from:"Ilya Dryomov"

Re: [PATCH -next 1/5] net: ceph: Fix a typo in osdmap.c

2021-03-25 Thread Ilya Dryomov

On Thu, Mar 25, 2021 at 7:37 AM Lu Wei  wrote:
>
> Modify "inital" to "initial" in net/ceph/osdmap.c.
>
> Reported-by: Hulk Robot 
> Signed-off-by: Lu Wei 
> ---
>  net/ceph/osdmap.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/net/ceph/osdmap.c b/net/ceph/osdmap.c
> index 2b1dd252f231..c959320c4775 100644
> --- a/net/ceph/osdmap.c
> +++ b/net/ceph/osdmap.c
> @@ -1069,7 +1069,7 @@ static struct crush_work *get_workspace(struct 
> workspace_manager *wsm,
>
> /*
>  * Do not return the error but go back to waiting.  We
> -* have the inital workspace and the CRUSH computation
> +* have the initial workspace and the CRUSH computation
>  * time is bounded so we will get it eventually.
>  */
> WARN_ON(atomic_read(&wsm->total_ws) < 1);
> --
> 2.17.1
>

Hi Lu,

There is at least one other legit typo in that file: "ambigous".
I'd rather fix all typos at once, so curious why Hulk Robot didn't
catch it.

Thanks,

Ilya

Re: [PATCH RESEND][next] ceph: Fix fall-through warnings for Clang

2021-03-05 Thread Ilya Dryomov

On Fri, Mar 5, 2021 at 10:59 AM Gustavo A. R. Silva
 wrote:
>
> In preparation to enable -Wimplicit-fallthrough for Clang, fix a couple
> of warnings by explicitly adding a break and a goto statements instead
> of just letting the code fall through to the next case.
>
> Link: https://github.com/KSPP/linux/issues/115
> Signed-off-by: Gustavo A. R. Silva 
> ---
>  fs/ceph/dir.c | 2 ++
>  1 file changed, 2 insertions(+)
>
> diff --git a/fs/ceph/dir.c b/fs/ceph/dir.c
> index 83d9358854fb..3e575656713e 100644
> --- a/fs/ceph/dir.c
> +++ b/fs/ceph/dir.c
> @@ -631,10 +631,12 @@ static loff_t ceph_dir_llseek(struct file *file, loff_t 
> offset, int whence)
> switch (whence) {
> case SEEK_CUR:
> offset += file->f_pos;
> +   break;
> case SEEK_SET:
> break;
> case SEEK_END:
> retval = -EOPNOTSUPP;
> +   goto out;
> default:
> goto out;
> }
> --
> 2.27.0
>

Applied.

Thanks,

Ilya

Re: net/ceph/messenger_v1.c:1204:5: warning: stack frame size of 2944 bytes in function 'ceph_con_v1_try_read'

2021-03-01 Thread Ilya Dryomov

On Mon, Mar 1, 2021 at 9:36 AM kernel test robot  wrote:
>
> Hi Ilya,
>
> FYI, the error/warning still remains.
>
> tree:   https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git 
> master
> head:   fe07bfda2fb9cdef8a4d4008a409bb02f35f1bd8
> commit: 2f713615ddd9d805b6c5e79c52e0e11af99d2bf1 libceph: move msgr1 protocol 
> implementation to its own file
> date:   3 months ago

It's fine.  This commit just moved the code which has been this way for
years and never caused any real issues.  Please add it to the allowlist
if possible.

> config: powerpc64-randconfig-r001-20210301 (attached as .config)
> compiler: clang version 13.0.0 (https://github.com/llvm/llvm-project 
> 5de09ef02e24d234d9fc0cd1c6dfe18a1bb784b0)
> reproduce (this is a W=1 build):
> wget 
> https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O 
> ~/bin/make.cross
> chmod +x ~/bin/make.cross
> # install powerpc64 cross compiling tool for clang build
> # apt-get install binutils-powerpc64-linux-gnu
> # 
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=2f713615ddd9d805b6c5e79c52e0e11af99d2bf1
> git remote add linus 
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
> git fetch --no-tags linus master
> git checkout 2f713615ddd9d805b6c5e79c52e0e11af99d2bf1
> # save the attached .config to linux build tree
> COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross 
> ARCH=powerpc64
>
> If you fix the issue, kindly add following tag as appropriate
> Reported-by: kernel test robot 
>
> All warnings (new ones prefixed by >>):
>
>__do_insb
>^
>arch/powerpc/include/asm/io.h:541:56: note: expanded from macro '__do_insb'
>#define __do_insb(p, b, n)  readsb((PCI_IO_ADDR)_IO_BASE+(p), (b), (n))
>   ~^
>In file included from net/ceph/messenger_v1.c:8:
>In file included from include/net/sock.h:38:
>In file included from include/linux/hardirq.h:10:
>In file included from arch/powerpc/include/asm/hardirq.h:6:
>In file included from include/linux/irq.h:20:
>In file included from include/linux/io.h:13:
>In file included from arch/powerpc/include/asm/io.h:604:
>arch/powerpc/include/asm/io-defs.h:45:1: warning: performing pointer 
> arithmetic on a null pointer has undefined behavior 
> [-Wnull-pointer-arithmetic]
>DEF_PCI_AC_NORET(insw, (unsigned long p, void *b, unsigned long c),
>^~~
>arch/powerpc/include/asm/io.h:601:3: note: expanded from macro 
> 'DEF_PCI_AC_NORET'
>__do_##name al; \
>^~
>:32:1: note: expanded from here
>__do_insw
>^
>arch/powerpc/include/asm/io.h:542:56: note: expanded from macro '__do_insw'
>#define __do_insw(p, b, n)  readsw((PCI_IO_ADDR)_IO_BASE+(p), (b), (n))
>   ~^
>In file included from net/ceph/messenger_v1.c:8:
>In file included from include/net/sock.h:38:
>In file included from include/linux/hardirq.h:10:
>In file included from arch/powerpc/include/asm/hardirq.h:6:
>In file included from include/linux/irq.h:20:
>In file included from include/linux/io.h:13:
>In file included from arch/powerpc/include/asm/io.h:604:
>arch/powerpc/include/asm/io-defs.h:47:1: warning: performing pointer 
> arithmetic on a null pointer has undefined behavior 
> [-Wnull-pointer-arithmetic]
>DEF_PCI_AC_NORET(insl, (unsigned long p, void *b, unsigned long c),
>^~~
>arch/powerpc/include/asm/io.h:601:3: note: expanded from macro 
> 'DEF_PCI_AC_NORET'
>__do_##name al; \
>^~
>:36:1: note: expanded from here
>__do_insl
>^
>arch/powerpc/include/asm/io.h:543:56: note: expanded from macro '__do_insl'
>#define __do_insl(p, b, n)  readsl((PCI_IO_ADDR)_IO_BASE+(p), (b), (n))
>   ~^
>In file included from net/ceph/messenger_v1.c:8:
>In file included from include/net/sock.h:38:
>In file included from include/linux/hardirq.h:10:
>In file included from arch/powerpc/include/asm/hardirq.h:6:
>In file included from include/linux/irq.h:20:
>In file included from include/linux/io.h:13:
>In file included from arch/powerpc/include/asm/io.h:604:
>arch/powerpc/include/asm/io-defs.h:49:1: warning: performing pointer 
> arithmetic on a null pointer has undefined behavior 
> [-Wnull-pointer-arithmetic]
>DEF_PCI_AC_NORET(outsb, (unsigned long p, const void *b, unsigned long c),
>^~
>arch/po

[GIT PULL] Ceph updates for 5.12-rc1

2021-02-22 Thread Ilya Dryomov

Hi Linus,

The following changes since commit f40ddce88593482919761f74910f42f4b84c004b:

  Linux 5.11 (2021-02-14 14:32:24 -0800)

are available in the Git repository at:

  https://github.com/ceph/ceph-client.git tags/ceph-for-5.12-rc1

for you to fetch changes up to 558b4510f622a3d96cf9d95050a04e7793d343c7:

  ceph: defer flushing the capsnap if the Fb is used (2021-02-16 12:09:52 +0100)


With netfs helper library and fscache rework delayed, just a few cap
handling improvements to avoid grabbing mmap_lock in some code paths
and deal with capsnaps better and a mount option cleanup.


Ilya Dryomov (2):
  libceph: deprecate [no]cephx_require_signatures options
  libceph: remove osdtimeout option entirely

Jeff Layton (3):
  ceph: fix flush_snap logic after putting caps
  ceph: clean up inode work queueing
  ceph: allow queueing cap/snap handling after putting cap references

Xiubo Li (1):
  ceph: defer flushing the capsnap if the Fb is used

 fs/ceph/addr.c   |  2 +-
 fs/ceph/caps.c   | 70 +++-
 fs/ceph/inode.c  | 61 --
 fs/ceph/snap.c   | 10 +++
 fs/ceph/super.h  | 40 +
 include/linux/ceph/libceph.h |  7 ++---
 net/ceph/ceph_common.c   | 17 ---
 7 files changed, 115 insertions(+), 92 deletions(-)

Re: [PATCH] ceph: Fix an Oops in error handling

2021-02-02 Thread Ilya Dryomov

On Tue, Feb 2, 2021 at 6:47 AM Dan Carpenter  wrote:
>
> The "req" pointer is an error pointer and not NULL so this check needs
> to be fixed.
>
> Fixes: 1cf7fdf52d5a ("ceph: convert readpage to fscache read helper")
> Signed-off-by: Dan Carpenter 
> ---
>  fs/ceph/addr.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/fs/ceph/addr.c b/fs/ceph/addr.c
> index 5eec6f66fe52..fb0238a4d34f 100644
> --- a/fs/ceph/addr.c
> +++ b/fs/ceph/addr.c
> @@ -273,7 +273,7 @@ static void ceph_netfs_issue_op(struct 
> netfs_read_subrequest *subreq)
> if (err)
> iput(inode);
>  out:
> -   if (req)
> +   if (!IS_ERR_OR_NULL(req))
> ceph_osdc_put_request(req);
> if (err)
> netfs_subreq_terminated(subreq, err);

Hi Dan,

I think a better fix would be to set req to NULL in the offending
IS_ERR branch since ceph_osdc_new_request() never returns NULL or
use two separate goto labels.

While at it, the initialization of req and the check on req before
calling ceph_osdc_put_request() are redundant.

Thanks,

Ilya

Re: [PATCH] ceph: Replace DEFINE_SIMPLE_ATTRIBUTE with DEFINE_DEBUGFS_ATTRIBUTE

2021-02-02 Thread Ilya Dryomov

On Mon, Feb 1, 2021 at 8:52 AM Jiapeng Chong
 wrote:
>
> Fix the following coccicheck warning:
>
> ./fs/ceph/debugfs.c:347:0-23: WARNING: congestion_kb_fops should be
> defined with DEFINE_DEBUGFS_ATTRIBUTE.
>
> Reported-by: Abaci Robot 
> Signed-off-by: Jiapeng Chong 
> ---
>  fs/ceph/debugfs.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/fs/ceph/debugfs.c b/fs/ceph/debugfs.c
> index 66989c8..617327e 100644
> --- a/fs/ceph/debugfs.c
> +++ b/fs/ceph/debugfs.c
> @@ -344,8 +344,8 @@ static int congestion_kb_get(void *data, u64 *val)
> return 0;
>  }
>
> -DEFINE_SIMPLE_ATTRIBUTE(congestion_kb_fops, congestion_kb_get,
> -   congestion_kb_set, "%llu\n");
> +DEFINE_DEBUGFS_ATTRIBUTE(congestion_kb_fops, congestion_kb_get,
> + congestion_kb_set, "%llu\n");
>
>
>  void ceph_fs_debugfs_cleanup(struct ceph_fs_client *fsc)

Hi Jiapeng,

What is the benefit of this conversion?

>From a quick look, with DEFINE_DEBUGFS_ATTRIBUTE writeback_congestion_kb
file would no longer be seekable.  It may not matter much, but something
that should have been mentioned.

Futher, debugfs_create_file() creates a full proxy for fops, protecting
against removal races.  DEFINE_DEBUGFS_ATTRIBUTE adds its own protection
but just for ->read() and ->write().  I don't think we need both.

Thanks,

Ilya

Re: [PATCH 0/6] ceph: convert to new netfs read helpers

2021-01-28 Thread Ilya Dryomov

On Thu, Jan 28, 2021 at 1:52 PM Jeff Layton  wrote:
>
> On Wed, 2021-01-27 at 23:50 +0100, Ilya Dryomov wrote:
> > On Tue, Jan 26, 2021 at 2:41 PM Jeff Layton  wrote:
> > >
> > > This patchset converts ceph to use the new netfs readpage, write_begin,
> > > and readahead helpers to handle buffered reads. This is a substantial
> > > reduction in code in ceph, but shouldn't really affect functionality in
> > > any way.
> > >
> > > Ilya, if you don't have any objections, I'll plan to let David pull this
> > > series into his tree to be merged with the netfs API patches themselves.
> >
> > Sure, that works for me.
> >
> > I would have expected that the new netfs infrastructure is pushed
> > to a public branch that individual filesystems could peruse, but since
> > David's set already includes patches for AFS and NFS, let's tag along.
> >
> > Thanks,
> >
> > Ilya
>
> David has a fscache-netfs-lib branch that has all of the infrastructure
> changes. See:
>
> 
> https://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs.git/log/?h=fscache-netfs-lib

I saw that, but AFAICS it hasn't been declared public (as in suitable
for other people to base their work on, with the promise that history
won't get rewritten.  It is branched off of what looks like a random
snapshot of Linus' tree instead of a release point, etc.

Thanks,

Ilya

Re: [PATCH 0/6] ceph: convert to new netfs read helpers

2021-01-27 Thread Ilya Dryomov

On Tue, Jan 26, 2021 at 2:41 PM Jeff Layton  wrote:
>
> This patchset converts ceph to use the new netfs readpage, write_begin,
> and readahead helpers to handle buffered reads. This is a substantial
> reduction in code in ceph, but shouldn't really affect functionality in
> any way.
>
> Ilya, if you don't have any objections, I'll plan to let David pull this
> series into his tree to be merged with the netfs API patches themselves.

Sure, that works for me.

I would have expected that the new netfs infrastructure is pushed
to a public branch that individual filesystems could peruse, but since
David's set already includes patches for AFS and NFS, let's tag along.

Thanks,

Ilya

[GIT PULL] Ceph fixes for 5.11-rc5

2021-01-22 Thread Ilya Dryomov

Hi Linus,

The following changes since commit e71ba9452f0b5b2e8dc8aa5445198cd9214a6a62:

  Linux 5.11-rc2 (2021-01-03 15:55:30 -0800)

are available in the Git repository at:

  https://github.com/ceph/ceph-client.git tags/ceph-for-5.11-rc5

for you to fetch changes up to 9d5ae6f3c50a6f718b6d4be3c7b0828966e01b05:

  libceph: fix "Boolean result is used in bitwise operation" warning 
(2021-01-21 16:49:59 +0100)


A patch to zero out sensitive cryptographic data and two minor cleanups
prompted by the fact that a bunch of code was moved in this cycle.

----
Ilya Dryomov (3):
  libceph: zero out session key and connection secret
  libceph, ceph: disambiguate ceph_connection_operations handlers
  libceph: fix "Boolean result is used in bitwise operation" warning

 fs/ceph/mds_client.c| 34 ++---
 net/ceph/auth_x.c   | 57 +
 net/ceph/crypto.c   |  3 ++-
 net/ceph/messenger_v1.c |  2 +-
 net/ceph/messenger_v2.c | 45 +-
 net/ceph/mon_client.c   | 14 ++--
 net/ceph/osd_client.c   | 40 +-
 7 files changed, 107 insertions(+), 88 deletions(-)

Re: [kbuild] net/ceph/messenger_v1.c:1099:23: warning: Boolean result is used in bitwise operation. Clarify expression with parentheses.

2021-01-20 Thread Ilya Dryomov

On Wed, Jan 20, 2021 at 1:43 PM Dan Carpenter  wrote:
>
> On Wed, Jan 20, 2021 at 12:01:59PM +0100, Ilya Dryomov wrote:
> > On Tue, Jan 19, 2021 at 8:46 PM Dan Carpenter  
> > wrote:
> > >
> > > tree:   
> > > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git   
> > > master
> > > head:   1e2a199f6ccdc15cf111d68d212e2fd4ce65682e
> > > commit: 2f713615ddd9d805b6c5e79c52e0e11af99d2bf1 libceph: move msgr1 
> > > protocol implementation to its own file
> > > compiler: gcc-9 (Debian 9.3.0-15) 9.3.0
> > >
> > > If you fix the issue, kindly add following tag as appropriate
> > > Reported-by: kernel test robot 
> > >
> > >
> > > cppcheck possible warnings: (new ones prefixed by >>, may not real 
> > > problems)
> > >
> > > >> net/ceph/messenger_v1.c:1099:23: warning: Boolean result is used in 
> > > >> bitwise operation. Clarify expression with parentheses. 
> > > >> [clarifyCondition]
> > >  BUG_ON(!con->in_msg ^ skip);
> > >      ^
> > >
> > > vim +1099 net/ceph/messenger_v1.c
> > >
> > > 2f713615ddd9d805 Ilya Dryomov 2020-11-12  1033  static int 
> > > read_partial_message(struct ceph_connection *con)
> > > 2f713615ddd9d805 Ilya Dryomov 2020-11-12  1034  {
> > > 2f713615ddd9d805 Ilya Dryomov 2020-11-12  1035      struct ceph_msg 
> > > *m = con->in_msg;
> > > 2f713615ddd9d805 Ilya Dryomov 2020-11-12  1036  int size;
> > > 2f713615ddd9d805 Ilya Dryomov 2020-11-12  1037  int end;
> > > 2f713615ddd9d805 Ilya Dryomov 2020-11-12  1038  int ret;
> > > 2f713615ddd9d805 Ilya Dryomov 2020-11-12  1039  unsigned int 
> > > front_len, middle_len, data_len;
> > > 2f713615ddd9d805 Ilya Dryomov 2020-11-12  1040  bool do_datacrc = 
> > > !ceph_test_opt(from_msgr(con->msgr), NOCRC);
> > > 2f713615ddd9d805 Ilya Dryomov 2020-11-12  1041  bool need_sign = 
> > > (con->peer_features & CEPH_FEATURE_MSG_AUTH);
> > > 2f713615ddd9d805 Ilya Dryomov 2020-11-12  1042  u64 seq;
> > > 2f713615ddd9d805 Ilya Dryomov 2020-11-12  1043  u32 crc;
> > > 2f713615ddd9d805 Ilya Dryomov 2020-11-12  1044
> > > 2f713615ddd9d805 Ilya Dryomov 2020-11-12  1045  
> > > dout("read_partial_message con %p msg %p\n", con, m);
> > > 2f713615ddd9d805 Ilya Dryomov 2020-11-12  1046
> > > 2f713615ddd9d805 Ilya Dryomov 2020-11-12  1047  /* header */
> > > 2f713615ddd9d805 Ilya Dryomov 2020-11-12  1048      size = sizeof 
> > > (con->in_hdr);
> > > 2f713615ddd9d805 Ilya Dryomov 2020-11-12  1049  end = size;
> > > 2f713615ddd9d805 Ilya Dryomov 2020-11-12  1050  ret = 
> > > read_partial(con, end, size, &con->in_hdr);
> > > 2f713615ddd9d805 Ilya Dryomov 2020-11-12  1051  if (ret <= 0)
> > > 2f713615ddd9d805 Ilya Dryomov 2020-11-12  1052  return 
> > > ret;
> > > 2f713615ddd9d805 Ilya Dryomov 2020-11-12  1053
> > > 2f713615ddd9d805 Ilya Dryomov 2020-11-12  1054  crc = crc32c(0, 
> > > &con->in_hdr, offsetof(struct ceph_msg_header, crc));
> > > 2f713615ddd9d805 Ilya Dryomov 2020-11-12  1055  if 
> > > (cpu_to_le32(crc) != con->in_hdr.crc) {
> > > 2f713615ddd9d805 Ilya Dryomov 2020-11-12  1056  
> > > pr_err("read_partial_message bad hdr crc %u != expected %u\n",
> > > 2f713615ddd9d805 Ilya Dryomov 2020-11-12  1057 
> > > crc, con->in_hdr.crc);
> > > 2f713615ddd9d805 Ilya Dryomov 2020-11-12  1058  return 
> > > -EBADMSG;
> > > 2f713615ddd9d805 Ilya Dryomov 2020-11-12  1059  }
> > > 2f713615ddd9d805 Ilya Dryomov 2020-11-12  1060
> > > 2f713615ddd9d805 Ilya Dryomov 2020-11-12  1061  front_len = 
> > > le32_to_cpu(con->in_hdr.front_len);
> > > 2f713615ddd9d805 Ilya Dryomov 2020-11-12  1062  if (front_len > 
> > > CEPH_MSG_MAX_FRONT_LEN)
> > > 2f713615ddd9d805 Ilya Dryomov 2020-11-12  1063      return 
> > > -EIO;
> > > 2f713615ddd9d805 Ilya Dryomov 2020-11-12  1064  middle_len = 
> > > le32_to_cpu(con->in_hdr.middle_len);
> > > 2f713615ddd9d805 Ilya Dryomov 2020-11-12  1065  if (middle_len > 
> > > CEPH_MSG_MAX_MIDDLE_LEN)
> > > 2f713615ddd9d805 Ilya Dryomov 2020-11-12  1066

Re: [kbuild] net/ceph/messenger_v1.c:1099:23: warning: Boolean result is used in bitwise operation. Clarify expression with parentheses.

2021-01-20 Thread Ilya Dryomov

On Tue, Jan 19, 2021 at 8:46 PM Dan Carpenter  wrote:
>
> tree:   https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git  
> master
> head:   1e2a199f6ccdc15cf111d68d212e2fd4ce65682e
> commit: 2f713615ddd9d805b6c5e79c52e0e11af99d2bf1 libceph: move msgr1 protocol 
> implementation to its own file
> compiler: gcc-9 (Debian 9.3.0-15) 9.3.0
>
> If you fix the issue, kindly add following tag as appropriate
> Reported-by: kernel test robot 
>
>
> cppcheck possible warnings: (new ones prefixed by >>, may not real problems)
>
> >> net/ceph/messenger_v1.c:1099:23: warning: Boolean result is used in 
> >> bitwise operation. Clarify expression with parentheses. [clarifyCondition]
>  BUG_ON(!con->in_msg ^ skip);
>      ^
>
> vim +1099 net/ceph/messenger_v1.c
>
> 2f713615ddd9d805 Ilya Dryomov 2020-11-12  1033  static int 
> read_partial_message(struct ceph_connection *con)
> 2f713615ddd9d805 Ilya Dryomov 2020-11-12  1034  {
> 2f713615ddd9d805 Ilya Dryomov 2020-11-12  1035  struct ceph_msg *m = 
> con->in_msg;
> 2f713615ddd9d805 Ilya Dryomov 2020-11-12  1036  int size;
> 2f713615ddd9d805 Ilya Dryomov 2020-11-12  1037      int end;
> 2f713615ddd9d805 Ilya Dryomov 2020-11-12  1038  int ret;
> 2f713615ddd9d805 Ilya Dryomov 2020-11-12  1039  unsigned int 
> front_len, middle_len, data_len;
> 2f713615ddd9d805 Ilya Dryomov 2020-11-12  1040  bool do_datacrc = 
> !ceph_test_opt(from_msgr(con->msgr), NOCRC);
> 2f713615ddd9d805 Ilya Dryomov 2020-11-12  1041  bool need_sign = 
> (con->peer_features & CEPH_FEATURE_MSG_AUTH);
> 2f713615ddd9d805 Ilya Dryomov 2020-11-12  1042  u64 seq;
> 2f713615ddd9d805 Ilya Dryomov 2020-11-12  1043  u32 crc;
> 2f713615ddd9d805 Ilya Dryomov 2020-11-12  1044
> 2f713615ddd9d805 Ilya Dryomov 2020-11-12  1045  
> dout("read_partial_message con %p msg %p\n", con, m);
> 2f713615ddd9d805 Ilya Dryomov 2020-11-12  1046
> 2f713615ddd9d805 Ilya Dryomov 2020-11-12  1047  /* header */
> 2f713615ddd9d805 Ilya Dryomov 2020-11-12  1048  size = sizeof 
> (con->in_hdr);
> 2f713615ddd9d805 Ilya Dryomov 2020-11-12  1049  end = size;
> 2f713615ddd9d805 Ilya Dryomov 2020-11-12  1050  ret = 
> read_partial(con, end, size, &con->in_hdr);
> 2f713615ddd9d805 Ilya Dryomov 2020-11-12  1051  if (ret <= 0)
> 2f713615ddd9d805 Ilya Dryomov 2020-11-12  1052  return ret;
> 2f713615ddd9d805 Ilya Dryomov 2020-11-12  1053
> 2f713615ddd9d805 Ilya Dryomov 2020-11-12  1054      crc = crc32c(0, 
> &con->in_hdr, offsetof(struct ceph_msg_header, crc));
> 2f713615ddd9d805 Ilya Dryomov 2020-11-12  1055  if (cpu_to_le32(crc) 
> != con->in_hdr.crc) {
> 2f713615ddd9d805 Ilya Dryomov 2020-11-12  1056  
> pr_err("read_partial_message bad hdr crc %u != expected %u\n",
> 2f713615ddd9d805 Ilya Dryomov 2020-11-12  1057     crc, 
> con->in_hdr.crc);
> 2f713615ddd9d805 Ilya Dryomov 2020-11-12  1058  return 
> -EBADMSG;
> 2f713615ddd9d805 Ilya Dryomov 2020-11-12  1059  }
> 2f713615ddd9d805 Ilya Dryomov 2020-11-12  1060
> 2f713615ddd9d805 Ilya Dryomov 2020-11-12  1061      front_len = 
> le32_to_cpu(con->in_hdr.front_len);
> 2f713615ddd9d805 Ilya Dryomov 2020-11-12  1062  if (front_len > 
> CEPH_MSG_MAX_FRONT_LEN)
> 2f713615ddd9d805 Ilya Dryomov 2020-11-12  1063  return -EIO;
> 2f713615ddd9d805 Ilya Dryomov 2020-11-12  1064  middle_len = 
> le32_to_cpu(con->in_hdr.middle_len);
> 2f713615ddd9d805 Ilya Dryomov 2020-11-12  1065  if (middle_len > 
> CEPH_MSG_MAX_MIDDLE_LEN)
> 2f713615ddd9d805 Ilya Dryomov 2020-11-12  1066  return -EIO;
> 2f713615ddd9d805 Ilya Dryomov 2020-11-12  1067  data_len = 
> le32_to_cpu(con->in_hdr.data_len);
> 2f713615ddd9d805 Ilya Dryomov 2020-11-12  1068  if (data_len > 
> CEPH_MSG_MAX_DATA_LEN)
> 2f713615ddd9d805 Ilya Dryomov 2020-11-12  1069  return -EIO;
> 2f713615ddd9d805 Ilya Dryomov 2020-11-12  1070
> 2f713615ddd9d805 Ilya Dryomov 2020-11-12  1071  /* verify seq# */
> 2f713615ddd9d805 Ilya Dryomov 2020-11-12  1072  seq = 
> le64_to_cpu(con->in_hdr.seq);
> 2f713615ddd9d805 Ilya Dryomov 2020-11-12  1073  if ((s64)seq - 
> (s64)con->in_seq < 1) {
> 2f713615ddd9d805 Ilya Dryomov 2020-11-12  1074  
> pr_info("skipping %s%lld %s seq %lld expected %lld\n",
> 2f713615ddd9d805 Ilya Dryomov 2020-11-12  1075  
> ENTITY_NAME(con->peer_name),

[GIT PULL] Ceph fixes for 5.11-rc2

2020-12-30 Thread Ilya Dryomov

Hi Linus,

The following changes since commit 5c8fe583cce542aa0b84adc939ce85293de36e5e:

  Linux 5.11-rc1 (2020-12-27 15:30:22 -0800)

are available in the Git repository at:

  https://github.com/ceph/ceph-client.git tags/ceph-for-5.11-rc2

for you to fetch changes up to 664f1e259a982bf213f0cd8eea7616c89546585c:

  libceph: add __maybe_unused to DEFINE_MSGR2_FEATURE (2020-12-28 20:34:33 
+0100)


A fix for an edge case in MClientRequest encoding and a couple of
trivial fixups for the new msgr2 support.


Ilya Dryomov (4):
  ceph: reencode gid_list when reconnecting
  libceph: fix auth_signature buffer allocation in secure mode
  libceph: align session_key and con_secret to 16 bytes
  libceph: add __maybe_unused to DEFINE_MSGR2_FEATURE

 fs/ceph/mds_client.c  | 53 ---
 include/linux/ceph/msgr.h |  4 ++--
 net/ceph/messenger_v2.c   | 15 +++---
 3 files changed, 36 insertions(+), 36 deletions(-)

[GIT PULL] Ceph updates for 5.11-rc1

2020-12-17 Thread Ilya Dryomov

Hi Linus,

The following changes since commit 2c85ebc57b3e1817b6ce1a6b703928e113a90442:

  Linux 5.10 (2020-12-13 14:41:30 -0800)

are available in the Git repository at:

  https://github.com/ceph/ceph-client.git tags/ceph-for-5.11-rc1

for you to fetch changes up to 2f0df6cfa325d7106b8a65bc0e02db1086e3f73b:

  libceph: drop ceph_auth_{create,update}_authorizer() (2020-12-14 23:21:50 
+0100)

There is a build conflict caused by the split of crypto/sha.h into
crypto/sha1.h and crypto/sha2.h that affects net/ceph/messenger_v2.c.
The resolution is to include the latter, done in for-linus-merged
just in case.


The big ticket item here is support for msgr2 on-wire protocol, which
adds the option of full in-transit encryption using AES-GCM algorithm
(myself).  On top of that we have a series to avoid intermittent
errors during recovery with recover_session=clean and some MDS request
encoding work from Jeff, a cap handling fix and assorted observability
improvements from Luis and Xiubo and a good number of cleanups.  Luis
also ran into a corner case with quotas which sadly means that we are
back to denying cross-quota-realm renames.


Colin Ian King (1):
  ceph: remove redundant assignment to variable i

Ilya Dryomov (34):
  libceph: include middle_len in process_message() dout
  libceph: lower exponential backoff delay
  libceph: don't call reset_connection() on version/feature mismatches
  libceph: split protocol reset bits out of reset_connection()
  libceph: rename reset_connection() to ceph_con_reset_session()
  libceph: clear con->peer_global_seq on RESETSESSION
  libceph: remove redundant session reset log message
  libceph: drop msg->ack_stamp field
  libceph: handle discarding acked and requeued messages separately
  libceph: change ceph_msg_data_cursor_init() to take cursor
  libceph: change ceph_con_in_msg_alloc() to take hdr
  libceph: factor out ceph_con_get_out_msg()
  libceph: make sure our addr->port is zero and addr->nonce is non-zero
  libceph: don't export ceph_messenger_{init_fini}() to modules
  libceph: make con->state an int
  libceph: rename and export con->state states
  libceph: rename and export con->flags bits
  libceph: export zero_page
  libceph: export remaining protocol independent infrastructure
  libceph: separate msgr1 protocol implementation
  libceph: move msgr1 protocol implementation to its own file
  libceph: move msgr1 protocol specific fields to its own struct
  libceph: more insight into ticket expiry and invalidation
  libceph: safer en/decoding of cephx requests and replies
  libceph, ceph: incorporate nautilus cephx changes
  libceph: amend cephx init_protocol() and build_request()
  libceph: drop ac->ops->name field
  libceph: factor out finish_auth()
  libceph, ceph: get and handle cluster maps with addrvecs
  libceph, rbd: ignore addr->type while comparing in some cases
  libceph: introduce connection modes and ms_mode option
  libceph, ceph: implement msgr2.1 protocol (crc and secure modes)
  libceph, ceph: make use of __ceph_auth_get_authorizer() in msgr1
  libceph: drop ceph_auth_{create,update}_authorizer()

Jeff Layton (15):
  ceph: don't WARN when removing caps due to blocklisting
  ceph: make fsc->mount_state an int
  ceph: add new RECOVER mount_state when recovering session
  ceph: remove timeout on allowing reconnect after blocklisting
  ceph: queue MDS requests to REJECTED sessions when CLEANRECOVER is set
  ceph: fix up some warnings on W=1 builds
  ceph: acquire Fs caps when getting dir stats
  ceph: ensure we have Fs caps when fetching dir link count
  ceph: pass down the flags to grab_cache_page_write_begin
  ceph: fix inode refcount leak when ceph_fill_inode on non-I_NEW inode 
fails
  ceph: when filling trace, call ceph_get_inode outside of mutexes
  ceph: don't reach into request header for readdir info
  ceph: take a cred reference instead of tracking individual uid/gid
  ceph: clean up argument lists to __prepare_send_request and __send_request
  ceph: implement updated ceph_mds_request_head structure

Liu, Changcheng (1):
  libceph: remove unused port macros

Luis Henriques (4):
  ceph: fix race in concurrent __ceph_remove_cap invocations
  ceph: downgrade warning from mdsmap decode to debug
  Revert "ceph: allow rename operation under different quota realms"
  ceph: add ceph.caps vxattr

Xiubo Li (4):
  ceph: send dentry lease metrics to MDS daemon
  ceph: add status debugfs file
  ceph: add ceph.{cluster_fsid/client_id} vxattrs
  ceph: set osdmap epoch for setxattr

 drivers/block/rbd.c|8 +

[GIT PULL] Ceph fix for 5.10-rc3

2020-11-06 Thread Ilya Dryomov

Hi Linus,

The following changes since commit 3cea11cd5e3b00d91caf0b4730194039b45c5891:

  Linux 5.10-rc2 (2020-11-01 14:43:51 -0800)

are available in the Git repository at:

  https://github.com/ceph/ceph-client.git tags/ceph-for-5.10-rc3

for you to fetch changes up to 62575e270f661aba64778cbc5f354511cf9abb21:

  ceph: check session state after bumping session->s_seq (2020-11-04 20:55:49 
+0100)


A fix for a potential stall on umount caused by the MDS dropping
our REQUEST_CLOSE message.  The code that handled this case was
inadvertently disabled in 5.9, this patch removes it entirely and
fixes the problem in a way that is consistent with ceph-fuse.


Jeff Layton (1):
  ceph: check session state after bumping session->s_seq

 fs/ceph/caps.c   |  2 +-
 fs/ceph/mds_client.c | 50 +++---
 fs/ceph/mds_client.h |  1 +
 fs/ceph/quota.c  |  2 +-
 fs/ceph/snap.c   |  2 +-
 5 files changed, 39 insertions(+), 18 deletions(-)

Re: [PATCH v2 31/39] docs: ABI: cleanup several ABI documents

2020-10-30 Thread Ilya Dryomov

On Fri, Oct 30, 2020 at 8:41 AM Mauro Carvalho Chehab
 wrote:
>
> There are some ABI documents that, while they don't generate
> any warnings, they have issues when parsed by get_abi.pl script
> on its output result.
>
> Address them, in order to provide a clean output.
>
> Acked-by: Jonathan Cameron  #for IIO
> Reviewed-by: Tom Rix  # for fpga-manager
> Reviewed-By: Kajol Jain # for 
> sysfs-bus-event_source-devices-hv_gpci and 
> sysfs-bus-event_source-devices-hv_24x7
> Acked-by: Oded Gabbay  # for Habanalabs
> Acked-by: Vaibhav Jain  # for sysfs-bus-papr-pmem
> Signed-off-by: Mauro Carvalho Chehab 
>
> [...]
>
>  Documentation/ABI/testing/sysfs-bus-rbd   |  37 ++-

Acked-by: Ilya Dryomov  # for rbd

Thanks,

Ilya

[GIT PULL] Ceph updates for 5.10-rc1

2020-10-21 Thread Ilya Dryomov

Hi Linus,

The following changes since commit bbf5c979011a099af5dc76498918ed7df445635b:

  Linux 5.9 (2020-10-11 14:15:50 -0700)

are available in the Git repository at:

  https://github.com/ceph/ceph-client.git tags/ceph-for-5.10-rc1

for you to fetch changes up to 28e1581c3b4ea5f98530064a103c6217bedeea73:

  libceph: clear con->out_msg on Policy::stateful_server faults (2020-10-12 
15:29:27 +0200)


We have:

- a patch that removes crush_workspace_mutex (myself).  CRUSH
  computations are no longer serialized and can run in parallel.

- a couple new filesystem client metrics for "ceph fs top" command
  (Xiubo Li)

- a fix for a very old messenger bug that affected the filesystem,
  marked for stable (myself)

- assorted fixups and cleanups throughout the codebase from Jeff
  and others.

----
Ilya Dryomov (9):
  libceph: multiple workspaces for CRUSH computations
  libceph, rbd, ceph: "blacklist" -> "blocklist"
  libceph: switch to the new "osd blocklist add" command
  ceph: add a note explaining session reject error string
  ceph: mark ceph_fmt_xattr() as printf-like for better type checking
  libceph: move a dout in queue_con_delay()
  libceph: fix ENTITY_NAME format suggestion
  libceph: format ceph_entity_addr nonces as unsigned
  libceph: clear con->out_msg on Policy::stateful_server faults

Jeff Layton (12):
  ceph: drop special-casing for ITER_PIPE in ceph_sync_read
  ceph: use kill_anon_super helper
  ceph: have ceph_writepages_start call pagevec_lookup_range_tag
  ceph: break out writeback of incompatible snap context to separate 
function
  ceph: don't call ceph_update_writeable_page from page_mkwrite
  ceph: fold ceph_sync_readpages into ceph_readpage
  ceph: fold ceph_sync_writepages into writepage_nounlock
  ceph: fold ceph_update_writeable_page into ceph_write_begin
  ceph: don't SetPageError on readpage errors
  ceph: drop separate mdsc argument from __send_cap
  ceph: break up send_cap_msg
  ceph: comment cleanups and clarifications

Luis Henriques (1):
  ceph: remove unnecessary return in switch statement

Matthew Wilcox (Oracle) (1):
  ceph: promote to unsigned long long before shifting

Xiubo Li (2):
  ceph: add ceph_sb_to_mdsc helper support to parse the mdsc
  ceph: metrics for opened files, pinned caps and opened inodes

Yan, Zheng (1):
  ceph: encode inodes' parent/d_name in cap reconnect message

Yanhu Cao (1):
  ceph: add column 'mds' to show caps in more user friendly

 Documentation/filesystems/ceph.rst |   6 +-
 drivers/block/rbd.c|   8 +-
 fs/ceph/addr.c | 416 +
 fs/ceph/caps.c | 128 
 fs/ceph/debugfs.c  |  18 +-
 fs/ceph/dir.c  |  20 +-
 fs/ceph/file.c |  85 +++-
 fs/ceph/inode.c|  10 +-
 fs/ceph/locks.c|   2 +-
 fs/ceph/mds_client.c   | 109 ++
 fs/ceph/mds_client.h   |   2 +-
 fs/ceph/metric.c   |  14 ++
 fs/ceph/metric.h   |   7 +
 fs/ceph/quota.c|  10 +-
 fs/ceph/snap.c |   2 +-
 fs/ceph/super.c|   8 +-
 fs/ceph/super.h|  13 +-
 fs/ceph/xattr.c|   3 +-
 include/linux/ceph/messenger.h |   2 +-
 include/linux/ceph/mon_client.h|   2 +-
 include/linux/ceph/osdmap.h|  14 +-
 include/linux/ceph/rados.h |   2 +-
 include/linux/crush/crush.h|   3 +
 net/ceph/messenger.c   |  13 +-
 net/ceph/mon_client.c  |  69 --
 net/ceph/osdmap.c  | 166 +--
 26 files changed, 689 insertions(+), 443 deletions(-)

[GIT PULL] Ceph fix for 5.9-rc5

2020-09-11 Thread Ilya Dryomov

Hi Linus,

The following changes since commit f4d51dffc6c01a9e94650d95ce0104964f8ae822:

  Linux 5.9-rc4 (2020-09-06 17:11:40 -0700)

are available in the Git repository at:

  https://github.com/ceph/ceph-client.git tags/ceph-for-5.9-rc5

for you to fetch changes up to f44d04e696feaf13d192d942c4f14ad2e117065a:

  rbd: require global CAP_SYS_ADMIN for mapping and unmapping (2020-09-07 
13:14:30 +0200)


A fix to add missing capability checks in rbd, marked for stable.


Ilya Dryomov (1):
  rbd: require global CAP_SYS_ADMIN for mapping and unmapping

 drivers/block/rbd.c | 12 
 1 file changed, 12 insertions(+)

Re: [trivial PATCH] treewide: Convert switch/case fallthrough; to break;

2020-09-10 Thread Ilya Dryomov

 |  2 +-
>  drivers/tty/vt/vt_ioctl.c |  2 +-
>  drivers/usb/dwc3/core.c   |  2 +-
>  drivers/usb/gadget/legacy/inode.c |  2 +-
>  drivers/usb/gadget/udc/pxa25x_udc.c   |  4 ++--
>  drivers/usb/host/ohci-hcd.c   |  2 +-
>  drivers/usb/isp1760/isp1760-hcd.c |  2 +-
>  drivers/usb/musb/cppi_dma.c   |  2 +-
>  drivers/usb/phy/phy-fsl-usb.c     |  2 +-
>  drivers/video/fbdev/stifb.c   |  2 +-
>  fs/afs/yfsclient.c|  8 
>  fs/ceph/dir.c |  2 +-

For ceph:

Acked-by: Ilya Dryomov 

Thanks,

Ilya

Re: [PATCH AUTOSEL 5.8 25/42] ceph: fix inode number handling on arches with 32-bit ino_t

2020-08-31 Thread Ilya Dryomov

On Mon, Aug 31, 2020 at 5:30 PM Sasha Levin  wrote:
>
> From: Jeff Layton 
>
> [ Upstream commit ebce3eb2f7ef9f6ef01a60874ebd232450107c9a ]
>
> Tuan and Ulrich mentioned that they were hitting a problem on s390x,
> which has a 32-bit ino_t value, even though it's a 64-bit arch (for
> historical reasons).
>
> I think the current handling of inode numbers in the ceph driver is
> wrong. It tries to use 32-bit inode numbers on 32-bit arches, but that's
> actually not a problem. 32-bit arches can deal with 64-bit inode numbers
> just fine when userland code is compiled with LFS support (the common
> case these days).
>
> What we really want to do is just use 64-bit numbers everywhere, unless
> someone has mounted with the ino32 mount option. In that case, we want
> to ensure that we hash the inode number down to something that will fit
> in 32 bits before presenting the value to userland.
>
> Add new helper functions that do this, and only do the conversion before
> presenting these values to userland in getattr and readdir.
>
> The inode table hashvalue is changed to just cast the inode number to
> unsigned long, as low-order bits are the most likely to vary anyway.
>
> While it's not strictly required, we do want to put something in
> inode->i_ino. Instead of basing it on BITS_PER_LONG, however, base it on
> the size of the ino_t type.
>
> NOTE: This is a user-visible change on 32-bit arches:
>
> 1/ inode numbers will be seen to have changed between kernel versions.
>32-bit arches will see large inode numbers now instead of the hashed
>ones they saw before.
>
> 2/ any really old software not built with LFS support may start failing
>stat() calls with -EOVERFLOW on inode numbers >2^32. Nothing much we
>can do about these, but hopefully the intersection of people running
>such code on ceph will be very small.
>
> The workaround for both problems is to mount with "-o ino32".
>
> [ idryomov: changelog tweak ]
>
> URL: https://tracker.ceph.com/issues/46828
> Reported-by: Ulrich Weigand 
> Reported-and-Tested-by: Tuan Hoang1 
> Signed-off-by: Jeff Layton 
> Reviewed-by: "Yan, Zheng" 
> Signed-off-by: Ilya Dryomov 
> Signed-off-by: Sasha Levin 
> ---
>  fs/ceph/caps.c   | 14 -
>  fs/ceph/debugfs.c|  4 +--
>  fs/ceph/dir.c| 31 ---
>  fs/ceph/file.c   |  4 +--
>  fs/ceph/inode.c  | 19 ++--
>  fs/ceph/mds_client.h |  2 +-
>  fs/ceph/quota.c  |  4 +--
>  fs/ceph/super.h  | 73 +++-
>  8 files changed, 74 insertions(+), 77 deletions(-)
>
> diff --git a/fs/ceph/caps.c b/fs/ceph/caps.c
> index 972c13aa42259..1206a481c5fc7 100644
> --- a/fs/ceph/caps.c
> +++ b/fs/ceph/caps.c
> @@ -886,8 +886,8 @@ int __ceph_caps_issued_mask(struct ceph_inode_info *ci, 
> int mask, int touch)
> int have = ci->i_snap_caps;
>
> if ((have & mask) == mask) {
> -   dout("__ceph_caps_issued_mask ino 0x%lx snap issued %s"
> -" (mask %s)\n", ci->vfs_inode.i_ino,
> +   dout("__ceph_caps_issued_mask ino 0x%llx snap issued %s"
> +" (mask %s)\n", ceph_ino(&ci->vfs_inode),
>  ceph_cap_string(have),
>  ceph_cap_string(mask));
> return 1;
> @@ -898,8 +898,8 @@ int __ceph_caps_issued_mask(struct ceph_inode_info *ci, 
> int mask, int touch)
> if (!__cap_is_valid(cap))
> continue;
> if ((cap->issued & mask) == mask) {
> -   dout("__ceph_caps_issued_mask ino 0x%lx cap %p issued 
> %s"
> -" (mask %s)\n", ci->vfs_inode.i_ino, cap,
> +   dout("__ceph_caps_issued_mask ino 0x%llx cap %p 
> issued %s"
> +" (mask %s)\n", ceph_ino(&ci->vfs_inode), cap,
>  ceph_cap_string(cap->issued),
>  ceph_cap_string(mask));
> if (touch)
> @@ -910,8 +910,8 @@ int __ceph_caps_issued_mask(struct ceph_inode_info *ci, 
> int mask, int touch)
> /* does a combination of caps satisfy mask? */
> have |= cap->issued;
> if ((have & mask) == mask) {
> -   dout("__ceph_caps_issued_mask ino 0x%lx combo issued 
> %s"
> -" (mask %s)\n", ci->vfs_inode.i_ino,
> +   dout("__ceph_caps_issued_mask ino 0x%

[GIT PULL] Ceph fixes for 5.9-rc3

2020-08-28 Thread Ilya Dryomov

Hi Linus,

The following changes since commit d012a7190fc1fd72ed48911e77ca97ba4521bccd:

  Linux 5.9-rc2 (2020-08-23 14:08:43 -0700)

are available in the Git repository at:

  https://github.com/ceph/ceph-client.git tags/ceph-for-5.9-rc3

for you to fetch changes up to 496ceaf12432b3d136dcdec48424312e71359ea7:

  ceph: don't allow setlease on cephfs (2020-08-24 20:06:54 +0200)


We have an inode number handling change, prompted by s390x which is
a 64-bit architecture with a 32-bit ino_t, a patch to disallow leases
to avoid potential data integrity issues when CephFS is re-exported
via NFS or CIFS and a fix for the bulk of W=1 compilation warnings.


Ilya Dryomov (1):
  libceph: add __maybe_unused to DEFINE_CEPH_FEATURE

Jeff Layton (2):
  ceph: fix inode number handling on arches with 32-bit ino_t
  ceph: don't allow setlease on cephfs

 fs/ceph/caps.c | 14 
 fs/ceph/debugfs.c  |  4 +--
 fs/ceph/dir.c  | 31 +++-
 fs/ceph/file.c |  5 +--
 fs/ceph/inode.c| 19 +-
 fs/ceph/mds_client.h   |  2 +-
 fs/ceph/quota.c|  4 +--
 fs/ceph/super.h| 73 --
 include/linux/ceph/ceph_features.h |  8 ++---
 9 files changed, 79 insertions(+), 81 deletions(-)

Re: [PATCH] rbd: Convert to use the preferred fallthrough macro

2020-08-19 Thread Ilya Dryomov

On Wed, Aug 19, 2020 at 3:03 PM Jens Axboe  wrote:
>
> On 8/19/20 1:53 AM, Miaohe Lin wrote:
> > Convert the uses of fallthrough comments to fallthrough macro.
>
> Applied, thanks.

Hi Jens,

This has already been folded into another patch in ceph-client.git.
Please drop it.

Thanks,

Ilya

Re: [PATCH] ceph: Convert to use the preferred fallthrough macro

2020-08-19 Thread Ilya Dryomov

On Wed, Aug 19, 2020 at 10:53 AM Miaohe Lin  wrote:
>
> Convert the uses of fallthrough comments to fallthrough macro.
>
> Signed-off-by: Hongxiang Lou 
> Signed-off-by: Miaohe Lin 
> ---
>  fs/ceph/file.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/fs/ceph/file.c b/fs/ceph/file.c
> index d51c3f2fdca0..30cd00265181 100644
> --- a/fs/ceph/file.c
> +++ b/fs/ceph/file.c
> @@ -252,7 +252,7 @@ static int ceph_init_file(struct inode *inode, struct 
> file *file, int fmode)
> case S_IFREG:
> ceph_fscache_register_inode_cookie(inode);
> ceph_fscache_file_set_cookie(inode, file);
> -   /* fall through */
> +   fallthrough;
> case S_IFDIR:
> ret = ceph_init_file_info(inode, file, fmode,
> S_ISDIR(inode->i_mode));
> --
> 2.19.1
>

Hi Miaohe,

I've already done that, folding into your previous patch:

  
https://github.com/ceph/ceph-client/commit/3f19ae89547df1b8ccba359a2f7ddba0f108ffbd

Thanks,

Ilya

Re: [RFC PATCH] ceph: Delete features that are not used in the kernel

2020-08-19 Thread Ilya Dryomov

On Wed, Aug 19, 2020 at 9:57 AM Leon Romanovsky  wrote:
>
> From: Leon Romanovsky 
>
> The ceph_features.h has declaration of features that are not in-use
> in kernel code. This causes to seeing such compilation warnings in
> almost every kernel compilation.
>
> ./include/linux/ceph/ceph_features.h:14:24: warning: 'CEPH_FEATURE_UID' 
> defined but not used [-Wunused-const-variable=]
>14 |  static const uint64_t CEPH_FEATURE_##name = (1ULL<   |^
> ./include/linux/ceph/ceph_features.h:75:1: note: in expansion of macro 
> 'DEFINE_CEPH_FEATURE'
>75 | DEFINE_CEPH_FEATURE( 0, 1, UID)
>   | ^~~
>
> The upstream kernel indeed doesn't have any use of them, so delete it.
>
> Signed-off-by: Leon Romanovsky 
> ---
> I'm sending this as RFC because probably the patch is wrong, but I
> would like to bring your attention to the existing problem and asking
> for an acceptable solution.

Hi Leon,

Yes, removing unused feature definitions is wrong.  Annotating them
as potentially unused would be much better -- I'll send a patch.

I don't think any of us builds with W=1, so these things don't get
noticed.

Thanks,

Ilya

Re: [PATCH] libceph: Convert to use the preferred fallthrough macro

2020-08-19 Thread Ilya Dryomov

On Tue, Aug 18, 2020 at 9:56 PM Jeff Layton  wrote:
>
> On Tue, 2020-08-18 at 08:26 -0400, Miaohe Lin wrote:
> > Convert the uses of fallthrough comments to fallthrough macro.
> >
> > Signed-off-by: Miaohe Lin 
> > ---
> >  net/ceph/ceph_hash.c| 20 ++--
> >  net/ceph/crush/mapper.c |  2 +-
> >  net/ceph/messenger.c|  4 ++--
> >  net/ceph/mon_client.c   |  2 +-
> >  net/ceph/osd_client.c   |  4 ++--
> >  5 files changed, 16 insertions(+), 16 deletions(-)
> >
> > diff --git a/net/ceph/ceph_hash.c b/net/ceph/ceph_hash.c
> > index 81e1e006c540..16a47c0eef37 100644
> > --- a/net/ceph/ceph_hash.c
> > +++ b/net/ceph/ceph_hash.c
> > @@ -50,35 +50,35 @@ unsigned int ceph_str_hash_rjenkins(const char *str, 
> > unsigned int length)
> >   switch (len) {
> >   case 11:
> >   c = c + ((__u32)k[10] << 24);
> > - /* fall through */
> > + fallthrough;
> >   case 10:
> >   c = c + ((__u32)k[9] << 16);
> > - /* fall through */
> > + fallthrough;
> >   case 9:
> >   c = c + ((__u32)k[8] << 8);
> >   /* the first byte of c is reserved for the length */
> > - /* fall through */
> > + fallthrough;
> >   case 8:
> >   b = b + ((__u32)k[7] << 24);
> > - /* fall through */
> > + fallthrough;
> >   case 7:
> >   b = b + ((__u32)k[6] << 16);
> > - /* fall through */
> > + fallthrough;
> >   case 6:
> >   b = b + ((__u32)k[5] << 8);
> > - /* fall through */
> > + fallthrough;
> >   case 5:
> >   b = b + k[4];
> > - /* fall through */
> > + fallthrough;
> >   case 4:
> >   a = a + ((__u32)k[3] << 24);
> > - /* fall through */
> > + fallthrough;
> >   case 3:
> >   a = a + ((__u32)k[2] << 16);
> > - /* fall through */
> > + fallthrough;
> >   case 2:
> >   a = a + ((__u32)k[1] << 8);
> > - /* fall through */
> > + fallthrough;
> >   case 1:
> >   a = a + k[0];
> >   /* case 0: nothing left to add */
> > diff --git a/net/ceph/crush/mapper.c b/net/ceph/crush/mapper.c
> > index 07e5614eb3f1..7057f8db4f99 100644
> > --- a/net/ceph/crush/mapper.c
> > +++ b/net/ceph/crush/mapper.c
> > @@ -987,7 +987,7 @@ int crush_do_rule(const struct crush_map *map,
> >   case CRUSH_RULE_CHOOSELEAF_FIRSTN:
> >   case CRUSH_RULE_CHOOSE_FIRSTN:
> >   firstn = 1;
> > - /* fall through */
> > + fallthrough;
> >   case CRUSH_RULE_CHOOSELEAF_INDEP:
> >   case CRUSH_RULE_CHOOSE_INDEP:
> >   if (wsize == 0)
> > diff --git a/net/ceph/messenger.c b/net/ceph/messenger.c
> > index 27d6ab11f9ee..bdfd66ba3843 100644
> > --- a/net/ceph/messenger.c
> > +++ b/net/ceph/messenger.c
> > @@ -412,7 +412,7 @@ static void ceph_sock_state_change(struct sock *sk)
> >   switch (sk->sk_state) {
> >   case TCP_CLOSE:
> >   dout("%s TCP_CLOSE\n", __func__);
> > - /* fall through */
> > + fallthrough;
> >   case TCP_CLOSE_WAIT:
> >   dout("%s TCP_CLOSE_WAIT\n", __func__);
> >   con_sock_state_closing(con);
> > @@ -2751,7 +2751,7 @@ static int try_read(struct ceph_connection *con)
> >   switch (ret) {
> >   case -EBADMSG:
> >   con->error_msg = "bad crc/signature";
> > - /* fall through */
> > + fallthrough;
> >   case -EBADE:
> >   ret = -EIO;
> >   break;
> > diff --git a/net/ceph/mon_client.c b/net/ceph/mon_client.c
> > index 3d8c8015e976..d633a0aeaa55 100644
> > --- a/net/ceph/mon_client.c
> > +++ b/net/ceph/mon_client.c
> > @@ -1307,7 +1307,7 @@ static struct ceph_msg *mon_alloc_msg(struct 
> > ceph_connection *con,
> >* request had a non-zero tid.  Work around this weirdness
> >* by allocating a new message.
> >*/
> > - /* fall through */
> > + fallthrough;
> >   case CEPH_MSG_MON_MAP:
> >   case CEPH_MSG_MDS_MAP:
> >   case CEPH_MSG_OSD_MAP:
> > diff --git a/net/ceph/osd_client.c b/net/ceph/osd_client.c
> > index e4fbcad6e7d8..7901ab6c79fd 100644
> > --- a/net/ceph/osd_client.c
> > +++ b/net/ceph/osd_client.c
> > @@ -3854,7 +3854,7 @@ static void scan_requests(struct ceph_osd *osd,
> >   if (!force_resend && !force_resend_writes)
> >   break;
> >
> > - /* fall through */
> > + fallthrough;
> >   case CALC_TARGET_NEED_RESEND:
> >

Re: [PATCH V2 6/6] ceph_debug: Remove now unused dout macro definitions

2020-08-17 Thread Ilya Dryomov

On Mon, Aug 17, 2020 at 3:34 AM Joe Perches  wrote:
>
> All the uses have be converted to pr_debug, so remove these.
>
> Signed-off-by: Joe Perches 
> ---
>  include/linux/ceph/ceph_debug.h | 30 --
>  1 file changed, 30 deletions(-)
>
> diff --git a/include/linux/ceph/ceph_debug.h b/include/linux/ceph/ceph_debug.h
> index d5a5da838caf..81c0d7195f1e 100644
> --- a/include/linux/ceph/ceph_debug.h
> +++ b/include/linux/ceph/ceph_debug.h
> @@ -6,34 +6,4 @@
>
>  #include 
>
> -#ifdef CONFIG_CEPH_LIB_PRETTYDEBUG
> -
> -/*
> - * wrap pr_debug to include a filename:lineno prefix on each line.
> - * this incurs some overhead (kernel size and execution time) due to
> - * the extra function call at each call site.
> - */
> -
> -# if defined(DEBUG) || defined(CONFIG_DYNAMIC_DEBUG)
> -#  define dout(fmt, ...)   \
> -   pr_debug("%.*s %12.12s:%-4d : " fmt,\
> -8 - (int)sizeof(KBUILD_MODNAME), "",   \
> -kbasename(__FILE__), __LINE__, ##__VA_ARGS__)
> -# else
> -/* faux printk call just to see any compiler warnings. */
> -#  define dout(fmt, ...)   do {\
> -   if (0)  \
> -   printk(KERN_DEBUG fmt, ##__VA_ARGS__);  \
> -   } while (0)
> -# endif
> -
> -#else
> -
> -/*
> - * or, just wrap pr_debug
> - */
> -# define dout(fmt, ...)pr_debug(" " fmt, ##__VA_ARGS__)
> -
> -#endif
> -
>  #endif
> --
> 2.26.0
>

Hi Joe,

Yeah, roughly the same thing can be achieved with +flmp instead
of just +p with PRETTYDEBUG, but PRETTYDEBUG formatting actually
predates those flags and some of us still use bash scripts from
back then.  We also have a few guides and blog entries with just
+p, but that's not a big deal.

I'd be fine with removing CONFIG_CEPH_LIB_PRETTYDEBUG since it's
disabled by default and in all major distributions, but I'm not a
fan of a wide-sweeping dout -> pr_debug change.  We do extensive
backporting to older kernels and these kind of changes are rather
annoying.  dout is shorter to type too ;)

I know that in some cases the function names are outdated or
duplicated, but I prefer fixing them gradually, along with actual
code changes in the area (i.e. similar to whitespace).

Thanks,

Ilya

[GIT PULL] Ceph updates for 5.9-rc1

2020-08-12 Thread Ilya Dryomov

Hi Linus,

The following changes since commit bcf876870b95592b52519ed4aafcf9d95999bc9c:

  Linux 5.8 (2020-08-02 14:21:45 -0700)

are available in the Git repository at:

  https://github.com/ceph/ceph-client.git tags/ceph-for-5.9-rc1

for you to fetch changes up to 02e37571f9e79022498fd0525c073b07e9d9ac69:

  ceph: handle zero-length feature mask in session messages (2020-08-05 
17:47:07 +0200)


Xiubo has completed his work on filesystem client metrics, they are
sent to all available MDSes once per second now.  Other than that, we
have a lot of fixes and cleanups all around the filesystem, including
a tweak to cut down on MDS request resends in multi-MDS setups from
Yanhu and fixups for SELinux symlink labeling and MClientSession
message decoding from Jeff.


Alexander A. Klimov (1):
  libceph: replace HTTP links with HTTPS ones

Colin Ian King (1):
  ceph: remove redundant initialization of variable mds

Ilya Dryomov (2):
  libceph: use target_copy() in send_linger()
  libceph: dump class and method names on method calls

Jeff Layton (5):
  ceph: clean up and optimize ceph_check_delayed_caps()
  libceph: just have osd_req_op_init() return a pointer
  ceph: set sec_context xattr on symlink creation
  ceph: move sb->wb_pagevec_pool to be a global mempool
  ceph: handle zero-length feature mask in session messages

Jia Yang (1):
  ceph: remove unused variables in ceph_mdsmap_decode()

Randy Dunlap (1):
  ceph: delete repeated words in fs/ceph/

Xiubo Li (9):
  ceph: add check_session_state() helper and make it global
  ceph: add global total_caps to count the mdsc's total caps number
  ceph: switch to WARN_ON_ONCE in encode_supported_features()
  ceph: fix potential mdsc use-after-free crash
  ceph: do not access the kiocb after aio requests
  ceph: check the sesion state and return false in case it is closed
  ceph: periodically send perf metrics to MDSes
  ceph: send client provided metric flags in client metadata
  ceph: fix use-after-free for fsc->mdsc

Xu Wang (1):
  ceph: remove unnecessary cast in kfree()

Yanhu Cao (1):
  ceph: use frag's MDS in either mode

 fs/ceph/Kconfig|   2 +-
 fs/ceph/addr.c |  23 +++--
 fs/ceph/caps.c |  12 +--
 fs/ceph/debugfs.c  |  16 +---
 fs/ceph/dir.c  |   4 +
 fs/ceph/file.c |   5 +-
 fs/ceph/mds_client.c   | 184 +
 fs/ceph/mds_client.h   |   7 +-
 fs/ceph/mdsmap.c   |  10 +-
 fs/ceph/metric.c   | 149 ++
 fs/ceph/metric.h   |  91 ++
 fs/ceph/super.c|  64 ++---
 fs/ceph/super.h|   6 +-
 fs/ceph/xattr.c|  12 +--
 include/linux/ceph/ceph_features.h |   2 +-
 include/linux/ceph/ceph_fs.h   |   1 +
 include/linux/ceph/libceph.h   |   1 +
 include/linux/ceph/osd_client.h|   2 +-
 include/linux/crush/crush.h|   2 +-
 net/ceph/Kconfig   |   2 +-
 net/ceph/ceph_hash.c   |   2 +-
 net/ceph/crush/hash.c  |   2 +-
 net/ceph/crush/mapper.c|   2 +-
 net/ceph/debugfs.c |   3 +
 net/ceph/osd_client.c  |  43 -
 25 files changed, 511 insertions(+), 136 deletions(-)

Re: [PATCH] Replace HTTP links with HTTPS ones: CEPH COMMON CODE (LIBCEPH)

2020-07-08 Thread Ilya Dryomov

On Wed, Jul 8, 2020 at 8:53 AM Alexander A. Klimov
 wrote:
>
> Rationale:
> Reduces attack surface on kernel devs opening the links for MITM
> as HTTPS traffic is much harder to manipulate.
>
> Deterministic algorithm:
> For each file:
>   If not .svg:
> For each line:
>   If doesn't contain `\bxmlns\b`:
> For each link, `\bhttp://[^# \t\r\n]*(?:\w|/)`:
>   If neither `\bgnu\.org/license`, nor `\bmozilla\.org/MPL\b`:
> If both the HTTP and HTTPS versions
> return 200 OK and serve the same content:
>   Replace HTTP with HTTPS.
>
> Signed-off-by: Alexander A. Klimov 
> ---
>  Continuing my work started at 93431e0607e5.
>  See also: git log --oneline '--author=Alexander A. Klimov 
> ' v5.7..master
>  (Actually letting a shell for loop submit all this stuff for me.)
>
>  If there are any URLs to be removed completely or at least not HTTPSified:
>  Just clearly say so and I'll *undo my change*.
>  See also: https://lkml.org/lkml/2020/6/27/64
>
>  If there are any valid, but yet not changed URLs:
>  See: https://lkml.org/lkml/2020/6/26/837
>
>  If you apply the patch, please let me know.
>
>
>  net/ceph/ceph_hash.c| 2 +-
>  net/ceph/crush/hash.c   | 2 +-
>  net/ceph/crush/mapper.c | 2 +-
>  3 files changed, 3 insertions(+), 3 deletions(-)
>
> diff --git a/net/ceph/ceph_hash.c b/net/ceph/ceph_hash.c
> index 9a5850f264ed..81e1e006c540 100644
> --- a/net/ceph/ceph_hash.c
> +++ b/net/ceph/ceph_hash.c
> @@ -4,7 +4,7 @@
>
>  /*
>   * Robert Jenkin's hash function.
> - * http://burtleburtle.net/bob/hash/evahash.html
> + * https://burtleburtle.net/bob/hash/evahash.html
>   * This is in the public domain.
>   */
>  #define mix(a, b, c)   \
> diff --git a/net/ceph/crush/hash.c b/net/ceph/crush/hash.c
> index e5cc603cdb17..fe79f6d2d0db 100644
> --- a/net/ceph/crush/hash.c
> +++ b/net/ceph/crush/hash.c
> @@ -7,7 +7,7 @@
>
>  /*
>   * Robert Jenkins' function for mixing 32-bit values
> - * http://burtleburtle.net/bob/hash/evahash.html
> + * https://burtleburtle.net/bob/hash/evahash.html
>   * a, b = random bits, c = input and output
>   */
>  #define crush_hashmix(a, b, c) do {\
> diff --git a/net/ceph/crush/mapper.c b/net/ceph/crush/mapper.c
> index 3f323ed9df52..07e5614eb3f1 100644
> --- a/net/ceph/crush/mapper.c
> +++ b/net/ceph/crush/mapper.c
> @@ -298,7 +298,7 @@ static __u64 crush_ln(unsigned int xin)
>   *
>   * for reference, see:
>   *
> - * 
> http://en.wikipedia.org/wiki/Exponential_distribution#Distribution_of_the_minimum_of_exponential_random_variables
> + * 
> https://en.wikipedia.org/wiki/Exponential_distribution#Distribution_of_the_minimum_of_exponential_random_variables
>   *
>   */
>

Applied with a couple more link fixes folded in.

Thanks,

Ilya

Re: [PATCH] fs: ceph: Remove unnecessary cast in kfree()

2020-07-08 Thread Ilya Dryomov

On Wed, Jul 8, 2020 at 9:27 AM Xu Wang  wrote:
>
> Remove unnecassary casts in the argument to kfree.
>
> Signed-off-by: Xu Wang 
> ---
>  fs/ceph/xattr.c | 12 ++--
>  1 file changed, 6 insertions(+), 6 deletions(-)
>
> diff --git a/fs/ceph/xattr.c b/fs/ceph/xattr.c
> index 71ee34d160c3..3a733ac33d9b 100644
> --- a/fs/ceph/xattr.c
> +++ b/fs/ceph/xattr.c
> @@ -497,10 +497,10 @@ static int __set_xattr(struct ceph_inode_info *ci,
> kfree(*newxattr);
> *newxattr = NULL;
> if (xattr->should_free_val)
> -   kfree((void *)xattr->val);
> +   kfree(xattr->val);
>
> if (update_xattr) {
> -   kfree((void *)name);
> +   kfree(name);
> name = xattr->name;
> }
> ci->i_xattrs.names_size -= xattr->name_len;
> @@ -566,9 +566,9 @@ static void __free_xattr(struct ceph_inode_xattr *xattr)
> BUG_ON(!xattr);
>
> if (xattr->should_free_name)
> -   kfree((void *)xattr->name);
> +   kfree(xattr->name);
> if (xattr->should_free_val)
> -   kfree((void *)xattr->val);
> +   kfree(xattr->val);
>
> kfree(xattr);
>  }
> @@ -582,9 +582,9 @@ static int __remove_xattr(struct ceph_inode_info *ci,
> rb_erase(&xattr->node, &ci->i_xattrs.index);
>
> if (xattr->should_free_name)
> -   kfree((void *)xattr->name);
> +   kfree(xattr->name);
> if (xattr->should_free_val)
> -   kfree((void *)xattr->val);
> +   kfree(xattr->val);
>
> ci->i_xattrs.names_size -= xattr->name_len;
> ci->i_xattrs.vals_size -= xattr->val_len;

Applied.

Thanks,

Ilya

[GIT PULL] Ceph fixes for 5.8-rc2

2020-06-19 Thread Ilya Dryomov

Hi Linus,

The following changes since commit b3a9e3b9622ae10064826dccb4f7a52bd88c7407:

  Linux 5.8-rc1 (2020-06-14 12:45:04 -0700)

are available in the Git repository at:

  https://github.com/ceph/ceph-client.git tags/ceph-for-5.8-rc2

for you to fetch changes up to 7ed286f3e061ee394782bd9fb4ed96bff0b5a021:

  libceph: don't omit used_replica in target_copy() (2020-06-16 16:02:08 +0200)


An important follow-up for replica reads support that went into -rc1
and two target_copy() fixups.


Ilya Dryomov (3):
  libceph: move away from global osd_req_flags
  libceph: don't omit recovery_deletes in target_copy()
  libceph: don't omit used_replica in target_copy()

 drivers/block/rbd.c  |  4 +++-
 include/linux/ceph/libceph.h |  4 ++--
 net/ceph/ceph_common.c   | 14 ++
 net/ceph/osd_client.c|  9 -
 4 files changed, 15 insertions(+), 16 deletions(-)

[GIT PULL] Ceph updates for 5.8-rc1

2020-06-08 Thread Ilya Dryomov

Hi Linus,

The following changes since commit 3d77e6a8804abcc0504c904bd6e5cdf3a5cf8162:

  Linux 5.7 (2020-05-31 16:49:15 -0700)

are available in the Git repository at:

  https://github.com/ceph/ceph-client.git tags/ceph-for-5.8-rc1

for you to fetch changes up to dc1dad8e1a612650b1e786e992cb0c6e101e226a:

  rbd: compression_hint option (2020-06-01 23:32:35 +0200)


The highlights are:

- OSD/MDS latency and caps cache metrics infrastructure for the
  filesytem (Xiubo Li).  Currently available through debugfs and
  will be periodically sent to the MDS in the future.

- support for replica reads (balanced and localized reads) for
  rbd and the filesystem (myself).  The default remains to always
  read from primary, users can opt-in with the new crush_location
  and read_from_replica options.  Note that reading from replica
  is safe for general use only since Octopus.

- support for RADOS allocation hint flags (myself).  Currently
  used by rbd to propagate the compressible/incompressible hint
  given with the new compression_hint map option and ready for
  passing on more advanced hints, e.g. based on fadvise() from
  the filesystem.

- support for efficient cross-quota-realm renames (Luis Henriques)

- assorted cap handling improvements and cleanups, particularly
  untangling some of the locking (Jeff Layton)


Gustavo A. R. Silva (1):
  libceph, rbd: replace zero-length array with flexible-array

Ilya Dryomov (7):
  libceph: add non-asserting rbtree insertion helper
  libceph: decode CRUSH device/bucket types and names
  libceph: crush_location infrastructure
  libceph: support for balanced and localized reads
  libceph: read_from_replica option
  libceph: support for alloc hint flags
  rbd: compression_hint option

Jeff Layton (11):
  ceph: reorganize __send_cap for less spinlock abuse
  ceph: split up __finish_cap_flush
  ceph: add comments for handle_cap_flush_ack logic
  ceph: don't release i_ceph_lock in handle_cap_trunc
  ceph: don't take i_ceph_lock in handle_cap_import
  ceph: document what protects i_dirty_item and i_flushing_item
  ceph: fix potential race in ceph_check_caps
  ceph: throw a warning if we destroy session with mutex still locked
  ceph: convert mdsc->cap_dirty to a per-session list
  ceph: request expedited service on session's last cap flush
  ceph: ceph_kick_flushing_caps needs the s_mutex

Luis Henriques (3):
  ceph: normalize 'delta' parameter usage in check_quota_exceeded
  ceph: allow rename operation under different quota realms
  ceph: don't return -ESTALE if there's still an open file

Xiubo Li (6):
  ceph: add dentry lease metric support
  ceph: add caps perf metric for each superblock
  ceph: add read/write latency metric support
  ceph: add metadata perf metric support
  ceph: make sure mdsc->mutex is nested in s->s_mutex to fix dead lock
  ceph: skip checking caps when session reconnecting and releasing reqs

Yan, Zheng (1):
  ceph: reset i_requested_max_size if file write is not wanted

 drivers/block/rbd.c |  44 -
 drivers/block/rbd_types.h   |   2 +-
 fs/ceph/Makefile|   2 +-
 fs/ceph/acl.c   |   2 +-
 fs/ceph/addr.c  |  20 ++
 fs/ceph/caps.c  | 425 ++--
 fs/ceph/debugfs.c   | 100 +-
 fs/ceph/dir.c   |  26 ++-
 fs/ceph/export.c|   9 +-
 fs/ceph/file.c  |  30 +++
 fs/ceph/inode.c |   4 +-
 fs/ceph/mds_client.c|  48 -
 fs/ceph/mds_client.h|  15 +-
 fs/ceph/metric.c| 148 ++
 fs/ceph/metric.h|  62 ++
 fs/ceph/quota.c |  62 +-
 fs/ceph/super.h |  34 +++-
 fs/ceph/xattr.c |   4 +-
 include/linux/ceph/libceph.h|  13 +-
 include/linux/ceph/mon_client.h |   2 +-
 include/linux/ceph/osd_client.h |   8 +-
 include/linux/ceph/osdmap.h |  19 +-
 include/linux/ceph/rados.h  |  14 ++
 include/linux/crush/crush.h |  14 +-
 net/ceph/ceph_common.c  |  75 +++
 net/ceph/crush/crush.c  |   3 +-
 net/ceph/debugfs.c  |   6 +-
 net/ceph/osd_client.c   | 103 +-
 net/ceph/osdmap.c   | 363 +-
 29 files changed, 1405 insertions(+), 252 deletions(-)
 create mode 100644 fs/ceph/metric.c
 create mode 100644 fs/ceph/metric.h

[GIT PULL] Ceph fixes for 5.7-rc8

2020-05-29 Thread Ilya Dryomov

Hi Linus,

The following changes since commit 9cb1fd0efd195590b828b9b865421ad345a4a145:

  Linux 5.7-rc7 (2020-05-24 15:32:54 -0700)

are available in the Git repository at:

  https://github.com/ceph/ceph-client.git tags/ceph-for-5.7-rc8

for you to fetch changes up to fb33c114d3ed5bdac230716f5b0a93b56b92a90d:

  ceph: flush release queue when handling caps for unknown inode (2020-05-27 
13:03:57 +0200)


Cache tiering and cap handling fixups, both marked for stable.


Jeff Layton (1):
  ceph: flush release queue when handling caps for unknown inode

Jerry Lee (1):
  libceph: ignore pool overlay and cache logic on redirects

 fs/ceph/caps.c| 2 +-
 net/ceph/osd_client.c | 4 +++-
 2 files changed, 4 insertions(+), 2 deletions(-)

[PATCH v3] vsprintf: don't obfuscate NULL and error pointers

2020-05-19 Thread Ilya Dryomov

I don't see what security concern is addressed by obfuscating NULL
and IS_ERR() error pointers, printed with %p/%pK.  Given the number
of sites where %p is used (over 1) and the fact that NULL pointers
aren't uncommon, it probably wouldn't take long for an attacker to
find the hash that corresponds to 0.  Although harder, the same goes
for most common error values, such as -1, -2, -11, -14, etc.

The NULL part actually fixes a regression: NULL pointers weren't
obfuscated until commit 3e5903eb9cff ("vsprintf: Prevent crash when
dereferencing invalid pointers") which went into 5.2.  I'm tacking
the IS_ERR() part on here because error pointers won't leak kernel
addresses and printing them as pointers shouldn't be any different
from e.g. %d with PTR_ERR_OR_ZERO().  Obfuscating them just makes
debugging based on existing pr_debug and friends excruciating.

Note that the "always print 0's for %pK when kptr_restrict == 2"
behaviour which goes way back is left as is.

Example output with the patch applied:

ptr error-ptr  NULL
%p:01f8cc5b  fff2  
%pK, kptr = 0: 01f8cc5b  fff2  
%px:   888048c04020  fff2  
%pK, kptr = 1: 888048c04020  fff2  
%pK, kptr = 2:     

Fixes: 3e5903eb9cff ("vsprintf: Prevent crash when dereferencing invalid 
pointers")
Signed-off-by: Ilya Dryomov 
Reviewed-by: Petr Mladek 
Reviewed-by: Sergey Senozhatsky 
Acked-by: Steven Rostedt (VMware) 
Acked-by: Linus Torvalds 
---
 lib/test_printf.c | 19 ++-
 lib/vsprintf.c|  7 +++
 2 files changed, 25 insertions(+), 1 deletion(-)

Hi Petr,

This just came up again, please consider sending this to Linus
for 5.7.

Prior discussion was split in three threads and revolved around the
vision for how lib/test_printf.c should be structured between Rasmus
and yourself.  The fix itself wasn't disputed and has several acks.

If you want to restructure the test suite before adding any new
test cases, v1 doesn't have them, but I'm reposting with test cases
because I think it's best to add them right away to prevent further
regressions.

v3:
- don't use EAGAIN macro in error_pointer() test case as the
  actual error code varies between architectures

v2:
- fix null_pointer() test case (it didn't catch the original
  regression because test_hashed() doesn't really test much)
  and add error_pointer() test case

diff --git a/lib/test_printf.c b/lib/test_printf.c
index 2d9f520d2f27..6b1622f4d7c2 100644
--- a/lib/test_printf.c
+++ b/lib/test_printf.c
@@ -214,6 +214,7 @@ test_string(void)
 #define PTR_STR "0123456789ab"
 #define PTR_VAL_NO_CRNG "(ptrval)"
 #define ZEROS ""   /* hex 32 zero bits */
+#define ONES ""/* hex 32 one bits */
 
 static int __init
 plain_format(void)
@@ -245,6 +246,7 @@ plain_format(void)
 #define PTR_STR "456789ab"
 #define PTR_VAL_NO_CRNG "(ptrval)"
 #define ZEROS ""
+#define ONES ""
 
 static int __init
 plain_format(void)
@@ -330,14 +332,28 @@ test_hashed(const char *fmt, const void *p)
test(buf, fmt, p);
 }
 
+/*
+ * NULL pointers aren't hashed.
+ */
 static void __init
 null_pointer(void)
 {
-   test_hashed("%p", NULL);
+   test(ZEROS "", "%p", NULL);
test(ZEROS "", "%px", NULL);
test("(null)", "%pE", NULL);
 }
 
+/*
+ * Error pointers aren't hashed.
+ */
+static void __init
+error_pointer(void)
+{
+   test(ONES "fff5", "%p", ERR_PTR(-11));
+   test(ONES "fff5", "%px", ERR_PTR(-11));
+   test("(efault)", "%pE", ERR_PTR(-11));
+}
+
 #define PTR_INVALID ((void *)0x00ab)
 
 static void __init
@@ -649,6 +665,7 @@ test_pointer(void)
 {
plain();
null_pointer();
+   error_pointer();
invalid_pointer();
symbol_ptr();
kernel_ptr();
diff --git a/lib/vsprintf.c b/lib/vsprintf.c
index 7c488a1ce318..f0f0522cd5a7 100644
--- a/lib/vsprintf.c
+++ b/lib/vsprintf.c
@@ -794,6 +794,13 @@ static char *ptr_to_id(char *buf, char *end, const void 
*ptr,
unsigned long hashval;
int ret;
 
+   /*
+* Print the real pointer value for NULL and error pointers,
+* as they are not actual addresses.
+*/
+   if (IS_ERR_OR_NULL(ptr))
+   return pointer_string(buf, end, ptr, spec);
+
/* When debugging early boot use non-cryptographically secure hash. */
if (unlikely(debug_boot_weak_hash)) {
hashval = hash_long((unsigned long)ptr, 32);
-- 
2.19.2

Re: linux-next: new contact(s) for the ceph tree?

2020-05-10 Thread Ilya Dryomov

On Sat, May 9, 2020 at 5:47 AM Stephen Rothwell  wrote:
>
> Hi Sage,
>
> On Sat, 9 May 2020 01:03:14 + (UTC) Sage Weil  wrote:
> >
> > Jeff Layton 
>
> Done.
> > On Sat, 9 May 2020, Stephen Rothwell wrote:
> > >
> > > I noticed commit
> > >
> > >   3a5ccecd9af7 ("MAINTAINERS: remove myself as ceph co-maintainer")
> > >
> > > appear recently.  So who should I now list as the contact(s) for the
> > > ceph tree?

Hi Stephen,

I thought maintainers were on the list automatically.  If there is
a separate list, please add me as well.

Thanks,

Ilya

[GIT PULL] Ceph fixes for 5.7-rc5

2020-05-08 Thread Ilya Dryomov

Hi Linus,

The following changes since commit 0e698dfa282211e414076f9dc7e83c1c288314fd:

  Linux 5.7-rc4 (2020-05-03 14:56:04 -0700)

are available in the Git repository at:

  https://github.com/ceph/ceph-client.git tags/ceph-for-5.7-rc5

for you to fetch changes up to 12ae44a40a1be891bdc6463f8c7072b4ede746ef:

  ceph: demote quotarealm lookup warning to a debug message (2020-05-08 
18:44:40 +0200)


Fixes for an endianness handling bug that prevented mounts on
big-endian arches, a spammy log message and a couple error paths.
Also included a MAINTAINERS update.


Jeff Layton (1):
  ceph: fix endianness bug when handling MDS session feature bits

Luis Henriques (1):
  ceph: demote quotarealm lookup warning to a debug message

Sage Weil (1):
  MAINTAINERS: remove myself as ceph co-maintainer

Wu Bo (2):
  ceph: fix special error code in ceph_try_get_caps()
  ceph: fix double unlock in handle_cap_export()

 MAINTAINERS  | 6 --
 fs/ceph/caps.c   | 3 ++-
 fs/ceph/mds_client.c | 8 +++-
 fs/ceph/quota.c  | 4 ++--
 4 files changed, 7 insertions(+), 14 deletions(-)

Re: [PATCH] rbd: Replace zero-length array with flexible-array

2020-05-08 Thread Ilya Dryomov

On Thu, May 7, 2020 at 9:15 PM Gustavo A. R. Silva
 wrote:
>
> The current codebase makes use of the zero-length array language
> extension to the C90 standard, but the preferred mechanism to declare
> variable-length types such as these ones is a flexible array member[1][2],
> introduced in C99:
>
> struct foo {
> int stuff;
> struct boo array[];
> };
>
> By making use of the mechanism above, we will get a compiler warning
> in case the flexible array does not occur last in the structure, which
> will help us prevent some kind of undefined behavior bugs from being
> inadvertently introduced[3] to the codebase from now on.
>
> Also, notice that, dynamic memory allocations won't be affected by
> this change:
>
> "Flexible array members have incomplete type, and so the sizeof operator
> may not be applied. As a quirk of the original implementation of
> zero-length arrays, sizeof evaluates to zero."[1]
>
> sizeof(flexible-array-member) triggers a warning because flexible array
> members have incomplete type[1]. There are some instances of code in
> which the sizeof operator is being incorrectly/erroneously applied to
> zero-length arrays and the result is zero. Such instances may be hiding
> some bugs. So, this work (flexible-array member conversions) will also
> help to get completely rid of those sorts of issues.
>
> This issue was found with the help of Coccinelle.
>
> [1] https://gcc.gnu.org/onlinedocs/gcc/Zero-Length.html
> [2] https://github.com/KSPP/linux/issues/21
> [3] commit 76497732932f ("cxgb3/l2t: Fix undefined behaviour")
>
> Signed-off-by: Gustavo A. R. Silva 
> ---
>  drivers/block/rbd_types.h |2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/block/rbd_types.h b/drivers/block/rbd_types.h
> index ac98ab6ccd3b..a600e0eb6b6f 100644
> --- a/drivers/block/rbd_types.h
> +++ b/drivers/block/rbd_types.h
> @@ -93,7 +93,7 @@ struct rbd_image_header_ondisk {
> __le32 snap_count;
> __le32 reserved;
> __le64 snap_names_len;
> -   struct rbd_image_snap_ondisk snaps[0];
> +   struct rbd_image_snap_ondisk snaps[];
>  } __attribute__((packed));
>
>
>

Applied (folded into libceph patch).

Thanks,

Ilya

Re: [PATCH] libceph: Replace zero-length array with flexible-array

2020-05-08 Thread Ilya Dryomov

On Thu, May 7, 2020 at 8:47 PM Gustavo A. R. Silva
 wrote:
>
> The current codebase makes use of the zero-length array language
> extension to the C90 standard, but the preferred mechanism to declare
> variable-length types such as these ones is a flexible array member[1][2],
> introduced in C99:
>
> struct foo {
> int stuff;
> struct boo array[];
> };
>
> By making use of the mechanism above, we will get a compiler warning
> in case the flexible array does not occur last in the structure, which
> will help us prevent some kind of undefined behavior bugs from being
> inadvertently introduced[3] to the codebase from now on.
>
> Also, notice that, dynamic memory allocations won't be affected by
> this change:
>
> "Flexible array members have incomplete type, and so the sizeof operator
> may not be applied. As a quirk of the original implementation of
> zero-length arrays, sizeof evaluates to zero."[1]
>
> sizeof(flexible-array-member) triggers a warning because flexible array
> members have incomplete type[1]. There are some instances of code in
> which the sizeof operator is being incorrectly/erroneously applied to
> zero-length arrays and the result is zero. Such instances may be hiding
> some bugs. So, this work (flexible-array member conversions) will also
> help to get completely rid of those sorts of issues.
>
> This issue was found with the help of Coccinelle.
>
> [1] https://gcc.gnu.org/onlinedocs/gcc/Zero-Length.html
> [2] https://github.com/KSPP/linux/issues/21
> [3] commit 76497732932f ("cxgb3/l2t: Fix undefined behaviour")
>
> Signed-off-by: Gustavo A. R. Silva 
> ---
>  include/linux/ceph/mon_client.h |2 +-
>  include/linux/crush/crush.h |2 +-
>  2 files changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/include/linux/ceph/mon_client.h b/include/linux/ceph/mon_client.h
> index dbb8a6959a73..ce4ffeb384d7 100644
> --- a/include/linux/ceph/mon_client.h
> +++ b/include/linux/ceph/mon_client.h
> @@ -19,7 +19,7 @@ struct ceph_monmap {
> struct ceph_fsid fsid;
> u32 epoch;
> u32 num_mon;
> -   struct ceph_entity_inst mon_inst[0];
> +   struct ceph_entity_inst mon_inst[];
>  };
>
>  struct ceph_mon_client;
> diff --git a/include/linux/crush/crush.h b/include/linux/crush/crush.h
> index 54741295c70b..38b0e4d50ed9 100644
> --- a/include/linux/crush/crush.h
> +++ b/include/linux/crush/crush.h
> @@ -87,7 +87,7 @@ struct crush_rule_mask {
>  struct crush_rule {
> __u32 len;
> struct crush_rule_mask mask;
> -   struct crush_rule_step steps[0];
> +   struct crush_rule_step steps[];
>  };
>
>  #define crush_rule_size(len) (sizeof(struct crush_rule) + \
>

Applied.

Thanks,

Ilya

Re: [PATCH] ceph: demote quotarealm lookup warning to a debug message

2020-05-07 Thread Ilya Dryomov

On Thu, May 7, 2020 at 3:44 PM Jeff Layton  wrote:
>
> On Tue, 2020-05-05 at 13:59 +0100, Luis Henriques wrote:
> > A misconfigured cephx can easily result in having the kernel client
> > flooding the logs with:
> >
> >   ceph: Can't lookup inode 1 (err: -13)
> >
> > Change his message to debug level.
> >
> > Link: https://tracker.ceph.com/issues/44546
> > Signed-off-by: Luis Henriques 
> > ---
> > Hi!
> >
> > This patch should fix some harmless warnings when using cephx to restrict
> > users access to certain filesystem paths.  I've added a comment to the
> > tracker where removing this warning could result (unlikely, IMHO!) in an
> > admin to miss not-so-harmless bogus configurations.
> >
> > Cheers,
> > --
> > Luís
> >
> >  fs/ceph/quota.c | 4 ++--
> >  1 file changed, 2 insertions(+), 2 deletions(-)
> >
> > diff --git a/fs/ceph/quota.c b/fs/ceph/quota.c
> > index de56dee60540..19507e2fdb57 100644
> > --- a/fs/ceph/quota.c
> > +++ b/fs/ceph/quota.c
> > @@ -159,8 +159,8 @@ static struct inode *lookup_quotarealm_inode(struct 
> > ceph_mds_client *mdsc,
> >   }
> >
> >   if (IS_ERR(in)) {
> > - pr_warn("Can't lookup inode %llx (err: %ld)\n",
> > - realm->ino, PTR_ERR(in));
> > + dout("Can't lookup inode %llx (err: %ld)\n",
> > +  realm->ino, PTR_ERR(in));
> >   qri->timeout = jiffies + msecs_to_jiffies(60 * 1000); /* XXX 
> > */
> >   } else {
> >   qri->timeout = 0;
> >
>
> Ilya,
>
> We've had a number of reports where people get a ton of kernel log spam
> when they hit this problem. I think we probably ought to mark this patch
> for stable and go ahead and send it to Linus for v5.7 -- any objection?

Sure, I'll queue it up.

Thanks,

Ilya

[GIT PULL] Ceph fixes for 5.4-rc4

2019-10-18 Thread Ilya Dryomov

Hi Linus,

The following changes since commit 4f5cafb5cb8471e54afdc9054d973535614f7675:

  Linux 5.4-rc3 (2019-10-13 16:37:36 -0700)

are available in the Git repository at:

  https://github.com/ceph/ceph-client.git tags/ceph-for-5.4-rc4

for you to fetch changes up to 25e6be21230d3208d687dad90b6e43419013c351:

  rbd: cancel lock_dwork if the wait is interrupted (2019-10-15 17:43:15 +0200)


A future-proofing decoding fix from Jeff intended for stable and
a patch for a mostly benign race from Dongsheng.


Dongsheng Yang (1):
  rbd: cancel lock_dwork if the wait is interrupted

Jeff Layton (1):
  ceph: just skip unrecognized info in ceph_reply_info_extra

 drivers/block/rbd.c  |  9 ++---
 fs/ceph/mds_client.c | 21 +++--
 2 files changed, 17 insertions(+), 13 deletions(-)

Re: [PATCH] function dispatch should return if mds session does not exist

2019-10-14 Thread Ilya Dryomov

On Mon, Oct 14, 2019 at 11:01 AM Yanhu Cao  wrote:
>
> we shouldn't call ceph_msg_put, otherwise libceph will pass
> invalid pointer to mm.
>
> kernel panic - not syncing: fatal exception
> [5452201.213885] [ cut here ]
> [5452201.213889] kernel BUG at mm/slub.c:3901!
> [5452201.213938] invalid opcode:  [#1] SMP PTI
> [5452201.213971] CPU: 35 PID: 3037447 Comm: kworker/35:1 Kdump: loaded 
> Not tainted 4.19.15 #1
> [5452201.214020] Hardware name: HP ProLiant DL380 Gen9/ProLiant DL380 
> Gen9, BIOS P89 01/22/2018
> [5452201.214088] Workqueue: ceph-msgr ceph_con_workfn [libceph]
> [5452201.214129] RIP: 0010:kfree+0x15b/0x170
> [5452201.214156] Code: 8b 02 f6 c4 80 75 08 49 8b 42 08 a8 01 74 1b 49 8b 
> 02 31 f6 f6 c4 80 74 05 41 0f b6 72 51 5b 5d 41 5c 4c 89 d7 e9 95 03 f9 ff 
> <0f> 0b 48 83 e8 01 e9 01 ff ff ff 49 83 ea 01 e9 e9 fe ff ff 90 0f
> [5452201.214262] RSP: 0018:b8c3a0607cb0 EFLAGS: 00010246
> [5452201.214296] RAX: eee84008 RBX: 9130c000 RCX: 
> 80200016
> [5452201.214339] RDX: 6f0ec000 RSI:  RDI: 
> 9130c000
> [5452201.214383] RBP: 91107f823970 R08: 0001 R09: 
> 
> [5452201.214426] R10: eee84000 R11: 0001 R12: 
> c076c45d
> [5452201.214469] R13: 91107f823970 R14: 91107f8239e0 R15: 
> 91107f823900
> [5452201.214513] FS:  () GS:9110bfbc() 
> knlGS:
> [5452201.214562] CS:  0010 DS:  ES:  CR0: 80050033
> [5452201.214598] CR2: 55993ab29620 CR3: 003a1e00a003 CR4: 
> 003606e0
> [5452201.214641] DR0:  DR1:  DR2: 
> 
> [5452201.214685] DR3:  DR6: fffe0ff0 DR7: 
> 0400
> [5452201.214728] Call Trace:
> [5452201.214759]  ceph_msg_release+0x15d/0x190 [libceph]
> [5452201.214811]  dispatch+0x66/0xa50 [ceph]
> [5452201.214846]  try_read+0x7f3/0x11d0 [libceph]
> [5452201.214878]  ? dequeue_entity+0x37e/0x7e0
> [5452201.214907]  ? pick_next_task_fair+0x291/0x610
> [5452201.214937]  ? dequeue_task_fair+0x5d/0x700
> [5452201.214966]  ? __switch_to+0x8c/0x470
> [5452201.214999]  ceph_con_workfn+0xa2/0x5b0 [libceph]
> [5452201.215033]  process_one_work+0x16b/0x370
> [5452201.215062]  worker_thread+0x49/0x3f0
> [5452201.215089]  kthread+0xf5/0x130
> [5452201.215112]  ? max_active_store+0x80/0x80
> [5452201.215139]  ? kthread_bind+0x10/0x10
> [5452201.215167]  ret_from_fork+0x1f/0x30
>
> Link: https://tracker.ceph.com/issues/42288
>
> Signed-off-by: Yanhu Cao 
> ---
>  fs/ceph/mds_client.c | 3 +--
>  1 file changed, 1 insertion(+), 2 deletions(-)
>
> diff --git a/fs/ceph/mds_client.c b/fs/ceph/mds_client.c
> index a8a8f84f3bbf..066358fea347 100644
> --- a/fs/ceph/mds_client.c
> +++ b/fs/ceph/mds_client.c
> @@ -4635,7 +4635,7 @@ static void dispatch(struct ceph_connection *con, 
> struct ceph_msg *msg)
> mutex_lock(&mdsc->mutex);
> if (__verify_registered_session(mdsc, s) < 0) {
> mutex_unlock(&mdsc->mutex);
> -   goto out;
> +   return;
> }
> mutex_unlock(&mdsc->mutex);
>
> @@ -4672,7 +4672,6 @@ static void dispatch(struct ceph_connection *con, 
> struct ceph_msg *msg)
> pr_err("received unknown message type %d %s\n", type,
>ceph_msg_type_name(type));
> }
> -out:
> ceph_msg_put(msg);
>  }
>

Hi Yanhu,

This doesn't look right to me.  The messenger hands its reference to
the dispatch function, the dispatch function is responsible for putting
it.  Even if the session isn't registered, the message should still be
valid and should still be freed.  The bug is somewhere else...

Thanks,

Ilya

[GIT PULL] Ceph updates for 5.4-rc1

2019-09-25 Thread Ilya Dryomov

Hi Linus,

The following changes since commit 4d856f72c10ecb060868ed10ff1b1453943fc6c8:

  Linux 5.3 (2019-09-15 14:19:32 -0700)

are available in the Git repository at:

  https://github.com/ceph/ceph-client.git tags/ceph-for-5.4-rc1

for you to fetch changes up to 3ee5a7015c8b7cb4de21f7345f8381946f2fce55:

  ceph: call ceph_mdsc_destroy from destroy_fs_client (2019-09-16 12:06:25 
+0200)


The highlights are:

- automatic recovery of a blacklisted filesystem session (Zheng Yan).
  This is disabled by default and can be enabled by mounting with the
  new "recover_session=clean" option.

- serialize buffered reads and O_DIRECT writes (Jeff Layton).  Care is
  taken to avoid serializing O_DIRECT reads and writes with each other,
  this is based on the exclusion scheme from NFS.

- handle large osdmaps better in the face of fragmented memory (myself)

- don't limit what security.* xattrs can be get or set (Jeff Layton).
  We were overly restrictive here, unnecessarily preventing things like
  file capability sets stored in security.capability from working.

- allow copy_file_range() within the same inode and across different
  filesystems within the same cluster (Luis Henriques)


David Disseldorp (1):
  libceph: handle OSD op ceph_pagelist_append() errors

Dongsheng Yang (1):
  rbd: fix response length parameter for encoded strings

Erqi Chen (1):
  ceph: reconnect connection if session hang in opening state

Ilya Dryomov (6):
  ceph: fix indentation in __get_snap_name()
  libceph: drop unused con parameter of calc_target()
  rbd: pull rbd_img_request_create() dout out into the callers
  ceph: include ceph_debug.h in cache.c
  libceph: avoid a __vmalloc() deadlock in ceph_kvmalloc()
  libceph: use ceph_kvmalloc() for osdmap arrays

Jeff Layton (18):
  ceph: allow copy_file_range when src and dst inode are same
  ceph: don't list vxattrs in listxattr()
  ceph: don't SetPageError on writepage errors
  ceph: remove ceph_get_cap_mds and __ceph_get_cap_mds
  ceph: fetch cap_gen under spinlock in ceph_add_cap
  ceph: eliminate session->s_trim_caps
  ceph: fix comments over ceph_add_cap
  ceph: have __mark_caps_flushing return flush_tid
  ceph: remove unneeded test in try_flush_caps
  ceph: remove CEPH_I_NOFLUSH
  ceph: remove incorrect comment above __send_cap
  ceph: update the mtime when truncating up
  ceph: don't freeze during write page faults
  ceph: add buffered/direct exclusionary locking for reads and writes
  ceph: turn ceph_security_invalidate_secctx into static inline
  ceph: only set CEPH_I_SEC_INITED if we got a MAC label
  ceph: allow arbitrary security.* xattrs
  ceph: call ceph_mdsc_destroy from destroy_fs_client

John Hubbard (2):
  ceph: don't return a value from void function
  ceph: use release_pages() directly

Krzysztof Wilczynski (1):
  ceph: move static keyword to the front of declarations

Luis Henriques (2):
  ceph: fix directories inode i_blkbits initialization
  ceph: allow object copies across different filesystems in the same cluster

Yan, Zheng (9):
  libceph: add function that reset client's entity addr
  libceph: add function that clears osd client's abort_err
  ceph: allow closing session in restarting/reconnect state
  ceph: track and report error of async metadata operation
  ceph: pass filp to ceph_get_caps()
  ceph: add helper function that forcibly reconnects to ceph cluster.
  ceph: return -EIO if read/write against filp that lost file locks
  ceph: invalidate all write mode filp after reconnect
  ceph: auto reconnect after blacklisted

 Documentation/filesystems/ceph.txt |  14 +++
 drivers/block/rbd.c|  18 ++--
 fs/ceph/Makefile   |   2 +-
 fs/ceph/addr.c |  61 +++--
 fs/ceph/cache.c|   2 +
 fs/ceph/caps.c | 173 +++--
 fs/ceph/debugfs.c  |   1 -
 fs/ceph/export.c   |  60 ++---
 fs/ceph/file.c | 104 +-
 fs/ceph/inode.c|  50 ++-
 fs/ceph/io.c   | 163 ++
 fs/ceph/io.h   |  12 +++
 fs/ceph/locks.c|   8 +-
 fs/ceph/mds_client.c   | 110 +--
 fs/ceph/mds_client.h   |   8 +-
 fs/ceph/super.c|  52 +--
 fs/ceph/super.h|  49 +++
 fs/ceph/xattr.c|  76 ++--
 include/linux/ceph/libceph.h   |   1 +
 include/linux/ceph/messenger.h |   1 +
 include/linux/ceph/mon_client.h|   1 +
 incl

Re: [GIT PULL afs: Development for 5.4

2019-09-19 Thread Ilya Dryomov

On Thu, Sep 19, 2019 at 3:55 PM Matthew Wilcox  wrote:
>
> On Thu, Sep 19, 2019 at 10:49:22AM +0100, David Howells wrote:
> > David Howells  wrote:
> >
> > > > However, I was close to unpulling it again. It has a merge commit with
> > > > this merge message:
> > > >
> > > > Merge remote-tracking branch 'net/master' into afs-next
> > > >
> > > > and that simply is not acceptable.
> > >
> > > Apologies - I meant to rebase that away.  There was a bug fix to rxrpc in
> > > net/master that didn't get pulled into your tree until Saturday.
> >
> > Actually, waiting for all outstanding fixes to get merged and then rebasing
> > might not be the right thing here.  The problem is that there are fixes in
> > both trees: afs fixes go directly into yours whereas rxrpc fixes go via
> > networking and I would prefer to base my patches on both of them for testing
> > purposes.  What's the preferred method for dealing with that?  Base on a 
> > merge
> > of the lastest of those fixes in each tree?
>
> Why is it organised this way?  I mean, yes, technically, rxrpc is a
> generic layer-6 protocol that any blah blah blah, but in practice no
> other user has come up in the last 37 years, so why bother pretending
> one is going to?  Just git mv net/rxrpc fs/afs/ and merge everything
> through your tree.
>
> I feel similarly about net/9p, net/sunrpc and net/ceph.  Every filesystem
> comes with its own presentation layer; nobody reuses an existing one.
> Just stop pretending they're separate components.

net/ceph is also being used by drivers/block/rbd.c.  net/ceph was split
out of fs/ceph when rbd was introduced.  We continued to manage them in
a single ceph-client.git tree though.

Thanks,

Ilya

Re: [PATCH v2] ceph: allow object copies across different filesystems in the same cluster

2019-09-09 Thread Ilya Dryomov

On Mon, Sep 9, 2019 at 12:29 PM Luis Henriques  wrote:
>
> OSDs are able to perform object copies across different pools.  Thus,
> there's no need to prevent copy_file_range from doing remote copies if the
> source and destination superblocks are different.  Only return -EXDEV if
> they have different fsid (the cluster ID).
>
> Signed-off-by: Luis Henriques 
> ---
>  fs/ceph/file.c | 18 ++
>  1 file changed, 14 insertions(+), 4 deletions(-)
>
> Hi,
>
> Here's the patch changelog since initial submittion:
>
> - Dropped have_fsid checks on client structs
> - Use %pU to print the fsid instead of raw hex strings (%*ph)
> - Fixed 'To:' field in email so that this time the patch hits vger
>
> Cheers,
> --
> Luis
>
> diff --git a/fs/ceph/file.c b/fs/ceph/file.c
> index 685a03cc4b77..4a624a1dd0bb 100644
> --- a/fs/ceph/file.c
> +++ b/fs/ceph/file.c
> @@ -1904,6 +1904,7 @@ static ssize_t __ceph_copy_file_range(struct file 
> *src_file, loff_t src_off,
> struct ceph_inode_info *src_ci = ceph_inode(src_inode);
> struct ceph_inode_info *dst_ci = ceph_inode(dst_inode);
> struct ceph_cap_flush *prealloc_cf;
> +   struct ceph_fs_client *src_fsc = ceph_inode_to_client(src_inode);
> struct ceph_object_locator src_oloc, dst_oloc;
> struct ceph_object_id src_oid, dst_oid;
> loff_t endoff = 0, size;
> @@ -1915,8 +1916,17 @@ static ssize_t __ceph_copy_file_range(struct file 
> *src_file, loff_t src_off,
>
> if (src_inode == dst_inode)
> return -EINVAL;
> -   if (src_inode->i_sb != dst_inode->i_sb)
> -   return -EXDEV;
> +   if (src_inode->i_sb != dst_inode->i_sb) {
> +   struct ceph_fs_client *dst_fsc = 
> ceph_inode_to_client(dst_inode);
> +
> +   if (ceph_fsid_compare(&src_fsc->client->fsid,
> + &dst_fsc->client->fsid)) {
> +   dout("Copying object across different clusters:");
> +   dout("  src fsid: %pU dst fsid: %pU\n",
> +&src_fsc->client->fsid, &dst_fsc->client->fsid);

Hi Luis,

This should be a single dout.

Thanks,

Ilya

Re: [PATCH] ceph: Move static keyword to the front of declarations

2019-09-02 Thread Ilya Dryomov

On Sat, Aug 31, 2019 at 11:57 PM Krzysztof Wilczynski  wrote:
>
> Move the static keyword to the front of declarations of
> snap_handle_length, handle_length and connected_handle_length,
> and resolve the following compiler warnings that can be seen
> when building with warnings enabled (W=1):
>
> fs/ceph/export.c:38:2: warning:
>   ‘static’ is not at beginning of declaration [-Wold-style-declaration]
>
> fs/ceph/export.c:88:2: warning:
>   ‘static’ is not at beginning of declaration [-Wold-style-declaration]
>
> fs/ceph/export.c:90:2: warning:
>   ‘static’ is not at beginning of declaration [-Wold-style-declaration]
>
> Signed-off-by: Krzysztof Wilczynski 
> ---
> Related: https://lore.kernel.org/r/20190827233017.gk9...@google.com
>
>  fs/ceph/export.c | 6 +++---
>  1 file changed, 3 insertions(+), 3 deletions(-)
>
> diff --git a/fs/ceph/export.c b/fs/ceph/export.c
> index 020d39a85ecc..b6bfa94332c3 100644
> --- a/fs/ceph/export.c
> +++ b/fs/ceph/export.c
> @@ -35,7 +35,7 @@ struct ceph_nfs_snapfh {
>  static int ceph_encode_snapfh(struct inode *inode, u32 *rawfh, int *max_len,
>   struct inode *parent_inode)
>  {
> -   const static int snap_handle_length =
> +   static const int snap_handle_length =
> sizeof(struct ceph_nfs_snapfh) >> 2;
> struct ceph_nfs_snapfh *sfh = (void *)rawfh;
> u64 snapid = ceph_snap(inode);
> @@ -85,9 +85,9 @@ static int ceph_encode_snapfh(struct inode *inode, u32 
> *rawfh, int *max_len,
>  static int ceph_encode_fh(struct inode *inode, u32 *rawfh, int *max_len,
>   struct inode *parent_inode)
>  {
> -   const static int handle_length =
> +   static const int handle_length =
> sizeof(struct ceph_nfs_fh) >> 2;
> -   const static int connected_handle_length =
> +   static const int connected_handle_length =
> sizeof(struct ceph_nfs_confh) >> 2;
> int type;

Applied.

Thanks,

Ilya

Re: bug report: libceph: follow redirect replies from osds

2019-08-30 Thread Ilya Dryomov

On Fri, Aug 30, 2019 at 4:05 PM Colin Ian King  wrote:
>
> Hi,
>
> Static analysis with Coverity has picked up an issue with commit:
>
> commit 205ee1187a671c3b067d7f1e974903b44036f270
> Author: Ilya Dryomov 
> Date:   Mon Jan 27 17:40:20 2014 +0200
>
> libceph: follow redirect replies from osds
>
> Specifically in function ceph_redirect_decode in net/ceph/osd_client.c:
>
> 3485
> 3486len = ceph_decode_32(p);
>
> CID 17904: Unused value (UNUSED_VALUE)
>
> 3487*p += len; /* skip osd_instructions */
> 3488
> 3489/* skip the rest */
> 3490*p = struct_end;
>
> The double write to *p looks wrong, I suspect the *p += len; should be
> just incrementing pointer p as in: p += len.  Am I correct to assume
> this is the correct fix?

Hi Colin,

No, the double write to *p is correct.  It skips over len bytes and
then skips to the end of the redirect reply.  There is no bug here but
we could drop

  len = ceph_decode_32(p);
  *p += len; /* skip osd_instructions */

and skip to the end directly to make Coverity happier.

Thanks,

Ilya

[GIT PULL] Ceph fixes for 5.3-rc7

2019-08-30 Thread Ilya Dryomov

Hi Linus,

The following changes since commit a55aa89aab90fae7c815b0551b07be37db359d76:

  Linux 5.3-rc6 (2019-08-25 12:01:23 -0700)

are available in the Git repository at:

  https://github.com/ceph/ceph-client.git tags/ceph-for-5.3-rc7

for you to fetch changes up to d435c9a7b85be1e820668d2f3718c2d9f24d5548:

  rbd: restore zeroing past the overlap when reading from parent (2019-08-28 
12:34:11 +0200)


A fix for a -rc1 regression in rbd and a trivial static checker fix.


Ilya Dryomov (1):
  rbd: restore zeroing past the overlap when reading from parent

Jia-Ju Bai (1):
  libceph: don't call crypto_free_sync_skcipher() on a NULL tfm

 drivers/block/rbd.c | 11 +++
 net/ceph/crypto.c   |  6 --
 2 files changed, 15 insertions(+), 2 deletions(-)

Re: [PATCH AUTOSEL 5.2 66/76] ceph: fix buffer free while holding i_ceph_lock in __ceph_setxattr()

2019-08-30 Thread Ilya Dryomov

On Thu, Aug 29, 2019 at 11:16 PM Sasha Levin  wrote:
>
> On Thu, Aug 29, 2019 at 10:51:04PM +0200, Ilya Dryomov wrote:
> >On Thu, Aug 29, 2019 at 8:15 PM Sasha Levin  wrote:
> >>
> >> From: Luis Henriques 
> >>
> >> [ Upstream commit 86968ef21596515958d5f0a40233d02be78ecec0 ]
> >>
> >> Calling ceph_buffer_put() in __ceph_setxattr() may end up freeing the
> >> i_xattrs.prealloc_blob buffer while holding the i_ceph_lock.  This can be
> >> fixed by postponing the call until later, when the lock is released.
> >>
> >> The following backtrace was triggered by fstests generic/117.
> >>
> >>   BUG: sleeping function called from invalid context at mm/vmalloc.c:2283
> >>   in_atomic(): 1, irqs_disabled(): 0, pid: 650, name: fsstress
> >>   3 locks held by fsstress/650:
> >>#0: 870a0fe8 (sb_writers#8){.+.+}, at: mnt_want_write+0x20/0x50
> >>#1: ba0c4c74 (&type->i_mutex_dir_key#6){}, at: 
> >> vfs_setxattr+0x55/0xa0
> >>#2: 8dfbb3f2 (&(&ci->i_ceph_lock)->rlock){+.+.}, at: 
> >> __ceph_setxattr+0x297/0x810
> >>   CPU: 1 PID: 650 Comm: fsstress Not tainted 5.2.0+ #437
> >>   Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 
> >> rel-1.12.1-0-ga5cab58-prebuilt.qemu.org 04/01/2014
> >>   Call Trace:
> >>dump_stack+0x67/0x90
> >>___might_sleep.cold+0x9f/0xb1
> >>vfree+0x4b/0x60
> >>ceph_buffer_release+0x1b/0x60
> >>__ceph_setxattr+0x2b4/0x810
> >>__vfs_setxattr+0x66/0x80
> >>__vfs_setxattr_noperm+0x59/0xf0
> >>vfs_setxattr+0x81/0xa0
> >>setxattr+0x115/0x230
> >>? filename_lookup+0xc9/0x140
> >>? rcu_read_lock_sched_held+0x74/0x80
> >>? rcu_sync_lockdep_assert+0x2e/0x60
> >>    ? __sb_start_write+0x142/0x1a0
> >>? mnt_want_write+0x20/0x50
> >>path_setxattr+0xba/0xd0
> >>__x64_sys_lsetxattr+0x24/0x30
> >>do_syscall_64+0x50/0x1c0
> >>entry_SYSCALL_64_after_hwframe+0x49/0xbe
> >>   RIP: 0033:0x7ff23514359a
> >>
> >> Signed-off-by: Luis Henriques 
> >> Reviewed-by: Jeff Layton 
> >> Signed-off-by: Ilya Dryomov 
> >> Signed-off-by: Sasha Levin 
> >> ---
> >>  fs/ceph/xattr.c | 8 ++--
> >>  1 file changed, 6 insertions(+), 2 deletions(-)
> >>
> >> diff --git a/fs/ceph/xattr.c b/fs/ceph/xattr.c
> >> index 0619adbcbe14c..8382299fc2d84 100644
> >> --- a/fs/ceph/xattr.c
> >> +++ b/fs/ceph/xattr.c
> >> @@ -1028,6 +1028,7 @@ int __ceph_setxattr(struct inode *inode, const char 
> >> *name,
> >> struct ceph_inode_info *ci = ceph_inode(inode);
> >> struct ceph_mds_client *mdsc = 
> >> ceph_sb_to_client(inode->i_sb)->mdsc;
> >> struct ceph_cap_flush *prealloc_cf = NULL;
> >> +   struct ceph_buffer *old_blob = NULL;
> >> int issued;
> >> int err;
> >> int dirty = 0;
> >> @@ -1101,13 +1102,15 @@ int __ceph_setxattr(struct inode *inode, const 
> >> char *name,
> >> struct ceph_buffer *blob;
> >>
> >> spin_unlock(&ci->i_ceph_lock);
> >> -   dout(" preaallocating new blob size=%d\n", 
> >> required_blob_size);
> >> +   ceph_buffer_put(old_blob); /* Shouldn't be required */
> >> +   dout(" pre-allocating new blob size=%d\n", 
> >> required_blob_size);
> >> blob = ceph_buffer_new(required_blob_size, GFP_NOFS);
> >> if (!blob)
> >> goto do_sync_unlocked;
> >> spin_lock(&ci->i_ceph_lock);
> >> +   /* prealloc_blob can't be released while holding 
> >> i_ceph_lock */
> >> if (ci->i_xattrs.prealloc_blob)
> >> -   ceph_buffer_put(ci->i_xattrs.prealloc_blob);
> >> +   old_blob = ci->i_xattrs.prealloc_blob;
> >> ci->i_xattrs.prealloc_blob = blob;
> >> goto retry;
> >> }
> >> @@ -1123,6 +1126,7 @@ int __ceph_setxattr(struct inode *inode, const char 
> >> *name,
> >> }
> >>
> >> spin_unlock(&ci->i_ceph_lock);
> >> +   ceph_buffer_put(old_blob);
> >> if (lock_snap_rwsem)
> >> up_read(&mdsc->snap_rwsem);
> >> if (dirty)
> >
> >Hi Sasha,
> >
> >I didn't tag i_ceph_lock series for stable because this is a very old
> >bug which no one ever hit in real life, at least to my knowledge.
>
> I can drop it if you prefer.

Either is fine with me.  I just wanted to explain my rationale for not
tagging them for stable in the first place and point out that there is
a prerequisite.

Thanks,

Ilya

Re: [PATCH AUTOSEL 5.2 66/76] ceph: fix buffer free while holding i_ceph_lock in __ceph_setxattr()

2019-08-29 Thread Ilya Dryomov

On Thu, Aug 29, 2019 at 8:15 PM Sasha Levin  wrote:
>
> From: Luis Henriques 
>
> [ Upstream commit 86968ef21596515958d5f0a40233d02be78ecec0 ]
>
> Calling ceph_buffer_put() in __ceph_setxattr() may end up freeing the
> i_xattrs.prealloc_blob buffer while holding the i_ceph_lock.  This can be
> fixed by postponing the call until later, when the lock is released.
>
> The following backtrace was triggered by fstests generic/117.
>
>   BUG: sleeping function called from invalid context at mm/vmalloc.c:2283
>   in_atomic(): 1, irqs_disabled(): 0, pid: 650, name: fsstress
>   3 locks held by fsstress/650:
>#0: 870a0fe8 (sb_writers#8){.+.+}, at: mnt_want_write+0x20/0x50
>#1: ba0c4c74 (&type->i_mutex_dir_key#6){}, at: 
> vfs_setxattr+0x55/0xa0
>#2: 8dfbb3f2 (&(&ci->i_ceph_lock)->rlock){+.+.}, at: 
> __ceph_setxattr+0x297/0x810
>   CPU: 1 PID: 650 Comm: fsstress Not tainted 5.2.0+ #437
>   Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 
> rel-1.12.1-0-ga5cab58-prebuilt.qemu.org 04/01/2014
>   Call Trace:
>dump_stack+0x67/0x90
>___might_sleep.cold+0x9f/0xb1
>vfree+0x4b/0x60
>ceph_buffer_release+0x1b/0x60
>__ceph_setxattr+0x2b4/0x810
>__vfs_setxattr+0x66/0x80
>__vfs_setxattr_noperm+0x59/0xf0
>vfs_setxattr+0x81/0xa0
>setxattr+0x115/0x230
>? filename_lookup+0xc9/0x140
>? rcu_read_lock_sched_held+0x74/0x80
>? rcu_sync_lockdep_assert+0x2e/0x60
>? __sb_start_write+0x142/0x1a0
>? mnt_want_write+0x20/0x50
>path_setxattr+0xba/0xd0
>__x64_sys_lsetxattr+0x24/0x30
>do_syscall_64+0x50/0x1c0
>entry_SYSCALL_64_after_hwframe+0x49/0xbe
>   RIP: 0033:0x7ff23514359a
>
> Signed-off-by: Luis Henriques 
> Reviewed-by: Jeff Layton 
> Signed-off-by: Ilya Dryomov 
> Signed-off-by: Sasha Levin 
> ---
>  fs/ceph/xattr.c | 8 ++--
>  1 file changed, 6 insertions(+), 2 deletions(-)
>
> diff --git a/fs/ceph/xattr.c b/fs/ceph/xattr.c
> index 0619adbcbe14c..8382299fc2d84 100644
> --- a/fs/ceph/xattr.c
> +++ b/fs/ceph/xattr.c
> @@ -1028,6 +1028,7 @@ int __ceph_setxattr(struct inode *inode, const char 
> *name,
> struct ceph_inode_info *ci = ceph_inode(inode);
> struct ceph_mds_client *mdsc = ceph_sb_to_client(inode->i_sb)->mdsc;
> struct ceph_cap_flush *prealloc_cf = NULL;
> +   struct ceph_buffer *old_blob = NULL;
> int issued;
> int err;
> int dirty = 0;
> @@ -1101,13 +1102,15 @@ int __ceph_setxattr(struct inode *inode, const char 
> *name,
> struct ceph_buffer *blob;
>
> spin_unlock(&ci->i_ceph_lock);
> -   dout(" preaallocating new blob size=%d\n", 
> required_blob_size);
> +   ceph_buffer_put(old_blob); /* Shouldn't be required */
> +   dout(" pre-allocating new blob size=%d\n", 
> required_blob_size);
> blob = ceph_buffer_new(required_blob_size, GFP_NOFS);
> if (!blob)
> goto do_sync_unlocked;
> spin_lock(&ci->i_ceph_lock);
> +   /* prealloc_blob can't be released while holding i_ceph_lock 
> */
> if (ci->i_xattrs.prealloc_blob)
> -   ceph_buffer_put(ci->i_xattrs.prealloc_blob);
> +   old_blob = ci->i_xattrs.prealloc_blob;
> ci->i_xattrs.prealloc_blob = blob;
> goto retry;
> }
> @@ -1123,6 +1126,7 @@ int __ceph_setxattr(struct inode *inode, const char 
> *name,
> }
>
> spin_unlock(&ci->i_ceph_lock);
> +   ceph_buffer_put(old_blob);
> if (lock_snap_rwsem)
> up_read(&mdsc->snap_rwsem);
> if (dirty)

Hi Sasha,

I didn't tag i_ceph_lock series for stable because this is a very old
bug which no one ever hit in real life, at least to my knowledge.

Please note that each of these patches requires 5c498950f730 ("libceph:
allow ceph_buffer_put() to receive a NULL ceph_buffer").

Thanks,

Ilya

[GIT PULL] Ceph fixes for 5.3-rc6

2019-08-23 Thread Ilya Dryomov

Hi Linus,

The following changes since commit d1abaeb3be7b5fa6d7a1fbbd2e14e3310005c4c1:

  Linux 5.3-rc5 (2019-08-18 14:31:08 -0700)

are available in the Git repository at:

  https://github.com/ceph/ceph-client.git tags/ceph-for-5.3-rc6

for you to fetch changes up to a561372405cf6bc6f14239b3a9e57bb39f2788b0:

  libceph: fix PG split vs OSD (re)connect race (2019-08-22 10:47:41 +0200)


Three important fixes tagged for stable (an indefinite hang, a crash on
an assert and a NULL pointer dereference) plus a small series from Luis
fixing instances of vfree() under spinlock.


Erqi Chen (1):
  ceph: clear page dirty before invalidate page

Ilya Dryomov (1):
  libceph: fix PG split vs OSD (re)connect race

Jeff Layton (1):
  ceph: don't try fill file_lock on unsuccessful GETFILELOCK reply

Luis Henriques (4):
  libceph: allow ceph_buffer_put() to receive a NULL ceph_buffer
  ceph: fix buffer free while holding i_ceph_lock in __ceph_setxattr()
  ceph: fix buffer free while holding i_ceph_lock in 
__ceph_build_xattrs_blob()
  ceph: fix buffer free while holding i_ceph_lock in fill_inode()

 fs/ceph/addr.c  |  5 +++--
 fs/ceph/caps.c  |  5 -
 fs/ceph/inode.c |  7 ---
 fs/ceph/locks.c |  3 +--
 fs/ceph/snap.c  |  4 +++-
 fs/ceph/super.h |  2 +-
 fs/ceph/xattr.c | 19 ++-
 include/linux/ceph/buffer.h |  3 ++-
 net/ceph/osd_client.c   |  9 -
 9 files changed, 36 insertions(+), 21 deletions(-)

Re: [PATCH] net/ceph replace ceph_kvmalloc with kvmalloc

2019-08-12 Thread Ilya Dryomov

On Mon, Aug 12, 2019 at 11:42 AM Marc Koderer  wrote:
>
> There is nearly no difference between both implemenations.
> ceph_kvmalloc existed before kvmalloc which makes me think it's
> a leftover.
>
> Signed-off-by: Marc Koderer 
> ---
>  net/ceph/buffer.c  |  3 +--
>  net/ceph/ceph_common.c | 11 ---
>  net/ceph/crypto.c  |  2 +-
>  net/ceph/messenger.c   |  2 +-
>  4 files changed, 3 insertions(+), 15 deletions(-)
>
> diff --git a/net/ceph/buffer.c b/net/ceph/buffer.c
> index 5622763ad402..6ca273d2246a 100644
> --- a/net/ceph/buffer.c
> +++ b/net/ceph/buffer.c
> @@ -7,7 +7,6 @@
>
>  #include 
>  #include 
> -#include  /* for ceph_kvmalloc */
>
>  struct ceph_buffer *ceph_buffer_new(size_t len, gfp_t gfp)
>  {
> @@ -17,7 +16,7 @@ struct ceph_buffer *ceph_buffer_new(size_t len, gfp_t gfp)
> if (!b)
> return NULL;
>
> -   b->vec.iov_base = ceph_kvmalloc(len, gfp);
> +   b->vec.iov_base = kvmalloc(len, gfp);
> if (!b->vec.iov_base) {
> kfree(b);
> return NULL;
> diff --git a/net/ceph/ceph_common.c b/net/ceph/ceph_common.c
> index 4eeea4d5c3ef..6c1769a815af 100644
> --- a/net/ceph/ceph_common.c
> +++ b/net/ceph/ceph_common.c
> @@ -185,17 +185,6 @@ int ceph_compare_options(struct ceph_options *new_opt,
>  }
>  EXPORT_SYMBOL(ceph_compare_options);
>
> -void *ceph_kvmalloc(size_t size, gfp_t flags)
> -{
> -   if (size <= (PAGE_SIZE << PAGE_ALLOC_COSTLY_ORDER)) {
> -   void *ptr = kmalloc(size, flags | __GFP_NOWARN);
> -   if (ptr)
> -   return ptr;
> -   }
> -
> -   return __vmalloc(size, flags, PAGE_KERNEL);
> -}
> -
>
>  static int parse_fsid(const char *str, struct ceph_fsid *fsid)
>  {
> diff --git a/net/ceph/crypto.c b/net/ceph/crypto.c
> index 5d6724cee38f..a9deead1e4ff 100644
> --- a/net/ceph/crypto.c
> +++ b/net/ceph/crypto.c
> @@ -144,7 +144,7 @@ void ceph_crypto_key_destroy(struct ceph_crypto_key *key)
>  static const u8 *aes_iv = (u8 *)CEPH_AES_IV;
>
>  /*
> - * Should be used for buffers allocated with ceph_kvmalloc().
> + * Should be used for buffers allocated with kvmalloc().
>   * Currently these are encrypt out-buffer (ceph_buffer) and decrypt
>   * in-buffer (msg front).
>   *
> diff --git a/net/ceph/messenger.c b/net/ceph/messenger.c
> index 962f521c863e..f1f2fcc6f780 100644
> --- a/net/ceph/messenger.c
> +++ b/net/ceph/messenger.c
> @@ -3334,7 +3334,7 @@ struct ceph_msg *ceph_msg_new2(int type, int front_len, 
> int max_data_items,
>
> /* front */
> if (front_len) {
> -   m->front.iov_base = ceph_kvmalloc(front_len, flags);
> +   m->front.iov_base = kvmalloc(front_len, flags);
> if (m->front.iov_base == NULL) {
> dout("ceph_msg_new can't allocate %d bytes\n",
>  front_len);

Hi Marc,

I'm working on a patch for https://tracker.ceph.com/issues/40481 which
changes ceph_kvmalloc() to properly deal with non-GFP_KERNEL contexts.
We can't switch to kvmalloc() because it doesn't actually fall back to
vmalloc() for GFP_NOFS or GFP_NOIO.

Thanks,

Ilya

Re: [PATCH] net: ceph: Fix a possible null-pointer dereference in ceph_crypto_key_destroy()

2019-07-30 Thread Ilya Dryomov

On Wed, Jul 24, 2019 at 11:43 AM Jia-Ju Bai  wrote:
>
> In set_secret(), key->tfm is assigned to NULL on line 55, and then
> ceph_crypto_key_destroy(key) is executed.
>
> ceph_crypto_key_destroy(key)
> crypto_free_sync_skcipher(key->tfm)
> crypto_skcipher_tfm(tfm)
> return &tfm->base;
>
> Thus, a possible null-pointer dereference may occur.
>
> To fix this bug, key->tfm is checked before calling
> crypto_free_sync_skcipher().
>
> This bug is found by a static analysis tool STCheck written by us.
>
> Signed-off-by: Jia-Ju Bai 
> ---
>  net/ceph/crypto.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/net/ceph/crypto.c b/net/ceph/crypto.c
> index 5d6724cee38f..ac28463bcfd8 100644
> --- a/net/ceph/crypto.c
> +++ b/net/ceph/crypto.c
> @@ -136,7 +136,8 @@ void ceph_crypto_key_destroy(struct ceph_crypto_key *key)
> if (key) {
> kfree(key->key);
> key->key = NULL;
> -   crypto_free_sync_skcipher(key->tfm);
> +   if (key->tfm)
> +   crypto_free_sync_skcipher(key->tfm);
> key->tfm = NULL;
> }
>  }

Hi Jia-Ju,

Yeah, looks like the only reason this continued to work after
69d6302b65a8 ("libceph: Remove VLA usage of skcipher") is because
crypto_sync_skcipher is a trivial wrapper around crypto_skcipher
added just for type checking AFAICT.

struct crypto_sync_skcipher {
struct crypto_skcipher base;
};

Before that ceph_crypto_key_destroy() used crypto_free_skcipher(),
which is safe to call on a NULL tfm.

Applied with a slight modification -- I moved key->tfm = NULL under
the new if and amended the changelog.

https://github.com/ceph/ceph-client/commit/b3d79916ff99074d289d66f1643b423ae0008c50

Thanks,

Ilya

Re: fs/ceph/export.c:459:3-12: code aligned with following code on line 461 (fwd)

2019-07-29 Thread Ilya Dryomov

On Sat, Jun 29, 2019 at 4:09 PM Julia Lawall  wrote:
>
> There is no bug here, but some code starting on line 461 seems to be
> incorrectly indented.
>
> julia
>
> -- Forwarded message --
> Date: Sat, 29 Jun 2019 19:51:04 +0800
> From: kbuild test robot 
> To: kbu...@01.org
> Cc: Julia Lawall 
> Subject: fs/ceph/export.c:459:3-12: code aligned with following code on line 
> 461
>
> CC: kbuild-...@01.org
> CC: linux-kernel@vger.kernel.org
> TO: "Yan, Zheng" 
> CC: Ilya Dryomov 
>
> tree:   https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git 
> master
> head:   249155c20f9b0754bc1b932a33344cfb4e0c2101
> commit: 570df4e9c23f861aa3f8f2954468c534a033bf1a ceph: snapshot nfs re-export
> date:   8 weeks ago
> :: branch date: 5 days ago
> :: commit date: 8 weeks ago
>
> If you fix the issue, kindly add following tag
> Reported-by: kbuild test robot 
> Reported-by: Julia Lawall 
>
> >> fs/ceph/export.c:459:3-12: code aligned with following code on line 461
>
> # 
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=570df4e9c23f861aa3f8f2954468c534a033bf1a
> git remote add linus 
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
> git remote update linus
> git checkout 570df4e9c23f861aa3f8f2954468c534a033bf1a
> vim +459 fs/ceph/export.c
>
> a8e63b7d Sage Weil  2009-10-06  401
> 570df4e9 Yan, Zheng 2017-11-15  402  static int __get_snap_name(struct dentry 
> *parent, char *name,
> 570df4e9 Yan, Zheng 2017-11-15  403struct dentry 
> *child)
> 570df4e9 Yan, Zheng 2017-11-15  404  {
> 570df4e9 Yan, Zheng 2017-11-15  405 struct inode *inode = d_inode(child);
> 570df4e9 Yan, Zheng 2017-11-15  406 struct inode *dir = d_inode(parent);
> 570df4e9 Yan, Zheng 2017-11-15  407 struct ceph_fs_client *fsc = 
> ceph_inode_to_client(inode);
> 570df4e9 Yan, Zheng 2017-11-15  408 struct ceph_mds_request *req = NULL;
> 570df4e9 Yan, Zheng 2017-11-15  409 char *last_name = NULL;
> 570df4e9 Yan, Zheng 2017-11-15  410 unsigned next_offset = 2;
> 570df4e9 Yan, Zheng 2017-11-15  411 int err = -EINVAL;
> 570df4e9 Yan, Zheng 2017-11-15  412
> 570df4e9 Yan, Zheng 2017-11-15  413 if (ceph_ino(inode) != ceph_ino(dir))
> 570df4e9 Yan, Zheng 2017-11-15  414 goto out;
> 570df4e9 Yan, Zheng 2017-11-15  415 if (ceph_snap(inode) == CEPH_SNAPDIR) 
> {
> 570df4e9 Yan, Zheng 2017-11-15  416 if (ceph_snap(dir) == 
> CEPH_NOSNAP) {
> 570df4e9 Yan, Zheng 2017-11-15  417 strcpy(name, 
> fsc->mount_options->snapdir_name);
> 570df4e9 Yan, Zheng 2017-11-15  418 err = 0;
> 570df4e9 Yan, Zheng 2017-11-15  419 }
> 570df4e9 Yan, Zheng 2017-11-15  420 goto out;
> 570df4e9 Yan, Zheng 2017-11-15  421 }
> 570df4e9 Yan, Zheng 2017-11-15  422 if (ceph_snap(dir) != CEPH_SNAPDIR)
> 570df4e9 Yan, Zheng 2017-11-15  423 goto out;
> 570df4e9 Yan, Zheng 2017-11-15  424
> 570df4e9 Yan, Zheng 2017-11-15  425 while (1) {
> 570df4e9 Yan, Zheng 2017-11-15  426 struct 
> ceph_mds_reply_info_parsed *rinfo;
> 570df4e9 Yan, Zheng 2017-11-15  427 struct 
> ceph_mds_reply_dir_entry *rde;
> 570df4e9 Yan, Zheng 2017-11-15  428 int i;
> 570df4e9 Yan, Zheng 2017-11-15  429
> 570df4e9 Yan, Zheng 2017-11-15  430 req = 
> ceph_mdsc_create_request(fsc->mdsc, CEPH_MDS_OP_LSSNAP,
> 570df4e9 Yan, Zheng 2017-11-15  431   
>  USE_AUTH_MDS);
> 570df4e9 Yan, Zheng 2017-11-15  432 if (IS_ERR(req)) {
> 570df4e9 Yan, Zheng 2017-11-15  433 err = PTR_ERR(req);
> 570df4e9 Yan, Zheng 2017-11-15  434 req = NULL;
> 570df4e9 Yan, Zheng 2017-11-15  435 goto out;
> 570df4e9 Yan, Zheng 2017-11-15  436 }
> 570df4e9 Yan, Zheng 2017-11-15  437 err = 
> ceph_alloc_readdir_reply_buffer(req, inode);
> 570df4e9 Yan, Zheng 2017-11-15  438 if (err)
> 570df4e9 Yan, Zheng 2017-11-15  439 goto out;
> 570df4e9 Yan, Zheng 2017-11-15  440
> 570df4e9 Yan, Zheng 2017-11-15  441 req->r_direct_mode = 
> USE_AUTH_MDS;
> 570df4e9 Yan, Zheng 2017-11-15  442 req->r_readdir_offset = 
> next_offset;
> 570df4e9 Yan, Zheng 2017-11-15  443 req->r_args.readdir.flags =
> 570df4e9 Yan, Zheng 2017-11-15  444 
> cpu_to_le16(CEPH_READDIR_REPLY_BITFLAGS);
> 570df4e9 Yan, Zheng 2017-11-15  445 if (last_name) {
> 570df4e9 Yan, Zheng 2017-11-15  446 req->r_p

Re: [PATCH 0/4] Sleeping functions in invalid context bug fixes

2019-07-23 Thread Ilya Dryomov

On Fri, Jul 19, 2019 at 5:20 PM Jeff Layton  wrote:
>
> On Fri, 2019-07-19 at 15:32 +0100, Luis Henriques wrote:
> > Hi,
> >
> > I'm sending three "sleeping function called from invalid context" bug
> > fixes that I had on my TODO for a while.  All of them are ceph_buffer_put
> > related, and all the fixes follow the same pattern: delay the operation
> > until the ci->i_ceph_lock is released.
> >
> > The first patch simply allows ceph_buffer_put to receive a NULL buffer so
> > that the NULL check doesn't need to be performed in all the other patches.
> > IOW, it's not really required, just convenient.
> >
> > (Note: maybe these patches should all be tagged for stable.)
> >
> > Luis Henriques (4):
> >   libceph: allow ceph_buffer_put() to receive a NULL ceph_buffer
> >   ceph: fix buffer free while holding i_ceph_lock in __ceph_setxattr()
> >   ceph: fix buffer free while holding i_ceph_lock in
> > __ceph_build_xattrs_blob()
> >   ceph: fix buffer free while holding i_ceph_lock in fill_inode()
> >
> >  fs/ceph/caps.c  |  5 -
> >  fs/ceph/inode.c |  7 ---
> >  fs/ceph/snap.c  |  4 +++-
> >  fs/ceph/super.h |  2 +-
> >  fs/ceph/xattr.c | 19 ++-
> >  include/linux/ceph/buffer.h |  3 ++-
> >  6 files changed, 28 insertions(+), 12 deletions(-)
>
> This all looks good to me. I'll plan to merge these into the testing
> branch soon, and tag them for stable.
>
> PS: On a related note (and more of a question for Ilya)...
>
> I'm wondering if we get any benefit from having our own ceph_kvmalloc
> routine. Why are we not better off using the stock kvmalloc routine
> instead? Forcing a vmalloc just because we've gone above 32k allocation
> doesn't seem like the right thing to do.

I don't remember off the top of my head and can't check right now.
Could be that kvmalloc() didn't exist back then.  I'll add that to my
TODO list.

Thanks,

Ilya

[GIT PULL] Ceph updates for 5.3-rc1

2019-07-17 Thread Ilya Dryomov

Hi Linus,

The following changes since commit 0ecfebd2b52404ae0c54a878c872bb93363ada36:

  Linux 5.2 (2019-07-07 15:41:56 -0700)

are available in the Git repository at:

  https://github.com/ceph/ceph-client.git tags/ceph-for-5.3-rc1

for you to fetch changes up to d31d07b97a5e76f41e00eb81dcca740e84aa7782:

  ceph: fix end offset in truncate_inode_pages_range call (2019-07-08 14:01:45 
+0200)

There is a trivial conflict caused by commit 9ffbe8ac05db
("locking/lockdep: Rename lockdep_assert_held_exclusive() ->
lockdep_assert_held_write()").  I included the resolution in
for-linus-merged.


Lots of exciting things this time!

- support for rbd object-map and fast-diff features (myself).  This
  will speed up reads, discards and things like snap diffs on sparse
  images.

- ceph.snap.btime vxattr to expose snapshot creation time (David
  Disseldorp).  This will be used to integrate with "Restore Previous
  Versions" feature added in Windows 7 for folks who reexport ceph
  through SMB.

- security xattrs for ceph (Zheng Yan).  Only selinux is supported
  for now due to the limitations of ->dentry_init_security().

- support for MSG_ADDR2, FS_BTIME and FS_CHANGE_ATTR features (Jeff
  Layton).  This is actually a single feature bit which was missing
  because of the filesystem pieces.  With this in, the kernel client
  will finally be reported as "luminous" by "ceph features" -- it is
  still being reported as "jewel" even though all required Luminous
  features were implemented in 4.13.

- stop NULL-terminating ceph vxattrs (Jeff Layton).  The convention
  with xattrs is to not terminate and this was causing inconsistencies
  with ceph-fuse.

- change filesystem time granularity from 1 us to 1 ns, again fixing
  an inconsistency with ceph-fuse (Luis Henriques).

On top of this there are some additional dentry name handling and cap
flushing fixes from Zheng.  Finally, Jeff is formally taking over for
Zheng as the filesystem maintainer.


Andrea Parri (1):
  ceph: fix improper use of smp_mb__before_atomic()

Christoph Hellwig (1):
  libceph: remove ceph_get_direct_page_vector()

Dan Carpenter (1):
  ceph: silence a checker warning in mdsc_show()

David Disseldorp (6):
  ceph: clean up ceph.dir.pin vxattr name sizeof()
  ceph: carry snapshot creation time with inodes
  ceph: add ceph.snap.btime vxattr
  ceph: fix listxattr vxattr buffer length calculation
  ceph: remove unused vxattr length helpers
  ceph: fix "ceph.dir.rctime" vxattr value

Hariprasad Kelam (1):
  ceph: fix warning PTR_ERR_OR_ZERO can be used

Ilya Dryomov (21):
  rbd: get rid of obj_req->xferred, obj_req->result and img_req->xferred
  rbd: replace obj_req->tried_parent with obj_req->read_state
  rbd: get rid of RBD_OBJ_WRITE_{FLAT,GUARD}
  rbd: move OSD request submission into object request state machines
  rbd: introduce image request state machine
  libceph: rename r_unsafe_item to r_private_item
  rbd: introduce obj_req->osd_reqs list
  rbd: factor out rbd_osd_setup_copyup()
  rbd: factor out __rbd_osd_setup_discard_ops()
  rbd: move OSD request allocation into object request state machines
  rbd: rename rbd_obj_setup_*() to rbd_obj_init_*()
  rbd: introduce copyup state machine
  rbd: lock should be quiesced on reacquire
  rbd: quiescing lock should wait for image requests
  rbd: new exclusive lock wait/wake code
  libceph: bump CEPH_MSG_MAX_DATA_LEN (again)
  libceph: change ceph_osdc_call() to take page vector for response
  libceph: export osd_req_op_data() macro
  rbd: call rbd_dev_mapping_set() from rbd_dev_image_probe()
  rbd: support for object-map and fast-diff
  rbd: setallochint only if object doesn't exist

Jeff Layton (22):
  libceph: fix sa_family just after reading address
  libceph: add ceph_decode_entity_addr
  libceph: ADDR2 support for monmap
  libceph: switch osdmap decoding to use ceph_decode_entity_addr
  libceph: fix watch_item_t decoding to use ceph_decode_entity_addr
  libceph: correctly decode ADDR2 addresses in incremental OSD maps
  ceph: have MDS map decoding use entity_addr_t decoder
  ceph: fix decode_locker to use ceph_decode_entity_addr
  libceph: use TYPE_LEGACY for entity addrs instead of TYPE_NONE
  libceph: rename ceph_encode_addr to ceph_encode_banner_addr
  ceph: add btime field to ceph_inode_info
  ceph: handle btime in cap messages
  libceph: turn on CEPH_FEATURE_MSG_ADDR2
  ceph: allow querying of STATX_BTIME in ceph_getattr
  iversion: add a routine to update a raw value with a larger one
  ceph: add change_attr field to ceph_inode_info
  ceph: handle change_attr in cap messages
  ceph

Re: linux-next: build failure after merge of the tip tree

2019-07-10 Thread Ilya Dryomov

On Wed, Jul 10, 2019 at 2:01 AM Stephen Rothwell  wrote:
>
> Hi all,
>
> On Tue, 9 Jul 2019 16:54:59 +1000 Stephen Rothwell  
> wrote:
> >
> > After merging the tip tree, today's linux-next build (x86_64 allmodconfig)
> > failed like this:
> >
> > drivers/block/rbd.c: In function 'wake_lock_waiters':
> > drivers/block/rbd.c:3933:2: error: implicit declaration of function 
> > 'lockdep_assert_held_exclusive'; did you mean 'lockdep_assert_held_write'? 
> > [-Werror=implicit-function-declaration]
> >   lockdep_assert_held_exclusive(&rbd_dev->lock_rwsem);
> >   ^
> >   lockdep_assert_held_write
> >
> > Caused by commit
> >
> >   9ffbe8ac05db ("locking/lockdep: Rename lockdep_assert_held_exclusive() -> 
> > lockdep_assert_held_write()")
> >
> > interacting with commits
> >
> >   637cd060537d ("rbd: new exclusive lock wait/wake code")
> >   a2b1da09793d ("rbd: lock should be quiesced on reacquire")
> >
> > from the ceph tree.
> >
> > I have added the following merge fix patch for today.
> >
> > From: Stephen Rothwell 
> > Date: Tue, 9 Jul 2019 16:46:12 +1000
> > Subject: [PATCH] rbd: fix up for lockdep_assert_held_exclusive rename
> >
> > Signed-off-by: Stephen Rothwell 
> > ---
> >  drivers/block/rbd.c | 4 ++--
> >  1 file changed, 2 insertions(+), 2 deletions(-)
> >
> > diff --git a/drivers/block/rbd.c b/drivers/block/rbd.c
> > index 723c3ef4bd59..02216fbdb854 100644
> > --- a/drivers/block/rbd.c
> > +++ b/drivers/block/rbd.c
> > @@ -3930,7 +3930,7 @@ static void wake_lock_waiters(struct rbd_device 
> > *rbd_dev, int result)
> >   struct rbd_img_request *img_req;
> >
> >   dout("%s rbd_dev %p result %d\n", __func__, rbd_dev, result);
> > - lockdep_assert_held_exclusive(&rbd_dev->lock_rwsem);
> > + lockdep_assert_held_write(&rbd_dev->lock_rwsem);
> >
> >   cancel_delayed_work(&rbd_dev->lock_dwork);
> >   if (!completion_done(&rbd_dev->acquire_wait)) {
> > @@ -4209,7 +4209,7 @@ static bool rbd_quiesce_lock(struct rbd_device 
> > *rbd_dev)
> >   bool need_wait;
> >
> >   dout("%s rbd_dev %p\n", __func__, rbd_dev);
> > - lockdep_assert_held_exclusive(&rbd_dev->lock_rwsem);
> > + lockdep_assert_held_write(&rbd_dev->lock_rwsem);
> >
> >   if (rbd_dev->lock_state != RBD_LOCK_STATE_LOCKED)
> >   return false;
>
> This fix now needs to be applied to the merge of the ceph tree.

Hi Stephen,

Yes, that is what I figured would happen.  I assume you would keep
carrying this fixup until the ceph tree is merged.

Thanks,

Ilya

[GIT PULL] Ceph fix for 5.2-rc7

2019-06-28 Thread Ilya Dryomov

Hi Linus,

The following changes since commit 4b972a01a7da614b4796475f933094751a295a2f:

  Linux 5.2-rc6 (2019-06-22 16:01:36 -0700)

are available in the Git repository at:

  https://github.com/ceph/ceph-client.git tags/ceph-for-5.2-rc7

for you to fetch changes up to d6b8bd679c9c8856fa04b80490765c43a4cb613b:

  ceph: fix ceph_mdsc_build_path to not stop on first component (2019-06-27 
18:27:36 +0200)


A small fix for a potential -rc1 regression from Jeff.


Jeff Layton (1):
  ceph: fix ceph_mdsc_build_path to not stop on first component

 fs/ceph/mds_client.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

Re: [PATCH v4 3/3] ceph: don't NULL terminate virtual xattrs

2019-06-25 Thread Ilya Dryomov

On Mon, Jun 24, 2019 at 6:27 PM Jeff Layton  wrote:
>
> The convention with xattrs is to not store the termination with string
> data, given that it returns the length. This is how setfattr/getfattr
> operate.
>
> Most of ceph's virtual xattr routines use snprintf to plop the string
> directly into the destination buffer, but snprintf always NULL
> terminates the string. This means that if we send the kernel a buffer
> that is the exact length needed to hold the string, it'll end up
> truncated.
>
> Add a ceph_fmt_xattr helper function to format the string into an
> on-stack buffer that is should always be large enough to hold the whole
> thing and then memcpy the result into the destination buffer. If it does
> turn out that the formatted string won't fit in the on-stack buffer,
> then return -E2BIG and do a WARN_ONCE().
>
> Change over most of the virtual xattr routines to use the new helper. A
> couple of the xattrs are sourced from strings however, and it's
> difficult to know how long they'll be. Just have those memcpy the result
> in place after verifying the length.
>
> Signed-off-by: Jeff Layton 
> ---
>  fs/ceph/xattr.c | 84 ++---
>  1 file changed, 59 insertions(+), 25 deletions(-)
>
> diff --git a/fs/ceph/xattr.c b/fs/ceph/xattr.c
> index 9b77dca0b786..37b458a9af3a 100644
> --- a/fs/ceph/xattr.c
> +++ b/fs/ceph/xattr.c
> @@ -109,22 +109,49 @@ static ssize_t ceph_vxattrcb_layout(struct 
> ceph_inode_info *ci, char *val,
> return ret;
>  }
>
> +/*
> + * The convention with strings in xattrs is that they should not be NULL
> + * terminated, since we're returning the length with them. snprintf always
> + * NULL terminates however, so call it on a temporary buffer and then memcpy
> + * the result into place.
> + */
> +static int ceph_fmt_xattr(char *val, size_t size, const char *fmt, ...)
> +{
> +   int ret;
> +   va_list args;
> +   char buf[96]; /* NB: reevaluate size if new vxattrs are added */
> +
> +   va_start(args, fmt);
> +   ret = vsnprintf(buf, size ? sizeof(buf) : 0, fmt, args);
> +   va_end(args);
> +
> +   /* Sanity check */
> +   if (size && ret + 1 > sizeof(buf)) {
> +   WARN_ONCE(true, "Returned length too big (%d)", ret);
> +   return -E2BIG;
> +   }
> +
> +   if (ret <= size)
> +   memcpy(val, buf, ret);
> +   return ret;
> +}

Nit: perhaps check size at the top and bail early instead of checking
it at every step?

Thanks,

Ilya

Re: [PATCH v3 1/2] ceph: fix buffer length handling in virtual xattrs

2019-06-24 Thread Ilya Dryomov

On Mon, Jun 24, 2019 at 12:26 PM Jeff Layton  wrote:
>
> On Mon, 2019-06-24 at 12:00 +0200, Ilya Dryomov wrote:
> > On Fri, Jun 21, 2019 at 4:18 PM Jeff Layton  wrote:
> > > The convention with xattrs is to not store the termination with string
> > > data, given that it returns the length. This is how setfattr/getfattr
> > > operate.
> > >
> > > Most of ceph's virtual xattr routines use snprintf to plop the string
> > > directly into the destination buffer, but snprintf always NULL
> > > terminates the string. This means that if we send the kernel a buffer
> > > that is the exact length needed to hold the string, it'll end up
> > > truncated.
> > >
> > > Add new routines to format the string into an on-stack buffer that is
> > > always large enough to hold the whole thing and then memcpy the result
> > > into the destination buffer. Then, change over the virtual xattr
> > > routines to use the new helper functions as appropriate.
> > >
> > > Finally, make the code return ERANGE if the destination buffer size was
> > > too small to hold the returned value.
> > >
> > > Signed-off-by: Jeff Layton 
> > > ---
> > >  fs/ceph/xattr.c | 103 
> > >  1 file changed, 78 insertions(+), 25 deletions(-)
> > >
> > > diff --git a/fs/ceph/xattr.c b/fs/ceph/xattr.c
> > > index 6621d27e64f5..359d3cbbb37b 100644
> > > --- a/fs/ceph/xattr.c
> > > +++ b/fs/ceph/xattr.c
> > > @@ -112,22 +112,47 @@ static size_t ceph_vxattrcb_layout(struct 
> > > ceph_inode_info *ci, char *val,
> > > return ret;
> > >  }
> > >
> > > +/* Enough to hold any possible expression of integer TYPE in base 10 */
> > > +#define INT_STR_SIZE(_type)3*sizeof(_type)+2
> > > +
> > > +/*
> > > + * snprintf always NULL terminates, but we need for xattrs to not be. For
> > > + * the integer vxattrs, just create an on-stack buffer for snprintf's
> > > + * destination, and just don't copy the termination to the actual buffer.
> > > + */
> > > +#define GENERATE_XATTR_INT_FORMATTER(_lbl, _format, _type)   
> > >\
> > > +static size_t format_ ## _lbl ## _xattr(char *val, size_t size, _type 
> > > src)   \
> > > +{
> > >\
> > > +   size_t ret;   
> > >\
> > > +   char buf[INT_STR_SIZE(_type)];
> > >\
> > > + 
> > >\
> > > +   ret = snprintf(buf, size ? sizeof(buf) : 0, _format, src);
> > >\
> > > +   if (ret <= size)  
> > >\
> > > +   memcpy(val, buf, ret);
> > >\
> > > +   return ret;   
> > >\
> > > +}
> > > +
> > > +GENERATE_XATTR_INT_FORMATTER(u, "%u", unsigned int)
> > > +GENERATE_XATTR_INT_FORMATTER(d, "%d", int)
> > > +GENERATE_XATTR_INT_FORMATTER(lld, "%lld", long long)
> > > +GENERATE_XATTR_INT_FORMATTER(llu, "%llu", unsigned long long)
> > > +
> > >  static size_t ceph_vxattrcb_layout_stripe_unit(struct ceph_inode_info 
> > > *ci,
> > >char *val, size_t size)
> > >  {
> > > -   return snprintf(val, size, "%u", ci->i_layout.stripe_unit);
> > > +   return format_u_xattr(val, size, ci->i_layout.stripe_unit);
> > >  }
> > >
> > >  static size_t ceph_vxattrcb_layout_stripe_count(struct ceph_inode_info 
> > > *ci,
> > > char *val, size_t size)
> > >  {
> > > -   return snprintf(val, size, "%u", ci->i_layout.stripe_count);
> > > +   return format_u_xattr(val, size, ci->i_layout.stripe_count);
> > >  }
> > >
> > >  static size_t ceph_vxattrcb_layout_object_size(struct ceph_inode_info 
> > > *ci,
> > >char *val, size_t size)
> > >  {
> > > -   return snprintf(val, size, "%u", ci->i_layout.object_

Re: [PATCH v3 1/2] ceph: fix buffer length handling in virtual xattrs

2019-06-24 Thread Ilya Dryomov

On Fri, Jun 21, 2019 at 4:18 PM Jeff Layton  wrote:
>
> The convention with xattrs is to not store the termination with string
> data, given that it returns the length. This is how setfattr/getfattr
> operate.
>
> Most of ceph's virtual xattr routines use snprintf to plop the string
> directly into the destination buffer, but snprintf always NULL
> terminates the string. This means that if we send the kernel a buffer
> that is the exact length needed to hold the string, it'll end up
> truncated.
>
> Add new routines to format the string into an on-stack buffer that is
> always large enough to hold the whole thing and then memcpy the result
> into the destination buffer. Then, change over the virtual xattr
> routines to use the new helper functions as appropriate.
>
> Finally, make the code return ERANGE if the destination buffer size was
> too small to hold the returned value.
>
> Signed-off-by: Jeff Layton 
> ---
>  fs/ceph/xattr.c | 103 
>  1 file changed, 78 insertions(+), 25 deletions(-)
>
> diff --git a/fs/ceph/xattr.c b/fs/ceph/xattr.c
> index 6621d27e64f5..359d3cbbb37b 100644
> --- a/fs/ceph/xattr.c
> +++ b/fs/ceph/xattr.c
> @@ -112,22 +112,47 @@ static size_t ceph_vxattrcb_layout(struct 
> ceph_inode_info *ci, char *val,
> return ret;
>  }
>
> +/* Enough to hold any possible expression of integer TYPE in base 10 */
> +#define INT_STR_SIZE(_type)3*sizeof(_type)+2
> +
> +/*
> + * snprintf always NULL terminates, but we need for xattrs to not be. For
> + * the integer vxattrs, just create an on-stack buffer for snprintf's
> + * destination, and just don't copy the termination to the actual buffer.
> + */
> +#define GENERATE_XATTR_INT_FORMATTER(_lbl, _format, _type)  \
> +static size_t format_ ## _lbl ## _xattr(char *val, size_t size, _type src)   
> \
> +{   \
> +   size_t ret;  \
> +   char buf[INT_STR_SIZE(_type)];   \
> +\
> +   ret = snprintf(buf, size ? sizeof(buf) : 0, _format, src);   \
> +   if (ret <= size) \
> +   memcpy(val, buf, ret);   \
> +   return ret;  \
> +}
> +
> +GENERATE_XATTR_INT_FORMATTER(u, "%u", unsigned int)
> +GENERATE_XATTR_INT_FORMATTER(d, "%d", int)
> +GENERATE_XATTR_INT_FORMATTER(lld, "%lld", long long)
> +GENERATE_XATTR_INT_FORMATTER(llu, "%llu", unsigned long long)
> +
>  static size_t ceph_vxattrcb_layout_stripe_unit(struct ceph_inode_info *ci,
>char *val, size_t size)
>  {
> -   return snprintf(val, size, "%u", ci->i_layout.stripe_unit);
> +   return format_u_xattr(val, size, ci->i_layout.stripe_unit);
>  }
>
>  static size_t ceph_vxattrcb_layout_stripe_count(struct ceph_inode_info *ci,
> char *val, size_t size)
>  {
> -   return snprintf(val, size, "%u", ci->i_layout.stripe_count);
> +   return format_u_xattr(val, size, ci->i_layout.stripe_count);
>  }
>
>  static size_t ceph_vxattrcb_layout_object_size(struct ceph_inode_info *ci,
>char *val, size_t size)
>  {
> -   return snprintf(val, size, "%u", ci->i_layout.object_size);
> +   return format_u_xattr(val, size, ci->i_layout.object_size);
>  }
>
>  static size_t ceph_vxattrcb_layout_pool(struct ceph_inode_info *ci,
> @@ -141,10 +166,14 @@ static size_t ceph_vxattrcb_layout_pool(struct 
> ceph_inode_info *ci,
>
> down_read(&osdc->lock);
> pool_name = ceph_pg_pool_name_by_id(osdc->osdmap, pool);
> -   if (pool_name)
> -   ret = snprintf(val, size, "%s", pool_name);
> -   else
> -   ret = snprintf(val, size, "%lld", (unsigned long long)pool);
> +   if (pool_name) {
> +   ret = strlen(pool_name);
> +
> +   if (ret <= size)
> +   memcpy(val, pool_name, ret);
> +   } else {
> +   ret = format_lld_xattr(val, size, pool);
> +   }
> up_read(&osdc->lock);
> return ret;
>  }
> @@ -155,7 +184,11 @@ static size_t ceph_vxattrcb_layout_pool_namespace(struct 
> ceph_inode_info *ci,
> int ret = 0;
> struct ceph_string *ns = ceph_try_get_string(ci->i_layout.pool_ns);
> if (ns) {
> -   ret = snprintf(val, size, "%.*s", (int)ns->len, ns->str);
> +   ret = ns->len;
> +
> +   if (ret <= size)
> +   memcpy(val, ns->str, ns->len);
> +
> ceph_put_string(ns);
> }
> return ret;
> @@ -166,50 +199,61 @@ static size_t 
> ceph_vxattrcb_layout_pool_nam

[GIT PULL] Ceph fixes for 5.2-rc4

2019-06-08 Thread Ilya Dryomov

Hi Linus,

The following changes since commit f2c7c76c5d0a443053e94adb9f0918fa2fb85c3a:

  Linux 5.2-rc3 (2019-06-02 13:55:33 -0700)

are available in the Git repository at:

  https://github.com/ceph/ceph-client.git tags/ceph-for-5.2-rc4

for you to fetch changes up to 7b2f936fc8282ab56d4d21247f2f9c21607c085c:

  ceph: fix error handling in ceph_get_caps() (2019-06-05 20:34:39 +0200)


A change to call iput() asynchronously to avoid a possible deadlock
when iput_final() needs to wait for in-flight I/O (e.g. readahead) and
a fixup for a cleanup that went into -rc1.


Yan, Zheng (3):
  ceph: single workqueue for inode related works
  ceph: avoid iput_final() while holding mutex or in dispatch thread
  ceph: fix error handling in ceph_get_caps()

 fs/ceph/caps.c   |  34 ++-
 fs/ceph/file.c   |   2 +-
 fs/ceph/inode.c  | 155 +++
 fs/ceph/mds_client.c |  28 ++
 fs/ceph/quota.c  |   9 ++-
 fs/ceph/snap.c   |  16 --
 fs/ceph/super.c  |  28 +++---
 fs/ceph/super.h  |  19 ---
 8 files changed, 156 insertions(+), 135 deletions(-)

[GIT PULL] Ceph updates for 5.2-rc1

2019-05-16 Thread Ilya Dryomov

Hi Linus,

The following changes since commit e93c9c99a629c61837d5a7fc2120cd2b6c70dbdd:

  Linux 5.1 (2019-05-05 17:42:58 -0700)

are available in the Git repository at:

  https://github.com/ceph/ceph-client.git tags/ceph-for-5.2-rc1

for you to fetch changes up to 00abf69dd24fd185982379c5cc3bb7b6d1fc:

  ceph: flush dirty inodes before proceeding with remount (2019-05-07 19:43:05 
+0200)


On the filesystem side we have:

- a fix to enforce quotas set above the mount point (Luis Henriques)

- support for exporting snapshots through NFS (Zheng Yan)

- proper statx implementation (Jeff Layton).  statx flags are mapped
  to MDS caps, with AT_STATX_{DONT,FORCE}_SYNC taken into account.

- some follow-up dentry name handling fixes, in particular elimination
  of our hand-rolled helper and the switch to __getname() as suggested
  by Al (Jeff Layton)

- a set of MDS client cleanups in preparation for async MDS requests
  in the future (Jeff Layton)

- a fix to sync the filesystem before remounting (Jeff Layton)

On the rbd side, work is on-going on object-map and fast-diff image
features.


Arnd Bergmann (3):
  rbd: avoid clang -Wuninitialized warning
  rbd: convert all rbd_assert(0) to BUG()
  libceph: fix clang warning for CEPH_DEFINE_OID_ONSTACK

Ilya Dryomov (2):
  rbd: client_mutex is never nested
  rbd: don't assert on writes to snapshots

Jeff Layton (20):
  ceph: remove superfluous inode_lock in ceph_fsync
  ceph: properly handle granular statx requests
  ceph: fix NULL pointer deref when debugging is enabled
  ceph: make iterate_session_caps a public symbol
  ceph: dump granular cap info in "caps" debugfs file
  ceph: fix potential use-after-free in ceph_mdsc_build_path
  ceph: use ceph_mdsc_build_path instead of clone_dentry_name
  ceph: use __getname/__putname in ceph_mdsc_build_path
  ceph: use pathlen values returned by set_request_path_attr
  ceph: after an MDS request, do callback and completions
  ceph: have ceph_mdsc_do_request call ceph_mdsc_submit_request
  ceph: move wait for mds request into helper function
  ceph: fix comment over ceph_drop_caps_for_unlink
  ceph: simplify arguments and return semantics of try_get_cap_refs
  ceph: just call get_session in __ceph_lookup_mds_session
  ceph: print inode number in __caps_issued_mask debugging messages
  libceph: fix unaligned accesses in ceph_entity_addr handling
  libceph: make ceph_pr_addr take an struct ceph_entity_addr pointer
  ceph: fix unaligned access in ceph_send_cap_releases
  ceph: flush dirty inodes before proceeding with remount

Luis Henriques (2):
  ceph: factor out ceph_lookup_inode()
  ceph: quota: fix quota subdir mounts

Yan, Zheng (1):
  ceph: snapshot nfs re-export

Zhi Zhang (1):
  ceph: remove duplicated filelock ref increase

 drivers/block/rbd.c|  24 +--
 fs/ceph/caps.c |  93 +--
 fs/ceph/debugfs.c  |  40 -
 fs/ceph/export.c   | 356 ++---
 fs/ceph/file.c |   2 +-
 fs/ceph/inode.c|  85 ++
 fs/ceph/locks.c|  13 --
 fs/ceph/mds_client.c   | 205 ++--
 fs/ceph/mds_client.h   |  33 +++-
 fs/ceph/mdsmap.c   |   2 +-
 fs/ceph/quota.c| 177 ++--
 fs/ceph/super.c|   7 +
 fs/ceph/super.h|   2 +
 include/linux/ceph/ceph_fs.h   |   6 +
 include/linux/ceph/messenger.h |   3 +-
 include/linux/ceph/osdmap.h|  13 +-
 net/ceph/cls_lock_client.c |   2 +-
 net/ceph/debugfs.c |   4 +-
 net/ceph/messenger.c   | 121 +++---
 net/ceph/mon_client.c  |   6 +-
 net/ceph/osd_client.c  |   2 +-
 21 files changed, 845 insertions(+), 351 deletions(-)

[GIT PULL] Ceph fixes for 5.1-rc7

2019-04-25 Thread Ilya Dryomov

Hi Linus,

The following changes since commit 085b7755808aa11f78ab9377257e1dad2e6fa4bb:

  Linux 5.1-rc6 (2019-04-21 10:45:57 -0700)

are available in the Git repository at:

  https://github.com/ceph/ceph-client.git tags/ceph-for-5.1-rc7

for you to fetch changes up to 37659182bff1eeaaeadcfc8f853c6d2b6dbc3f47:

  ceph: fix ci->i_head_snapc leak (2019-04-23 21:37:54 +0200)


dentry name handling fixes from Jeff and a memory leak fix from Zheng.
Both are old issues, marked for stable.


Jeff Layton (3):
  ceph: only use d_name directly when parent is locked
  ceph: ensure d_name stability in ceph_dentry_hash()
  ceph: handle the case where a dentry has been renamed on outstanding req

Yan, Zheng (1):
  ceph: fix ci->i_head_snapc leak

 fs/ceph/dir.c|  6 -
 fs/ceph/inode.c  | 16 +++-
 fs/ceph/mds_client.c | 70 +++-
 fs/ceph/snap.c   |  7 +-
 4 files changed, 85 insertions(+), 14 deletions(-)

[GIT PULL] Ceph fixes for 5.1-rc3

2019-03-29 Thread Ilya Dryomov

Hi Linus,

The following changes since commit 8c2ffd9174779014c3fe1f96d9dc3641d9175f00:

  Linux 5.1-rc2 (2019-03-24 14:02:26 -0700)

are available in the Git repository at:

  https://github.com/ceph/ceph-client.git tags/ceph-for-5.1-rc3

for you to fetch changes up to daf5cc27eed99afdea8d96e71b89ba41f5406ef6:

  ceph: fix use-after-free on symlink traversal (2019-03-27 19:00:37 +0100)


A patch to avoid choking on multipage bvecs in the messenger and
a small use-after-free fix.


Al Viro (1):
  ceph: fix use-after-free on symlink traversal

Ilya Dryomov (1):
  libceph: fix breakage caused by multipage bvecs

 fs/ceph/inode.c  | 2 +-
 net/ceph/messenger.c | 8 ++--
 2 files changed, 7 insertions(+), 3 deletions(-)

Re: [PATCH] [v2] ceph: fix clang warning for CEPH_DEFINE_OID_ONSTACK

2019-03-27 Thread Ilya Dryomov

On Mon, Mar 25, 2019 at 1:51 PM Arnd Bergmann  wrote:
>
> clang complains about assigning a variable to itself during the
> declaration:
>
> fs/ceph/ioctl.c:187:26: error: variable 'oid' is uninitialized when used 
> within its own initialization [-Werror,-Wuninitialized]
> CEPH_DEFINE_OID_ONSTACK(oid);
> ^~~
> include/linux/ceph/osdmap.h:122:52: note: expanded from macro 
> 'CEPH_DEFINE_OID_ONSTACK'
> struct ceph_object_id oid = CEPH_OID_INIT_ONSTACK(oid)
>   ~~~ ^~~
> include/linux/ceph/osdmap.h:120:29: note: expanded from macro 
> 'CEPH_OID_INIT_ONSTACK'
> ({ ceph_oid_init(&oid); oid; })
> ^~~
>
> We use this trick in other places, but it is completely unnecessary
> here, as we can just use a regular struct initializer.
>
> Signed-off-by: Arnd Bergmann 
> ---
> v2: rearrange to only have one instance of the initializer
> ---
>  include/linux/ceph/osdmap.h | 13 ++---
>  1 file changed, 6 insertions(+), 7 deletions(-)
>
> diff --git a/include/linux/ceph/osdmap.h b/include/linux/ceph/osdmap.h
> index 5675b1f09bc5..8794cf0f0b39 100644
> --- a/include/linux/ceph/osdmap.h
> +++ b/include/linux/ceph/osdmap.h
> @@ -110,17 +110,16 @@ struct ceph_object_id {
> int name_len;
>  };
>
> +#define CEPH_OID_INITIALIZER(oid) { .name = (oid).inline_name }
> +
> +#define CEPH_DEFINE_OID_ONSTACK(oid)   \
> +   struct ceph_object_id oid = CEPH_OID_INITIALIZER(oid)
> +
>  static inline void ceph_oid_init(struct ceph_object_id *oid)
>  {
> -   oid->name = oid->inline_name;
> -   oid->name_len = 0;
> +   *oid = (struct ceph_object_id)CEPH_OID_INITIALIZER(*oid);
>  }
>
> -#define CEPH_OID_INIT_ONSTACK(oid) \
> -({ ceph_oid_init(&oid); oid; })
> -#define CEPH_DEFINE_OID_ONSTACK(oid)   \
> -   struct ceph_object_id oid = CEPH_OID_INIT_ONSTACK(oid)
> -
>  static inline bool ceph_oid_empty(const struct ceph_object_id *oid)
>  {
> return oid->name == oid->inline_name && !oid->name_len;

Applied.

Thanks,

Ilya

[PATCH] dm table: propagate BDI_CAP_STABLE_WRITES

2019-03-26 Thread Ilya Dryomov

Some devices don't use blk_integrity but still want stable pages
because they do their own checksumming.  Examples include rbd and iSCSI
when data digests are negotiated.  Stacking DM (and thus LVM) on top of
these devices results in sporadic checksum errors.

Set BDI_CAP_STABLE_WRITES if any underlying device has it set.

Signed-off-by: Ilya Dryomov 
---
 drivers/md/dm-table.c | 39 +++
 1 file changed, 39 insertions(+)

diff --git a/drivers/md/dm-table.c b/drivers/md/dm-table.c
index ba9481f1bf3c..cde3b49b2a91 100644
--- a/drivers/md/dm-table.c
+++ b/drivers/md/dm-table.c
@@ -1844,6 +1844,36 @@ static bool dm_table_supports_secure_erase(struct 
dm_table *t)
return true;
 }
 
+static int device_requires_stable_pages(struct dm_target *ti,
+   struct dm_dev *dev, sector_t start,
+   sector_t len, void *data)
+{
+   struct request_queue *q = bdev_get_queue(dev->bdev);
+
+   return q && bdi_cap_stable_pages_required(q->backing_dev_info);
+}
+
+/*
+ * If any underlying device requires stable pages, a table must require
+ * them as well.  Only targets that support iterate_devices are considered:
+ * don't want error, zero, etc to require stable pages.
+ */
+static bool dm_table_requires_stable_pages(struct dm_table *t)
+{
+   struct dm_target *ti;
+   unsigned i;
+
+   for (i = 0; i < dm_table_get_num_targets(t); i++) {
+   ti = dm_table_get_target(t, i);
+
+   if (ti->type->iterate_devices &&
+   ti->type->iterate_devices(ti, device_requires_stable_pages, 
NULL))
+   return true;
+   }
+
+   return false;
+}
+
 void dm_table_set_restrictions(struct dm_table *t, struct request_queue *q,
   struct queue_limits *limits)
 {
@@ -1896,6 +1926,15 @@ void dm_table_set_restrictions(struct dm_table *t, 
struct request_queue *q,
 
dm_table_verify_integrity(t);
 
+   /*
+* Some devices don't use blk_integrity but still want stable pages
+* because they do their own checksumming.
+*/
+   if (dm_table_requires_stable_pages(t))
+   q->backing_dev_info->capabilities |= BDI_CAP_STABLE_WRITES;
+   else
+   q->backing_dev_info->capabilities &= ~BDI_CAP_STABLE_WRITES;
+
/*
 * Determine whether or not this queue's I/O timings contribute
 * to the entropy pool, Only request-based targets use this.
-- 
2.19.2

Re: ceph: fix use-after-free on symlink traversal

2019-03-26 Thread Ilya Dryomov

On Tue, Mar 26, 2019 at 2:39 AM Al Viro  wrote:
>
> free the symlink body after the same RCU delay we have for freeing the
> struct inode itself, so that traversal during RCU pathwalk wouldn't step
> into freed memory.
>
> Signed-off-by: Al Viro 
> ---
> diff --git a/fs/ceph/inode.c b/fs/ceph/inode.c
> index e3346628efe2..2d61ddda9bf5 100644
> --- a/fs/ceph/inode.c
> +++ b/fs/ceph/inode.c
> @@ -524,6 +524,7 @@ static void ceph_i_callback(struct rcu_head *head)
> struct inode *inode = container_of(head, struct inode, i_rcu);
> struct ceph_inode_info *ci = ceph_inode(inode);
>
> +   kfree(ci->i_symlink);
> kmem_cache_free(ceph_inode_cachep, ci);
>  }
>
> @@ -566,7 +567,6 @@ void ceph_destroy_inode(struct inode *inode)
> }
> }
>
> -   kfree(ci->i_symlink);
> while ((n = rb_first(&ci->i_fragtree)) != NULL) {
> frag = rb_entry(n, struct ceph_inode_frag, node);
> rb_erase(n, &ci->i_fragtree);

Al, I see you directed this patch at Linus instead of ceph-devel.
I can pick it up for -rc3 as I have an important libceph fix pending
anyway.  Let me know if you want me to handle it.

Thanks,

Ilya

Re: [PATCH] rbd: avoid clang -Wuninitialized warning

2019-03-25 Thread Ilya Dryomov

On Fri, Mar 22, 2019 at 5:55 PM Arnd Bergmann  wrote:
>
> On Fri, Mar 22, 2019 at 5:33 PM Ilya Dryomov  wrote:
> >
> > On Fri, Mar 22, 2019 at 3:36 PM Arnd Bergmann  wrote:
> > >
> > > diff --git a/drivers/block/rbd.c b/drivers/block/rbd.c
> > > index 4ba967d65cf9..cbcc3baf3807 100644
> > > --- a/drivers/block/rbd.c
> > > +++ b/drivers/block/rbd.c
> > > @@ -2399,7 +2399,7 @@ static int rbd_obj_read_from_parent(struct 
> > > rbd_obj_request *obj_req)
> > >   &obj_req->bvec_pos);
> > > break;
> > > default:
> > > -   rbd_assert(0);
> > > +   BUG();
> > > }
> > > } else {
> > > ret = rbd_img_fill_from_bvecs(child_img_req,
> >
> > Hi Arnd,
> >
> > You did a couple of these last year in commit c6244b3b2377 ("rbd: avoid
> > Wreturn-type warnings").
>
> Ah, I had completely forgotten about that. Different bug and different
> compiler, but same solution ;-)
>
> >  Let's change all of those default cases to BUG
> > in one go.  Do you want to do that or should I?
>
> I've prepared another patch now and sent it out, please
> apply it on top. I'd like this one-line patch to stay separate though
> since it captures the warning message and may need to
> be backported to stable kernels later.

Applied.

Thanks,

Ilya

[GIT PULL] Ceph fixes for 5.1-rc2

2019-03-22 Thread Ilya Dryomov

Hi Linus,

The following changes since commit 9e98c678c2d6ae3a17cb2de55d17f69dddaa231b:

  Linux 5.1-rc1 (2019-03-17 14:22:26 -0700)

are available in the Git repository at:

  https://github.com/ceph/ceph-client.git tags/ceph-for-5.1-rc2

for you to fetch changes up to 9d4a227f6ef189cf37eb22641f6ee788b7dc41bb:

  rbd: drop wait_for_latest_osdmap() (2019-03-20 16:27:40 +0100)


A follow up for the new alloc_size logic and a blacklisting fix, marked
for stable.


Ilya Dryomov (3):
  rbd: set io_min, io_opt and discard_granularity to alloc_size
  libceph: wait for latest osdmap in ceph_monc_blacklist_add()
  rbd: drop wait_for_latest_osdmap()

 drivers/block/rbd.c  | 28 ++--
 include/linux/ceph/libceph.h |  2 ++
 net/ceph/ceph_common.c   | 18 +-
 net/ceph/mon_client.c|  9 +
 4 files changed, 34 insertions(+), 23 deletions(-)

Re: [PATCH] ceph: fix clang warning for CEPH_DEFINE_OID_ONSTACK

2019-03-22 Thread Ilya Dryomov

On Fri, Mar 22, 2019 at 3:08 PM Arnd Bergmann  wrote:
>
> clang complains about assigning a variable to itself during the
> declaration:
>
> fs/ceph/ioctl.c:187:26: error: variable 'oid' is uninitialized when used 
> within its own initialization [-Werror,-Wuninitialized]
> CEPH_DEFINE_OID_ONSTACK(oid);
> ^~~
> include/linux/ceph/osdmap.h:122:52: note: expanded from macro 
> 'CEPH_DEFINE_OID_ONSTACK'
> struct ceph_object_id oid = CEPH_OID_INIT_ONSTACK(oid)
>   ~~~ ^~~
> include/linux/ceph/osdmap.h:120:29: note: expanded from macro 
> 'CEPH_OID_INIT_ONSTACK'
> ({ ceph_oid_init(&oid); oid; })
> ^~~
>
> We use this trick in other places, but it is completely unnecessary
> here, as we can just use a regular struct initializer.
>
> Signed-off-by: Arnd Bergmann 
> ---
>  include/linux/ceph/osdmap.h | 4 +---
>  1 file changed, 1 insertion(+), 3 deletions(-)
>
> diff --git a/include/linux/ceph/osdmap.h b/include/linux/ceph/osdmap.h
> index 5675b1f09bc5..82f957a7a0d6 100644
> --- a/include/linux/ceph/osdmap.h
> +++ b/include/linux/ceph/osdmap.h
> @@ -116,10 +116,8 @@ static inline void ceph_oid_init(struct ceph_object_id 
> *oid)
> oid->name_len = 0;
>  }
>
> -#define CEPH_OID_INIT_ONSTACK(oid) \
> -({ ceph_oid_init(&oid); oid; })
>  #define CEPH_DEFINE_OID_ONSTACK(oid)   \
> -   struct ceph_object_id oid = CEPH_OID_INIT_ONSTACK(oid)
> +   struct ceph_object_id oid = { .name = oid.inline_name }
>
>  static inline bool ceph_oid_empty(const struct ceph_object_id *oid)
>  {

Hi Arnd,

I don't like this because the initialization is no longer contained to
ceph_oid_init().  Now there are two things to patch instead of one.

How is this going to be fixed in other places?

Thanks,

Ilya

[GIT PULL] Ceph updates for 5.1-rc1

2019-03-12 Thread Ilya Dryomov

Hi Linus,

The following changes since commit 1c163f4c7b3f621efff9b28a47abb36f7378d783:

  Linux 5.0 (2019-03-03 15:21:29 -0800)

are available in the Git repository at:

  https://github.com/ceph/ceph-client.git tags/ceph-for-5.1-rc1

for you to fetch changes up to d11ae8e0a76afc506071831854348f2ea1f3290e:

  Documentation: modern versions of ceph are not backed by btrfs (2019-03-05 
18:55:18 +0100)


The highlights are:

- rbd will now ignore discards that aren't aligned and big enough to
  actually free up some space (myself).  This is controlled by the new
  alloc_size map option and can be disabled if needed.

- support for rbd deep-flatten feature (myself).  Deep-flatten allows
  "rbd flatten" to fully disconnect the clone image and its snapshots
  from the parent and make the parent snapshot removable.

- a new round of cap handling improvements (Zheng Yan).  The kernel
  client should now be much more prompt about releasing its caps and
  it is possible to put a limit on the number of caps held.

- support for getting ceph.dir.pin extended attribute (Zheng Yan)


Gustavo A. R. Silva (1):
  libceph: use struct_size() for kmalloc() in crush_decode()

Ilya Dryomov (11):
  rbd: get rid of obj_req->obj_request_count
  rbd: handle DISCARD and WRITE_ZEROES separately
  rbd: round off and ignore discards that are too small
  rbd: remove experimental designation from kernel layering
  rbd: clear ->xferred on error from rbd_obj_issue_copyup()
  rbd: factor out __rbd_osd_req_create()
  rbd: stop copying num_osd_ops in rbd_obj_issue_copyup()
  rbd: introduce rbd_obj_issue_copyup_ops()
  rbd: copyup with an empty snapshot context (aka deep-copyup)
  rbd: whole-object write and zeroout should copyup when snapshots exist
  rbd: advertise support for RBD_FEATURE_DEEP_FLATTEN

Jeff Layton (1):
  Documentation: modern versions of ceph are not backed by btrfs

Yan, Zheng (12):
  ceph: set special inode's blocksize to page size
  ceph: decode feature bits in session message
  ceph: split large reconnect into multiple messages
  ceph: map snapid to anonymous bdev ID
  ceph: support versioned reply
  ceph: support getting ceph.dir.pin vxattr
  ceph: send cap releases more aggressively
  ceph: touch existing cap when handling reply
  ceph: remove dentry_lru file from debugfs
  ceph: delete stale dentry when last reference is dropped
  ceph: periodically trim stale dentries
  ceph: add mount option to limit caps count

zhengbin (1):
  ceph: pass inclusive lend parameter to filemap_write_and_wait_range()

 Documentation/filesystems/ceph.txt |  14 +-
 drivers/block/rbd.c| 400 +++--
 fs/ceph/caps.c |  72 ++--
 fs/ceph/debugfs.c  |  27 --
 fs/ceph/dir.c  | 455 +++-
 fs/ceph/file.c |  13 +-
 fs/ceph/inode.c|  52 +--
 fs/ceph/mds_client.c   | 698 ++---
 fs/ceph/mds_client.h   |  43 ++-
 fs/ceph/snap.c | 159 -
 fs/ceph/super.c|  21 +-
 fs/ceph/super.h|  43 ++-
 fs/ceph/xattr.c|  20 +-
 include/linux/ceph/types.h |   1 +
 net/ceph/osdmap.c  |   5 +-
 15 files changed, 1597 insertions(+), 426 deletions(-)

[GIT PULL] Ceph fixes for 5.0-rc8

2019-02-21 Thread Ilya Dryomov

Hi Linus,

The following changes since commit a3b22b9f11d9fbc48b0291ea92259a5a810e9438:

  Linux 5.0-rc7 (2019-02-17 18:46:40 -0800)

are available in the Git repository at:

  https://github.com/ceph/ceph-client.git tags/ceph-for-5.0-rc8

for you to fetch changes up to 04242ff3ac0abbaa4362f97781dac268e6c3541a:

  ceph: avoid repeatedly adding inode to mdsc->snap_flush_list (2019-02-18 
18:08:29 +0100)


Two bug fixes for old issues, both marked for stable.

----
Ilya Dryomov (1):
  libceph: handle an empty authorize reply

Yan, Zheng (1):
  ceph: avoid repeatedly adding inode to mdsc->snap_flush_list

 fs/ceph/snap.c   |  3 ++-
 net/ceph/messenger.c | 15 +--
 2 files changed, 11 insertions(+), 7 deletions(-)

[GIT PULL] Ceph fixes for 5.0-rc4

2019-01-24 Thread Ilya Dryomov

Hi Linus,

The following changes since commit 49a57857aeea06ca831043acbb0fa5e0f50602fd:

  Linux 5.0-rc3 (2019-01-21 13:14:44 +1300)

are available in the Git repository at:

  https://github.com/ceph/ceph-client.git tags/ceph-for-5.0-rc4

for you to fetch changes up to 74827ee29565f86e2a64495a5e3e58d3371d74ee:

  ceph: quota: cleanup license mess (2019-01-21 14:53:23 +0100)


A fix for a potential use-after-free, a patch to close a (mostly benign)
race in the messenger and a licence clarification for quota.c.


Ilya Dryomov (1):
  libceph: avoid KEEPALIVE_PENDING races in ceph_con_keepalive()

Thomas Gleixner (1):
  ceph: quota: cleanup license mess

Yan, Zheng (1):
  ceph: clear inode pointer when snap realm gets dropped by its inode

 fs/ceph/caps.c   |  2 ++
 fs/ceph/quota.c  | 13 -
 net/ceph/messenger.c |  5 +++--
 3 files changed, 5 insertions(+), 15 deletions(-)

Re: [patch 6/9] ceph: quota: Cleanup license mess

2019-01-18 Thread Ilya Dryomov

On Fri, Jan 18, 2019 at 12:15 AM Thomas Gleixner  wrote:
>
> Precise and non-ambiguous license information is important. The recently
> added aegis header file has a SPDX license identifier, which is nice, but

Looks like cut-and-paste from crypto/aegis.h patch?

I'm changing this to say "recently added quota.c file".

> at the same time it has a contradictionary license boiler plate text.
>
>   SPDX-License-Identifier: GPL-2.0
>
> versus
>
>   * This program is free software; you can redistribute it and/or
>   * modify it under the terms of the GNU General Public License
>   * as published by the Free Software Foundation; either version 2
>   * of the License, or (at your option) any later version.
>
> Oh well.
>
> As the other ceph related files are licensed under the GPL v2 only, it's
> assumed that the SPDX id is correct and the boiler plate was randomly
> copied into that patch.
>
> Remove the boiler plate as it is wrong and even if correct it is redundant.
>
> Fixes: fb18a57568c2 ("ceph: quota: add initial infrastructure to support 
> cephfs quotas")
> Signed-off-by: Thomas Gleixner 
> Cc: Luis Henriques 
> Cc: Jiri Kosina 
> Cc: "Yan, Zheng" 
> Cc: Sage Weil 
> Cc: Ilya Dryomov 
> Cc: ceph-de...@vger.kernel.org
> ---
>
> P.S.: This patch is part of a larger cleanup, but independent of other
>   patches and is intended to be picked up by the maintainer directly.
>
> ---
>  fs/ceph/quota.c |   13 -
>  1 file changed, 13 deletions(-)
>
> --- a/fs/ceph/quota.c
> +++ b/fs/ceph/quota.c
> @@ -3,19 +3,6 @@
>   * quota.c - CephFS quota
>   *
>   * Copyright (C) 2017-2018 SUSE
> - *
> - * This program is free software; you can redistribute it and/or
> - * modify it under the terms of the GNU General Public License
> - * as published by the Free Software Foundation; either version 2
> - * of the License, or (at your option) any later version.
> - *
> - * This program is distributed in the hope that it will be useful,
> - * but WITHOUT ANY WARRANTY; without even the implied warranty of
> - * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> - * GNU General Public License for more details.
> - *
> - * You should have received a copy of the GNU General Public License
> - * along with this program; if not, see <http://www.gnu.org/licenses/>.
>   */
>
>  #include 

Applied.

Thanks,

Ilya

Re: [PATCH net-next] libceph, ceph: use struct_size() in kmalloc()

2019-01-17 Thread Ilya Dryomov

On Tue, Jan 15, 2019 at 8:41 PM Gustavo A. R. Silva
 wrote:
>
> One of the more common cases of allocation size calculations is finding
> the size of a structure that has a zero-sized array at the end, along
> with memory for some number of elements for that array. For example:
>
> struct foo {
> int stuff;
> struct boo entry[];
> };
>
> instance = kmalloc(sizeof(struct foo) + count * sizeof(struct boo), 
> GFP_KERNEL);
>
> Instead of leaving these open-coded and prone to type mistakes, we can
> now use the new struct_size() helper:
>
> instance = kmalloc(struct_size(instance, entry, count), GFP_KERNEL);
>
> This code was detected with the help of Coccinelle.
>
> Signed-off-by: Gustavo A. R. Silva 
> ---
>  net/ceph/osdmap.c | 5 ++---
>  1 file changed, 2 insertions(+), 3 deletions(-)
>
> diff --git a/net/ceph/osdmap.c b/net/ceph/osdmap.c
> index 98c0ff3d6441..48a31dc9161c 100644
> --- a/net/ceph/osdmap.c
> +++ b/net/ceph/osdmap.c
> @@ -495,9 +495,8 @@ static struct crush_map *crush_decode(void *pbyval, void 
> *end)
>   / sizeof(struct crush_rule_step))
> goto bad;
>  #endif
> -   r = c->rules[i] = kmalloc(sizeof(*r) +
> - yes*sizeof(struct crush_rule_step),
> - GFP_NOFS);
> +   r = kmalloc(struct_size(r, steps, yes), GFP_NOFS);
> +   c->rules[i] = r;
> if (r == NULL)
> goto badmem;
> dout(" rule %d is at %p\n", i, r);

Applied.

Thanks,

Ilya

Re: [PATCH] libceph: protect pending flags in ceph_con_keepalive()

2019-01-15 Thread Ilya Dryomov

On Tue, Jan 15, 2019 at 7:56 AM Myungho Jung  wrote:
>
> On Mon, Jan 14, 2019 at 09:37:25PM +0100, Ilya Dryomov wrote:
> > On Thu, Jan 3, 2019 at 4:50 AM Myungho Jung  wrote:
> > > I reproduced on vm using syzkaller utils and verified the fix by syzbot.
> >
> > Hi Myungho,
> >
> > I think this might be a better fix:
> >
> > diff --git a/net/ceph/messenger.c b/net/ceph/messenger.c
> > index d5718284db57..c5f5313e3537 100644
> > --- a/net/ceph/messenger.c
> > +++ b/net/ceph/messenger.c
> > @@ -3205,10 +3205,11 @@ void ceph_con_keepalive(struct ceph_connection *con)
> >  {
> > dout("con_keepalive %p\n", con);
> > mutex_lock(&con->mutex);
> > +   con_flag_set(con, CON_FLAG_KEEPALIVE_PENDING);
> > clear_standby(con);
> > mutex_unlock(&con->mutex);
> > -   if (con_flag_test_and_set(con, CON_FLAG_KEEPALIVE_PENDING) == 0 &&
> > -   con_flag_test_and_set(con, CON_FLAG_WRITE_PENDING) == 0)
> > +
> > +   if (con_flag_test_and_set(con, CON_FLAG_WRITE_PENDING) == 0)
> > queue_con(con);
> >  }
> >  EXPORT_SYMBOL(ceph_con_keepalive);
> >
> > WRITE_PENDING can be set without con->mutex held from socket callbacks.
> > This is the reason we use atomic bit ops here, so testing WRITE_PENDING
> > under the lock didn't make sense to me.
> >
> > At the same time, KEEPALIVE_PENDING could have been a non-atomic flag.
> > I spent some time trying to make sense of conditioning queue_con() call
> > on the previous value of KEEPALIVE_PENDING and couldn't see any, so I'm
> > setting it with con_flag_set(), making ceph_con_keepalive() symmetric
> > with ceph_con_send().
> >
> > Thanks,
> >
> > Ilya
>
> Hi Ilya,
>
> Yes, it looks clear and makes sense to have an atomic operation in if 
> statement
> but it still triggers warning. KEEPALIVE_PENDING should be set after
> clear_standby() because con_fault() can be called right before acquiring the
> lock here which sets the flag in standby state. I tesed the change with syzbot
> and confirmed there was no warning.

Right, it still triggers one of the warnings.  I was too focused on
WRITE_PENDING and missed that in plain sight.  I'll update the patch.

Thanks for testing!

Ilya

Re: [PATCH] libceph: protect pending flags in ceph_con_keepalive()

2019-01-14 Thread Ilya Dryomov

On Thu, Jan 3, 2019 at 4:50 AM Myungho Jung  wrote:
> I reproduced on vm using syzkaller utils and verified the fix by syzbot.

Hi Myungho,

I think this might be a better fix:

diff --git a/net/ceph/messenger.c b/net/ceph/messenger.c
index d5718284db57..c5f5313e3537 100644
--- a/net/ceph/messenger.c
+++ b/net/ceph/messenger.c
@@ -3205,10 +3205,11 @@ void ceph_con_keepalive(struct ceph_connection *con)
 {
dout("con_keepalive %p\n", con);
mutex_lock(&con->mutex);
+   con_flag_set(con, CON_FLAG_KEEPALIVE_PENDING);
clear_standby(con);
mutex_unlock(&con->mutex);
-   if (con_flag_test_and_set(con, CON_FLAG_KEEPALIVE_PENDING) == 0 &&
-   con_flag_test_and_set(con, CON_FLAG_WRITE_PENDING) == 0)
+
+   if (con_flag_test_and_set(con, CON_FLAG_WRITE_PENDING) == 0)
queue_con(con);
 }
 EXPORT_SYMBOL(ceph_con_keepalive);

WRITE_PENDING can be set without con->mutex held from socket callbacks.
This is the reason we use atomic bit ops here, so testing WRITE_PENDING
under the lock didn't make sense to me.

At the same time, KEEPALIVE_PENDING could have been a non-atomic flag.
I spent some time trying to make sense of conditioning queue_con() call
on the previous value of KEEPALIVE_PENDING and couldn't see any, so I'm
setting it with con_flag_set(), making ceph_con_keepalive() symmetric
with ceph_con_send().

Thanks,

Ilya

[GIT PULL] Ceph updates for 5.0-rc2

2019-01-11 Thread Ilya Dryomov

Hi Linus,

The following changes since commit bfeffd155283772bbe78c6a05dec7c0128ee500c:

  Linux 5.0-rc1 (2019-01-06 17:08:20 -0800)

are available in the Git repository at:

  https://github.com/ceph/ceph-client.git tags/ceph-for-5.0-rc2

for you to fetch changes up to 85f5a4d666fd9be73856ed16bb36c5af5b406b29:

  rbd: don't return 0 on unmap if RBD_DEV_FLAG_REMOVING is set (2019-01-10 
09:45:09 +0100)


A patch to allow setting abort_on_full and a fix for an old "rbd unmap"
edge case, marked for stable.


Dongsheng Yang (1):
  libceph: allow setting abort_on_full for rbd

Ilya Dryomov (1):
  rbd: don't return 0 on unmap if RBD_DEV_FLAG_REMOVING is set

Souptick Joarder (1):
  ceph: use vmf_error() in ceph_filemap_fault()

 drivers/block/rbd.c |  9 -
 fs/ceph/addr.c  |  5 +
 fs/ceph/super.c |  4 ++--
 include/linux/ceph/libceph.h|  6 --
 include/linux/ceph/osd_client.h |  1 -
 net/ceph/ceph_common.c  | 11 ++-
 net/ceph/debugfs.c  |  2 +-
 net/ceph/osd_client.c   |  4 ++--
 8 files changed, 24 insertions(+), 18 deletions(-)

Re: [PATCH] fs/ceph/addr.c: Convert to use vmf_error()

2019-01-05 Thread Ilya Dryomov

On Fri, Jan 4, 2019 at 8:26 PM Souptick Joarder  wrote:
>
> This code is converted to use vmf_error().
>
> Signed-off-by: Souptick Joarder 
> ---
>  fs/ceph/addr.c | 5 +
>  1 file changed, 1 insertion(+), 4 deletions(-)
>
> diff --git a/fs/ceph/addr.c b/fs/ceph/addr.c
> index 8eade7a..fa2a85d 100644
> --- a/fs/ceph/addr.c
> +++ b/fs/ceph/addr.c
> @@ -1495,10 +1495,7 @@ static vm_fault_t ceph_filemap_fault(struct vm_fault 
> *vmf)
> if (err < 0 || off >= i_size_read(inode)) {
> unlock_page(page);
> put_page(page);
> -   if (err == -ENOMEM)
> -   ret = VM_FAULT_OOM;
> -   else
> -   ret = VM_FAULT_SIGBUS;
> +   ret = vmf_error(err);
> goto out_inline;
> }
> if (err < PAGE_SIZE)

Applied.

Thanks,

Ilya

[GIT PULL] Ceph updates for 4.21-rc1

2019-01-03 Thread Ilya Dryomov

Hi Linus,

The following changes since commit 8fe28cb58bcb235034b64cbbb7550a8a43fd88be:

  Linux 4.20 (2018-12-23 15:55:59 -0800)

are available in the Git repository at:

  https://github.com/ceph/ceph-client.git tags/ceph-for-4.21-rc1

for you to fetch changes up to 5ccedf1ccd710ba32f36986b49eeb764e53e7ef1:

  ceph: don't encode inode pathes into reconnect message (2018-12-26 16:08:36 
+0100)


A fairly quiet round: a couple of messenger performance improvements
from myself and a few cap handling fixes from Zheng.


Chengguang Xu (1):
  ceph: remove redundant assignment

Ilya Dryomov (4):
  libceph: drop last_piece logic from write_partial_message_data()
  libceph: use sock_no_sendpage() as a fallback in ceph_tcp_sendpage()
  libceph: use MSG_SENDPAGE_NOTLAST with ceph_tcp_sendpage()
  libceph: switch more to bool in ceph_tcp_sendmsg()

Yan, Zheng (6):
  ceph: cleanup splice_dentry()
  ceph: don't update importing cap's mseq when handing cap export
  ceph: don't request excl caps when mount is readonly
  ceph: skip updating 'wanted' caps if caps are already issued
  ceph: update wanted caps after resuming stale session
  ceph: don't encode inode pathes into reconnect message

 fs/ceph/caps.c   |  75 ++
 fs/ceph/inode.c  |  60 ++--
 fs/ceph/mds_client.c | 129 ++-
 fs/ceph/mds_client.h |  16 ---
 fs/ceph/mdsmap.c |   1 -
 net/ceph/messenger.c |  55 +-
 6 files changed, 174 insertions(+), 162 deletions(-)

Re: [PATCH] libceph: protect pending flags in ceph_con_keepalive()

2019-01-02 Thread Ilya Dryomov

On Thu, Dec 27, 2018 at 8:08 PM Myungho Jung  wrote:
>
> con_flag_test_and_set() sets CON_FLAG_KEEPALIVE_PENDING and
> CON_FLAG_WRITE_PENDING flags without protection in ceph_con_keepalive().
> It triggers WARN_ON() in clear_standby() if the flags are set after
> con_fault() changes connection state to CON_STATE_STANDBY. Move
> con_flag_test_and_set() to be called before releasing the lock and store
> the condition to check after the critical section.
>
> Reported-by: syzbot+acdeb633f6211ccdf...@syzkaller.appspotmail.com
> Signed-off-by: Myungho Jung 
> ---
>  net/ceph/messenger.c | 8 ++--
>  1 file changed, 6 insertions(+), 2 deletions(-)
>
> diff --git a/net/ceph/messenger.c b/net/ceph/messenger.c
> index 2f126eff275d..e15da22d4f37 100644
> --- a/net/ceph/messenger.c
> +++ b/net/ceph/messenger.c
> @@ -3216,12 +3216,16 @@ void ceph_msg_revoke_incoming(struct ceph_msg *msg)
>   */
>  void ceph_con_keepalive(struct ceph_connection *con)
>  {
> +   bool pending;
> +
> dout("con_keepalive %p\n", con);
> mutex_lock(&con->mutex);
> clear_standby(con);
> +   pending = (con_flag_test_and_set(con,
> +CON_FLAG_KEEPALIVE_PENDING) == 0 &&
> +  con_flag_test_and_set(con, CON_FLAG_WRITE_PENDING) == 0);
> mutex_unlock(&con->mutex);
> -   if (con_flag_test_and_set(con, CON_FLAG_KEEPALIVE_PENDING) == 0 &&
> -   con_flag_test_and_set(con, CON_FLAG_WRITE_PENDING) == 0)
> +   if (pending)
> queue_con(con);
>  }
>  EXPORT_SYMBOL(ceph_con_keepalive);

Hi Myungho,

Were you able to reproduce?  If so, did you use the syzkaller output or
something else?

Thanks,

Ilya

Re: [PATCH 06/10] block: rbd: convert to use BUS_ATTR_WO and RO

2018-12-21 Thread Ilya Dryomov

On Fri, Dec 21, 2018 at 8:55 AM Greg Kroah-Hartman
 wrote:
>
> We are trying to get rid of BUS_ATTR() and the usage of that in rbd.c
> can be trivially converted to use BUS_ATTR_WO and RO, so use those
> macros instead.
>
> Cc: Ilya Dryomov 
> Cc: Sage Weil 
> Cc: Alex Elder 
> Cc: Jens Axboe 
> Signed-off-by: Greg Kroah-Hartman 
> ---
>  drivers/block/rbd.c | 45 +++--
>  1 file changed, 19 insertions(+), 26 deletions(-)
>
> diff --git a/drivers/block/rbd.c b/drivers/block/rbd.c
> index 8e5140bbf241..d871d364fdcf 100644
> --- a/drivers/block/rbd.c
> +++ b/drivers/block/rbd.c
> @@ -428,14 +428,13 @@ static bool single_major = true;
>  module_param(single_major, bool, 0444);
>  MODULE_PARM_DESC(single_major, "Use a single major number for all rbd 
> devices (default: true)");
>
> -static ssize_t rbd_add(struct bus_type *bus, const char *buf,
> -  size_t count);
> -static ssize_t rbd_remove(struct bus_type *bus, const char *buf,
> - size_t count);
> -static ssize_t rbd_add_single_major(struct bus_type *bus, const char *buf,
> -   size_t count);
> -static ssize_t rbd_remove_single_major(struct bus_type *bus, const char *buf,
> -  size_t count);
> +static ssize_t add_store(struct bus_type *bus, const char *buf, size_t 
> count);
> +static ssize_t remove_store(struct bus_type *bus, const char *buf,
> +   size_t count);
> +static ssize_t add_single_major_store(struct bus_type *bus, const char *buf,
> + size_t count);
> +static ssize_t remove_single_major_store(struct bus_type *bus, const char 
> *buf,
> +size_t count);
>  static int rbd_dev_image_probe(struct rbd_device *rbd_dev, int depth);
>
>  static int rbd_dev_id_to_minor(int dev_id)
> @@ -464,16 +463,16 @@ static bool rbd_is_lock_owner(struct rbd_device 
> *rbd_dev)
> return is_lock_owner;
>  }
>
> -static ssize_t rbd_supported_features_show(struct bus_type *bus, char *buf)
> +static ssize_t supported_features_show(struct bus_type *bus, char *buf)
>  {
> return sprintf(buf, "0x%llx\n", RBD_FEATURES_SUPPORTED);
>  }
>
> -static BUS_ATTR(add, 0200, NULL, rbd_add);
> -static BUS_ATTR(remove, 0200, NULL, rbd_remove);
> -static BUS_ATTR(add_single_major, 0200, NULL, rbd_add_single_major);
> -static BUS_ATTR(remove_single_major, 0200, NULL, rbd_remove_single_major);
> -static BUS_ATTR(supported_features, 0444, rbd_supported_features_show, NULL);
> +static BUS_ATTR_WO(add);
> +static BUS_ATTR_WO(remove);
> +static BUS_ATTR_WO(add_single_major);
> +static BUS_ATTR_WO(remove_single_major);
> +static BUS_ATTR_RO(supported_features);
>
>  static struct attribute *rbd_bus_attrs[] = {
> &bus_attr_add.attr,
> @@ -5934,9 +5933,7 @@ static ssize_t do_rbd_add(struct bus_type *bus,
> goto out;
>  }
>
> -static ssize_t rbd_add(struct bus_type *bus,
> -  const char *buf,
> -  size_t count)
> +static ssize_t add_store(struct bus_type *bus, const char *buf, size_t count)
>  {
> if (single_major)
> return -EINVAL;
> @@ -5944,9 +5941,8 @@ static ssize_t rbd_add(struct bus_type *bus,
> return do_rbd_add(bus, buf, count);
>  }
>
> -static ssize_t rbd_add_single_major(struct bus_type *bus,
> -   const char *buf,
> -   size_t count)
> +static ssize_t add_single_major_store(struct bus_type *bus, const char *buf,
> + size_t count)
>  {
> return do_rbd_add(bus, buf, count);
>  }
> @@ -6050,9 +6046,7 @@ static ssize_t do_rbd_remove(struct bus_type *bus,
> return count;
>  }
>
> -static ssize_t rbd_remove(struct bus_type *bus,
> - const char *buf,
> - size_t count)
> +static ssize_t remove_store(struct bus_type *bus, const char *buf, size_t 
> count)
>  {
> if (single_major)
> return -EINVAL;
> @@ -6060,9 +6054,8 @@ static ssize_t rbd_remove(struct bus_type *bus,
> return do_rbd_remove(bus, buf, count);
>  }
>
> -static ssize_t rbd_remove_single_major(struct bus_type *bus,
> -  const char *buf,
> -  size_t count)
> +static ssize_t remove_single_major_store(struct bus_type *bus, const char 
> *buf,
> +size_t count)
>  {
> return do_rbd_remove(bus, buf, count);
>  }

Acked-by: Ilya Dryomov 

Thanks,

Ilya

[GIT PULL] Ceph fix for 4.20-rc7

2018-12-14 Thread Ilya Dryomov

Hi Linus,

The following changes since commit 40e020c129cfc991e8ab4736d2665351ffd1468d:

  Linux 4.20-rc6 (2018-12-09 15:31:00 -0800)

are available in the Git repository at:

  https://github.com/ceph/ceph-client.git tags/ceph-for-4.20-rc7

for you to fetch changes up to 6f9718fe41c3a47e4362bddf145e2db6ad7d8e87:

  ceph: make 'nocopyfrom' a default mount option (2018-12-11 18:22:17 +0100)


Luis discovered a problem with the new copyfrom offload on the server
side.  Disable it for now.


Luis Henriques (1):
  ceph: make 'nocopyfrom' a default mount option

 fs/ceph/super.c | 4 ++--
 fs/ceph/super.h | 4 +++-
 2 files changed, 5 insertions(+), 3 deletions(-)

Re: [PATCH] ceph: make 'nocopyfrom' a default mount option

2018-12-10 Thread Ilya Dryomov

On Mon, Dec 10, 2018 at 11:23 AM Luis Henriques  wrote:
>
> Since we found a problem with the 'copy-from' operation after objects have
> been truncated, offloading object copies to OSDs should be discouraged
> until the issue is fixed.
>
> Thus, this patch adds the 'nocopyfrom' mount option to the default mount
> options which effectily means that remote copies won't be done in
> copy_file_range unless they are explicitly enabled at mount time.
>
> Link: https://tracker.ceph.com/issues/37378
> Signed-off-by: Luis Henriques 
> ---
>  fs/ceph/super.h | 4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)
>
> diff --git a/fs/ceph/super.h b/fs/ceph/super.h
> index c005a5400f2e..79a265ba9200 100644
> --- a/fs/ceph/super.h
> +++ b/fs/ceph/super.h
> @@ -42,7 +42,9 @@
>  #define CEPH_MOUNT_OPT_NOQUOTADF   (1<<13) /* no root dir quota in 
> statfs */
>  #define CEPH_MOUNT_OPT_NOCOPYFROM  (1<<14) /* don't use RADOS 
> 'copy-from' op */
>
> -#define CEPH_MOUNT_OPT_DEFAULTCEPH_MOUNT_OPT_DCACHE
> +#define CEPH_MOUNT_OPT_DEFAULT \
> +   (CEPH_MOUNT_OPT_DCACHE |\
> +CEPH_MOUNT_OPT_NOCOPYFROM)
>
>  #define ceph_set_mount_opt(fsc, opt) \
> (fsc)->mount_options->flags |= CEPH_MOUNT_OPT_##opt;

Thanks Luis, I'll pick it up for 4.20.

Ilya

Re: [PATCH 4.14 018/146] libceph: implement CEPHX_V2 calculation mode

2018-12-04 Thread Ilya Dryomov

On Tue, Dec 4, 2018 at 12:01 PM Greg Kroah-Hartman
 wrote:
>
> 4.14-stable review patch.  If anyone has any objections, please let me know.
>
> --
>
> commit cc255c76c70f7a87d97939621eae04b600d9f4a1 upstream.
>
> Derive the signature from the entire buffer (both AES cipher blocks)
> instead of using just the first half of the first block, leaving out
> data_crc entirely.
>
> This addresses CVE-2018-1129.
>
> Link: http://tracker.ceph.com/issues/24837
> Signed-off-by: Ilya Dryomov 
> Reviewed-by: Sage Weil 
> Signed-off-by: Ben Hutchings 
> Signed-off-by: Sasha Levin 
> ---
>  include/linux/ceph/ceph_features.h |  7 +--
>  net/ceph/auth_x.c  | 73 +++---
>  2 files changed, 60 insertions(+), 20 deletions(-)
>
> diff --git a/include/linux/ceph/ceph_features.h 
> b/include/linux/ceph/ceph_features.h
> index 59042d5ac520..70f42eef813b 100644
> --- a/include/linux/ceph/ceph_features.h
> +++ b/include/linux/ceph/ceph_features.h
> @@ -165,9 +165,9 @@ DEFINE_CEPH_FEATURE(58, 1, FS_FILE_LAYOUT_V2) // overlap
>  DEFINE_CEPH_FEATURE(59, 1, FS_BTIME)
>  DEFINE_CEPH_FEATURE(59, 1, FS_CHANGE_ATTR) // overlap
>  DEFINE_CEPH_FEATURE(59, 1, MSG_ADDR2) // overlap
> -DEFINE_CEPH_FEATURE(60, 1, BLKIN_TRACING)  // *do not share this bit*
> +DEFINE_CEPH_FEATURE(60, 1, OSD_RECOVERY_DELETES) // *do not share this bit*
> +DEFINE_CEPH_FEATURE(61, 1, CEPHX_V2) // *do not share this bit*
>
> -DEFINE_CEPH_FEATURE(61, 1, RESERVED2)  // unused, but slow down!
>  DEFINE_CEPH_FEATURE(62, 1, RESERVED)   // do not use; used as a 
> sentinal
>  DEFINE_CEPH_FEATURE_DEPRECATED(63, 1, RESERVED_BROKEN, LUMINOUS) // 
> client-facing
>
> @@ -209,7 +209,8 @@ DEFINE_CEPH_FEATURE_DEPRECATED(63, 1, RESERVED_BROKEN, 
> LUMINOUS) // client-facin
>  CEPH_FEATURE_SERVER_JEWEL |\
>  CEPH_FEATURE_MON_STATEFUL_SUB |\
>  CEPH_FEATURE_CRUSH_TUNABLES5 | \
> -CEPH_FEATURE_NEW_OSDOPREPLY_ENCODING)
> +CEPH_FEATURE_NEW_OSDOPREPLY_ENCODING | \
> +CEPH_FEATURE_CEPHX_V2)
>
>  #define CEPH_FEATURES_REQUIRED_DEFAULT   \
> (CEPH_FEATURE_NOSRCADDR |\
> diff --git a/net/ceph/auth_x.c b/net/ceph/auth_x.c
> index ce28bb07d8fd..10eb759bbcb4 100644
> --- a/net/ceph/auth_x.c
> +++ b/net/ceph/auth_x.c
> @@ -9,6 +9,7 @@
>
>  #include 
>  #include 
> +#include 
>  #include 
>  #include 
>
> @@ -803,26 +804,64 @@ static int calc_signature(struct ceph_x_authorizer *au, 
> struct ceph_msg *msg,
>   __le64 *psig)
>  {
> void *enc_buf = au->enc_buf;
> -   struct {
> -   __le32 len;
> -   __le32 header_crc;
> -   __le32 front_crc;
> -   __le32 middle_crc;
> -   __le32 data_crc;
> -   } __packed *sigblock = enc_buf + ceph_x_encrypt_offset();
> int ret;
>
> -   sigblock->len = cpu_to_le32(4*sizeof(u32));
> -   sigblock->header_crc = msg->hdr.crc;
> -   sigblock->front_crc = msg->footer.front_crc;
> -   sigblock->middle_crc = msg->footer.middle_crc;
> -   sigblock->data_crc =  msg->footer.data_crc;
> -   ret = ceph_x_encrypt(&au->session_key, enc_buf, CEPHX_AU_ENC_BUF_LEN,
> -sizeof(*sigblock));
> -   if (ret < 0)
> -   return ret;
> +   if (!CEPH_HAVE_FEATURE(msg->con->peer_features, CEPHX_V2)) {
> +   struct {
> +   __le32 len;
> +   __le32 header_crc;
> +   __le32 front_crc;
> +   __le32 middle_crc;
> +   __le32 data_crc;
> +   } __packed *sigblock = enc_buf + ceph_x_encrypt_offset();
> +
> +   sigblock->len = cpu_to_le32(4*sizeof(u32));
> +   sigblock->header_crc = msg->hdr.crc;
> +   sigblock->front_crc = msg->footer.front_crc;
> +   sigblock->middle_crc = msg->footer.middle_crc;
> +   sigblock->data_crc =  msg->footer.data_crc;
> +
> +   ret = ceph_x_encrypt(&au->session_key, enc_buf,
> +CEPHX_AU_ENC_BUF_LEN, sizeof(*sigblock));
> +   if (ret < 0)
> +   return ret;
> +
> +   *psig = *(__le64 *)(enc_buf + sizeof(u32));
> +   } else {
> +   struct {
> +   __le32 header_crc;
> +   __le32 front_crc;
> +   __le32 front_len;
> +   __le32 middle_crc;
> +

[GIT PULL] Ceph fix for 4.20-rc4

2018-11-23 Thread Ilya Dryomov

Hi Linus,

The following changes since commit 9ff01193a20d391e8dbce4403dd5ef87c7eaaca6:

  Linux 4.20-rc3 (2018-11-18 13:33:44 -0800)

are available in the Git repository at:

  https://github.com/ceph/ceph-client.git tags/ceph-for-4.20-rc4

for you to fetch changes up to 7e241f647dc7087a0401418a187f3f5b527cc690:

  libceph: fall back to sendmsg for slab pages (2018-11-19 17:59:47 +0100)


A messenger fix, marked for stable.


Ilya Dryomov (1):
  libceph: fall back to sendmsg for slab pages

 net/ceph/messenger.c | 12 +---
 1 file changed, 9 insertions(+), 3 deletions(-)

[GIT PULL] Ceph fixes for 4.20-rc2

2018-11-09 Thread Ilya Dryomov

Hi Linus,

The following changes since commit 651022382c7f8da46cb4872a545ee1da6d097d2a:

  Linux 4.20-rc1 (2018-11-04 15:37:52 -0800)

are available in the Git repository at:

  https://github.com/ceph/ceph-client.git tags/ceph-for-4.20-rc2

for you to fetch changes up to 23c625ce3065e40c933a4239efb9b11f1194a343:

  libceph: assume argonaut on the server side (2018-11-08 17:51:11 +0100)


Two CephFS fixes (copy_file_range and quota) and a small feature bit
cleanup.


Ilya Dryomov (1):
  libceph: assume argonaut on the server side

Luis Henriques (2):
  ceph: add destination file data sync before doing any remote copy
  ceph: quota: fix null pointer dereference in quota check

 fs/ceph/file.c | 11 +--
 fs/ceph/mds_client.c   | 12 +++-
 fs/ceph/quota.c|  3 ++-
 include/linux/ceph/ceph_features.h |  8 +---
 4 files changed, 15 insertions(+), 19 deletions(-)

[GIT PULL] Ceph updates for 4.20-rc1

2018-10-31 Thread Ilya Dryomov

Hi Linus,

The following changes since commit 84df9525b0c27f3ebc2ebb1864fa62a97fdedb7d:

  Linux 4.19 (2018-10-22 07:37:37 +0100)

are available in the Git repository at:

  https://github.com/ceph/ceph-client.git tags/ceph-for-4.20-rc1

for you to fetch changes up to ea4cdc548e5e74a529cdd1aea885d74b4aa8f1b3:

  ceph: new mount option to disable usage of copy-from op (2018-10-22 10:28:24 
+0200)


The highlights are:

- a series that fixes some old memory allocation issues in libceph
  (myself).  We no longer allocate memory in places where allocation
  failures cannot be handled and BUG when the allocation fails.

- support for copy_file_range() syscall (Luis Henriques).  If size and
  alignment conditions are met, it leverages RADOS copy-from operation.
  Otherwise, a local copy is performed.

- a patch that reduces memory requirement of ceph_sync_read() from the
  size of the entire read to the size of one object (Zheng Yan).

- fallocate() syscall is now restricted to FALLOC_FL_PUNCH_HOLE (Luis
  Henriques)


Chengguang Xu (3):
  ceph: reset cap hold timeout only for requeued inode
  rbd: add __init/__exit annotations
  ceph: check snap first in ceph_set_acl()

Ilya Dryomov (12):
  libceph: bump CEPH_MSG_MAX_DATA_LEN
  libceph: osd_req_op_cls_init() doesn't need to take opcode
  libceph: introduce ceph_pagelist_alloc()
  libceph: don't consume a ref on pagelist in ceph_msg_data_add_pagelist()
  libceph: no need to call osd_req_opcode_valid() in osd_req_encode_op()
  ceph: num_ops is off by one in ceph_aio_retry_work()
  libceph: enable fallback to ceph_msg_new() in ceph_msgpool_get()
  libceph: assign cookies in linger_submit()
  libceph: introduce alloc_watch_request()
  libceph, rbd, ceph: move ceph_osdc_alloc_messages() calls
  libceph: preallocate message data items
  libceph: check reply num_data_items in setup_request_data()

Luis Henriques (5):
  ceph: only allow punch hole mode in fallocate
  ceph: add non-blocking parameter to ceph_try_get_caps()
  libceph: support the RADOS copy-from operation
  ceph: support copy_file_range file operation
  ceph: new mount option to disable usage of copy-from op

Xuehan Xu (1):
  ceph: set timeout conditionally in __cap_delay_requeue

Yan, Zheng (4):
  Revert "ceph: fix dentry leak in splice_dentry()"
  ceph: fix dentry leak in ceph_readdir_prepopulate
  ceph: check if LOOKUPNAME request was aborted when filling trace
  ceph: refactor ceph_sync_read()

 Documentation/filesystems/ceph.txt |   5 +
 drivers/block/rbd.c|  28 +-
 fs/ceph/acl.c  |  13 +-
 fs/ceph/addr.c |   2 +-
 fs/ceph/caps.c |  21 +-
 fs/ceph/file.c | 573 +++--
 fs/ceph/inode.c|  13 +-
 fs/ceph/mds_client.c   |   9 +-
 fs/ceph/super.c|  13 +
 fs/ceph/super.h|   3 +-
 fs/ceph/xattr.c|   3 +-
 include/linux/ceph/libceph.h   |   8 +-
 include/linux/ceph/messenger.h |  24 +-
 include/linux/ceph/msgpool.h   |  11 +-
 include/linux/ceph/osd_client.h|  22 +-
 include/linux/ceph/pagelist.h  |  11 +-
 include/linux/ceph/rados.h |  28 ++
 net/ceph/messenger.c   | 107 +++
 net/ceph/msgpool.c |  27 +-
 net/ceph/osd_client.c  | 363 +--
 net/ceph/pagelist.c|  20 ++
 21 files changed, 900 insertions(+), 404 deletions(-)

Re: [PATCH] ceph: only allow punch hole mode in fallocate

2018-10-10 Thread Ilya Dryomov

On Wed, Oct 10, 2018 at 1:19 PM Luis Henriques  wrote:
>
> Ilya Dryomov  writes:
>
> > On Wed, Oct 10, 2018 at 6:21 AM Yan, Zheng  wrote:
> >>
> >> On Wed, Oct 10, 2018 at 1:54 AM Luis Henriques  wrote:
>
> 
>
> >> Applied, thanks
> >
> > I don't think it should go to stable kernels.  Strictly speaking it's
> > a behaviour change -- it's been this way for many years and, unless you
> > are close to ENOSPC, it's sort of appears to work.  I'll take off the
> > stable tag unless I hear objections.
>
> Right, it can in fact break applications that rely on the previous
> (bogus) behaviour.  But it can also be claimed that it *will* break
> applications anyway with an updated kernel, so backporting it to older
> kernels will just allow a consistent behaviour.
>
> Anyway, I'm OK either way.  But if you drop the stable tag make sure you
> also remove the 'Fixes:' tag as I believe the stable folks will still
> pick this patch if it includes a valid SHA1 in it.

Yeah, we've run into this in the past.

Thanks,

Ilya

Re: [PATCH] ceph: only allow punch hole mode in fallocate

2018-10-10 Thread Ilya Dryomov

On Wed, Oct 10, 2018 at 6:21 AM Yan, Zheng  wrote:
>
> On Wed, Oct 10, 2018 at 1:54 AM Luis Henriques  wrote:
> >
> > Current implementation of cephfs fallocate isn't correct as it doesn't
> > really reserve the space in the cluster, which means that a subsequent
> > call to a write may actually fail due to lack of space.  In fact, it is
> > currently possible to fallocate an amount space that is larger than the
> > free space in the cluster.
> >
> > Since there's no easy solution to fix this at the moment, this patch
> > simply removes support for all fallocate operations but
> > FALLOC_FL_PUNCH_HOLE (which implies FALLOC_FL_KEEP_SIZE).
> >
> > Link: https://tracker.ceph.com/issues/36317
> > Cc: sta...@vger.kernel.org
> > Fixes: ad7a60de882a ("ceph: punch hole support")
> > Signed-off-by: Luis Henriques 
> > ---
> >  fs/ceph/file.c | 45 +
> >  1 file changed, 9 insertions(+), 36 deletions(-)
> >
> > diff --git a/fs/ceph/file.c b/fs/ceph/file.c
> > index 92ab20433682..91a7ad259bcf 100644
> > --- a/fs/ceph/file.c
> > +++ b/fs/ceph/file.c
> > @@ -1735,7 +1735,6 @@ static long ceph_fallocate(struct file *file, int 
> > mode,
> > struct ceph_file_info *fi = file->private_data;
> > struct inode *inode = file_inode(file);
> > struct ceph_inode_info *ci = ceph_inode(inode);
> > -   struct ceph_fs_client *fsc = ceph_inode_to_client(inode);
> > struct ceph_cap_flush *prealloc_cf;
> > int want, got = 0;
> > int dirty;
> > @@ -1743,10 +1742,7 @@ static long ceph_fallocate(struct file *file, int 
> > mode,
> > loff_t endoff = 0;
> > loff_t size;
> >
> > -   if ((offset + length) > max(i_size_read(inode), fsc->max_file_size))
> > -   return -EFBIG;
> > -
> > -   if (mode & ~(FALLOC_FL_KEEP_SIZE | FALLOC_FL_PUNCH_HOLE))
> > +   if (mode != (FALLOC_FL_KEEP_SIZE | FALLOC_FL_PUNCH_HOLE))
> > return -EOPNOTSUPP;
> >
> > if (!S_ISREG(inode->i_mode))
> > @@ -1763,18 +1759,6 @@ static long ceph_fallocate(struct file *file, int 
> > mode,
> > goto unlock;
> > }
> >
> > -   if (!(mode & (FALLOC_FL_PUNCH_HOLE | FALLOC_FL_KEEP_SIZE)) &&
> > -   ceph_quota_is_max_bytes_exceeded(inode, offset + length)) {
> > -   ret = -EDQUOT;
> > -   goto unlock;
> > -   }
> > -
> > -   if (ceph_osdmap_flag(&fsc->client->osdc, CEPH_OSDMAP_FULL) &&
> > -   !(mode & FALLOC_FL_PUNCH_HOLE)) {
> > -   ret = -ENOSPC;
> > -   goto unlock;
> > -   }
> > -
> > if (ci->i_inline_version != CEPH_INLINE_NONE) {
> > ret = ceph_uninline_data(file, NULL);
> > if (ret < 0)
> > @@ -1782,12 +1766,12 @@ static long ceph_fallocate(struct file *file, int 
> > mode,
> > }
> >
> > size = i_size_read(inode);
> > -   if (!(mode & FALLOC_FL_KEEP_SIZE)) {
> > -   endoff = offset + length;
> > -   ret = inode_newsize_ok(inode, endoff);
> > -   if (ret)
> > -   goto unlock;
> > -   }
> > +
> > +   /* Are we punching a hole beyond EOF? */
> > +   if (offset >= size)
> > +   goto unlock;
> > +   if ((offset + length) > size)
> > +   length = size - offset;
> >
> > if (fi->fmode & CEPH_FILE_MODE_LAZY)
> > want = CEPH_CAP_FILE_BUFFER | CEPH_CAP_FILE_LAZYIO;
> > @@ -1798,16 +1782,8 @@ static long ceph_fallocate(struct file *file, int 
> > mode,
> > if (ret < 0)
> > goto unlock;
> >
> > -   if (mode & FALLOC_FL_PUNCH_HOLE) {
> > -   if (offset < size)
> > -   ceph_zero_pagecache_range(inode, offset, length);
> > -   ret = ceph_zero_objects(inode, offset, length);
> > -   } else if (endoff > size) {
> > -   truncate_pagecache_range(inode, size, -1);
> > -   if (ceph_inode_set_size(inode, endoff))
> > -   ceph_check_caps(ceph_inode(inode),
> > -   CHECK_CAPS_AUTHONLY, NULL);
> > -   }
> > +   ceph_zero_pagecache_range(inode, offset, length);
> > +   ret = ceph_zero_objects(inode, offset, length);
> >
> > if (!ret) {
> > spin_lock(&ci->i_ceph_lock);
> > @@ -1817,9 +1793,6 @@ static long ceph_fallocate(struct file *file, int 
> > mode,
> > spin_unlock(&ci->i_ceph_lock);
> > if (dirty)
> > __mark_inode_dirty(inode, dirty);
> > -   if ((endoff > size) &&
> > -   ceph_quota_is_max_bytes_approaching(inode, endoff))
> > -   ceph_check_caps(ci, CHECK_CAPS_NODELAY, NULL);
> > }
> >
> > ceph_put_cap_refs(ci, got);
>
> Applied, thanks

I don't think it should go to stable kernels.  Strictly speaking it's
a behaviour change -- it's been this way for many years an

Re: [PATCH] ceph: use an enum instead of 'static const' to define constants

2018-10-08 Thread Ilya Dryomov

On Mon, Oct 8, 2018 at 5:37 PM Arnd Bergmann  wrote:
>
> On Mon, Oct 8, 2018 at 4:23 PM Ilya Dryomov  wrote:
> > On Fri, Oct 5, 2018 at 6:18 PM Arnd Bergmann  wrote:
> > > @@ -71,7 +71,7 @@
> > >   * This ensures that no two versions who have different meanings for
> > >   * the bit ever speak to each other.
> > >   */
> > > -
> > > +enum ceph_features {
> > >  DEFINE_CEPH_FEATURE( 0, 1, UID)
> > >  DEFINE_CEPH_FEATURE( 1, 1, NOSRCADDR)
> > >  DEFINE_CEPH_FEATURE_RETIRED( 2, 1, MONCLOCKCHECK, JEWEL, LUMINOUS)
> > > @@ -170,13 +170,13 @@ DEFINE_CEPH_FEATURE(61, 1, CEPHX_V2) // 
> > > *do not share this bit*
> > >
> > >  DEFINE_CEPH_FEATURE(62, 1, RESERVED)   // do not use; used as a 
> > > sentinal
> > >  DEFINE_CEPH_FEATURE_DEPRECATED(63, 1, RESERVED_BROKEN, LUMINOUS) // 
> > > client-facing
> > > -
> > > +};
> >
> > I don't particularly like this because it looks like lower constants
> > are actually ints and the rest are unsigned longs, even though they all
> > have ULL suffixes.  The standard seems to require that enum constants
> > be representable as ints, is the non-pedantic behaviour documented
> > somewhere?
>
> I had not realized that this is a gcc extension, or that it behaves slightly
> differently from the standard C++ behavior that apparently adopted a
> saner variant (all values in an enum have the same type).
>
> How about we just add a __maybe_unused to DEFINE_CEPH_FEATURE
> then to shut up the warning?

Fine with me.

Thanks,

Ilya

Re: [PATCH] ceph: use an enum instead of 'static const' to define constants

2018-10-08 Thread Ilya Dryomov

On Fri, Oct 5, 2018 at 6:18 PM Arnd Bergmann  wrote:
>
> Building with W=1 produces lots of warnings for files including
> ceph_features.h:
>
> include/linux/ceph/ceph_features.h:15:24: error: 'CEPH_FEATUREMASK_SERVER_M' 
> defined but not used [-Werror=unused-const-variable=]
>
> The normal way to define compile-time constants in the kernel is
> to use either macros or enums, and gcc does not warn about those.
>
> Converting to an enum is simple here and means we can still use
> the names while debugging.
>
> Signed-off-by: Arnd Bergmann 
> ---
>  include/linux/ceph/ceph_features.h | 20 ++--
>  1 file changed, 10 insertions(+), 10 deletions(-)
>
> diff --git a/include/linux/ceph/ceph_features.h 
> b/include/linux/ceph/ceph_features.h
> index 6b92b3395fa9..676908eca060 100644
> --- a/include/linux/ceph/ceph_features.h
> +++ b/include/linux/ceph/ceph_features.h
> @@ -11,15 +11,15 @@
>  #define CEPH_FEATURE_INCARNATION_2 (1ull<<57) // CEPH_FEATURE_SERVER_JEWEL
>
>  #define DEFINE_CEPH_FEATURE(bit, incarnation, name)\
> -   static const uint64_t CEPH_FEATURE_##name = (1ULL<   \
> -   static const uint64_t CEPH_FEATUREMASK_##name = \
> -   (1ULL< +   CEPH_FEATURE_##name = (1ULL< +   CEPH_FEATUREMASK_##name =   \
> +   (1ULL<
>  /* this bit is ignored but still advertised by release *when* */
> -#define DEFINE_CEPH_FEATURE_DEPRECATED(bit, incarnation, name, when) \
> -   static const uint64_t DEPRECATED_CEPH_FEATURE_##name = (1ULL< -   static const uint64_t DEPRECATED_CEPH_FEATUREMASK_##name =
>   \
> -   (1ULL< +#define DEFINE_CEPH_FEATURE_DEPRECATED(bit, incarnation, name, when)   \
> +   DEPRECATED_CEPH_FEATURE_##name = (1ULL< +   DEPRECATED_CEPH_FEATUREMASK_##name =\
> +   (1ULL<
>  /*
>   * this bit is ignored by release *unused* and not advertised by
> @@ -71,7 +71,7 @@
>   * This ensures that no two versions who have different meanings for
>   * the bit ever speak to each other.
>   */
> -
> +enum ceph_features {
>  DEFINE_CEPH_FEATURE( 0, 1, UID)
>  DEFINE_CEPH_FEATURE( 1, 1, NOSRCADDR)
>  DEFINE_CEPH_FEATURE_RETIRED( 2, 1, MONCLOCKCHECK, JEWEL, LUMINOUS)
> @@ -170,13 +170,13 @@ DEFINE_CEPH_FEATURE(61, 1, CEPHX_V2) // *do 
> not share this bit*
>
>  DEFINE_CEPH_FEATURE(62, 1, RESERVED)   // do not use; used as a 
> sentinal
>  DEFINE_CEPH_FEATURE_DEPRECATED(63, 1, RESERVED_BROKEN, LUMINOUS) // 
> client-facing
> -
> +};

I don't particularly like this because it looks like lower constants
are actually ints and the rest are unsigned longs, even though they all
have ULL suffixes.  The standard seems to require that enum constants
be representable as ints, is the non-pedantic behaviour documented
somewhere?

Thanks,

Ilya

[GIT PULL] Ceph updates for 4.19-rc3

2018-09-07 Thread Ilya Dryomov

Hi Linus,

The following changes since commit 57361846b52bc686112da6ca5368d11210796804:

  Linux 4.19-rc2 (2018-09-02 14:37:30 -0700)

are available in the Git repository at:

  https://github.com/ceph/ceph-client.git tags/ceph-for-4.19-rc3

for you to fetch changes up to e92c0eaf754310f9f31e9229a3f7274a67478f82:

  rbd: support cloning across namespaces (2018-09-06 16:18:04 +0200)


Two rbd patches to complete support for images within namespaces that
went into -rc1 and a use-after-free fix.

The rbd changes have been sitting in a branch for quite a while but
couldn't be included into the -rc1 pull request because of a pending
wire protocol backwards compatibility fixup that only got committed
early this week.

Said fixup ended up being really trivial -- just an extra byte added,
so I decided to send these changes for -rc3.  If it's too late in the
cycle for this follow-up to be pulled, let me know and I'll send the
use-after-free fix separately; we will have the necessary stop gaps on
the server side to prevent the current 4.19 code from doing anything
unexpected.

--------
Ilya Dryomov (3):
  ceph: avoid a use-after-free in ceph_destroy_options()
  rbd: factor out get_parent_info()
  rbd: support cloning across namespaces

 drivers/block/rbd.c | 235 +++-
 fs/ceph/super.c |  16 ++--
 2 files changed, 189 insertions(+), 62 deletions(-)

[GIT PULL] Ceph updates for 4.19-rc1

2018-08-20 Thread Ilya Dryomov

Hi Linus,

The following changes since commit acb1872577b346bd15ab3a3f8dff780d6cca4b70:

  Linux 4.18-rc7 (2018-07-29 14:44:52 -0700)

are available in the Git repository at:

  https://github.com/ceph/ceph-client.git tags/ceph-for-4.19-rc1

for you to fetch changes up to 0fcf6c02b205f80f24eb548b236543ec151cb01c:

  ceph: don't drop message if it contains more data than expected (2018-08-13 
17:55:44 +0200)


The main things are support for cephx v2 authentication protocol and
basic support for rbd images within namespaces (myself).  Also included
y2038 conversion patches from Arnd, a pile of miscellaneous fixes from
Chengguang and Zheng's feature bit infrastructure for the filesystem.


Arnd Bergmann (5):
  libceph: use timespec64 in for keepalive2 and ticket validity
  ceph: stop using current_kernel_time()
  ceph: use timespec64 for inode timestamp
  libceph: use timespec64 for r_mtime
  ceph: use timespec64 for r_stamp

Chengguang Xu (14):
  ceph: add retry logic for error -ERANGE in ceph_get_acl()
  ceph: restore ctime as well in the case of restoring old mode
  libceph: stop parsing when a bad int arg is detected
  ceph: return errors from posix_acl_equiv_mode() correctly
  ceph: add d_drop for some error cases in ceph_mknod()
  ceph: add d_drop for some error cases in ceph_symlink()
  ceph: add new field max_file_size in ceph_fs_client
  ceph: add additional range check in ceph_fallocate()
  ceph: add additional offset check in ceph_write_iter()
  ceph: add additional size check in ceph_setattr()
  ceph: compare fsc->max_file_size and inode->i_size for max file size limit
  ceph: change to void return type for __do_request()
  ceph: refactor ceph_unreserve_caps()
  ceph: refactor error handling code in ceph_reserve_caps()

Ilya Dryomov (14):
  libceph: make ceph_osdc_notify{,_ack}() payload_len u32
  libceph: change ceph_pagelist_encode_string() to take u32
  libceph: amend "bad option arg" error message
  rbd: pass rbd_spec into parse_rbd_opts_token()
  rbd: support for images within namespaces
  libceph: remove now unused ceph_{en,de}code_timespec()
  libceph: store ceph_auth_handshake pointer in ceph_connection
  libceph: factor out __prepare_write_connect()
  libceph: factor out __ceph_x_decrypt()
  libceph: factor out encrypt_authorizer()
  libceph: add authorizer challenge
  libceph: implement CEPHX_V2 calculation mode
  libceph: check authorizer reply/challenge length before reading
  libceph: weaken sizeof check in ceph_x_verify_authorizer_reply()

Souptick Joarder (1):
  ceph: adding new return type vm_fault_t

Stephen Hemminger (1):
  ceph: fix whitespace

Yan, Zheng (3):
  ceph: fix incorrect use of strncpy
  ceph: support cephfs' own feature bits
  ceph: don't drop message if it contains more data than expected

YueHaibing (2):
  libceph: remove unnecessary non NULL check for request_key
  crush: fix using plain integer as NULL warning

 drivers/block/rbd.c| 125 +--
 fs/ceph/acl.c  |  30 +++--
 fs/ceph/addr.c |  74 ++--
 fs/ceph/cache.c|  11 +-
 fs/ceph/caps.c | 138 ++---
 fs/ceph/dir.c  |  20 ++--
 fs/ceph/file.c |  34 --
 fs/ceph/inode.c|  83 ++---
 fs/ceph/mds_client.c   |  98 ++-
 fs/ceph/mds_client.h   |  14 ++-
 fs/ceph/quota.c|   2 +-
 fs/ceph/snap.c |   6 +-
 fs/ceph/super.c|   6 +-
 fs/ceph/super.h|  12 +-
 fs/ceph/xattr.c|   4 +-
 include/linux/ceph/auth.h  |   8 ++
 include/linux/ceph/ceph_features.h |   7 +-
 include/linux/ceph/decode.h|  18 ++-
 include/linux/ceph/messenger.h |   8 +-
 include/linux/ceph/msgr.h  |   2 +-
 include/linux/ceph/osd_client.h|  10 +-
 include/linux/ceph/pagelist.h  |   2 +-
 net/ceph/Kconfig   |   1 -
 net/ceph/Makefile  |   1 -
 net/ceph/auth.c|  16 +++
 net/ceph/auth_none.c   |   1 -
 net/ceph/auth_none.h   |   1 -
 net/ceph/auth_x.c  | 239 +
 net/ceph/auth_x.h  |   3 +-
 net/ceph/auth_x_protocol.h |   7 ++
 net/ceph/ceph_common.c |  13 +-
 net/ceph/cls_lock_client.c |   4 +-
 net/ceph/crush/mapper.c|   4 +-
 net/ceph/messenger.c   | 113 +++---
 net/ceph/mon_client.c  |   2 +-
 net/ceph/osd_client.c  |  27 +++--
 net/ceph/p

Re: Warning when using eMMC and partprobe: generic_make_request: Trying to write to read-only block-device

2018-08-14 Thread Ilya Dryomov

On Tue, Aug 14, 2018 at 4:41 PM Stefan Agner  wrote:
>
> Hi,
>
> Using Linux 4.18 on a i.MX 6Q I see the following warning during
> boot-up:
>
> [   23.928916] [ cut here ]
> [   23.933795] WARNING: CPU: 1 PID: 527 at block/blk-core.c:2161
> generic_make_request_checks+0x868/0xa18
> [   23.943306] generic_make_request: Trying to write to read-only
> block-device mmcblk2boot0 (partno 0)
> [   23.952569] Modules linked in: joydev flexcan can_dev coda imx_vdoa
> v4l2_mem2mem videobuf2_vmalloc dw_hdmi_ahb_audio evbug nhc_mobility
> nhc_hop nhc_routing nhc_ipv6 nhc_dest nhc_fragment nhc_udp fuse
> bluetooth_6lowpan 6lowpan
> [   23.973115] CPU: 1 PID: 527 Comm: partprobe Not tainted 4.18.0 #1
> [   23.979336] Hardware name: Freescale i.MX6 Quad/DualLite (Device
> Tree)
> [   23.985984] Backtrace:
> [   23.988513] [] (dump_backtrace) from []
> (show_stack+0x18/0x1c)
> [   23.996231]  r7: r6:60060013 r5: r4:c118ca44
> [   24.002009] [] (show_stack) from []
> (dump_stack+0xb4/0xec)
> [   24.009377] [] (dump_stack) from []
> (__warn+0xc4/0x108)
> [   24.016471]  r10:c1108908 r9:c04864f0 r8:0871 r7:c0ea02dc
> r6:0009 r5:
> [   24.024447]  r4:d73d1d1c r3:abf2ba7b
> [   24.028111] [] (__warn) from []
> (warn_slowpath_fmt+0x4c/0x6c)
> [   24.035741]  r9:d73d r8:c01011e4 r7:c04873cc r6:d6f27400
> r5:c0ea05b0 r4:c1108908
> [   24.043641] [] (warn_slowpath_fmt) from []
> (generic_make_request_checks+0x868/0xa18)
> [   24.053296]  r3:d73d1d74 r2:c0ea05b0
> [   24.058425]  r5:d6d4d0a0 r4:d6f94240
> [   24.063538] [] (generic_make_request_checks) from
> [] (generic_make_request+0xc0/0x480)
> [   24.076301]  r10:d6f94240 r9:d73d r8:c01011e4 r7:c1108908
> r6:d73d1e98 r5:c1108908
> [   24.087212]  r4:d6d4d0a0
> [   24.091251] [] (generic_make_request) from []
> (submit_bio+0x38/0x19c)
> [   24.102438]  r10:0076 r9:d73d r8:c01011e4 r7:7fff
> r6:d73d1e98 r5:c1108908
> [   24.113265]  r4:d6f94240
> [   24.117278] [] (submit_bio) from []
> (submit_bio_wait+0x5c/0x98)
> [   24.127914]  r10:0076 r9:d73d r8:c01011e4 r7:7fff
> r6:d73d1e98 r5:c1108908
> [   24.138724]  r4:d6f94240
> [   24.142739] [] (submit_bio_wait) from []
> (blkdev_issue_flush+0x80/0xb0)
> [   24.154038]  r6: r5:d4164340 r4:d6f94240
> [   24.160143] [] (blkdev_issue_flush) from []
> (blkdev_fsync+0x3c/0x54)
> [   24.171143]  r7:7fff r6:d4164428 r5:7fff r4:
> [   24.178322] [] (blkdev_fsync) from []
> (vfs_fsync_range+0x44/0x84)
> [   24.189080]  r6: r5: r4:d66ce000
> [   24.195173] [] (vfs_fsync_range) from []
> (do_fsync+0x44/0x78)
> [   24.205530]  r7:0076 r6: r5:d66ce000 r4:d66ce000
> [   24.212672] [] (do_fsync) from []
> (sys_fsync+0x14/0x18)
> [   24.221140]  r6:01320248 r5:0001 r4:013201d0
> [   24.227266] [] (sys_fsync) from []
> (ret_fast_syscall+0x0/0x28)
> [   24.237781] Exception stack(0xd73d1fa8 to 0xd73d1ff0)
> [   24.244320] 1fa0:   013201d0 0001 0004
> be863b1c 0064 
> [   24.255437] 1fc0: 013201d0 0001 01320248 0076 b6ef8aec
> b6ef4c04 b6f37fa4 
> [   24.266593] 1fe0: 0076 be863a00 b6e61faf b6de8306
> [   24.273302] irq event stamp: 13037
> [   24.278202] hardirqs last  enabled at (13045): []
> console_unlock+0x3e0/0x4e4
> [   24.289163] hardirqs last disabled at (13054): []
> console_unlock+0x80/0x4e4
> [   24.300085] softirqs last  enabled at (12880): []
> __do_softirq+0x224/0x2d0
> [   24.310934] softirqs last disabled at (12833): []
> irq_exit+0xc8/0x1ac
> [   24.321438] ---[ end trace 05a4aba40df38a0c ]---
>
> The system I am using calls partprobe for some reason, which causes the
> stack
> trace to appear.
>
> The mmcblkXbootY partitions are hardware partitions on eMMC devices
> which are
> by default set to read only. Partition probing should really not lead to
> a
> write as far as I can tell...
>
> strace shows what partprobe is actually doing:
>
> ...
> openat(AT_FDCWD, "/dev/mmcblk2boot0", O_RDONLY|O_LARGEFILE) = 4
> ...
> ioctl(4, BLKFLSBUF) = 0
> ...
> ioctl(4, BLKSSZGET, [512])  = 0
> fadvise64_64(4, 0, 0, POSIX_FADV_RANDOM) = 0
> fstat64(4, {st_mode=S_IFBLK|0660, st_rdev=makedev(179, 64), ...}) = 0
> ioctl(4, BLKGETSIZE64, [2097152])   = 0
> ...
> ioctl(4, CDROM_GET_CAPABILITY, 0)   = -1 EINVAL (Invalid argument)
> ioctl(4, BLKALIGNOFF, [0])  = 0
> ioctl(4, BLKIOMIN, [512])   = 0
> ioctl(4, BLKIOOPT, [0]) = 0
> ioctl(4, BLKPBSZGET, [512]) = 0
> ioctl(4, BLKSSZGET, [512])  = 0
> ioctl(4, BLKGETSIZE64, [2097152])   = 0
> ioctl(4, HDIO_GETGEO, {heads=4, sectors=16, cylinders=64, start=0}) = 0
> fsync(4)= 0
> close(4)= 0
> ...
>
> Any idea?

Looks like it's coming from that fsync():

  sys_fsync
do_fsync
  vfs_fsync_range
blkdev_fsync
  blkdev_issue_

[GIT PULL] Ceph fix for 4.18-rc3

2018-06-29 Thread Ilya Dryomov

Hi Linus,

The following changes since commit 7daf201d7fe8334e2d2364d4e8ed3394ec9af819:

  Linux 4.18-rc2 (2018-06-24 20:54:29 +0800)

are available in the Git repository at:

  https://github.com/ceph/ceph-client.git tags/ceph-for-4.18-rc3

for you to fetch changes up to 8b8f53af1ed9df88a4c0fbfdf3db58f62060edf3:

  ceph: fix dentry leak in splice_dentry() (2018-06-26 18:42:44 +0200)


A trivial dentry leak fix from Zheng.


Yan, Zheng (1):
  ceph: fix dentry leak in splice_dentry()

 fs/ceph/inode.c | 1 +
 1 file changed, 1 insertion(+)

[GIT PULL] Ceph updates for 4.18-rc1

2018-06-14 Thread Ilya Dryomov

Hi Linus,

The following changes since commit 29dcea88779c856c7dc92040a0c01233263101d4:

  Linux 4.17 (2018-06-03 14:15:21 -0700)

are available in the git repository at:

  https://github.com/ceph/ceph-client.git tags/ceph-for-4.18-rc1

for you to fetch changes up to 23edca864951250af845a11da86bb3ea63522ed2:

  rbd: flush rbd_dev->watch_dwork after watch is unregistered (2018-06-04 
20:46:02 +0200)


The main piece is a set of libceph changes that revamps how OSD
requests are aborted, improving CephFS ENOSPC handling and making
"umount -f" actually work (Zheng and myself).  The rest is mostly
mount option handling cleanups from Chengguang and assorted fixes
from Zheng, Luis and Dongsheng.


Chengguang Xu (5):
  libceph, rbd: add error handling for osd_req_op_cls_init()
  ceph: fix alignment of rasize
  ceph: strengthen rsize/wsize/readdir_max_bytes validation
  ceph: show ino32 if the value is different with default
  ceph: update description of some mount options

Dongsheng Yang (1):
  rbd: flush rbd_dev->watch_dwork after watch is unregistered

Ilya Dryomov (13):
  libceph: get rid of more_kvec in try_write()
  libceph: use MSG_TRUNC for discarding received bytes
  ceph: show wsize only if non-default
  libceph: introduce ceph_osdc_abort_requests()
  libceph: no need to call flush_workqueue() before destruction
  libceph: move more code into __complete_request()
  libceph: defer __complete_request() to a workqueue
  libceph: use for_each_request() in ceph_osdc_abort_on_full()
  libceph: don't warn if req->r_abort_on_full is set
  libceph: avoid a use-after-free during map check
  libceph: don't abort reads in ceph_osdc_abort_on_full()
  libceph: make abort_on_full a per-osdc setting
  libceph: allocate the locator string with GFP_NOFAIL

Luis Henriques (2):
  ceph: fix st_nlink stat for directories
  ceph: fix use-after-free in ceph_statfs()

Yan, Zheng (10):
  ceph: use bit flags to define vxattr attributes
  ceph: always get rstat from auth mds
  ceph: update i_files/i_subdirs only when Fs cap is issued
  ceph: define argument structure for handle_cap_grant
  ceph: handle the new nfiles/nsubdirs fields in cap message
  ceph: support file lock on directory
  ceph: abort osd requests on force umount
  ceph: flush pending works before shutdown super
  ceph: fix wrong check for the case of updating link count
  ceph: prevent i_version from going back

 Documentation/filesystems/ceph.txt |   8 +-
 drivers/block/rbd.c|  11 +-
 fs/ceph/addr.c |   1 -
 fs/ceph/caps.c | 160 ---
 fs/ceph/dir.c  |   2 +
 fs/ceph/file.c |   1 -
 fs/ceph/inode.c|  67 +++-
 fs/ceph/super.c|  35 --
 fs/ceph/xattr.c|  60 ++-
 include/linux/ceph/ceph_fs.h   |   1 +
 include/linux/ceph/osd_client.h|   8 +-
 include/linux/ceph/osdmap.h|   8 +-
 net/ceph/messenger.c   |  31 ++
 net/ceph/osd_client.c  | 216 ++---
 net/ceph/osdmap.c  |  19 ++--
 15 files changed, 372 insertions(+), 256 deletions(-)

[GIT PULL] Ceph fixes for 4.17-rc5

2018-05-11 Thread Ilya Dryomov

Hi Linus,

The following changes since commit 75bc37fefc4471e718ba8e651aa74673d4e0a9eb:

  Linux 4.17-rc4 (2018-05-06 16:57:38 -1000)

are available in the git repository at:

  https://github.com/ceph/ceph-client.git tags/ceph-for-4.17-rc5

for you to fetch changes up to fc218544fbc800d1c91348ec834cacfb257348f7:

  ceph: fix iov_iter issues in ceph_direct_read_write() (2018-05-10 10:15:12 
+0200)


These patches fix two long-standing bugs in the DIO code path, one of
which is a crash trivially triggerable with splice().


Ilya Dryomov (3):
  ceph: fix rsize/wsize capping in ceph_direct_read_write()
  libceph: add osd_req_op_extent_osd_data_bvecs()
  ceph: fix iov_iter issues in ceph_direct_read_write()

 drivers/block/rbd.c |   4 +-
 fs/ceph/file.c  | 205 
 include/linux/ceph/osd_client.h |  12 ++-
 net/ceph/osd_client.c   |  27 +-
 4 files changed, 158 insertions(+), 90 deletions(-)

[PATCH 2/2] iov_iter: fix memory leak in pipe_get_pages_alloc()

2018-05-02 Thread Ilya Dryomov

Make n signed to avoid leaking the pages array if __pipe_get_pages()
fails.

Signed-off-by: Ilya Dryomov 
---
 lib/iov_iter.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/lib/iov_iter.c b/lib/iov_iter.c
index 4d5bf40d399d..fdae394172fa 100644
--- a/lib/iov_iter.c
+++ b/lib/iov_iter.c
@@ -1102,7 +1102,7 @@ static ssize_t pipe_get_pages_alloc(struct iov_iter *i,
   size_t *start)
 {
struct page **p;
-   size_t n;
+   ssize_t n;
int idx;
int npages;
 
-- 
2.4.3

[PATCH 1/2] iov_iter: fix return type of __pipe_get_pages()

2018-05-02 Thread Ilya Dryomov

It returns -EFAULT and happens to be a helper for pipe_get_pages()
whose return type is ssize_t.

Signed-off-by: Ilya Dryomov 
---
 lib/iov_iter.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/lib/iov_iter.c b/lib/iov_iter.c
index 970212670b6a..4d5bf40d399d 100644
--- a/lib/iov_iter.c
+++ b/lib/iov_iter.c
@@ -1012,7 +1012,7 @@ unsigned long iov_iter_gap_alignment(const struct 
iov_iter *i)
 }
 EXPORT_SYMBOL(iov_iter_gap_alignment);
 
-static inline size_t __pipe_get_pages(struct iov_iter *i,
+static inline ssize_t __pipe_get_pages(struct iov_iter *i,
size_t maxsize,
struct page **pages,
int idx,
-- 
2.4.3

[GIT PULL] Ceph fixes for 4.17-rc3

2018-04-27 Thread Ilya Dryomov

Hi Linus,

The following changes since commit 6d08b06e67cd117f6992c46611dfb4ce267cd71e:

  Linux 4.17-rc2 (2018-04-22 19:20:09 -0700)

are available in the git repository at:

  https://github.com/ceph/ceph-client.git tags/ceph-for-4.17-rc3

for you to fetch changes up to 9c55ad1c214d9f8c4594ac2c3fa392c1c32431a7:

  libceph: validate con->state at the top of try_write() (2018-04-26 17:39:08 
+0200)


A CephFS quota follow-up and fixes for two older issues in the
messenger layer, marked for stable.

----
Ilya Dryomov (3):
  libceph: un-backoff on tick when we have a authenticated session
  libceph: reschedule a tick in finish_hunting()
  libceph: validate con->state at the top of try_write()

Yan, Zheng (1):
  ceph: check if mds create snaprealm when setting quota

 fs/ceph/xattr.c   | 28 +---
 net/ceph/messenger.c  |  7 +++
 net/ceph/mon_client.c | 14 +++---
 3 files changed, 43 insertions(+), 6 deletions(-)

Re: [4.4,50/97] ext4: add validity checks for bitmap block numbers -- regression?

2018-04-23 Thread Ilya Dryomov

Hi Greg,

Commit 7dac4a1726a9 ("ext4: add validity checks for bitmap block
numbers") seems to be the cause of the regression reported here:

  https://marc.info/?l=linux-ext4&m=152416385122029&w=2

ext4 folks are probably busy at LSF, so no reply yet.  Should this
commit be held until we get word from Ted?

Please excuse broken threading.

Thanks,

Ilya

1 2 3 4 >

1 - 100 of 307 matches

Mail list logo