Re: [PATCH -next 1/5] net: ceph: Fix a typo in osdmap.c
On Thu, Mar 25, 2021 at 7:37 AM Lu Wei wrote: > > Modify "inital" to "initial" in net/ceph/osdmap.c. > > Reported-by: Hulk Robot > Signed-off-by: Lu Wei > --- > net/ceph/osdmap.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/net/ceph/osdmap.c b/net/ceph/osdmap.c > index 2b1dd252f231..c959320c4775 100644 > --- a/net/ceph/osdmap.c > +++ b/net/ceph/osdmap.c > @@ -1069,7 +1069,7 @@ static struct crush_work *get_workspace(struct > workspace_manager *wsm, > > /* > * Do not return the error but go back to waiting. We > -* have the inital workspace and the CRUSH computation > +* have the initial workspace and the CRUSH computation > * time is bounded so we will get it eventually. > */ > WARN_ON(atomic_read(&wsm->total_ws) < 1); > -- > 2.17.1 > Hi Lu, There is at least one other legit typo in that file: "ambigous". I'd rather fix all typos at once, so curious why Hulk Robot didn't catch it. Thanks, Ilya
Re: [PATCH RESEND][next] ceph: Fix fall-through warnings for Clang
On Fri, Mar 5, 2021 at 10:59 AM Gustavo A. R. Silva wrote: > > In preparation to enable -Wimplicit-fallthrough for Clang, fix a couple > of warnings by explicitly adding a break and a goto statements instead > of just letting the code fall through to the next case. > > Link: https://github.com/KSPP/linux/issues/115 > Signed-off-by: Gustavo A. R. Silva > --- > fs/ceph/dir.c | 2 ++ > 1 file changed, 2 insertions(+) > > diff --git a/fs/ceph/dir.c b/fs/ceph/dir.c > index 83d9358854fb..3e575656713e 100644 > --- a/fs/ceph/dir.c > +++ b/fs/ceph/dir.c > @@ -631,10 +631,12 @@ static loff_t ceph_dir_llseek(struct file *file, loff_t > offset, int whence) > switch (whence) { > case SEEK_CUR: > offset += file->f_pos; > + break; > case SEEK_SET: > break; > case SEEK_END: > retval = -EOPNOTSUPP; > + goto out; > default: > goto out; > } > -- > 2.27.0 > Applied. Thanks, Ilya
Re: net/ceph/messenger_v1.c:1204:5: warning: stack frame size of 2944 bytes in function 'ceph_con_v1_try_read'
On Mon, Mar 1, 2021 at 9:36 AM kernel test robot wrote: > > Hi Ilya, > > FYI, the error/warning still remains. > > tree: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git > master > head: fe07bfda2fb9cdef8a4d4008a409bb02f35f1bd8 > commit: 2f713615ddd9d805b6c5e79c52e0e11af99d2bf1 libceph: move msgr1 protocol > implementation to its own file > date: 3 months ago It's fine. This commit just moved the code which has been this way for years and never caused any real issues. Please add it to the allowlist if possible. > config: powerpc64-randconfig-r001-20210301 (attached as .config) > compiler: clang version 13.0.0 (https://github.com/llvm/llvm-project > 5de09ef02e24d234d9fc0cd1c6dfe18a1bb784b0) > reproduce (this is a W=1 build): > wget > https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O > ~/bin/make.cross > chmod +x ~/bin/make.cross > # install powerpc64 cross compiling tool for clang build > # apt-get install binutils-powerpc64-linux-gnu > # > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=2f713615ddd9d805b6c5e79c52e0e11af99d2bf1 > git remote add linus > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git > git fetch --no-tags linus master > git checkout 2f713615ddd9d805b6c5e79c52e0e11af99d2bf1 > # save the attached .config to linux build tree > COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross > ARCH=powerpc64 > > If you fix the issue, kindly add following tag as appropriate > Reported-by: kernel test robot > > All warnings (new ones prefixed by >>): > >__do_insb >^ >arch/powerpc/include/asm/io.h:541:56: note: expanded from macro '__do_insb' >#define __do_insb(p, b, n) readsb((PCI_IO_ADDR)_IO_BASE+(p), (b), (n)) > ~^ >In file included from net/ceph/messenger_v1.c:8: >In file included from include/net/sock.h:38: >In file included from include/linux/hardirq.h:10: >In file included from arch/powerpc/include/asm/hardirq.h:6: >In file included from include/linux/irq.h:20: >In file included from include/linux/io.h:13: >In file included from arch/powerpc/include/asm/io.h:604: >arch/powerpc/include/asm/io-defs.h:45:1: warning: performing pointer > arithmetic on a null pointer has undefined behavior > [-Wnull-pointer-arithmetic] >DEF_PCI_AC_NORET(insw, (unsigned long p, void *b, unsigned long c), >^~~ >arch/powerpc/include/asm/io.h:601:3: note: expanded from macro > 'DEF_PCI_AC_NORET' >__do_##name al; \ >^~ >:32:1: note: expanded from here >__do_insw >^ >arch/powerpc/include/asm/io.h:542:56: note: expanded from macro '__do_insw' >#define __do_insw(p, b, n) readsw((PCI_IO_ADDR)_IO_BASE+(p), (b), (n)) > ~^ >In file included from net/ceph/messenger_v1.c:8: >In file included from include/net/sock.h:38: >In file included from include/linux/hardirq.h:10: >In file included from arch/powerpc/include/asm/hardirq.h:6: >In file included from include/linux/irq.h:20: >In file included from include/linux/io.h:13: >In file included from arch/powerpc/include/asm/io.h:604: >arch/powerpc/include/asm/io-defs.h:47:1: warning: performing pointer > arithmetic on a null pointer has undefined behavior > [-Wnull-pointer-arithmetic] >DEF_PCI_AC_NORET(insl, (unsigned long p, void *b, unsigned long c), >^~~ >arch/powerpc/include/asm/io.h:601:3: note: expanded from macro > 'DEF_PCI_AC_NORET' >__do_##name al; \ >^~ >:36:1: note: expanded from here >__do_insl >^ >arch/powerpc/include/asm/io.h:543:56: note: expanded from macro '__do_insl' >#define __do_insl(p, b, n) readsl((PCI_IO_ADDR)_IO_BASE+(p), (b), (n)) > ~^ >In file included from net/ceph/messenger_v1.c:8: >In file included from include/net/sock.h:38: >In file included from include/linux/hardirq.h:10: >In file included from arch/powerpc/include/asm/hardirq.h:6: >In file included from include/linux/irq.h:20: >In file included from include/linux/io.h:13: >In file included from arch/powerpc/include/asm/io.h:604: >arch/powerpc/include/asm/io-defs.h:49:1: warning: performing pointer > arithmetic on a null pointer has undefined behavior > [-Wnull-pointer-arithmetic] >DEF_PCI_AC_NORET(outsb, (unsigned long p, const void *b, unsigned long c), >^~ >arch/po
[GIT PULL] Ceph updates for 5.12-rc1
Hi Linus, The following changes since commit f40ddce88593482919761f74910f42f4b84c004b: Linux 5.11 (2021-02-14 14:32:24 -0800) are available in the Git repository at: https://github.com/ceph/ceph-client.git tags/ceph-for-5.12-rc1 for you to fetch changes up to 558b4510f622a3d96cf9d95050a04e7793d343c7: ceph: defer flushing the capsnap if the Fb is used (2021-02-16 12:09:52 +0100) With netfs helper library and fscache rework delayed, just a few cap handling improvements to avoid grabbing mmap_lock in some code paths and deal with capsnaps better and a mount option cleanup. Ilya Dryomov (2): libceph: deprecate [no]cephx_require_signatures options libceph: remove osdtimeout option entirely Jeff Layton (3): ceph: fix flush_snap logic after putting caps ceph: clean up inode work queueing ceph: allow queueing cap/snap handling after putting cap references Xiubo Li (1): ceph: defer flushing the capsnap if the Fb is used fs/ceph/addr.c | 2 +- fs/ceph/caps.c | 70 +++- fs/ceph/inode.c | 61 -- fs/ceph/snap.c | 10 +++ fs/ceph/super.h | 40 + include/linux/ceph/libceph.h | 7 ++--- net/ceph/ceph_common.c | 17 --- 7 files changed, 115 insertions(+), 92 deletions(-)
Re: [PATCH] ceph: Fix an Oops in error handling
On Tue, Feb 2, 2021 at 6:47 AM Dan Carpenter wrote: > > The "req" pointer is an error pointer and not NULL so this check needs > to be fixed. > > Fixes: 1cf7fdf52d5a ("ceph: convert readpage to fscache read helper") > Signed-off-by: Dan Carpenter > --- > fs/ceph/addr.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/fs/ceph/addr.c b/fs/ceph/addr.c > index 5eec6f66fe52..fb0238a4d34f 100644 > --- a/fs/ceph/addr.c > +++ b/fs/ceph/addr.c > @@ -273,7 +273,7 @@ static void ceph_netfs_issue_op(struct > netfs_read_subrequest *subreq) > if (err) > iput(inode); > out: > - if (req) > + if (!IS_ERR_OR_NULL(req)) > ceph_osdc_put_request(req); > if (err) > netfs_subreq_terminated(subreq, err); Hi Dan, I think a better fix would be to set req to NULL in the offending IS_ERR branch since ceph_osdc_new_request() never returns NULL or use two separate goto labels. While at it, the initialization of req and the check on req before calling ceph_osdc_put_request() are redundant. Thanks, Ilya
Re: [PATCH] ceph: Replace DEFINE_SIMPLE_ATTRIBUTE with DEFINE_DEBUGFS_ATTRIBUTE
On Mon, Feb 1, 2021 at 8:52 AM Jiapeng Chong wrote: > > Fix the following coccicheck warning: > > ./fs/ceph/debugfs.c:347:0-23: WARNING: congestion_kb_fops should be > defined with DEFINE_DEBUGFS_ATTRIBUTE. > > Reported-by: Abaci Robot > Signed-off-by: Jiapeng Chong > --- > fs/ceph/debugfs.c | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) > > diff --git a/fs/ceph/debugfs.c b/fs/ceph/debugfs.c > index 66989c8..617327e 100644 > --- a/fs/ceph/debugfs.c > +++ b/fs/ceph/debugfs.c > @@ -344,8 +344,8 @@ static int congestion_kb_get(void *data, u64 *val) > return 0; > } > > -DEFINE_SIMPLE_ATTRIBUTE(congestion_kb_fops, congestion_kb_get, > - congestion_kb_set, "%llu\n"); > +DEFINE_DEBUGFS_ATTRIBUTE(congestion_kb_fops, congestion_kb_get, > + congestion_kb_set, "%llu\n"); > > > void ceph_fs_debugfs_cleanup(struct ceph_fs_client *fsc) Hi Jiapeng, What is the benefit of this conversion? >From a quick look, with DEFINE_DEBUGFS_ATTRIBUTE writeback_congestion_kb file would no longer be seekable. It may not matter much, but something that should have been mentioned. Futher, debugfs_create_file() creates a full proxy for fops, protecting against removal races. DEFINE_DEBUGFS_ATTRIBUTE adds its own protection but just for ->read() and ->write(). I don't think we need both. Thanks, Ilya
Re: [PATCH 0/6] ceph: convert to new netfs read helpers
On Thu, Jan 28, 2021 at 1:52 PM Jeff Layton wrote: > > On Wed, 2021-01-27 at 23:50 +0100, Ilya Dryomov wrote: > > On Tue, Jan 26, 2021 at 2:41 PM Jeff Layton wrote: > > > > > > This patchset converts ceph to use the new netfs readpage, write_begin, > > > and readahead helpers to handle buffered reads. This is a substantial > > > reduction in code in ceph, but shouldn't really affect functionality in > > > any way. > > > > > > Ilya, if you don't have any objections, I'll plan to let David pull this > > > series into his tree to be merged with the netfs API patches themselves. > > > > Sure, that works for me. > > > > I would have expected that the new netfs infrastructure is pushed > > to a public branch that individual filesystems could peruse, but since > > David's set already includes patches for AFS and NFS, let's tag along. > > > > Thanks, > > > > Ilya > > David has a fscache-netfs-lib branch that has all of the infrastructure > changes. See: > > > https://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs.git/log/?h=fscache-netfs-lib I saw that, but AFAICS it hasn't been declared public (as in suitable for other people to base their work on, with the promise that history won't get rewritten. It is branched off of what looks like a random snapshot of Linus' tree instead of a release point, etc. Thanks, Ilya
Re: [PATCH 0/6] ceph: convert to new netfs read helpers
On Tue, Jan 26, 2021 at 2:41 PM Jeff Layton wrote: > > This patchset converts ceph to use the new netfs readpage, write_begin, > and readahead helpers to handle buffered reads. This is a substantial > reduction in code in ceph, but shouldn't really affect functionality in > any way. > > Ilya, if you don't have any objections, I'll plan to let David pull this > series into his tree to be merged with the netfs API patches themselves. Sure, that works for me. I would have expected that the new netfs infrastructure is pushed to a public branch that individual filesystems could peruse, but since David's set already includes patches for AFS and NFS, let's tag along. Thanks, Ilya
[GIT PULL] Ceph fixes for 5.11-rc5
Hi Linus, The following changes since commit e71ba9452f0b5b2e8dc8aa5445198cd9214a6a62: Linux 5.11-rc2 (2021-01-03 15:55:30 -0800) are available in the Git repository at: https://github.com/ceph/ceph-client.git tags/ceph-for-5.11-rc5 for you to fetch changes up to 9d5ae6f3c50a6f718b6d4be3c7b0828966e01b05: libceph: fix "Boolean result is used in bitwise operation" warning (2021-01-21 16:49:59 +0100) A patch to zero out sensitive cryptographic data and two minor cleanups prompted by the fact that a bunch of code was moved in this cycle. ---- Ilya Dryomov (3): libceph: zero out session key and connection secret libceph, ceph: disambiguate ceph_connection_operations handlers libceph: fix "Boolean result is used in bitwise operation" warning fs/ceph/mds_client.c| 34 ++--- net/ceph/auth_x.c | 57 + net/ceph/crypto.c | 3 ++- net/ceph/messenger_v1.c | 2 +- net/ceph/messenger_v2.c | 45 +- net/ceph/mon_client.c | 14 ++-- net/ceph/osd_client.c | 40 +- 7 files changed, 107 insertions(+), 88 deletions(-)
Re: [kbuild] net/ceph/messenger_v1.c:1099:23: warning: Boolean result is used in bitwise operation. Clarify expression with parentheses.
On Wed, Jan 20, 2021 at 1:43 PM Dan Carpenter wrote: > > On Wed, Jan 20, 2021 at 12:01:59PM +0100, Ilya Dryomov wrote: > > On Tue, Jan 19, 2021 at 8:46 PM Dan Carpenter > > wrote: > > > > > > tree: > > > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git > > > master > > > head: 1e2a199f6ccdc15cf111d68d212e2fd4ce65682e > > > commit: 2f713615ddd9d805b6c5e79c52e0e11af99d2bf1 libceph: move msgr1 > > > protocol implementation to its own file > > > compiler: gcc-9 (Debian 9.3.0-15) 9.3.0 > > > > > > If you fix the issue, kindly add following tag as appropriate > > > Reported-by: kernel test robot > > > > > > > > > cppcheck possible warnings: (new ones prefixed by >>, may not real > > > problems) > > > > > > >> net/ceph/messenger_v1.c:1099:23: warning: Boolean result is used in > > > >> bitwise operation. Clarify expression with parentheses. > > > >> [clarifyCondition] > > > BUG_ON(!con->in_msg ^ skip); > > > ^ > > > > > > vim +1099 net/ceph/messenger_v1.c > > > > > > 2f713615ddd9d805 Ilya Dryomov 2020-11-12 1033 static int > > > read_partial_message(struct ceph_connection *con) > > > 2f713615ddd9d805 Ilya Dryomov 2020-11-12 1034 { > > > 2f713615ddd9d805 Ilya Dryomov 2020-11-12 1035 struct ceph_msg > > > *m = con->in_msg; > > > 2f713615ddd9d805 Ilya Dryomov 2020-11-12 1036 int size; > > > 2f713615ddd9d805 Ilya Dryomov 2020-11-12 1037 int end; > > > 2f713615ddd9d805 Ilya Dryomov 2020-11-12 1038 int ret; > > > 2f713615ddd9d805 Ilya Dryomov 2020-11-12 1039 unsigned int > > > front_len, middle_len, data_len; > > > 2f713615ddd9d805 Ilya Dryomov 2020-11-12 1040 bool do_datacrc = > > > !ceph_test_opt(from_msgr(con->msgr), NOCRC); > > > 2f713615ddd9d805 Ilya Dryomov 2020-11-12 1041 bool need_sign = > > > (con->peer_features & CEPH_FEATURE_MSG_AUTH); > > > 2f713615ddd9d805 Ilya Dryomov 2020-11-12 1042 u64 seq; > > > 2f713615ddd9d805 Ilya Dryomov 2020-11-12 1043 u32 crc; > > > 2f713615ddd9d805 Ilya Dryomov 2020-11-12 1044 > > > 2f713615ddd9d805 Ilya Dryomov 2020-11-12 1045 > > > dout("read_partial_message con %p msg %p\n", con, m); > > > 2f713615ddd9d805 Ilya Dryomov 2020-11-12 1046 > > > 2f713615ddd9d805 Ilya Dryomov 2020-11-12 1047 /* header */ > > > 2f713615ddd9d805 Ilya Dryomov 2020-11-12 1048 size = sizeof > > > (con->in_hdr); > > > 2f713615ddd9d805 Ilya Dryomov 2020-11-12 1049 end = size; > > > 2f713615ddd9d805 Ilya Dryomov 2020-11-12 1050 ret = > > > read_partial(con, end, size, &con->in_hdr); > > > 2f713615ddd9d805 Ilya Dryomov 2020-11-12 1051 if (ret <= 0) > > > 2f713615ddd9d805 Ilya Dryomov 2020-11-12 1052 return > > > ret; > > > 2f713615ddd9d805 Ilya Dryomov 2020-11-12 1053 > > > 2f713615ddd9d805 Ilya Dryomov 2020-11-12 1054 crc = crc32c(0, > > > &con->in_hdr, offsetof(struct ceph_msg_header, crc)); > > > 2f713615ddd9d805 Ilya Dryomov 2020-11-12 1055 if > > > (cpu_to_le32(crc) != con->in_hdr.crc) { > > > 2f713615ddd9d805 Ilya Dryomov 2020-11-12 1056 > > > pr_err("read_partial_message bad hdr crc %u != expected %u\n", > > > 2f713615ddd9d805 Ilya Dryomov 2020-11-12 1057 > > > crc, con->in_hdr.crc); > > > 2f713615ddd9d805 Ilya Dryomov 2020-11-12 1058 return > > > -EBADMSG; > > > 2f713615ddd9d805 Ilya Dryomov 2020-11-12 1059 } > > > 2f713615ddd9d805 Ilya Dryomov 2020-11-12 1060 > > > 2f713615ddd9d805 Ilya Dryomov 2020-11-12 1061 front_len = > > > le32_to_cpu(con->in_hdr.front_len); > > > 2f713615ddd9d805 Ilya Dryomov 2020-11-12 1062 if (front_len > > > > CEPH_MSG_MAX_FRONT_LEN) > > > 2f713615ddd9d805 Ilya Dryomov 2020-11-12 1063 return > > > -EIO; > > > 2f713615ddd9d805 Ilya Dryomov 2020-11-12 1064 middle_len = > > > le32_to_cpu(con->in_hdr.middle_len); > > > 2f713615ddd9d805 Ilya Dryomov 2020-11-12 1065 if (middle_len > > > > CEPH_MSG_MAX_MIDDLE_LEN) > > > 2f713615ddd9d805 Ilya Dryomov 2020-11-12 1066
Re: [kbuild] net/ceph/messenger_v1.c:1099:23: warning: Boolean result is used in bitwise operation. Clarify expression with parentheses.
On Tue, Jan 19, 2021 at 8:46 PM Dan Carpenter wrote: > > tree: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git > master > head: 1e2a199f6ccdc15cf111d68d212e2fd4ce65682e > commit: 2f713615ddd9d805b6c5e79c52e0e11af99d2bf1 libceph: move msgr1 protocol > implementation to its own file > compiler: gcc-9 (Debian 9.3.0-15) 9.3.0 > > If you fix the issue, kindly add following tag as appropriate > Reported-by: kernel test robot > > > cppcheck possible warnings: (new ones prefixed by >>, may not real problems) > > >> net/ceph/messenger_v1.c:1099:23: warning: Boolean result is used in > >> bitwise operation. Clarify expression with parentheses. [clarifyCondition] > BUG_ON(!con->in_msg ^ skip); > ^ > > vim +1099 net/ceph/messenger_v1.c > > 2f713615ddd9d805 Ilya Dryomov 2020-11-12 1033 static int > read_partial_message(struct ceph_connection *con) > 2f713615ddd9d805 Ilya Dryomov 2020-11-12 1034 { > 2f713615ddd9d805 Ilya Dryomov 2020-11-12 1035 struct ceph_msg *m = > con->in_msg; > 2f713615ddd9d805 Ilya Dryomov 2020-11-12 1036 int size; > 2f713615ddd9d805 Ilya Dryomov 2020-11-12 1037 int end; > 2f713615ddd9d805 Ilya Dryomov 2020-11-12 1038 int ret; > 2f713615ddd9d805 Ilya Dryomov 2020-11-12 1039 unsigned int > front_len, middle_len, data_len; > 2f713615ddd9d805 Ilya Dryomov 2020-11-12 1040 bool do_datacrc = > !ceph_test_opt(from_msgr(con->msgr), NOCRC); > 2f713615ddd9d805 Ilya Dryomov 2020-11-12 1041 bool need_sign = > (con->peer_features & CEPH_FEATURE_MSG_AUTH); > 2f713615ddd9d805 Ilya Dryomov 2020-11-12 1042 u64 seq; > 2f713615ddd9d805 Ilya Dryomov 2020-11-12 1043 u32 crc; > 2f713615ddd9d805 Ilya Dryomov 2020-11-12 1044 > 2f713615ddd9d805 Ilya Dryomov 2020-11-12 1045 > dout("read_partial_message con %p msg %p\n", con, m); > 2f713615ddd9d805 Ilya Dryomov 2020-11-12 1046 > 2f713615ddd9d805 Ilya Dryomov 2020-11-12 1047 /* header */ > 2f713615ddd9d805 Ilya Dryomov 2020-11-12 1048 size = sizeof > (con->in_hdr); > 2f713615ddd9d805 Ilya Dryomov 2020-11-12 1049 end = size; > 2f713615ddd9d805 Ilya Dryomov 2020-11-12 1050 ret = > read_partial(con, end, size, &con->in_hdr); > 2f713615ddd9d805 Ilya Dryomov 2020-11-12 1051 if (ret <= 0) > 2f713615ddd9d805 Ilya Dryomov 2020-11-12 1052 return ret; > 2f713615ddd9d805 Ilya Dryomov 2020-11-12 1053 > 2f713615ddd9d805 Ilya Dryomov 2020-11-12 1054 crc = crc32c(0, > &con->in_hdr, offsetof(struct ceph_msg_header, crc)); > 2f713615ddd9d805 Ilya Dryomov 2020-11-12 1055 if (cpu_to_le32(crc) > != con->in_hdr.crc) { > 2f713615ddd9d805 Ilya Dryomov 2020-11-12 1056 > pr_err("read_partial_message bad hdr crc %u != expected %u\n", > 2f713615ddd9d805 Ilya Dryomov 2020-11-12 1057 crc, > con->in_hdr.crc); > 2f713615ddd9d805 Ilya Dryomov 2020-11-12 1058 return > -EBADMSG; > 2f713615ddd9d805 Ilya Dryomov 2020-11-12 1059 } > 2f713615ddd9d805 Ilya Dryomov 2020-11-12 1060 > 2f713615ddd9d805 Ilya Dryomov 2020-11-12 1061 front_len = > le32_to_cpu(con->in_hdr.front_len); > 2f713615ddd9d805 Ilya Dryomov 2020-11-12 1062 if (front_len > > CEPH_MSG_MAX_FRONT_LEN) > 2f713615ddd9d805 Ilya Dryomov 2020-11-12 1063 return -EIO; > 2f713615ddd9d805 Ilya Dryomov 2020-11-12 1064 middle_len = > le32_to_cpu(con->in_hdr.middle_len); > 2f713615ddd9d805 Ilya Dryomov 2020-11-12 1065 if (middle_len > > CEPH_MSG_MAX_MIDDLE_LEN) > 2f713615ddd9d805 Ilya Dryomov 2020-11-12 1066 return -EIO; > 2f713615ddd9d805 Ilya Dryomov 2020-11-12 1067 data_len = > le32_to_cpu(con->in_hdr.data_len); > 2f713615ddd9d805 Ilya Dryomov 2020-11-12 1068 if (data_len > > CEPH_MSG_MAX_DATA_LEN) > 2f713615ddd9d805 Ilya Dryomov 2020-11-12 1069 return -EIO; > 2f713615ddd9d805 Ilya Dryomov 2020-11-12 1070 > 2f713615ddd9d805 Ilya Dryomov 2020-11-12 1071 /* verify seq# */ > 2f713615ddd9d805 Ilya Dryomov 2020-11-12 1072 seq = > le64_to_cpu(con->in_hdr.seq); > 2f713615ddd9d805 Ilya Dryomov 2020-11-12 1073 if ((s64)seq - > (s64)con->in_seq < 1) { > 2f713615ddd9d805 Ilya Dryomov 2020-11-12 1074 > pr_info("skipping %s%lld %s seq %lld expected %lld\n", > 2f713615ddd9d805 Ilya Dryomov 2020-11-12 1075 > ENTITY_NAME(con->peer_name),
[GIT PULL] Ceph fixes for 5.11-rc2
Hi Linus, The following changes since commit 5c8fe583cce542aa0b84adc939ce85293de36e5e: Linux 5.11-rc1 (2020-12-27 15:30:22 -0800) are available in the Git repository at: https://github.com/ceph/ceph-client.git tags/ceph-for-5.11-rc2 for you to fetch changes up to 664f1e259a982bf213f0cd8eea7616c89546585c: libceph: add __maybe_unused to DEFINE_MSGR2_FEATURE (2020-12-28 20:34:33 +0100) A fix for an edge case in MClientRequest encoding and a couple of trivial fixups for the new msgr2 support. Ilya Dryomov (4): ceph: reencode gid_list when reconnecting libceph: fix auth_signature buffer allocation in secure mode libceph: align session_key and con_secret to 16 bytes libceph: add __maybe_unused to DEFINE_MSGR2_FEATURE fs/ceph/mds_client.c | 53 --- include/linux/ceph/msgr.h | 4 ++-- net/ceph/messenger_v2.c | 15 +++--- 3 files changed, 36 insertions(+), 36 deletions(-)
[GIT PULL] Ceph updates for 5.11-rc1
Hi Linus, The following changes since commit 2c85ebc57b3e1817b6ce1a6b703928e113a90442: Linux 5.10 (2020-12-13 14:41:30 -0800) are available in the Git repository at: https://github.com/ceph/ceph-client.git tags/ceph-for-5.11-rc1 for you to fetch changes up to 2f0df6cfa325d7106b8a65bc0e02db1086e3f73b: libceph: drop ceph_auth_{create,update}_authorizer() (2020-12-14 23:21:50 +0100) There is a build conflict caused by the split of crypto/sha.h into crypto/sha1.h and crypto/sha2.h that affects net/ceph/messenger_v2.c. The resolution is to include the latter, done in for-linus-merged just in case. The big ticket item here is support for msgr2 on-wire protocol, which adds the option of full in-transit encryption using AES-GCM algorithm (myself). On top of that we have a series to avoid intermittent errors during recovery with recover_session=clean and some MDS request encoding work from Jeff, a cap handling fix and assorted observability improvements from Luis and Xiubo and a good number of cleanups. Luis also ran into a corner case with quotas which sadly means that we are back to denying cross-quota-realm renames. Colin Ian King (1): ceph: remove redundant assignment to variable i Ilya Dryomov (34): libceph: include middle_len in process_message() dout libceph: lower exponential backoff delay libceph: don't call reset_connection() on version/feature mismatches libceph: split protocol reset bits out of reset_connection() libceph: rename reset_connection() to ceph_con_reset_session() libceph: clear con->peer_global_seq on RESETSESSION libceph: remove redundant session reset log message libceph: drop msg->ack_stamp field libceph: handle discarding acked and requeued messages separately libceph: change ceph_msg_data_cursor_init() to take cursor libceph: change ceph_con_in_msg_alloc() to take hdr libceph: factor out ceph_con_get_out_msg() libceph: make sure our addr->port is zero and addr->nonce is non-zero libceph: don't export ceph_messenger_{init_fini}() to modules libceph: make con->state an int libceph: rename and export con->state states libceph: rename and export con->flags bits libceph: export zero_page libceph: export remaining protocol independent infrastructure libceph: separate msgr1 protocol implementation libceph: move msgr1 protocol implementation to its own file libceph: move msgr1 protocol specific fields to its own struct libceph: more insight into ticket expiry and invalidation libceph: safer en/decoding of cephx requests and replies libceph, ceph: incorporate nautilus cephx changes libceph: amend cephx init_protocol() and build_request() libceph: drop ac->ops->name field libceph: factor out finish_auth() libceph, ceph: get and handle cluster maps with addrvecs libceph, rbd: ignore addr->type while comparing in some cases libceph: introduce connection modes and ms_mode option libceph, ceph: implement msgr2.1 protocol (crc and secure modes) libceph, ceph: make use of __ceph_auth_get_authorizer() in msgr1 libceph: drop ceph_auth_{create,update}_authorizer() Jeff Layton (15): ceph: don't WARN when removing caps due to blocklisting ceph: make fsc->mount_state an int ceph: add new RECOVER mount_state when recovering session ceph: remove timeout on allowing reconnect after blocklisting ceph: queue MDS requests to REJECTED sessions when CLEANRECOVER is set ceph: fix up some warnings on W=1 builds ceph: acquire Fs caps when getting dir stats ceph: ensure we have Fs caps when fetching dir link count ceph: pass down the flags to grab_cache_page_write_begin ceph: fix inode refcount leak when ceph_fill_inode on non-I_NEW inode fails ceph: when filling trace, call ceph_get_inode outside of mutexes ceph: don't reach into request header for readdir info ceph: take a cred reference instead of tracking individual uid/gid ceph: clean up argument lists to __prepare_send_request and __send_request ceph: implement updated ceph_mds_request_head structure Liu, Changcheng (1): libceph: remove unused port macros Luis Henriques (4): ceph: fix race in concurrent __ceph_remove_cap invocations ceph: downgrade warning from mdsmap decode to debug Revert "ceph: allow rename operation under different quota realms" ceph: add ceph.caps vxattr Xiubo Li (4): ceph: send dentry lease metrics to MDS daemon ceph: add status debugfs file ceph: add ceph.{cluster_fsid/client_id} vxattrs ceph: set osdmap epoch for setxattr drivers/block/rbd.c|8 +
[GIT PULL] Ceph fix for 5.10-rc3
Hi Linus, The following changes since commit 3cea11cd5e3b00d91caf0b4730194039b45c5891: Linux 5.10-rc2 (2020-11-01 14:43:51 -0800) are available in the Git repository at: https://github.com/ceph/ceph-client.git tags/ceph-for-5.10-rc3 for you to fetch changes up to 62575e270f661aba64778cbc5f354511cf9abb21: ceph: check session state after bumping session->s_seq (2020-11-04 20:55:49 +0100) A fix for a potential stall on umount caused by the MDS dropping our REQUEST_CLOSE message. The code that handled this case was inadvertently disabled in 5.9, this patch removes it entirely and fixes the problem in a way that is consistent with ceph-fuse. Jeff Layton (1): ceph: check session state after bumping session->s_seq fs/ceph/caps.c | 2 +- fs/ceph/mds_client.c | 50 +++--- fs/ceph/mds_client.h | 1 + fs/ceph/quota.c | 2 +- fs/ceph/snap.c | 2 +- 5 files changed, 39 insertions(+), 18 deletions(-)
Re: [PATCH v2 31/39] docs: ABI: cleanup several ABI documents
On Fri, Oct 30, 2020 at 8:41 AM Mauro Carvalho Chehab wrote: > > There are some ABI documents that, while they don't generate > any warnings, they have issues when parsed by get_abi.pl script > on its output result. > > Address them, in order to provide a clean output. > > Acked-by: Jonathan Cameron #for IIO > Reviewed-by: Tom Rix # for fpga-manager > Reviewed-By: Kajol Jain # for > sysfs-bus-event_source-devices-hv_gpci and > sysfs-bus-event_source-devices-hv_24x7 > Acked-by: Oded Gabbay # for Habanalabs > Acked-by: Vaibhav Jain # for sysfs-bus-papr-pmem > Signed-off-by: Mauro Carvalho Chehab > > [...] > > Documentation/ABI/testing/sysfs-bus-rbd | 37 ++- Acked-by: Ilya Dryomov # for rbd Thanks, Ilya
[GIT PULL] Ceph updates for 5.10-rc1
Hi Linus, The following changes since commit bbf5c979011a099af5dc76498918ed7df445635b: Linux 5.9 (2020-10-11 14:15:50 -0700) are available in the Git repository at: https://github.com/ceph/ceph-client.git tags/ceph-for-5.10-rc1 for you to fetch changes up to 28e1581c3b4ea5f98530064a103c6217bedeea73: libceph: clear con->out_msg on Policy::stateful_server faults (2020-10-12 15:29:27 +0200) We have: - a patch that removes crush_workspace_mutex (myself). CRUSH computations are no longer serialized and can run in parallel. - a couple new filesystem client metrics for "ceph fs top" command (Xiubo Li) - a fix for a very old messenger bug that affected the filesystem, marked for stable (myself) - assorted fixups and cleanups throughout the codebase from Jeff and others. ---- Ilya Dryomov (9): libceph: multiple workspaces for CRUSH computations libceph, rbd, ceph: "blacklist" -> "blocklist" libceph: switch to the new "osd blocklist add" command ceph: add a note explaining session reject error string ceph: mark ceph_fmt_xattr() as printf-like for better type checking libceph: move a dout in queue_con_delay() libceph: fix ENTITY_NAME format suggestion libceph: format ceph_entity_addr nonces as unsigned libceph: clear con->out_msg on Policy::stateful_server faults Jeff Layton (12): ceph: drop special-casing for ITER_PIPE in ceph_sync_read ceph: use kill_anon_super helper ceph: have ceph_writepages_start call pagevec_lookup_range_tag ceph: break out writeback of incompatible snap context to separate function ceph: don't call ceph_update_writeable_page from page_mkwrite ceph: fold ceph_sync_readpages into ceph_readpage ceph: fold ceph_sync_writepages into writepage_nounlock ceph: fold ceph_update_writeable_page into ceph_write_begin ceph: don't SetPageError on readpage errors ceph: drop separate mdsc argument from __send_cap ceph: break up send_cap_msg ceph: comment cleanups and clarifications Luis Henriques (1): ceph: remove unnecessary return in switch statement Matthew Wilcox (Oracle) (1): ceph: promote to unsigned long long before shifting Xiubo Li (2): ceph: add ceph_sb_to_mdsc helper support to parse the mdsc ceph: metrics for opened files, pinned caps and opened inodes Yan, Zheng (1): ceph: encode inodes' parent/d_name in cap reconnect message Yanhu Cao (1): ceph: add column 'mds' to show caps in more user friendly Documentation/filesystems/ceph.rst | 6 +- drivers/block/rbd.c| 8 +- fs/ceph/addr.c | 416 + fs/ceph/caps.c | 128 fs/ceph/debugfs.c | 18 +- fs/ceph/dir.c | 20 +- fs/ceph/file.c | 85 +++- fs/ceph/inode.c| 10 +- fs/ceph/locks.c| 2 +- fs/ceph/mds_client.c | 109 ++ fs/ceph/mds_client.h | 2 +- fs/ceph/metric.c | 14 ++ fs/ceph/metric.h | 7 + fs/ceph/quota.c| 10 +- fs/ceph/snap.c | 2 +- fs/ceph/super.c| 8 +- fs/ceph/super.h| 13 +- fs/ceph/xattr.c| 3 +- include/linux/ceph/messenger.h | 2 +- include/linux/ceph/mon_client.h| 2 +- include/linux/ceph/osdmap.h| 14 +- include/linux/ceph/rados.h | 2 +- include/linux/crush/crush.h| 3 + net/ceph/messenger.c | 13 +- net/ceph/mon_client.c | 69 -- net/ceph/osdmap.c | 166 +-- 26 files changed, 689 insertions(+), 443 deletions(-)
[GIT PULL] Ceph fix for 5.9-rc5
Hi Linus, The following changes since commit f4d51dffc6c01a9e94650d95ce0104964f8ae822: Linux 5.9-rc4 (2020-09-06 17:11:40 -0700) are available in the Git repository at: https://github.com/ceph/ceph-client.git tags/ceph-for-5.9-rc5 for you to fetch changes up to f44d04e696feaf13d192d942c4f14ad2e117065a: rbd: require global CAP_SYS_ADMIN for mapping and unmapping (2020-09-07 13:14:30 +0200) A fix to add missing capability checks in rbd, marked for stable. Ilya Dryomov (1): rbd: require global CAP_SYS_ADMIN for mapping and unmapping drivers/block/rbd.c | 12 1 file changed, 12 insertions(+)
Re: [trivial PATCH] treewide: Convert switch/case fallthrough; to break;
| 2 +- > drivers/tty/vt/vt_ioctl.c | 2 +- > drivers/usb/dwc3/core.c | 2 +- > drivers/usb/gadget/legacy/inode.c | 2 +- > drivers/usb/gadget/udc/pxa25x_udc.c | 4 ++-- > drivers/usb/host/ohci-hcd.c | 2 +- > drivers/usb/isp1760/isp1760-hcd.c | 2 +- > drivers/usb/musb/cppi_dma.c | 2 +- > drivers/usb/phy/phy-fsl-usb.c | 2 +- > drivers/video/fbdev/stifb.c | 2 +- > fs/afs/yfsclient.c| 8 > fs/ceph/dir.c | 2 +- For ceph: Acked-by: Ilya Dryomov Thanks, Ilya
Re: [PATCH AUTOSEL 5.8 25/42] ceph: fix inode number handling on arches with 32-bit ino_t
On Mon, Aug 31, 2020 at 5:30 PM Sasha Levin wrote: > > From: Jeff Layton > > [ Upstream commit ebce3eb2f7ef9f6ef01a60874ebd232450107c9a ] > > Tuan and Ulrich mentioned that they were hitting a problem on s390x, > which has a 32-bit ino_t value, even though it's a 64-bit arch (for > historical reasons). > > I think the current handling of inode numbers in the ceph driver is > wrong. It tries to use 32-bit inode numbers on 32-bit arches, but that's > actually not a problem. 32-bit arches can deal with 64-bit inode numbers > just fine when userland code is compiled with LFS support (the common > case these days). > > What we really want to do is just use 64-bit numbers everywhere, unless > someone has mounted with the ino32 mount option. In that case, we want > to ensure that we hash the inode number down to something that will fit > in 32 bits before presenting the value to userland. > > Add new helper functions that do this, and only do the conversion before > presenting these values to userland in getattr and readdir. > > The inode table hashvalue is changed to just cast the inode number to > unsigned long, as low-order bits are the most likely to vary anyway. > > While it's not strictly required, we do want to put something in > inode->i_ino. Instead of basing it on BITS_PER_LONG, however, base it on > the size of the ino_t type. > > NOTE: This is a user-visible change on 32-bit arches: > > 1/ inode numbers will be seen to have changed between kernel versions. >32-bit arches will see large inode numbers now instead of the hashed >ones they saw before. > > 2/ any really old software not built with LFS support may start failing >stat() calls with -EOVERFLOW on inode numbers >2^32. Nothing much we >can do about these, but hopefully the intersection of people running >such code on ceph will be very small. > > The workaround for both problems is to mount with "-o ino32". > > [ idryomov: changelog tweak ] > > URL: https://tracker.ceph.com/issues/46828 > Reported-by: Ulrich Weigand > Reported-and-Tested-by: Tuan Hoang1 > Signed-off-by: Jeff Layton > Reviewed-by: "Yan, Zheng" > Signed-off-by: Ilya Dryomov > Signed-off-by: Sasha Levin > --- > fs/ceph/caps.c | 14 - > fs/ceph/debugfs.c| 4 +-- > fs/ceph/dir.c| 31 --- > fs/ceph/file.c | 4 +-- > fs/ceph/inode.c | 19 ++-- > fs/ceph/mds_client.h | 2 +- > fs/ceph/quota.c | 4 +-- > fs/ceph/super.h | 73 +++- > 8 files changed, 74 insertions(+), 77 deletions(-) > > diff --git a/fs/ceph/caps.c b/fs/ceph/caps.c > index 972c13aa42259..1206a481c5fc7 100644 > --- a/fs/ceph/caps.c > +++ b/fs/ceph/caps.c > @@ -886,8 +886,8 @@ int __ceph_caps_issued_mask(struct ceph_inode_info *ci, > int mask, int touch) > int have = ci->i_snap_caps; > > if ((have & mask) == mask) { > - dout("__ceph_caps_issued_mask ino 0x%lx snap issued %s" > -" (mask %s)\n", ci->vfs_inode.i_ino, > + dout("__ceph_caps_issued_mask ino 0x%llx snap issued %s" > +" (mask %s)\n", ceph_ino(&ci->vfs_inode), > ceph_cap_string(have), > ceph_cap_string(mask)); > return 1; > @@ -898,8 +898,8 @@ int __ceph_caps_issued_mask(struct ceph_inode_info *ci, > int mask, int touch) > if (!__cap_is_valid(cap)) > continue; > if ((cap->issued & mask) == mask) { > - dout("__ceph_caps_issued_mask ino 0x%lx cap %p issued > %s" > -" (mask %s)\n", ci->vfs_inode.i_ino, cap, > + dout("__ceph_caps_issued_mask ino 0x%llx cap %p > issued %s" > +" (mask %s)\n", ceph_ino(&ci->vfs_inode), cap, > ceph_cap_string(cap->issued), > ceph_cap_string(mask)); > if (touch) > @@ -910,8 +910,8 @@ int __ceph_caps_issued_mask(struct ceph_inode_info *ci, > int mask, int touch) > /* does a combination of caps satisfy mask? */ > have |= cap->issued; > if ((have & mask) == mask) { > - dout("__ceph_caps_issued_mask ino 0x%lx combo issued > %s" > -" (mask %s)\n", ci->vfs_inode.i_ino, > + dout("__ceph_caps_issued_mask ino 0x%
[GIT PULL] Ceph fixes for 5.9-rc3
Hi Linus, The following changes since commit d012a7190fc1fd72ed48911e77ca97ba4521bccd: Linux 5.9-rc2 (2020-08-23 14:08:43 -0700) are available in the Git repository at: https://github.com/ceph/ceph-client.git tags/ceph-for-5.9-rc3 for you to fetch changes up to 496ceaf12432b3d136dcdec48424312e71359ea7: ceph: don't allow setlease on cephfs (2020-08-24 20:06:54 +0200) We have an inode number handling change, prompted by s390x which is a 64-bit architecture with a 32-bit ino_t, a patch to disallow leases to avoid potential data integrity issues when CephFS is re-exported via NFS or CIFS and a fix for the bulk of W=1 compilation warnings. Ilya Dryomov (1): libceph: add __maybe_unused to DEFINE_CEPH_FEATURE Jeff Layton (2): ceph: fix inode number handling on arches with 32-bit ino_t ceph: don't allow setlease on cephfs fs/ceph/caps.c | 14 fs/ceph/debugfs.c | 4 +-- fs/ceph/dir.c | 31 +++- fs/ceph/file.c | 5 +-- fs/ceph/inode.c| 19 +- fs/ceph/mds_client.h | 2 +- fs/ceph/quota.c| 4 +-- fs/ceph/super.h| 73 -- include/linux/ceph/ceph_features.h | 8 ++--- 9 files changed, 79 insertions(+), 81 deletions(-)
Re: [PATCH] rbd: Convert to use the preferred fallthrough macro
On Wed, Aug 19, 2020 at 3:03 PM Jens Axboe wrote: > > On 8/19/20 1:53 AM, Miaohe Lin wrote: > > Convert the uses of fallthrough comments to fallthrough macro. > > Applied, thanks. Hi Jens, This has already been folded into another patch in ceph-client.git. Please drop it. Thanks, Ilya
Re: [PATCH] ceph: Convert to use the preferred fallthrough macro
On Wed, Aug 19, 2020 at 10:53 AM Miaohe Lin wrote: > > Convert the uses of fallthrough comments to fallthrough macro. > > Signed-off-by: Hongxiang Lou > Signed-off-by: Miaohe Lin > --- > fs/ceph/file.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/fs/ceph/file.c b/fs/ceph/file.c > index d51c3f2fdca0..30cd00265181 100644 > --- a/fs/ceph/file.c > +++ b/fs/ceph/file.c > @@ -252,7 +252,7 @@ static int ceph_init_file(struct inode *inode, struct > file *file, int fmode) > case S_IFREG: > ceph_fscache_register_inode_cookie(inode); > ceph_fscache_file_set_cookie(inode, file); > - /* fall through */ > + fallthrough; > case S_IFDIR: > ret = ceph_init_file_info(inode, file, fmode, > S_ISDIR(inode->i_mode)); > -- > 2.19.1 > Hi Miaohe, I've already done that, folding into your previous patch: https://github.com/ceph/ceph-client/commit/3f19ae89547df1b8ccba359a2f7ddba0f108ffbd Thanks, Ilya
Re: [RFC PATCH] ceph: Delete features that are not used in the kernel
On Wed, Aug 19, 2020 at 9:57 AM Leon Romanovsky wrote: > > From: Leon Romanovsky > > The ceph_features.h has declaration of features that are not in-use > in kernel code. This causes to seeing such compilation warnings in > almost every kernel compilation. > > ./include/linux/ceph/ceph_features.h:14:24: warning: 'CEPH_FEATURE_UID' > defined but not used [-Wunused-const-variable=] >14 | static const uint64_t CEPH_FEATURE_##name = (1ULL< |^ > ./include/linux/ceph/ceph_features.h:75:1: note: in expansion of macro > 'DEFINE_CEPH_FEATURE' >75 | DEFINE_CEPH_FEATURE( 0, 1, UID) > | ^~~ > > The upstream kernel indeed doesn't have any use of them, so delete it. > > Signed-off-by: Leon Romanovsky > --- > I'm sending this as RFC because probably the patch is wrong, but I > would like to bring your attention to the existing problem and asking > for an acceptable solution. Hi Leon, Yes, removing unused feature definitions is wrong. Annotating them as potentially unused would be much better -- I'll send a patch. I don't think any of us builds with W=1, so these things don't get noticed. Thanks, Ilya
Re: [PATCH] libceph: Convert to use the preferred fallthrough macro
On Tue, Aug 18, 2020 at 9:56 PM Jeff Layton wrote: > > On Tue, 2020-08-18 at 08:26 -0400, Miaohe Lin wrote: > > Convert the uses of fallthrough comments to fallthrough macro. > > > > Signed-off-by: Miaohe Lin > > --- > > net/ceph/ceph_hash.c| 20 ++-- > > net/ceph/crush/mapper.c | 2 +- > > net/ceph/messenger.c| 4 ++-- > > net/ceph/mon_client.c | 2 +- > > net/ceph/osd_client.c | 4 ++-- > > 5 files changed, 16 insertions(+), 16 deletions(-) > > > > diff --git a/net/ceph/ceph_hash.c b/net/ceph/ceph_hash.c > > index 81e1e006c540..16a47c0eef37 100644 > > --- a/net/ceph/ceph_hash.c > > +++ b/net/ceph/ceph_hash.c > > @@ -50,35 +50,35 @@ unsigned int ceph_str_hash_rjenkins(const char *str, > > unsigned int length) > > switch (len) { > > case 11: > > c = c + ((__u32)k[10] << 24); > > - /* fall through */ > > + fallthrough; > > case 10: > > c = c + ((__u32)k[9] << 16); > > - /* fall through */ > > + fallthrough; > > case 9: > > c = c + ((__u32)k[8] << 8); > > /* the first byte of c is reserved for the length */ > > - /* fall through */ > > + fallthrough; > > case 8: > > b = b + ((__u32)k[7] << 24); > > - /* fall through */ > > + fallthrough; > > case 7: > > b = b + ((__u32)k[6] << 16); > > - /* fall through */ > > + fallthrough; > > case 6: > > b = b + ((__u32)k[5] << 8); > > - /* fall through */ > > + fallthrough; > > case 5: > > b = b + k[4]; > > - /* fall through */ > > + fallthrough; > > case 4: > > a = a + ((__u32)k[3] << 24); > > - /* fall through */ > > + fallthrough; > > case 3: > > a = a + ((__u32)k[2] << 16); > > - /* fall through */ > > + fallthrough; > > case 2: > > a = a + ((__u32)k[1] << 8); > > - /* fall through */ > > + fallthrough; > > case 1: > > a = a + k[0]; > > /* case 0: nothing left to add */ > > diff --git a/net/ceph/crush/mapper.c b/net/ceph/crush/mapper.c > > index 07e5614eb3f1..7057f8db4f99 100644 > > --- a/net/ceph/crush/mapper.c > > +++ b/net/ceph/crush/mapper.c > > @@ -987,7 +987,7 @@ int crush_do_rule(const struct crush_map *map, > > case CRUSH_RULE_CHOOSELEAF_FIRSTN: > > case CRUSH_RULE_CHOOSE_FIRSTN: > > firstn = 1; > > - /* fall through */ > > + fallthrough; > > case CRUSH_RULE_CHOOSELEAF_INDEP: > > case CRUSH_RULE_CHOOSE_INDEP: > > if (wsize == 0) > > diff --git a/net/ceph/messenger.c b/net/ceph/messenger.c > > index 27d6ab11f9ee..bdfd66ba3843 100644 > > --- a/net/ceph/messenger.c > > +++ b/net/ceph/messenger.c > > @@ -412,7 +412,7 @@ static void ceph_sock_state_change(struct sock *sk) > > switch (sk->sk_state) { > > case TCP_CLOSE: > > dout("%s TCP_CLOSE\n", __func__); > > - /* fall through */ > > + fallthrough; > > case TCP_CLOSE_WAIT: > > dout("%s TCP_CLOSE_WAIT\n", __func__); > > con_sock_state_closing(con); > > @@ -2751,7 +2751,7 @@ static int try_read(struct ceph_connection *con) > > switch (ret) { > > case -EBADMSG: > > con->error_msg = "bad crc/signature"; > > - /* fall through */ > > + fallthrough; > > case -EBADE: > > ret = -EIO; > > break; > > diff --git a/net/ceph/mon_client.c b/net/ceph/mon_client.c > > index 3d8c8015e976..d633a0aeaa55 100644 > > --- a/net/ceph/mon_client.c > > +++ b/net/ceph/mon_client.c > > @@ -1307,7 +1307,7 @@ static struct ceph_msg *mon_alloc_msg(struct > > ceph_connection *con, > >* request had a non-zero tid. Work around this weirdness > >* by allocating a new message. > >*/ > > - /* fall through */ > > + fallthrough; > > case CEPH_MSG_MON_MAP: > > case CEPH_MSG_MDS_MAP: > > case CEPH_MSG_OSD_MAP: > > diff --git a/net/ceph/osd_client.c b/net/ceph/osd_client.c > > index e4fbcad6e7d8..7901ab6c79fd 100644 > > --- a/net/ceph/osd_client.c > > +++ b/net/ceph/osd_client.c > > @@ -3854,7 +3854,7 @@ static void scan_requests(struct ceph_osd *osd, > > if (!force_resend && !force_resend_writes) > > break; > > > > - /* fall through */ > > + fallthrough; > > case CALC_TARGET_NEED_RESEND: > >
Re: [PATCH V2 6/6] ceph_debug: Remove now unused dout macro definitions
On Mon, Aug 17, 2020 at 3:34 AM Joe Perches wrote: > > All the uses have be converted to pr_debug, so remove these. > > Signed-off-by: Joe Perches > --- > include/linux/ceph/ceph_debug.h | 30 -- > 1 file changed, 30 deletions(-) > > diff --git a/include/linux/ceph/ceph_debug.h b/include/linux/ceph/ceph_debug.h > index d5a5da838caf..81c0d7195f1e 100644 > --- a/include/linux/ceph/ceph_debug.h > +++ b/include/linux/ceph/ceph_debug.h > @@ -6,34 +6,4 @@ > > #include > > -#ifdef CONFIG_CEPH_LIB_PRETTYDEBUG > - > -/* > - * wrap pr_debug to include a filename:lineno prefix on each line. > - * this incurs some overhead (kernel size and execution time) due to > - * the extra function call at each call site. > - */ > - > -# if defined(DEBUG) || defined(CONFIG_DYNAMIC_DEBUG) > -# define dout(fmt, ...) \ > - pr_debug("%.*s %12.12s:%-4d : " fmt,\ > -8 - (int)sizeof(KBUILD_MODNAME), "", \ > -kbasename(__FILE__), __LINE__, ##__VA_ARGS__) > -# else > -/* faux printk call just to see any compiler warnings. */ > -# define dout(fmt, ...) do {\ > - if (0) \ > - printk(KERN_DEBUG fmt, ##__VA_ARGS__); \ > - } while (0) > -# endif > - > -#else > - > -/* > - * or, just wrap pr_debug > - */ > -# define dout(fmt, ...)pr_debug(" " fmt, ##__VA_ARGS__) > - > -#endif > - > #endif > -- > 2.26.0 > Hi Joe, Yeah, roughly the same thing can be achieved with +flmp instead of just +p with PRETTYDEBUG, but PRETTYDEBUG formatting actually predates those flags and some of us still use bash scripts from back then. We also have a few guides and blog entries with just +p, but that's not a big deal. I'd be fine with removing CONFIG_CEPH_LIB_PRETTYDEBUG since it's disabled by default and in all major distributions, but I'm not a fan of a wide-sweeping dout -> pr_debug change. We do extensive backporting to older kernels and these kind of changes are rather annoying. dout is shorter to type too ;) I know that in some cases the function names are outdated or duplicated, but I prefer fixing them gradually, along with actual code changes in the area (i.e. similar to whitespace). Thanks, Ilya
[GIT PULL] Ceph updates for 5.9-rc1
Hi Linus, The following changes since commit bcf876870b95592b52519ed4aafcf9d95999bc9c: Linux 5.8 (2020-08-02 14:21:45 -0700) are available in the Git repository at: https://github.com/ceph/ceph-client.git tags/ceph-for-5.9-rc1 for you to fetch changes up to 02e37571f9e79022498fd0525c073b07e9d9ac69: ceph: handle zero-length feature mask in session messages (2020-08-05 17:47:07 +0200) Xiubo has completed his work on filesystem client metrics, they are sent to all available MDSes once per second now. Other than that, we have a lot of fixes and cleanups all around the filesystem, including a tweak to cut down on MDS request resends in multi-MDS setups from Yanhu and fixups for SELinux symlink labeling and MClientSession message decoding from Jeff. Alexander A. Klimov (1): libceph: replace HTTP links with HTTPS ones Colin Ian King (1): ceph: remove redundant initialization of variable mds Ilya Dryomov (2): libceph: use target_copy() in send_linger() libceph: dump class and method names on method calls Jeff Layton (5): ceph: clean up and optimize ceph_check_delayed_caps() libceph: just have osd_req_op_init() return a pointer ceph: set sec_context xattr on symlink creation ceph: move sb->wb_pagevec_pool to be a global mempool ceph: handle zero-length feature mask in session messages Jia Yang (1): ceph: remove unused variables in ceph_mdsmap_decode() Randy Dunlap (1): ceph: delete repeated words in fs/ceph/ Xiubo Li (9): ceph: add check_session_state() helper and make it global ceph: add global total_caps to count the mdsc's total caps number ceph: switch to WARN_ON_ONCE in encode_supported_features() ceph: fix potential mdsc use-after-free crash ceph: do not access the kiocb after aio requests ceph: check the sesion state and return false in case it is closed ceph: periodically send perf metrics to MDSes ceph: send client provided metric flags in client metadata ceph: fix use-after-free for fsc->mdsc Xu Wang (1): ceph: remove unnecessary cast in kfree() Yanhu Cao (1): ceph: use frag's MDS in either mode fs/ceph/Kconfig| 2 +- fs/ceph/addr.c | 23 +++-- fs/ceph/caps.c | 12 +-- fs/ceph/debugfs.c | 16 +--- fs/ceph/dir.c | 4 + fs/ceph/file.c | 5 +- fs/ceph/mds_client.c | 184 + fs/ceph/mds_client.h | 7 +- fs/ceph/mdsmap.c | 10 +- fs/ceph/metric.c | 149 ++ fs/ceph/metric.h | 91 ++ fs/ceph/super.c| 64 ++--- fs/ceph/super.h| 6 +- fs/ceph/xattr.c| 12 +-- include/linux/ceph/ceph_features.h | 2 +- include/linux/ceph/ceph_fs.h | 1 + include/linux/ceph/libceph.h | 1 + include/linux/ceph/osd_client.h| 2 +- include/linux/crush/crush.h| 2 +- net/ceph/Kconfig | 2 +- net/ceph/ceph_hash.c | 2 +- net/ceph/crush/hash.c | 2 +- net/ceph/crush/mapper.c| 2 +- net/ceph/debugfs.c | 3 + net/ceph/osd_client.c | 43 - 25 files changed, 511 insertions(+), 136 deletions(-)
Re: [PATCH] Replace HTTP links with HTTPS ones: CEPH COMMON CODE (LIBCEPH)
On Wed, Jul 8, 2020 at 8:53 AM Alexander A. Klimov wrote: > > Rationale: > Reduces attack surface on kernel devs opening the links for MITM > as HTTPS traffic is much harder to manipulate. > > Deterministic algorithm: > For each file: > If not .svg: > For each line: > If doesn't contain `\bxmlns\b`: > For each link, `\bhttp://[^# \t\r\n]*(?:\w|/)`: > If neither `\bgnu\.org/license`, nor `\bmozilla\.org/MPL\b`: > If both the HTTP and HTTPS versions > return 200 OK and serve the same content: > Replace HTTP with HTTPS. > > Signed-off-by: Alexander A. Klimov > --- > Continuing my work started at 93431e0607e5. > See also: git log --oneline '--author=Alexander A. Klimov > ' v5.7..master > (Actually letting a shell for loop submit all this stuff for me.) > > If there are any URLs to be removed completely or at least not HTTPSified: > Just clearly say so and I'll *undo my change*. > See also: https://lkml.org/lkml/2020/6/27/64 > > If there are any valid, but yet not changed URLs: > See: https://lkml.org/lkml/2020/6/26/837 > > If you apply the patch, please let me know. > > > net/ceph/ceph_hash.c| 2 +- > net/ceph/crush/hash.c | 2 +- > net/ceph/crush/mapper.c | 2 +- > 3 files changed, 3 insertions(+), 3 deletions(-) > > diff --git a/net/ceph/ceph_hash.c b/net/ceph/ceph_hash.c > index 9a5850f264ed..81e1e006c540 100644 > --- a/net/ceph/ceph_hash.c > +++ b/net/ceph/ceph_hash.c > @@ -4,7 +4,7 @@ > > /* > * Robert Jenkin's hash function. > - * http://burtleburtle.net/bob/hash/evahash.html > + * https://burtleburtle.net/bob/hash/evahash.html > * This is in the public domain. > */ > #define mix(a, b, c) \ > diff --git a/net/ceph/crush/hash.c b/net/ceph/crush/hash.c > index e5cc603cdb17..fe79f6d2d0db 100644 > --- a/net/ceph/crush/hash.c > +++ b/net/ceph/crush/hash.c > @@ -7,7 +7,7 @@ > > /* > * Robert Jenkins' function for mixing 32-bit values > - * http://burtleburtle.net/bob/hash/evahash.html > + * https://burtleburtle.net/bob/hash/evahash.html > * a, b = random bits, c = input and output > */ > #define crush_hashmix(a, b, c) do {\ > diff --git a/net/ceph/crush/mapper.c b/net/ceph/crush/mapper.c > index 3f323ed9df52..07e5614eb3f1 100644 > --- a/net/ceph/crush/mapper.c > +++ b/net/ceph/crush/mapper.c > @@ -298,7 +298,7 @@ static __u64 crush_ln(unsigned int xin) > * > * for reference, see: > * > - * > http://en.wikipedia.org/wiki/Exponential_distribution#Distribution_of_the_minimum_of_exponential_random_variables > + * > https://en.wikipedia.org/wiki/Exponential_distribution#Distribution_of_the_minimum_of_exponential_random_variables > * > */ > Applied with a couple more link fixes folded in. Thanks, Ilya
Re: [PATCH] fs: ceph: Remove unnecessary cast in kfree()
On Wed, Jul 8, 2020 at 9:27 AM Xu Wang wrote: > > Remove unnecassary casts in the argument to kfree. > > Signed-off-by: Xu Wang > --- > fs/ceph/xattr.c | 12 ++-- > 1 file changed, 6 insertions(+), 6 deletions(-) > > diff --git a/fs/ceph/xattr.c b/fs/ceph/xattr.c > index 71ee34d160c3..3a733ac33d9b 100644 > --- a/fs/ceph/xattr.c > +++ b/fs/ceph/xattr.c > @@ -497,10 +497,10 @@ static int __set_xattr(struct ceph_inode_info *ci, > kfree(*newxattr); > *newxattr = NULL; > if (xattr->should_free_val) > - kfree((void *)xattr->val); > + kfree(xattr->val); > > if (update_xattr) { > - kfree((void *)name); > + kfree(name); > name = xattr->name; > } > ci->i_xattrs.names_size -= xattr->name_len; > @@ -566,9 +566,9 @@ static void __free_xattr(struct ceph_inode_xattr *xattr) > BUG_ON(!xattr); > > if (xattr->should_free_name) > - kfree((void *)xattr->name); > + kfree(xattr->name); > if (xattr->should_free_val) > - kfree((void *)xattr->val); > + kfree(xattr->val); > > kfree(xattr); > } > @@ -582,9 +582,9 @@ static int __remove_xattr(struct ceph_inode_info *ci, > rb_erase(&xattr->node, &ci->i_xattrs.index); > > if (xattr->should_free_name) > - kfree((void *)xattr->name); > + kfree(xattr->name); > if (xattr->should_free_val) > - kfree((void *)xattr->val); > + kfree(xattr->val); > > ci->i_xattrs.names_size -= xattr->name_len; > ci->i_xattrs.vals_size -= xattr->val_len; Applied. Thanks, Ilya
[GIT PULL] Ceph fixes for 5.8-rc2
Hi Linus, The following changes since commit b3a9e3b9622ae10064826dccb4f7a52bd88c7407: Linux 5.8-rc1 (2020-06-14 12:45:04 -0700) are available in the Git repository at: https://github.com/ceph/ceph-client.git tags/ceph-for-5.8-rc2 for you to fetch changes up to 7ed286f3e061ee394782bd9fb4ed96bff0b5a021: libceph: don't omit used_replica in target_copy() (2020-06-16 16:02:08 +0200) An important follow-up for replica reads support that went into -rc1 and two target_copy() fixups. Ilya Dryomov (3): libceph: move away from global osd_req_flags libceph: don't omit recovery_deletes in target_copy() libceph: don't omit used_replica in target_copy() drivers/block/rbd.c | 4 +++- include/linux/ceph/libceph.h | 4 ++-- net/ceph/ceph_common.c | 14 ++ net/ceph/osd_client.c| 9 - 4 files changed, 15 insertions(+), 16 deletions(-)
[GIT PULL] Ceph updates for 5.8-rc1
Hi Linus, The following changes since commit 3d77e6a8804abcc0504c904bd6e5cdf3a5cf8162: Linux 5.7 (2020-05-31 16:49:15 -0700) are available in the Git repository at: https://github.com/ceph/ceph-client.git tags/ceph-for-5.8-rc1 for you to fetch changes up to dc1dad8e1a612650b1e786e992cb0c6e101e226a: rbd: compression_hint option (2020-06-01 23:32:35 +0200) The highlights are: - OSD/MDS latency and caps cache metrics infrastructure for the filesytem (Xiubo Li). Currently available through debugfs and will be periodically sent to the MDS in the future. - support for replica reads (balanced and localized reads) for rbd and the filesystem (myself). The default remains to always read from primary, users can opt-in with the new crush_location and read_from_replica options. Note that reading from replica is safe for general use only since Octopus. - support for RADOS allocation hint flags (myself). Currently used by rbd to propagate the compressible/incompressible hint given with the new compression_hint map option and ready for passing on more advanced hints, e.g. based on fadvise() from the filesystem. - support for efficient cross-quota-realm renames (Luis Henriques) - assorted cap handling improvements and cleanups, particularly untangling some of the locking (Jeff Layton) Gustavo A. R. Silva (1): libceph, rbd: replace zero-length array with flexible-array Ilya Dryomov (7): libceph: add non-asserting rbtree insertion helper libceph: decode CRUSH device/bucket types and names libceph: crush_location infrastructure libceph: support for balanced and localized reads libceph: read_from_replica option libceph: support for alloc hint flags rbd: compression_hint option Jeff Layton (11): ceph: reorganize __send_cap for less spinlock abuse ceph: split up __finish_cap_flush ceph: add comments for handle_cap_flush_ack logic ceph: don't release i_ceph_lock in handle_cap_trunc ceph: don't take i_ceph_lock in handle_cap_import ceph: document what protects i_dirty_item and i_flushing_item ceph: fix potential race in ceph_check_caps ceph: throw a warning if we destroy session with mutex still locked ceph: convert mdsc->cap_dirty to a per-session list ceph: request expedited service on session's last cap flush ceph: ceph_kick_flushing_caps needs the s_mutex Luis Henriques (3): ceph: normalize 'delta' parameter usage in check_quota_exceeded ceph: allow rename operation under different quota realms ceph: don't return -ESTALE if there's still an open file Xiubo Li (6): ceph: add dentry lease metric support ceph: add caps perf metric for each superblock ceph: add read/write latency metric support ceph: add metadata perf metric support ceph: make sure mdsc->mutex is nested in s->s_mutex to fix dead lock ceph: skip checking caps when session reconnecting and releasing reqs Yan, Zheng (1): ceph: reset i_requested_max_size if file write is not wanted drivers/block/rbd.c | 44 - drivers/block/rbd_types.h | 2 +- fs/ceph/Makefile| 2 +- fs/ceph/acl.c | 2 +- fs/ceph/addr.c | 20 ++ fs/ceph/caps.c | 425 ++-- fs/ceph/debugfs.c | 100 +- fs/ceph/dir.c | 26 ++- fs/ceph/export.c| 9 +- fs/ceph/file.c | 30 +++ fs/ceph/inode.c | 4 +- fs/ceph/mds_client.c| 48 - fs/ceph/mds_client.h| 15 +- fs/ceph/metric.c| 148 ++ fs/ceph/metric.h| 62 ++ fs/ceph/quota.c | 62 +- fs/ceph/super.h | 34 +++- fs/ceph/xattr.c | 4 +- include/linux/ceph/libceph.h| 13 +- include/linux/ceph/mon_client.h | 2 +- include/linux/ceph/osd_client.h | 8 +- include/linux/ceph/osdmap.h | 19 +- include/linux/ceph/rados.h | 14 ++ include/linux/crush/crush.h | 14 +- net/ceph/ceph_common.c | 75 +++ net/ceph/crush/crush.c | 3 +- net/ceph/debugfs.c | 6 +- net/ceph/osd_client.c | 103 +- net/ceph/osdmap.c | 363 +- 29 files changed, 1405 insertions(+), 252 deletions(-) create mode 100644 fs/ceph/metric.c create mode 100644 fs/ceph/metric.h
[GIT PULL] Ceph fixes for 5.7-rc8
Hi Linus, The following changes since commit 9cb1fd0efd195590b828b9b865421ad345a4a145: Linux 5.7-rc7 (2020-05-24 15:32:54 -0700) are available in the Git repository at: https://github.com/ceph/ceph-client.git tags/ceph-for-5.7-rc8 for you to fetch changes up to fb33c114d3ed5bdac230716f5b0a93b56b92a90d: ceph: flush release queue when handling caps for unknown inode (2020-05-27 13:03:57 +0200) Cache tiering and cap handling fixups, both marked for stable. Jeff Layton (1): ceph: flush release queue when handling caps for unknown inode Jerry Lee (1): libceph: ignore pool overlay and cache logic on redirects fs/ceph/caps.c| 2 +- net/ceph/osd_client.c | 4 +++- 2 files changed, 4 insertions(+), 2 deletions(-)
[PATCH v3] vsprintf: don't obfuscate NULL and error pointers
I don't see what security concern is addressed by obfuscating NULL and IS_ERR() error pointers, printed with %p/%pK. Given the number of sites where %p is used (over 1) and the fact that NULL pointers aren't uncommon, it probably wouldn't take long for an attacker to find the hash that corresponds to 0. Although harder, the same goes for most common error values, such as -1, -2, -11, -14, etc. The NULL part actually fixes a regression: NULL pointers weren't obfuscated until commit 3e5903eb9cff ("vsprintf: Prevent crash when dereferencing invalid pointers") which went into 5.2. I'm tacking the IS_ERR() part on here because error pointers won't leak kernel addresses and printing them as pointers shouldn't be any different from e.g. %d with PTR_ERR_OR_ZERO(). Obfuscating them just makes debugging based on existing pr_debug and friends excruciating. Note that the "always print 0's for %pK when kptr_restrict == 2" behaviour which goes way back is left as is. Example output with the patch applied: ptr error-ptr NULL %p:01f8cc5b fff2 %pK, kptr = 0: 01f8cc5b fff2 %px: 888048c04020 fff2 %pK, kptr = 1: 888048c04020 fff2 %pK, kptr = 2: Fixes: 3e5903eb9cff ("vsprintf: Prevent crash when dereferencing invalid pointers") Signed-off-by: Ilya Dryomov Reviewed-by: Petr Mladek Reviewed-by: Sergey Senozhatsky Acked-by: Steven Rostedt (VMware) Acked-by: Linus Torvalds --- lib/test_printf.c | 19 ++- lib/vsprintf.c| 7 +++ 2 files changed, 25 insertions(+), 1 deletion(-) Hi Petr, This just came up again, please consider sending this to Linus for 5.7. Prior discussion was split in three threads and revolved around the vision for how lib/test_printf.c should be structured between Rasmus and yourself. The fix itself wasn't disputed and has several acks. If you want to restructure the test suite before adding any new test cases, v1 doesn't have them, but I'm reposting with test cases because I think it's best to add them right away to prevent further regressions. v3: - don't use EAGAIN macro in error_pointer() test case as the actual error code varies between architectures v2: - fix null_pointer() test case (it didn't catch the original regression because test_hashed() doesn't really test much) and add error_pointer() test case diff --git a/lib/test_printf.c b/lib/test_printf.c index 2d9f520d2f27..6b1622f4d7c2 100644 --- a/lib/test_printf.c +++ b/lib/test_printf.c @@ -214,6 +214,7 @@ test_string(void) #define PTR_STR "0123456789ab" #define PTR_VAL_NO_CRNG "(ptrval)" #define ZEROS "" /* hex 32 zero bits */ +#define ONES ""/* hex 32 one bits */ static int __init plain_format(void) @@ -245,6 +246,7 @@ plain_format(void) #define PTR_STR "456789ab" #define PTR_VAL_NO_CRNG "(ptrval)" #define ZEROS "" +#define ONES "" static int __init plain_format(void) @@ -330,14 +332,28 @@ test_hashed(const char *fmt, const void *p) test(buf, fmt, p); } +/* + * NULL pointers aren't hashed. + */ static void __init null_pointer(void) { - test_hashed("%p", NULL); + test(ZEROS "", "%p", NULL); test(ZEROS "", "%px", NULL); test("(null)", "%pE", NULL); } +/* + * Error pointers aren't hashed. + */ +static void __init +error_pointer(void) +{ + test(ONES "fff5", "%p", ERR_PTR(-11)); + test(ONES "fff5", "%px", ERR_PTR(-11)); + test("(efault)", "%pE", ERR_PTR(-11)); +} + #define PTR_INVALID ((void *)0x00ab) static void __init @@ -649,6 +665,7 @@ test_pointer(void) { plain(); null_pointer(); + error_pointer(); invalid_pointer(); symbol_ptr(); kernel_ptr(); diff --git a/lib/vsprintf.c b/lib/vsprintf.c index 7c488a1ce318..f0f0522cd5a7 100644 --- a/lib/vsprintf.c +++ b/lib/vsprintf.c @@ -794,6 +794,13 @@ static char *ptr_to_id(char *buf, char *end, const void *ptr, unsigned long hashval; int ret; + /* +* Print the real pointer value for NULL and error pointers, +* as they are not actual addresses. +*/ + if (IS_ERR_OR_NULL(ptr)) + return pointer_string(buf, end, ptr, spec); + /* When debugging early boot use non-cryptographically secure hash. */ if (unlikely(debug_boot_weak_hash)) { hashval = hash_long((unsigned long)ptr, 32); -- 2.19.2
Re: linux-next: new contact(s) for the ceph tree?
On Sat, May 9, 2020 at 5:47 AM Stephen Rothwell wrote: > > Hi Sage, > > On Sat, 9 May 2020 01:03:14 + (UTC) Sage Weil wrote: > > > > Jeff Layton > > Done. > > On Sat, 9 May 2020, Stephen Rothwell wrote: > > > > > > I noticed commit > > > > > > 3a5ccecd9af7 ("MAINTAINERS: remove myself as ceph co-maintainer") > > > > > > appear recently. So who should I now list as the contact(s) for the > > > ceph tree? Hi Stephen, I thought maintainers were on the list automatically. If there is a separate list, please add me as well. Thanks, Ilya
[GIT PULL] Ceph fixes for 5.7-rc5
Hi Linus, The following changes since commit 0e698dfa282211e414076f9dc7e83c1c288314fd: Linux 5.7-rc4 (2020-05-03 14:56:04 -0700) are available in the Git repository at: https://github.com/ceph/ceph-client.git tags/ceph-for-5.7-rc5 for you to fetch changes up to 12ae44a40a1be891bdc6463f8c7072b4ede746ef: ceph: demote quotarealm lookup warning to a debug message (2020-05-08 18:44:40 +0200) Fixes for an endianness handling bug that prevented mounts on big-endian arches, a spammy log message and a couple error paths. Also included a MAINTAINERS update. Jeff Layton (1): ceph: fix endianness bug when handling MDS session feature bits Luis Henriques (1): ceph: demote quotarealm lookup warning to a debug message Sage Weil (1): MAINTAINERS: remove myself as ceph co-maintainer Wu Bo (2): ceph: fix special error code in ceph_try_get_caps() ceph: fix double unlock in handle_cap_export() MAINTAINERS | 6 -- fs/ceph/caps.c | 3 ++- fs/ceph/mds_client.c | 8 +++- fs/ceph/quota.c | 4 ++-- 4 files changed, 7 insertions(+), 14 deletions(-)
Re: [PATCH] rbd: Replace zero-length array with flexible-array
On Thu, May 7, 2020 at 9:15 PM Gustavo A. R. Silva wrote: > > The current codebase makes use of the zero-length array language > extension to the C90 standard, but the preferred mechanism to declare > variable-length types such as these ones is a flexible array member[1][2], > introduced in C99: > > struct foo { > int stuff; > struct boo array[]; > }; > > By making use of the mechanism above, we will get a compiler warning > in case the flexible array does not occur last in the structure, which > will help us prevent some kind of undefined behavior bugs from being > inadvertently introduced[3] to the codebase from now on. > > Also, notice that, dynamic memory allocations won't be affected by > this change: > > "Flexible array members have incomplete type, and so the sizeof operator > may not be applied. As a quirk of the original implementation of > zero-length arrays, sizeof evaluates to zero."[1] > > sizeof(flexible-array-member) triggers a warning because flexible array > members have incomplete type[1]. There are some instances of code in > which the sizeof operator is being incorrectly/erroneously applied to > zero-length arrays and the result is zero. Such instances may be hiding > some bugs. So, this work (flexible-array member conversions) will also > help to get completely rid of those sorts of issues. > > This issue was found with the help of Coccinelle. > > [1] https://gcc.gnu.org/onlinedocs/gcc/Zero-Length.html > [2] https://github.com/KSPP/linux/issues/21 > [3] commit 76497732932f ("cxgb3/l2t: Fix undefined behaviour") > > Signed-off-by: Gustavo A. R. Silva > --- > drivers/block/rbd_types.h |2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/drivers/block/rbd_types.h b/drivers/block/rbd_types.h > index ac98ab6ccd3b..a600e0eb6b6f 100644 > --- a/drivers/block/rbd_types.h > +++ b/drivers/block/rbd_types.h > @@ -93,7 +93,7 @@ struct rbd_image_header_ondisk { > __le32 snap_count; > __le32 reserved; > __le64 snap_names_len; > - struct rbd_image_snap_ondisk snaps[0]; > + struct rbd_image_snap_ondisk snaps[]; > } __attribute__((packed)); > > > Applied (folded into libceph patch). Thanks, Ilya
Re: [PATCH] libceph: Replace zero-length array with flexible-array
On Thu, May 7, 2020 at 8:47 PM Gustavo A. R. Silva wrote: > > The current codebase makes use of the zero-length array language > extension to the C90 standard, but the preferred mechanism to declare > variable-length types such as these ones is a flexible array member[1][2], > introduced in C99: > > struct foo { > int stuff; > struct boo array[]; > }; > > By making use of the mechanism above, we will get a compiler warning > in case the flexible array does not occur last in the structure, which > will help us prevent some kind of undefined behavior bugs from being > inadvertently introduced[3] to the codebase from now on. > > Also, notice that, dynamic memory allocations won't be affected by > this change: > > "Flexible array members have incomplete type, and so the sizeof operator > may not be applied. As a quirk of the original implementation of > zero-length arrays, sizeof evaluates to zero."[1] > > sizeof(flexible-array-member) triggers a warning because flexible array > members have incomplete type[1]. There are some instances of code in > which the sizeof operator is being incorrectly/erroneously applied to > zero-length arrays and the result is zero. Such instances may be hiding > some bugs. So, this work (flexible-array member conversions) will also > help to get completely rid of those sorts of issues. > > This issue was found with the help of Coccinelle. > > [1] https://gcc.gnu.org/onlinedocs/gcc/Zero-Length.html > [2] https://github.com/KSPP/linux/issues/21 > [3] commit 76497732932f ("cxgb3/l2t: Fix undefined behaviour") > > Signed-off-by: Gustavo A. R. Silva > --- > include/linux/ceph/mon_client.h |2 +- > include/linux/crush/crush.h |2 +- > 2 files changed, 2 insertions(+), 2 deletions(-) > > diff --git a/include/linux/ceph/mon_client.h b/include/linux/ceph/mon_client.h > index dbb8a6959a73..ce4ffeb384d7 100644 > --- a/include/linux/ceph/mon_client.h > +++ b/include/linux/ceph/mon_client.h > @@ -19,7 +19,7 @@ struct ceph_monmap { > struct ceph_fsid fsid; > u32 epoch; > u32 num_mon; > - struct ceph_entity_inst mon_inst[0]; > + struct ceph_entity_inst mon_inst[]; > }; > > struct ceph_mon_client; > diff --git a/include/linux/crush/crush.h b/include/linux/crush/crush.h > index 54741295c70b..38b0e4d50ed9 100644 > --- a/include/linux/crush/crush.h > +++ b/include/linux/crush/crush.h > @@ -87,7 +87,7 @@ struct crush_rule_mask { > struct crush_rule { > __u32 len; > struct crush_rule_mask mask; > - struct crush_rule_step steps[0]; > + struct crush_rule_step steps[]; > }; > > #define crush_rule_size(len) (sizeof(struct crush_rule) + \ > Applied. Thanks, Ilya
Re: [PATCH] ceph: demote quotarealm lookup warning to a debug message
On Thu, May 7, 2020 at 3:44 PM Jeff Layton wrote: > > On Tue, 2020-05-05 at 13:59 +0100, Luis Henriques wrote: > > A misconfigured cephx can easily result in having the kernel client > > flooding the logs with: > > > > ceph: Can't lookup inode 1 (err: -13) > > > > Change his message to debug level. > > > > Link: https://tracker.ceph.com/issues/44546 > > Signed-off-by: Luis Henriques > > --- > > Hi! > > > > This patch should fix some harmless warnings when using cephx to restrict > > users access to certain filesystem paths. I've added a comment to the > > tracker where removing this warning could result (unlikely, IMHO!) in an > > admin to miss not-so-harmless bogus configurations. > > > > Cheers, > > -- > > LuÃs > > > > fs/ceph/quota.c | 4 ++-- > > 1 file changed, 2 insertions(+), 2 deletions(-) > > > > diff --git a/fs/ceph/quota.c b/fs/ceph/quota.c > > index de56dee60540..19507e2fdb57 100644 > > --- a/fs/ceph/quota.c > > +++ b/fs/ceph/quota.c > > @@ -159,8 +159,8 @@ static struct inode *lookup_quotarealm_inode(struct > > ceph_mds_client *mdsc, > > } > > > > if (IS_ERR(in)) { > > - pr_warn("Can't lookup inode %llx (err: %ld)\n", > > - realm->ino, PTR_ERR(in)); > > + dout("Can't lookup inode %llx (err: %ld)\n", > > + realm->ino, PTR_ERR(in)); > > qri->timeout = jiffies + msecs_to_jiffies(60 * 1000); /* XXX > > */ > > } else { > > qri->timeout = 0; > > > > Ilya, > > We've had a number of reports where people get a ton of kernel log spam > when they hit this problem. I think we probably ought to mark this patch > for stable and go ahead and send it to Linus for v5.7 -- any objection? Sure, I'll queue it up. Thanks, Ilya
[GIT PULL] Ceph fixes for 5.4-rc4
Hi Linus, The following changes since commit 4f5cafb5cb8471e54afdc9054d973535614f7675: Linux 5.4-rc3 (2019-10-13 16:37:36 -0700) are available in the Git repository at: https://github.com/ceph/ceph-client.git tags/ceph-for-5.4-rc4 for you to fetch changes up to 25e6be21230d3208d687dad90b6e43419013c351: rbd: cancel lock_dwork if the wait is interrupted (2019-10-15 17:43:15 +0200) A future-proofing decoding fix from Jeff intended for stable and a patch for a mostly benign race from Dongsheng. Dongsheng Yang (1): rbd: cancel lock_dwork if the wait is interrupted Jeff Layton (1): ceph: just skip unrecognized info in ceph_reply_info_extra drivers/block/rbd.c | 9 ++--- fs/ceph/mds_client.c | 21 +++-- 2 files changed, 17 insertions(+), 13 deletions(-)
Re: [PATCH] function dispatch should return if mds session does not exist
On Mon, Oct 14, 2019 at 11:01 AM Yanhu Cao wrote: > > we shouldn't call ceph_msg_put, otherwise libceph will pass > invalid pointer to mm. > > kernel panic - not syncing: fatal exception > [5452201.213885] [ cut here ] > [5452201.213889] kernel BUG at mm/slub.c:3901! > [5452201.213938] invalid opcode: [#1] SMP PTI > [5452201.213971] CPU: 35 PID: 3037447 Comm: kworker/35:1 Kdump: loaded > Not tainted 4.19.15 #1 > [5452201.214020] Hardware name: HP ProLiant DL380 Gen9/ProLiant DL380 > Gen9, BIOS P89 01/22/2018 > [5452201.214088] Workqueue: ceph-msgr ceph_con_workfn [libceph] > [5452201.214129] RIP: 0010:kfree+0x15b/0x170 > [5452201.214156] Code: 8b 02 f6 c4 80 75 08 49 8b 42 08 a8 01 74 1b 49 8b > 02 31 f6 f6 c4 80 74 05 41 0f b6 72 51 5b 5d 41 5c 4c 89 d7 e9 95 03 f9 ff > <0f> 0b 48 83 e8 01 e9 01 ff ff ff 49 83 ea 01 e9 e9 fe ff ff 90 0f > [5452201.214262] RSP: 0018:b8c3a0607cb0 EFLAGS: 00010246 > [5452201.214296] RAX: eee84008 RBX: 9130c000 RCX: > 80200016 > [5452201.214339] RDX: 6f0ec000 RSI: RDI: > 9130c000 > [5452201.214383] RBP: 91107f823970 R08: 0001 R09: > > [5452201.214426] R10: eee84000 R11: 0001 R12: > c076c45d > [5452201.214469] R13: 91107f823970 R14: 91107f8239e0 R15: > 91107f823900 > [5452201.214513] FS: () GS:9110bfbc() > knlGS: > [5452201.214562] CS: 0010 DS: ES: CR0: 80050033 > [5452201.214598] CR2: 55993ab29620 CR3: 003a1e00a003 CR4: > 003606e0 > [5452201.214641] DR0: DR1: DR2: > > [5452201.214685] DR3: DR6: fffe0ff0 DR7: > 0400 > [5452201.214728] Call Trace: > [5452201.214759] ceph_msg_release+0x15d/0x190 [libceph] > [5452201.214811] dispatch+0x66/0xa50 [ceph] > [5452201.214846] try_read+0x7f3/0x11d0 [libceph] > [5452201.214878] ? dequeue_entity+0x37e/0x7e0 > [5452201.214907] ? pick_next_task_fair+0x291/0x610 > [5452201.214937] ? dequeue_task_fair+0x5d/0x700 > [5452201.214966] ? __switch_to+0x8c/0x470 > [5452201.214999] ceph_con_workfn+0xa2/0x5b0 [libceph] > [5452201.215033] process_one_work+0x16b/0x370 > [5452201.215062] worker_thread+0x49/0x3f0 > [5452201.215089] kthread+0xf5/0x130 > [5452201.215112] ? max_active_store+0x80/0x80 > [5452201.215139] ? kthread_bind+0x10/0x10 > [5452201.215167] ret_from_fork+0x1f/0x30 > > Link: https://tracker.ceph.com/issues/42288 > > Signed-off-by: Yanhu Cao > --- > fs/ceph/mds_client.c | 3 +-- > 1 file changed, 1 insertion(+), 2 deletions(-) > > diff --git a/fs/ceph/mds_client.c b/fs/ceph/mds_client.c > index a8a8f84f3bbf..066358fea347 100644 > --- a/fs/ceph/mds_client.c > +++ b/fs/ceph/mds_client.c > @@ -4635,7 +4635,7 @@ static void dispatch(struct ceph_connection *con, > struct ceph_msg *msg) > mutex_lock(&mdsc->mutex); > if (__verify_registered_session(mdsc, s) < 0) { > mutex_unlock(&mdsc->mutex); > - goto out; > + return; > } > mutex_unlock(&mdsc->mutex); > > @@ -4672,7 +4672,6 @@ static void dispatch(struct ceph_connection *con, > struct ceph_msg *msg) > pr_err("received unknown message type %d %s\n", type, >ceph_msg_type_name(type)); > } > -out: > ceph_msg_put(msg); > } > Hi Yanhu, This doesn't look right to me. The messenger hands its reference to the dispatch function, the dispatch function is responsible for putting it. Even if the session isn't registered, the message should still be valid and should still be freed. The bug is somewhere else... Thanks, Ilya
[GIT PULL] Ceph updates for 5.4-rc1
Hi Linus, The following changes since commit 4d856f72c10ecb060868ed10ff1b1453943fc6c8: Linux 5.3 (2019-09-15 14:19:32 -0700) are available in the Git repository at: https://github.com/ceph/ceph-client.git tags/ceph-for-5.4-rc1 for you to fetch changes up to 3ee5a7015c8b7cb4de21f7345f8381946f2fce55: ceph: call ceph_mdsc_destroy from destroy_fs_client (2019-09-16 12:06:25 +0200) The highlights are: - automatic recovery of a blacklisted filesystem session (Zheng Yan). This is disabled by default and can be enabled by mounting with the new "recover_session=clean" option. - serialize buffered reads and O_DIRECT writes (Jeff Layton). Care is taken to avoid serializing O_DIRECT reads and writes with each other, this is based on the exclusion scheme from NFS. - handle large osdmaps better in the face of fragmented memory (myself) - don't limit what security.* xattrs can be get or set (Jeff Layton). We were overly restrictive here, unnecessarily preventing things like file capability sets stored in security.capability from working. - allow copy_file_range() within the same inode and across different filesystems within the same cluster (Luis Henriques) David Disseldorp (1): libceph: handle OSD op ceph_pagelist_append() errors Dongsheng Yang (1): rbd: fix response length parameter for encoded strings Erqi Chen (1): ceph: reconnect connection if session hang in opening state Ilya Dryomov (6): ceph: fix indentation in __get_snap_name() libceph: drop unused con parameter of calc_target() rbd: pull rbd_img_request_create() dout out into the callers ceph: include ceph_debug.h in cache.c libceph: avoid a __vmalloc() deadlock in ceph_kvmalloc() libceph: use ceph_kvmalloc() for osdmap arrays Jeff Layton (18): ceph: allow copy_file_range when src and dst inode are same ceph: don't list vxattrs in listxattr() ceph: don't SetPageError on writepage errors ceph: remove ceph_get_cap_mds and __ceph_get_cap_mds ceph: fetch cap_gen under spinlock in ceph_add_cap ceph: eliminate session->s_trim_caps ceph: fix comments over ceph_add_cap ceph: have __mark_caps_flushing return flush_tid ceph: remove unneeded test in try_flush_caps ceph: remove CEPH_I_NOFLUSH ceph: remove incorrect comment above __send_cap ceph: update the mtime when truncating up ceph: don't freeze during write page faults ceph: add buffered/direct exclusionary locking for reads and writes ceph: turn ceph_security_invalidate_secctx into static inline ceph: only set CEPH_I_SEC_INITED if we got a MAC label ceph: allow arbitrary security.* xattrs ceph: call ceph_mdsc_destroy from destroy_fs_client John Hubbard (2): ceph: don't return a value from void function ceph: use release_pages() directly Krzysztof Wilczynski (1): ceph: move static keyword to the front of declarations Luis Henriques (2): ceph: fix directories inode i_blkbits initialization ceph: allow object copies across different filesystems in the same cluster Yan, Zheng (9): libceph: add function that reset client's entity addr libceph: add function that clears osd client's abort_err ceph: allow closing session in restarting/reconnect state ceph: track and report error of async metadata operation ceph: pass filp to ceph_get_caps() ceph: add helper function that forcibly reconnects to ceph cluster. ceph: return -EIO if read/write against filp that lost file locks ceph: invalidate all write mode filp after reconnect ceph: auto reconnect after blacklisted Documentation/filesystems/ceph.txt | 14 +++ drivers/block/rbd.c| 18 ++-- fs/ceph/Makefile | 2 +- fs/ceph/addr.c | 61 +++-- fs/ceph/cache.c| 2 + fs/ceph/caps.c | 173 +++-- fs/ceph/debugfs.c | 1 - fs/ceph/export.c | 60 ++--- fs/ceph/file.c | 104 +- fs/ceph/inode.c| 50 ++- fs/ceph/io.c | 163 ++ fs/ceph/io.h | 12 +++ fs/ceph/locks.c| 8 +- fs/ceph/mds_client.c | 110 +-- fs/ceph/mds_client.h | 8 +- fs/ceph/super.c| 52 +-- fs/ceph/super.h| 49 +++ fs/ceph/xattr.c| 76 ++-- include/linux/ceph/libceph.h | 1 + include/linux/ceph/messenger.h | 1 + include/linux/ceph/mon_client.h| 1 + incl
Re: [GIT PULL afs: Development for 5.4
On Thu, Sep 19, 2019 at 3:55 PM Matthew Wilcox wrote: > > On Thu, Sep 19, 2019 at 10:49:22AM +0100, David Howells wrote: > > David Howells wrote: > > > > > > However, I was close to unpulling it again. It has a merge commit with > > > > this merge message: > > > > > > > > Merge remote-tracking branch 'net/master' into afs-next > > > > > > > > and that simply is not acceptable. > > > > > > Apologies - I meant to rebase that away. There was a bug fix to rxrpc in > > > net/master that didn't get pulled into your tree until Saturday. > > > > Actually, waiting for all outstanding fixes to get merged and then rebasing > > might not be the right thing here. The problem is that there are fixes in > > both trees: afs fixes go directly into yours whereas rxrpc fixes go via > > networking and I would prefer to base my patches on both of them for testing > > purposes. What's the preferred method for dealing with that? Base on a > > merge > > of the lastest of those fixes in each tree? > > Why is it organised this way? I mean, yes, technically, rxrpc is a > generic layer-6 protocol that any blah blah blah, but in practice no > other user has come up in the last 37 years, so why bother pretending > one is going to? Just git mv net/rxrpc fs/afs/ and merge everything > through your tree. > > I feel similarly about net/9p, net/sunrpc and net/ceph. Every filesystem > comes with its own presentation layer; nobody reuses an existing one. > Just stop pretending they're separate components. net/ceph is also being used by drivers/block/rbd.c. net/ceph was split out of fs/ceph when rbd was introduced. We continued to manage them in a single ceph-client.git tree though. Thanks, Ilya
Re: [PATCH v2] ceph: allow object copies across different filesystems in the same cluster
On Mon, Sep 9, 2019 at 12:29 PM Luis Henriques wrote: > > OSDs are able to perform object copies across different pools. Thus, > there's no need to prevent copy_file_range from doing remote copies if the > source and destination superblocks are different. Only return -EXDEV if > they have different fsid (the cluster ID). > > Signed-off-by: Luis Henriques > --- > fs/ceph/file.c | 18 ++ > 1 file changed, 14 insertions(+), 4 deletions(-) > > Hi, > > Here's the patch changelog since initial submittion: > > - Dropped have_fsid checks on client structs > - Use %pU to print the fsid instead of raw hex strings (%*ph) > - Fixed 'To:' field in email so that this time the patch hits vger > > Cheers, > -- > Luis > > diff --git a/fs/ceph/file.c b/fs/ceph/file.c > index 685a03cc4b77..4a624a1dd0bb 100644 > --- a/fs/ceph/file.c > +++ b/fs/ceph/file.c > @@ -1904,6 +1904,7 @@ static ssize_t __ceph_copy_file_range(struct file > *src_file, loff_t src_off, > struct ceph_inode_info *src_ci = ceph_inode(src_inode); > struct ceph_inode_info *dst_ci = ceph_inode(dst_inode); > struct ceph_cap_flush *prealloc_cf; > + struct ceph_fs_client *src_fsc = ceph_inode_to_client(src_inode); > struct ceph_object_locator src_oloc, dst_oloc; > struct ceph_object_id src_oid, dst_oid; > loff_t endoff = 0, size; > @@ -1915,8 +1916,17 @@ static ssize_t __ceph_copy_file_range(struct file > *src_file, loff_t src_off, > > if (src_inode == dst_inode) > return -EINVAL; > - if (src_inode->i_sb != dst_inode->i_sb) > - return -EXDEV; > + if (src_inode->i_sb != dst_inode->i_sb) { > + struct ceph_fs_client *dst_fsc = > ceph_inode_to_client(dst_inode); > + > + if (ceph_fsid_compare(&src_fsc->client->fsid, > + &dst_fsc->client->fsid)) { > + dout("Copying object across different clusters:"); > + dout(" src fsid: %pU dst fsid: %pU\n", > +&src_fsc->client->fsid, &dst_fsc->client->fsid); Hi Luis, This should be a single dout. Thanks, Ilya
Re: [PATCH] ceph: Move static keyword to the front of declarations
On Sat, Aug 31, 2019 at 11:57 PM Krzysztof Wilczynski wrote: > > Move the static keyword to the front of declarations of > snap_handle_length, handle_length and connected_handle_length, > and resolve the following compiler warnings that can be seen > when building with warnings enabled (W=1): > > fs/ceph/export.c:38:2: warning: > ‘static’ is not at beginning of declaration [-Wold-style-declaration] > > fs/ceph/export.c:88:2: warning: > ‘static’ is not at beginning of declaration [-Wold-style-declaration] > > fs/ceph/export.c:90:2: warning: > ‘static’ is not at beginning of declaration [-Wold-style-declaration] > > Signed-off-by: Krzysztof Wilczynski > --- > Related: https://lore.kernel.org/r/20190827233017.gk9...@google.com > > fs/ceph/export.c | 6 +++--- > 1 file changed, 3 insertions(+), 3 deletions(-) > > diff --git a/fs/ceph/export.c b/fs/ceph/export.c > index 020d39a85ecc..b6bfa94332c3 100644 > --- a/fs/ceph/export.c > +++ b/fs/ceph/export.c > @@ -35,7 +35,7 @@ struct ceph_nfs_snapfh { > static int ceph_encode_snapfh(struct inode *inode, u32 *rawfh, int *max_len, > struct inode *parent_inode) > { > - const static int snap_handle_length = > + static const int snap_handle_length = > sizeof(struct ceph_nfs_snapfh) >> 2; > struct ceph_nfs_snapfh *sfh = (void *)rawfh; > u64 snapid = ceph_snap(inode); > @@ -85,9 +85,9 @@ static int ceph_encode_snapfh(struct inode *inode, u32 > *rawfh, int *max_len, > static int ceph_encode_fh(struct inode *inode, u32 *rawfh, int *max_len, > struct inode *parent_inode) > { > - const static int handle_length = > + static const int handle_length = > sizeof(struct ceph_nfs_fh) >> 2; > - const static int connected_handle_length = > + static const int connected_handle_length = > sizeof(struct ceph_nfs_confh) >> 2; > int type; Applied. Thanks, Ilya
Re: bug report: libceph: follow redirect replies from osds
On Fri, Aug 30, 2019 at 4:05 PM Colin Ian King wrote: > > Hi, > > Static analysis with Coverity has picked up an issue with commit: > > commit 205ee1187a671c3b067d7f1e974903b44036f270 > Author: Ilya Dryomov > Date: Mon Jan 27 17:40:20 2014 +0200 > > libceph: follow redirect replies from osds > > Specifically in function ceph_redirect_decode in net/ceph/osd_client.c: > > 3485 > 3486len = ceph_decode_32(p); > > CID 17904: Unused value (UNUSED_VALUE) > > 3487*p += len; /* skip osd_instructions */ > 3488 > 3489/* skip the rest */ > 3490*p = struct_end; > > The double write to *p looks wrong, I suspect the *p += len; should be > just incrementing pointer p as in: p += len. Am I correct to assume > this is the correct fix? Hi Colin, No, the double write to *p is correct. It skips over len bytes and then skips to the end of the redirect reply. There is no bug here but we could drop len = ceph_decode_32(p); *p += len; /* skip osd_instructions */ and skip to the end directly to make Coverity happier. Thanks, Ilya
[GIT PULL] Ceph fixes for 5.3-rc7
Hi Linus, The following changes since commit a55aa89aab90fae7c815b0551b07be37db359d76: Linux 5.3-rc6 (2019-08-25 12:01:23 -0700) are available in the Git repository at: https://github.com/ceph/ceph-client.git tags/ceph-for-5.3-rc7 for you to fetch changes up to d435c9a7b85be1e820668d2f3718c2d9f24d5548: rbd: restore zeroing past the overlap when reading from parent (2019-08-28 12:34:11 +0200) A fix for a -rc1 regression in rbd and a trivial static checker fix. Ilya Dryomov (1): rbd: restore zeroing past the overlap when reading from parent Jia-Ju Bai (1): libceph: don't call crypto_free_sync_skcipher() on a NULL tfm drivers/block/rbd.c | 11 +++ net/ceph/crypto.c | 6 -- 2 files changed, 15 insertions(+), 2 deletions(-)
Re: [PATCH AUTOSEL 5.2 66/76] ceph: fix buffer free while holding i_ceph_lock in __ceph_setxattr()
On Thu, Aug 29, 2019 at 11:16 PM Sasha Levin wrote: > > On Thu, Aug 29, 2019 at 10:51:04PM +0200, Ilya Dryomov wrote: > >On Thu, Aug 29, 2019 at 8:15 PM Sasha Levin wrote: > >> > >> From: Luis Henriques > >> > >> [ Upstream commit 86968ef21596515958d5f0a40233d02be78ecec0 ] > >> > >> Calling ceph_buffer_put() in __ceph_setxattr() may end up freeing the > >> i_xattrs.prealloc_blob buffer while holding the i_ceph_lock. This can be > >> fixed by postponing the call until later, when the lock is released. > >> > >> The following backtrace was triggered by fstests generic/117. > >> > >> BUG: sleeping function called from invalid context at mm/vmalloc.c:2283 > >> in_atomic(): 1, irqs_disabled(): 0, pid: 650, name: fsstress > >> 3 locks held by fsstress/650: > >>#0: 870a0fe8 (sb_writers#8){.+.+}, at: mnt_want_write+0x20/0x50 > >>#1: ba0c4c74 (&type->i_mutex_dir_key#6){}, at: > >> vfs_setxattr+0x55/0xa0 > >>#2: 8dfbb3f2 (&(&ci->i_ceph_lock)->rlock){+.+.}, at: > >> __ceph_setxattr+0x297/0x810 > >> CPU: 1 PID: 650 Comm: fsstress Not tainted 5.2.0+ #437 > >> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS > >> rel-1.12.1-0-ga5cab58-prebuilt.qemu.org 04/01/2014 > >> Call Trace: > >>dump_stack+0x67/0x90 > >>___might_sleep.cold+0x9f/0xb1 > >>vfree+0x4b/0x60 > >>ceph_buffer_release+0x1b/0x60 > >>__ceph_setxattr+0x2b4/0x810 > >>__vfs_setxattr+0x66/0x80 > >>__vfs_setxattr_noperm+0x59/0xf0 > >>vfs_setxattr+0x81/0xa0 > >>setxattr+0x115/0x230 > >>? filename_lookup+0xc9/0x140 > >>? rcu_read_lock_sched_held+0x74/0x80 > >>? rcu_sync_lockdep_assert+0x2e/0x60 > >> ? __sb_start_write+0x142/0x1a0 > >>? mnt_want_write+0x20/0x50 > >>path_setxattr+0xba/0xd0 > >>__x64_sys_lsetxattr+0x24/0x30 > >>do_syscall_64+0x50/0x1c0 > >>entry_SYSCALL_64_after_hwframe+0x49/0xbe > >> RIP: 0033:0x7ff23514359a > >> > >> Signed-off-by: Luis Henriques > >> Reviewed-by: Jeff Layton > >> Signed-off-by: Ilya Dryomov > >> Signed-off-by: Sasha Levin > >> --- > >> fs/ceph/xattr.c | 8 ++-- > >> 1 file changed, 6 insertions(+), 2 deletions(-) > >> > >> diff --git a/fs/ceph/xattr.c b/fs/ceph/xattr.c > >> index 0619adbcbe14c..8382299fc2d84 100644 > >> --- a/fs/ceph/xattr.c > >> +++ b/fs/ceph/xattr.c > >> @@ -1028,6 +1028,7 @@ int __ceph_setxattr(struct inode *inode, const char > >> *name, > >> struct ceph_inode_info *ci = ceph_inode(inode); > >> struct ceph_mds_client *mdsc = > >> ceph_sb_to_client(inode->i_sb)->mdsc; > >> struct ceph_cap_flush *prealloc_cf = NULL; > >> + struct ceph_buffer *old_blob = NULL; > >> int issued; > >> int err; > >> int dirty = 0; > >> @@ -1101,13 +1102,15 @@ int __ceph_setxattr(struct inode *inode, const > >> char *name, > >> struct ceph_buffer *blob; > >> > >> spin_unlock(&ci->i_ceph_lock); > >> - dout(" preaallocating new blob size=%d\n", > >> required_blob_size); > >> + ceph_buffer_put(old_blob); /* Shouldn't be required */ > >> + dout(" pre-allocating new blob size=%d\n", > >> required_blob_size); > >> blob = ceph_buffer_new(required_blob_size, GFP_NOFS); > >> if (!blob) > >> goto do_sync_unlocked; > >> spin_lock(&ci->i_ceph_lock); > >> + /* prealloc_blob can't be released while holding > >> i_ceph_lock */ > >> if (ci->i_xattrs.prealloc_blob) > >> - ceph_buffer_put(ci->i_xattrs.prealloc_blob); > >> + old_blob = ci->i_xattrs.prealloc_blob; > >> ci->i_xattrs.prealloc_blob = blob; > >> goto retry; > >> } > >> @@ -1123,6 +1126,7 @@ int __ceph_setxattr(struct inode *inode, const char > >> *name, > >> } > >> > >> spin_unlock(&ci->i_ceph_lock); > >> + ceph_buffer_put(old_blob); > >> if (lock_snap_rwsem) > >> up_read(&mdsc->snap_rwsem); > >> if (dirty) > > > >Hi Sasha, > > > >I didn't tag i_ceph_lock series for stable because this is a very old > >bug which no one ever hit in real life, at least to my knowledge. > > I can drop it if you prefer. Either is fine with me. I just wanted to explain my rationale for not tagging them for stable in the first place and point out that there is a prerequisite. Thanks, Ilya
Re: [PATCH AUTOSEL 5.2 66/76] ceph: fix buffer free while holding i_ceph_lock in __ceph_setxattr()
On Thu, Aug 29, 2019 at 8:15 PM Sasha Levin wrote: > > From: Luis Henriques > > [ Upstream commit 86968ef21596515958d5f0a40233d02be78ecec0 ] > > Calling ceph_buffer_put() in __ceph_setxattr() may end up freeing the > i_xattrs.prealloc_blob buffer while holding the i_ceph_lock. This can be > fixed by postponing the call until later, when the lock is released. > > The following backtrace was triggered by fstests generic/117. > > BUG: sleeping function called from invalid context at mm/vmalloc.c:2283 > in_atomic(): 1, irqs_disabled(): 0, pid: 650, name: fsstress > 3 locks held by fsstress/650: >#0: 870a0fe8 (sb_writers#8){.+.+}, at: mnt_want_write+0x20/0x50 >#1: ba0c4c74 (&type->i_mutex_dir_key#6){}, at: > vfs_setxattr+0x55/0xa0 >#2: 8dfbb3f2 (&(&ci->i_ceph_lock)->rlock){+.+.}, at: > __ceph_setxattr+0x297/0x810 > CPU: 1 PID: 650 Comm: fsstress Not tainted 5.2.0+ #437 > Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS > rel-1.12.1-0-ga5cab58-prebuilt.qemu.org 04/01/2014 > Call Trace: >dump_stack+0x67/0x90 >___might_sleep.cold+0x9f/0xb1 >vfree+0x4b/0x60 >ceph_buffer_release+0x1b/0x60 >__ceph_setxattr+0x2b4/0x810 >__vfs_setxattr+0x66/0x80 >__vfs_setxattr_noperm+0x59/0xf0 >vfs_setxattr+0x81/0xa0 >setxattr+0x115/0x230 >? filename_lookup+0xc9/0x140 >? rcu_read_lock_sched_held+0x74/0x80 >? rcu_sync_lockdep_assert+0x2e/0x60 >? __sb_start_write+0x142/0x1a0 >? mnt_want_write+0x20/0x50 >path_setxattr+0xba/0xd0 >__x64_sys_lsetxattr+0x24/0x30 >do_syscall_64+0x50/0x1c0 >entry_SYSCALL_64_after_hwframe+0x49/0xbe > RIP: 0033:0x7ff23514359a > > Signed-off-by: Luis Henriques > Reviewed-by: Jeff Layton > Signed-off-by: Ilya Dryomov > Signed-off-by: Sasha Levin > --- > fs/ceph/xattr.c | 8 ++-- > 1 file changed, 6 insertions(+), 2 deletions(-) > > diff --git a/fs/ceph/xattr.c b/fs/ceph/xattr.c > index 0619adbcbe14c..8382299fc2d84 100644 > --- a/fs/ceph/xattr.c > +++ b/fs/ceph/xattr.c > @@ -1028,6 +1028,7 @@ int __ceph_setxattr(struct inode *inode, const char > *name, > struct ceph_inode_info *ci = ceph_inode(inode); > struct ceph_mds_client *mdsc = ceph_sb_to_client(inode->i_sb)->mdsc; > struct ceph_cap_flush *prealloc_cf = NULL; > + struct ceph_buffer *old_blob = NULL; > int issued; > int err; > int dirty = 0; > @@ -1101,13 +1102,15 @@ int __ceph_setxattr(struct inode *inode, const char > *name, > struct ceph_buffer *blob; > > spin_unlock(&ci->i_ceph_lock); > - dout(" preaallocating new blob size=%d\n", > required_blob_size); > + ceph_buffer_put(old_blob); /* Shouldn't be required */ > + dout(" pre-allocating new blob size=%d\n", > required_blob_size); > blob = ceph_buffer_new(required_blob_size, GFP_NOFS); > if (!blob) > goto do_sync_unlocked; > spin_lock(&ci->i_ceph_lock); > + /* prealloc_blob can't be released while holding i_ceph_lock > */ > if (ci->i_xattrs.prealloc_blob) > - ceph_buffer_put(ci->i_xattrs.prealloc_blob); > + old_blob = ci->i_xattrs.prealloc_blob; > ci->i_xattrs.prealloc_blob = blob; > goto retry; > } > @@ -1123,6 +1126,7 @@ int __ceph_setxattr(struct inode *inode, const char > *name, > } > > spin_unlock(&ci->i_ceph_lock); > + ceph_buffer_put(old_blob); > if (lock_snap_rwsem) > up_read(&mdsc->snap_rwsem); > if (dirty) Hi Sasha, I didn't tag i_ceph_lock series for stable because this is a very old bug which no one ever hit in real life, at least to my knowledge. Please note that each of these patches requires 5c498950f730 ("libceph: allow ceph_buffer_put() to receive a NULL ceph_buffer"). Thanks, Ilya
[GIT PULL] Ceph fixes for 5.3-rc6
Hi Linus, The following changes since commit d1abaeb3be7b5fa6d7a1fbbd2e14e3310005c4c1: Linux 5.3-rc5 (2019-08-18 14:31:08 -0700) are available in the Git repository at: https://github.com/ceph/ceph-client.git tags/ceph-for-5.3-rc6 for you to fetch changes up to a561372405cf6bc6f14239b3a9e57bb39f2788b0: libceph: fix PG split vs OSD (re)connect race (2019-08-22 10:47:41 +0200) Three important fixes tagged for stable (an indefinite hang, a crash on an assert and a NULL pointer dereference) plus a small series from Luis fixing instances of vfree() under spinlock. Erqi Chen (1): ceph: clear page dirty before invalidate page Ilya Dryomov (1): libceph: fix PG split vs OSD (re)connect race Jeff Layton (1): ceph: don't try fill file_lock on unsuccessful GETFILELOCK reply Luis Henriques (4): libceph: allow ceph_buffer_put() to receive a NULL ceph_buffer ceph: fix buffer free while holding i_ceph_lock in __ceph_setxattr() ceph: fix buffer free while holding i_ceph_lock in __ceph_build_xattrs_blob() ceph: fix buffer free while holding i_ceph_lock in fill_inode() fs/ceph/addr.c | 5 +++-- fs/ceph/caps.c | 5 - fs/ceph/inode.c | 7 --- fs/ceph/locks.c | 3 +-- fs/ceph/snap.c | 4 +++- fs/ceph/super.h | 2 +- fs/ceph/xattr.c | 19 ++- include/linux/ceph/buffer.h | 3 ++- net/ceph/osd_client.c | 9 - 9 files changed, 36 insertions(+), 21 deletions(-)
Re: [PATCH] net/ceph replace ceph_kvmalloc with kvmalloc
On Mon, Aug 12, 2019 at 11:42 AM Marc Koderer wrote: > > There is nearly no difference between both implemenations. > ceph_kvmalloc existed before kvmalloc which makes me think it's > a leftover. > > Signed-off-by: Marc Koderer > --- > net/ceph/buffer.c | 3 +-- > net/ceph/ceph_common.c | 11 --- > net/ceph/crypto.c | 2 +- > net/ceph/messenger.c | 2 +- > 4 files changed, 3 insertions(+), 15 deletions(-) > > diff --git a/net/ceph/buffer.c b/net/ceph/buffer.c > index 5622763ad402..6ca273d2246a 100644 > --- a/net/ceph/buffer.c > +++ b/net/ceph/buffer.c > @@ -7,7 +7,6 @@ > > #include > #include > -#include /* for ceph_kvmalloc */ > > struct ceph_buffer *ceph_buffer_new(size_t len, gfp_t gfp) > { > @@ -17,7 +16,7 @@ struct ceph_buffer *ceph_buffer_new(size_t len, gfp_t gfp) > if (!b) > return NULL; > > - b->vec.iov_base = ceph_kvmalloc(len, gfp); > + b->vec.iov_base = kvmalloc(len, gfp); > if (!b->vec.iov_base) { > kfree(b); > return NULL; > diff --git a/net/ceph/ceph_common.c b/net/ceph/ceph_common.c > index 4eeea4d5c3ef..6c1769a815af 100644 > --- a/net/ceph/ceph_common.c > +++ b/net/ceph/ceph_common.c > @@ -185,17 +185,6 @@ int ceph_compare_options(struct ceph_options *new_opt, > } > EXPORT_SYMBOL(ceph_compare_options); > > -void *ceph_kvmalloc(size_t size, gfp_t flags) > -{ > - if (size <= (PAGE_SIZE << PAGE_ALLOC_COSTLY_ORDER)) { > - void *ptr = kmalloc(size, flags | __GFP_NOWARN); > - if (ptr) > - return ptr; > - } > - > - return __vmalloc(size, flags, PAGE_KERNEL); > -} > - > > static int parse_fsid(const char *str, struct ceph_fsid *fsid) > { > diff --git a/net/ceph/crypto.c b/net/ceph/crypto.c > index 5d6724cee38f..a9deead1e4ff 100644 > --- a/net/ceph/crypto.c > +++ b/net/ceph/crypto.c > @@ -144,7 +144,7 @@ void ceph_crypto_key_destroy(struct ceph_crypto_key *key) > static const u8 *aes_iv = (u8 *)CEPH_AES_IV; > > /* > - * Should be used for buffers allocated with ceph_kvmalloc(). > + * Should be used for buffers allocated with kvmalloc(). > * Currently these are encrypt out-buffer (ceph_buffer) and decrypt > * in-buffer (msg front). > * > diff --git a/net/ceph/messenger.c b/net/ceph/messenger.c > index 962f521c863e..f1f2fcc6f780 100644 > --- a/net/ceph/messenger.c > +++ b/net/ceph/messenger.c > @@ -3334,7 +3334,7 @@ struct ceph_msg *ceph_msg_new2(int type, int front_len, > int max_data_items, > > /* front */ > if (front_len) { > - m->front.iov_base = ceph_kvmalloc(front_len, flags); > + m->front.iov_base = kvmalloc(front_len, flags); > if (m->front.iov_base == NULL) { > dout("ceph_msg_new can't allocate %d bytes\n", > front_len); Hi Marc, I'm working on a patch for https://tracker.ceph.com/issues/40481 which changes ceph_kvmalloc() to properly deal with non-GFP_KERNEL contexts. We can't switch to kvmalloc() because it doesn't actually fall back to vmalloc() for GFP_NOFS or GFP_NOIO. Thanks, Ilya
Re: [PATCH] net: ceph: Fix a possible null-pointer dereference in ceph_crypto_key_destroy()
On Wed, Jul 24, 2019 at 11:43 AM Jia-Ju Bai wrote: > > In set_secret(), key->tfm is assigned to NULL on line 55, and then > ceph_crypto_key_destroy(key) is executed. > > ceph_crypto_key_destroy(key) > crypto_free_sync_skcipher(key->tfm) > crypto_skcipher_tfm(tfm) > return &tfm->base; > > Thus, a possible null-pointer dereference may occur. > > To fix this bug, key->tfm is checked before calling > crypto_free_sync_skcipher(). > > This bug is found by a static analysis tool STCheck written by us. > > Signed-off-by: Jia-Ju Bai > --- > net/ceph/crypto.c | 3 ++- > 1 file changed, 2 insertions(+), 1 deletion(-) > > diff --git a/net/ceph/crypto.c b/net/ceph/crypto.c > index 5d6724cee38f..ac28463bcfd8 100644 > --- a/net/ceph/crypto.c > +++ b/net/ceph/crypto.c > @@ -136,7 +136,8 @@ void ceph_crypto_key_destroy(struct ceph_crypto_key *key) > if (key) { > kfree(key->key); > key->key = NULL; > - crypto_free_sync_skcipher(key->tfm); > + if (key->tfm) > + crypto_free_sync_skcipher(key->tfm); > key->tfm = NULL; > } > } Hi Jia-Ju, Yeah, looks like the only reason this continued to work after 69d6302b65a8 ("libceph: Remove VLA usage of skcipher") is because crypto_sync_skcipher is a trivial wrapper around crypto_skcipher added just for type checking AFAICT. struct crypto_sync_skcipher { struct crypto_skcipher base; }; Before that ceph_crypto_key_destroy() used crypto_free_skcipher(), which is safe to call on a NULL tfm. Applied with a slight modification -- I moved key->tfm = NULL under the new if and amended the changelog. https://github.com/ceph/ceph-client/commit/b3d79916ff99074d289d66f1643b423ae0008c50 Thanks, Ilya
Re: fs/ceph/export.c:459:3-12: code aligned with following code on line 461 (fwd)
On Sat, Jun 29, 2019 at 4:09 PM Julia Lawall wrote: > > There is no bug here, but some code starting on line 461 seems to be > incorrectly indented. > > julia > > -- Forwarded message -- > Date: Sat, 29 Jun 2019 19:51:04 +0800 > From: kbuild test robot > To: kbu...@01.org > Cc: Julia Lawall > Subject: fs/ceph/export.c:459:3-12: code aligned with following code on line > 461 > > CC: kbuild-...@01.org > CC: linux-kernel@vger.kernel.org > TO: "Yan, Zheng" > CC: Ilya Dryomov > > tree: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git > master > head: 249155c20f9b0754bc1b932a33344cfb4e0c2101 > commit: 570df4e9c23f861aa3f8f2954468c534a033bf1a ceph: snapshot nfs re-export > date: 8 weeks ago > :: branch date: 5 days ago > :: commit date: 8 weeks ago > > If you fix the issue, kindly add following tag > Reported-by: kbuild test robot > Reported-by: Julia Lawall > > >> fs/ceph/export.c:459:3-12: code aligned with following code on line 461 > > # > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=570df4e9c23f861aa3f8f2954468c534a033bf1a > git remote add linus > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git > git remote update linus > git checkout 570df4e9c23f861aa3f8f2954468c534a033bf1a > vim +459 fs/ceph/export.c > > a8e63b7d Sage Weil 2009-10-06 401 > 570df4e9 Yan, Zheng 2017-11-15 402 static int __get_snap_name(struct dentry > *parent, char *name, > 570df4e9 Yan, Zheng 2017-11-15 403struct dentry > *child) > 570df4e9 Yan, Zheng 2017-11-15 404 { > 570df4e9 Yan, Zheng 2017-11-15 405 struct inode *inode = d_inode(child); > 570df4e9 Yan, Zheng 2017-11-15 406 struct inode *dir = d_inode(parent); > 570df4e9 Yan, Zheng 2017-11-15 407 struct ceph_fs_client *fsc = > ceph_inode_to_client(inode); > 570df4e9 Yan, Zheng 2017-11-15 408 struct ceph_mds_request *req = NULL; > 570df4e9 Yan, Zheng 2017-11-15 409 char *last_name = NULL; > 570df4e9 Yan, Zheng 2017-11-15 410 unsigned next_offset = 2; > 570df4e9 Yan, Zheng 2017-11-15 411 int err = -EINVAL; > 570df4e9 Yan, Zheng 2017-11-15 412 > 570df4e9 Yan, Zheng 2017-11-15 413 if (ceph_ino(inode) != ceph_ino(dir)) > 570df4e9 Yan, Zheng 2017-11-15 414 goto out; > 570df4e9 Yan, Zheng 2017-11-15 415 if (ceph_snap(inode) == CEPH_SNAPDIR) > { > 570df4e9 Yan, Zheng 2017-11-15 416 if (ceph_snap(dir) == > CEPH_NOSNAP) { > 570df4e9 Yan, Zheng 2017-11-15 417 strcpy(name, > fsc->mount_options->snapdir_name); > 570df4e9 Yan, Zheng 2017-11-15 418 err = 0; > 570df4e9 Yan, Zheng 2017-11-15 419 } > 570df4e9 Yan, Zheng 2017-11-15 420 goto out; > 570df4e9 Yan, Zheng 2017-11-15 421 } > 570df4e9 Yan, Zheng 2017-11-15 422 if (ceph_snap(dir) != CEPH_SNAPDIR) > 570df4e9 Yan, Zheng 2017-11-15 423 goto out; > 570df4e9 Yan, Zheng 2017-11-15 424 > 570df4e9 Yan, Zheng 2017-11-15 425 while (1) { > 570df4e9 Yan, Zheng 2017-11-15 426 struct > ceph_mds_reply_info_parsed *rinfo; > 570df4e9 Yan, Zheng 2017-11-15 427 struct > ceph_mds_reply_dir_entry *rde; > 570df4e9 Yan, Zheng 2017-11-15 428 int i; > 570df4e9 Yan, Zheng 2017-11-15 429 > 570df4e9 Yan, Zheng 2017-11-15 430 req = > ceph_mdsc_create_request(fsc->mdsc, CEPH_MDS_OP_LSSNAP, > 570df4e9 Yan, Zheng 2017-11-15 431 > USE_AUTH_MDS); > 570df4e9 Yan, Zheng 2017-11-15 432 if (IS_ERR(req)) { > 570df4e9 Yan, Zheng 2017-11-15 433 err = PTR_ERR(req); > 570df4e9 Yan, Zheng 2017-11-15 434 req = NULL; > 570df4e9 Yan, Zheng 2017-11-15 435 goto out; > 570df4e9 Yan, Zheng 2017-11-15 436 } > 570df4e9 Yan, Zheng 2017-11-15 437 err = > ceph_alloc_readdir_reply_buffer(req, inode); > 570df4e9 Yan, Zheng 2017-11-15 438 if (err) > 570df4e9 Yan, Zheng 2017-11-15 439 goto out; > 570df4e9 Yan, Zheng 2017-11-15 440 > 570df4e9 Yan, Zheng 2017-11-15 441 req->r_direct_mode = > USE_AUTH_MDS; > 570df4e9 Yan, Zheng 2017-11-15 442 req->r_readdir_offset = > next_offset; > 570df4e9 Yan, Zheng 2017-11-15 443 req->r_args.readdir.flags = > 570df4e9 Yan, Zheng 2017-11-15 444 > cpu_to_le16(CEPH_READDIR_REPLY_BITFLAGS); > 570df4e9 Yan, Zheng 2017-11-15 445 if (last_name) { > 570df4e9 Yan, Zheng 2017-11-15 446 req->r_p
Re: [PATCH 0/4] Sleeping functions in invalid context bug fixes
On Fri, Jul 19, 2019 at 5:20 PM Jeff Layton wrote: > > On Fri, 2019-07-19 at 15:32 +0100, Luis Henriques wrote: > > Hi, > > > > I'm sending three "sleeping function called from invalid context" bug > > fixes that I had on my TODO for a while. All of them are ceph_buffer_put > > related, and all the fixes follow the same pattern: delay the operation > > until the ci->i_ceph_lock is released. > > > > The first patch simply allows ceph_buffer_put to receive a NULL buffer so > > that the NULL check doesn't need to be performed in all the other patches. > > IOW, it's not really required, just convenient. > > > > (Note: maybe these patches should all be tagged for stable.) > > > > Luis Henriques (4): > > libceph: allow ceph_buffer_put() to receive a NULL ceph_buffer > > ceph: fix buffer free while holding i_ceph_lock in __ceph_setxattr() > > ceph: fix buffer free while holding i_ceph_lock in > > __ceph_build_xattrs_blob() > > ceph: fix buffer free while holding i_ceph_lock in fill_inode() > > > > fs/ceph/caps.c | 5 - > > fs/ceph/inode.c | 7 --- > > fs/ceph/snap.c | 4 +++- > > fs/ceph/super.h | 2 +- > > fs/ceph/xattr.c | 19 ++- > > include/linux/ceph/buffer.h | 3 ++- > > 6 files changed, 28 insertions(+), 12 deletions(-) > > This all looks good to me. I'll plan to merge these into the testing > branch soon, and tag them for stable. > > PS: On a related note (and more of a question for Ilya)... > > I'm wondering if we get any benefit from having our own ceph_kvmalloc > routine. Why are we not better off using the stock kvmalloc routine > instead? Forcing a vmalloc just because we've gone above 32k allocation > doesn't seem like the right thing to do. I don't remember off the top of my head and can't check right now. Could be that kvmalloc() didn't exist back then. I'll add that to my TODO list. Thanks, Ilya
[GIT PULL] Ceph updates for 5.3-rc1
Hi Linus, The following changes since commit 0ecfebd2b52404ae0c54a878c872bb93363ada36: Linux 5.2 (2019-07-07 15:41:56 -0700) are available in the Git repository at: https://github.com/ceph/ceph-client.git tags/ceph-for-5.3-rc1 for you to fetch changes up to d31d07b97a5e76f41e00eb81dcca740e84aa7782: ceph: fix end offset in truncate_inode_pages_range call (2019-07-08 14:01:45 +0200) There is a trivial conflict caused by commit 9ffbe8ac05db ("locking/lockdep: Rename lockdep_assert_held_exclusive() -> lockdep_assert_held_write()"). I included the resolution in for-linus-merged. Lots of exciting things this time! - support for rbd object-map and fast-diff features (myself). This will speed up reads, discards and things like snap diffs on sparse images. - ceph.snap.btime vxattr to expose snapshot creation time (David Disseldorp). This will be used to integrate with "Restore Previous Versions" feature added in Windows 7 for folks who reexport ceph through SMB. - security xattrs for ceph (Zheng Yan). Only selinux is supported for now due to the limitations of ->dentry_init_security(). - support for MSG_ADDR2, FS_BTIME and FS_CHANGE_ATTR features (Jeff Layton). This is actually a single feature bit which was missing because of the filesystem pieces. With this in, the kernel client will finally be reported as "luminous" by "ceph features" -- it is still being reported as "jewel" even though all required Luminous features were implemented in 4.13. - stop NULL-terminating ceph vxattrs (Jeff Layton). The convention with xattrs is to not terminate and this was causing inconsistencies with ceph-fuse. - change filesystem time granularity from 1 us to 1 ns, again fixing an inconsistency with ceph-fuse (Luis Henriques). On top of this there are some additional dentry name handling and cap flushing fixes from Zheng. Finally, Jeff is formally taking over for Zheng as the filesystem maintainer. Andrea Parri (1): ceph: fix improper use of smp_mb__before_atomic() Christoph Hellwig (1): libceph: remove ceph_get_direct_page_vector() Dan Carpenter (1): ceph: silence a checker warning in mdsc_show() David Disseldorp (6): ceph: clean up ceph.dir.pin vxattr name sizeof() ceph: carry snapshot creation time with inodes ceph: add ceph.snap.btime vxattr ceph: fix listxattr vxattr buffer length calculation ceph: remove unused vxattr length helpers ceph: fix "ceph.dir.rctime" vxattr value Hariprasad Kelam (1): ceph: fix warning PTR_ERR_OR_ZERO can be used Ilya Dryomov (21): rbd: get rid of obj_req->xferred, obj_req->result and img_req->xferred rbd: replace obj_req->tried_parent with obj_req->read_state rbd: get rid of RBD_OBJ_WRITE_{FLAT,GUARD} rbd: move OSD request submission into object request state machines rbd: introduce image request state machine libceph: rename r_unsafe_item to r_private_item rbd: introduce obj_req->osd_reqs list rbd: factor out rbd_osd_setup_copyup() rbd: factor out __rbd_osd_setup_discard_ops() rbd: move OSD request allocation into object request state machines rbd: rename rbd_obj_setup_*() to rbd_obj_init_*() rbd: introduce copyup state machine rbd: lock should be quiesced on reacquire rbd: quiescing lock should wait for image requests rbd: new exclusive lock wait/wake code libceph: bump CEPH_MSG_MAX_DATA_LEN (again) libceph: change ceph_osdc_call() to take page vector for response libceph: export osd_req_op_data() macro rbd: call rbd_dev_mapping_set() from rbd_dev_image_probe() rbd: support for object-map and fast-diff rbd: setallochint only if object doesn't exist Jeff Layton (22): libceph: fix sa_family just after reading address libceph: add ceph_decode_entity_addr libceph: ADDR2 support for monmap libceph: switch osdmap decoding to use ceph_decode_entity_addr libceph: fix watch_item_t decoding to use ceph_decode_entity_addr libceph: correctly decode ADDR2 addresses in incremental OSD maps ceph: have MDS map decoding use entity_addr_t decoder ceph: fix decode_locker to use ceph_decode_entity_addr libceph: use TYPE_LEGACY for entity addrs instead of TYPE_NONE libceph: rename ceph_encode_addr to ceph_encode_banner_addr ceph: add btime field to ceph_inode_info ceph: handle btime in cap messages libceph: turn on CEPH_FEATURE_MSG_ADDR2 ceph: allow querying of STATX_BTIME in ceph_getattr iversion: add a routine to update a raw value with a larger one ceph: add change_attr field to ceph_inode_info ceph: handle change_attr in cap messages ceph
Re: linux-next: build failure after merge of the tip tree
On Wed, Jul 10, 2019 at 2:01 AM Stephen Rothwell wrote: > > Hi all, > > On Tue, 9 Jul 2019 16:54:59 +1000 Stephen Rothwell > wrote: > > > > After merging the tip tree, today's linux-next build (x86_64 allmodconfig) > > failed like this: > > > > drivers/block/rbd.c: In function 'wake_lock_waiters': > > drivers/block/rbd.c:3933:2: error: implicit declaration of function > > 'lockdep_assert_held_exclusive'; did you mean 'lockdep_assert_held_write'? > > [-Werror=implicit-function-declaration] > > lockdep_assert_held_exclusive(&rbd_dev->lock_rwsem); > > ^ > > lockdep_assert_held_write > > > > Caused by commit > > > > 9ffbe8ac05db ("locking/lockdep: Rename lockdep_assert_held_exclusive() -> > > lockdep_assert_held_write()") > > > > interacting with commits > > > > 637cd060537d ("rbd: new exclusive lock wait/wake code") > > a2b1da09793d ("rbd: lock should be quiesced on reacquire") > > > > from the ceph tree. > > > > I have added the following merge fix patch for today. > > > > From: Stephen Rothwell > > Date: Tue, 9 Jul 2019 16:46:12 +1000 > > Subject: [PATCH] rbd: fix up for lockdep_assert_held_exclusive rename > > > > Signed-off-by: Stephen Rothwell > > --- > > drivers/block/rbd.c | 4 ++-- > > 1 file changed, 2 insertions(+), 2 deletions(-) > > > > diff --git a/drivers/block/rbd.c b/drivers/block/rbd.c > > index 723c3ef4bd59..02216fbdb854 100644 > > --- a/drivers/block/rbd.c > > +++ b/drivers/block/rbd.c > > @@ -3930,7 +3930,7 @@ static void wake_lock_waiters(struct rbd_device > > *rbd_dev, int result) > > struct rbd_img_request *img_req; > > > > dout("%s rbd_dev %p result %d\n", __func__, rbd_dev, result); > > - lockdep_assert_held_exclusive(&rbd_dev->lock_rwsem); > > + lockdep_assert_held_write(&rbd_dev->lock_rwsem); > > > > cancel_delayed_work(&rbd_dev->lock_dwork); > > if (!completion_done(&rbd_dev->acquire_wait)) { > > @@ -4209,7 +4209,7 @@ static bool rbd_quiesce_lock(struct rbd_device > > *rbd_dev) > > bool need_wait; > > > > dout("%s rbd_dev %p\n", __func__, rbd_dev); > > - lockdep_assert_held_exclusive(&rbd_dev->lock_rwsem); > > + lockdep_assert_held_write(&rbd_dev->lock_rwsem); > > > > if (rbd_dev->lock_state != RBD_LOCK_STATE_LOCKED) > > return false; > > This fix now needs to be applied to the merge of the ceph tree. Hi Stephen, Yes, that is what I figured would happen. I assume you would keep carrying this fixup until the ceph tree is merged. Thanks, Ilya
[GIT PULL] Ceph fix for 5.2-rc7
Hi Linus, The following changes since commit 4b972a01a7da614b4796475f933094751a295a2f: Linux 5.2-rc6 (2019-06-22 16:01:36 -0700) are available in the Git repository at: https://github.com/ceph/ceph-client.git tags/ceph-for-5.2-rc7 for you to fetch changes up to d6b8bd679c9c8856fa04b80490765c43a4cb613b: ceph: fix ceph_mdsc_build_path to not stop on first component (2019-06-27 18:27:36 +0200) A small fix for a potential -rc1 regression from Jeff. Jeff Layton (1): ceph: fix ceph_mdsc_build_path to not stop on first component fs/ceph/mds_client.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
Re: [PATCH v4 3/3] ceph: don't NULL terminate virtual xattrs
On Mon, Jun 24, 2019 at 6:27 PM Jeff Layton wrote: > > The convention with xattrs is to not store the termination with string > data, given that it returns the length. This is how setfattr/getfattr > operate. > > Most of ceph's virtual xattr routines use snprintf to plop the string > directly into the destination buffer, but snprintf always NULL > terminates the string. This means that if we send the kernel a buffer > that is the exact length needed to hold the string, it'll end up > truncated. > > Add a ceph_fmt_xattr helper function to format the string into an > on-stack buffer that is should always be large enough to hold the whole > thing and then memcpy the result into the destination buffer. If it does > turn out that the formatted string won't fit in the on-stack buffer, > then return -E2BIG and do a WARN_ONCE(). > > Change over most of the virtual xattr routines to use the new helper. A > couple of the xattrs are sourced from strings however, and it's > difficult to know how long they'll be. Just have those memcpy the result > in place after verifying the length. > > Signed-off-by: Jeff Layton > --- > fs/ceph/xattr.c | 84 ++--- > 1 file changed, 59 insertions(+), 25 deletions(-) > > diff --git a/fs/ceph/xattr.c b/fs/ceph/xattr.c > index 9b77dca0b786..37b458a9af3a 100644 > --- a/fs/ceph/xattr.c > +++ b/fs/ceph/xattr.c > @@ -109,22 +109,49 @@ static ssize_t ceph_vxattrcb_layout(struct > ceph_inode_info *ci, char *val, > return ret; > } > > +/* > + * The convention with strings in xattrs is that they should not be NULL > + * terminated, since we're returning the length with them. snprintf always > + * NULL terminates however, so call it on a temporary buffer and then memcpy > + * the result into place. > + */ > +static int ceph_fmt_xattr(char *val, size_t size, const char *fmt, ...) > +{ > + int ret; > + va_list args; > + char buf[96]; /* NB: reevaluate size if new vxattrs are added */ > + > + va_start(args, fmt); > + ret = vsnprintf(buf, size ? sizeof(buf) : 0, fmt, args); > + va_end(args); > + > + /* Sanity check */ > + if (size && ret + 1 > sizeof(buf)) { > + WARN_ONCE(true, "Returned length too big (%d)", ret); > + return -E2BIG; > + } > + > + if (ret <= size) > + memcpy(val, buf, ret); > + return ret; > +} Nit: perhaps check size at the top and bail early instead of checking it at every step? Thanks, Ilya
Re: [PATCH v3 1/2] ceph: fix buffer length handling in virtual xattrs
On Mon, Jun 24, 2019 at 12:26 PM Jeff Layton wrote: > > On Mon, 2019-06-24 at 12:00 +0200, Ilya Dryomov wrote: > > On Fri, Jun 21, 2019 at 4:18 PM Jeff Layton wrote: > > > The convention with xattrs is to not store the termination with string > > > data, given that it returns the length. This is how setfattr/getfattr > > > operate. > > > > > > Most of ceph's virtual xattr routines use snprintf to plop the string > > > directly into the destination buffer, but snprintf always NULL > > > terminates the string. This means that if we send the kernel a buffer > > > that is the exact length needed to hold the string, it'll end up > > > truncated. > > > > > > Add new routines to format the string into an on-stack buffer that is > > > always large enough to hold the whole thing and then memcpy the result > > > into the destination buffer. Then, change over the virtual xattr > > > routines to use the new helper functions as appropriate. > > > > > > Finally, make the code return ERANGE if the destination buffer size was > > > too small to hold the returned value. > > > > > > Signed-off-by: Jeff Layton > > > --- > > > fs/ceph/xattr.c | 103 > > > 1 file changed, 78 insertions(+), 25 deletions(-) > > > > > > diff --git a/fs/ceph/xattr.c b/fs/ceph/xattr.c > > > index 6621d27e64f5..359d3cbbb37b 100644 > > > --- a/fs/ceph/xattr.c > > > +++ b/fs/ceph/xattr.c > > > @@ -112,22 +112,47 @@ static size_t ceph_vxattrcb_layout(struct > > > ceph_inode_info *ci, char *val, > > > return ret; > > > } > > > > > > +/* Enough to hold any possible expression of integer TYPE in base 10 */ > > > +#define INT_STR_SIZE(_type)3*sizeof(_type)+2 > > > + > > > +/* > > > + * snprintf always NULL terminates, but we need for xattrs to not be. For > > > + * the integer vxattrs, just create an on-stack buffer for snprintf's > > > + * destination, and just don't copy the termination to the actual buffer. > > > + */ > > > +#define GENERATE_XATTR_INT_FORMATTER(_lbl, _format, _type) > > >\ > > > +static size_t format_ ## _lbl ## _xattr(char *val, size_t size, _type > > > src) \ > > > +{ > > >\ > > > + size_t ret; > > >\ > > > + char buf[INT_STR_SIZE(_type)]; > > >\ > > > + > > >\ > > > + ret = snprintf(buf, size ? sizeof(buf) : 0, _format, src); > > >\ > > > + if (ret <= size) > > >\ > > > + memcpy(val, buf, ret); > > >\ > > > + return ret; > > >\ > > > +} > > > + > > > +GENERATE_XATTR_INT_FORMATTER(u, "%u", unsigned int) > > > +GENERATE_XATTR_INT_FORMATTER(d, "%d", int) > > > +GENERATE_XATTR_INT_FORMATTER(lld, "%lld", long long) > > > +GENERATE_XATTR_INT_FORMATTER(llu, "%llu", unsigned long long) > > > + > > > static size_t ceph_vxattrcb_layout_stripe_unit(struct ceph_inode_info > > > *ci, > > >char *val, size_t size) > > > { > > > - return snprintf(val, size, "%u", ci->i_layout.stripe_unit); > > > + return format_u_xattr(val, size, ci->i_layout.stripe_unit); > > > } > > > > > > static size_t ceph_vxattrcb_layout_stripe_count(struct ceph_inode_info > > > *ci, > > > char *val, size_t size) > > > { > > > - return snprintf(val, size, "%u", ci->i_layout.stripe_count); > > > + return format_u_xattr(val, size, ci->i_layout.stripe_count); > > > } > > > > > > static size_t ceph_vxattrcb_layout_object_size(struct ceph_inode_info > > > *ci, > > >char *val, size_t size) > > > { > > > - return snprintf(val, size, "%u", ci->i_layout.object_
Re: [PATCH v3 1/2] ceph: fix buffer length handling in virtual xattrs
On Fri, Jun 21, 2019 at 4:18 PM Jeff Layton wrote: > > The convention with xattrs is to not store the termination with string > data, given that it returns the length. This is how setfattr/getfattr > operate. > > Most of ceph's virtual xattr routines use snprintf to plop the string > directly into the destination buffer, but snprintf always NULL > terminates the string. This means that if we send the kernel a buffer > that is the exact length needed to hold the string, it'll end up > truncated. > > Add new routines to format the string into an on-stack buffer that is > always large enough to hold the whole thing and then memcpy the result > into the destination buffer. Then, change over the virtual xattr > routines to use the new helper functions as appropriate. > > Finally, make the code return ERANGE if the destination buffer size was > too small to hold the returned value. > > Signed-off-by: Jeff Layton > --- > fs/ceph/xattr.c | 103 > 1 file changed, 78 insertions(+), 25 deletions(-) > > diff --git a/fs/ceph/xattr.c b/fs/ceph/xattr.c > index 6621d27e64f5..359d3cbbb37b 100644 > --- a/fs/ceph/xattr.c > +++ b/fs/ceph/xattr.c > @@ -112,22 +112,47 @@ static size_t ceph_vxattrcb_layout(struct > ceph_inode_info *ci, char *val, > return ret; > } > > +/* Enough to hold any possible expression of integer TYPE in base 10 */ > +#define INT_STR_SIZE(_type)3*sizeof(_type)+2 > + > +/* > + * snprintf always NULL terminates, but we need for xattrs to not be. For > + * the integer vxattrs, just create an on-stack buffer for snprintf's > + * destination, and just don't copy the termination to the actual buffer. > + */ > +#define GENERATE_XATTR_INT_FORMATTER(_lbl, _format, _type) \ > +static size_t format_ ## _lbl ## _xattr(char *val, size_t size, _type src) > \ > +{ \ > + size_t ret; \ > + char buf[INT_STR_SIZE(_type)]; \ > +\ > + ret = snprintf(buf, size ? sizeof(buf) : 0, _format, src); \ > + if (ret <= size) \ > + memcpy(val, buf, ret); \ > + return ret; \ > +} > + > +GENERATE_XATTR_INT_FORMATTER(u, "%u", unsigned int) > +GENERATE_XATTR_INT_FORMATTER(d, "%d", int) > +GENERATE_XATTR_INT_FORMATTER(lld, "%lld", long long) > +GENERATE_XATTR_INT_FORMATTER(llu, "%llu", unsigned long long) > + > static size_t ceph_vxattrcb_layout_stripe_unit(struct ceph_inode_info *ci, >char *val, size_t size) > { > - return snprintf(val, size, "%u", ci->i_layout.stripe_unit); > + return format_u_xattr(val, size, ci->i_layout.stripe_unit); > } > > static size_t ceph_vxattrcb_layout_stripe_count(struct ceph_inode_info *ci, > char *val, size_t size) > { > - return snprintf(val, size, "%u", ci->i_layout.stripe_count); > + return format_u_xattr(val, size, ci->i_layout.stripe_count); > } > > static size_t ceph_vxattrcb_layout_object_size(struct ceph_inode_info *ci, >char *val, size_t size) > { > - return snprintf(val, size, "%u", ci->i_layout.object_size); > + return format_u_xattr(val, size, ci->i_layout.object_size); > } > > static size_t ceph_vxattrcb_layout_pool(struct ceph_inode_info *ci, > @@ -141,10 +166,14 @@ static size_t ceph_vxattrcb_layout_pool(struct > ceph_inode_info *ci, > > down_read(&osdc->lock); > pool_name = ceph_pg_pool_name_by_id(osdc->osdmap, pool); > - if (pool_name) > - ret = snprintf(val, size, "%s", pool_name); > - else > - ret = snprintf(val, size, "%lld", (unsigned long long)pool); > + if (pool_name) { > + ret = strlen(pool_name); > + > + if (ret <= size) > + memcpy(val, pool_name, ret); > + } else { > + ret = format_lld_xattr(val, size, pool); > + } > up_read(&osdc->lock); > return ret; > } > @@ -155,7 +184,11 @@ static size_t ceph_vxattrcb_layout_pool_namespace(struct > ceph_inode_info *ci, > int ret = 0; > struct ceph_string *ns = ceph_try_get_string(ci->i_layout.pool_ns); > if (ns) { > - ret = snprintf(val, size, "%.*s", (int)ns->len, ns->str); > + ret = ns->len; > + > + if (ret <= size) > + memcpy(val, ns->str, ns->len); > + > ceph_put_string(ns); > } > return ret; > @@ -166,50 +199,61 @@ static size_t > ceph_vxattrcb_layout_pool_nam
[GIT PULL] Ceph fixes for 5.2-rc4
Hi Linus, The following changes since commit f2c7c76c5d0a443053e94adb9f0918fa2fb85c3a: Linux 5.2-rc3 (2019-06-02 13:55:33 -0700) are available in the Git repository at: https://github.com/ceph/ceph-client.git tags/ceph-for-5.2-rc4 for you to fetch changes up to 7b2f936fc8282ab56d4d21247f2f9c21607c085c: ceph: fix error handling in ceph_get_caps() (2019-06-05 20:34:39 +0200) A change to call iput() asynchronously to avoid a possible deadlock when iput_final() needs to wait for in-flight I/O (e.g. readahead) and a fixup for a cleanup that went into -rc1. Yan, Zheng (3): ceph: single workqueue for inode related works ceph: avoid iput_final() while holding mutex or in dispatch thread ceph: fix error handling in ceph_get_caps() fs/ceph/caps.c | 34 ++- fs/ceph/file.c | 2 +- fs/ceph/inode.c | 155 +++ fs/ceph/mds_client.c | 28 ++ fs/ceph/quota.c | 9 ++- fs/ceph/snap.c | 16 -- fs/ceph/super.c | 28 +++--- fs/ceph/super.h | 19 --- 8 files changed, 156 insertions(+), 135 deletions(-)
[GIT PULL] Ceph updates for 5.2-rc1
Hi Linus, The following changes since commit e93c9c99a629c61837d5a7fc2120cd2b6c70dbdd: Linux 5.1 (2019-05-05 17:42:58 -0700) are available in the Git repository at: https://github.com/ceph/ceph-client.git tags/ceph-for-5.2-rc1 for you to fetch changes up to 00abf69dd24fd185982379c5cc3bb7b6d1fc: ceph: flush dirty inodes before proceeding with remount (2019-05-07 19:43:05 +0200) On the filesystem side we have: - a fix to enforce quotas set above the mount point (Luis Henriques) - support for exporting snapshots through NFS (Zheng Yan) - proper statx implementation (Jeff Layton). statx flags are mapped to MDS caps, with AT_STATX_{DONT,FORCE}_SYNC taken into account. - some follow-up dentry name handling fixes, in particular elimination of our hand-rolled helper and the switch to __getname() as suggested by Al (Jeff Layton) - a set of MDS client cleanups in preparation for async MDS requests in the future (Jeff Layton) - a fix to sync the filesystem before remounting (Jeff Layton) On the rbd side, work is on-going on object-map and fast-diff image features. Arnd Bergmann (3): rbd: avoid clang -Wuninitialized warning rbd: convert all rbd_assert(0) to BUG() libceph: fix clang warning for CEPH_DEFINE_OID_ONSTACK Ilya Dryomov (2): rbd: client_mutex is never nested rbd: don't assert on writes to snapshots Jeff Layton (20): ceph: remove superfluous inode_lock in ceph_fsync ceph: properly handle granular statx requests ceph: fix NULL pointer deref when debugging is enabled ceph: make iterate_session_caps a public symbol ceph: dump granular cap info in "caps" debugfs file ceph: fix potential use-after-free in ceph_mdsc_build_path ceph: use ceph_mdsc_build_path instead of clone_dentry_name ceph: use __getname/__putname in ceph_mdsc_build_path ceph: use pathlen values returned by set_request_path_attr ceph: after an MDS request, do callback and completions ceph: have ceph_mdsc_do_request call ceph_mdsc_submit_request ceph: move wait for mds request into helper function ceph: fix comment over ceph_drop_caps_for_unlink ceph: simplify arguments and return semantics of try_get_cap_refs ceph: just call get_session in __ceph_lookup_mds_session ceph: print inode number in __caps_issued_mask debugging messages libceph: fix unaligned accesses in ceph_entity_addr handling libceph: make ceph_pr_addr take an struct ceph_entity_addr pointer ceph: fix unaligned access in ceph_send_cap_releases ceph: flush dirty inodes before proceeding with remount Luis Henriques (2): ceph: factor out ceph_lookup_inode() ceph: quota: fix quota subdir mounts Yan, Zheng (1): ceph: snapshot nfs re-export Zhi Zhang (1): ceph: remove duplicated filelock ref increase drivers/block/rbd.c| 24 +-- fs/ceph/caps.c | 93 +-- fs/ceph/debugfs.c | 40 - fs/ceph/export.c | 356 ++--- fs/ceph/file.c | 2 +- fs/ceph/inode.c| 85 ++ fs/ceph/locks.c| 13 -- fs/ceph/mds_client.c | 205 ++-- fs/ceph/mds_client.h | 33 +++- fs/ceph/mdsmap.c | 2 +- fs/ceph/quota.c| 177 ++-- fs/ceph/super.c| 7 + fs/ceph/super.h| 2 + include/linux/ceph/ceph_fs.h | 6 + include/linux/ceph/messenger.h | 3 +- include/linux/ceph/osdmap.h| 13 +- net/ceph/cls_lock_client.c | 2 +- net/ceph/debugfs.c | 4 +- net/ceph/messenger.c | 121 +++--- net/ceph/mon_client.c | 6 +- net/ceph/osd_client.c | 2 +- 21 files changed, 845 insertions(+), 351 deletions(-)
[GIT PULL] Ceph fixes for 5.1-rc7
Hi Linus, The following changes since commit 085b7755808aa11f78ab9377257e1dad2e6fa4bb: Linux 5.1-rc6 (2019-04-21 10:45:57 -0700) are available in the Git repository at: https://github.com/ceph/ceph-client.git tags/ceph-for-5.1-rc7 for you to fetch changes up to 37659182bff1eeaaeadcfc8f853c6d2b6dbc3f47: ceph: fix ci->i_head_snapc leak (2019-04-23 21:37:54 +0200) dentry name handling fixes from Jeff and a memory leak fix from Zheng. Both are old issues, marked for stable. Jeff Layton (3): ceph: only use d_name directly when parent is locked ceph: ensure d_name stability in ceph_dentry_hash() ceph: handle the case where a dentry has been renamed on outstanding req Yan, Zheng (1): ceph: fix ci->i_head_snapc leak fs/ceph/dir.c| 6 - fs/ceph/inode.c | 16 +++- fs/ceph/mds_client.c | 70 +++- fs/ceph/snap.c | 7 +- 4 files changed, 85 insertions(+), 14 deletions(-)
[GIT PULL] Ceph fixes for 5.1-rc3
Hi Linus, The following changes since commit 8c2ffd9174779014c3fe1f96d9dc3641d9175f00: Linux 5.1-rc2 (2019-03-24 14:02:26 -0700) are available in the Git repository at: https://github.com/ceph/ceph-client.git tags/ceph-for-5.1-rc3 for you to fetch changes up to daf5cc27eed99afdea8d96e71b89ba41f5406ef6: ceph: fix use-after-free on symlink traversal (2019-03-27 19:00:37 +0100) A patch to avoid choking on multipage bvecs in the messenger and a small use-after-free fix. Al Viro (1): ceph: fix use-after-free on symlink traversal Ilya Dryomov (1): libceph: fix breakage caused by multipage bvecs fs/ceph/inode.c | 2 +- net/ceph/messenger.c | 8 ++-- 2 files changed, 7 insertions(+), 3 deletions(-)
Re: [PATCH] [v2] ceph: fix clang warning for CEPH_DEFINE_OID_ONSTACK
On Mon, Mar 25, 2019 at 1:51 PM Arnd Bergmann wrote: > > clang complains about assigning a variable to itself during the > declaration: > > fs/ceph/ioctl.c:187:26: error: variable 'oid' is uninitialized when used > within its own initialization [-Werror,-Wuninitialized] > CEPH_DEFINE_OID_ONSTACK(oid); > ^~~ > include/linux/ceph/osdmap.h:122:52: note: expanded from macro > 'CEPH_DEFINE_OID_ONSTACK' > struct ceph_object_id oid = CEPH_OID_INIT_ONSTACK(oid) > ~~~ ^~~ > include/linux/ceph/osdmap.h:120:29: note: expanded from macro > 'CEPH_OID_INIT_ONSTACK' > ({ ceph_oid_init(&oid); oid; }) > ^~~ > > We use this trick in other places, but it is completely unnecessary > here, as we can just use a regular struct initializer. > > Signed-off-by: Arnd Bergmann > --- > v2: rearrange to only have one instance of the initializer > --- > include/linux/ceph/osdmap.h | 13 ++--- > 1 file changed, 6 insertions(+), 7 deletions(-) > > diff --git a/include/linux/ceph/osdmap.h b/include/linux/ceph/osdmap.h > index 5675b1f09bc5..8794cf0f0b39 100644 > --- a/include/linux/ceph/osdmap.h > +++ b/include/linux/ceph/osdmap.h > @@ -110,17 +110,16 @@ struct ceph_object_id { > int name_len; > }; > > +#define CEPH_OID_INITIALIZER(oid) { .name = (oid).inline_name } > + > +#define CEPH_DEFINE_OID_ONSTACK(oid) \ > + struct ceph_object_id oid = CEPH_OID_INITIALIZER(oid) > + > static inline void ceph_oid_init(struct ceph_object_id *oid) > { > - oid->name = oid->inline_name; > - oid->name_len = 0; > + *oid = (struct ceph_object_id)CEPH_OID_INITIALIZER(*oid); > } > > -#define CEPH_OID_INIT_ONSTACK(oid) \ > -({ ceph_oid_init(&oid); oid; }) > -#define CEPH_DEFINE_OID_ONSTACK(oid) \ > - struct ceph_object_id oid = CEPH_OID_INIT_ONSTACK(oid) > - > static inline bool ceph_oid_empty(const struct ceph_object_id *oid) > { > return oid->name == oid->inline_name && !oid->name_len; Applied. Thanks, Ilya
[PATCH] dm table: propagate BDI_CAP_STABLE_WRITES
Some devices don't use blk_integrity but still want stable pages because they do their own checksumming. Examples include rbd and iSCSI when data digests are negotiated. Stacking DM (and thus LVM) on top of these devices results in sporadic checksum errors. Set BDI_CAP_STABLE_WRITES if any underlying device has it set. Signed-off-by: Ilya Dryomov --- drivers/md/dm-table.c | 39 +++ 1 file changed, 39 insertions(+) diff --git a/drivers/md/dm-table.c b/drivers/md/dm-table.c index ba9481f1bf3c..cde3b49b2a91 100644 --- a/drivers/md/dm-table.c +++ b/drivers/md/dm-table.c @@ -1844,6 +1844,36 @@ static bool dm_table_supports_secure_erase(struct dm_table *t) return true; } +static int device_requires_stable_pages(struct dm_target *ti, + struct dm_dev *dev, sector_t start, + sector_t len, void *data) +{ + struct request_queue *q = bdev_get_queue(dev->bdev); + + return q && bdi_cap_stable_pages_required(q->backing_dev_info); +} + +/* + * If any underlying device requires stable pages, a table must require + * them as well. Only targets that support iterate_devices are considered: + * don't want error, zero, etc to require stable pages. + */ +static bool dm_table_requires_stable_pages(struct dm_table *t) +{ + struct dm_target *ti; + unsigned i; + + for (i = 0; i < dm_table_get_num_targets(t); i++) { + ti = dm_table_get_target(t, i); + + if (ti->type->iterate_devices && + ti->type->iterate_devices(ti, device_requires_stable_pages, NULL)) + return true; + } + + return false; +} + void dm_table_set_restrictions(struct dm_table *t, struct request_queue *q, struct queue_limits *limits) { @@ -1896,6 +1926,15 @@ void dm_table_set_restrictions(struct dm_table *t, struct request_queue *q, dm_table_verify_integrity(t); + /* +* Some devices don't use blk_integrity but still want stable pages +* because they do their own checksumming. +*/ + if (dm_table_requires_stable_pages(t)) + q->backing_dev_info->capabilities |= BDI_CAP_STABLE_WRITES; + else + q->backing_dev_info->capabilities &= ~BDI_CAP_STABLE_WRITES; + /* * Determine whether or not this queue's I/O timings contribute * to the entropy pool, Only request-based targets use this. -- 2.19.2
Re: ceph: fix use-after-free on symlink traversal
On Tue, Mar 26, 2019 at 2:39 AM Al Viro wrote: > > free the symlink body after the same RCU delay we have for freeing the > struct inode itself, so that traversal during RCU pathwalk wouldn't step > into freed memory. > > Signed-off-by: Al Viro > --- > diff --git a/fs/ceph/inode.c b/fs/ceph/inode.c > index e3346628efe2..2d61ddda9bf5 100644 > --- a/fs/ceph/inode.c > +++ b/fs/ceph/inode.c > @@ -524,6 +524,7 @@ static void ceph_i_callback(struct rcu_head *head) > struct inode *inode = container_of(head, struct inode, i_rcu); > struct ceph_inode_info *ci = ceph_inode(inode); > > + kfree(ci->i_symlink); > kmem_cache_free(ceph_inode_cachep, ci); > } > > @@ -566,7 +567,6 @@ void ceph_destroy_inode(struct inode *inode) > } > } > > - kfree(ci->i_symlink); > while ((n = rb_first(&ci->i_fragtree)) != NULL) { > frag = rb_entry(n, struct ceph_inode_frag, node); > rb_erase(n, &ci->i_fragtree); Al, I see you directed this patch at Linus instead of ceph-devel. I can pick it up for -rc3 as I have an important libceph fix pending anyway. Let me know if you want me to handle it. Thanks, Ilya
Re: [PATCH] rbd: avoid clang -Wuninitialized warning
On Fri, Mar 22, 2019 at 5:55 PM Arnd Bergmann wrote: > > On Fri, Mar 22, 2019 at 5:33 PM Ilya Dryomov wrote: > > > > On Fri, Mar 22, 2019 at 3:36 PM Arnd Bergmann wrote: > > > > > > diff --git a/drivers/block/rbd.c b/drivers/block/rbd.c > > > index 4ba967d65cf9..cbcc3baf3807 100644 > > > --- a/drivers/block/rbd.c > > > +++ b/drivers/block/rbd.c > > > @@ -2399,7 +2399,7 @@ static int rbd_obj_read_from_parent(struct > > > rbd_obj_request *obj_req) > > > &obj_req->bvec_pos); > > > break; > > > default: > > > - rbd_assert(0); > > > + BUG(); > > > } > > > } else { > > > ret = rbd_img_fill_from_bvecs(child_img_req, > > > > Hi Arnd, > > > > You did a couple of these last year in commit c6244b3b2377 ("rbd: avoid > > Wreturn-type warnings"). > > Ah, I had completely forgotten about that. Different bug and different > compiler, but same solution ;-) > > > Let's change all of those default cases to BUG > > in one go. Do you want to do that or should I? > > I've prepared another patch now and sent it out, please > apply it on top. I'd like this one-line patch to stay separate though > since it captures the warning message and may need to > be backported to stable kernels later. Applied. Thanks, Ilya
[GIT PULL] Ceph fixes for 5.1-rc2
Hi Linus, The following changes since commit 9e98c678c2d6ae3a17cb2de55d17f69dddaa231b: Linux 5.1-rc1 (2019-03-17 14:22:26 -0700) are available in the Git repository at: https://github.com/ceph/ceph-client.git tags/ceph-for-5.1-rc2 for you to fetch changes up to 9d4a227f6ef189cf37eb22641f6ee788b7dc41bb: rbd: drop wait_for_latest_osdmap() (2019-03-20 16:27:40 +0100) A follow up for the new alloc_size logic and a blacklisting fix, marked for stable. Ilya Dryomov (3): rbd: set io_min, io_opt and discard_granularity to alloc_size libceph: wait for latest osdmap in ceph_monc_blacklist_add() rbd: drop wait_for_latest_osdmap() drivers/block/rbd.c | 28 ++-- include/linux/ceph/libceph.h | 2 ++ net/ceph/ceph_common.c | 18 +- net/ceph/mon_client.c| 9 + 4 files changed, 34 insertions(+), 23 deletions(-)
Re: [PATCH] ceph: fix clang warning for CEPH_DEFINE_OID_ONSTACK
On Fri, Mar 22, 2019 at 3:08 PM Arnd Bergmann wrote: > > clang complains about assigning a variable to itself during the > declaration: > > fs/ceph/ioctl.c:187:26: error: variable 'oid' is uninitialized when used > within its own initialization [-Werror,-Wuninitialized] > CEPH_DEFINE_OID_ONSTACK(oid); > ^~~ > include/linux/ceph/osdmap.h:122:52: note: expanded from macro > 'CEPH_DEFINE_OID_ONSTACK' > struct ceph_object_id oid = CEPH_OID_INIT_ONSTACK(oid) > ~~~ ^~~ > include/linux/ceph/osdmap.h:120:29: note: expanded from macro > 'CEPH_OID_INIT_ONSTACK' > ({ ceph_oid_init(&oid); oid; }) > ^~~ > > We use this trick in other places, but it is completely unnecessary > here, as we can just use a regular struct initializer. > > Signed-off-by: Arnd Bergmann > --- > include/linux/ceph/osdmap.h | 4 +--- > 1 file changed, 1 insertion(+), 3 deletions(-) > > diff --git a/include/linux/ceph/osdmap.h b/include/linux/ceph/osdmap.h > index 5675b1f09bc5..82f957a7a0d6 100644 > --- a/include/linux/ceph/osdmap.h > +++ b/include/linux/ceph/osdmap.h > @@ -116,10 +116,8 @@ static inline void ceph_oid_init(struct ceph_object_id > *oid) > oid->name_len = 0; > } > > -#define CEPH_OID_INIT_ONSTACK(oid) \ > -({ ceph_oid_init(&oid); oid; }) > #define CEPH_DEFINE_OID_ONSTACK(oid) \ > - struct ceph_object_id oid = CEPH_OID_INIT_ONSTACK(oid) > + struct ceph_object_id oid = { .name = oid.inline_name } > > static inline bool ceph_oid_empty(const struct ceph_object_id *oid) > { Hi Arnd, I don't like this because the initialization is no longer contained to ceph_oid_init(). Now there are two things to patch instead of one. How is this going to be fixed in other places? Thanks, Ilya
[GIT PULL] Ceph updates for 5.1-rc1
Hi Linus, The following changes since commit 1c163f4c7b3f621efff9b28a47abb36f7378d783: Linux 5.0 (2019-03-03 15:21:29 -0800) are available in the Git repository at: https://github.com/ceph/ceph-client.git tags/ceph-for-5.1-rc1 for you to fetch changes up to d11ae8e0a76afc506071831854348f2ea1f3290e: Documentation: modern versions of ceph are not backed by btrfs (2019-03-05 18:55:18 +0100) The highlights are: - rbd will now ignore discards that aren't aligned and big enough to actually free up some space (myself). This is controlled by the new alloc_size map option and can be disabled if needed. - support for rbd deep-flatten feature (myself). Deep-flatten allows "rbd flatten" to fully disconnect the clone image and its snapshots from the parent and make the parent snapshot removable. - a new round of cap handling improvements (Zheng Yan). The kernel client should now be much more prompt about releasing its caps and it is possible to put a limit on the number of caps held. - support for getting ceph.dir.pin extended attribute (Zheng Yan) Gustavo A. R. Silva (1): libceph: use struct_size() for kmalloc() in crush_decode() Ilya Dryomov (11): rbd: get rid of obj_req->obj_request_count rbd: handle DISCARD and WRITE_ZEROES separately rbd: round off and ignore discards that are too small rbd: remove experimental designation from kernel layering rbd: clear ->xferred on error from rbd_obj_issue_copyup() rbd: factor out __rbd_osd_req_create() rbd: stop copying num_osd_ops in rbd_obj_issue_copyup() rbd: introduce rbd_obj_issue_copyup_ops() rbd: copyup with an empty snapshot context (aka deep-copyup) rbd: whole-object write and zeroout should copyup when snapshots exist rbd: advertise support for RBD_FEATURE_DEEP_FLATTEN Jeff Layton (1): Documentation: modern versions of ceph are not backed by btrfs Yan, Zheng (12): ceph: set special inode's blocksize to page size ceph: decode feature bits in session message ceph: split large reconnect into multiple messages ceph: map snapid to anonymous bdev ID ceph: support versioned reply ceph: support getting ceph.dir.pin vxattr ceph: send cap releases more aggressively ceph: touch existing cap when handling reply ceph: remove dentry_lru file from debugfs ceph: delete stale dentry when last reference is dropped ceph: periodically trim stale dentries ceph: add mount option to limit caps count zhengbin (1): ceph: pass inclusive lend parameter to filemap_write_and_wait_range() Documentation/filesystems/ceph.txt | 14 +- drivers/block/rbd.c| 400 +++-- fs/ceph/caps.c | 72 ++-- fs/ceph/debugfs.c | 27 -- fs/ceph/dir.c | 455 +++- fs/ceph/file.c | 13 +- fs/ceph/inode.c| 52 +-- fs/ceph/mds_client.c | 698 ++--- fs/ceph/mds_client.h | 43 ++- fs/ceph/snap.c | 159 - fs/ceph/super.c| 21 +- fs/ceph/super.h| 43 ++- fs/ceph/xattr.c| 20 +- include/linux/ceph/types.h | 1 + net/ceph/osdmap.c | 5 +- 15 files changed, 1597 insertions(+), 426 deletions(-)
[GIT PULL] Ceph fixes for 5.0-rc8
Hi Linus, The following changes since commit a3b22b9f11d9fbc48b0291ea92259a5a810e9438: Linux 5.0-rc7 (2019-02-17 18:46:40 -0800) are available in the Git repository at: https://github.com/ceph/ceph-client.git tags/ceph-for-5.0-rc8 for you to fetch changes up to 04242ff3ac0abbaa4362f97781dac268e6c3541a: ceph: avoid repeatedly adding inode to mdsc->snap_flush_list (2019-02-18 18:08:29 +0100) Two bug fixes for old issues, both marked for stable. ---- Ilya Dryomov (1): libceph: handle an empty authorize reply Yan, Zheng (1): ceph: avoid repeatedly adding inode to mdsc->snap_flush_list fs/ceph/snap.c | 3 ++- net/ceph/messenger.c | 15 +-- 2 files changed, 11 insertions(+), 7 deletions(-)
[GIT PULL] Ceph fixes for 5.0-rc4
Hi Linus, The following changes since commit 49a57857aeea06ca831043acbb0fa5e0f50602fd: Linux 5.0-rc3 (2019-01-21 13:14:44 +1300) are available in the Git repository at: https://github.com/ceph/ceph-client.git tags/ceph-for-5.0-rc4 for you to fetch changes up to 74827ee29565f86e2a64495a5e3e58d3371d74ee: ceph: quota: cleanup license mess (2019-01-21 14:53:23 +0100) A fix for a potential use-after-free, a patch to close a (mostly benign) race in the messenger and a licence clarification for quota.c. Ilya Dryomov (1): libceph: avoid KEEPALIVE_PENDING races in ceph_con_keepalive() Thomas Gleixner (1): ceph: quota: cleanup license mess Yan, Zheng (1): ceph: clear inode pointer when snap realm gets dropped by its inode fs/ceph/caps.c | 2 ++ fs/ceph/quota.c | 13 - net/ceph/messenger.c | 5 +++-- 3 files changed, 5 insertions(+), 15 deletions(-)
Re: [patch 6/9] ceph: quota: Cleanup license mess
On Fri, Jan 18, 2019 at 12:15 AM Thomas Gleixner wrote: > > Precise and non-ambiguous license information is important. The recently > added aegis header file has a SPDX license identifier, which is nice, but Looks like cut-and-paste from crypto/aegis.h patch? I'm changing this to say "recently added quota.c file". > at the same time it has a contradictionary license boiler plate text. > > SPDX-License-Identifier: GPL-2.0 > > versus > > * This program is free software; you can redistribute it and/or > * modify it under the terms of the GNU General Public License > * as published by the Free Software Foundation; either version 2 > * of the License, or (at your option) any later version. > > Oh well. > > As the other ceph related files are licensed under the GPL v2 only, it's > assumed that the SPDX id is correct and the boiler plate was randomly > copied into that patch. > > Remove the boiler plate as it is wrong and even if correct it is redundant. > > Fixes: fb18a57568c2 ("ceph: quota: add initial infrastructure to support > cephfs quotas") > Signed-off-by: Thomas Gleixner > Cc: Luis Henriques > Cc: Jiri Kosina > Cc: "Yan, Zheng" > Cc: Sage Weil > Cc: Ilya Dryomov > Cc: ceph-de...@vger.kernel.org > --- > > P.S.: This patch is part of a larger cleanup, but independent of other > patches and is intended to be picked up by the maintainer directly. > > --- > fs/ceph/quota.c | 13 - > 1 file changed, 13 deletions(-) > > --- a/fs/ceph/quota.c > +++ b/fs/ceph/quota.c > @@ -3,19 +3,6 @@ > * quota.c - CephFS quota > * > * Copyright (C) 2017-2018 SUSE > - * > - * This program is free software; you can redistribute it and/or > - * modify it under the terms of the GNU General Public License > - * as published by the Free Software Foundation; either version 2 > - * of the License, or (at your option) any later version. > - * > - * This program is distributed in the hope that it will be useful, > - * but WITHOUT ANY WARRANTY; without even the implied warranty of > - * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the > - * GNU General Public License for more details. > - * > - * You should have received a copy of the GNU General Public License > - * along with this program; if not, see <http://www.gnu.org/licenses/>. > */ > > #include Applied. Thanks, Ilya
Re: [PATCH net-next] libceph, ceph: use struct_size() in kmalloc()
On Tue, Jan 15, 2019 at 8:41 PM Gustavo A. R. Silva wrote: > > One of the more common cases of allocation size calculations is finding > the size of a structure that has a zero-sized array at the end, along > with memory for some number of elements for that array. For example: > > struct foo { > int stuff; > struct boo entry[]; > }; > > instance = kmalloc(sizeof(struct foo) + count * sizeof(struct boo), > GFP_KERNEL); > > Instead of leaving these open-coded and prone to type mistakes, we can > now use the new struct_size() helper: > > instance = kmalloc(struct_size(instance, entry, count), GFP_KERNEL); > > This code was detected with the help of Coccinelle. > > Signed-off-by: Gustavo A. R. Silva > --- > net/ceph/osdmap.c | 5 ++--- > 1 file changed, 2 insertions(+), 3 deletions(-) > > diff --git a/net/ceph/osdmap.c b/net/ceph/osdmap.c > index 98c0ff3d6441..48a31dc9161c 100644 > --- a/net/ceph/osdmap.c > +++ b/net/ceph/osdmap.c > @@ -495,9 +495,8 @@ static struct crush_map *crush_decode(void *pbyval, void > *end) > / sizeof(struct crush_rule_step)) > goto bad; > #endif > - r = c->rules[i] = kmalloc(sizeof(*r) + > - yes*sizeof(struct crush_rule_step), > - GFP_NOFS); > + r = kmalloc(struct_size(r, steps, yes), GFP_NOFS); > + c->rules[i] = r; > if (r == NULL) > goto badmem; > dout(" rule %d is at %p\n", i, r); Applied. Thanks, Ilya
Re: [PATCH] libceph: protect pending flags in ceph_con_keepalive()
On Tue, Jan 15, 2019 at 7:56 AM Myungho Jung wrote: > > On Mon, Jan 14, 2019 at 09:37:25PM +0100, Ilya Dryomov wrote: > > On Thu, Jan 3, 2019 at 4:50 AM Myungho Jung wrote: > > > I reproduced on vm using syzkaller utils and verified the fix by syzbot. > > > > Hi Myungho, > > > > I think this might be a better fix: > > > > diff --git a/net/ceph/messenger.c b/net/ceph/messenger.c > > index d5718284db57..c5f5313e3537 100644 > > --- a/net/ceph/messenger.c > > +++ b/net/ceph/messenger.c > > @@ -3205,10 +3205,11 @@ void ceph_con_keepalive(struct ceph_connection *con) > > { > > dout("con_keepalive %p\n", con); > > mutex_lock(&con->mutex); > > + con_flag_set(con, CON_FLAG_KEEPALIVE_PENDING); > > clear_standby(con); > > mutex_unlock(&con->mutex); > > - if (con_flag_test_and_set(con, CON_FLAG_KEEPALIVE_PENDING) == 0 && > > - con_flag_test_and_set(con, CON_FLAG_WRITE_PENDING) == 0) > > + > > + if (con_flag_test_and_set(con, CON_FLAG_WRITE_PENDING) == 0) > > queue_con(con); > > } > > EXPORT_SYMBOL(ceph_con_keepalive); > > > > WRITE_PENDING can be set without con->mutex held from socket callbacks. > > This is the reason we use atomic bit ops here, so testing WRITE_PENDING > > under the lock didn't make sense to me. > > > > At the same time, KEEPALIVE_PENDING could have been a non-atomic flag. > > I spent some time trying to make sense of conditioning queue_con() call > > on the previous value of KEEPALIVE_PENDING and couldn't see any, so I'm > > setting it with con_flag_set(), making ceph_con_keepalive() symmetric > > with ceph_con_send(). > > > > Thanks, > > > > Ilya > > Hi Ilya, > > Yes, it looks clear and makes sense to have an atomic operation in if > statement > but it still triggers warning. KEEPALIVE_PENDING should be set after > clear_standby() because con_fault() can be called right before acquiring the > lock here which sets the flag in standby state. I tesed the change with syzbot > and confirmed there was no warning. Right, it still triggers one of the warnings. I was too focused on WRITE_PENDING and missed that in plain sight. I'll update the patch. Thanks for testing! Ilya
Re: [PATCH] libceph: protect pending flags in ceph_con_keepalive()
On Thu, Jan 3, 2019 at 4:50 AM Myungho Jung wrote: > I reproduced on vm using syzkaller utils and verified the fix by syzbot. Hi Myungho, I think this might be a better fix: diff --git a/net/ceph/messenger.c b/net/ceph/messenger.c index d5718284db57..c5f5313e3537 100644 --- a/net/ceph/messenger.c +++ b/net/ceph/messenger.c @@ -3205,10 +3205,11 @@ void ceph_con_keepalive(struct ceph_connection *con) { dout("con_keepalive %p\n", con); mutex_lock(&con->mutex); + con_flag_set(con, CON_FLAG_KEEPALIVE_PENDING); clear_standby(con); mutex_unlock(&con->mutex); - if (con_flag_test_and_set(con, CON_FLAG_KEEPALIVE_PENDING) == 0 && - con_flag_test_and_set(con, CON_FLAG_WRITE_PENDING) == 0) + + if (con_flag_test_and_set(con, CON_FLAG_WRITE_PENDING) == 0) queue_con(con); } EXPORT_SYMBOL(ceph_con_keepalive); WRITE_PENDING can be set without con->mutex held from socket callbacks. This is the reason we use atomic bit ops here, so testing WRITE_PENDING under the lock didn't make sense to me. At the same time, KEEPALIVE_PENDING could have been a non-atomic flag. I spent some time trying to make sense of conditioning queue_con() call on the previous value of KEEPALIVE_PENDING and couldn't see any, so I'm setting it with con_flag_set(), making ceph_con_keepalive() symmetric with ceph_con_send(). Thanks, Ilya
[GIT PULL] Ceph updates for 5.0-rc2
Hi Linus, The following changes since commit bfeffd155283772bbe78c6a05dec7c0128ee500c: Linux 5.0-rc1 (2019-01-06 17:08:20 -0800) are available in the Git repository at: https://github.com/ceph/ceph-client.git tags/ceph-for-5.0-rc2 for you to fetch changes up to 85f5a4d666fd9be73856ed16bb36c5af5b406b29: rbd: don't return 0 on unmap if RBD_DEV_FLAG_REMOVING is set (2019-01-10 09:45:09 +0100) A patch to allow setting abort_on_full and a fix for an old "rbd unmap" edge case, marked for stable. Dongsheng Yang (1): libceph: allow setting abort_on_full for rbd Ilya Dryomov (1): rbd: don't return 0 on unmap if RBD_DEV_FLAG_REMOVING is set Souptick Joarder (1): ceph: use vmf_error() in ceph_filemap_fault() drivers/block/rbd.c | 9 - fs/ceph/addr.c | 5 + fs/ceph/super.c | 4 ++-- include/linux/ceph/libceph.h| 6 -- include/linux/ceph/osd_client.h | 1 - net/ceph/ceph_common.c | 11 ++- net/ceph/debugfs.c | 2 +- net/ceph/osd_client.c | 4 ++-- 8 files changed, 24 insertions(+), 18 deletions(-)
Re: [PATCH] fs/ceph/addr.c: Convert to use vmf_error()
On Fri, Jan 4, 2019 at 8:26 PM Souptick Joarder wrote: > > This code is converted to use vmf_error(). > > Signed-off-by: Souptick Joarder > --- > fs/ceph/addr.c | 5 + > 1 file changed, 1 insertion(+), 4 deletions(-) > > diff --git a/fs/ceph/addr.c b/fs/ceph/addr.c > index 8eade7a..fa2a85d 100644 > --- a/fs/ceph/addr.c > +++ b/fs/ceph/addr.c > @@ -1495,10 +1495,7 @@ static vm_fault_t ceph_filemap_fault(struct vm_fault > *vmf) > if (err < 0 || off >= i_size_read(inode)) { > unlock_page(page); > put_page(page); > - if (err == -ENOMEM) > - ret = VM_FAULT_OOM; > - else > - ret = VM_FAULT_SIGBUS; > + ret = vmf_error(err); > goto out_inline; > } > if (err < PAGE_SIZE) Applied. Thanks, Ilya
[GIT PULL] Ceph updates for 4.21-rc1
Hi Linus, The following changes since commit 8fe28cb58bcb235034b64cbbb7550a8a43fd88be: Linux 4.20 (2018-12-23 15:55:59 -0800) are available in the Git repository at: https://github.com/ceph/ceph-client.git tags/ceph-for-4.21-rc1 for you to fetch changes up to 5ccedf1ccd710ba32f36986b49eeb764e53e7ef1: ceph: don't encode inode pathes into reconnect message (2018-12-26 16:08:36 +0100) A fairly quiet round: a couple of messenger performance improvements from myself and a few cap handling fixes from Zheng. Chengguang Xu (1): ceph: remove redundant assignment Ilya Dryomov (4): libceph: drop last_piece logic from write_partial_message_data() libceph: use sock_no_sendpage() as a fallback in ceph_tcp_sendpage() libceph: use MSG_SENDPAGE_NOTLAST with ceph_tcp_sendpage() libceph: switch more to bool in ceph_tcp_sendmsg() Yan, Zheng (6): ceph: cleanup splice_dentry() ceph: don't update importing cap's mseq when handing cap export ceph: don't request excl caps when mount is readonly ceph: skip updating 'wanted' caps if caps are already issued ceph: update wanted caps after resuming stale session ceph: don't encode inode pathes into reconnect message fs/ceph/caps.c | 75 ++ fs/ceph/inode.c | 60 ++-- fs/ceph/mds_client.c | 129 ++- fs/ceph/mds_client.h | 16 --- fs/ceph/mdsmap.c | 1 - net/ceph/messenger.c | 55 +- 6 files changed, 174 insertions(+), 162 deletions(-)
Re: [PATCH] libceph: protect pending flags in ceph_con_keepalive()
On Thu, Dec 27, 2018 at 8:08 PM Myungho Jung wrote: > > con_flag_test_and_set() sets CON_FLAG_KEEPALIVE_PENDING and > CON_FLAG_WRITE_PENDING flags without protection in ceph_con_keepalive(). > It triggers WARN_ON() in clear_standby() if the flags are set after > con_fault() changes connection state to CON_STATE_STANDBY. Move > con_flag_test_and_set() to be called before releasing the lock and store > the condition to check after the critical section. > > Reported-by: syzbot+acdeb633f6211ccdf...@syzkaller.appspotmail.com > Signed-off-by: Myungho Jung > --- > net/ceph/messenger.c | 8 ++-- > 1 file changed, 6 insertions(+), 2 deletions(-) > > diff --git a/net/ceph/messenger.c b/net/ceph/messenger.c > index 2f126eff275d..e15da22d4f37 100644 > --- a/net/ceph/messenger.c > +++ b/net/ceph/messenger.c > @@ -3216,12 +3216,16 @@ void ceph_msg_revoke_incoming(struct ceph_msg *msg) > */ > void ceph_con_keepalive(struct ceph_connection *con) > { > + bool pending; > + > dout("con_keepalive %p\n", con); > mutex_lock(&con->mutex); > clear_standby(con); > + pending = (con_flag_test_and_set(con, > +CON_FLAG_KEEPALIVE_PENDING) == 0 && > + con_flag_test_and_set(con, CON_FLAG_WRITE_PENDING) == 0); > mutex_unlock(&con->mutex); > - if (con_flag_test_and_set(con, CON_FLAG_KEEPALIVE_PENDING) == 0 && > - con_flag_test_and_set(con, CON_FLAG_WRITE_PENDING) == 0) > + if (pending) > queue_con(con); > } > EXPORT_SYMBOL(ceph_con_keepalive); Hi Myungho, Were you able to reproduce? If so, did you use the syzkaller output or something else? Thanks, Ilya
Re: [PATCH 06/10] block: rbd: convert to use BUS_ATTR_WO and RO
On Fri, Dec 21, 2018 at 8:55 AM Greg Kroah-Hartman wrote: > > We are trying to get rid of BUS_ATTR() and the usage of that in rbd.c > can be trivially converted to use BUS_ATTR_WO and RO, so use those > macros instead. > > Cc: Ilya Dryomov > Cc: Sage Weil > Cc: Alex Elder > Cc: Jens Axboe > Signed-off-by: Greg Kroah-Hartman > --- > drivers/block/rbd.c | 45 +++-- > 1 file changed, 19 insertions(+), 26 deletions(-) > > diff --git a/drivers/block/rbd.c b/drivers/block/rbd.c > index 8e5140bbf241..d871d364fdcf 100644 > --- a/drivers/block/rbd.c > +++ b/drivers/block/rbd.c > @@ -428,14 +428,13 @@ static bool single_major = true; > module_param(single_major, bool, 0444); > MODULE_PARM_DESC(single_major, "Use a single major number for all rbd > devices (default: true)"); > > -static ssize_t rbd_add(struct bus_type *bus, const char *buf, > - size_t count); > -static ssize_t rbd_remove(struct bus_type *bus, const char *buf, > - size_t count); > -static ssize_t rbd_add_single_major(struct bus_type *bus, const char *buf, > - size_t count); > -static ssize_t rbd_remove_single_major(struct bus_type *bus, const char *buf, > - size_t count); > +static ssize_t add_store(struct bus_type *bus, const char *buf, size_t > count); > +static ssize_t remove_store(struct bus_type *bus, const char *buf, > + size_t count); > +static ssize_t add_single_major_store(struct bus_type *bus, const char *buf, > + size_t count); > +static ssize_t remove_single_major_store(struct bus_type *bus, const char > *buf, > +size_t count); > static int rbd_dev_image_probe(struct rbd_device *rbd_dev, int depth); > > static int rbd_dev_id_to_minor(int dev_id) > @@ -464,16 +463,16 @@ static bool rbd_is_lock_owner(struct rbd_device > *rbd_dev) > return is_lock_owner; > } > > -static ssize_t rbd_supported_features_show(struct bus_type *bus, char *buf) > +static ssize_t supported_features_show(struct bus_type *bus, char *buf) > { > return sprintf(buf, "0x%llx\n", RBD_FEATURES_SUPPORTED); > } > > -static BUS_ATTR(add, 0200, NULL, rbd_add); > -static BUS_ATTR(remove, 0200, NULL, rbd_remove); > -static BUS_ATTR(add_single_major, 0200, NULL, rbd_add_single_major); > -static BUS_ATTR(remove_single_major, 0200, NULL, rbd_remove_single_major); > -static BUS_ATTR(supported_features, 0444, rbd_supported_features_show, NULL); > +static BUS_ATTR_WO(add); > +static BUS_ATTR_WO(remove); > +static BUS_ATTR_WO(add_single_major); > +static BUS_ATTR_WO(remove_single_major); > +static BUS_ATTR_RO(supported_features); > > static struct attribute *rbd_bus_attrs[] = { > &bus_attr_add.attr, > @@ -5934,9 +5933,7 @@ static ssize_t do_rbd_add(struct bus_type *bus, > goto out; > } > > -static ssize_t rbd_add(struct bus_type *bus, > - const char *buf, > - size_t count) > +static ssize_t add_store(struct bus_type *bus, const char *buf, size_t count) > { > if (single_major) > return -EINVAL; > @@ -5944,9 +5941,8 @@ static ssize_t rbd_add(struct bus_type *bus, > return do_rbd_add(bus, buf, count); > } > > -static ssize_t rbd_add_single_major(struct bus_type *bus, > - const char *buf, > - size_t count) > +static ssize_t add_single_major_store(struct bus_type *bus, const char *buf, > + size_t count) > { > return do_rbd_add(bus, buf, count); > } > @@ -6050,9 +6046,7 @@ static ssize_t do_rbd_remove(struct bus_type *bus, > return count; > } > > -static ssize_t rbd_remove(struct bus_type *bus, > - const char *buf, > - size_t count) > +static ssize_t remove_store(struct bus_type *bus, const char *buf, size_t > count) > { > if (single_major) > return -EINVAL; > @@ -6060,9 +6054,8 @@ static ssize_t rbd_remove(struct bus_type *bus, > return do_rbd_remove(bus, buf, count); > } > > -static ssize_t rbd_remove_single_major(struct bus_type *bus, > - const char *buf, > - size_t count) > +static ssize_t remove_single_major_store(struct bus_type *bus, const char > *buf, > +size_t count) > { > return do_rbd_remove(bus, buf, count); > } Acked-by: Ilya Dryomov Thanks, Ilya
[GIT PULL] Ceph fix for 4.20-rc7
Hi Linus, The following changes since commit 40e020c129cfc991e8ab4736d2665351ffd1468d: Linux 4.20-rc6 (2018-12-09 15:31:00 -0800) are available in the Git repository at: https://github.com/ceph/ceph-client.git tags/ceph-for-4.20-rc7 for you to fetch changes up to 6f9718fe41c3a47e4362bddf145e2db6ad7d8e87: ceph: make 'nocopyfrom' a default mount option (2018-12-11 18:22:17 +0100) Luis discovered a problem with the new copyfrom offload on the server side. Disable it for now. Luis Henriques (1): ceph: make 'nocopyfrom' a default mount option fs/ceph/super.c | 4 ++-- fs/ceph/super.h | 4 +++- 2 files changed, 5 insertions(+), 3 deletions(-)
Re: [PATCH] ceph: make 'nocopyfrom' a default mount option
On Mon, Dec 10, 2018 at 11:23 AM Luis Henriques wrote: > > Since we found a problem with the 'copy-from' operation after objects have > been truncated, offloading object copies to OSDs should be discouraged > until the issue is fixed. > > Thus, this patch adds the 'nocopyfrom' mount option to the default mount > options which effectily means that remote copies won't be done in > copy_file_range unless they are explicitly enabled at mount time. > > Link: https://tracker.ceph.com/issues/37378 > Signed-off-by: Luis Henriques > --- > fs/ceph/super.h | 4 +++- > 1 file changed, 3 insertions(+), 1 deletion(-) > > diff --git a/fs/ceph/super.h b/fs/ceph/super.h > index c005a5400f2e..79a265ba9200 100644 > --- a/fs/ceph/super.h > +++ b/fs/ceph/super.h > @@ -42,7 +42,9 @@ > #define CEPH_MOUNT_OPT_NOQUOTADF (1<<13) /* no root dir quota in > statfs */ > #define CEPH_MOUNT_OPT_NOCOPYFROM (1<<14) /* don't use RADOS > 'copy-from' op */ > > -#define CEPH_MOUNT_OPT_DEFAULTCEPH_MOUNT_OPT_DCACHE > +#define CEPH_MOUNT_OPT_DEFAULT \ > + (CEPH_MOUNT_OPT_DCACHE |\ > +CEPH_MOUNT_OPT_NOCOPYFROM) > > #define ceph_set_mount_opt(fsc, opt) \ > (fsc)->mount_options->flags |= CEPH_MOUNT_OPT_##opt; Thanks Luis, I'll pick it up for 4.20. Ilya
Re: [PATCH 4.14 018/146] libceph: implement CEPHX_V2 calculation mode
On Tue, Dec 4, 2018 at 12:01 PM Greg Kroah-Hartman wrote: > > 4.14-stable review patch. If anyone has any objections, please let me know. > > -- > > commit cc255c76c70f7a87d97939621eae04b600d9f4a1 upstream. > > Derive the signature from the entire buffer (both AES cipher blocks) > instead of using just the first half of the first block, leaving out > data_crc entirely. > > This addresses CVE-2018-1129. > > Link: http://tracker.ceph.com/issues/24837 > Signed-off-by: Ilya Dryomov > Reviewed-by: Sage Weil > Signed-off-by: Ben Hutchings > Signed-off-by: Sasha Levin > --- > include/linux/ceph/ceph_features.h | 7 +-- > net/ceph/auth_x.c | 73 +++--- > 2 files changed, 60 insertions(+), 20 deletions(-) > > diff --git a/include/linux/ceph/ceph_features.h > b/include/linux/ceph/ceph_features.h > index 59042d5ac520..70f42eef813b 100644 > --- a/include/linux/ceph/ceph_features.h > +++ b/include/linux/ceph/ceph_features.h > @@ -165,9 +165,9 @@ DEFINE_CEPH_FEATURE(58, 1, FS_FILE_LAYOUT_V2) // overlap > DEFINE_CEPH_FEATURE(59, 1, FS_BTIME) > DEFINE_CEPH_FEATURE(59, 1, FS_CHANGE_ATTR) // overlap > DEFINE_CEPH_FEATURE(59, 1, MSG_ADDR2) // overlap > -DEFINE_CEPH_FEATURE(60, 1, BLKIN_TRACING) // *do not share this bit* > +DEFINE_CEPH_FEATURE(60, 1, OSD_RECOVERY_DELETES) // *do not share this bit* > +DEFINE_CEPH_FEATURE(61, 1, CEPHX_V2) // *do not share this bit* > > -DEFINE_CEPH_FEATURE(61, 1, RESERVED2) // unused, but slow down! > DEFINE_CEPH_FEATURE(62, 1, RESERVED) // do not use; used as a > sentinal > DEFINE_CEPH_FEATURE_DEPRECATED(63, 1, RESERVED_BROKEN, LUMINOUS) // > client-facing > > @@ -209,7 +209,8 @@ DEFINE_CEPH_FEATURE_DEPRECATED(63, 1, RESERVED_BROKEN, > LUMINOUS) // client-facin > CEPH_FEATURE_SERVER_JEWEL |\ > CEPH_FEATURE_MON_STATEFUL_SUB |\ > CEPH_FEATURE_CRUSH_TUNABLES5 | \ > -CEPH_FEATURE_NEW_OSDOPREPLY_ENCODING) > +CEPH_FEATURE_NEW_OSDOPREPLY_ENCODING | \ > +CEPH_FEATURE_CEPHX_V2) > > #define CEPH_FEATURES_REQUIRED_DEFAULT \ > (CEPH_FEATURE_NOSRCADDR |\ > diff --git a/net/ceph/auth_x.c b/net/ceph/auth_x.c > index ce28bb07d8fd..10eb759bbcb4 100644 > --- a/net/ceph/auth_x.c > +++ b/net/ceph/auth_x.c > @@ -9,6 +9,7 @@ > > #include > #include > +#include > #include > #include > > @@ -803,26 +804,64 @@ static int calc_signature(struct ceph_x_authorizer *au, > struct ceph_msg *msg, > __le64 *psig) > { > void *enc_buf = au->enc_buf; > - struct { > - __le32 len; > - __le32 header_crc; > - __le32 front_crc; > - __le32 middle_crc; > - __le32 data_crc; > - } __packed *sigblock = enc_buf + ceph_x_encrypt_offset(); > int ret; > > - sigblock->len = cpu_to_le32(4*sizeof(u32)); > - sigblock->header_crc = msg->hdr.crc; > - sigblock->front_crc = msg->footer.front_crc; > - sigblock->middle_crc = msg->footer.middle_crc; > - sigblock->data_crc = msg->footer.data_crc; > - ret = ceph_x_encrypt(&au->session_key, enc_buf, CEPHX_AU_ENC_BUF_LEN, > -sizeof(*sigblock)); > - if (ret < 0) > - return ret; > + if (!CEPH_HAVE_FEATURE(msg->con->peer_features, CEPHX_V2)) { > + struct { > + __le32 len; > + __le32 header_crc; > + __le32 front_crc; > + __le32 middle_crc; > + __le32 data_crc; > + } __packed *sigblock = enc_buf + ceph_x_encrypt_offset(); > + > + sigblock->len = cpu_to_le32(4*sizeof(u32)); > + sigblock->header_crc = msg->hdr.crc; > + sigblock->front_crc = msg->footer.front_crc; > + sigblock->middle_crc = msg->footer.middle_crc; > + sigblock->data_crc = msg->footer.data_crc; > + > + ret = ceph_x_encrypt(&au->session_key, enc_buf, > +CEPHX_AU_ENC_BUF_LEN, sizeof(*sigblock)); > + if (ret < 0) > + return ret; > + > + *psig = *(__le64 *)(enc_buf + sizeof(u32)); > + } else { > + struct { > + __le32 header_crc; > + __le32 front_crc; > + __le32 front_len; > + __le32 middle_crc; > +
[GIT PULL] Ceph fix for 4.20-rc4
Hi Linus, The following changes since commit 9ff01193a20d391e8dbce4403dd5ef87c7eaaca6: Linux 4.20-rc3 (2018-11-18 13:33:44 -0800) are available in the Git repository at: https://github.com/ceph/ceph-client.git tags/ceph-for-4.20-rc4 for you to fetch changes up to 7e241f647dc7087a0401418a187f3f5b527cc690: libceph: fall back to sendmsg for slab pages (2018-11-19 17:59:47 +0100) A messenger fix, marked for stable. Ilya Dryomov (1): libceph: fall back to sendmsg for slab pages net/ceph/messenger.c | 12 +--- 1 file changed, 9 insertions(+), 3 deletions(-)
[GIT PULL] Ceph fixes for 4.20-rc2
Hi Linus, The following changes since commit 651022382c7f8da46cb4872a545ee1da6d097d2a: Linux 4.20-rc1 (2018-11-04 15:37:52 -0800) are available in the Git repository at: https://github.com/ceph/ceph-client.git tags/ceph-for-4.20-rc2 for you to fetch changes up to 23c625ce3065e40c933a4239efb9b11f1194a343: libceph: assume argonaut on the server side (2018-11-08 17:51:11 +0100) Two CephFS fixes (copy_file_range and quota) and a small feature bit cleanup. Ilya Dryomov (1): libceph: assume argonaut on the server side Luis Henriques (2): ceph: add destination file data sync before doing any remote copy ceph: quota: fix null pointer dereference in quota check fs/ceph/file.c | 11 +-- fs/ceph/mds_client.c | 12 +++- fs/ceph/quota.c| 3 ++- include/linux/ceph/ceph_features.h | 8 +--- 4 files changed, 15 insertions(+), 19 deletions(-)
[GIT PULL] Ceph updates for 4.20-rc1
Hi Linus, The following changes since commit 84df9525b0c27f3ebc2ebb1864fa62a97fdedb7d: Linux 4.19 (2018-10-22 07:37:37 +0100) are available in the Git repository at: https://github.com/ceph/ceph-client.git tags/ceph-for-4.20-rc1 for you to fetch changes up to ea4cdc548e5e74a529cdd1aea885d74b4aa8f1b3: ceph: new mount option to disable usage of copy-from op (2018-10-22 10:28:24 +0200) The highlights are: - a series that fixes some old memory allocation issues in libceph (myself). We no longer allocate memory in places where allocation failures cannot be handled and BUG when the allocation fails. - support for copy_file_range() syscall (Luis Henriques). If size and alignment conditions are met, it leverages RADOS copy-from operation. Otherwise, a local copy is performed. - a patch that reduces memory requirement of ceph_sync_read() from the size of the entire read to the size of one object (Zheng Yan). - fallocate() syscall is now restricted to FALLOC_FL_PUNCH_HOLE (Luis Henriques) Chengguang Xu (3): ceph: reset cap hold timeout only for requeued inode rbd: add __init/__exit annotations ceph: check snap first in ceph_set_acl() Ilya Dryomov (12): libceph: bump CEPH_MSG_MAX_DATA_LEN libceph: osd_req_op_cls_init() doesn't need to take opcode libceph: introduce ceph_pagelist_alloc() libceph: don't consume a ref on pagelist in ceph_msg_data_add_pagelist() libceph: no need to call osd_req_opcode_valid() in osd_req_encode_op() ceph: num_ops is off by one in ceph_aio_retry_work() libceph: enable fallback to ceph_msg_new() in ceph_msgpool_get() libceph: assign cookies in linger_submit() libceph: introduce alloc_watch_request() libceph, rbd, ceph: move ceph_osdc_alloc_messages() calls libceph: preallocate message data items libceph: check reply num_data_items in setup_request_data() Luis Henriques (5): ceph: only allow punch hole mode in fallocate ceph: add non-blocking parameter to ceph_try_get_caps() libceph: support the RADOS copy-from operation ceph: support copy_file_range file operation ceph: new mount option to disable usage of copy-from op Xuehan Xu (1): ceph: set timeout conditionally in __cap_delay_requeue Yan, Zheng (4): Revert "ceph: fix dentry leak in splice_dentry()" ceph: fix dentry leak in ceph_readdir_prepopulate ceph: check if LOOKUPNAME request was aborted when filling trace ceph: refactor ceph_sync_read() Documentation/filesystems/ceph.txt | 5 + drivers/block/rbd.c| 28 +- fs/ceph/acl.c | 13 +- fs/ceph/addr.c | 2 +- fs/ceph/caps.c | 21 +- fs/ceph/file.c | 573 +++-- fs/ceph/inode.c| 13 +- fs/ceph/mds_client.c | 9 +- fs/ceph/super.c| 13 + fs/ceph/super.h| 3 +- fs/ceph/xattr.c| 3 +- include/linux/ceph/libceph.h | 8 +- include/linux/ceph/messenger.h | 24 +- include/linux/ceph/msgpool.h | 11 +- include/linux/ceph/osd_client.h| 22 +- include/linux/ceph/pagelist.h | 11 +- include/linux/ceph/rados.h | 28 ++ net/ceph/messenger.c | 107 +++ net/ceph/msgpool.c | 27 +- net/ceph/osd_client.c | 363 +-- net/ceph/pagelist.c| 20 ++ 21 files changed, 900 insertions(+), 404 deletions(-)
Re: [PATCH] ceph: only allow punch hole mode in fallocate
On Wed, Oct 10, 2018 at 1:19 PM Luis Henriques wrote: > > Ilya Dryomov writes: > > > On Wed, Oct 10, 2018 at 6:21 AM Yan, Zheng wrote: > >> > >> On Wed, Oct 10, 2018 at 1:54 AM Luis Henriques wrote: > > > > >> Applied, thanks > > > > I don't think it should go to stable kernels. Strictly speaking it's > > a behaviour change -- it's been this way for many years and, unless you > > are close to ENOSPC, it's sort of appears to work. I'll take off the > > stable tag unless I hear objections. > > Right, it can in fact break applications that rely on the previous > (bogus) behaviour. But it can also be claimed that it *will* break > applications anyway with an updated kernel, so backporting it to older > kernels will just allow a consistent behaviour. > > Anyway, I'm OK either way. But if you drop the stable tag make sure you > also remove the 'Fixes:' tag as I believe the stable folks will still > pick this patch if it includes a valid SHA1 in it. Yeah, we've run into this in the past. Thanks, Ilya
Re: [PATCH] ceph: only allow punch hole mode in fallocate
On Wed, Oct 10, 2018 at 6:21 AM Yan, Zheng wrote: > > On Wed, Oct 10, 2018 at 1:54 AM Luis Henriques wrote: > > > > Current implementation of cephfs fallocate isn't correct as it doesn't > > really reserve the space in the cluster, which means that a subsequent > > call to a write may actually fail due to lack of space. In fact, it is > > currently possible to fallocate an amount space that is larger than the > > free space in the cluster. > > > > Since there's no easy solution to fix this at the moment, this patch > > simply removes support for all fallocate operations but > > FALLOC_FL_PUNCH_HOLE (which implies FALLOC_FL_KEEP_SIZE). > > > > Link: https://tracker.ceph.com/issues/36317 > > Cc: sta...@vger.kernel.org > > Fixes: ad7a60de882a ("ceph: punch hole support") > > Signed-off-by: Luis Henriques > > --- > > fs/ceph/file.c | 45 + > > 1 file changed, 9 insertions(+), 36 deletions(-) > > > > diff --git a/fs/ceph/file.c b/fs/ceph/file.c > > index 92ab20433682..91a7ad259bcf 100644 > > --- a/fs/ceph/file.c > > +++ b/fs/ceph/file.c > > @@ -1735,7 +1735,6 @@ static long ceph_fallocate(struct file *file, int > > mode, > > struct ceph_file_info *fi = file->private_data; > > struct inode *inode = file_inode(file); > > struct ceph_inode_info *ci = ceph_inode(inode); > > - struct ceph_fs_client *fsc = ceph_inode_to_client(inode); > > struct ceph_cap_flush *prealloc_cf; > > int want, got = 0; > > int dirty; > > @@ -1743,10 +1742,7 @@ static long ceph_fallocate(struct file *file, int > > mode, > > loff_t endoff = 0; > > loff_t size; > > > > - if ((offset + length) > max(i_size_read(inode), fsc->max_file_size)) > > - return -EFBIG; > > - > > - if (mode & ~(FALLOC_FL_KEEP_SIZE | FALLOC_FL_PUNCH_HOLE)) > > + if (mode != (FALLOC_FL_KEEP_SIZE | FALLOC_FL_PUNCH_HOLE)) > > return -EOPNOTSUPP; > > > > if (!S_ISREG(inode->i_mode)) > > @@ -1763,18 +1759,6 @@ static long ceph_fallocate(struct file *file, int > > mode, > > goto unlock; > > } > > > > - if (!(mode & (FALLOC_FL_PUNCH_HOLE | FALLOC_FL_KEEP_SIZE)) && > > - ceph_quota_is_max_bytes_exceeded(inode, offset + length)) { > > - ret = -EDQUOT; > > - goto unlock; > > - } > > - > > - if (ceph_osdmap_flag(&fsc->client->osdc, CEPH_OSDMAP_FULL) && > > - !(mode & FALLOC_FL_PUNCH_HOLE)) { > > - ret = -ENOSPC; > > - goto unlock; > > - } > > - > > if (ci->i_inline_version != CEPH_INLINE_NONE) { > > ret = ceph_uninline_data(file, NULL); > > if (ret < 0) > > @@ -1782,12 +1766,12 @@ static long ceph_fallocate(struct file *file, int > > mode, > > } > > > > size = i_size_read(inode); > > - if (!(mode & FALLOC_FL_KEEP_SIZE)) { > > - endoff = offset + length; > > - ret = inode_newsize_ok(inode, endoff); > > - if (ret) > > - goto unlock; > > - } > > + > > + /* Are we punching a hole beyond EOF? */ > > + if (offset >= size) > > + goto unlock; > > + if ((offset + length) > size) > > + length = size - offset; > > > > if (fi->fmode & CEPH_FILE_MODE_LAZY) > > want = CEPH_CAP_FILE_BUFFER | CEPH_CAP_FILE_LAZYIO; > > @@ -1798,16 +1782,8 @@ static long ceph_fallocate(struct file *file, int > > mode, > > if (ret < 0) > > goto unlock; > > > > - if (mode & FALLOC_FL_PUNCH_HOLE) { > > - if (offset < size) > > - ceph_zero_pagecache_range(inode, offset, length); > > - ret = ceph_zero_objects(inode, offset, length); > > - } else if (endoff > size) { > > - truncate_pagecache_range(inode, size, -1); > > - if (ceph_inode_set_size(inode, endoff)) > > - ceph_check_caps(ceph_inode(inode), > > - CHECK_CAPS_AUTHONLY, NULL); > > - } > > + ceph_zero_pagecache_range(inode, offset, length); > > + ret = ceph_zero_objects(inode, offset, length); > > > > if (!ret) { > > spin_lock(&ci->i_ceph_lock); > > @@ -1817,9 +1793,6 @@ static long ceph_fallocate(struct file *file, int > > mode, > > spin_unlock(&ci->i_ceph_lock); > > if (dirty) > > __mark_inode_dirty(inode, dirty); > > - if ((endoff > size) && > > - ceph_quota_is_max_bytes_approaching(inode, endoff)) > > - ceph_check_caps(ci, CHECK_CAPS_NODELAY, NULL); > > } > > > > ceph_put_cap_refs(ci, got); > > Applied, thanks I don't think it should go to stable kernels. Strictly speaking it's a behaviour change -- it's been this way for many years an
Re: [PATCH] ceph: use an enum instead of 'static const' to define constants
On Mon, Oct 8, 2018 at 5:37 PM Arnd Bergmann wrote: > > On Mon, Oct 8, 2018 at 4:23 PM Ilya Dryomov wrote: > > On Fri, Oct 5, 2018 at 6:18 PM Arnd Bergmann wrote: > > > @@ -71,7 +71,7 @@ > > > * This ensures that no two versions who have different meanings for > > > * the bit ever speak to each other. > > > */ > > > - > > > +enum ceph_features { > > > DEFINE_CEPH_FEATURE( 0, 1, UID) > > > DEFINE_CEPH_FEATURE( 1, 1, NOSRCADDR) > > > DEFINE_CEPH_FEATURE_RETIRED( 2, 1, MONCLOCKCHECK, JEWEL, LUMINOUS) > > > @@ -170,13 +170,13 @@ DEFINE_CEPH_FEATURE(61, 1, CEPHX_V2) // > > > *do not share this bit* > > > > > > DEFINE_CEPH_FEATURE(62, 1, RESERVED) // do not use; used as a > > > sentinal > > > DEFINE_CEPH_FEATURE_DEPRECATED(63, 1, RESERVED_BROKEN, LUMINOUS) // > > > client-facing > > > - > > > +}; > > > > I don't particularly like this because it looks like lower constants > > are actually ints and the rest are unsigned longs, even though they all > > have ULL suffixes. The standard seems to require that enum constants > > be representable as ints, is the non-pedantic behaviour documented > > somewhere? > > I had not realized that this is a gcc extension, or that it behaves slightly > differently from the standard C++ behavior that apparently adopted a > saner variant (all values in an enum have the same type). > > How about we just add a __maybe_unused to DEFINE_CEPH_FEATURE > then to shut up the warning? Fine with me. Thanks, Ilya
Re: [PATCH] ceph: use an enum instead of 'static const' to define constants
On Fri, Oct 5, 2018 at 6:18 PM Arnd Bergmann wrote: > > Building with W=1 produces lots of warnings for files including > ceph_features.h: > > include/linux/ceph/ceph_features.h:15:24: error: 'CEPH_FEATUREMASK_SERVER_M' > defined but not used [-Werror=unused-const-variable=] > > The normal way to define compile-time constants in the kernel is > to use either macros or enums, and gcc does not warn about those. > > Converting to an enum is simple here and means we can still use > the names while debugging. > > Signed-off-by: Arnd Bergmann > --- > include/linux/ceph/ceph_features.h | 20 ++-- > 1 file changed, 10 insertions(+), 10 deletions(-) > > diff --git a/include/linux/ceph/ceph_features.h > b/include/linux/ceph/ceph_features.h > index 6b92b3395fa9..676908eca060 100644 > --- a/include/linux/ceph/ceph_features.h > +++ b/include/linux/ceph/ceph_features.h > @@ -11,15 +11,15 @@ > #define CEPH_FEATURE_INCARNATION_2 (1ull<<57) // CEPH_FEATURE_SERVER_JEWEL > > #define DEFINE_CEPH_FEATURE(bit, incarnation, name)\ > - static const uint64_t CEPH_FEATURE_##name = (1ULL< \ > - static const uint64_t CEPH_FEATUREMASK_##name = \ > - (1ULL< + CEPH_FEATURE_##name = (1ULL< + CEPH_FEATUREMASK_##name = \ > + (1ULL< > /* this bit is ignored but still advertised by release *when* */ > -#define DEFINE_CEPH_FEATURE_DEPRECATED(bit, incarnation, name, when) \ > - static const uint64_t DEPRECATED_CEPH_FEATURE_##name = (1ULL< - static const uint64_t DEPRECATED_CEPH_FEATUREMASK_##name = > \ > - (1ULL< +#define DEFINE_CEPH_FEATURE_DEPRECATED(bit, incarnation, name, when) \ > + DEPRECATED_CEPH_FEATURE_##name = (1ULL< + DEPRECATED_CEPH_FEATUREMASK_##name =\ > + (1ULL< > /* > * this bit is ignored by release *unused* and not advertised by > @@ -71,7 +71,7 @@ > * This ensures that no two versions who have different meanings for > * the bit ever speak to each other. > */ > - > +enum ceph_features { > DEFINE_CEPH_FEATURE( 0, 1, UID) > DEFINE_CEPH_FEATURE( 1, 1, NOSRCADDR) > DEFINE_CEPH_FEATURE_RETIRED( 2, 1, MONCLOCKCHECK, JEWEL, LUMINOUS) > @@ -170,13 +170,13 @@ DEFINE_CEPH_FEATURE(61, 1, CEPHX_V2) // *do > not share this bit* > > DEFINE_CEPH_FEATURE(62, 1, RESERVED) // do not use; used as a > sentinal > DEFINE_CEPH_FEATURE_DEPRECATED(63, 1, RESERVED_BROKEN, LUMINOUS) // > client-facing > - > +}; I don't particularly like this because it looks like lower constants are actually ints and the rest are unsigned longs, even though they all have ULL suffixes. The standard seems to require that enum constants be representable as ints, is the non-pedantic behaviour documented somewhere? Thanks, Ilya
[GIT PULL] Ceph updates for 4.19-rc3
Hi Linus, The following changes since commit 57361846b52bc686112da6ca5368d11210796804: Linux 4.19-rc2 (2018-09-02 14:37:30 -0700) are available in the Git repository at: https://github.com/ceph/ceph-client.git tags/ceph-for-4.19-rc3 for you to fetch changes up to e92c0eaf754310f9f31e9229a3f7274a67478f82: rbd: support cloning across namespaces (2018-09-06 16:18:04 +0200) Two rbd patches to complete support for images within namespaces that went into -rc1 and a use-after-free fix. The rbd changes have been sitting in a branch for quite a while but couldn't be included into the -rc1 pull request because of a pending wire protocol backwards compatibility fixup that only got committed early this week. Said fixup ended up being really trivial -- just an extra byte added, so I decided to send these changes for -rc3. If it's too late in the cycle for this follow-up to be pulled, let me know and I'll send the use-after-free fix separately; we will have the necessary stop gaps on the server side to prevent the current 4.19 code from doing anything unexpected. -------- Ilya Dryomov (3): ceph: avoid a use-after-free in ceph_destroy_options() rbd: factor out get_parent_info() rbd: support cloning across namespaces drivers/block/rbd.c | 235 +++- fs/ceph/super.c | 16 ++-- 2 files changed, 189 insertions(+), 62 deletions(-)
[GIT PULL] Ceph updates for 4.19-rc1
Hi Linus, The following changes since commit acb1872577b346bd15ab3a3f8dff780d6cca4b70: Linux 4.18-rc7 (2018-07-29 14:44:52 -0700) are available in the Git repository at: https://github.com/ceph/ceph-client.git tags/ceph-for-4.19-rc1 for you to fetch changes up to 0fcf6c02b205f80f24eb548b236543ec151cb01c: ceph: don't drop message if it contains more data than expected (2018-08-13 17:55:44 +0200) The main things are support for cephx v2 authentication protocol and basic support for rbd images within namespaces (myself). Also included y2038 conversion patches from Arnd, a pile of miscellaneous fixes from Chengguang and Zheng's feature bit infrastructure for the filesystem. Arnd Bergmann (5): libceph: use timespec64 in for keepalive2 and ticket validity ceph: stop using current_kernel_time() ceph: use timespec64 for inode timestamp libceph: use timespec64 for r_mtime ceph: use timespec64 for r_stamp Chengguang Xu (14): ceph: add retry logic for error -ERANGE in ceph_get_acl() ceph: restore ctime as well in the case of restoring old mode libceph: stop parsing when a bad int arg is detected ceph: return errors from posix_acl_equiv_mode() correctly ceph: add d_drop for some error cases in ceph_mknod() ceph: add d_drop for some error cases in ceph_symlink() ceph: add new field max_file_size in ceph_fs_client ceph: add additional range check in ceph_fallocate() ceph: add additional offset check in ceph_write_iter() ceph: add additional size check in ceph_setattr() ceph: compare fsc->max_file_size and inode->i_size for max file size limit ceph: change to void return type for __do_request() ceph: refactor ceph_unreserve_caps() ceph: refactor error handling code in ceph_reserve_caps() Ilya Dryomov (14): libceph: make ceph_osdc_notify{,_ack}() payload_len u32 libceph: change ceph_pagelist_encode_string() to take u32 libceph: amend "bad option arg" error message rbd: pass rbd_spec into parse_rbd_opts_token() rbd: support for images within namespaces libceph: remove now unused ceph_{en,de}code_timespec() libceph: store ceph_auth_handshake pointer in ceph_connection libceph: factor out __prepare_write_connect() libceph: factor out __ceph_x_decrypt() libceph: factor out encrypt_authorizer() libceph: add authorizer challenge libceph: implement CEPHX_V2 calculation mode libceph: check authorizer reply/challenge length before reading libceph: weaken sizeof check in ceph_x_verify_authorizer_reply() Souptick Joarder (1): ceph: adding new return type vm_fault_t Stephen Hemminger (1): ceph: fix whitespace Yan, Zheng (3): ceph: fix incorrect use of strncpy ceph: support cephfs' own feature bits ceph: don't drop message if it contains more data than expected YueHaibing (2): libceph: remove unnecessary non NULL check for request_key crush: fix using plain integer as NULL warning drivers/block/rbd.c| 125 +-- fs/ceph/acl.c | 30 +++-- fs/ceph/addr.c | 74 ++-- fs/ceph/cache.c| 11 +- fs/ceph/caps.c | 138 ++--- fs/ceph/dir.c | 20 ++-- fs/ceph/file.c | 34 -- fs/ceph/inode.c| 83 ++--- fs/ceph/mds_client.c | 98 ++- fs/ceph/mds_client.h | 14 ++- fs/ceph/quota.c| 2 +- fs/ceph/snap.c | 6 +- fs/ceph/super.c| 6 +- fs/ceph/super.h| 12 +- fs/ceph/xattr.c| 4 +- include/linux/ceph/auth.h | 8 ++ include/linux/ceph/ceph_features.h | 7 +- include/linux/ceph/decode.h| 18 ++- include/linux/ceph/messenger.h | 8 +- include/linux/ceph/msgr.h | 2 +- include/linux/ceph/osd_client.h| 10 +- include/linux/ceph/pagelist.h | 2 +- net/ceph/Kconfig | 1 - net/ceph/Makefile | 1 - net/ceph/auth.c| 16 +++ net/ceph/auth_none.c | 1 - net/ceph/auth_none.h | 1 - net/ceph/auth_x.c | 239 + net/ceph/auth_x.h | 3 +- net/ceph/auth_x_protocol.h | 7 ++ net/ceph/ceph_common.c | 13 +- net/ceph/cls_lock_client.c | 4 +- net/ceph/crush/mapper.c| 4 +- net/ceph/messenger.c | 113 +++--- net/ceph/mon_client.c | 2 +- net/ceph/osd_client.c | 27 +++-- net/ceph/p
Re: Warning when using eMMC and partprobe: generic_make_request: Trying to write to read-only block-device
On Tue, Aug 14, 2018 at 4:41 PM Stefan Agner wrote: > > Hi, > > Using Linux 4.18 on a i.MX 6Q I see the following warning during > boot-up: > > [ 23.928916] [ cut here ] > [ 23.933795] WARNING: CPU: 1 PID: 527 at block/blk-core.c:2161 > generic_make_request_checks+0x868/0xa18 > [ 23.943306] generic_make_request: Trying to write to read-only > block-device mmcblk2boot0 (partno 0) > [ 23.952569] Modules linked in: joydev flexcan can_dev coda imx_vdoa > v4l2_mem2mem videobuf2_vmalloc dw_hdmi_ahb_audio evbug nhc_mobility > nhc_hop nhc_routing nhc_ipv6 nhc_dest nhc_fragment nhc_udp fuse > bluetooth_6lowpan 6lowpan > [ 23.973115] CPU: 1 PID: 527 Comm: partprobe Not tainted 4.18.0 #1 > [ 23.979336] Hardware name: Freescale i.MX6 Quad/DualLite (Device > Tree) > [ 23.985984] Backtrace: > [ 23.988513] [] (dump_backtrace) from [] > (show_stack+0x18/0x1c) > [ 23.996231] r7: r6:60060013 r5: r4:c118ca44 > [ 24.002009] [] (show_stack) from [] > (dump_stack+0xb4/0xec) > [ 24.009377] [] (dump_stack) from [] > (__warn+0xc4/0x108) > [ 24.016471] r10:c1108908 r9:c04864f0 r8:0871 r7:c0ea02dc > r6:0009 r5: > [ 24.024447] r4:d73d1d1c r3:abf2ba7b > [ 24.028111] [] (__warn) from [] > (warn_slowpath_fmt+0x4c/0x6c) > [ 24.035741] r9:d73d r8:c01011e4 r7:c04873cc r6:d6f27400 > r5:c0ea05b0 r4:c1108908 > [ 24.043641] [] (warn_slowpath_fmt) from [] > (generic_make_request_checks+0x868/0xa18) > [ 24.053296] r3:d73d1d74 r2:c0ea05b0 > [ 24.058425] r5:d6d4d0a0 r4:d6f94240 > [ 24.063538] [] (generic_make_request_checks) from > [] (generic_make_request+0xc0/0x480) > [ 24.076301] r10:d6f94240 r9:d73d r8:c01011e4 r7:c1108908 > r6:d73d1e98 r5:c1108908 > [ 24.087212] r4:d6d4d0a0 > [ 24.091251] [] (generic_make_request) from [] > (submit_bio+0x38/0x19c) > [ 24.102438] r10:0076 r9:d73d r8:c01011e4 r7:7fff > r6:d73d1e98 r5:c1108908 > [ 24.113265] r4:d6f94240 > [ 24.117278] [] (submit_bio) from [] > (submit_bio_wait+0x5c/0x98) > [ 24.127914] r10:0076 r9:d73d r8:c01011e4 r7:7fff > r6:d73d1e98 r5:c1108908 > [ 24.138724] r4:d6f94240 > [ 24.142739] [] (submit_bio_wait) from [] > (blkdev_issue_flush+0x80/0xb0) > [ 24.154038] r6: r5:d4164340 r4:d6f94240 > [ 24.160143] [] (blkdev_issue_flush) from [] > (blkdev_fsync+0x3c/0x54) > [ 24.171143] r7:7fff r6:d4164428 r5:7fff r4: > [ 24.178322] [] (blkdev_fsync) from [] > (vfs_fsync_range+0x44/0x84) > [ 24.189080] r6: r5: r4:d66ce000 > [ 24.195173] [] (vfs_fsync_range) from [] > (do_fsync+0x44/0x78) > [ 24.205530] r7:0076 r6: r5:d66ce000 r4:d66ce000 > [ 24.212672] [] (do_fsync) from [] > (sys_fsync+0x14/0x18) > [ 24.221140] r6:01320248 r5:0001 r4:013201d0 > [ 24.227266] [] (sys_fsync) from [] > (ret_fast_syscall+0x0/0x28) > [ 24.237781] Exception stack(0xd73d1fa8 to 0xd73d1ff0) > [ 24.244320] 1fa0: 013201d0 0001 0004 > be863b1c 0064 > [ 24.255437] 1fc0: 013201d0 0001 01320248 0076 b6ef8aec > b6ef4c04 b6f37fa4 > [ 24.266593] 1fe0: 0076 be863a00 b6e61faf b6de8306 > [ 24.273302] irq event stamp: 13037 > [ 24.278202] hardirqs last enabled at (13045): [] > console_unlock+0x3e0/0x4e4 > [ 24.289163] hardirqs last disabled at (13054): [] > console_unlock+0x80/0x4e4 > [ 24.300085] softirqs last enabled at (12880): [] > __do_softirq+0x224/0x2d0 > [ 24.310934] softirqs last disabled at (12833): [] > irq_exit+0xc8/0x1ac > [ 24.321438] ---[ end trace 05a4aba40df38a0c ]--- > > The system I am using calls partprobe for some reason, which causes the > stack > trace to appear. > > The mmcblkXbootY partitions are hardware partitions on eMMC devices > which are > by default set to read only. Partition probing should really not lead to > a > write as far as I can tell... > > strace shows what partprobe is actually doing: > > ... > openat(AT_FDCWD, "/dev/mmcblk2boot0", O_RDONLY|O_LARGEFILE) = 4 > ... > ioctl(4, BLKFLSBUF) = 0 > ... > ioctl(4, BLKSSZGET, [512]) = 0 > fadvise64_64(4, 0, 0, POSIX_FADV_RANDOM) = 0 > fstat64(4, {st_mode=S_IFBLK|0660, st_rdev=makedev(179, 64), ...}) = 0 > ioctl(4, BLKGETSIZE64, [2097152]) = 0 > ... > ioctl(4, CDROM_GET_CAPABILITY, 0) = -1 EINVAL (Invalid argument) > ioctl(4, BLKALIGNOFF, [0]) = 0 > ioctl(4, BLKIOMIN, [512]) = 0 > ioctl(4, BLKIOOPT, [0]) = 0 > ioctl(4, BLKPBSZGET, [512]) = 0 > ioctl(4, BLKSSZGET, [512]) = 0 > ioctl(4, BLKGETSIZE64, [2097152]) = 0 > ioctl(4, HDIO_GETGEO, {heads=4, sectors=16, cylinders=64, start=0}) = 0 > fsync(4)= 0 > close(4)= 0 > ... > > Any idea? Looks like it's coming from that fsync(): sys_fsync do_fsync vfs_fsync_range blkdev_fsync blkdev_issue_
[GIT PULL] Ceph fix for 4.18-rc3
Hi Linus, The following changes since commit 7daf201d7fe8334e2d2364d4e8ed3394ec9af819: Linux 4.18-rc2 (2018-06-24 20:54:29 +0800) are available in the Git repository at: https://github.com/ceph/ceph-client.git tags/ceph-for-4.18-rc3 for you to fetch changes up to 8b8f53af1ed9df88a4c0fbfdf3db58f62060edf3: ceph: fix dentry leak in splice_dentry() (2018-06-26 18:42:44 +0200) A trivial dentry leak fix from Zheng. Yan, Zheng (1): ceph: fix dentry leak in splice_dentry() fs/ceph/inode.c | 1 + 1 file changed, 1 insertion(+)
[GIT PULL] Ceph updates for 4.18-rc1
Hi Linus, The following changes since commit 29dcea88779c856c7dc92040a0c01233263101d4: Linux 4.17 (2018-06-03 14:15:21 -0700) are available in the git repository at: https://github.com/ceph/ceph-client.git tags/ceph-for-4.18-rc1 for you to fetch changes up to 23edca864951250af845a11da86bb3ea63522ed2: rbd: flush rbd_dev->watch_dwork after watch is unregistered (2018-06-04 20:46:02 +0200) The main piece is a set of libceph changes that revamps how OSD requests are aborted, improving CephFS ENOSPC handling and making "umount -f" actually work (Zheng and myself). The rest is mostly mount option handling cleanups from Chengguang and assorted fixes from Zheng, Luis and Dongsheng. Chengguang Xu (5): libceph, rbd: add error handling for osd_req_op_cls_init() ceph: fix alignment of rasize ceph: strengthen rsize/wsize/readdir_max_bytes validation ceph: show ino32 if the value is different with default ceph: update description of some mount options Dongsheng Yang (1): rbd: flush rbd_dev->watch_dwork after watch is unregistered Ilya Dryomov (13): libceph: get rid of more_kvec in try_write() libceph: use MSG_TRUNC for discarding received bytes ceph: show wsize only if non-default libceph: introduce ceph_osdc_abort_requests() libceph: no need to call flush_workqueue() before destruction libceph: move more code into __complete_request() libceph: defer __complete_request() to a workqueue libceph: use for_each_request() in ceph_osdc_abort_on_full() libceph: don't warn if req->r_abort_on_full is set libceph: avoid a use-after-free during map check libceph: don't abort reads in ceph_osdc_abort_on_full() libceph: make abort_on_full a per-osdc setting libceph: allocate the locator string with GFP_NOFAIL Luis Henriques (2): ceph: fix st_nlink stat for directories ceph: fix use-after-free in ceph_statfs() Yan, Zheng (10): ceph: use bit flags to define vxattr attributes ceph: always get rstat from auth mds ceph: update i_files/i_subdirs only when Fs cap is issued ceph: define argument structure for handle_cap_grant ceph: handle the new nfiles/nsubdirs fields in cap message ceph: support file lock on directory ceph: abort osd requests on force umount ceph: flush pending works before shutdown super ceph: fix wrong check for the case of updating link count ceph: prevent i_version from going back Documentation/filesystems/ceph.txt | 8 +- drivers/block/rbd.c| 11 +- fs/ceph/addr.c | 1 - fs/ceph/caps.c | 160 --- fs/ceph/dir.c | 2 + fs/ceph/file.c | 1 - fs/ceph/inode.c| 67 +++- fs/ceph/super.c| 35 -- fs/ceph/xattr.c| 60 ++- include/linux/ceph/ceph_fs.h | 1 + include/linux/ceph/osd_client.h| 8 +- include/linux/ceph/osdmap.h| 8 +- net/ceph/messenger.c | 31 ++ net/ceph/osd_client.c | 216 ++--- net/ceph/osdmap.c | 19 ++-- 15 files changed, 372 insertions(+), 256 deletions(-)
[GIT PULL] Ceph fixes for 4.17-rc5
Hi Linus, The following changes since commit 75bc37fefc4471e718ba8e651aa74673d4e0a9eb: Linux 4.17-rc4 (2018-05-06 16:57:38 -1000) are available in the git repository at: https://github.com/ceph/ceph-client.git tags/ceph-for-4.17-rc5 for you to fetch changes up to fc218544fbc800d1c91348ec834cacfb257348f7: ceph: fix iov_iter issues in ceph_direct_read_write() (2018-05-10 10:15:12 +0200) These patches fix two long-standing bugs in the DIO code path, one of which is a crash trivially triggerable with splice(). Ilya Dryomov (3): ceph: fix rsize/wsize capping in ceph_direct_read_write() libceph: add osd_req_op_extent_osd_data_bvecs() ceph: fix iov_iter issues in ceph_direct_read_write() drivers/block/rbd.c | 4 +- fs/ceph/file.c | 205 include/linux/ceph/osd_client.h | 12 ++- net/ceph/osd_client.c | 27 +- 4 files changed, 158 insertions(+), 90 deletions(-)
[PATCH 2/2] iov_iter: fix memory leak in pipe_get_pages_alloc()
Make n signed to avoid leaking the pages array if __pipe_get_pages() fails. Signed-off-by: Ilya Dryomov --- lib/iov_iter.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/lib/iov_iter.c b/lib/iov_iter.c index 4d5bf40d399d..fdae394172fa 100644 --- a/lib/iov_iter.c +++ b/lib/iov_iter.c @@ -1102,7 +1102,7 @@ static ssize_t pipe_get_pages_alloc(struct iov_iter *i, size_t *start) { struct page **p; - size_t n; + ssize_t n; int idx; int npages; -- 2.4.3
[PATCH 1/2] iov_iter: fix return type of __pipe_get_pages()
It returns -EFAULT and happens to be a helper for pipe_get_pages() whose return type is ssize_t. Signed-off-by: Ilya Dryomov --- lib/iov_iter.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/lib/iov_iter.c b/lib/iov_iter.c index 970212670b6a..4d5bf40d399d 100644 --- a/lib/iov_iter.c +++ b/lib/iov_iter.c @@ -1012,7 +1012,7 @@ unsigned long iov_iter_gap_alignment(const struct iov_iter *i) } EXPORT_SYMBOL(iov_iter_gap_alignment); -static inline size_t __pipe_get_pages(struct iov_iter *i, +static inline ssize_t __pipe_get_pages(struct iov_iter *i, size_t maxsize, struct page **pages, int idx, -- 2.4.3
[GIT PULL] Ceph fixes for 4.17-rc3
Hi Linus, The following changes since commit 6d08b06e67cd117f6992c46611dfb4ce267cd71e: Linux 4.17-rc2 (2018-04-22 19:20:09 -0700) are available in the git repository at: https://github.com/ceph/ceph-client.git tags/ceph-for-4.17-rc3 for you to fetch changes up to 9c55ad1c214d9f8c4594ac2c3fa392c1c32431a7: libceph: validate con->state at the top of try_write() (2018-04-26 17:39:08 +0200) A CephFS quota follow-up and fixes for two older issues in the messenger layer, marked for stable. ---- Ilya Dryomov (3): libceph: un-backoff on tick when we have a authenticated session libceph: reschedule a tick in finish_hunting() libceph: validate con->state at the top of try_write() Yan, Zheng (1): ceph: check if mds create snaprealm when setting quota fs/ceph/xattr.c | 28 +--- net/ceph/messenger.c | 7 +++ net/ceph/mon_client.c | 14 +++--- 3 files changed, 43 insertions(+), 6 deletions(-)
Re: [4.4,50/97] ext4: add validity checks for bitmap block numbers -- regression?
Hi Greg, Commit 7dac4a1726a9 ("ext4: add validity checks for bitmap block numbers") seems to be the cause of the regression reported here: https://marc.info/?l=linux-ext4&m=152416385122029&w=2 ext4 folks are probably busy at LSF, so no reply yet. Should this commit be held until we get word from Ted? Please excuse broken threading. Thanks, Ilya