Re: [GIT PULL] Please pull NFS client changes for 5.11

2020-12-17 Thread Trond Myklebust
On Thu, 2020-12-17 at 12:25 -0800, Linus Torvalds wrote:
> On Thu, Dec 17, 2020 at 10:05 AM Trond Myklebust
>  wrote:
> > 
> >   git://git.linux-nfs.org/projects/trondmy/linux-nfs.git tags/nfs-
> > for-
> > 5.11-1
> > 
> > for you to fetch changes up to
> > 52104f274e2d7f134d34bab11cada8913d4544e2:
> 
> pr-tracker-bot doesn't seem to have reacted to this email.
> 
> I suspect it's because of the odd and unfortunate line breaks in it,
> but who knows. Maybe it's just delayed (but the other filesystem
> pulls
> seem to have been tracked).
> 
> Anyway, it's pulled, even if the automation seems to not have
> noticed.
> 
>     Linus

Thanks! Much appreciated.

-- 
Trond Myklebust
Linux NFS client maintainer, Hammerspace
trond.mykleb...@hammerspace.com




Re: [GIT PULL] Please pull NFS client changes for 5.11

2020-12-17 Thread Linus Torvalds
On Thu, Dec 17, 2020 at 10:05 AM Trond Myklebust
 wrote:
>
>   git://git.linux-nfs.org/projects/trondmy/linux-nfs.git tags/nfs-for-
> 5.11-1
>
> for you to fetch changes up to
> 52104f274e2d7f134d34bab11cada8913d4544e2:

pr-tracker-bot doesn't seem to have reacted to this email.

I suspect it's because of the odd and unfortunate line breaks in it,
but who knows. Maybe it's just delayed (but the other filesystem pulls
seem to have been tracked).

Anyway, it's pulled, even if the automation seems to not have noticed.

Linus


[GIT PULL] Please pull NFS client changes for 5.11

2020-12-17 Thread Trond Myklebust
Hi Linus,

The following changes since commit
63e2fffa59a9dd91e443b08832656399fd80b7f0:

  pNFS/flexfiles: Fix array overflow when flexfiles mirroring is
enabled (2020-11-30 10:52:22 -0500)

are available in the Git repository at:

  git://git.linux-nfs.org/projects/trondmy/linux-nfs.git tags/nfs-for-
5.11-1

for you to fetch changes up to
52104f274e2d7f134d34bab11cada8913d4544e2:

  NFS/pNFS: Fix a typo in ff_layout_resend_pnfs_read() (2020-12-16
17:25:24 -0500)


Stephen Rothwell has noted that there are conflicts with the upstream
tree.
They are due to the last-minute fix to the NFSv4.2 READ_PLUS code in
5.10,
and due to the changes for the nfsd XDR code. The fixups should be
trivial
but please let me know if you'd prefer that I do so.

Cheers
  Trond


NFS client updates for Linux 5.11

Highlights include:

Features:
- NFSv3: Add emulation of lookupp() to improve open_by_filehandle()
  support.
- A series of patches to improve readdir performance, particularly with
  large directories.
- Basic support for using NFS/RDMA with the pNFS files and flexfiles
  drivers.
- Micro-optimisations for RDMA.
- RDMA tracing improvements.

Bugfixes:
- Fix a long standing bug with xs_read_xdr_buf() when receiving partial
  pages (Dan Aloni).
- Various fixes for getxattr and listxattr, when used over non-TCP
  transports.
- Fixes for containerised NFS from Sargun Dhillon.
- switch nfsiod to be an UNBOUND workqueue (Neil Brown).
- READDIR should not ask for security label information if there is no
  LSM policy. (Olga Kornievskaia)
- Avoid using interval-based rebinding with TCP in lockd (Calum
Mackay).
- A series of RPC and NFS layer fixes to support the NFSv4.2 READ_PLUS
code.
- A couple of fixes for pnfs/flexfiles read failover

Cleanups:
- Various cleanups for the SUNRPC xdr code in conjunction with the
  READ_PLUS fixes.


Calum Mackay (1):
  lockd: don't use interval-based rebinding over TCP

Chuck Lever (16):
  xprtrdma: Replace dprintk call sites in ERR_CHUNK path
  xprtrdma: Introduce Receive completion IDs
  xprtrdma: Introduce Send completion IDs
  xprtrdma: Introduce FRWR completion IDs
  xprtrdma: Clean up trace_xprtrdma_post_linv
  xprtrdma: Clean up reply parsing error tracepoints
  xprtrdma: Clean up tracepoints in the reply path
  xprtrdma: Clean up xprtrdma callback tracepoints
  xprtrdma: Clean up trace_xprtrdma_nomrs()
  xprtrdma: Display the task ID when reporting MR events
  xprtrdma: Trace unmap_sync calls
  xprtrdma: Move rpcrdma_mr_put()
  xprtrdma: Micro-optimize MR DMA-unmapping
  SUNRPC: Remove XDRBUF_SPARSE_PAGES flag in gss_proxy upcall
  NFS: Fix rpcrdma_inline_fixup() crash with new LISTXATTRS
operation
  xprtrdma: Fix XDRBUF_SPARSE_PAGES support

Colin Ian King (1):
  NFSv4.2: fix error return on memory allocation failure

Dai Ngo (1):
  NFSv4.2: Fix 5 seconds delay when doing inter server copy

Dan Aloni (1):
  sunrpc: fix xs_read_xdr_buf for partial pages receive

Fedor Tokarev (1):
  net: sunrpc: Fix 'snprintf' return value check in
'do_xprt_debugfs'

Frank van der Linden (1):
  NFSv4.2: improve page handling for GETXATTR

Geliang Tang (1):
  NFSv4.1: use BITS_PER_LONG macro in nfs4session.h

NeilBrown (1):
  NFS: switch nfsiod to be an UNBOUND workqueue.

Olga Kornievskaia (1):
  NFSv4.2: condition READDIR's mask for security label based on LSM
state

Sargun Dhillon (2):
  NFS: NFSv2/NFSv3: Use cred from fs_context during mount
  NFSv4: Refactor to use user namespaces for nfs4idmap

Trond Myklebust (63):
  NFSv3: Refactor nfs3_proc_lookup() to split out the dentry
  NFSv3: Add emulation of the lookupp() operation
  NFSv4: Observe the NFS_MOUNT_SOFTREVAL flag in _nfs4_proc_lookupp
  SUNRPC: rpc_wake_up() should wake up tasks in the correct order
  NFS: Ensure contents of struct nfs_open_dir_context are
consistent
  NFS: Clean up readdir struct nfs_cache_array
  NFS: Clean up nfs_readdir_page_filler()
  NFS: Clean up directory array handling
  NFS: Don't discard readdir results
  NFS: Remove unnecessary kmap in nfs_readdir_xdr_to_array()
  NFS: Replace kmap() with kmap_atomic() in
nfs_readdir_search_array()
  NFS: Simplify struct nfs_cache_array_entry
  NFS: Support larger readdir buffers
  NFS: More readdir cleanups
  NFS: nfs_do_filldir() does not return a value
  NFS: Reduce readdir stack usage
  NFS: Cleanup to remove nfs_readdir_descriptor_t typedef
  NFS: Allow the NFS generic code to pass in a verifier to readdir
  NFS: Handle NFS4ERR_NOT_SAME and NFSERR_BADCOOKIE from readdir
calls
  NFS: Improve handling of directory verifiers
  NFS: Optimisations for monotonically increasing readdir cookies
  NFS: Reduce number of RPC calls when doing uncached readdir
  

Re: [GIT PULL] Please pull NFS client changes

2018-12-07 Thread pr-tracker-bot
The pull request you sent on Thu, 6 Dec 2018 21:55:54 +:

> git://git.linux-nfs.org/projects/trondmy/linux-nfs.git tags/nfs-for-4.20-5

has been merged into torvalds/linux.git:
https://git.kernel.org/torvalds/c/7f80c7325be49db3fb8b5f343f47691b7999fda7

Thank you!

-- 
Deet-doot-dot, I am a bot.
https://korg.wiki.kernel.org/userdoc/prtracker


Re: [GIT PULL] Please pull NFS client changes

2018-12-07 Thread pr-tracker-bot
The pull request you sent on Thu, 6 Dec 2018 21:55:54 +:

> git://git.linux-nfs.org/projects/trondmy/linux-nfs.git tags/nfs-for-4.20-5

has been merged into torvalds/linux.git:
https://git.kernel.org/torvalds/c/7f80c7325be49db3fb8b5f343f47691b7999fda7

Thank you!

-- 
Deet-doot-dot, I am a bot.
https://korg.wiki.kernel.org/userdoc/prtracker


[GIT PULL] Please pull NFS client changes

2018-12-06 Thread Trond Myklebust
Hi Linus,

Apologies for the relatively high volume this late in -rc. This is mainly
fallout from the updates to the SUNRPC code that is being triggered from
less common combinations of NFS mount options.

Cheers
  Trond

The following changes since commit 4b78317679c4f3782a3cff0ddb269c1fcfde7621:

  Merge branch 'x86-pti-for-linus' of 
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip (2018-12-01 12:35:48 
-0800)

are available in the Git repository at:

  git://git.linux-nfs.org/projects/trondmy/linux-nfs.git tags/nfs-for-4.20-5

for you to fetch changes up to 79462857eb547e5d17fc8445b9768615e02dc1cf:

  SUNRPC: Don't force a redundant disconnection in xs_read_stream() (2018-12-05 
07:11:12 -0500)


NFS client bugfixes for Linux 4.20

Highlights include:

Stable fixes:
 - Fix a page leak when using RPCSEC_GSS/krb5p to encrypt data.

Bugfixes:
 - Fix a regression that causes the RPC receive code to hang
 - Fix call_connect_status() so that it handles tasks that got transmitted
   while queued waiting for the socket lock.
 - Fix a memory leak in call_encode()
 - Fix several other connect races.
 - Fix receive code error handling.
 - Use the discard iterator rather than MSG_TRUNC for compatibility with
   AF_UNIX/AF_LOCAL sockets.
 - nfs: don't dirty kernel pages read by direct-io
 - pnfs/Flexfiles fix to enforce per-mirror stateid only for NFSv4 data
   servers


Chuck Lever (1):
  SUNRPC: Fix leak of krb5p encode pages

Dave Kleikamp (1):
  nfs: don't dirty kernel pages read by direct-io

Tigran Mkrtchyan (1):
  flexfiles: enforce per-mirror stateid only for v4 DSes

Trond Myklebust (9):
  SUNRPC: call_connect_status() must handle tasks that got transmitted
  SUNRPC: Fix a memory leak in call_encode()
  SUNRPC: Fix a potential race in xprt_connect()
  SUNRPC: Fix RPC receive hangs
  SUNRPC: Fix up handling of the XDRBUF_SPARSE_PAGES flag
  SUNRPC: Treat EFAULT as a truncated message in xs_read_stream_request()
  SUNRPC: Use the discard iterator rather than MSG_TRUNC
  SUNRPC: Fix up socket polling
  SUNRPC: Don't force a redundant disconnection in xs_read_stream()

 fs/nfs/direct.c|  9 +++-
 fs/nfs/flexfilelayout/flexfilelayout.c |  6 ++-
 include/linux/sunrpc/xdr.h |  1 -
 net/sunrpc/auth_gss/auth_gss.c |  4 ++
 net/sunrpc/clnt.c  |  8 
 net/sunrpc/xprt.c  | 13 +-
 net/sunrpc/xprtsock.c  | 81 --
 7 files changed, 73 insertions(+), 49 deletions(-)
-- 
Trond Myklebust
Linux NFS client maintainer, Hammerspace
trond.mykleb...@hammerspace.com




[GIT PULL] Please pull NFS client changes

2018-12-06 Thread Trond Myklebust
Hi Linus,

Apologies for the relatively high volume this late in -rc. This is mainly
fallout from the updates to the SUNRPC code that is being triggered from
less common combinations of NFS mount options.

Cheers
  Trond

The following changes since commit 4b78317679c4f3782a3cff0ddb269c1fcfde7621:

  Merge branch 'x86-pti-for-linus' of 
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip (2018-12-01 12:35:48 
-0800)

are available in the Git repository at:

  git://git.linux-nfs.org/projects/trondmy/linux-nfs.git tags/nfs-for-4.20-5

for you to fetch changes up to 79462857eb547e5d17fc8445b9768615e02dc1cf:

  SUNRPC: Don't force a redundant disconnection in xs_read_stream() (2018-12-05 
07:11:12 -0500)


NFS client bugfixes for Linux 4.20

Highlights include:

Stable fixes:
 - Fix a page leak when using RPCSEC_GSS/krb5p to encrypt data.

Bugfixes:
 - Fix a regression that causes the RPC receive code to hang
 - Fix call_connect_status() so that it handles tasks that got transmitted
   while queued waiting for the socket lock.
 - Fix a memory leak in call_encode()
 - Fix several other connect races.
 - Fix receive code error handling.
 - Use the discard iterator rather than MSG_TRUNC for compatibility with
   AF_UNIX/AF_LOCAL sockets.
 - nfs: don't dirty kernel pages read by direct-io
 - pnfs/Flexfiles fix to enforce per-mirror stateid only for NFSv4 data
   servers


Chuck Lever (1):
  SUNRPC: Fix leak of krb5p encode pages

Dave Kleikamp (1):
  nfs: don't dirty kernel pages read by direct-io

Tigran Mkrtchyan (1):
  flexfiles: enforce per-mirror stateid only for v4 DSes

Trond Myklebust (9):
  SUNRPC: call_connect_status() must handle tasks that got transmitted
  SUNRPC: Fix a memory leak in call_encode()
  SUNRPC: Fix a potential race in xprt_connect()
  SUNRPC: Fix RPC receive hangs
  SUNRPC: Fix up handling of the XDRBUF_SPARSE_PAGES flag
  SUNRPC: Treat EFAULT as a truncated message in xs_read_stream_request()
  SUNRPC: Use the discard iterator rather than MSG_TRUNC
  SUNRPC: Fix up socket polling
  SUNRPC: Don't force a redundant disconnection in xs_read_stream()

 fs/nfs/direct.c|  9 +++-
 fs/nfs/flexfilelayout/flexfilelayout.c |  6 ++-
 include/linux/sunrpc/xdr.h |  1 -
 net/sunrpc/auth_gss/auth_gss.c |  4 ++
 net/sunrpc/clnt.c  |  8 
 net/sunrpc/xprt.c  | 13 +-
 net/sunrpc/xprtsock.c  | 81 --
 7 files changed, 73 insertions(+), 49 deletions(-)
-- 
Trond Myklebust
Linux NFS client maintainer, Hammerspace
trond.mykleb...@hammerspace.com




Re: [GIT PULL] Please pull NFS client changes

2018-11-25 Thread pr-tracker-bot
The pull request you sent on Sun, 25 Nov 2018 03:04:11 +:

> git://git.linux-nfs.org/projects/trondmy/linux-nfs.git tags/nfs-for-4.20-4

has been merged into torvalds/linux.git:
https://git.kernel.org/torvalds/c/17c2f540863a6c0faa3f0ede3c785d9427bcaf80

Thank you!

-- 
Deet-doot-dot, I am a bot.
https://korg.wiki.kernel.org/userdoc/prtracker


Re: [GIT PULL] Please pull NFS client changes

2018-11-25 Thread pr-tracker-bot
The pull request you sent on Sun, 25 Nov 2018 03:04:11 +:

> git://git.linux-nfs.org/projects/trondmy/linux-nfs.git tags/nfs-for-4.20-4

has been merged into torvalds/linux.git:
https://git.kernel.org/torvalds/c/17c2f540863a6c0faa3f0ede3c785d9427bcaf80

Thank you!

-- 
Deet-doot-dot, I am a bot.
https://korg.wiki.kernel.org/userdoc/prtracker


[GIT PULL] Please pull NFS client changes

2018-11-24 Thread Trond Myklebust
Hi Linus,

The following changes since commit 9ff01193a20d391e8dbce4403dd5ef87c7eaaca6:

  Linux 4.20-rc3 (2018-11-18 13:33:44 -0800)

are available in the Git repository at:

  git://git.linux-nfs.org/projects/trondmy/linux-nfs.git tags/nfs-for-4.20-4

for you to fetch changes up to bb21ce0ad227b69ec0f83279297ee44232105d96:

  flexfiles: use per-mirror specified stateid for IO (2018-11-22 14:04:55 -0500)


NFS client bugfixes for Linux 4.20

Highlights include:

Bugfixes:
 - Fix a NFSv4 state manager deadlock when returning a delegation
 - NFSv4.2 copy do not allocate memory under the lock
 - flexfiles: Use the correct stateid for IO in the tightly coupled case


Olga Kornievskaia (1):
  NFSv4.2 copy do not allocate memory under the lock

Tigran Mkrtchyan (1):
  flexfiles: use per-mirror specified stateid for IO

Trond Myklebust (1):
  NFSv4: Fix a NFSv4 state manager deadlock

 fs/nfs/callback_proc.c| 22 +++---
 fs/nfs/flexfilelayout/flexfilelayout.c| 21 +
 fs/nfs/flexfilelayout/flexfilelayout.h|  4 
 fs/nfs/flexfilelayout/flexfilelayoutdev.c | 19 +++
 fs/nfs/nfs42proc.c| 19 ++-
 fs/nfs/nfs4_fs.h  |  2 ++
 fs/nfs/nfs4state.c| 16 +++-
 7 files changed, 66 insertions(+), 37 deletions(-)
-- 
Trond Myklebust
Linux NFS client maintainer, Hammerspace
trond.mykleb...@hammerspace.com




[GIT PULL] Please pull NFS client changes

2018-11-24 Thread Trond Myklebust
Hi Linus,

The following changes since commit 9ff01193a20d391e8dbce4403dd5ef87c7eaaca6:

  Linux 4.20-rc3 (2018-11-18 13:33:44 -0800)

are available in the Git repository at:

  git://git.linux-nfs.org/projects/trondmy/linux-nfs.git tags/nfs-for-4.20-4

for you to fetch changes up to bb21ce0ad227b69ec0f83279297ee44232105d96:

  flexfiles: use per-mirror specified stateid for IO (2018-11-22 14:04:55 -0500)


NFS client bugfixes for Linux 4.20

Highlights include:

Bugfixes:
 - Fix a NFSv4 state manager deadlock when returning a delegation
 - NFSv4.2 copy do not allocate memory under the lock
 - flexfiles: Use the correct stateid for IO in the tightly coupled case


Olga Kornievskaia (1):
  NFSv4.2 copy do not allocate memory under the lock

Tigran Mkrtchyan (1):
  flexfiles: use per-mirror specified stateid for IO

Trond Myklebust (1):
  NFSv4: Fix a NFSv4 state manager deadlock

 fs/nfs/callback_proc.c| 22 +++---
 fs/nfs/flexfilelayout/flexfilelayout.c| 21 +
 fs/nfs/flexfilelayout/flexfilelayout.h|  4 
 fs/nfs/flexfilelayout/flexfilelayoutdev.c | 19 +++
 fs/nfs/nfs42proc.c| 19 ++-
 fs/nfs/nfs4_fs.h  |  2 ++
 fs/nfs/nfs4state.c| 16 +++-
 7 files changed, 66 insertions(+), 37 deletions(-)
-- 
Trond Myklebust
Linux NFS client maintainer, Hammerspace
trond.mykleb...@hammerspace.com




Re: [GIT PULL] Please pull NFS client changes for Linux 4.20.

2018-10-26 Thread Linus Torvalds
On Fri, Oct 26, 2018 at 1:21 PM Trond Myklebust  wrote:
>
> We agreed about a year or so ago to take turns maintaining the linux-
> next branches.

Heh. That was a simpler pattern than I thought. It was just hidden by
the fact that there are sometimes more than one pull request during
the merge window, and then a variable number of "fixes" pull requests
during the rc's..

But now that you mention it I can see that yeah, it just alternates by
releases..

   Linus


Re: [GIT PULL] Please pull NFS client changes for Linux 4.20.

2018-10-26 Thread Linus Torvalds
On Fri, Oct 26, 2018 at 1:21 PM Trond Myklebust  wrote:
>
> We agreed about a year or so ago to take turns maintaining the linux-
> next branches.

Heh. That was a simpler pattern than I thought. It was just hidden by
the fact that there are sometimes more than one pull request during
the merge window, and then a variable number of "fixes" pull requests
during the rc's..

But now that you mention it I can see that yeah, it just alternates by
releases..

   Linus


Re: [GIT PULL] Please pull NFS client changes for Linux 4.20.

2018-10-26 Thread Trond Myklebust
On Fri, 2018-10-26 at 13:10 -0700, Linus Torvalds wrote:
> On Fri, Oct 26, 2018 at 8:45 AM Trond Myklebust <
> tron...@hammerspace.com> wrote:
> > 
> > NFS client updates for Linux 4.20
> 
> Pulled.

Thank you!

> Btw, just out of curiosity - is there some pattern to when I get the
> pull requests from you vs Anna? Or is it just "whoever happens to
> have
> the baton"? Not that it matters, but I was wondering if there was
> some
> logic to it that I've missed..

We agreed about a year or so ago to take turns maintaining the linux-
next branches. So in practice that means Anna has been sending you the
pull requests for the odd kernel minor numbers, and I've been sending
you the ones for the even numbers.
We still share the burden of reviewing the code, though, so if
something slips through the cracks, we get to share the blame... ☺

Cheers
  Trond

-- 
Trond Myklebust
Linux NFS client maintainer, Hammerspace
trond.mykleb...@hammerspace.com




Re: [GIT PULL] Please pull NFS client changes for Linux 4.20.

2018-10-26 Thread Trond Myklebust
On Fri, 2018-10-26 at 13:10 -0700, Linus Torvalds wrote:
> On Fri, Oct 26, 2018 at 8:45 AM Trond Myklebust <
> tron...@hammerspace.com> wrote:
> > 
> > NFS client updates for Linux 4.20
> 
> Pulled.

Thank you!

> Btw, just out of curiosity - is there some pattern to when I get the
> pull requests from you vs Anna? Or is it just "whoever happens to
> have
> the baton"? Not that it matters, but I was wondering if there was
> some
> logic to it that I've missed..

We agreed about a year or so ago to take turns maintaining the linux-
next branches. So in practice that means Anna has been sending you the
pull requests for the odd kernel minor numbers, and I've been sending
you the ones for the even numbers.
We still share the burden of reviewing the code, though, so if
something slips through the cracks, we get to share the blame... ☺

Cheers
  Trond

-- 
Trond Myklebust
Linux NFS client maintainer, Hammerspace
trond.mykleb...@hammerspace.com




Re: [GIT PULL] Please pull NFS client changes for Linux 4.20.

2018-10-26 Thread Linus Torvalds
On Fri, Oct 26, 2018 at 8:45 AM Trond Myklebust  wrote:
>
> NFS client updates for Linux 4.20

Pulled.

Btw, just out of curiosity - is there some pattern to when I get the
pull requests from you vs Anna? Or is it just "whoever happens to have
the baton"? Not that it matters, but I was wondering if there was some
logic to it that I've missed..

Linus


Re: [GIT PULL] Please pull NFS client changes for Linux 4.20.

2018-10-26 Thread Linus Torvalds
On Fri, Oct 26, 2018 at 8:45 AM Trond Myklebust  wrote:
>
> NFS client updates for Linux 4.20

Pulled.

Btw, just out of curiosity - is there some pattern to when I get the
pull requests from you vs Anna? Or is it just "whoever happens to have
the baton"? Not that it matters, but I was wondering if there was some
logic to it that I've missed..

Linus


[GIT PULL] Please pull NFS client changes for Linux 4.20.

2018-10-26 Thread Trond Myklebust
Hi Linus,

The following changes since commit 17b57b1883c1285f3d0dc2266e8f79286a7bef38:

  Linux 4.19-rc6 (2018-09-30 07:15:35 -0700)

are available in the Git repository at:

  git://git.linux-nfs.org/projects/trondmy/linux-nfs.git tags/nfs-for-4.20-1

for you to fetch changes up to 331bc71cb1751d78f6807ad8e6162b07c67cdd1b:

  SUNRPC: Convert the auth cred cache to use refcount_t (2018-10-23 12:24:33 
-0400)

Cheers
  Trond


NFS client updates for Linux 4.20

Highlights include:

Stable fixes:
- Fix the NFSv4.1 r/wsize sanity checking
- Reset the RPC/RDMA credit grant properly after a disconnect
- Fix a missed page unlock after pg_doio()

Features and optimisations:
- Overhaul of the RPC client socket code to eliminate a locking bottleneck
  and reduce the latency when transmitting lots of requests in parallel.
- Allow parallelisation of the RPCSEC_GSS encoding of an RPC request.
- Convert the RPC client socket receive code to use iovec_iter() for
  improved efficiency.
- Convert several NFS and RPC lookup operations to use RCU instead of
  taking global locks.
- Avoid the need for BH-safe locks in the RPC/RDMA back channel.

Bugfixes and cleanups:
- Fix lock recovery during NFSv4 delegation recalls
- Fix the NFSv4 + NFSv4.1 "lookup revalidate + open file" case.
- Fixes for the RPC connection metrics
- Various RPC client layer cleanups to consolidate stream based sockets
- RPC/RDMA connection cleanups
- Simplify the RPC/RDMA cleanup after memory operation failures
- Clean ups for NFS v4.2 copy completion and NFSv4 open state reclaim.


Anna Schumaker (4):
  NFS: Split out the body of nfs4_reclaim_open_state()
  NFS: Reduce indentation of the switch statement in 
nfs4_reclaim_open_state()
  NFS: Reduce indentation of nfs4_recovery_handle_error()
  NFSv4: Split out NFS v4.2 copy completion functions

Arnd Bergmann (1):
  SUNRPC: use cmpxchg64() in gss_seq_send64_fetch_and_inc()

Benjamin Coddington (2):
  nfs: Fix a missed page unlock after pg_doio()
  nfs: remove redundant call to nfs_context_set_write_error()

Chuck Lever (20):
  xprtrdma: xprt_release_rqst_cong is called outside of transport_lock
  xprtrdma: Reset credit grant properly after a disconnect
  xprtrdma: Create more MRs at a time
  xprtrdma: Explicitly resetting MRs is no longer necessary
  xprtrdma: Name MR trace events consistently
  sunrpc: Fix connect metrics
  sunrpc: Report connect_time in seconds
  xprtrdma: Rename rpcrdma_conn_upcall
  xprtrdma: Conventional variable names in rpcrdma_conn_upcall
  xprtrdma: Eliminate "connstate" variable from rpcrdma_conn_upcall()
  xprtrdma: Re-organize the switch() in rpcrdma_conn_upcall
  xprtrdma: Simplify RPC wake-ups on connect
  xprtrdma: Rename rpcrdma_qp_async_error_upcall
  xprtrdma: Remove memory address of "ep" from an error message
  xprtrdma: Don't disable BH's in backchannel server
  xprtrdma: Move rb_flags initialization
  xprtrdma: Report when there were zero posted Receives
  xprtrdma: Add documenting comments
  xprtrdma: Clean up xprt_rdma_disconnect_inject
  xprtrdma: Squelch a sparse warning

Frank Sorenson (1):
  NFS: change sign of nfs_fh length

J. Bruce Fields (1):
  sunrpc: safely reallow resvport min/max inversion

Olga Kornievskaia (1):
  NFSv4.x: fix lock recovery during delegation recall

Tigran Mkrtchyan (1):
  nfs4: flex_file: ignore synthetic uid/gid for tightly coupled DSes

Trond Myklebust (67):
  SUNRPC: Clean up initialisation of the struct rpc_rqst
  SUNRPC: If there is no reply expected, bail early from call_decode
  SUNRPC: The transmitted message must lie in the RPCSEC window of validity
  SUNRPC: Simplify identification of when the message send/receive is 
complete
  SUNRPC: Avoid holding locks across the XDR encoding of the RPC message
  SUNRPC: Rename TCP receive-specific state variables
  SUNRPC: Move reset of TCP state variables into the reconnect code
  SUNRPC: Add socket transmit queue offset tracking
  SUNRPC: Simplify dealing with aborted partially transmitted messages
  SUNRPC: Refactor the transport request pinning
  SUNRPC: Add a helper to wake up a sleeping rpc_task and set its status
  SUNRPC: Test whether the task is queued before grabbing the queue 
spinlocks
  SUNRPC: Don't wake queued RPC calls multiple times in xprt_transmit
  SUNRPC: Rename xprt->recv_lock to xprt->queue_lock
  SUNRPC: Refactor xprt_transmit() to remove the reply queue code
  SUNRPC: Refactor xprt_transmit() to remove wait for reply code
  SUNRPC: Minor cleanup for call_transmit()
  SUNRPC: Distinguish between the slot allocation list and receive queue
  SUNRPC: Add a transmission queue for RPC requests
  SUNRPC: Refactor RPC call encoding
  

[GIT PULL] Please pull NFS client changes for Linux 4.20.

2018-10-26 Thread Trond Myklebust
Hi Linus,

The following changes since commit 17b57b1883c1285f3d0dc2266e8f79286a7bef38:

  Linux 4.19-rc6 (2018-09-30 07:15:35 -0700)

are available in the Git repository at:

  git://git.linux-nfs.org/projects/trondmy/linux-nfs.git tags/nfs-for-4.20-1

for you to fetch changes up to 331bc71cb1751d78f6807ad8e6162b07c67cdd1b:

  SUNRPC: Convert the auth cred cache to use refcount_t (2018-10-23 12:24:33 
-0400)

Cheers
  Trond


NFS client updates for Linux 4.20

Highlights include:

Stable fixes:
- Fix the NFSv4.1 r/wsize sanity checking
- Reset the RPC/RDMA credit grant properly after a disconnect
- Fix a missed page unlock after pg_doio()

Features and optimisations:
- Overhaul of the RPC client socket code to eliminate a locking bottleneck
  and reduce the latency when transmitting lots of requests in parallel.
- Allow parallelisation of the RPCSEC_GSS encoding of an RPC request.
- Convert the RPC client socket receive code to use iovec_iter() for
  improved efficiency.
- Convert several NFS and RPC lookup operations to use RCU instead of
  taking global locks.
- Avoid the need for BH-safe locks in the RPC/RDMA back channel.

Bugfixes and cleanups:
- Fix lock recovery during NFSv4 delegation recalls
- Fix the NFSv4 + NFSv4.1 "lookup revalidate + open file" case.
- Fixes for the RPC connection metrics
- Various RPC client layer cleanups to consolidate stream based sockets
- RPC/RDMA connection cleanups
- Simplify the RPC/RDMA cleanup after memory operation failures
- Clean ups for NFS v4.2 copy completion and NFSv4 open state reclaim.


Anna Schumaker (4):
  NFS: Split out the body of nfs4_reclaim_open_state()
  NFS: Reduce indentation of the switch statement in 
nfs4_reclaim_open_state()
  NFS: Reduce indentation of nfs4_recovery_handle_error()
  NFSv4: Split out NFS v4.2 copy completion functions

Arnd Bergmann (1):
  SUNRPC: use cmpxchg64() in gss_seq_send64_fetch_and_inc()

Benjamin Coddington (2):
  nfs: Fix a missed page unlock after pg_doio()
  nfs: remove redundant call to nfs_context_set_write_error()

Chuck Lever (20):
  xprtrdma: xprt_release_rqst_cong is called outside of transport_lock
  xprtrdma: Reset credit grant properly after a disconnect
  xprtrdma: Create more MRs at a time
  xprtrdma: Explicitly resetting MRs is no longer necessary
  xprtrdma: Name MR trace events consistently
  sunrpc: Fix connect metrics
  sunrpc: Report connect_time in seconds
  xprtrdma: Rename rpcrdma_conn_upcall
  xprtrdma: Conventional variable names in rpcrdma_conn_upcall
  xprtrdma: Eliminate "connstate" variable from rpcrdma_conn_upcall()
  xprtrdma: Re-organize the switch() in rpcrdma_conn_upcall
  xprtrdma: Simplify RPC wake-ups on connect
  xprtrdma: Rename rpcrdma_qp_async_error_upcall
  xprtrdma: Remove memory address of "ep" from an error message
  xprtrdma: Don't disable BH's in backchannel server
  xprtrdma: Move rb_flags initialization
  xprtrdma: Report when there were zero posted Receives
  xprtrdma: Add documenting comments
  xprtrdma: Clean up xprt_rdma_disconnect_inject
  xprtrdma: Squelch a sparse warning

Frank Sorenson (1):
  NFS: change sign of nfs_fh length

J. Bruce Fields (1):
  sunrpc: safely reallow resvport min/max inversion

Olga Kornievskaia (1):
  NFSv4.x: fix lock recovery during delegation recall

Tigran Mkrtchyan (1):
  nfs4: flex_file: ignore synthetic uid/gid for tightly coupled DSes

Trond Myklebust (67):
  SUNRPC: Clean up initialisation of the struct rpc_rqst
  SUNRPC: If there is no reply expected, bail early from call_decode
  SUNRPC: The transmitted message must lie in the RPCSEC window of validity
  SUNRPC: Simplify identification of when the message send/receive is 
complete
  SUNRPC: Avoid holding locks across the XDR encoding of the RPC message
  SUNRPC: Rename TCP receive-specific state variables
  SUNRPC: Move reset of TCP state variables into the reconnect code
  SUNRPC: Add socket transmit queue offset tracking
  SUNRPC: Simplify dealing with aborted partially transmitted messages
  SUNRPC: Refactor the transport request pinning
  SUNRPC: Add a helper to wake up a sleeping rpc_task and set its status
  SUNRPC: Test whether the task is queued before grabbing the queue 
spinlocks
  SUNRPC: Don't wake queued RPC calls multiple times in xprt_transmit
  SUNRPC: Rename xprt->recv_lock to xprt->queue_lock
  SUNRPC: Refactor xprt_transmit() to remove the reply queue code
  SUNRPC: Refactor xprt_transmit() to remove wait for reply code
  SUNRPC: Minor cleanup for call_transmit()
  SUNRPC: Distinguish between the slot allocation list and receive queue
  SUNRPC: Add a transmission queue for RPC requests
  SUNRPC: Refactor RPC call encoding
  

Re: [GIT PULL] Please pull NFS client changes for 4.18

2018-06-13 Thread Paul E. McKenney
On Wed, Jun 13, 2018 at 11:42:51AM +1000, NeilBrown wrote:
> On Tue, Jun 12 2018, Paul E. McKenney wrote:
> 
> > On Tue, Jun 12, 2018 at 06:11:50PM -0700, Linus Torvalds wrote:
> >> On Tue, Jun 12, 2018 at 6:02 PM Paul E. McKenney
> >>  wrote:
> >> >
> >> > We did review this one, and Neil did make requested changes to the 
> >> > comment
> >> > header and documentation.
> >> 
> >> Oh, I checked the commit for "acked-by" from you and didn't see any,
> >> so I was assuming this was just Neil and Trond on their own..
> >
> > Hmmm...  I gave them a Reviewed-by, but perhaps it got lost.
> >
> > http://lkml.kernel.org/r/20180501141404.gd26...@linux.vnet.ibm.com
> 
> Sorry about that.  I think I planned to add it after I heard back from
> Trond what he thought of the patch, but I didn't hear anything until it
> landed.  I'll try to be more proactive next time.

Not a problem from my viewpoint.  There may come a time when I need to
care about contribution metrics, but I don't need to at the moment.  ;-)

Thanx, Paul



Re: [GIT PULL] Please pull NFS client changes for 4.18

2018-06-13 Thread Paul E. McKenney
On Wed, Jun 13, 2018 at 11:42:51AM +1000, NeilBrown wrote:
> On Tue, Jun 12 2018, Paul E. McKenney wrote:
> 
> > On Tue, Jun 12, 2018 at 06:11:50PM -0700, Linus Torvalds wrote:
> >> On Tue, Jun 12, 2018 at 6:02 PM Paul E. McKenney
> >>  wrote:
> >> >
> >> > We did review this one, and Neil did make requested changes to the 
> >> > comment
> >> > header and documentation.
> >> 
> >> Oh, I checked the commit for "acked-by" from you and didn't see any,
> >> so I was assuming this was just Neil and Trond on their own..
> >
> > Hmmm...  I gave them a Reviewed-by, but perhaps it got lost.
> >
> > http://lkml.kernel.org/r/20180501141404.gd26...@linux.vnet.ibm.com
> 
> Sorry about that.  I think I planned to add it after I heard back from
> Trond what he thought of the patch, but I didn't hear anything until it
> landed.  I'll try to be more proactive next time.

Not a problem from my viewpoint.  There may come a time when I need to
care about contribution metrics, but I don't need to at the moment.  ;-)

Thanx, Paul



Re: [GIT PULL] Please pull NFS client changes for 4.18

2018-06-12 Thread NeilBrown
On Tue, Jun 12 2018, Paul E. McKenney wrote:

> On Tue, Jun 12, 2018 at 06:11:50PM -0700, Linus Torvalds wrote:
>> On Tue, Jun 12, 2018 at 6:02 PM Paul E. McKenney
>>  wrote:
>> >
>> > We did review this one, and Neil did make requested changes to the comment
>> > header and documentation.
>> 
>> Oh, I checked the commit for "acked-by" from you and didn't see any,
>> so I was assuming this was just Neil and Trond on their own..
>
> Hmmm...  I gave them a Reviewed-by, but perhaps it got lost.
>
> http://lkml.kernel.org/r/20180501141404.gd26...@linux.vnet.ibm.com
>
>   Thanx, Paul

Sorry about that.  I think I planned to add it after I heard back from
Trond what he thought of the patch, but I didn't hear anything until it
landed.  I'll try to be more proactive next time.

Thanks,
NeilBrown


signature.asc
Description: PGP signature


Re: [GIT PULL] Please pull NFS client changes for 4.18

2018-06-12 Thread NeilBrown
On Tue, Jun 12 2018, Paul E. McKenney wrote:

> On Tue, Jun 12, 2018 at 06:11:50PM -0700, Linus Torvalds wrote:
>> On Tue, Jun 12, 2018 at 6:02 PM Paul E. McKenney
>>  wrote:
>> >
>> > We did review this one, and Neil did make requested changes to the comment
>> > header and documentation.
>> 
>> Oh, I checked the commit for "acked-by" from you and didn't see any,
>> so I was assuming this was just Neil and Trond on their own..
>
> Hmmm...  I gave them a Reviewed-by, but perhaps it got lost.
>
> http://lkml.kernel.org/r/20180501141404.gd26...@linux.vnet.ibm.com
>
>   Thanx, Paul

Sorry about that.  I think I planned to add it after I heard back from
Trond what he thought of the patch, but I didn't hear anything until it
landed.  I'll try to be more proactive next time.

Thanks,
NeilBrown


signature.asc
Description: PGP signature


Re: [GIT PULL] Please pull NFS client changes for 4.18

2018-06-12 Thread Paul E. McKenney
On Tue, Jun 12, 2018 at 06:11:50PM -0700, Linus Torvalds wrote:
> On Tue, Jun 12, 2018 at 6:02 PM Paul E. McKenney
>  wrote:
> >
> > We did review this one, and Neil did make requested changes to the comment
> > header and documentation.
> 
> Oh, I checked the commit for "acked-by" from you and didn't see any,
> so I was assuming this was just Neil and Trond on their own..

Hmmm...  I gave them a Reviewed-by, but perhaps it got lost.

http://lkml.kernel.org/r/20180501141404.gd26...@linux.vnet.ibm.com

Thanx, Paul



Re: [GIT PULL] Please pull NFS client changes for 4.18

2018-06-12 Thread Paul E. McKenney
On Tue, Jun 12, 2018 at 06:11:50PM -0700, Linus Torvalds wrote:
> On Tue, Jun 12, 2018 at 6:02 PM Paul E. McKenney
>  wrote:
> >
> > We did review this one, and Neil did make requested changes to the comment
> > header and documentation.
> 
> Oh, I checked the commit for "acked-by" from you and didn't see any,
> so I was assuming this was just Neil and Trond on their own..

Hmmm...  I gave them a Reviewed-by, but perhaps it got lost.

http://lkml.kernel.org/r/20180501141404.gd26...@linux.vnet.ibm.com

Thanx, Paul



Re: [GIT PULL] Please pull NFS client changes for 4.18

2018-06-12 Thread Linus Torvalds
On Tue, Jun 12, 2018 at 6:02 PM Paul E. McKenney
 wrote:
>
> We did review this one, and Neil did make requested changes to the comment
> header and documentation.

Oh, I checked the commit for "acked-by" from you and didn't see any,
so I was assuming this was just Neil and Trond on their own..

Linus


Re: [GIT PULL] Please pull NFS client changes for 4.18

2018-06-12 Thread Linus Torvalds
On Tue, Jun 12, 2018 at 6:02 PM Paul E. McKenney
 wrote:
>
> We did review this one, and Neil did make requested changes to the comment
> header and documentation.

Oh, I checked the commit for "acked-by" from you and didn't see any,
so I was assuming this was just Neil and Trond on their own..

Linus


Re: [GIT PULL] Please pull NFS client changes for 4.18

2018-06-12 Thread Paul E. McKenney
On Tue, Jun 12, 2018 at 11:08:31AM -0700, Linus Torvalds wrote:
> Final note (for now) on this: I've merged the nfs code, but I really
> am obviously not happy with these crazy random ad-hoc
> cursor-not-cursor list games.

We did review this one, and Neil did make requested changes to the comment
header and documentation.  However, as you say, I will push back harder
on future RCU list traversal macros.

Thanx, Paul



Re: [GIT PULL] Please pull NFS client changes for 4.18

2018-06-12 Thread Paul E. McKenney
On Tue, Jun 12, 2018 at 11:08:31AM -0700, Linus Torvalds wrote:
> Final note (for now) on this: I've merged the nfs code, but I really
> am obviously not happy with these crazy random ad-hoc
> cursor-not-cursor list games.

We did review this one, and Neil did make requested changes to the comment
header and documentation.  However, as you say, I will push back harder
on future RCU list traversal macros.

Thanx, Paul



Re: [GIT PULL] Please pull NFS client changes for 4.18

2018-06-12 Thread NeilBrown
On Tue, Jun 12 2018, Linus Torvalds wrote:

> Final note (for now) on this: I've merged the nfs code, but I really
> am obviously not happy with these crazy random ad-hoc
> cursor-not-cursor list games.
>
> Linus
Hi Linus,
 thanks for merging the code despite your reservations.

 Yes, we could create a generic rcu-list cursor.  I have given it some
 thought but didn't like the added complexity.  As there were existing
 objects in the list that could be used as a cursor, that seemed to me to
 be the better solution.

 As you say, and cursor would need to be allocated from a slab, not on
 the stack. We could use a SLAB_TYPESAFE_BY_RCU and not need to use rcu
 to delay the freeing.
 The lsb in the next pointer of the cursor would be 1 to indicate the
 cursor.

 Any iteration of the list would need to look out for this flag.
 When found it would need to skip over any cursors to the next
 non-cursor, then repeat the skip and make sure it found the same
 non-cursor.  This guards against the cursor moving while it is being
 examined.

 Any walk that needed to place a cursor would need to get an exclusive
 lock on the list from the start.  This is more locking overhead than
 just grabbing the lock to optimistically take a reference on the
 "current" item which I did in the NFS patch.  If the lists were
 normally short that might not be a problem. In this case the list can
 get quite long so the extra locking might be noticeable.

 Deleting objects from the list would need to be careful to preserve the
 flag bit, but that is the least difficult part.

 FYI I have an open proposal to improve the cursor used by rhashtable
 for rhashtable_walk - it sometimes needs to drop out of RCU in the
 middle of a bucket chain.  In that case the chain is normally short (16 is
 considered so long that the hash must have been compromised) and I
 propose an insertion sort to keep the addresses of objects in numerical
 order.  This way the address of the last object found can work as a stable
 cursor - we just search through the list until an object has a larger
 address.

 So my perspective is that while an rcu_cursor_list could be developed,
 I'm not sure it would always (or ever?) be the best solution to a
 given problem.
 I can turn these thoughts into a patch if you like and see what people
 think.

Thanks,
NeilBrown


signature.asc
Description: PGP signature


Re: [GIT PULL] Please pull NFS client changes for 4.18

2018-06-12 Thread NeilBrown
On Tue, Jun 12 2018, Linus Torvalds wrote:

> Final note (for now) on this: I've merged the nfs code, but I really
> am obviously not happy with these crazy random ad-hoc
> cursor-not-cursor list games.
>
> Linus
Hi Linus,
 thanks for merging the code despite your reservations.

 Yes, we could create a generic rcu-list cursor.  I have given it some
 thought but didn't like the added complexity.  As there were existing
 objects in the list that could be used as a cursor, that seemed to me to
 be the better solution.

 As you say, and cursor would need to be allocated from a slab, not on
 the stack. We could use a SLAB_TYPESAFE_BY_RCU and not need to use rcu
 to delay the freeing.
 The lsb in the next pointer of the cursor would be 1 to indicate the
 cursor.

 Any iteration of the list would need to look out for this flag.
 When found it would need to skip over any cursors to the next
 non-cursor, then repeat the skip and make sure it found the same
 non-cursor.  This guards against the cursor moving while it is being
 examined.

 Any walk that needed to place a cursor would need to get an exclusive
 lock on the list from the start.  This is more locking overhead than
 just grabbing the lock to optimistically take a reference on the
 "current" item which I did in the NFS patch.  If the lists were
 normally short that might not be a problem. In this case the list can
 get quite long so the extra locking might be noticeable.

 Deleting objects from the list would need to be careful to preserve the
 flag bit, but that is the least difficult part.

 FYI I have an open proposal to improve the cursor used by rhashtable
 for rhashtable_walk - it sometimes needs to drop out of RCU in the
 middle of a bucket chain.  In that case the chain is normally short (16 is
 considered so long that the hash must have been compromised) and I
 propose an insertion sort to keep the addresses of objects in numerical
 order.  This way the address of the last object found can work as a stable
 cursor - we just search through the list until an object has a larger
 address.

 So my perspective is that while an rcu_cursor_list could be developed,
 I'm not sure it would always (or ever?) be the best solution to a
 given problem.
 I can turn these thoughts into a patch if you like and see what people
 think.

Thanks,
NeilBrown


signature.asc
Description: PGP signature


Re: [GIT PULL] Please pull NFS client changes for 4.18

2018-06-12 Thread Linus Torvalds
Final note (for now) on this: I've merged the nfs code, but I really
am obviously not happy with these crazy random ad-hoc
cursor-not-cursor list games.

Linus


Re: [GIT PULL] Please pull NFS client changes for 4.18

2018-06-12 Thread Linus Torvalds
Final note (for now) on this: I've merged the nfs code, but I really
am obviously not happy with these crazy random ad-hoc
cursor-not-cursor list games.

Linus


Re: [GIT PULL] Please pull NFS client changes for 4.18

2018-06-12 Thread Linus Torvalds
On Tue, Jun 12, 2018 at 10:42 AM Linus Torvalds
 wrote:
>
> So could we please have a cursor entry for RCU walking, and actually
> have a agreed-upon way to do this? Maybe even using the low bit in the
> "next" field to mark a cursor entry generically - the same way we
> already do for the HEAD field in the bit-locked lists, just on the
> actual entries _on_ the list instead.

The real problem with a RCU cursor is that the lifetime of the cursor
is also RCU extended, so you can't do the traditional "just allocate
the cursor entry on the stack" that you can with synchronous locked
lists.

That makes it slightly more inconvenient to do simple cursors.

The dcache code has a fairly clean "dcache cursor", but it does use a
whole "struct dentry" for it (and it depends on "next_positive()" to
just skip them because dursors are never _positive_ dentries, so
there's some subtlety there).

But at least it's a very explicit cursor model, not that odd
delegation inode that seems to have a life as a real inode too.

So maybe a generic rcu-list cursor is too hard to do sanely, but can
we at least encourage that very explicit cursor model that is *only* a
cursor, and might not be touched/moved/reallocated by something else?

Linus


Re: [GIT PULL] Please pull NFS client changes for 4.18

2018-06-12 Thread Linus Torvalds
On Tue, Jun 12, 2018 at 10:42 AM Linus Torvalds
 wrote:
>
> So could we please have a cursor entry for RCU walking, and actually
> have a agreed-upon way to do this? Maybe even using the low bit in the
> "next" field to mark a cursor entry generically - the same way we
> already do for the HEAD field in the bit-locked lists, just on the
> actual entries _on_ the list instead.

The real problem with a RCU cursor is that the lifetime of the cursor
is also RCU extended, so you can't do the traditional "just allocate
the cursor entry on the stack" that you can with synchronous locked
lists.

That makes it slightly more inconvenient to do simple cursors.

The dcache code has a fairly clean "dcache cursor", but it does use a
whole "struct dentry" for it (and it depends on "next_positive()" to
just skip them because dursors are never _positive_ dentries, so
there's some subtlety there).

But at least it's a very explicit cursor model, not that odd
delegation inode that seems to have a life as a real inode too.

So maybe a generic rcu-list cursor is too hard to do sanely, but can
we at least encourage that very explicit cursor model that is *only* a
cursor, and might not be touched/moved/reallocated by something else?

Linus


Re: [GIT PULL] Please pull NFS client changes for 4.18

2018-06-12 Thread Linus Torvalds
On Tue, Jun 12, 2018 at 6:39 AM Trond Myklebust  wrote:
>
> NeilBrown (4):
>   rculist: add list_for_each_entry_from_rcu()

Oh christ, people. Not another one of these.

We need to start having a real rule in place where people DO NOT PLAY
GAMES WITH RCU LISTS.

Adding Paul McKenney to the list, since he clearly wasn't for the
actual development, and the commit has no sign that it was sanity
checked.

This whole "let's re-start RCU walking in the middle after we've
dropped the RCU read lock" needs a hell of a lot more thought than
people seem to actually be giving it.

Maybe the code is actually correct - it looks ok to me. But every time
I see it I just shudder.

What people actually want is to just have a cursor entry in an RCU
list. I'd much rather have *that*, than have people make up these
insane ad-hoc cursor entries that are insane and have horrible
semantics.

For example, looking at that commit e04bbf6b1bbe ("NFS: Avoid
quadratic search when freeing delegations"), it even talks about the
ad-hoc cursor entry not actually being reliable, and how that's ok
because the code doesn't care. No, random odd behavior that is
basically never ever going to be testable is *NOT* ok.

So could we please have a cursor entry for RCU walking, and actually
have a agreed-upon way to do this? Maybe even using the low bit in the
"next" field to mark a cursor entry generically - the same way we
already do for the HEAD field in the bit-locked lists, just on the
actual entries _on_ the list instead.

Because we now have *two* of these odd special "restart the loop in
the middle" come in during the same merge window. That really implies
to me that we should have a _proper_ solution for the problem, not
this horribly nasty case where people make up their own pseudo-cursors
that have odd and questionable semantics.

Paul, comments?

 Linus


Re: [GIT PULL] Please pull NFS client changes for 4.18

2018-06-12 Thread Linus Torvalds
On Tue, Jun 12, 2018 at 6:39 AM Trond Myklebust  wrote:
>
> NeilBrown (4):
>   rculist: add list_for_each_entry_from_rcu()

Oh christ, people. Not another one of these.

We need to start having a real rule in place where people DO NOT PLAY
GAMES WITH RCU LISTS.

Adding Paul McKenney to the list, since he clearly wasn't for the
actual development, and the commit has no sign that it was sanity
checked.

This whole "let's re-start RCU walking in the middle after we've
dropped the RCU read lock" needs a hell of a lot more thought than
people seem to actually be giving it.

Maybe the code is actually correct - it looks ok to me. But every time
I see it I just shudder.

What people actually want is to just have a cursor entry in an RCU
list. I'd much rather have *that*, than have people make up these
insane ad-hoc cursor entries that are insane and have horrible
semantics.

For example, looking at that commit e04bbf6b1bbe ("NFS: Avoid
quadratic search when freeing delegations"), it even talks about the
ad-hoc cursor entry not actually being reliable, and how that's ok
because the code doesn't care. No, random odd behavior that is
basically never ever going to be testable is *NOT* ok.

So could we please have a cursor entry for RCU walking, and actually
have a agreed-upon way to do this? Maybe even using the low bit in the
"next" field to mark a cursor entry generically - the same way we
already do for the HEAD field in the bit-locked lists, just on the
actual entries _on_ the list instead.

Because we now have *two* of these odd special "restart the loop in
the middle" come in during the same merge window. That really implies
to me that we should have a _proper_ solution for the problem, not
this horribly nasty case where people make up their own pseudo-cursors
that have odd and questionable semantics.

Paul, comments?

 Linus


[GIT PULL] Please pull NFS client changes for 4.18

2018-06-12 Thread Trond Myklebust
Hi Linus,

The following changes since commit 786b71f5b754273ccef6d9462e52062b3e1f9877:

  Merge tag 'nds32-for-linus-4.17-fixes' of 
git://git.kernel.org/pub/scm/linux/kernel/git/greentime/linux (2018-05-28 
05:25:57 -0700)

are available in the Git repository at:

  git://git.linux-nfs.org/projects/trondmy/linux-nfs.git tags/nfs-for-4.18-1

for you to fetch changes up to 93b7f7ad2018d2037559b1d0892417864c78b371:

  skip LAYOUTRETURN if layout is invalid (2018-06-12 08:48:04 -0400)

Cheers,
  Trond


NFS client updates for Linux 4.18

Highlights include:

Stable fixes:
- Fix a 1-byte stack overflow in nfs_idmap_read_and_verify_message
- Fix a hang due to incorrect error returns in rpcrdma_convert_iovs()
- Revert an incorrect change to the NFSv4.1 callback channel
- Fix a bug in the NFSv4.1 sequence error handling

Features and optimisations:
- Support for piggybacking a LAYOUTGET operation to the OPEN compound
- RDMA performance enhancements to deal with transport congestion
- Add proper SPDX tags for NetApp-contributed RDMA source
- Do not request delegated file attributes (size+change) from the server
- Optimise away a GETATTR in the lookup revalidate code when doing NFSv4 OPEN
- Optimise away unnecessary lookups for rename targets
- Misc performance improvements when freeing NFSv4 delegations

Bugfixes and cleanups:
- Try to fail quickly if proto=rdma
- Clean up RDMA receive trace points
- Fix sillyrename to return the delegation when appropriate
- Misc attribute revalidation fixes
- Immediately clear the pNFS layout on a file when the server returns ESTALE
- Return NFS4ERR_DELAY when delegation/layout recalls fail due to igrab()
- Fix the client behaviour on NFS4ERR_SEQ_FALSE_RETRY


Anna Schumaker (4):
  NFS: Move call to nfs4_state_protect_write() to nfs4_write_setup()
  NFS: Move call to nfs4_state_protect() to nfs4_commit_setup()
  NFS: Pass "privileged" value to nfs4_init_sequence()
  NFS: Merge nfs41_free_stateid() with _nfs41_free_stateid()

Benjamin Coddington (1):
  NFSv4: Don't add a new lock on an interrupted wait for LOCK

Chuck Lever (21):
  xprtrdma: Add proper SPDX tags for NetApp-contributed source
  xprtrdma: Try to fail quickly if proto=rdma
  xprtrdma: Create transport's CM ID in the correct network namespace
  xprtrdma: Fix max_send_wr computation
  SUNRPC: Initialize rpc_rqst outside of xprt->reserve_lock
  SUNRPC: Add a ->free_slot transport callout
  xprtrdma: Introduce ->alloc_slot call-out for xprtrdma
  xprtrdma: Make rpc_rqst part of rpcrdma_req
  xprtrdma: Clean up Receive trace points
  xprtrdma: Move Receive posting to Receive handler
  xprtrdma: Remove rpcrdma_ep_{post_recv, post_extra_recv}
  xprtrdma: Remove rpcrdma_buffer_get_req_locked()
  xprtrdma: Remove rpcrdma_buffer_get_rep_locked()
  xprtrdma: Make rpcrdma_sendctx_put_locked() a static function
  xprtrdma: Return -ENOBUFS when no pages are available
  xprtrdma: Move common wait_for_buffer_space call to parent function
  xprtrdma: Wait on empty sendctx queue
  xprtrdma: Add trace_xprtrdma_dma_map(mr)
  xprtrdma: Remove transfertypes array
  NFSv4.0: Remove cl_ipaddr from non-UCS client ID
  NFSv4.0: Remove transport protocol name from non-UCS client ID

Dave Wysochanski (1):
  NFSv4: Fix possible 1-byte stack overflow in 
nfs_idmap_read_and_verify_message

Fred Isaman (14):
  pnfs: Remove redundant assignment from nfs4_proc_layoutget().
  pnfs: Store return value of decode_layoutget for later processing
  NFS4: move ctx into nfs4_run_open_task
  pnfs: Add layout driver flag PNFS_LAYOUTGET_ON_OPEN
  pnfs: refactor send_layoutget
  pnfs: move allocations out of nfs4_proc_layoutget
  pnfs: Add conditional encode/decode of LAYOUTGET within OPEN compound
  pnfs: Move nfs4_opendata into nfs4_fs.h
  pnfs: Change pnfs_alloc_init_layoutget_args call signature
  pnfs: Add LAYOUTGET to OPEN of a new file
  pnfs: Add LAYOUTGET to OPEN of an existing file
  pnfs: Stop attempting LAYOUTGET on OPEN on failure
  pnfs: Add barrier to prevent lgopen using LAYOUTGET during recall
  pnfs: Fix manipulation of NFS_LAYOUT_FIRST_LAYOUTGET

NeilBrown (4):
  NFS: slight optimization for walking list for delegations
  NFS: use cond_resched() when restarting walk of delegation list.
  rculist: add list_for_each_entry_from_rcu()
  NFS: Avoid quadratic search when freeing delegations.

Olga Kornievskaia (1):
  skip LAYOUTRETURN if layout is invalid

Trond Myklebust (35):
  NFS: Optimise away the close-to-open GETATTR when we have NFSv4 OPEN
  NFS: If the VFS sets LOOKUP_REVAL then force a lookup of the dentry
  NFS: Optimise away lookups for rename targets
  NFSv4: Only pass the delegation to setattr if we're sending a 

[GIT PULL] Please pull NFS client changes for 4.18

2018-06-12 Thread Trond Myklebust
Hi Linus,

The following changes since commit 786b71f5b754273ccef6d9462e52062b3e1f9877:

  Merge tag 'nds32-for-linus-4.17-fixes' of 
git://git.kernel.org/pub/scm/linux/kernel/git/greentime/linux (2018-05-28 
05:25:57 -0700)

are available in the Git repository at:

  git://git.linux-nfs.org/projects/trondmy/linux-nfs.git tags/nfs-for-4.18-1

for you to fetch changes up to 93b7f7ad2018d2037559b1d0892417864c78b371:

  skip LAYOUTRETURN if layout is invalid (2018-06-12 08:48:04 -0400)

Cheers,
  Trond


NFS client updates for Linux 4.18

Highlights include:

Stable fixes:
- Fix a 1-byte stack overflow in nfs_idmap_read_and_verify_message
- Fix a hang due to incorrect error returns in rpcrdma_convert_iovs()
- Revert an incorrect change to the NFSv4.1 callback channel
- Fix a bug in the NFSv4.1 sequence error handling

Features and optimisations:
- Support for piggybacking a LAYOUTGET operation to the OPEN compound
- RDMA performance enhancements to deal with transport congestion
- Add proper SPDX tags for NetApp-contributed RDMA source
- Do not request delegated file attributes (size+change) from the server
- Optimise away a GETATTR in the lookup revalidate code when doing NFSv4 OPEN
- Optimise away unnecessary lookups for rename targets
- Misc performance improvements when freeing NFSv4 delegations

Bugfixes and cleanups:
- Try to fail quickly if proto=rdma
- Clean up RDMA receive trace points
- Fix sillyrename to return the delegation when appropriate
- Misc attribute revalidation fixes
- Immediately clear the pNFS layout on a file when the server returns ESTALE
- Return NFS4ERR_DELAY when delegation/layout recalls fail due to igrab()
- Fix the client behaviour on NFS4ERR_SEQ_FALSE_RETRY


Anna Schumaker (4):
  NFS: Move call to nfs4_state_protect_write() to nfs4_write_setup()
  NFS: Move call to nfs4_state_protect() to nfs4_commit_setup()
  NFS: Pass "privileged" value to nfs4_init_sequence()
  NFS: Merge nfs41_free_stateid() with _nfs41_free_stateid()

Benjamin Coddington (1):
  NFSv4: Don't add a new lock on an interrupted wait for LOCK

Chuck Lever (21):
  xprtrdma: Add proper SPDX tags for NetApp-contributed source
  xprtrdma: Try to fail quickly if proto=rdma
  xprtrdma: Create transport's CM ID in the correct network namespace
  xprtrdma: Fix max_send_wr computation
  SUNRPC: Initialize rpc_rqst outside of xprt->reserve_lock
  SUNRPC: Add a ->free_slot transport callout
  xprtrdma: Introduce ->alloc_slot call-out for xprtrdma
  xprtrdma: Make rpc_rqst part of rpcrdma_req
  xprtrdma: Clean up Receive trace points
  xprtrdma: Move Receive posting to Receive handler
  xprtrdma: Remove rpcrdma_ep_{post_recv, post_extra_recv}
  xprtrdma: Remove rpcrdma_buffer_get_req_locked()
  xprtrdma: Remove rpcrdma_buffer_get_rep_locked()
  xprtrdma: Make rpcrdma_sendctx_put_locked() a static function
  xprtrdma: Return -ENOBUFS when no pages are available
  xprtrdma: Move common wait_for_buffer_space call to parent function
  xprtrdma: Wait on empty sendctx queue
  xprtrdma: Add trace_xprtrdma_dma_map(mr)
  xprtrdma: Remove transfertypes array
  NFSv4.0: Remove cl_ipaddr from non-UCS client ID
  NFSv4.0: Remove transport protocol name from non-UCS client ID

Dave Wysochanski (1):
  NFSv4: Fix possible 1-byte stack overflow in 
nfs_idmap_read_and_verify_message

Fred Isaman (14):
  pnfs: Remove redundant assignment from nfs4_proc_layoutget().
  pnfs: Store return value of decode_layoutget for later processing
  NFS4: move ctx into nfs4_run_open_task
  pnfs: Add layout driver flag PNFS_LAYOUTGET_ON_OPEN
  pnfs: refactor send_layoutget
  pnfs: move allocations out of nfs4_proc_layoutget
  pnfs: Add conditional encode/decode of LAYOUTGET within OPEN compound
  pnfs: Move nfs4_opendata into nfs4_fs.h
  pnfs: Change pnfs_alloc_init_layoutget_args call signature
  pnfs: Add LAYOUTGET to OPEN of a new file
  pnfs: Add LAYOUTGET to OPEN of an existing file
  pnfs: Stop attempting LAYOUTGET on OPEN on failure
  pnfs: Add barrier to prevent lgopen using LAYOUTGET during recall
  pnfs: Fix manipulation of NFS_LAYOUT_FIRST_LAYOUTGET

NeilBrown (4):
  NFS: slight optimization for walking list for delegations
  NFS: use cond_resched() when restarting walk of delegation list.
  rculist: add list_for_each_entry_from_rcu()
  NFS: Avoid quadratic search when freeing delegations.

Olga Kornievskaia (1):
  skip LAYOUTRETURN if layout is invalid

Trond Myklebust (35):
  NFS: Optimise away the close-to-open GETATTR when we have NFSv4 OPEN
  NFS: If the VFS sets LOOKUP_REVAL then force a lookup of the dentry
  NFS: Optimise away lookups for rename targets
  NFSv4: Only pass the delegation to setattr if we're sending a 

[GIT PULL] Please pull NFS client changes

2018-01-30 Thread Trond Myklebust
Hi Linus,

The following changes since commit a8750ddca918032d6349adbf9a4b6555e7db20da:

  Linux 4.15-rc8 (2018-01-14 15:32:30 -0800)

are available in the Git repository at:

  git://git.linux-nfs.org/projects/trondmy/linux-nfs.git tags/nfs-for-4.16-1

for you to fetch changes up to e231c6879cfd44e4fffd384bb6dd7d313249a523:

  NFS: Fix a race between mmap() and O_DIRECT (2018-01-28 22:00:15 -0500)


NFS client updates for Linux 4.16

Highlights include:

Stable bugfixes:
- Fix breakages in the nfsstat utility due to the inclusion of the NFSv4
  LOOKUPP operation.
- Fix a NULL pointer dereference in nfs_idmap_prepare_pipe_upcall() due to
  nfs_idmap_legacy_upcall() being called without an 'aux' parameter.
- Fix a refcount leak in the standard O_DIRECT error path.
- Fix a refcount leak in the pNFS O_DIRECT fallback to MDS path.
- Fix CPU latency issues with nfs_commit_release_pages()
- Fix the LAYOUTUNAVAILABLE error case in the file layout type.
- NFS: Fix a race between mmap() and O_DIRECT

Features:
- Support the statx() mask and query flags to enable optimisations when
  the user is requesting only attributes that are already up to date in
  the inode cache, or is specifying the AT_STATX_DONT_SYNC flag.
- Add a module alias for the SCSI pNFS layout type.

Bugfixes:
- Automounting when resolving a NFSv4 referral should preserve the RDMA
  transport protocol settings.
- Various other RDMA bugfixes from Chuck.
- pNFS block layout fixes.
- Always set NFS_LOCK_LOST when a lock is lost.


Arnd Bergmann (1):
  nfs: remove unused label in nfs_encode_fh()

Benjamin Coddington (7):
  NFS: remove unused offset arg in nfs_pgio_rpcsetup
  pnfs/blocklayout: Add module alias for LAYOUT4_SCSI
  pnfs/blocklayout: set PNFS_LAYOUTRETURN_ON_ERROR
  pnfs/blocklayout: handle transient devices
  SUNRPC: Fix null rpc_clnt dereference in rpc_task_queued tracepoint
  pnfs/blocklayout: pnfs_block_dev_map uses bytes, not sectors
  pnfs/blocklayout: Ensure disk address in block device map

Chuck Lever (39):
  nfs: Define NFS_RDMA_PORT
  nfs: Referrals should use the same proto setting as their parent
  nfs: Update server port after referral or migration
  SUNRPC: Remove rpc_protocol()
  xprtrdma: Fix buffer leak after transport set up failure
  xprtrdma: Fix backchannel allocation of extra rpcrdma_reps
  xprtrdma: Eliminate unnecessary lock cycle in xprt_rdma_send_request
  xprtrdma: Per-mode handling for Remote Invalidation
  xprtrdma: Remove ri_reminv_expected
  xprtrdma: Remove unused padding variables
  xprtrdma: Initialize the xprt address string array earlier
  xprtrdma: Remove another sockaddr_storage field (cdata::addr)
  xprtrdma: Support IPv6 in xprt_rdma_set_port
  xprtrdma: Move unmap-safe logic to rpcrdma_marshal_req
  xprtrdma: buf_free not called for CB replies
  xprtrdma: Split xprt_rdma_send_request
  xprtrdma: Don't clear RPC_BC_PA_IN_USE on pre-allocated rpc_rqst's
  xprtrdma: Replace all usage of "frmr" with "frwr"
  xprtrdma: Remove usage of "mw"
  xprtrdma: Introduce rpcrdma_mw_unmap_and_put
  nfs: Use proper enum definitions for nfs_show_stable
  rdma/ib: Add trace point macros to display human-readable values
  rpcrdma: infrastructure for static trace points in rpcrdma.ko
  xprtrdma: Add trace points in RPC Call transmit paths
  xprtrdma: Add trace points in the RPC Reply handler paths
  xprtrdma: Add trace points to instrument memory registration
  xprtrdma: Add trace points in reply decoder path
  xprtrdma: Add trace points to instrument memory invalidation
  xprtrdma: Add trace points to instrument MR allocation and recovery
  xprtrdma: Add trace points for connect events
  xprtrdma: Add trace points in the client-side backchannel code paths
  xprtrdma: Add trace points to instrument QP and CQ access upcalls
  xprtrdma: Instrument allocation/release of rpcrdma_req/rep objects
  xprtrdma: Fix "bytes registered" accounting
  xprtrdma: Correct some documenting comments
  SUNRPC: Trace xprt_timer events
  sunrpc: Format RPC events consistently for display
  SUNRPC: task_run_action should display tk_callback
  SUNRPC: Micro-optimize __rpc_execute

Elena Reshetova (4):
  lockd: convert nlm_host.h_count from atomic_t to refcount_t
  lockd: convert nsm_handle.sm_count from atomic_t to refcount_t
  lockd: convert nlm_lockowner.count from atomic_t to refcount_t
  lockd: convert nlm_rqst.a_count from atomic_t to refcount_t

Eric Biggers (1):
  NFS: reject request for id_legacy key without auxdata

J. Bruce Fields (1):
  NFS: commit direct writes even if they fail partially

Jan Chochol (1):
  nfs: Do not convert nfs_idmap_cache_timeout to jiffies

NeilBrown (2):
  nfs: 

[GIT PULL] Please pull NFS client changes

2018-01-30 Thread Trond Myklebust
Hi Linus,

The following changes since commit a8750ddca918032d6349adbf9a4b6555e7db20da:

  Linux 4.15-rc8 (2018-01-14 15:32:30 -0800)

are available in the Git repository at:

  git://git.linux-nfs.org/projects/trondmy/linux-nfs.git tags/nfs-for-4.16-1

for you to fetch changes up to e231c6879cfd44e4fffd384bb6dd7d313249a523:

  NFS: Fix a race between mmap() and O_DIRECT (2018-01-28 22:00:15 -0500)


NFS client updates for Linux 4.16

Highlights include:

Stable bugfixes:
- Fix breakages in the nfsstat utility due to the inclusion of the NFSv4
  LOOKUPP operation.
- Fix a NULL pointer dereference in nfs_idmap_prepare_pipe_upcall() due to
  nfs_idmap_legacy_upcall() being called without an 'aux' parameter.
- Fix a refcount leak in the standard O_DIRECT error path.
- Fix a refcount leak in the pNFS O_DIRECT fallback to MDS path.
- Fix CPU latency issues with nfs_commit_release_pages()
- Fix the LAYOUTUNAVAILABLE error case in the file layout type.
- NFS: Fix a race between mmap() and O_DIRECT

Features:
- Support the statx() mask and query flags to enable optimisations when
  the user is requesting only attributes that are already up to date in
  the inode cache, or is specifying the AT_STATX_DONT_SYNC flag.
- Add a module alias for the SCSI pNFS layout type.

Bugfixes:
- Automounting when resolving a NFSv4 referral should preserve the RDMA
  transport protocol settings.
- Various other RDMA bugfixes from Chuck.
- pNFS block layout fixes.
- Always set NFS_LOCK_LOST when a lock is lost.


Arnd Bergmann (1):
  nfs: remove unused label in nfs_encode_fh()

Benjamin Coddington (7):
  NFS: remove unused offset arg in nfs_pgio_rpcsetup
  pnfs/blocklayout: Add module alias for LAYOUT4_SCSI
  pnfs/blocklayout: set PNFS_LAYOUTRETURN_ON_ERROR
  pnfs/blocklayout: handle transient devices
  SUNRPC: Fix null rpc_clnt dereference in rpc_task_queued tracepoint
  pnfs/blocklayout: pnfs_block_dev_map uses bytes, not sectors
  pnfs/blocklayout: Ensure disk address in block device map

Chuck Lever (39):
  nfs: Define NFS_RDMA_PORT
  nfs: Referrals should use the same proto setting as their parent
  nfs: Update server port after referral or migration
  SUNRPC: Remove rpc_protocol()
  xprtrdma: Fix buffer leak after transport set up failure
  xprtrdma: Fix backchannel allocation of extra rpcrdma_reps
  xprtrdma: Eliminate unnecessary lock cycle in xprt_rdma_send_request
  xprtrdma: Per-mode handling for Remote Invalidation
  xprtrdma: Remove ri_reminv_expected
  xprtrdma: Remove unused padding variables
  xprtrdma: Initialize the xprt address string array earlier
  xprtrdma: Remove another sockaddr_storage field (cdata::addr)
  xprtrdma: Support IPv6 in xprt_rdma_set_port
  xprtrdma: Move unmap-safe logic to rpcrdma_marshal_req
  xprtrdma: buf_free not called for CB replies
  xprtrdma: Split xprt_rdma_send_request
  xprtrdma: Don't clear RPC_BC_PA_IN_USE on pre-allocated rpc_rqst's
  xprtrdma: Replace all usage of "frmr" with "frwr"
  xprtrdma: Remove usage of "mw"
  xprtrdma: Introduce rpcrdma_mw_unmap_and_put
  nfs: Use proper enum definitions for nfs_show_stable
  rdma/ib: Add trace point macros to display human-readable values
  rpcrdma: infrastructure for static trace points in rpcrdma.ko
  xprtrdma: Add trace points in RPC Call transmit paths
  xprtrdma: Add trace points in the RPC Reply handler paths
  xprtrdma: Add trace points to instrument memory registration
  xprtrdma: Add trace points in reply decoder path
  xprtrdma: Add trace points to instrument memory invalidation
  xprtrdma: Add trace points to instrument MR allocation and recovery
  xprtrdma: Add trace points for connect events
  xprtrdma: Add trace points in the client-side backchannel code paths
  xprtrdma: Add trace points to instrument QP and CQ access upcalls
  xprtrdma: Instrument allocation/release of rpcrdma_req/rep objects
  xprtrdma: Fix "bytes registered" accounting
  xprtrdma: Correct some documenting comments
  SUNRPC: Trace xprt_timer events
  sunrpc: Format RPC events consistently for display
  SUNRPC: task_run_action should display tk_callback
  SUNRPC: Micro-optimize __rpc_execute

Elena Reshetova (4):
  lockd: convert nlm_host.h_count from atomic_t to refcount_t
  lockd: convert nsm_handle.sm_count from atomic_t to refcount_t
  lockd: convert nlm_lockowner.count from atomic_t to refcount_t
  lockd: convert nlm_rqst.a_count from atomic_t to refcount_t

Eric Biggers (1):
  NFS: reject request for id_legacy key without auxdata

J. Bruce Fields (1):
  NFS: commit direct writes even if they fail partially

Jan Chochol (1):
  nfs: Do not convert nfs_idmap_cache_timeout to jiffies

NeilBrown (2):
  nfs: 

[GIT PULL] Please pull NFS client changes

2017-10-09 Thread Trond Myklebust
Hi Linus,

The following changes since commit 9e66317d3c92ddaab330c125dfe9d06eee268aff:

  Linux 4.14-rc3 (2017-10-01 14:54:54 -0700)

are available in the git repository at:

  git://git.linux-nfs.org/projects/trondmy/linux-nfs.git tags/nfs-for-4.14-3

for you to fetch changes up to e8fa33a6f6c7688591542db955794b69b8cecc55:

  NFSv4/pnfs: Fix an infinite layoutget loop (2017-10-04 14:06:54 -0400)


NFS client bugfixes for Linux 4.14

Hightlights include:

stable fixes:
- nfs/filelayout: fix oops when freeing filelayout segment
- NFS: Fix uninitialized rpc_wait_queue

bugfixes:
- NFSv4/pnfs: Fix an infinite layoutget loop
- nfs: RPC_MAX_AUTH_SIZE is in bytes


Benjamin Coddington (1):
  NFS: Fix uninitialized rpc_wait_queue

Colin Ian King (1):
  sunrpc: remove redundant initialization of sock

Dan Carpenter (1):
  NFS: Cleanup error handling in nfs_idmap_request_key()

J. Bruce Fields (1):
  nfs: RPC_MAX_AUTH_SIZE is in bytes

Scott Mayhew (1):
  nfs/filelayout: fix oops when freeing filelayout segment

Trond Myklebust (1):
  NFSv4/pnfs: Fix an infinite layoutget loop

 fs/nfs/client.c| 2 +-
 fs/nfs/filelayout/filelayout.c | 3 ++-
 fs/nfs/nfs4idmap.c | 2 +-
 fs/nfs/nfs4proc.c  | 3 +--
 fs/nfs/nfs4xdr.c   | 4 ++--
 net/sunrpc/xprtsock.c  | 2 +-
 6 files changed, 8 insertions(+), 8 deletions(-)
-- 
Trond Myklebust
Linux NFS client maintainer, PrimaryData
trond.mykleb...@primarydata.com


[GIT PULL] Please pull NFS client changes

2017-10-09 Thread Trond Myklebust
Hi Linus,

The following changes since commit 9e66317d3c92ddaab330c125dfe9d06eee268aff:

  Linux 4.14-rc3 (2017-10-01 14:54:54 -0700)

are available in the git repository at:

  git://git.linux-nfs.org/projects/trondmy/linux-nfs.git tags/nfs-for-4.14-3

for you to fetch changes up to e8fa33a6f6c7688591542db955794b69b8cecc55:

  NFSv4/pnfs: Fix an infinite layoutget loop (2017-10-04 14:06:54 -0400)


NFS client bugfixes for Linux 4.14

Hightlights include:

stable fixes:
- nfs/filelayout: fix oops when freeing filelayout segment
- NFS: Fix uninitialized rpc_wait_queue

bugfixes:
- NFSv4/pnfs: Fix an infinite layoutget loop
- nfs: RPC_MAX_AUTH_SIZE is in bytes


Benjamin Coddington (1):
  NFS: Fix uninitialized rpc_wait_queue

Colin Ian King (1):
  sunrpc: remove redundant initialization of sock

Dan Carpenter (1):
  NFS: Cleanup error handling in nfs_idmap_request_key()

J. Bruce Fields (1):
  nfs: RPC_MAX_AUTH_SIZE is in bytes

Scott Mayhew (1):
  nfs/filelayout: fix oops when freeing filelayout segment

Trond Myklebust (1):
  NFSv4/pnfs: Fix an infinite layoutget loop

 fs/nfs/client.c| 2 +-
 fs/nfs/filelayout/filelayout.c | 3 ++-
 fs/nfs/nfs4idmap.c | 2 +-
 fs/nfs/nfs4proc.c  | 3 +--
 fs/nfs/nfs4xdr.c   | 4 ++--
 net/sunrpc/xprtsock.c  | 2 +-
 6 files changed, 8 insertions(+), 8 deletions(-)
-- 
Trond Myklebust
Linux NFS client maintainer, PrimaryData
trond.mykleb...@primarydata.com


[GIT PULL] Please pull NFS client changes for 4.14

2017-09-11 Thread Trond Myklebust
Hi Linus,

The following changes since commit ef954844c7ace62f773f4f23e28d2d915adc419f:

  Linux 4.13-rc5 (2017-08-13 16:01:32 -0700)

are available in the git repository at:

  git://git.linux-nfs.org/projects/trondmy/linux-nfs.git tags/nfs-for-4.14-1

for you to fetch changes up to 1bd5d6d08ea7ed0794c8a3908383d6d6fc202cdd:

  NFS: Count the bytes of skipped subrequests in nfs_lock_and_join_requests() 
(2017-09-09 16:43:09 -0400)


NFS client updates for Linux 4.14

Hightlights include:

Stable bugfixes:
- Fix mirror allocation in the writeback code to avoid a use after free
- Fix the O_DSYNC writes to use the correct byte range
- Fix 2 use after free issues in the I/O code

Features:
- Writeback fixes to split up the inode->i_lock in order to reduce contention
- RPC client receive fixes to reduce the amount of time the
  xprt->transport_lock is held when receiving data from a socket into am
  XDR buffer.
- Ditto fixes to reduce contention between call side users of the rdma
  rb_lock, and its use in rpcrdma_reply_handler.
- Re-arrange rdma stats to reduce false cacheline sharing.
- Various rdma cleanups and optimisations.
- Refactor the NFSv4.1 exchange id code and clean up the code.
- Const-ify all instances of struct rpc_xprt_ops

Bugfixes:
- Fix the NFSv2 'sec=' mount option.
- NFSv4.1: don't use machine credentials for CLOSE when using 'sec=sys'
- Fix the NFSv3 GRANT callback when the port changes on the server.
- Fix livelock issues with COMMIT
- NFSv4: Use correct inode in _nfs4_opendata_to_nfs4_state() when doing
  and NFSv4.1 open by filehandle.


Chuck Lever (18):
  sunrpc: Const-ify all instances of struct rpc_xprt_ops
  xprtrdma: Add xdr_init_decode to rpcrdma_reply_handler()
  xprtrdma: Harden backchannel call decoding
  xprtrdma: Refactor rpcrdma_reply_handler()
  xprtrdma: Replace rpcrdma_count_chunks()
  xprtrdma: Remove opcode check in Receive completion handler
  xprtrdma: Remove rpcrdma_rep::rr_len
  xprtrdma: Clean up XDR decoding in rpcrdma_update_granted_credits()
  xprtrdma: Clean up rpcrdma_marshal_req() synopsis
  xprtrdma: Remove rpclen from rpcrdma_marshal_req
  xprtrdma: Set up an xdr_stream in rpcrdma_marshal_req()
  xprtrdma: Harden chunk list encoding against send buffer overflow
  xprtrdma: Clean up rpcrdma_bc_marshal_reply()
  xprtrdma: Remove imul instructions from rpcrdma_convert_iovs()
  xprtrdma: Remove imul instructions from chunk list encoders
  NFS: Fix NFSv2 security settings
  xprtrdma: Re-arrange struct rx_stats
  xprtrdma: Use xprt_pin_rqst in rpcrdma_reply_handler

Markus Elfring (1):
  lockd: Delete an error message for a failed memory allocation in 
reclaimer()

NeilBrown (6):
  SUNRPC: ECONNREFUSED should cause a rebind.
  NFSv4.1: don't use machine credentials for CLOSE when using 'sec=sys'
  NFS: don't expect errors from mempool_alloc().
  SUNRPC: remove some dead code.
  NFS: flush data when locking a file to ensure cache coherence for mmap.
  NFS: remove jiffies field from access cache

Trond Myklebust (48):
  NFSv4: Refactor _nfs4_proc_exchange_id()
  NFSv4.1: Ensure we clear the SP4_MACH_CRED flags in nfs4_sp4_select_mode()
  NFSv4: Cleanup setting of the migration flags.
  Merge branch 'exchange_id'
  NFSv4: Use correct inode in _nfs4_opendata_to_nfs4_state()
  NFSv4: Use the nfs4_state being recovered in 
_nfs4_opendata_to_nfs4_state()
  Merge branch 'open_state'
  NFS: Simplify page writeback
  NFS: Reduce lock contention in nfs_page_find_head_request()
  NFS: Reduce lock contention in nfs_try_to_update_request()
  NFS: Ensure we always dereference the page head last
  NFS: Fix a reference and lock leak in nfs_lock_and_join_requests()
  NFS: Fix an ABBA issue in nfs_lock_and_join_requests()
  NFS: Don't check request offset and size without holding a lock
  NFS: Don't unlock writebacks before declaring PG_WB_END
  NFS: Fix the inode request accounting when pages have subrequests
  NFS: Teach nfs_try_to_update_request() to deal with request page_groups
  NFS: Remove page group limit in nfs_flush_incompatible()
  NFS: Reduce inode->i_lock contention in nfs_lock_and_join_requests()
  NFS: Further optimise nfs_lock_and_join_requests()
  NFS: Fix nfs_page_group_destroy() and nfs_lock_and_join_requests() race 
cases
  NFS: Remove nfs_page_group_clear_bits()
  NFS: Remove unuse function nfs_page_group_lock_wait()
  NFS: Remove unused parameter from nfs_page_group_lock()
  NFS: Fix up nfs_page_group_covers_page()
  NFSv4: Convert nfs_lock_and_join_requests() to use 
nfs_page_find_head_request()
  NFS: Refactor nfs_page_find_head_request()
  NFSv4: Use a mutex to protect the per-inode commit lists
  NFS: Use an 

[GIT PULL] Please pull NFS client changes for 4.14

2017-09-11 Thread Trond Myklebust
Hi Linus,

The following changes since commit ef954844c7ace62f773f4f23e28d2d915adc419f:

  Linux 4.13-rc5 (2017-08-13 16:01:32 -0700)

are available in the git repository at:

  git://git.linux-nfs.org/projects/trondmy/linux-nfs.git tags/nfs-for-4.14-1

for you to fetch changes up to 1bd5d6d08ea7ed0794c8a3908383d6d6fc202cdd:

  NFS: Count the bytes of skipped subrequests in nfs_lock_and_join_requests() 
(2017-09-09 16:43:09 -0400)


NFS client updates for Linux 4.14

Hightlights include:

Stable bugfixes:
- Fix mirror allocation in the writeback code to avoid a use after free
- Fix the O_DSYNC writes to use the correct byte range
- Fix 2 use after free issues in the I/O code

Features:
- Writeback fixes to split up the inode->i_lock in order to reduce contention
- RPC client receive fixes to reduce the amount of time the
  xprt->transport_lock is held when receiving data from a socket into am
  XDR buffer.
- Ditto fixes to reduce contention between call side users of the rdma
  rb_lock, and its use in rpcrdma_reply_handler.
- Re-arrange rdma stats to reduce false cacheline sharing.
- Various rdma cleanups and optimisations.
- Refactor the NFSv4.1 exchange id code and clean up the code.
- Const-ify all instances of struct rpc_xprt_ops

Bugfixes:
- Fix the NFSv2 'sec=' mount option.
- NFSv4.1: don't use machine credentials for CLOSE when using 'sec=sys'
- Fix the NFSv3 GRANT callback when the port changes on the server.
- Fix livelock issues with COMMIT
- NFSv4: Use correct inode in _nfs4_opendata_to_nfs4_state() when doing
  and NFSv4.1 open by filehandle.


Chuck Lever (18):
  sunrpc: Const-ify all instances of struct rpc_xprt_ops
  xprtrdma: Add xdr_init_decode to rpcrdma_reply_handler()
  xprtrdma: Harden backchannel call decoding
  xprtrdma: Refactor rpcrdma_reply_handler()
  xprtrdma: Replace rpcrdma_count_chunks()
  xprtrdma: Remove opcode check in Receive completion handler
  xprtrdma: Remove rpcrdma_rep::rr_len
  xprtrdma: Clean up XDR decoding in rpcrdma_update_granted_credits()
  xprtrdma: Clean up rpcrdma_marshal_req() synopsis
  xprtrdma: Remove rpclen from rpcrdma_marshal_req
  xprtrdma: Set up an xdr_stream in rpcrdma_marshal_req()
  xprtrdma: Harden chunk list encoding against send buffer overflow
  xprtrdma: Clean up rpcrdma_bc_marshal_reply()
  xprtrdma: Remove imul instructions from rpcrdma_convert_iovs()
  xprtrdma: Remove imul instructions from chunk list encoders
  NFS: Fix NFSv2 security settings
  xprtrdma: Re-arrange struct rx_stats
  xprtrdma: Use xprt_pin_rqst in rpcrdma_reply_handler

Markus Elfring (1):
  lockd: Delete an error message for a failed memory allocation in 
reclaimer()

NeilBrown (6):
  SUNRPC: ECONNREFUSED should cause a rebind.
  NFSv4.1: don't use machine credentials for CLOSE when using 'sec=sys'
  NFS: don't expect errors from mempool_alloc().
  SUNRPC: remove some dead code.
  NFS: flush data when locking a file to ensure cache coherence for mmap.
  NFS: remove jiffies field from access cache

Trond Myklebust (48):
  NFSv4: Refactor _nfs4_proc_exchange_id()
  NFSv4.1: Ensure we clear the SP4_MACH_CRED flags in nfs4_sp4_select_mode()
  NFSv4: Cleanup setting of the migration flags.
  Merge branch 'exchange_id'
  NFSv4: Use correct inode in _nfs4_opendata_to_nfs4_state()
  NFSv4: Use the nfs4_state being recovered in 
_nfs4_opendata_to_nfs4_state()
  Merge branch 'open_state'
  NFS: Simplify page writeback
  NFS: Reduce lock contention in nfs_page_find_head_request()
  NFS: Reduce lock contention in nfs_try_to_update_request()
  NFS: Ensure we always dereference the page head last
  NFS: Fix a reference and lock leak in nfs_lock_and_join_requests()
  NFS: Fix an ABBA issue in nfs_lock_and_join_requests()
  NFS: Don't check request offset and size without holding a lock
  NFS: Don't unlock writebacks before declaring PG_WB_END
  NFS: Fix the inode request accounting when pages have subrequests
  NFS: Teach nfs_try_to_update_request() to deal with request page_groups
  NFS: Remove page group limit in nfs_flush_incompatible()
  NFS: Reduce inode->i_lock contention in nfs_lock_and_join_requests()
  NFS: Further optimise nfs_lock_and_join_requests()
  NFS: Fix nfs_page_group_destroy() and nfs_lock_and_join_requests() race 
cases
  NFS: Remove nfs_page_group_clear_bits()
  NFS: Remove unuse function nfs_page_group_lock_wait()
  NFS: Remove unused parameter from nfs_page_group_lock()
  NFS: Fix up nfs_page_group_covers_page()
  NFSv4: Convert nfs_lock_and_join_requests() to use 
nfs_page_find_head_request()
  NFS: Refactor nfs_page_find_head_request()
  NFSv4: Use a mutex to protect the per-inode commit lists
  NFS: Use an 

Re: [GIT PULL] Please pull NFS client changes for Linux 4.13

2017-08-01 Thread Trond Myklebust
On Tue, 2017-08-01 at 13:50 -0400, da...@codemonkey.org.uk wrote:
> On Tue, Aug 01, 2017 at 10:20:31AM -0700, Linus Torvalds wrote:
> 
>  > So I think the 'pathname' part may actually be entirely a red
> herring,
>  > and it's the underlying access itself that just picks up a random
>  > pointer from a stack that now contains something different. And
> KASAN
>  > didn't notice the stale stack access itself, because the stack
> slot is
>  > still valid - it's just no longer the original 'verifier'
> allocation.
>  > 
>  > Or *something* like that.
>  > 
>  > None of this looks even remotely new, though - the code seems to
> go
>  > back to 2009. Have you just changed what you're testing to trigger
>  > these things?
> 
> No idea why it only just showed up, but it isn't 100% reproducable
> either.  A month or so ago I did disable the V4 code on the server
> completely (as I was using v3 everywhere else), so maybe I started
> hitting
> a fallback path somewhere.  *shrug*
> 

I would only expect you too see it if you interrupt the wait on the
asynchronous EXCHANGE_ID call (which would allow the RPC call to
continue while the caller stack is trashed). Prior to commit
8d89bd70bc939, that code path was fully synchronous, so there was no
issue with interrupting the call.

-- 
Trond Myklebust
Linux NFS client maintainer, PrimaryData
trond.mykleb...@primarydata.com


Re: [GIT PULL] Please pull NFS client changes for Linux 4.13

2017-08-01 Thread Trond Myklebust
On Tue, 2017-08-01 at 13:50 -0400, da...@codemonkey.org.uk wrote:
> On Tue, Aug 01, 2017 at 10:20:31AM -0700, Linus Torvalds wrote:
> 
>  > So I think the 'pathname' part may actually be entirely a red
> herring,
>  > and it's the underlying access itself that just picks up a random
>  > pointer from a stack that now contains something different. And
> KASAN
>  > didn't notice the stale stack access itself, because the stack
> slot is
>  > still valid - it's just no longer the original 'verifier'
> allocation.
>  > 
>  > Or *something* like that.
>  > 
>  > None of this looks even remotely new, though - the code seems to
> go
>  > back to 2009. Have you just changed what you're testing to trigger
>  > these things?
> 
> No idea why it only just showed up, but it isn't 100% reproducable
> either.  A month or so ago I did disable the V4 code on the server
> completely (as I was using v3 everywhere else), so maybe I started
> hitting
> a fallback path somewhere.  *shrug*
> 

I would only expect you too see it if you interrupt the wait on the
asynchronous EXCHANGE_ID call (which would allow the RPC call to
continue while the caller stack is trashed). Prior to commit
8d89bd70bc939, that code path was fully synchronous, so there was no
issue with interrupting the call.

-- 
Trond Myklebust
Linux NFS client maintainer, PrimaryData
trond.mykleb...@primarydata.com


Re: [GIT PULL] Please pull NFS client changes for Linux 4.13

2017-08-01 Thread Linus Torvalds
On Tue, Aug 1, 2017 at 10:20 AM, Linus Torvalds
 wrote:
>
> So I think the 'pathname' part may actually be entirely a red herring,
> and it's the underlying access itself that just picks up a random
> pointer from a stack that now contains something different. And KASAN
> didn't notice the stale stack access itself, because the stack slot is
> still valid - it's just no longer the original 'verifier' allocation.
>
> Or *something* like that.

I think the "something like that" is actually just reading the
cdata->args.verifier->data pointer itself, and it *is* the stack
access - but the stack page has been free'd (because of the same fatal
signal that interrupted the rpc_wait_for_completion_task() call), and
then re-allocated (and free'd again) as a pathname page.

Maybe.

Regardless, my patch still looks conceptually correct, even if it
might have bugs due to total lack of testing.

Linus


Re: [GIT PULL] Please pull NFS client changes for Linux 4.13

2017-08-01 Thread Linus Torvalds
On Tue, Aug 1, 2017 at 10:20 AM, Linus Torvalds
 wrote:
>
> So I think the 'pathname' part may actually be entirely a red herring,
> and it's the underlying access itself that just picks up a random
> pointer from a stack that now contains something different. And KASAN
> didn't notice the stale stack access itself, because the stack slot is
> still valid - it's just no longer the original 'verifier' allocation.
>
> Or *something* like that.

I think the "something like that" is actually just reading the
cdata->args.verifier->data pointer itself, and it *is* the stack
access - but the stack page has been free'd (because of the same fatal
signal that interrupted the rpc_wait_for_completion_task() call), and
then re-allocated (and free'd again) as a pathname page.

Maybe.

Regardless, my patch still looks conceptually correct, even if it
might have bugs due to total lack of testing.

Linus


Re: [GIT PULL] Please pull NFS client changes for Linux 4.13

2017-08-01 Thread da...@codemonkey.org.uk
On Tue, Aug 01, 2017 at 10:20:31AM -0700, Linus Torvalds wrote:

 > So I think the 'pathname' part may actually be entirely a red herring,
 > and it's the underlying access itself that just picks up a random
 > pointer from a stack that now contains something different. And KASAN
 > didn't notice the stale stack access itself, because the stack slot is
 > still valid - it's just no longer the original 'verifier' allocation.
 > 
 > Or *something* like that.
 > 
 > None of this looks even remotely new, though - the code seems to go
 > back to 2009. Have you just changed what you're testing to trigger
 > these things?

No idea why it only just showed up, but it isn't 100% reproducable
either.  A month or so ago I did disable the V4 code on the server
completely (as I was using v3 everywhere else), so maybe I started hitting
a fallback path somewhere.  *shrug*

Dave



Re: [GIT PULL] Please pull NFS client changes for Linux 4.13

2017-08-01 Thread da...@codemonkey.org.uk
On Tue, Aug 01, 2017 at 10:20:31AM -0700, Linus Torvalds wrote:

 > So I think the 'pathname' part may actually be entirely a red herring,
 > and it's the underlying access itself that just picks up a random
 > pointer from a stack that now contains something different. And KASAN
 > didn't notice the stale stack access itself, because the stack slot is
 > still valid - it's just no longer the original 'verifier' allocation.
 > 
 > Or *something* like that.
 > 
 > None of this looks even remotely new, though - the code seems to go
 > back to 2009. Have you just changed what you're testing to trigger
 > these things?

No idea why it only just showed up, but it isn't 100% reproducable
either.  A month or so ago I did disable the V4 code on the server
completely (as I was using v3 everywhere else), so maybe I started hitting
a fallback path somewhere.  *shrug*

Dave



Re: [GIT PULL] Please pull NFS client changes for Linux 4.13

2017-08-01 Thread Trond Myklebust
On Tue, 2017-08-01 at 10:20 -0700, Linus Torvalds wrote:
> On Tue, Aug 1, 2017 at 8:51 AM, da...@codemonkey.org.uk
>  wrote:
> > On Mon, Jul 31, 2017 at 10:35:45PM -0700, Linus Torvalds wrote:
> >  > Any chance of getting the output from
> >  >
> >  >./scripts/faddr2line vmlinux
> > nfs4_exchange_id_done+0x3d7/0x8e0
> > 
> > 
> > Hm, that points to this..
> > 
> > 7463 /* Save the EXCHANGE_ID verifier session trunk
> > tests */
> > 7464 memcpy(clp->cl_confirm.data, cdata-
> > >args.verifier->data,
> > 7465sizeof(clp->cl_confirm.data));
> 
> Ok, that certainly made no sense to me, because the KASAN report made
> it look like a stale pathname access (allocated in getname, freed in
> putname), but I think the issue is more fundamental than that.
> 
> That cdata->args.verifier seems to be entirely broken. AT least for
> the "xprt == NULL" case, it does the following:
> 
>  - use the address of a local variable ("")
> 
>  - wait for the rpc completion using rpc_wait_for_completion_task().
> 
> That's unacceptably buggy crap. rpc_wait_for_completion_task() will
> happily exit on a deadly signal even if the rpc hasn't been
> completed,
> so now you'll have a stale pointer to a stack that has been freed.
> 
> So I think the 'pathname' part may actually be entirely a red
> herring,
> and it's the underlying access itself that just picks up a random
> pointer from a stack that now contains something different. And KASAN
> didn't notice the stale stack access itself, because the stack slot
> is
> still valid - it's just no longer the original 'verifier' allocation.
> 
> Or *something* like that.
> 
> None of this looks even remotely new, though - the code seems to go
> back to 2009. Have you just changed what you're testing to trigger
> these things?
> 
> I'm not even sure why it does that stupid stack allocation. It does a
> *real* allocation just a few lines later:
> 
> struct nfs41_exchange_id_data *calldata
> ...
> calldata = kzalloc(sizeof(*calldata), GFP_NOFS);
> 
> and the whole verifier structure could easily have been part of that
> same allocation as far as I can tell.
> 
> And that really might seem to be the right thing to do.
> 
> TOTALLY UNTESTED PROBABLY COMPLETE CRAP patch attatched.
> 
> That patch compiles for me. It *might* even work. Or it might just be
> the ramblings of a diseased mind.
> 
> Anna? Trond?
> 

I came to the same conclusion yesterday, and have a stable patch that
does something similar. I just got distracted with the other bugs that
were introduced by the exchangeid patch series in Linux-4.9 (including
what looks like a duplicate free issue in nfs4_test_session_trunk()).

I can pass a few of the more critical patches on to Anna for merging in
this cycle, then I've got some clean ups ready for the 4.14 merge
window.

Cheers
  Trond

-- 
Trond Myklebust
Linux NFS client maintainer, PrimaryData
trond.mykleb...@primarydata.com


Re: [GIT PULL] Please pull NFS client changes for Linux 4.13

2017-08-01 Thread Trond Myklebust
On Tue, 2017-08-01 at 10:20 -0700, Linus Torvalds wrote:
> On Tue, Aug 1, 2017 at 8:51 AM, da...@codemonkey.org.uk
>  wrote:
> > On Mon, Jul 31, 2017 at 10:35:45PM -0700, Linus Torvalds wrote:
> >  > Any chance of getting the output from
> >  >
> >  >./scripts/faddr2line vmlinux
> > nfs4_exchange_id_done+0x3d7/0x8e0
> > 
> > 
> > Hm, that points to this..
> > 
> > 7463 /* Save the EXCHANGE_ID verifier session trunk
> > tests */
> > 7464 memcpy(clp->cl_confirm.data, cdata-
> > >args.verifier->data,
> > 7465sizeof(clp->cl_confirm.data));
> 
> Ok, that certainly made no sense to me, because the KASAN report made
> it look like a stale pathname access (allocated in getname, freed in
> putname), but I think the issue is more fundamental than that.
> 
> That cdata->args.verifier seems to be entirely broken. AT least for
> the "xprt == NULL" case, it does the following:
> 
>  - use the address of a local variable ("")
> 
>  - wait for the rpc completion using rpc_wait_for_completion_task().
> 
> That's unacceptably buggy crap. rpc_wait_for_completion_task() will
> happily exit on a deadly signal even if the rpc hasn't been
> completed,
> so now you'll have a stale pointer to a stack that has been freed.
> 
> So I think the 'pathname' part may actually be entirely a red
> herring,
> and it's the underlying access itself that just picks up a random
> pointer from a stack that now contains something different. And KASAN
> didn't notice the stale stack access itself, because the stack slot
> is
> still valid - it's just no longer the original 'verifier' allocation.
> 
> Or *something* like that.
> 
> None of this looks even remotely new, though - the code seems to go
> back to 2009. Have you just changed what you're testing to trigger
> these things?
> 
> I'm not even sure why it does that stupid stack allocation. It does a
> *real* allocation just a few lines later:
> 
> struct nfs41_exchange_id_data *calldata
> ...
> calldata = kzalloc(sizeof(*calldata), GFP_NOFS);
> 
> and the whole verifier structure could easily have been part of that
> same allocation as far as I can tell.
> 
> And that really might seem to be the right thing to do.
> 
> TOTALLY UNTESTED PROBABLY COMPLETE CRAP patch attatched.
> 
> That patch compiles for me. It *might* even work. Or it might just be
> the ramblings of a diseased mind.
> 
> Anna? Trond?
> 

I came to the same conclusion yesterday, and have a stable patch that
does something similar. I just got distracted with the other bugs that
were introduced by the exchangeid patch series in Linux-4.9 (including
what looks like a duplicate free issue in nfs4_test_session_trunk()).

I can pass a few of the more critical patches on to Anna for merging in
this cycle, then I've got some clean ups ready for the 4.14 merge
window.

Cheers
  Trond

-- 
Trond Myklebust
Linux NFS client maintainer, PrimaryData
trond.mykleb...@primarydata.com


Re: [GIT PULL] Please pull NFS client changes for Linux 4.13

2017-08-01 Thread Linus Torvalds
On Tue, Aug 1, 2017 at 8:51 AM, da...@codemonkey.org.uk
 wrote:
> On Mon, Jul 31, 2017 at 10:35:45PM -0700, Linus Torvalds wrote:
>  > Any chance of getting the output from
>  >
>  >./scripts/faddr2line vmlinux nfs4_exchange_id_done+0x3d7/0x8e0
>
>
> Hm, that points to this..
>
> 7463 /* Save the EXCHANGE_ID verifier session trunk tests */
> 7464 memcpy(clp->cl_confirm.data, cdata->args.verifier->data,
> 7465sizeof(clp->cl_confirm.data));

Ok, that certainly made no sense to me, because the KASAN report made
it look like a stale pathname access (allocated in getname, freed in
putname), but I think the issue is more fundamental than that.

That cdata->args.verifier seems to be entirely broken. AT least for
the "xprt == NULL" case, it does the following:

 - use the address of a local variable ("")

 - wait for the rpc completion using rpc_wait_for_completion_task().

That's unacceptably buggy crap. rpc_wait_for_completion_task() will
happily exit on a deadly signal even if the rpc hasn't been completed,
so now you'll have a stale pointer to a stack that has been freed.

So I think the 'pathname' part may actually be entirely a red herring,
and it's the underlying access itself that just picks up a random
pointer from a stack that now contains something different. And KASAN
didn't notice the stale stack access itself, because the stack slot is
still valid - it's just no longer the original 'verifier' allocation.

Or *something* like that.

None of this looks even remotely new, though - the code seems to go
back to 2009. Have you just changed what you're testing to trigger
these things?

I'm not even sure why it does that stupid stack allocation. It does a
*real* allocation just a few lines later:

struct nfs41_exchange_id_data *calldata
...
calldata = kzalloc(sizeof(*calldata), GFP_NOFS);

and the whole verifier structure could easily have been part of that
same allocation as far as I can tell.

And that really might seem to be the right thing to do.

TOTALLY UNTESTED PROBABLY COMPLETE CRAP patch attatched.

That patch compiles for me. It *might* even work. Or it might just be
the ramblings of a diseased mind.

Anna? Trond?

So caveat probatorem,

  Linus
 fs/nfs/nfs4proc.c | 19 ++-
 1 file changed, 14 insertions(+), 5 deletions(-)

diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c
index 18ca6879d8de..0712af3d38f8 100644
--- a/fs/nfs/nfs4proc.c
+++ b/fs/nfs/nfs4proc.c
@@ -7490,6 +7490,11 @@ static const struct rpc_call_ops 
nfs4_exchange_id_call_ops = {
.rpc_release = nfs4_exchange_id_release,
 };
 
+struct verifier_and_calldata {
+   struct nfs41_exchange_id_data calldata;
+   nfs4_verifier verifier;
+};
+
 /*
  * _nfs4_proc_exchange_id()
  *
@@ -7498,7 +7503,8 @@ static const struct rpc_call_ops 
nfs4_exchange_id_call_ops = {
 static int _nfs4_proc_exchange_id(struct nfs_client *clp, struct rpc_cred 
*cred,
u32 sp4_how, struct rpc_xprt *xprt)
 {
-   nfs4_verifier verifier;
+   struct verifier_and_calldata *vna;
+   nfs4_verifier *verifier;
struct rpc_message msg = {
.rpc_proc = _procedures[NFSPROC4_CLNT_EXCHANGE_ID],
.rpc_cred = cred,
@@ -7516,14 +7522,17 @@ static int _nfs4_proc_exchange_id(struct nfs_client 
*clp, struct rpc_cred *cred,
if (!atomic_inc_not_zero(>cl_count))
return -EIO;
 
-   calldata = kzalloc(sizeof(*calldata), GFP_NOFS);
-   if (!calldata) {
+   vna = kzalloc(sizeof(*vna), GFP_NOFS);
+   if (!vna) {
nfs_put_client(clp);
return -ENOMEM;
}
+   /* kfree() of calldata will also free the verifier */
+   calldata = >calldata;
+   verifier = >verifier;
 
if (!xprt)
-   nfs4_init_boot_verifier(clp, );
+   nfs4_init_boot_verifier(clp, verifier);
 
status = nfs4_init_uniform_client_string(clp);
if (status)
@@ -7566,7 +7575,7 @@ static int _nfs4_proc_exchange_id(struct nfs_client *clp, 
struct rpc_cred *cred,
RPC_TASK_SOFT|RPC_TASK_SOFTCONN|RPC_TASK_ASYNC;
calldata->args.verifier = >cl_confirm;
} else {
-   calldata->args.verifier = 
+   calldata->args.verifier = verifier;
}
calldata->args.client = clp;
 #ifdef CONFIG_NFS_V4_1_MIGRATION


Re: [GIT PULL] Please pull NFS client changes for Linux 4.13

2017-08-01 Thread Linus Torvalds
On Tue, Aug 1, 2017 at 8:51 AM, da...@codemonkey.org.uk
 wrote:
> On Mon, Jul 31, 2017 at 10:35:45PM -0700, Linus Torvalds wrote:
>  > Any chance of getting the output from
>  >
>  >./scripts/faddr2line vmlinux nfs4_exchange_id_done+0x3d7/0x8e0
>
>
> Hm, that points to this..
>
> 7463 /* Save the EXCHANGE_ID verifier session trunk tests */
> 7464 memcpy(clp->cl_confirm.data, cdata->args.verifier->data,
> 7465sizeof(clp->cl_confirm.data));

Ok, that certainly made no sense to me, because the KASAN report made
it look like a stale pathname access (allocated in getname, freed in
putname), but I think the issue is more fundamental than that.

That cdata->args.verifier seems to be entirely broken. AT least for
the "xprt == NULL" case, it does the following:

 - use the address of a local variable ("")

 - wait for the rpc completion using rpc_wait_for_completion_task().

That's unacceptably buggy crap. rpc_wait_for_completion_task() will
happily exit on a deadly signal even if the rpc hasn't been completed,
so now you'll have a stale pointer to a stack that has been freed.

So I think the 'pathname' part may actually be entirely a red herring,
and it's the underlying access itself that just picks up a random
pointer from a stack that now contains something different. And KASAN
didn't notice the stale stack access itself, because the stack slot is
still valid - it's just no longer the original 'verifier' allocation.

Or *something* like that.

None of this looks even remotely new, though - the code seems to go
back to 2009. Have you just changed what you're testing to trigger
these things?

I'm not even sure why it does that stupid stack allocation. It does a
*real* allocation just a few lines later:

struct nfs41_exchange_id_data *calldata
...
calldata = kzalloc(sizeof(*calldata), GFP_NOFS);

and the whole verifier structure could easily have been part of that
same allocation as far as I can tell.

And that really might seem to be the right thing to do.

TOTALLY UNTESTED PROBABLY COMPLETE CRAP patch attatched.

That patch compiles for me. It *might* even work. Or it might just be
the ramblings of a diseased mind.

Anna? Trond?

So caveat probatorem,

  Linus
 fs/nfs/nfs4proc.c | 19 ++-
 1 file changed, 14 insertions(+), 5 deletions(-)

diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c
index 18ca6879d8de..0712af3d38f8 100644
--- a/fs/nfs/nfs4proc.c
+++ b/fs/nfs/nfs4proc.c
@@ -7490,6 +7490,11 @@ static const struct rpc_call_ops 
nfs4_exchange_id_call_ops = {
.rpc_release = nfs4_exchange_id_release,
 };
 
+struct verifier_and_calldata {
+   struct nfs41_exchange_id_data calldata;
+   nfs4_verifier verifier;
+};
+
 /*
  * _nfs4_proc_exchange_id()
  *
@@ -7498,7 +7503,8 @@ static const struct rpc_call_ops 
nfs4_exchange_id_call_ops = {
 static int _nfs4_proc_exchange_id(struct nfs_client *clp, struct rpc_cred 
*cred,
u32 sp4_how, struct rpc_xprt *xprt)
 {
-   nfs4_verifier verifier;
+   struct verifier_and_calldata *vna;
+   nfs4_verifier *verifier;
struct rpc_message msg = {
.rpc_proc = _procedures[NFSPROC4_CLNT_EXCHANGE_ID],
.rpc_cred = cred,
@@ -7516,14 +7522,17 @@ static int _nfs4_proc_exchange_id(struct nfs_client 
*clp, struct rpc_cred *cred,
if (!atomic_inc_not_zero(>cl_count))
return -EIO;
 
-   calldata = kzalloc(sizeof(*calldata), GFP_NOFS);
-   if (!calldata) {
+   vna = kzalloc(sizeof(*vna), GFP_NOFS);
+   if (!vna) {
nfs_put_client(clp);
return -ENOMEM;
}
+   /* kfree() of calldata will also free the verifier */
+   calldata = >calldata;
+   verifier = >verifier;
 
if (!xprt)
-   nfs4_init_boot_verifier(clp, );
+   nfs4_init_boot_verifier(clp, verifier);
 
status = nfs4_init_uniform_client_string(clp);
if (status)
@@ -7566,7 +7575,7 @@ static int _nfs4_proc_exchange_id(struct nfs_client *clp, 
struct rpc_cred *cred,
RPC_TASK_SOFT|RPC_TASK_SOFTCONN|RPC_TASK_ASYNC;
calldata->args.verifier = >cl_confirm;
} else {
-   calldata->args.verifier = 
+   calldata->args.verifier = verifier;
}
calldata->args.client = clp;
 #ifdef CONFIG_NFS_V4_1_MIGRATION


Re: [GIT PULL] Please pull NFS client changes for Linux 4.13

2017-08-01 Thread da...@codemonkey.org.uk
On Mon, Jul 31, 2017 at 10:35:45PM -0700, Linus Torvalds wrote:
 > On Mon, Jul 31, 2017 at 8:43 AM, da...@codemonkey.org.uk
 >  wrote:
 > > Another NFSv4 KASAN splat, this time from rc3.
 > >
 > > BUG: KASAN: use-after-free in nfs4_exchange_id_done+0x3d7/0x8e0 [nfsv4]
 > 
 > Ugh. It's really hard to tell what access that it - KASAN doesn't
 > actually give enough information. There's lots of 8-byte accesses
 > there in that function.
 > 
 > Any chance of getting the output from
 > 
 >./scripts/faddr2line vmlinux nfs4_exchange_id_done+0x3d7/0x8e0
 

Hm, that points to this..

7463 /* Save the EXCHANGE_ID verifier session trunk tests */
7464 memcpy(clp->cl_confirm.data, cdata->args.verifier->data,
7465sizeof(clp->cl_confirm.data));

Dave



Re: [GIT PULL] Please pull NFS client changes for Linux 4.13

2017-08-01 Thread da...@codemonkey.org.uk
On Mon, Jul 31, 2017 at 10:35:45PM -0700, Linus Torvalds wrote:
 > On Mon, Jul 31, 2017 at 8:43 AM, da...@codemonkey.org.uk
 >  wrote:
 > > Another NFSv4 KASAN splat, this time from rc3.
 > >
 > > BUG: KASAN: use-after-free in nfs4_exchange_id_done+0x3d7/0x8e0 [nfsv4]
 > 
 > Ugh. It's really hard to tell what access that it - KASAN doesn't
 > actually give enough information. There's lots of 8-byte accesses
 > there in that function.
 > 
 > Any chance of getting the output from
 > 
 >./scripts/faddr2line vmlinux nfs4_exchange_id_done+0x3d7/0x8e0
 

Hm, that points to this..

7463 /* Save the EXCHANGE_ID verifier session trunk tests */
7464 memcpy(clp->cl_confirm.data, cdata->args.verifier->data,
7465sizeof(clp->cl_confirm.data));

Dave



Re: [GIT PULL] Please pull NFS client changes for Linux 4.13

2017-07-31 Thread Linus Torvalds
On Mon, Jul 31, 2017 at 8:43 AM, da...@codemonkey.org.uk
 wrote:
> Another NFSv4 KASAN splat, this time from rc3.
>
> BUG: KASAN: use-after-free in nfs4_exchange_id_done+0x3d7/0x8e0 [nfsv4]

Ugh. It's really hard to tell what access that it - KASAN doesn't
actually give enough information. There's lots of 8-byte accesses
there in that function.

Any chance of getting the output from

   ./scripts/faddr2line vmlinux nfs4_exchange_id_done+0x3d7/0x8e0

or something? That would be extremely useful in general for
stacktraces, but it's doubly useful for KASAN because most *other*
stacktraces tend to have a very limited number of things that can warn
(ie there's one or two WARN_ON() calls in a function), but KASAN can
have tens or hundreds..

   Linus


> Read of size 8 at addr 8804508af528 by task kworker/2:1/34
>
> CPU: 2 PID: 34 Comm: kworker/2:1 Not tainted 4.13.0-rc3-think+ #1
> Workqueue: rpciod rpc_async_schedule [sunrpc]
> Call Trace:
>  dump_stack+0x68/0xa1
>  print_address_description+0xd9/0x270
>  kasan_report+0x257/0x370
>  ? nfs4_exchange_id_done+0x3d7/0x8e0 [nfsv4]
>  check_memory_region+0x13a/0x1a0
>  __asan_loadN+0xf/0x20
>  nfs4_exchange_id_done+0x3d7/0x8e0 [nfsv4]
>  ? nfs4_exchange_id_release+0xb0/0xb0 [nfsv4]
>  rpc_exit_task+0x69/0x110 [sunrpc]
>  ? rpc_destroy_wait_queue+0x20/0x20 [sunrpc]
>  ? rpc_destroy_wait_queue+0x20/0x20 [sunrpc]
>  __rpc_execute+0x1a0/0x840 [sunrpc]
>  ? rpc_wake_up_queued_task+0x50/0x50 [sunrpc]
>  ? __lock_is_held+0x9a/0x100
>  ? debug_lockdep_rcu_enabled.part.16+0x1a/0x30
>  rpc_async_schedule+0x12/0x20 [sunrpc]
>  process_one_work+0x4d5/0xa70
>  ? flush_delayed_work+0x70/0x70
>  ? lock_acquire+0xfc/0x220
>  worker_thread+0x88/0x630
>  ? pci_mmcfg_check_reserved+0xc0/0xc0
>  kthread+0x1a6/0x1f0
>  ? process_one_work+0xa70/0xa70
>  ? kthread_create_on_node+0xc0/0xc0
>  ret_from_fork+0x27/0x40
>
> Allocated by task 1:
>  save_stack_trace+0x1b/0x20
>  save_stack+0x46/0xd0
>  kasan_kmalloc+0xad/0xe0
>  kasan_slab_alloc+0x12/0x20
>  kmem_cache_alloc+0xe0/0x2f0
>  getname_flags+0x43/0x220
>  getname+0x12/0x20
>  do_sys_open+0x14c/0x2b0
>  SyS_open+0x1e/0x20
>  do_syscall_64+0xea/0x260
>  return_from_SYSCALL_64+0x0/0x7a
>
> Freed by task 1:
>  save_stack_trace+0x1b/0x20
>  save_stack+0x46/0xd0
>  kasan_slab_free+0x72/0xc0
>  kmem_cache_free+0xa8/0x300
>  putname+0x80/0x90
>  do_sys_open+0x22f/0x2b0
>  SyS_open+0x1e/0x20
>  do_syscall_64+0xea/0x260
>  return_from_SYSCALL_64+0x0/0x7a
>
> The buggy address belongs to the object at 8804508aeac0\x0a which belongs 
> to the cache names_cache of size 4096
> The buggy address is located 2664 bytes inside of\x0a 4096-byte region 
> [8804508aeac0, 8804508afac0)
> The buggy address belongs to the page:
> page:ea0011422a00 count:1 mapcount:0 mapping:  (null) index:0x0
> [CONT START]  compound_mapcount: 0
> flags: 0x80008100(slab|head)
> raw: 80008100   000100070007
> raw: ea00113d6020 ea001136e220 8804664f8040 
> page dumped because: kasan: bad access detected
>
> Memory state around the buggy address:
>  8804508af400: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
>  8804508af480: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
>>8804508af500: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
>   ^
>  8804508af580: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
>  8804508af600: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
> ==
>


Re: [GIT PULL] Please pull NFS client changes for Linux 4.13

2017-07-31 Thread Linus Torvalds
On Mon, Jul 31, 2017 at 8:43 AM, da...@codemonkey.org.uk
 wrote:
> Another NFSv4 KASAN splat, this time from rc3.
>
> BUG: KASAN: use-after-free in nfs4_exchange_id_done+0x3d7/0x8e0 [nfsv4]

Ugh. It's really hard to tell what access that it - KASAN doesn't
actually give enough information. There's lots of 8-byte accesses
there in that function.

Any chance of getting the output from

   ./scripts/faddr2line vmlinux nfs4_exchange_id_done+0x3d7/0x8e0

or something? That would be extremely useful in general for
stacktraces, but it's doubly useful for KASAN because most *other*
stacktraces tend to have a very limited number of things that can warn
(ie there's one or two WARN_ON() calls in a function), but KASAN can
have tens or hundreds..

   Linus


> Read of size 8 at addr 8804508af528 by task kworker/2:1/34
>
> CPU: 2 PID: 34 Comm: kworker/2:1 Not tainted 4.13.0-rc3-think+ #1
> Workqueue: rpciod rpc_async_schedule [sunrpc]
> Call Trace:
>  dump_stack+0x68/0xa1
>  print_address_description+0xd9/0x270
>  kasan_report+0x257/0x370
>  ? nfs4_exchange_id_done+0x3d7/0x8e0 [nfsv4]
>  check_memory_region+0x13a/0x1a0
>  __asan_loadN+0xf/0x20
>  nfs4_exchange_id_done+0x3d7/0x8e0 [nfsv4]
>  ? nfs4_exchange_id_release+0xb0/0xb0 [nfsv4]
>  rpc_exit_task+0x69/0x110 [sunrpc]
>  ? rpc_destroy_wait_queue+0x20/0x20 [sunrpc]
>  ? rpc_destroy_wait_queue+0x20/0x20 [sunrpc]
>  __rpc_execute+0x1a0/0x840 [sunrpc]
>  ? rpc_wake_up_queued_task+0x50/0x50 [sunrpc]
>  ? __lock_is_held+0x9a/0x100
>  ? debug_lockdep_rcu_enabled.part.16+0x1a/0x30
>  rpc_async_schedule+0x12/0x20 [sunrpc]
>  process_one_work+0x4d5/0xa70
>  ? flush_delayed_work+0x70/0x70
>  ? lock_acquire+0xfc/0x220
>  worker_thread+0x88/0x630
>  ? pci_mmcfg_check_reserved+0xc0/0xc0
>  kthread+0x1a6/0x1f0
>  ? process_one_work+0xa70/0xa70
>  ? kthread_create_on_node+0xc0/0xc0
>  ret_from_fork+0x27/0x40
>
> Allocated by task 1:
>  save_stack_trace+0x1b/0x20
>  save_stack+0x46/0xd0
>  kasan_kmalloc+0xad/0xe0
>  kasan_slab_alloc+0x12/0x20
>  kmem_cache_alloc+0xe0/0x2f0
>  getname_flags+0x43/0x220
>  getname+0x12/0x20
>  do_sys_open+0x14c/0x2b0
>  SyS_open+0x1e/0x20
>  do_syscall_64+0xea/0x260
>  return_from_SYSCALL_64+0x0/0x7a
>
> Freed by task 1:
>  save_stack_trace+0x1b/0x20
>  save_stack+0x46/0xd0
>  kasan_slab_free+0x72/0xc0
>  kmem_cache_free+0xa8/0x300
>  putname+0x80/0x90
>  do_sys_open+0x22f/0x2b0
>  SyS_open+0x1e/0x20
>  do_syscall_64+0xea/0x260
>  return_from_SYSCALL_64+0x0/0x7a
>
> The buggy address belongs to the object at 8804508aeac0\x0a which belongs 
> to the cache names_cache of size 4096
> The buggy address is located 2664 bytes inside of\x0a 4096-byte region 
> [8804508aeac0, 8804508afac0)
> The buggy address belongs to the page:
> page:ea0011422a00 count:1 mapcount:0 mapping:  (null) index:0x0
> [CONT START]  compound_mapcount: 0
> flags: 0x80008100(slab|head)
> raw: 80008100   000100070007
> raw: ea00113d6020 ea001136e220 8804664f8040 
> page dumped because: kasan: bad access detected
>
> Memory state around the buggy address:
>  8804508af400: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
>  8804508af480: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
>>8804508af500: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
>   ^
>  8804508af580: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
>  8804508af600: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
> ==
>


Re: [GIT PULL] Please pull NFS client changes for Linux 4.13

2017-07-31 Thread da...@codemonkey.org.uk
Another NFSv4 KASAN splat, this time from rc3.


==
BUG: KASAN: use-after-free in nfs4_exchange_id_done+0x3d7/0x8e0 [nfsv4]
Read of size 8 at addr 8804508af528 by task kworker/2:1/34

CPU: 2 PID: 34 Comm: kworker/2:1 Not tainted 4.13.0-rc3-think+ #1 
Workqueue: rpciod rpc_async_schedule [sunrpc]
Call Trace:
 dump_stack+0x68/0xa1
 print_address_description+0xd9/0x270
 kasan_report+0x257/0x370
 ? nfs4_exchange_id_done+0x3d7/0x8e0 [nfsv4]
 check_memory_region+0x13a/0x1a0
 __asan_loadN+0xf/0x20
 nfs4_exchange_id_done+0x3d7/0x8e0 [nfsv4]
 ? nfs4_exchange_id_release+0xb0/0xb0 [nfsv4]
 rpc_exit_task+0x69/0x110 [sunrpc]
 ? rpc_destroy_wait_queue+0x20/0x20 [sunrpc]
 ? rpc_destroy_wait_queue+0x20/0x20 [sunrpc]
 __rpc_execute+0x1a0/0x840 [sunrpc]
 ? rpc_wake_up_queued_task+0x50/0x50 [sunrpc]
 ? __lock_is_held+0x9a/0x100
 ? debug_lockdep_rcu_enabled.part.16+0x1a/0x30
 rpc_async_schedule+0x12/0x20 [sunrpc]
 process_one_work+0x4d5/0xa70
 ? flush_delayed_work+0x70/0x70
 ? lock_acquire+0xfc/0x220
 worker_thread+0x88/0x630
 ? pci_mmcfg_check_reserved+0xc0/0xc0
 kthread+0x1a6/0x1f0
 ? process_one_work+0xa70/0xa70
 ? kthread_create_on_node+0xc0/0xc0
 ret_from_fork+0x27/0x40

Allocated by task 1:
 save_stack_trace+0x1b/0x20
 save_stack+0x46/0xd0
 kasan_kmalloc+0xad/0xe0
 kasan_slab_alloc+0x12/0x20
 kmem_cache_alloc+0xe0/0x2f0
 getname_flags+0x43/0x220
 getname+0x12/0x20
 do_sys_open+0x14c/0x2b0
 SyS_open+0x1e/0x20
 do_syscall_64+0xea/0x260
 return_from_SYSCALL_64+0x0/0x7a

Freed by task 1:
 save_stack_trace+0x1b/0x20
 save_stack+0x46/0xd0
 kasan_slab_free+0x72/0xc0
 kmem_cache_free+0xa8/0x300
 putname+0x80/0x90
 do_sys_open+0x22f/0x2b0
 SyS_open+0x1e/0x20
 do_syscall_64+0xea/0x260
 return_from_SYSCALL_64+0x0/0x7a

The buggy address belongs to the object at 8804508aeac0\x0a which belongs 
to the cache names_cache of size 4096
The buggy address is located 2664 bytes inside of\x0a 4096-byte region 
[8804508aeac0, 8804508afac0)
The buggy address belongs to the page:
page:ea0011422a00 count:1 mapcount:0 mapping:  (null) index:0x0
[CONT START]  compound_mapcount: 0
flags: 0x80008100(slab|head)
raw: 80008100   000100070007
raw: ea00113d6020 ea001136e220 8804664f8040 
page dumped because: kasan: bad access detected

Memory state around the buggy address:
 8804508af400: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
 8804508af480: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
>8804508af500: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
  ^
 8804508af580: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
 8804508af600: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
==



Re: [GIT PULL] Please pull NFS client changes for Linux 4.13

2017-07-31 Thread da...@codemonkey.org.uk
Another NFSv4 KASAN splat, this time from rc3.


==
BUG: KASAN: use-after-free in nfs4_exchange_id_done+0x3d7/0x8e0 [nfsv4]
Read of size 8 at addr 8804508af528 by task kworker/2:1/34

CPU: 2 PID: 34 Comm: kworker/2:1 Not tainted 4.13.0-rc3-think+ #1 
Workqueue: rpciod rpc_async_schedule [sunrpc]
Call Trace:
 dump_stack+0x68/0xa1
 print_address_description+0xd9/0x270
 kasan_report+0x257/0x370
 ? nfs4_exchange_id_done+0x3d7/0x8e0 [nfsv4]
 check_memory_region+0x13a/0x1a0
 __asan_loadN+0xf/0x20
 nfs4_exchange_id_done+0x3d7/0x8e0 [nfsv4]
 ? nfs4_exchange_id_release+0xb0/0xb0 [nfsv4]
 rpc_exit_task+0x69/0x110 [sunrpc]
 ? rpc_destroy_wait_queue+0x20/0x20 [sunrpc]
 ? rpc_destroy_wait_queue+0x20/0x20 [sunrpc]
 __rpc_execute+0x1a0/0x840 [sunrpc]
 ? rpc_wake_up_queued_task+0x50/0x50 [sunrpc]
 ? __lock_is_held+0x9a/0x100
 ? debug_lockdep_rcu_enabled.part.16+0x1a/0x30
 rpc_async_schedule+0x12/0x20 [sunrpc]
 process_one_work+0x4d5/0xa70
 ? flush_delayed_work+0x70/0x70
 ? lock_acquire+0xfc/0x220
 worker_thread+0x88/0x630
 ? pci_mmcfg_check_reserved+0xc0/0xc0
 kthread+0x1a6/0x1f0
 ? process_one_work+0xa70/0xa70
 ? kthread_create_on_node+0xc0/0xc0
 ret_from_fork+0x27/0x40

Allocated by task 1:
 save_stack_trace+0x1b/0x20
 save_stack+0x46/0xd0
 kasan_kmalloc+0xad/0xe0
 kasan_slab_alloc+0x12/0x20
 kmem_cache_alloc+0xe0/0x2f0
 getname_flags+0x43/0x220
 getname+0x12/0x20
 do_sys_open+0x14c/0x2b0
 SyS_open+0x1e/0x20
 do_syscall_64+0xea/0x260
 return_from_SYSCALL_64+0x0/0x7a

Freed by task 1:
 save_stack_trace+0x1b/0x20
 save_stack+0x46/0xd0
 kasan_slab_free+0x72/0xc0
 kmem_cache_free+0xa8/0x300
 putname+0x80/0x90
 do_sys_open+0x22f/0x2b0
 SyS_open+0x1e/0x20
 do_syscall_64+0xea/0x260
 return_from_SYSCALL_64+0x0/0x7a

The buggy address belongs to the object at 8804508aeac0\x0a which belongs 
to the cache names_cache of size 4096
The buggy address is located 2664 bytes inside of\x0a 4096-byte region 
[8804508aeac0, 8804508afac0)
The buggy address belongs to the page:
page:ea0011422a00 count:1 mapcount:0 mapping:  (null) index:0x0
[CONT START]  compound_mapcount: 0
flags: 0x80008100(slab|head)
raw: 80008100   000100070007
raw: ea00113d6020 ea001136e220 8804664f8040 
page dumped because: kasan: bad access detected

Memory state around the buggy address:
 8804508af400: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
 8804508af480: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
>8804508af500: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
  ^
 8804508af580: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
 8804508af600: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
==



Re: [GIT PULL] Please pull NFS client changes for Linux 4.13

2017-07-17 Thread Linus Torvalds
On Sun, Jul 16, 2017 at 8:05 PM, da...@codemonkey.org.uk
 wrote:
> On Sun, Jul 16, 2017 at 10:57:27PM +, Trond Myklebust wrote:
>
>  > > BUG: KASAN: global-out-of-bounds in call_start+0x93/0x100
>  > > Read of size 8 at addr 8d582588 by task kworker/0:1/22
>  >
>  > Does the following patch fix it?
>
> Yep, seems to do the trick!

I'm assuming I'll get this fix through a future pull request, and am
not applying the patch as-is. Just FYI.

 Linus


Re: [GIT PULL] Please pull NFS client changes for Linux 4.13

2017-07-17 Thread Linus Torvalds
On Sun, Jul 16, 2017 at 8:05 PM, da...@codemonkey.org.uk
 wrote:
> On Sun, Jul 16, 2017 at 10:57:27PM +, Trond Myklebust wrote:
>
>  > > BUG: KASAN: global-out-of-bounds in call_start+0x93/0x100
>  > > Read of size 8 at addr 8d582588 by task kworker/0:1/22
>  >
>  > Does the following patch fix it?
>
> Yep, seems to do the trick!

I'm assuming I'll get this fix through a future pull request, and am
not applying the patch as-is. Just FYI.

 Linus


Re: [GIT PULL] Please pull NFS client changes for Linux 4.13

2017-07-16 Thread da...@codemonkey.org.uk
On Sun, Jul 16, 2017 at 10:57:27PM +, Trond Myklebust wrote:
 
 > > BUG: KASAN: global-out-of-bounds in call_start+0x93/0x100
 > > Read of size 8 at addr 8d582588 by task kworker/0:1/22
 > 
 > Does the following patch fix it?

Yep, seems to do the trick!

Dave



Re: [GIT PULL] Please pull NFS client changes for Linux 4.13

2017-07-16 Thread da...@codemonkey.org.uk
On Sun, Jul 16, 2017 at 10:57:27PM +, Trond Myklebust wrote:
 
 > > BUG: KASAN: global-out-of-bounds in call_start+0x93/0x100
 > > Read of size 8 at addr 8d582588 by task kworker/0:1/22
 > 
 > Does the following patch fix it?

Yep, seems to do the trick!

Dave



Re: [GIT PULL] Please pull NFS client changes for Linux 4.13

2017-07-16 Thread Trond Myklebust
Hi Dave,

On Sun, 2017-07-16 at 17:15 -0400, Dave Jones wrote:
> On Fri, Jul 14, 2017 at 10:25:43AM -0400, Dave Jones wrote:
>  > On Thu, Jul 13, 2017 at 05:16:24PM -0400, Anna Schumaker wrote:
>  >  > Hi Linus,
>  >  > 
>  >  > The following changes since commit
> 32c1431eea4881a6b17bd7c639315010aeefa452:
>  >  > 
>  >  >   Linux 4.12-rc5 (2017-06-11 16:48:20 -0700)
>  >  > 
>  >  > are available in the git repository at:
>  >  > 
>  >  >   git://git.linux-nfs.org/projects/anna/linux-nfs.git tags/nfs-
> for-4.13-1
>  >  > 
>  >  > for you to fetch changes up to
> b4f937cffa66b3d56eb8f586e620d0b223a281a3:
>  >  > 
>  >  >   NFS: Don't run wake_up_bit() when nobody is waiting... (2017-
> 07-13 16:57:18 -0400)
>  > 
>  > Since this landed, I'm seeing this during boot..
>  > 
>  >  =
> =
>  >  BUG: KASAN: global-out-of-bounds in strscpy+0x4a/0x230
>  >  Read of size 8 at addr b4eeaf20 by task nfsd/688
> 
> Now that this one got fixed, this one fell out instead..
> Will dig deeper tomorrow.
> 
> ==
> BUG: KASAN: global-out-of-bounds in call_start+0x93/0x100
> Read of size 8 at addr 8d582588 by task kworker/0:1/22
> 
> CPU: 0 PID: 22 Comm: kworker/0:1 Not tainted 4.13.0-rc1-firewall+ #1 
> Workqueue: rpciod rpc_async_schedule
> Call Trace:
>  dump_stack+0x68/0x94
>  print_address_description+0x2c/0x270
>  ? call_start+0x93/0x100
>  kasan_report+0x239/0x350
>  __asan_load8+0x55/0x90
>  call_start+0x93/0x100
>  ? rpc_default_callback+0x10/0x10
>  ? rpc_default_callback+0x10/0x10
>  __rpc_execute+0x170/0x740
>  ? rpc_wake_up_queued_task+0x50/0x50
>  ? __lock_is_held+0x9f/0x110
>  rpc_async_schedule+0x12/0x20
>  process_one_work+0x4ba/0xb10
>  ? process_one_work+0x401/0xb10
>  ? pwq_dec_nr_in_flight+0x120/0x120
>  worker_thread+0x91/0x670
>  ? __sched_text_start+0x8/0x8
>  kthread+0x1ab/0x200
>  ? process_one_work+0xb10/0xb10
>  ? __kthread_create_on_node+0x340/0x340
>  ret_from_fork+0x27/0x40
> 
> The buggy address belongs to the variable:
>  nfs_cb_version+0x8/0x740

Does the following patch fix it?

Cheers
  Trond

8<--
From b9230cdfbbee90178a1318d20cd3373ffb758788 Mon Sep 17 00:00:00 2001
From: Trond Myklebust 
Date: Sun, 16 Jul 2017 18:52:18 -0400
Subject: [PATCH] nfsd: Fix a memory scribble in the callback channel

The offset of the entry in struct rpc_version has to match the version
number.

Reported-by: Dave Jones 
Fixes: 1c5876ddbdb4 ("sunrpc: move p_count out of struct rpc_procinfo")
Signed-off-by: Trond Myklebust 
---
 fs/nfsd/nfs4callback.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/fs/nfsd/nfs4callback.c b/fs/nfsd/nfs4callback.c
index b45083c0f9ae..49b0a9e7ff18 100644
--- a/fs/nfsd/nfs4callback.c
+++ b/fs/nfsd/nfs4callback.c
@@ -720,8 +720,8 @@ static const struct rpc_version nfs_cb_version4 = {
.counts = nfs4_cb_counts,
 };
 
-static const struct rpc_version *nfs_cb_version[] = {
-   _cb_version4,
+static const struct rpc_version *nfs_cb_version[2] = {
+   [1] = _cb_version4,
 };
 
 static const struct rpc_program cb_program;
@@ -795,7 +795,7 @@ static int setup_callback_client(struct nfs4_client *clp, 
struct nfs4_cb_conn *c
.saddress   = (struct sockaddr *) >cb_saddr,
.timeout= ,
.program= _program,
-   .version= 0,
+   .version= 1,
.flags  = (RPC_CLNT_CREATE_NOPING | 
RPC_CLNT_CREATE_QUIET),
};
struct rpc_clnt *client;
-- 
2.13.3

-- 
Trond Myklebust
Linux NFS client maintainer, PrimaryData
trond.mykleb...@primarydata.com


Re: [GIT PULL] Please pull NFS client changes for Linux 4.13

2017-07-16 Thread Trond Myklebust
Hi Dave,

On Sun, 2017-07-16 at 17:15 -0400, Dave Jones wrote:
> On Fri, Jul 14, 2017 at 10:25:43AM -0400, Dave Jones wrote:
>  > On Thu, Jul 13, 2017 at 05:16:24PM -0400, Anna Schumaker wrote:
>  >  > Hi Linus,
>  >  > 
>  >  > The following changes since commit
> 32c1431eea4881a6b17bd7c639315010aeefa452:
>  >  > 
>  >  >   Linux 4.12-rc5 (2017-06-11 16:48:20 -0700)
>  >  > 
>  >  > are available in the git repository at:
>  >  > 
>  >  >   git://git.linux-nfs.org/projects/anna/linux-nfs.git tags/nfs-
> for-4.13-1
>  >  > 
>  >  > for you to fetch changes up to
> b4f937cffa66b3d56eb8f586e620d0b223a281a3:
>  >  > 
>  >  >   NFS: Don't run wake_up_bit() when nobody is waiting... (2017-
> 07-13 16:57:18 -0400)
>  > 
>  > Since this landed, I'm seeing this during boot..
>  > 
>  >  =
> =
>  >  BUG: KASAN: global-out-of-bounds in strscpy+0x4a/0x230
>  >  Read of size 8 at addr b4eeaf20 by task nfsd/688
> 
> Now that this one got fixed, this one fell out instead..
> Will dig deeper tomorrow.
> 
> ==
> BUG: KASAN: global-out-of-bounds in call_start+0x93/0x100
> Read of size 8 at addr 8d582588 by task kworker/0:1/22
> 
> CPU: 0 PID: 22 Comm: kworker/0:1 Not tainted 4.13.0-rc1-firewall+ #1 
> Workqueue: rpciod rpc_async_schedule
> Call Trace:
>  dump_stack+0x68/0x94
>  print_address_description+0x2c/0x270
>  ? call_start+0x93/0x100
>  kasan_report+0x239/0x350
>  __asan_load8+0x55/0x90
>  call_start+0x93/0x100
>  ? rpc_default_callback+0x10/0x10
>  ? rpc_default_callback+0x10/0x10
>  __rpc_execute+0x170/0x740
>  ? rpc_wake_up_queued_task+0x50/0x50
>  ? __lock_is_held+0x9f/0x110
>  rpc_async_schedule+0x12/0x20
>  process_one_work+0x4ba/0xb10
>  ? process_one_work+0x401/0xb10
>  ? pwq_dec_nr_in_flight+0x120/0x120
>  worker_thread+0x91/0x670
>  ? __sched_text_start+0x8/0x8
>  kthread+0x1ab/0x200
>  ? process_one_work+0xb10/0xb10
>  ? __kthread_create_on_node+0x340/0x340
>  ret_from_fork+0x27/0x40
> 
> The buggy address belongs to the variable:
>  nfs_cb_version+0x8/0x740

Does the following patch fix it?

Cheers
  Trond

8<--
From b9230cdfbbee90178a1318d20cd3373ffb758788 Mon Sep 17 00:00:00 2001
From: Trond Myklebust 
Date: Sun, 16 Jul 2017 18:52:18 -0400
Subject: [PATCH] nfsd: Fix a memory scribble in the callback channel

The offset of the entry in struct rpc_version has to match the version
number.

Reported-by: Dave Jones 
Fixes: 1c5876ddbdb4 ("sunrpc: move p_count out of struct rpc_procinfo")
Signed-off-by: Trond Myklebust 
---
 fs/nfsd/nfs4callback.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/fs/nfsd/nfs4callback.c b/fs/nfsd/nfs4callback.c
index b45083c0f9ae..49b0a9e7ff18 100644
--- a/fs/nfsd/nfs4callback.c
+++ b/fs/nfsd/nfs4callback.c
@@ -720,8 +720,8 @@ static const struct rpc_version nfs_cb_version4 = {
.counts = nfs4_cb_counts,
 };
 
-static const struct rpc_version *nfs_cb_version[] = {
-   _cb_version4,
+static const struct rpc_version *nfs_cb_version[2] = {
+   [1] = _cb_version4,
 };
 
 static const struct rpc_program cb_program;
@@ -795,7 +795,7 @@ static int setup_callback_client(struct nfs4_client *clp, 
struct nfs4_cb_conn *c
.saddress   = (struct sockaddr *) >cb_saddr,
.timeout= ,
.program= _program,
-   .version= 0,
+   .version= 1,
.flags  = (RPC_CLNT_CREATE_NOPING | 
RPC_CLNT_CREATE_QUIET),
};
struct rpc_clnt *client;
-- 
2.13.3

-- 
Trond Myklebust
Linux NFS client maintainer, PrimaryData
trond.mykleb...@primarydata.com


Re: [GIT PULL] Please pull NFS client changes for Linux 4.13

2017-07-16 Thread Dave Jones
On Fri, Jul 14, 2017 at 10:25:43AM -0400, Dave Jones wrote:
 > On Thu, Jul 13, 2017 at 05:16:24PM -0400, Anna Schumaker wrote:
 >  > Hi Linus,
 >  > 
 >  > The following changes since commit 
 > 32c1431eea4881a6b17bd7c639315010aeefa452:
 >  > 
 >  >   Linux 4.12-rc5 (2017-06-11 16:48:20 -0700)
 >  > 
 >  > are available in the git repository at:
 >  > 
 >  >   git://git.linux-nfs.org/projects/anna/linux-nfs.git tags/nfs-for-4.13-1
 >  > 
 >  > for you to fetch changes up to b4f937cffa66b3d56eb8f586e620d0b223a281a3:
 >  > 
 >  >   NFS: Don't run wake_up_bit() when nobody is waiting... (2017-07-13 
 > 16:57:18 -0400)
 > 
 > Since this landed, I'm seeing this during boot..
 > 
 >  ==
 >  BUG: KASAN: global-out-of-bounds in strscpy+0x4a/0x230
 >  Read of size 8 at addr b4eeaf20 by task nfsd/688

Now that this one got fixed, this one fell out instead..
Will dig deeper tomorrow.

==
BUG: KASAN: global-out-of-bounds in call_start+0x93/0x100
Read of size 8 at addr 8d582588 by task kworker/0:1/22

CPU: 0 PID: 22 Comm: kworker/0:1 Not tainted 4.13.0-rc1-firewall+ #1 
Workqueue: rpciod rpc_async_schedule
Call Trace:
 dump_stack+0x68/0x94
 print_address_description+0x2c/0x270
 ? call_start+0x93/0x100
 kasan_report+0x239/0x350
 __asan_load8+0x55/0x90
 call_start+0x93/0x100
 ? rpc_default_callback+0x10/0x10
 ? rpc_default_callback+0x10/0x10
 __rpc_execute+0x170/0x740
 ? rpc_wake_up_queued_task+0x50/0x50
 ? __lock_is_held+0x9f/0x110
 rpc_async_schedule+0x12/0x20
 process_one_work+0x4ba/0xb10
 ? process_one_work+0x401/0xb10
 ? pwq_dec_nr_in_flight+0x120/0x120
 worker_thread+0x91/0x670
 ? __sched_text_start+0x8/0x8
 kthread+0x1ab/0x200
 ? process_one_work+0xb10/0xb10
 ? __kthread_create_on_node+0x340/0x340
 ret_from_fork+0x27/0x40

The buggy address belongs to the variable:
 nfs_cb_version+0x8/0x740

Memory state around the buggy address:
 8d582480: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
 8d582500: 00 00 00 00 00 00 00 00 00 00 fa fa fa fa fa fa
>8d582580: 00 fa fa fa fa fa fa fa 00 00 00 00 00 00 00 00
  ^
 8d582600: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
 8d582680: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
==



Re: [GIT PULL] Please pull NFS client changes for Linux 4.13

2017-07-16 Thread Dave Jones
On Fri, Jul 14, 2017 at 10:25:43AM -0400, Dave Jones wrote:
 > On Thu, Jul 13, 2017 at 05:16:24PM -0400, Anna Schumaker wrote:
 >  > Hi Linus,
 >  > 
 >  > The following changes since commit 
 > 32c1431eea4881a6b17bd7c639315010aeefa452:
 >  > 
 >  >   Linux 4.12-rc5 (2017-06-11 16:48:20 -0700)
 >  > 
 >  > are available in the git repository at:
 >  > 
 >  >   git://git.linux-nfs.org/projects/anna/linux-nfs.git tags/nfs-for-4.13-1
 >  > 
 >  > for you to fetch changes up to b4f937cffa66b3d56eb8f586e620d0b223a281a3:
 >  > 
 >  >   NFS: Don't run wake_up_bit() when nobody is waiting... (2017-07-13 
 > 16:57:18 -0400)
 > 
 > Since this landed, I'm seeing this during boot..
 > 
 >  ==
 >  BUG: KASAN: global-out-of-bounds in strscpy+0x4a/0x230
 >  Read of size 8 at addr b4eeaf20 by task nfsd/688

Now that this one got fixed, this one fell out instead..
Will dig deeper tomorrow.

==
BUG: KASAN: global-out-of-bounds in call_start+0x93/0x100
Read of size 8 at addr 8d582588 by task kworker/0:1/22

CPU: 0 PID: 22 Comm: kworker/0:1 Not tainted 4.13.0-rc1-firewall+ #1 
Workqueue: rpciod rpc_async_schedule
Call Trace:
 dump_stack+0x68/0x94
 print_address_description+0x2c/0x270
 ? call_start+0x93/0x100
 kasan_report+0x239/0x350
 __asan_load8+0x55/0x90
 call_start+0x93/0x100
 ? rpc_default_callback+0x10/0x10
 ? rpc_default_callback+0x10/0x10
 __rpc_execute+0x170/0x740
 ? rpc_wake_up_queued_task+0x50/0x50
 ? __lock_is_held+0x9f/0x110
 rpc_async_schedule+0x12/0x20
 process_one_work+0x4ba/0xb10
 ? process_one_work+0x401/0xb10
 ? pwq_dec_nr_in_flight+0x120/0x120
 worker_thread+0x91/0x670
 ? __sched_text_start+0x8/0x8
 kthread+0x1ab/0x200
 ? process_one_work+0xb10/0xb10
 ? __kthread_create_on_node+0x340/0x340
 ret_from_fork+0x27/0x40

The buggy address belongs to the variable:
 nfs_cb_version+0x8/0x740

Memory state around the buggy address:
 8d582480: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
 8d582500: 00 00 00 00 00 00 00 00 00 00 fa fa fa fa fa fa
>8d582580: 00 fa fa fa fa fa fa fa 00 00 00 00 00 00 00 00
  ^
 8d582600: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
 8d582680: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
==



Re: [GIT PULL] Please pull NFS client changes for Linux 4.13

2017-07-14 Thread Daniel Micay
> I find "hardening" code that adds bugs to be particularly bad and
> ugly, the same way that I absolutely *hate* debugging code that turns
> out to make debugging impossible (we had that with the "better" stack
> tracing code that caused kernel panics to kill the machine entirely
> rather than show the backtrace, and I'm still bitter about it a decade
> after the fact).

Agree, it's very important for this code to be correct and the string
functions have some subtleties so it needs scrutiny. I messed up strcpy
between v1 and v2 trying to add a proper read overflow check. My fault
for not looking more closely at strscpy before adopting it based on my
misinterpretation of the API.

This is primarily a bug finding feature right now and it has gotten a
few fixed that actually matter (most were unimportant memcpy read past
end of string constant but not all). I don't think it has another bug
like this strscpy misuse itself, but there will need to be some more
fixes for minor read overflows, etc. elsewhere in the tree before it'll
actually make sense as a hardening feature because it can turn a benign
read overflow into a DoS via BUG(). I think it will be fine for 4.13,
but I definitely wouldn't propose 'default y' for a while, even if there
was no performance cost (and there is).

Fix for this issue is here in case anyone just looks only at this thread
(realized I should have passed send-email a reply id):

http://marc.info/?l=linux-fsdevel=150006772418003=2


Re: [GIT PULL] Please pull NFS client changes for Linux 4.13

2017-07-14 Thread Daniel Micay
> I find "hardening" code that adds bugs to be particularly bad and
> ugly, the same way that I absolutely *hate* debugging code that turns
> out to make debugging impossible (we had that with the "better" stack
> tracing code that caused kernel panics to kill the machine entirely
> rather than show the backtrace, and I'm still bitter about it a decade
> after the fact).

Agree, it's very important for this code to be correct and the string
functions have some subtleties so it needs scrutiny. I messed up strcpy
between v1 and v2 trying to add a proper read overflow check. My fault
for not looking more closely at strscpy before adopting it based on my
misinterpretation of the API.

This is primarily a bug finding feature right now and it has gotten a
few fixed that actually matter (most were unimportant memcpy read past
end of string constant but not all). I don't think it has another bug
like this strscpy misuse itself, but there will need to be some more
fixes for minor read overflows, etc. elsewhere in the tree before it'll
actually make sense as a hardening feature because it can turn a benign
read overflow into a DoS via BUG(). I think it will be fine for 4.13,
but I definitely wouldn't propose 'default y' for a while, even if there
was no performance cost (and there is).

Fix for this issue is here in case anyone just looks only at this thread
(realized I should have passed send-email a reply id):

http://marc.info/?l=linux-fsdevel=150006772418003=2


Re: [GIT PULL] Please pull NFS client changes for Linux 4.13

2017-07-14 Thread Daniel Micay
> The reason q_size isn't used is because it doesn't yet prevent read
> overflow. The commit message mentions that among the current
> limitations
> along with __builtin_object_size(ptr, 1).

Er rather, in strlcat, the q_size is unused after the fast path is
because strnlen obtains the constant again itself.


Re: [GIT PULL] Please pull NFS client changes for Linux 4.13

2017-07-14 Thread Daniel Micay
> The reason q_size isn't used is because it doesn't yet prevent read
> overflow. The commit message mentions that among the current
> limitations
> along with __builtin_object_size(ptr, 1).

Er rather, in strlcat, the q_size is unused after the fast path is
because strnlen obtains the constant again itself.


Re: [GIT PULL] Please pull NFS client changes for Linux 4.13

2017-07-14 Thread Daniel Micay
On Fri, 2017-07-14 at 13:50 -0700, Linus Torvalds wrote:
> On Fri, Jul 14, 2017 at 1:38 PM, Daniel Micay 
> wrote:
> > 
> > If strscpy treats the count parameter as a *guarantee* of the dest
> > size
> > rather than a limit,
> 
> No, it's a *limit*.
> 
> And by a *limit*, I mean that we know that we can access both source
> and destination within that limit.

FORTIFY_SOURCE needs to be able to pass a limit without implying that
there's a minimum. That's the distinction I was trying to make. It's
wrong to use anything where it's interpreted as a minimum here. Using
__builtin_object_size(ptr, 0) vs. __builtin_object_size(ptr, 1) doesn't
avoid the problem. __builtin_object_size(ptr, 1) returns a maximum among
the possible buffer sizes just like 0. It's just stricter, i.e. catches
intra-object overflow, which isn't desirable for the first take since it
will cause compatibility issues. There's code using memcpy,
copy_to_user, etc. to read / write multiple fields with a pointer to the
first one passed as the source / destination.

> > My initial patch used strlcpy there, because I wasn't aware of
> > strscpy
> > before it was suggested:
> 
> Since I'm looking at this, I note that the "strlcpy()" code is
> complete garbage too, and has that same
> 
>  p_size == (size_t)-1 && q_size == (size_t)-1
> 
> check which is wrong.  Of course, in strlcpy, q_size is never actually
> *used*, so the whole check seems bogus.

That check is only an optimization. __builtin_object_size always returns
a constant, and it's (size_t)-1 when no limit could be found.

The reason q_size isn't used is because it doesn't yet prevent read
overflow. The commit message mentions that among the current limitations
along with __builtin_object_size(ptr, 1).

> But no, strlcpy() is complete garbage, and should never be used. It is
> truly a shit interface, and anybody who uses it is by definition
> buggy.
> 
> Why? Because the return value of "strlcpy()" is defined to be ignoring
> the limit, so you FUNDAMENTALLY must not use that thing on untrusted
> source strings.
> 
> But since the whole *point* of people using it is for untrusted
> sources, it by definition is garbage.
> 
> Ergo: don't use strlcpy(). It's unbelievable crap. It's wrong. There's
> a reason we defined "strscpy()" as the way to do safe copies
> (strncpy(), of course, is broken for both lack of NUL termination
> _and_ for excessive NUL termination when a NUL did exist).

Sure, it doesn't prevent read overflow (but it's not worse than strcpy,
which is the purpose) which is why I said this:

"The fortified string functions should place a limit on reads from the
source. For strcat/strcpy, these could just be a variant of strlcat /
strlcpy using the size limit as a bound on both the source and
destination, with the return value only reporting whether truncation
occurred rather than providing the source length. It would be an easier
API for developers to use too and not that it really matters but it
would be more efficient for the case where truncation is intended."

That's why strscpy was suggested and I switched to that + updated the
commit message to only mention strcat, but it's wrong to use it here
because __builtin_object_size(p, 0) / __builtin_object_size(p, 1) are
only a guaranteed maximum length with no minimum guarantee.


Re: [GIT PULL] Please pull NFS client changes for Linux 4.13

2017-07-14 Thread Daniel Micay
On Fri, 2017-07-14 at 13:50 -0700, Linus Torvalds wrote:
> On Fri, Jul 14, 2017 at 1:38 PM, Daniel Micay 
> wrote:
> > 
> > If strscpy treats the count parameter as a *guarantee* of the dest
> > size
> > rather than a limit,
> 
> No, it's a *limit*.
> 
> And by a *limit*, I mean that we know that we can access both source
> and destination within that limit.

FORTIFY_SOURCE needs to be able to pass a limit without implying that
there's a minimum. That's the distinction I was trying to make. It's
wrong to use anything where it's interpreted as a minimum here. Using
__builtin_object_size(ptr, 0) vs. __builtin_object_size(ptr, 1) doesn't
avoid the problem. __builtin_object_size(ptr, 1) returns a maximum among
the possible buffer sizes just like 0. It's just stricter, i.e. catches
intra-object overflow, which isn't desirable for the first take since it
will cause compatibility issues. There's code using memcpy,
copy_to_user, etc. to read / write multiple fields with a pointer to the
first one passed as the source / destination.

> > My initial patch used strlcpy there, because I wasn't aware of
> > strscpy
> > before it was suggested:
> 
> Since I'm looking at this, I note that the "strlcpy()" code is
> complete garbage too, and has that same
> 
>  p_size == (size_t)-1 && q_size == (size_t)-1
> 
> check which is wrong.  Of course, in strlcpy, q_size is never actually
> *used*, so the whole check seems bogus.

That check is only an optimization. __builtin_object_size always returns
a constant, and it's (size_t)-1 when no limit could be found.

The reason q_size isn't used is because it doesn't yet prevent read
overflow. The commit message mentions that among the current limitations
along with __builtin_object_size(ptr, 1).

> But no, strlcpy() is complete garbage, and should never be used. It is
> truly a shit interface, and anybody who uses it is by definition
> buggy.
> 
> Why? Because the return value of "strlcpy()" is defined to be ignoring
> the limit, so you FUNDAMENTALLY must not use that thing on untrusted
> source strings.
> 
> But since the whole *point* of people using it is for untrusted
> sources, it by definition is garbage.
> 
> Ergo: don't use strlcpy(). It's unbelievable crap. It's wrong. There's
> a reason we defined "strscpy()" as the way to do safe copies
> (strncpy(), of course, is broken for both lack of NUL termination
> _and_ for excessive NUL termination when a NUL did exist).

Sure, it doesn't prevent read overflow (but it's not worse than strcpy,
which is the purpose) which is why I said this:

"The fortified string functions should place a limit on reads from the
source. For strcat/strcpy, these could just be a variant of strlcat /
strlcpy using the size limit as a bound on both the source and
destination, with the return value only reporting whether truncation
occurred rather than providing the source length. It would be an easier
API for developers to use too and not that it really matters but it
would be more efficient for the case where truncation is intended."

That's why strscpy was suggested and I switched to that + updated the
commit message to only mention strcat, but it's wrong to use it here
because __builtin_object_size(p, 0) / __builtin_object_size(p, 1) are
only a guaranteed maximum length with no minimum guarantee.


Re: [GIT PULL] Please pull NFS client changes for Linux 4.13

2017-07-14 Thread Daniel Micay
> My initial patch used strlcpy there, because I wasn't aware of strscpy
> before it was suggested:
> 
> http://www.openwall.com/lists/kernel-hardening/2017/05/04/11
> 
> I was wrong to move it to strscpy. It could be switched back to
> strlcpy
> again unless the kernel considers the count parameter to be a
> guarantee
> that could be leveraged in the future. Using the fortified strlen +
> memcpy would provide the improvement that strscpy was meant to provide
> there over strlcpy.

Similarly, the FORTIFY_SOURCE strcat uses strlcat with the assumption
that the count parameter is a limit, not a guarantee for a optimization.
There's only a C implementation and it currently doesn't, but if it's
meant to be a guarantee then the strcat needs to be changed too.

I'll make a fix moving away both from the existing functions.


Re: [GIT PULL] Please pull NFS client changes for Linux 4.13

2017-07-14 Thread Daniel Micay
> My initial patch used strlcpy there, because I wasn't aware of strscpy
> before it was suggested:
> 
> http://www.openwall.com/lists/kernel-hardening/2017/05/04/11
> 
> I was wrong to move it to strscpy. It could be switched back to
> strlcpy
> again unless the kernel considers the count parameter to be a
> guarantee
> that could be leveraged in the future. Using the fortified strlen +
> memcpy would provide the improvement that strscpy was meant to provide
> there over strlcpy.

Similarly, the FORTIFY_SOURCE strcat uses strlcat with the assumption
that the count parameter is a limit, not a guarantee for a optimization.
There's only a C implementation and it currently doesn't, but if it's
meant to be a guarantee then the strcat needs to be changed too.

I'll make a fix moving away both from the existing functions.


Re: [GIT PULL] Please pull NFS client changes for Linux 4.13

2017-07-14 Thread Linus Torvalds
On Fri, Jul 14, 2017 at 1:38 PM, Daniel Micay  wrote:
>
> If strscpy treats the count parameter as a *guarantee* of the dest size
> rather than a limit,

No, it's a *limit*.

And by a *limit*, I mean that we know that we can access both source
and destination within that limit.

> My initial patch used strlcpy there, because I wasn't aware of strscpy
> before it was suggested:

Since I'm looking at this, I note that the "strlcpy()" code is
complete garbage too, and has that same

 p_size == (size_t)-1 && q_size == (size_t)-1

check which is wrong.  Of course, in strlcpy, q_size is never actually
*used*, so the whole check seems bogus.

But no, strlcpy() is complete garbage, and should never be used. It is
truly a shit interface, and anybody who uses it is by definition
buggy.

Why? Because the return value of "strlcpy()" is defined to be ignoring
the limit, so you FUNDAMENTALLY must not use that thing on untrusted
source strings.

But since the whole *point* of people using it is for untrusted
sources, it by definition is garbage.

Ergo: don't use strlcpy(). It's unbelievable crap. It's wrong. There's
a reason we defined "strscpy()" as the way to do safe copies
(strncpy(), of course, is broken for both lack of NUL termination
_and_ for excessive NUL termination when a NUL did exist).

So quite frankly, this hardening code needs to be looked at again. And
no, if it uses "strlcpy()", then it's not hardering, it's just a pile
of crap.

Yes, I'm annoyed. I really get very very annoyed by "hardening" code
that does nothing of the kind.

   Linus


Re: [GIT PULL] Please pull NFS client changes for Linux 4.13

2017-07-14 Thread Linus Torvalds
On Fri, Jul 14, 2017 at 1:38 PM, Daniel Micay  wrote:
>
> If strscpy treats the count parameter as a *guarantee* of the dest size
> rather than a limit,

No, it's a *limit*.

And by a *limit*, I mean that we know that we can access both source
and destination within that limit.

> My initial patch used strlcpy there, because I wasn't aware of strscpy
> before it was suggested:

Since I'm looking at this, I note that the "strlcpy()" code is
complete garbage too, and has that same

 p_size == (size_t)-1 && q_size == (size_t)-1

check which is wrong.  Of course, in strlcpy, q_size is never actually
*used*, so the whole check seems bogus.

But no, strlcpy() is complete garbage, and should never be used. It is
truly a shit interface, and anybody who uses it is by definition
buggy.

Why? Because the return value of "strlcpy()" is defined to be ignoring
the limit, so you FUNDAMENTALLY must not use that thing on untrusted
source strings.

But since the whole *point* of people using it is for untrusted
sources, it by definition is garbage.

Ergo: don't use strlcpy(). It's unbelievable crap. It's wrong. There's
a reason we defined "strscpy()" as the way to do safe copies
(strncpy(), of course, is broken for both lack of NUL termination
_and_ for excessive NUL termination when a NUL did exist).

So quite frankly, this hardening code needs to be looked at again. And
no, if it uses "strlcpy()", then it's not hardering, it's just a pile
of crap.

Yes, I'm annoyed. I really get very very annoyed by "hardening" code
that does nothing of the kind.

   Linus


Re: [GIT PULL] Please pull NFS client changes for Linux 4.13

2017-07-14 Thread Daniel Micay
On Fri, 2017-07-14 at 12:58 -0700, Linus Torvalds wrote:
> On Fri, Jul 14, 2017 at 12:43 PM, Andrey Ryabinin
>  wrote:
> > 
> > > yet when I look at the generated code for __ip_map_lookup, I see
> > > 
> > >movl$32, %edx   #,
> > >movq%r13, %rsi  # class,
> > >leaq48(%rax), %rdi  #, tmp126
> > >callstrscpy #
> > > 
> > > what's the bug here? Look at that third argume8nt - %rdx. It is
> > > initialized to 32.
> > 
> > It's not a compiler bug, it's a bug in our strcpy().
> > Whoever wrote this strcpy() into strscpy() code apparently didn't
> > read carefully
> > enough gcc manual about __builtin_object_size().
> > 
> > Summary from https://gcc.gnu.org/onlinedocs/gcc/Object-Size-Checking
> > .html :
> > 
> > __builtin_object_size(ptr, type) returns a constant number
> > of bytes from 'ptr' to the end of the object 'ptr'
> > pointer points to. "type" is an integer constant from 0 to
> > 3. If the least significant bit is clear, objects
> > are whole variables, if it is set, a closest surrounding
> > subobject is considered the object a pointer points to.
> > The second bit determines if maximum or minimum of remaining
> > bytes is computed.
> > 
> > We have type = 0 in strcpy(), so the least significant bit is clear.
> > So the 'ptr' is considered as a pointer to the whole
> > variable i.e. pointer to struct ip_map ip;
> > And the number of bytes from 'ip.m_class' to the end of the ip
> > object is exactly 32.
> > 
> > I suppose that changing the type to 1 should fix this bug.
> 
> Oh, that absolutely needs to be done.
> 
> Because that "strcpy() -> strscpy()" conversion really depends on that
> size being the right size (well, in this case minimal safe size) for
> the actual accesses, exactly because "strscpy()" is perfectly willing
> to write *past* the end of the destination string within that given
> size limit (ie it reads and writes in the same 8-byte chunks).
> 
> So if you have a small target string that is contained in a big
> object, then the "hardened" strcpy() code can actually end up
> overwriting things past the end of the strring, even if the string
> itself were to have fit in the buffer.
> 
> I note that every single use in string.h is buggy, and it worries me
> that __compiletime_object_size() does this too. The only user of that
> seems to be check_copy_size(), and now I'm a bit worried what that bug
> may have hidden.
> 
> I find "hardening" code that adds bugs to be particularly bad and
> ugly, the same way that I absolutely *hate* debugging code that turns
> out to make debugging impossible (we had that with the "better" stack
> tracing code that caused kernel panics to kill the machine entirely
> rather than show the backtrace, and I'm still bitter about it a decade
> after the fact).
> 
> There is something actively *evil* about it. Daniel, Kees, please jump
> on this.
> 
> Andrey, thanks for noticing this thing,
> 
>   Linus

The issue is the usage of strscpy then, not the __builtin_object_size
type parameter. The type is set 0 rather than 1 to be more lenient by
not detecting intra-object overflow, which is going to come later.

If strscpy treats the count parameter as a *guarantee* of the dest size
rather than a limit, it's wrong to use it there, whether or not the type
parameter for __builtin_object_size is 0 or 1 since it can still return
a larger size. It's a limit with no guaranteed minimum.

My initial patch used strlcpy there, because I wasn't aware of strscpy
before it was suggested:

http://www.openwall.com/lists/kernel-hardening/2017/05/04/11

I was wrong to move it to strscpy. It could be switched back to strlcpy
again unless the kernel considers the count parameter to be a guarantee
that could be leveraged in the future. Using the fortified strlen +
memcpy would provide the improvement that strscpy was meant to provide
there over strlcpy.


Re: [GIT PULL] Please pull NFS client changes for Linux 4.13

2017-07-14 Thread Daniel Micay
On Fri, 2017-07-14 at 12:58 -0700, Linus Torvalds wrote:
> On Fri, Jul 14, 2017 at 12:43 PM, Andrey Ryabinin
>  wrote:
> > 
> > > yet when I look at the generated code for __ip_map_lookup, I see
> > > 
> > >movl$32, %edx   #,
> > >movq%r13, %rsi  # class,
> > >leaq48(%rax), %rdi  #, tmp126
> > >callstrscpy #
> > > 
> > > what's the bug here? Look at that third argume8nt - %rdx. It is
> > > initialized to 32.
> > 
> > It's not a compiler bug, it's a bug in our strcpy().
> > Whoever wrote this strcpy() into strscpy() code apparently didn't
> > read carefully
> > enough gcc manual about __builtin_object_size().
> > 
> > Summary from https://gcc.gnu.org/onlinedocs/gcc/Object-Size-Checking
> > .html :
> > 
> > __builtin_object_size(ptr, type) returns a constant number
> > of bytes from 'ptr' to the end of the object 'ptr'
> > pointer points to. "type" is an integer constant from 0 to
> > 3. If the least significant bit is clear, objects
> > are whole variables, if it is set, a closest surrounding
> > subobject is considered the object a pointer points to.
> > The second bit determines if maximum or minimum of remaining
> > bytes is computed.
> > 
> > We have type = 0 in strcpy(), so the least significant bit is clear.
> > So the 'ptr' is considered as a pointer to the whole
> > variable i.e. pointer to struct ip_map ip;
> > And the number of bytes from 'ip.m_class' to the end of the ip
> > object is exactly 32.
> > 
> > I suppose that changing the type to 1 should fix this bug.
> 
> Oh, that absolutely needs to be done.
> 
> Because that "strcpy() -> strscpy()" conversion really depends on that
> size being the right size (well, in this case minimal safe size) for
> the actual accesses, exactly because "strscpy()" is perfectly willing
> to write *past* the end of the destination string within that given
> size limit (ie it reads and writes in the same 8-byte chunks).
> 
> So if you have a small target string that is contained in a big
> object, then the "hardened" strcpy() code can actually end up
> overwriting things past the end of the strring, even if the string
> itself were to have fit in the buffer.
> 
> I note that every single use in string.h is buggy, and it worries me
> that __compiletime_object_size() does this too. The only user of that
> seems to be check_copy_size(), and now I'm a bit worried what that bug
> may have hidden.
> 
> I find "hardening" code that adds bugs to be particularly bad and
> ugly, the same way that I absolutely *hate* debugging code that turns
> out to make debugging impossible (we had that with the "better" stack
> tracing code that caused kernel panics to kill the machine entirely
> rather than show the backtrace, and I'm still bitter about it a decade
> after the fact).
> 
> There is something actively *evil* about it. Daniel, Kees, please jump
> on this.
> 
> Andrey, thanks for noticing this thing,
> 
>   Linus

The issue is the usage of strscpy then, not the __builtin_object_size
type parameter. The type is set 0 rather than 1 to be more lenient by
not detecting intra-object overflow, which is going to come later.

If strscpy treats the count parameter as a *guarantee* of the dest size
rather than a limit, it's wrong to use it there, whether or not the type
parameter for __builtin_object_size is 0 or 1 since it can still return
a larger size. It's a limit with no guaranteed minimum.

My initial patch used strlcpy there, because I wasn't aware of strscpy
before it was suggested:

http://www.openwall.com/lists/kernel-hardening/2017/05/04/11

I was wrong to move it to strscpy. It could be switched back to strlcpy
again unless the kernel considers the count parameter to be a guarantee
that could be leveraged in the future. Using the fortified strlen +
memcpy would provide the improvement that strscpy was meant to provide
there over strlcpy.


Re: [GIT PULL] Please pull NFS client changes for Linux 4.13

2017-07-14 Thread Andrey Rybainin


On 07/14/2017 10:58 PM, Linus Torvalds wrote:
> On Fri, Jul 14, 2017 at 12:43 PM, Andrey Ryabinin
>  wrote:
>>
>>> yet when I look at the generated code for __ip_map_lookup, I see
>>>
>>>movl$32, %edx   #,
>>>movq%r13, %rsi  # class,
>>>leaq48(%rax), %rdi  #, tmp126
>>>callstrscpy #
>>>
>>> what's the bug here? Look at that third argume8nt - %rdx. It is
>>> initialized to 32.
>>
>> It's not a compiler bug, it's a bug in our strcpy().
>> Whoever wrote this strcpy() into strscpy() code apparently didn't read 
>> carefully
>> enough gcc manual about __builtin_object_size().
>>
>> Summary from https://gcc.gnu.org/onlinedocs/gcc/Object-Size-Checking.html :
>>
>> __builtin_object_size(ptr, type) returns a constant number of bytes 
>> from 'ptr' to the end of the object 'ptr'
>> pointer points to. "type" is an integer constant from 0 to 3. If the 
>> least significant bit is clear, objects
>> are whole variables, if it is set, a closest surrounding subobject 
>> is considered the object a pointer points to.
>> The second bit determines if maximum or minimum of remaining bytes 
>> is computed.
>>
>> We have type = 0 in strcpy(), so the least significant bit is clear. So the 
>> 'ptr' is considered as a pointer to the whole
>> variable i.e. pointer to struct ip_map ip;
>> And the number of bytes from 'ip.m_class' to the end of the ip object is 
>> exactly 32.
>>
>> I suppose that changing the type to 1 should fix this bug.
> 
> Oh, that absolutely needs to be done.
> 
> Because that "strcpy() -> strscpy()" conversion really depends on that
> size being the right size (well, in this case minimal safe size) for
> the actual accesses, exactly because "strscpy()" is perfectly willing
> to write *past* the end of the destination string within that given
> size limit (ie it reads and writes in the same 8-byte chunks).
> 
> So if you have a small target string that is contained in a big
> object, then the "hardened" strcpy() code can actually end up
> overwriting things past the end of the strring, even if the string
> itself were to have fit in the buffer.
> 
> I note that every single use in string.h is buggy, and it worries me
> that __compiletime_object_size() does this too. The only user of that
> seems to be check_copy_size(), and now I'm a bit worried what that bug
> may have hidden.
> 
> I find "hardening" code that adds bugs to be particularly bad and
> ugly, the same way that I absolutely *hate* debugging code that turns
> out to make debugging impossible (we had that with the "better" stack
> tracing code that caused kernel panics to kill the machine entirely
> rather than show the backtrace, and I'm still bitter about it a decade
> after the fact).
> 

A have some more news to make you even more "happier" :)
strcpy() choose to copy 32-bytes instead of smaller 5-bytes because it has one 
more bug :)

GCC couldn't determine size of class (which is 5-byte string):
strcpy(ip.m_class, class);

So, p_size = 32 and q_size  = -1, this "if (p_size == (size_t)-1 && q_size == 
(size_t)-1)" is false
(because of bogus '&&', obviously we should have '||' here)

and since (32 < (size_t)-1) 
if (strscpy(p, q, p_size < q_size ? p_size : q_size) < 0)

we end up with 32-bytes strscpy().

Enjoy :)


> There is something actively *evil* about it. Daniel, Kees, please jump on 
> this.
> 
> Andrey, thanks for noticing this thing,
> 
>   Linus
> 


Re: [GIT PULL] Please pull NFS client changes for Linux 4.13

2017-07-14 Thread Andrey Rybainin


On 07/14/2017 10:58 PM, Linus Torvalds wrote:
> On Fri, Jul 14, 2017 at 12:43 PM, Andrey Ryabinin
>  wrote:
>>
>>> yet when I look at the generated code for __ip_map_lookup, I see
>>>
>>>movl$32, %edx   #,
>>>movq%r13, %rsi  # class,
>>>leaq48(%rax), %rdi  #, tmp126
>>>callstrscpy #
>>>
>>> what's the bug here? Look at that third argume8nt - %rdx. It is
>>> initialized to 32.
>>
>> It's not a compiler bug, it's a bug in our strcpy().
>> Whoever wrote this strcpy() into strscpy() code apparently didn't read 
>> carefully
>> enough gcc manual about __builtin_object_size().
>>
>> Summary from https://gcc.gnu.org/onlinedocs/gcc/Object-Size-Checking.html :
>>
>> __builtin_object_size(ptr, type) returns a constant number of bytes 
>> from 'ptr' to the end of the object 'ptr'
>> pointer points to. "type" is an integer constant from 0 to 3. If the 
>> least significant bit is clear, objects
>> are whole variables, if it is set, a closest surrounding subobject 
>> is considered the object a pointer points to.
>> The second bit determines if maximum or minimum of remaining bytes 
>> is computed.
>>
>> We have type = 0 in strcpy(), so the least significant bit is clear. So the 
>> 'ptr' is considered as a pointer to the whole
>> variable i.e. pointer to struct ip_map ip;
>> And the number of bytes from 'ip.m_class' to the end of the ip object is 
>> exactly 32.
>>
>> I suppose that changing the type to 1 should fix this bug.
> 
> Oh, that absolutely needs to be done.
> 
> Because that "strcpy() -> strscpy()" conversion really depends on that
> size being the right size (well, in this case minimal safe size) for
> the actual accesses, exactly because "strscpy()" is perfectly willing
> to write *past* the end of the destination string within that given
> size limit (ie it reads and writes in the same 8-byte chunks).
> 
> So if you have a small target string that is contained in a big
> object, then the "hardened" strcpy() code can actually end up
> overwriting things past the end of the strring, even if the string
> itself were to have fit in the buffer.
> 
> I note that every single use in string.h is buggy, and it worries me
> that __compiletime_object_size() does this too. The only user of that
> seems to be check_copy_size(), and now I'm a bit worried what that bug
> may have hidden.
> 
> I find "hardening" code that adds bugs to be particularly bad and
> ugly, the same way that I absolutely *hate* debugging code that turns
> out to make debugging impossible (we had that with the "better" stack
> tracing code that caused kernel panics to kill the machine entirely
> rather than show the backtrace, and I'm still bitter about it a decade
> after the fact).
> 

A have some more news to make you even more "happier" :)
strcpy() choose to copy 32-bytes instead of smaller 5-bytes because it has one 
more bug :)

GCC couldn't determine size of class (which is 5-byte string):
strcpy(ip.m_class, class);

So, p_size = 32 and q_size  = -1, this "if (p_size == (size_t)-1 && q_size == 
(size_t)-1)" is false
(because of bogus '&&', obviously we should have '||' here)

and since (32 < (size_t)-1) 
if (strscpy(p, q, p_size < q_size ? p_size : q_size) < 0)

we end up with 32-bytes strscpy().

Enjoy :)


> There is something actively *evil* about it. Daniel, Kees, please jump on 
> this.
> 
> Andrey, thanks for noticing this thing,
> 
>   Linus
> 


Re: [GIT PULL] Please pull NFS client changes for Linux 4.13

2017-07-14 Thread Linus Torvalds
On Fri, Jul 14, 2017 at 12:43 PM, Andrey Ryabinin
 wrote:
>
>> yet when I look at the generated code for __ip_map_lookup, I see
>>
>>movl$32, %edx   #,
>>movq%r13, %rsi  # class,
>>leaq48(%rax), %rdi  #, tmp126
>>callstrscpy #
>>
>> what's the bug here? Look at that third argume8nt - %rdx. It is
>> initialized to 32.
>
> It's not a compiler bug, it's a bug in our strcpy().
> Whoever wrote this strcpy() into strscpy() code apparently didn't read 
> carefully
> enough gcc manual about __builtin_object_size().
>
> Summary from https://gcc.gnu.org/onlinedocs/gcc/Object-Size-Checking.html :
>
> __builtin_object_size(ptr, type) returns a constant number of bytes 
> from 'ptr' to the end of the object 'ptr'
> pointer points to. "type" is an integer constant from 0 to 3. If the 
> least significant bit is clear, objects
> are whole variables, if it is set, a closest surrounding subobject is 
> considered the object a pointer points to.
> The second bit determines if maximum or minimum of remaining bytes is 
> computed.
>
> We have type = 0 in strcpy(), so the least significant bit is clear. So the 
> 'ptr' is considered as a pointer to the whole
> variable i.e. pointer to struct ip_map ip;
> And the number of bytes from 'ip.m_class' to the end of the ip object is 
> exactly 32.
>
> I suppose that changing the type to 1 should fix this bug.

Oh, that absolutely needs to be done.

Because that "strcpy() -> strscpy()" conversion really depends on that
size being the right size (well, in this case minimal safe size) for
the actual accesses, exactly because "strscpy()" is perfectly willing
to write *past* the end of the destination string within that given
size limit (ie it reads and writes in the same 8-byte chunks).

So if you have a small target string that is contained in a big
object, then the "hardened" strcpy() code can actually end up
overwriting things past the end of the strring, even if the string
itself were to have fit in the buffer.

I note that every single use in string.h is buggy, and it worries me
that __compiletime_object_size() does this too. The only user of that
seems to be check_copy_size(), and now I'm a bit worried what that bug
may have hidden.

I find "hardening" code that adds bugs to be particularly bad and
ugly, the same way that I absolutely *hate* debugging code that turns
out to make debugging impossible (we had that with the "better" stack
tracing code that caused kernel panics to kill the machine entirely
rather than show the backtrace, and I'm still bitter about it a decade
after the fact).

There is something actively *evil* about it. Daniel, Kees, please jump on this.

Andrey, thanks for noticing this thing,

  Linus


Re: [GIT PULL] Please pull NFS client changes for Linux 4.13

2017-07-14 Thread Linus Torvalds
On Fri, Jul 14, 2017 at 12:43 PM, Andrey Ryabinin
 wrote:
>
>> yet when I look at the generated code for __ip_map_lookup, I see
>>
>>movl$32, %edx   #,
>>movq%r13, %rsi  # class,
>>leaq48(%rax), %rdi  #, tmp126
>>callstrscpy #
>>
>> what's the bug here? Look at that third argume8nt - %rdx. It is
>> initialized to 32.
>
> It's not a compiler bug, it's a bug in our strcpy().
> Whoever wrote this strcpy() into strscpy() code apparently didn't read 
> carefully
> enough gcc manual about __builtin_object_size().
>
> Summary from https://gcc.gnu.org/onlinedocs/gcc/Object-Size-Checking.html :
>
> __builtin_object_size(ptr, type) returns a constant number of bytes 
> from 'ptr' to the end of the object 'ptr'
> pointer points to. "type" is an integer constant from 0 to 3. If the 
> least significant bit is clear, objects
> are whole variables, if it is set, a closest surrounding subobject is 
> considered the object a pointer points to.
> The second bit determines if maximum or minimum of remaining bytes is 
> computed.
>
> We have type = 0 in strcpy(), so the least significant bit is clear. So the 
> 'ptr' is considered as a pointer to the whole
> variable i.e. pointer to struct ip_map ip;
> And the number of bytes from 'ip.m_class' to the end of the ip object is 
> exactly 32.
>
> I suppose that changing the type to 1 should fix this bug.

Oh, that absolutely needs to be done.

Because that "strcpy() -> strscpy()" conversion really depends on that
size being the right size (well, in this case minimal safe size) for
the actual accesses, exactly because "strscpy()" is perfectly willing
to write *past* the end of the destination string within that given
size limit (ie it reads and writes in the same 8-byte chunks).

So if you have a small target string that is contained in a big
object, then the "hardened" strcpy() code can actually end up
overwriting things past the end of the strring, even if the string
itself were to have fit in the buffer.

I note that every single use in string.h is buggy, and it worries me
that __compiletime_object_size() does this too. The only user of that
seems to be check_copy_size(), and now I'm a bit worried what that bug
may have hidden.

I find "hardening" code that adds bugs to be particularly bad and
ugly, the same way that I absolutely *hate* debugging code that turns
out to make debugging impossible (we had that with the "better" stack
tracing code that caused kernel panics to kill the machine entirely
rather than show the backtrace, and I'm still bitter about it a decade
after the fact).

There is something actively *evil* about it. Daniel, Kees, please jump on this.

Andrey, thanks for noticing this thing,

  Linus


Re: [GIT PULL] Please pull NFS client changes for Linux 4.13

2017-07-14 Thread Dave Jones
On Fri, Jul 14, 2017 at 12:05:02PM -0700, Linus Torvalds wrote:
 > On Fri, Jul 14, 2017 at 7:25 AM, Dave Jones  wrote:
 > > On Thu, Jul 13, 2017 at 05:16:24PM -0400, Anna Schumaker wrote:
 > >  >
 > >  >   git://git.linux-nfs.org/projects/anna/linux-nfs.git 
 > > tags/nfs-for-4.13-1
 > >
 > > Since this landed, I'm seeing this during boot..
 > >
 > >  ==
 > >  BUG: KASAN: global-out-of-bounds in strscpy+0x4a/0x230
 > >  Read of size 8 at addr b4eeaf20 by task nfsd/688
 > 
 > Is KASAN aware that strscpy() does the word-at-a-time optimistic reads
 > of the sources?
 > 
 > The problem may be that the source is initialized from the global
 > string "nfsd", and KASAN may be unhappy abotu the fact that we read 8
 > bytes from a 5-byte string (four plus NUL) as we do the word-at-a-time
 > strscpy..
 > 
 > That said, we do check the size first (because we also *write* 8 bytes
 > at a time), so maybe KASAN shouldn't even need to care.
 > 
 > Hmm. it really looks to me like this is actually a compiler bug (I'm
 > using current gcc in F26, which is gcc-7.1.1 - I'm assuming DaveJ is
 > the same).

Debian's 6.4.0

 > This is the source code in __ip_map_lookup:
 > 
 > struct ip_map ip;
 >  .
 > strcpy(ip.m_class, class);
 > 
 > and "m_class" is 8 bytes in size:
 > 
 > struct ip_map {
 > ...
 > charm_class[8]; /* e.g. "nfsd" */
 > ...
 > 
 > yet when I look at the generated code for __ip_map_lookup, I see
 > 
 > movl$32, %edx   #,
 > movq%r13, %rsi  # class,
 > leaq48(%rax), %rdi  #, tmp126
 > callstrscpy #
 > 
 > what's the bug here? Look at that third argument - %rdx. It is
 > initialized to 32.
 > 
 > WTF?
 > 
 > The code to turn "strcpy()" into "strscpy()" should pick the *smaller*
 > of the two object sizes as the size argument. How the hell is that
 > size argument 32?
 > 
 > Am I missing something? DaveJ, do you see the same?

My compiler seems to have replaced the call with an inlined copy afaics.

0be0 <__ip_map_lookup>:
{
 be0:   e8 00 00 00 00  callq  be5 <__ip_map_lookup+0x5>
 be5:   55  push   %rbp
 be6:   48 b8 00 00 00 00 00movabs $0xdc00,%rax
 bed:   fc ff df 
 bf0:   48 89 e5mov%rsp,%rbp
 bf3:   41 57   push   %r15
 bf5:   41 56   push   %r14
 bf7:   4c 8d 32lea(%rdx),%r14
if (strscpy(p, q, p_size < q_size ? p_size : q_size) < 0)
 bfa:   ba 20 00 00 00  mov$0x20,%edx
 bff:   41 55   push   %r13
 c01:   4c 8d 2elea(%rsi),%r13
 c04:   41 54   push   %r12
 c06:   53  push   %rbx
 c07:   48 8d 1flea(%rdi),%rbx
 c0a:   48 8d a4 24 60 ff fflea-0xa0(%rsp),%rsp
 c11:   ff 
 c12:   49 89 e4mov%rsp,%r12
 c15:   49 c1 ec 03 shr$0x3,%r12
 c19:   48 c7 04 24 b3 8a b5movq   $0x41b58ab3,(%rsp)
 c20:   41 
 c21:   48 c7 44 24 08 00 00movq   $0x0,0x8(%rsp)
 c28:   00 00 
 c2a:   48 c7 44 24 10 00 00movq   $0x0,0x10(%rsp)
 c31:   00 00 
 c33:   48 8d 7c 24 50  lea0x50(%rsp),%rdi
 c38:   4d 8d 24 04 lea(%r12,%rax,1),%r12
 c3c:   41 c7 04 24 f1 f1 f1movl   $0xf1f1f1f1,(%r12)
 c43:   f1 
 c44:   41 c7 44 24 0c 00 00movl   $0xf4f4,0xc(%r12)
 c4b:   f4 f4 
 c4d:   65 48 8b 04 25 28 00mov%gs:0x28,%rax
 c54:   00 00 
 c56:   48 89 84 24 98 00 00mov%rax,0x98(%rsp)
 c5d:   00 
 c5e:   31 c0   xor%eax,%eax
 c60:   e8 00 00 00 00  callq  c65 <__ip_map_lookup+0x85>
 c65:   48 85 c0test   %rax,%rax
 c68:   0f 88 a0 00 00 00   js d0e <__ip_map_lookup+0x12e>
ip.m_addr = *addr;
 c6e:   be 10 00 00 00  mov$0x10,%esi
 c73:   49 8d 3elea(%r14),%rdi
 c76:   e8 00 00 00 00  callq  c7b <__ip_map_lookup+0x9b>
 c7b:   49 8b 56 08 mov0x8(%r14),%rdx


But that mov $0x20,%edx looks like it might be the same value we're talking 
about.

Dave



Re: [GIT PULL] Please pull NFS client changes for Linux 4.13

2017-07-14 Thread Dave Jones
On Fri, Jul 14, 2017 at 12:05:02PM -0700, Linus Torvalds wrote:
 > On Fri, Jul 14, 2017 at 7:25 AM, Dave Jones  wrote:
 > > On Thu, Jul 13, 2017 at 05:16:24PM -0400, Anna Schumaker wrote:
 > >  >
 > >  >   git://git.linux-nfs.org/projects/anna/linux-nfs.git 
 > > tags/nfs-for-4.13-1
 > >
 > > Since this landed, I'm seeing this during boot..
 > >
 > >  ==
 > >  BUG: KASAN: global-out-of-bounds in strscpy+0x4a/0x230
 > >  Read of size 8 at addr b4eeaf20 by task nfsd/688
 > 
 > Is KASAN aware that strscpy() does the word-at-a-time optimistic reads
 > of the sources?
 > 
 > The problem may be that the source is initialized from the global
 > string "nfsd", and KASAN may be unhappy abotu the fact that we read 8
 > bytes from a 5-byte string (four plus NUL) as we do the word-at-a-time
 > strscpy..
 > 
 > That said, we do check the size first (because we also *write* 8 bytes
 > at a time), so maybe KASAN shouldn't even need to care.
 > 
 > Hmm. it really looks to me like this is actually a compiler bug (I'm
 > using current gcc in F26, which is gcc-7.1.1 - I'm assuming DaveJ is
 > the same).

Debian's 6.4.0

 > This is the source code in __ip_map_lookup:
 > 
 > struct ip_map ip;
 >  .
 > strcpy(ip.m_class, class);
 > 
 > and "m_class" is 8 bytes in size:
 > 
 > struct ip_map {
 > ...
 > charm_class[8]; /* e.g. "nfsd" */
 > ...
 > 
 > yet when I look at the generated code for __ip_map_lookup, I see
 > 
 > movl$32, %edx   #,
 > movq%r13, %rsi  # class,
 > leaq48(%rax), %rdi  #, tmp126
 > callstrscpy #
 > 
 > what's the bug here? Look at that third argument - %rdx. It is
 > initialized to 32.
 > 
 > WTF?
 > 
 > The code to turn "strcpy()" into "strscpy()" should pick the *smaller*
 > of the two object sizes as the size argument. How the hell is that
 > size argument 32?
 > 
 > Am I missing something? DaveJ, do you see the same?

My compiler seems to have replaced the call with an inlined copy afaics.

0be0 <__ip_map_lookup>:
{
 be0:   e8 00 00 00 00  callq  be5 <__ip_map_lookup+0x5>
 be5:   55  push   %rbp
 be6:   48 b8 00 00 00 00 00movabs $0xdc00,%rax
 bed:   fc ff df 
 bf0:   48 89 e5mov%rsp,%rbp
 bf3:   41 57   push   %r15
 bf5:   41 56   push   %r14
 bf7:   4c 8d 32lea(%rdx),%r14
if (strscpy(p, q, p_size < q_size ? p_size : q_size) < 0)
 bfa:   ba 20 00 00 00  mov$0x20,%edx
 bff:   41 55   push   %r13
 c01:   4c 8d 2elea(%rsi),%r13
 c04:   41 54   push   %r12
 c06:   53  push   %rbx
 c07:   48 8d 1flea(%rdi),%rbx
 c0a:   48 8d a4 24 60 ff fflea-0xa0(%rsp),%rsp
 c11:   ff 
 c12:   49 89 e4mov%rsp,%r12
 c15:   49 c1 ec 03 shr$0x3,%r12
 c19:   48 c7 04 24 b3 8a b5movq   $0x41b58ab3,(%rsp)
 c20:   41 
 c21:   48 c7 44 24 08 00 00movq   $0x0,0x8(%rsp)
 c28:   00 00 
 c2a:   48 c7 44 24 10 00 00movq   $0x0,0x10(%rsp)
 c31:   00 00 
 c33:   48 8d 7c 24 50  lea0x50(%rsp),%rdi
 c38:   4d 8d 24 04 lea(%r12,%rax,1),%r12
 c3c:   41 c7 04 24 f1 f1 f1movl   $0xf1f1f1f1,(%r12)
 c43:   f1 
 c44:   41 c7 44 24 0c 00 00movl   $0xf4f4,0xc(%r12)
 c4b:   f4 f4 
 c4d:   65 48 8b 04 25 28 00mov%gs:0x28,%rax
 c54:   00 00 
 c56:   48 89 84 24 98 00 00mov%rax,0x98(%rsp)
 c5d:   00 
 c5e:   31 c0   xor%eax,%eax
 c60:   e8 00 00 00 00  callq  c65 <__ip_map_lookup+0x85>
 c65:   48 85 c0test   %rax,%rax
 c68:   0f 88 a0 00 00 00   js d0e <__ip_map_lookup+0x12e>
ip.m_addr = *addr;
 c6e:   be 10 00 00 00  mov$0x10,%esi
 c73:   49 8d 3elea(%r14),%rdi
 c76:   e8 00 00 00 00  callq  c7b <__ip_map_lookup+0x9b>
 c7b:   49 8b 56 08 mov0x8(%r14),%rdx


But that mov $0x20,%edx looks like it might be the same value we're talking 
about.

Dave



Re: [GIT PULL] Please pull NFS client changes for Linux 4.13

2017-07-14 Thread Andrey Ryabinin


On 07/14/2017 10:05 PM, Linus Torvalds wrote:
> On Fri, Jul 14, 2017 at 7:25 AM, Dave Jones  wrote:
>> On Thu, Jul 13, 2017 at 05:16:24PM -0400, Anna Schumaker wrote:
>>  >
>>  >   git://git.linux-nfs.org/projects/anna/linux-nfs.git tags/nfs-for-4.13-1
>>
>> Since this landed, I'm seeing this during boot..
>>
>>  ==
>>  BUG: KASAN: global-out-of-bounds in strscpy+0x4a/0x230
>>  Read of size 8 at addr b4eeaf20 by task nfsd/688
> 
> Is KASAN aware that strscpy() does the word-at-a-time optimistic reads
> of the sources?
> 

Nope.

> The problem may be that the source is initialized from the global
> string "nfsd", and KASAN may be unhappy abotu the fact that we read 8
> bytes from a 5-byte string (four plus NUL) as we do the word-at-a-time
> strscpy..
> 

Right.

> That said, we do check the size first (because we also *write* 8 bytes
> at a time), so maybe KASAN shouldn't even need to care.
>

Perhaps we could fallback to unoptimzed copy for KASAN case by setting max = 0
in strscpy().

 
> Hmm. it really looks to me like this is actually a compiler bug (I'm
> using current gcc in F26, which is gcc-7.1.1 - I'm assuming DaveJ is
> the same).
> 
> This is the source code in __ip_map_lookup:
> 
> struct ip_map ip;
>  .
> strcpy(ip.m_class, class);
> 
> and "m_class" is 8 bytes in size:
> 
> struct ip_map {
> ...
> charm_class[8]; /* e.g. "nfsd" */
> ...
> 
> yet when I look at the generated code for __ip_map_lookup, I see
> 
> movl$32, %edx   #,
> movq%r13, %rsi  # class,
> leaq48(%rax), %rdi  #, tmp126
> callstrscpy #
> 
> what's the bug here? Look at that third argume8nt - %rdx. It is
> initialized to 32.
> 
> WTF?
> 


It's not a compiler bug, it's a bug in our strcpy().
Whoever wrote this strcpy() into strscpy() code apparently didn't read carefully
enough gcc manual about __builtin_object_size().

Summary from https://gcc.gnu.org/onlinedocs/gcc/Object-Size-Checking.html :

__builtin_object_size(ptr, type) returns a constant number of bytes 
from 'ptr' to the end of the object 'ptr'
pointer points to. "type" is an integer constant from 0 to 3. If the 
least significant bit is clear, objects
are whole variables, if it is set, a closest surrounding subobject is 
considered the object a pointer points to.
The second bit determines if maximum or minimum of remaining bytes is 
computed. 

We have type = 0 in strcpy(), so the least significant bit is clear. So the 
'ptr' is considered as a pointer to the whole
variable i.e. pointer to struct ip_map ip;
And the number of bytes from 'ip.m_class' to the end of the ip object is 
exactly 32.

I suppose that changing the type to 1 should fix this bug.



> The code to turn "strcpy()" into "strscpy()" should pick the *smaller*
> of the two object sizes as the size argument. How the hell is that
> size argument 32?
> 
> Am I missing something? DaveJ, do you see the same?
> 
>Linus
> 


Re: [GIT PULL] Please pull NFS client changes for Linux 4.13

2017-07-14 Thread Andrey Ryabinin


On 07/14/2017 10:05 PM, Linus Torvalds wrote:
> On Fri, Jul 14, 2017 at 7:25 AM, Dave Jones  wrote:
>> On Thu, Jul 13, 2017 at 05:16:24PM -0400, Anna Schumaker wrote:
>>  >
>>  >   git://git.linux-nfs.org/projects/anna/linux-nfs.git tags/nfs-for-4.13-1
>>
>> Since this landed, I'm seeing this during boot..
>>
>>  ==
>>  BUG: KASAN: global-out-of-bounds in strscpy+0x4a/0x230
>>  Read of size 8 at addr b4eeaf20 by task nfsd/688
> 
> Is KASAN aware that strscpy() does the word-at-a-time optimistic reads
> of the sources?
> 

Nope.

> The problem may be that the source is initialized from the global
> string "nfsd", and KASAN may be unhappy abotu the fact that we read 8
> bytes from a 5-byte string (four plus NUL) as we do the word-at-a-time
> strscpy..
> 

Right.

> That said, we do check the size first (because we also *write* 8 bytes
> at a time), so maybe KASAN shouldn't even need to care.
>

Perhaps we could fallback to unoptimzed copy for KASAN case by setting max = 0
in strscpy().

 
> Hmm. it really looks to me like this is actually a compiler bug (I'm
> using current gcc in F26, which is gcc-7.1.1 - I'm assuming DaveJ is
> the same).
> 
> This is the source code in __ip_map_lookup:
> 
> struct ip_map ip;
>  .
> strcpy(ip.m_class, class);
> 
> and "m_class" is 8 bytes in size:
> 
> struct ip_map {
> ...
> charm_class[8]; /* e.g. "nfsd" */
> ...
> 
> yet when I look at the generated code for __ip_map_lookup, I see
> 
> movl$32, %edx   #,
> movq%r13, %rsi  # class,
> leaq48(%rax), %rdi  #, tmp126
> callstrscpy #
> 
> what's the bug here? Look at that third argume8nt - %rdx. It is
> initialized to 32.
> 
> WTF?
> 


It's not a compiler bug, it's a bug in our strcpy().
Whoever wrote this strcpy() into strscpy() code apparently didn't read carefully
enough gcc manual about __builtin_object_size().

Summary from https://gcc.gnu.org/onlinedocs/gcc/Object-Size-Checking.html :

__builtin_object_size(ptr, type) returns a constant number of bytes 
from 'ptr' to the end of the object 'ptr'
pointer points to. "type" is an integer constant from 0 to 3. If the 
least significant bit is clear, objects
are whole variables, if it is set, a closest surrounding subobject is 
considered the object a pointer points to.
The second bit determines if maximum or minimum of remaining bytes is 
computed. 

We have type = 0 in strcpy(), so the least significant bit is clear. So the 
'ptr' is considered as a pointer to the whole
variable i.e. pointer to struct ip_map ip;
And the number of bytes from 'ip.m_class' to the end of the ip object is 
exactly 32.

I suppose that changing the type to 1 should fix this bug.



> The code to turn "strcpy()" into "strscpy()" should pick the *smaller*
> of the two object sizes as the size argument. How the hell is that
> size argument 32?
> 
> Am I missing something? DaveJ, do you see the same?
> 
>Linus
> 


Re: [GIT PULL] Please pull NFS client changes for Linux 4.13

2017-07-14 Thread Linus Torvalds
On Fri, Jul 14, 2017 at 7:25 AM, Dave Jones  wrote:
> On Thu, Jul 13, 2017 at 05:16:24PM -0400, Anna Schumaker wrote:
>  >
>  >   git://git.linux-nfs.org/projects/anna/linux-nfs.git tags/nfs-for-4.13-1
>
> Since this landed, I'm seeing this during boot..
>
>  ==
>  BUG: KASAN: global-out-of-bounds in strscpy+0x4a/0x230
>  Read of size 8 at addr b4eeaf20 by task nfsd/688

Is KASAN aware that strscpy() does the word-at-a-time optimistic reads
of the sources?

The problem may be that the source is initialized from the global
string "nfsd", and KASAN may be unhappy abotu the fact that we read 8
bytes from a 5-byte string (four plus NUL) as we do the word-at-a-time
strscpy..

That said, we do check the size first (because we also *write* 8 bytes
at a time), so maybe KASAN shouldn't even need to care.

Hmm. it really looks to me like this is actually a compiler bug (I'm
using current gcc in F26, which is gcc-7.1.1 - I'm assuming DaveJ is
the same).

This is the source code in __ip_map_lookup:

struct ip_map ip;
 .
strcpy(ip.m_class, class);

and "m_class" is 8 bytes in size:

struct ip_map {
...
charm_class[8]; /* e.g. "nfsd" */
...

yet when I look at the generated code for __ip_map_lookup, I see

movl$32, %edx   #,
movq%r13, %rsi  # class,
leaq48(%rax), %rdi  #, tmp126
callstrscpy #

what's the bug here? Look at that third argument - %rdx. It is
initialized to 32.

WTF?

The code to turn "strcpy()" into "strscpy()" should pick the *smaller*
of the two object sizes as the size argument. How the hell is that
size argument 32?

Am I missing something? DaveJ, do you see the same?

   Linus


Re: [GIT PULL] Please pull NFS client changes for Linux 4.13

2017-07-14 Thread Linus Torvalds
On Fri, Jul 14, 2017 at 7:25 AM, Dave Jones  wrote:
> On Thu, Jul 13, 2017 at 05:16:24PM -0400, Anna Schumaker wrote:
>  >
>  >   git://git.linux-nfs.org/projects/anna/linux-nfs.git tags/nfs-for-4.13-1
>
> Since this landed, I'm seeing this during boot..
>
>  ==
>  BUG: KASAN: global-out-of-bounds in strscpy+0x4a/0x230
>  Read of size 8 at addr b4eeaf20 by task nfsd/688

Is KASAN aware that strscpy() does the word-at-a-time optimistic reads
of the sources?

The problem may be that the source is initialized from the global
string "nfsd", and KASAN may be unhappy abotu the fact that we read 8
bytes from a 5-byte string (four plus NUL) as we do the word-at-a-time
strscpy..

That said, we do check the size first (because we also *write* 8 bytes
at a time), so maybe KASAN shouldn't even need to care.

Hmm. it really looks to me like this is actually a compiler bug (I'm
using current gcc in F26, which is gcc-7.1.1 - I'm assuming DaveJ is
the same).

This is the source code in __ip_map_lookup:

struct ip_map ip;
 .
strcpy(ip.m_class, class);

and "m_class" is 8 bytes in size:

struct ip_map {
...
charm_class[8]; /* e.g. "nfsd" */
...

yet when I look at the generated code for __ip_map_lookup, I see

movl$32, %edx   #,
movq%r13, %rsi  # class,
leaq48(%rax), %rdi  #, tmp126
callstrscpy #

what's the bug here? Look at that third argument - %rdx. It is
initialized to 32.

WTF?

The code to turn "strcpy()" into "strscpy()" should pick the *smaller*
of the two object sizes as the size argument. How the hell is that
size argument 32?

Am I missing something? DaveJ, do you see the same?

   Linus


Re: [GIT PULL] Please pull NFS client changes for Linux 4.13

2017-07-14 Thread J. Bruce Fields
On Fri, Jul 14, 2017 at 10:25:43AM -0400, Dave Jones wrote:
> On Thu, Jul 13, 2017 at 05:16:24PM -0400, Anna Schumaker wrote:
>  > Hi Linus,
>  > 
>  > The following changes since commit 
> 32c1431eea4881a6b17bd7c639315010aeefa452:
>  > 
>  >   Linux 4.12-rc5 (2017-06-11 16:48:20 -0700)
>  > 
>  > are available in the git repository at:
>  > 
>  >   git://git.linux-nfs.org/projects/anna/linux-nfs.git tags/nfs-for-4.13-1
>  > 
>  > for you to fetch changes up to b4f937cffa66b3d56eb8f586e620d0b223a281a3:
>  > 
>  >   NFS: Don't run wake_up_bit() when nobody is waiting... (2017-07-13 
> 16:57:18 -0400)
> 
> Since this landed, I'm seeing this during boot..

__ip_map_lookup does have a strcpy, and it looks like that can be
implemented in terms of strscpy.

Based on that backtrace, it should just be copying from
nfsd_program->pg_class, which is initialized to "nfsd" and never
changed.

I spent a few minutes trying to figure out the tracing macros that
define str__nfsd__trace_system_name+0x3a0/0x3e0 and gave up.

So I have no idea what's going on

--b.

> 
>  ==
>  BUG: KASAN: global-out-of-bounds in strscpy+0x4a/0x230
>  Read of size 8 at addr b4eeaf20 by task nfsd/688
>  
>  CPU: 0 PID: 688 Comm: nfsd Not tainted 4.12.0-firewall+ #14 
>  Call Trace:
>   dump_stack+0x68/0x94
>   print_address_description+0x2c/0x270
>   ? strscpy+0x4a/0x230
>   kasan_report+0x239/0x350
>   __asan_load8+0x55/0x90
>   strscpy+0x4a/0x230
>   __ip_map_lookup+0x85/0x150
>   ? ip_map_init+0x50/0x50
>   ? lock_acquire+0x270/0x270
>   svcauth_unix_set_client+0x9f3/0xdc0
>   ? svcauth_unix_set_client+0x5/0xdc0
>   ? unix_gid_parse+0x340/0x340
>   ? kasan_kmalloc+0xbb/0xf0
>   ? groups_alloc+0x29/0x80
>   ? __kmalloc+0x13b/0x360
>   ? groups_alloc+0x29/0x80
>   ? groups_alloc+0x48/0x80
>   ? svcauth_unix_accept+0x3a5/0x3c0
>   svc_set_client+0x50/0x60
>   svc_process+0x901/0x10b0
>   ? svc_register+0x430/0x430
>   ? __might_sleep+0x78/0xf0
>   ? preempt_count_sub+0xaf/0x120
>   ? __validate_process_creds+0x9e/0x160
>   nfsd+0x250/0x380
>   ? nfsd+0x5/0x380
>   kthread+0x1ab/0x200
>   ? nfsd_destroy+0x1f0/0x1f0
>   ? __kthread_create_on_node+0x340/0x340
>   ret_from_fork+0x27/0x40
>  
>  The buggy address belongs to the variable:
>   str__nfsd__trace_system_name+0x3a0/0x3e0
>  
>  Memory state around the buggy address:
>   b4eeae00: 00 00 00 01 fa fa fa fa 00 00 00 00 00 04 fa fa
>   b4eeae80: fa fa fa fa 04 fa fa fa fa fa fa fa 04 fa fa fa
>  >b4eeaf00: fa fa fa fa 05 fa fa fa fa fa fa fa 00 00 00 00
> ^   
>   b4eeaf80: 00 fa fa fa fa fa fa fa 00 00 05 fa fa fa fa fa
>   b4eeb000: 00 03 fa fa fa fa fa fa 00 07 fa fa fa fa fa fa
>  ==


Re: [GIT PULL] Please pull NFS client changes for Linux 4.13

2017-07-14 Thread J. Bruce Fields
On Fri, Jul 14, 2017 at 10:25:43AM -0400, Dave Jones wrote:
> On Thu, Jul 13, 2017 at 05:16:24PM -0400, Anna Schumaker wrote:
>  > Hi Linus,
>  > 
>  > The following changes since commit 
> 32c1431eea4881a6b17bd7c639315010aeefa452:
>  > 
>  >   Linux 4.12-rc5 (2017-06-11 16:48:20 -0700)
>  > 
>  > are available in the git repository at:
>  > 
>  >   git://git.linux-nfs.org/projects/anna/linux-nfs.git tags/nfs-for-4.13-1
>  > 
>  > for you to fetch changes up to b4f937cffa66b3d56eb8f586e620d0b223a281a3:
>  > 
>  >   NFS: Don't run wake_up_bit() when nobody is waiting... (2017-07-13 
> 16:57:18 -0400)
> 
> Since this landed, I'm seeing this during boot..

__ip_map_lookup does have a strcpy, and it looks like that can be
implemented in terms of strscpy.

Based on that backtrace, it should just be copying from
nfsd_program->pg_class, which is initialized to "nfsd" and never
changed.

I spent a few minutes trying to figure out the tracing macros that
define str__nfsd__trace_system_name+0x3a0/0x3e0 and gave up.

So I have no idea what's going on

--b.

> 
>  ==
>  BUG: KASAN: global-out-of-bounds in strscpy+0x4a/0x230
>  Read of size 8 at addr b4eeaf20 by task nfsd/688
>  
>  CPU: 0 PID: 688 Comm: nfsd Not tainted 4.12.0-firewall+ #14 
>  Call Trace:
>   dump_stack+0x68/0x94
>   print_address_description+0x2c/0x270
>   ? strscpy+0x4a/0x230
>   kasan_report+0x239/0x350
>   __asan_load8+0x55/0x90
>   strscpy+0x4a/0x230
>   __ip_map_lookup+0x85/0x150
>   ? ip_map_init+0x50/0x50
>   ? lock_acquire+0x270/0x270
>   svcauth_unix_set_client+0x9f3/0xdc0
>   ? svcauth_unix_set_client+0x5/0xdc0
>   ? unix_gid_parse+0x340/0x340
>   ? kasan_kmalloc+0xbb/0xf0
>   ? groups_alloc+0x29/0x80
>   ? __kmalloc+0x13b/0x360
>   ? groups_alloc+0x29/0x80
>   ? groups_alloc+0x48/0x80
>   ? svcauth_unix_accept+0x3a5/0x3c0
>   svc_set_client+0x50/0x60
>   svc_process+0x901/0x10b0
>   ? svc_register+0x430/0x430
>   ? __might_sleep+0x78/0xf0
>   ? preempt_count_sub+0xaf/0x120
>   ? __validate_process_creds+0x9e/0x160
>   nfsd+0x250/0x380
>   ? nfsd+0x5/0x380
>   kthread+0x1ab/0x200
>   ? nfsd_destroy+0x1f0/0x1f0
>   ? __kthread_create_on_node+0x340/0x340
>   ret_from_fork+0x27/0x40
>  
>  The buggy address belongs to the variable:
>   str__nfsd__trace_system_name+0x3a0/0x3e0
>  
>  Memory state around the buggy address:
>   b4eeae00: 00 00 00 01 fa fa fa fa 00 00 00 00 00 04 fa fa
>   b4eeae80: fa fa fa fa 04 fa fa fa fa fa fa fa 04 fa fa fa
>  >b4eeaf00: fa fa fa fa 05 fa fa fa fa fa fa fa 00 00 00 00
> ^   
>   b4eeaf80: 00 fa fa fa fa fa fa fa 00 00 05 fa fa fa fa fa
>   b4eeb000: 00 03 fa fa fa fa fa fa 00 07 fa fa fa fa fa fa
>  ==


Re: [GIT PULL] Please pull NFS client changes for Linux 4.13

2017-07-14 Thread Dave Jones
On Thu, Jul 13, 2017 at 05:16:24PM -0400, Anna Schumaker wrote:
 > Hi Linus,
 > 
 > The following changes since commit 32c1431eea4881a6b17bd7c639315010aeefa452:
 > 
 >   Linux 4.12-rc5 (2017-06-11 16:48:20 -0700)
 > 
 > are available in the git repository at:
 > 
 >   git://git.linux-nfs.org/projects/anna/linux-nfs.git tags/nfs-for-4.13-1
 > 
 > for you to fetch changes up to b4f937cffa66b3d56eb8f586e620d0b223a281a3:
 > 
 >   NFS: Don't run wake_up_bit() when nobody is waiting... (2017-07-13 
 > 16:57:18 -0400)

Since this landed, I'm seeing this during boot..

 ==
 BUG: KASAN: global-out-of-bounds in strscpy+0x4a/0x230
 Read of size 8 at addr b4eeaf20 by task nfsd/688
 
 CPU: 0 PID: 688 Comm: nfsd Not tainted 4.12.0-firewall+ #14 
 Call Trace:
  dump_stack+0x68/0x94
  print_address_description+0x2c/0x270
  ? strscpy+0x4a/0x230
  kasan_report+0x239/0x350
  __asan_load8+0x55/0x90
  strscpy+0x4a/0x230
  __ip_map_lookup+0x85/0x150
  ? ip_map_init+0x50/0x50
  ? lock_acquire+0x270/0x270
  svcauth_unix_set_client+0x9f3/0xdc0
  ? svcauth_unix_set_client+0x5/0xdc0
  ? unix_gid_parse+0x340/0x340
  ? kasan_kmalloc+0xbb/0xf0
  ? groups_alloc+0x29/0x80
  ? __kmalloc+0x13b/0x360
  ? groups_alloc+0x29/0x80
  ? groups_alloc+0x48/0x80
  ? svcauth_unix_accept+0x3a5/0x3c0
  svc_set_client+0x50/0x60
  svc_process+0x901/0x10b0
  ? svc_register+0x430/0x430
  ? __might_sleep+0x78/0xf0
  ? preempt_count_sub+0xaf/0x120
  ? __validate_process_creds+0x9e/0x160
  nfsd+0x250/0x380
  ? nfsd+0x5/0x380
  kthread+0x1ab/0x200
  ? nfsd_destroy+0x1f0/0x1f0
  ? __kthread_create_on_node+0x340/0x340
  ret_from_fork+0x27/0x40
 
 The buggy address belongs to the variable:
  str__nfsd__trace_system_name+0x3a0/0x3e0
 
 Memory state around the buggy address:
  b4eeae00: 00 00 00 01 fa fa fa fa 00 00 00 00 00 04 fa fa
  b4eeae80: fa fa fa fa 04 fa fa fa fa fa fa fa 04 fa fa fa
 >b4eeaf00: fa fa fa fa 05 fa fa fa fa fa fa fa 00 00 00 00
^   
  b4eeaf80: 00 fa fa fa fa fa fa fa 00 00 05 fa fa fa fa fa
  b4eeb000: 00 03 fa fa fa fa fa fa 00 07 fa fa fa fa fa fa
 ==



Re: [GIT PULL] Please pull NFS client changes for Linux 4.13

2017-07-14 Thread Dave Jones
On Thu, Jul 13, 2017 at 05:16:24PM -0400, Anna Schumaker wrote:
 > Hi Linus,
 > 
 > The following changes since commit 32c1431eea4881a6b17bd7c639315010aeefa452:
 > 
 >   Linux 4.12-rc5 (2017-06-11 16:48:20 -0700)
 > 
 > are available in the git repository at:
 > 
 >   git://git.linux-nfs.org/projects/anna/linux-nfs.git tags/nfs-for-4.13-1
 > 
 > for you to fetch changes up to b4f937cffa66b3d56eb8f586e620d0b223a281a3:
 > 
 >   NFS: Don't run wake_up_bit() when nobody is waiting... (2017-07-13 
 > 16:57:18 -0400)

Since this landed, I'm seeing this during boot..

 ==
 BUG: KASAN: global-out-of-bounds in strscpy+0x4a/0x230
 Read of size 8 at addr b4eeaf20 by task nfsd/688
 
 CPU: 0 PID: 688 Comm: nfsd Not tainted 4.12.0-firewall+ #14 
 Call Trace:
  dump_stack+0x68/0x94
  print_address_description+0x2c/0x270
  ? strscpy+0x4a/0x230
  kasan_report+0x239/0x350
  __asan_load8+0x55/0x90
  strscpy+0x4a/0x230
  __ip_map_lookup+0x85/0x150
  ? ip_map_init+0x50/0x50
  ? lock_acquire+0x270/0x270
  svcauth_unix_set_client+0x9f3/0xdc0
  ? svcauth_unix_set_client+0x5/0xdc0
  ? unix_gid_parse+0x340/0x340
  ? kasan_kmalloc+0xbb/0xf0
  ? groups_alloc+0x29/0x80
  ? __kmalloc+0x13b/0x360
  ? groups_alloc+0x29/0x80
  ? groups_alloc+0x48/0x80
  ? svcauth_unix_accept+0x3a5/0x3c0
  svc_set_client+0x50/0x60
  svc_process+0x901/0x10b0
  ? svc_register+0x430/0x430
  ? __might_sleep+0x78/0xf0
  ? preempt_count_sub+0xaf/0x120
  ? __validate_process_creds+0x9e/0x160
  nfsd+0x250/0x380
  ? nfsd+0x5/0x380
  kthread+0x1ab/0x200
  ? nfsd_destroy+0x1f0/0x1f0
  ? __kthread_create_on_node+0x340/0x340
  ret_from_fork+0x27/0x40
 
 The buggy address belongs to the variable:
  str__nfsd__trace_system_name+0x3a0/0x3e0
 
 Memory state around the buggy address:
  b4eeae00: 00 00 00 01 fa fa fa fa 00 00 00 00 00 04 fa fa
  b4eeae80: fa fa fa fa 04 fa fa fa fa fa fa fa 04 fa fa fa
 >b4eeaf00: fa fa fa fa 05 fa fa fa fa fa fa fa 00 00 00 00
^   
  b4eeaf80: 00 fa fa fa fa fa fa fa 00 00 05 fa fa fa fa fa
  b4eeb000: 00 03 fa fa fa fa fa fa 00 07 fa fa fa fa fa fa
 ==



Re: [GIT PULL] Please pull NFS client changes for Linux 4.13

2017-07-14 Thread Anna Schumaker


On 07/14/2017 03:09 AM, Christoph Hellwig wrote:
> On Thu, Jul 13, 2017 at 02:43:14PM -0700, Linus Torvalds wrote:
>> On Thu, Jul 13, 2017 at 2:16 PM, Anna Schumaker
>>  wrote:
>>>
>>>   git://git.linux-nfs.org/projects/anna/linux-nfs.git tags/nfs-for-4.13-1
>>
>> Btw, your key seems to have expired, and doing a refresh on it doesn't fix 
>> it.
>>
>> I'm sure you've refreshed your key, but apparently that refresh hasn't
>> been percolated to the public keyservers?

I assumed I had refreshed it once git let me sign things again, but maybe I 
missed a step.

> 
> As someone who has run into an issue in that area recently:  I manually
> had to refresh and re-upload my signing subkey and not just the primary
> key, which wa rather confusing and took a long time to sort out.
> 

I'll start here.  Thanks for the tip!

Anna


Re: [GIT PULL] Please pull NFS client changes for Linux 4.13

2017-07-14 Thread Anna Schumaker


On 07/14/2017 03:09 AM, Christoph Hellwig wrote:
> On Thu, Jul 13, 2017 at 02:43:14PM -0700, Linus Torvalds wrote:
>> On Thu, Jul 13, 2017 at 2:16 PM, Anna Schumaker
>>  wrote:
>>>
>>>   git://git.linux-nfs.org/projects/anna/linux-nfs.git tags/nfs-for-4.13-1
>>
>> Btw, your key seems to have expired, and doing a refresh on it doesn't fix 
>> it.
>>
>> I'm sure you've refreshed your key, but apparently that refresh hasn't
>> been percolated to the public keyservers?

I assumed I had refreshed it once git let me sign things again, but maybe I 
missed a step.

> 
> As someone who has run into an issue in that area recently:  I manually
> had to refresh and re-upload my signing subkey and not just the primary
> key, which wa rather confusing and took a long time to sort out.
> 

I'll start here.  Thanks for the tip!

Anna


Re: [GIT PULL] Please pull NFS client changes for Linux 4.13

2017-07-14 Thread Christoph Hellwig
On Thu, Jul 13, 2017 at 02:43:14PM -0700, Linus Torvalds wrote:
> On Thu, Jul 13, 2017 at 2:16 PM, Anna Schumaker
>  wrote:
> >
> >   git://git.linux-nfs.org/projects/anna/linux-nfs.git tags/nfs-for-4.13-1
> 
> Btw, your key seems to have expired, and doing a refresh on it doesn't fix it.
> 
> I'm sure you've refreshed your key, but apparently that refresh hasn't
> been percolated to the public keyservers?

As someone who has run into an issue in that area recently:  I manually
had to refresh and re-upload my signing subkey and not just the primary
key, which wa rather confusing and took a long time to sort out.


Re: [GIT PULL] Please pull NFS client changes for Linux 4.13

2017-07-14 Thread Christoph Hellwig
On Thu, Jul 13, 2017 at 02:43:14PM -0700, Linus Torvalds wrote:
> On Thu, Jul 13, 2017 at 2:16 PM, Anna Schumaker
>  wrote:
> >
> >   git://git.linux-nfs.org/projects/anna/linux-nfs.git tags/nfs-for-4.13-1
> 
> Btw, your key seems to have expired, and doing a refresh on it doesn't fix it.
> 
> I'm sure you've refreshed your key, but apparently that refresh hasn't
> been percolated to the public keyservers?

As someone who has run into an issue in that area recently:  I manually
had to refresh and re-upload my signing subkey and not just the primary
key, which wa rather confusing and took a long time to sort out.


Re: [GIT PULL] Please pull NFS client changes for Linux 4.13

2017-07-13 Thread Linus Torvalds
On Thu, Jul 13, 2017 at 2:16 PM, Anna Schumaker
 wrote:
>
>   git://git.linux-nfs.org/projects/anna/linux-nfs.git tags/nfs-for-4.13-1

Btw, your key seems to have expired, and doing a refresh on it doesn't fix it.

I'm sure you've refreshed your key, but apparently that refresh hasn't
been percolated to the public keyservers?

   Linus


  1   2   >