Re: 9.1-stable crashes while copying data from a NFS mounted directory

2013-01-28 Thread Christian Gusenbauer
On Monday 28 January 2013 07:35:31 YongHyeon PYUN wrote:
> On Fri, Jan 25, 2013 at 06:09:50PM +0100, Christian Gusenbauer wrote:
> > On Friday 25 January 2013 05:50:48 YongHyeon PYUN wrote:
> > > On Fri, Jan 25, 2013 at 01:30:43PM +0900, YongHyeon PYUN wrote:
> > > > On Thu, Jan 24, 2013 at 05:21:50PM -0500, John Baldwin wrote:
> > > > > On Thursday, January 24, 2013 4:22:12 pm Konstantin Belousov wrote:
> > > > > > On Thu, Jan 24, 2013 at 09:50:52PM +0100, Christian Gusenbauer 
wrote:
> > > > > > > On Thursday 24 January 2013 20:37:09 Konstantin Belousov wrote:
> > > > > > > > On Thu, Jan 24, 2013 at 07:50:49PM +0100, Christian
> > > > > > > > Gusenbauer
> > 
> > wrote:
> > > > > > > > > On Thursday 24 January 2013 19:07:23 Konstantin Belousov 
wrote:
> > > > > > > > > > On Thu, Jan 24, 2013 at 08:03:59PM +0200, Konstantin
> > > > > > > > > > Belousov
> > 
> > wrote:
> > > > > > > > > > > On Thu, Jan 24, 2013 at 06:05:57PM +0100, Christian
> > 
> > Gusenbauer wrote:
> > > > > > > > > > > > Hi!
> > > > > > > > > > > > 
> > > > > > > > > > > > I'm using 9.1 stable svn revision 245605 and I get
> > > > > > > > > > > > the panic below if I execute the following commands
> > > > > > > > > > > > (as single user):
> > > > > > > > > > > > 
> > > > > > > > > > > > # swapon -a
> > > > > > > > > > > > # dumpon /dev/ada0s3b
> > > > > > > > > > > > # mount -u /
> > > > > > > > > > > > # ifconfig age0 inet 192.168.2.2 mtu 6144 up
> > > > > > > > > > > > # mount -t nfs -o rsize=32768 data:/multimedia /mnt
> > > > > > > > > > > > # cp /mnt/Movies/test/a.m2ts /tmp
> > > > > > > > > > > > 
> > > > > > > > > > > > then the system panics almost immediately. I'll
> > > > > > > > > > > > attach the stack trace.
> > > > > > > > > > > > 
> > > > > > > > > > > > Note, that I'm using jumbo frames (6144 byte) on a
> > > > > > > > > > > > 1Gbit network, maybe that's the cause for the panic,
> > > > > > > > > > > > because the bcopy (see stack frame #15) fails.
> > > > > > > > > > > > 
> > > > > > > > > > > > Any clues?
> > > > > > > > > > > 
> > > > > > > > > > > I tried a similar operation with the nfs mount of
> > > > > > > > > > > rsize=32768 and mtu 6144, but the machine runs HEAD and
> > > > > > > > > > > em instead of age. I was unable to reproduce the panic
> > > > > > > > > > > on the copy of the 5GB file from nfs mount.
> > > > > > > > > 
> > > > > > > > > Hmmm, I did a quick test. If I do not change the MTU, so
> > > > > > > > > just configuring age0 with
> > > > > > > > > 
> > > > > > > > > # ifconfig age0 inet 192.168.2.2 up
> > > > > > > > > 
> > > > > > > > > then I can copy all files from the mounted directory
> > > > > > > > > without any problems, too. So it's probably age0 related?
> > > > > > > > 
> > > > > > > > From your backtrace and the buffer printout, I see somewhat
> > > > > > > > strange thing. The buffer data address is 0xff8171418000,
> > > > > > > > while kernel faulted at the attempt to write at
> > > > > > > > 0xff8171413000, which is is lower then the buffer data
> > > > > > > > pointer, at the attempt to bcopy to the buffer.
> > > > > > > > 
> > > > > > > > The other data suggests that there were no overflow of the
> > > > > > > > data from the server response. So it might be that
> > 

Re: 9.1-stable crashes while copying data from a NFS mounted directory

2013-01-25 Thread Christian Gusenbauer
On Friday 25 January 2013 05:50:48 YongHyeon PYUN wrote:
> On Fri, Jan 25, 2013 at 01:30:43PM +0900, YongHyeon PYUN wrote:
> > On Thu, Jan 24, 2013 at 05:21:50PM -0500, John Baldwin wrote:
> > > On Thursday, January 24, 2013 4:22:12 pm Konstantin Belousov wrote:
> > > > On Thu, Jan 24, 2013 at 09:50:52PM +0100, Christian Gusenbauer wrote:
> > > > > On Thursday 24 January 2013 20:37:09 Konstantin Belousov wrote:
> > > > > > On Thu, Jan 24, 2013 at 07:50:49PM +0100, Christian Gusenbauer 
wrote:
> > > > > > > On Thursday 24 January 2013 19:07:23 Konstantin Belousov wrote:
> > > > > > > > On Thu, Jan 24, 2013 at 08:03:59PM +0200, Konstantin Belousov 
wrote:
> > > > > > > > > On Thu, Jan 24, 2013 at 06:05:57PM +0100, Christian 
Gusenbauer wrote:
> > > > > > > > > > Hi!
> > > > > > > > > > 
> > > > > > > > > > I'm using 9.1 stable svn revision 245605 and I get the
> > > > > > > > > > panic below if I execute the following commands (as
> > > > > > > > > > single user):
> > > > > > > > > > 
> > > > > > > > > > # swapon -a
> > > > > > > > > > # dumpon /dev/ada0s3b
> > > > > > > > > > # mount -u /
> > > > > > > > > > # ifconfig age0 inet 192.168.2.2 mtu 6144 up
> > > > > > > > > > # mount -t nfs -o rsize=32768 data:/multimedia /mnt
> > > > > > > > > > # cp /mnt/Movies/test/a.m2ts /tmp
> > > > > > > > > > 
> > > > > > > > > > then the system panics almost immediately. I'll attach
> > > > > > > > > > the stack trace.
> > > > > > > > > > 
> > > > > > > > > > Note, that I'm using jumbo frames (6144 byte) on a 1Gbit
> > > > > > > > > > network, maybe that's the cause for the panic, because
> > > > > > > > > > the bcopy (see stack frame #15) fails.
> > > > > > > > > > 
> > > > > > > > > > Any clues?
> > > > > > > > > 
> > > > > > > > > I tried a similar operation with the nfs mount of
> > > > > > > > > rsize=32768 and mtu 6144, but the machine runs HEAD and em
> > > > > > > > > instead of age. I was unable to reproduce the panic on the
> > > > > > > > > copy of the 5GB file from nfs mount.
> > > > > > > 
> > > > > > > Hmmm, I did a quick test. If I do not change the MTU, so just
> > > > > > > configuring age0 with
> > > > > > > 
> > > > > > > # ifconfig age0 inet 192.168.2.2 up
> > > > > > > 
> > > > > > > then I can copy all files from the mounted directory without
> > > > > > > any problems, too. So it's probably age0 related?
> > > > > > 
> > > > > > From your backtrace and the buffer printout, I see somewhat
> > > > > > strange thing. The buffer data address is 0xff8171418000,
> > > > > > while kernel faulted at the attempt to write at
> > > > > > 0xff8171413000, which is is lower then the buffer data
> > > > > > pointer, at the attempt to bcopy to the buffer.
> > > > > > 
> > > > > > The other data suggests that there were no overflow of the data
> > > > > > from the server response. So it might be that mbuf_len(mp)
> > > > > > returned negative number ? I am not sure is it possible at all.
> > > > > > 
> > > > > > Try this debugging patch, please. You need to add INVARIANTS etc
> > > > > > to the kernel config.
> > > > > > 
> > > > > > diff --git a/sys/fs/nfs/nfs_commonsubs.c
> > > > > > b/sys/fs/nfs/nfs_commonsubs.c index efc0786..9a6bda5 100644
> > > > > > --- a/sys/fs/nfs/nfs_commonsubs.c
> > > > > > +++ b/sys/fs/nfs/nfs_commonsubs.c
> > > > > > @@ -218,6 +218,7 @@ nfsm_mbufuio(struct nfsrv_descript *nd,
> > > > > > struct uio *uiop, int siz) }
> > > > > > 
> > > > > > mbufcp = NFSMTOD(mp, ca