11.0-CURRENT panic (nfsd?)

2014-01-05 Thread Markiyan Kushnir
I started to see a reliable panic on a recent CURRENT:

$ uname -a
FreeBSD mkushnir.mooo.com 11.0-CURRENT FreeBSD 11.0-CURRENT #0
r260296: Sun Jan  5 07:14:50 EET 2014
r...@vm.mkushnir.mooo.com:/usr/obj/usr/src.svnup/sys/MAREK  amd64

The panic is always triggered by the first request to the nfs service
(this machine runs a PXE server).

The core.txt is attached. Please let me know if I can help more.

--
Markiyan.
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: 11.0-CURRENT panic (nfsd?)

2014-01-05 Thread Markiyan Kushnir
Please ignore the attached core.txt.1.gz, and see the new
core.txt.2.gz in this attachment. I confused files.

--
Markiyan.


2014/1/5 Markiyan Kushnir :
> I started to see a reliable panic on a recent CURRENT:
>
> $ uname -a
> FreeBSD mkushnir.mooo.com 11.0-CURRENT FreeBSD 11.0-CURRENT #0
> r260296: Sun Jan  5 07:14:50 EET 2014
> r...@vm.mkushnir.mooo.com:/usr/obj/usr/src.svnup/sys/MAREK  amd64
>
> The panic is always triggered by the first request to the nfs service
> (this machine runs a PXE server).
>
> The core.txt is attached. Please let me know if I can help more.
>
> --
> Markiyan.
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: 11.0-CURRENT panic (nfsd?)

2014-01-05 Thread John-Mark Gurney
Markiyan Kushnir wrote this message on Sun, Jan 05, 2014 at 10:57 +0200:
> I started to see a reliable panic on a recent CURRENT:
> 
> $ uname -a
> FreeBSD mkushnir.mooo.com 11.0-CURRENT FreeBSD 11.0-CURRENT #0
> r260296: Sun Jan  5 07:14:50 EET 2014
> r...@vm.mkushnir.mooo.com:/usr/obj/usr/src.svnup/sys/MAREK  amd64
> 
> The panic is always triggered by the first request to the nfs service
> (this machine runs a PXE server).
> 
> The core.txt is attached. Please let me know if I can help more.

Apparently the mime-type on the attachment was bad and got scrubbed...

Maybe include it inline if it isn't too long?

-- 
  John-Mark Gurney  Voice: +1 415 225 5579

 "All that I will do, has been done, All that I have, has not."
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: 11.0-CURRENT panic (nfsd?)

2014-01-05 Thread Markiyan Kushnir
2014/1/5 John-Mark Gurney :
> Markiyan Kushnir wrote this message on Sun, Jan 05, 2014 at 10:57 +0200:
>> I started to see a reliable panic on a recent CURRENT:
>>
>> $ uname -a
>> FreeBSD mkushnir.mooo.com 11.0-CURRENT FreeBSD 11.0-CURRENT #0
>> r260296: Sun Jan  5 07:14:50 EET 2014
>> r...@vm.mkushnir.mooo.com:/usr/obj/usr/src.svnup/sys/MAREK  amd64
>>
>> The panic is always triggered by the first request to the nfs service
>> (this machine runs a PXE server).
>>
>> The core.txt is attached. Please let me know if I can help more.
>
> Apparently the mime-type on the attachment was bad and got scrubbed...
>
> Maybe include it inline if it isn't too long?
>

It's 144KB long. I will share it via Google Drive:

https://drive.google.com/file/d/0B9Q-zpUXxqCnNVhBY0M5ZzU4d1k/edit?usp=sharing

--
Markiyan.


> --
>   John-Mark Gurney  Voice: +1 415 225 5579
>
>  "All that I will do, has been done, All that I have, has not."
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: 11.0-CURRENT panic (nfsd?)

2014-01-05 Thread John-Mark Gurney
Markiyan Kushnir wrote this message on Sun, Jan 05, 2014 at 11:06 +0200:
> 2014/1/5 John-Mark Gurney :
> > Markiyan Kushnir wrote this message on Sun, Jan 05, 2014 at 10:57 +0200:
> >> I started to see a reliable panic on a recent CURRENT:
> >>
> >> $ uname -a
> >> FreeBSD mkushnir.mooo.com 11.0-CURRENT FreeBSD 11.0-CURRENT #0
> >> r260296: Sun Jan  5 07:14:50 EET 2014
> >> r...@vm.mkushnir.mooo.com:/usr/obj/usr/src.svnup/sys/MAREK  amd64
> >>
> >> The panic is always triggered by the first request to the nfs service
> >> (this machine runs a PXE server).
> >>
> >> The core.txt is attached. Please let me know if I can help more.
> >
> > Apparently the mime-type on the attachment was bad and got scrubbed...
> >
> > Maybe include it inline if it isn't too long?
> >
> 
> It's 144KB long. I will share it via Google Drive:
> 
> https://drive.google.com/file/d/0B9Q-zpUXxqCnNVhBY0M5ZzU4d1k/edit?usp=sharing

Looks like a NULL function pointer was called:
Fatal trap 12: page fault while in kernel mode
cpuid = 0; apic id = 00
fault virtual address   = 0x0
fault code  = supervisor read instruction, page not present
instruction pointer = 0x20:0x0
stack pointer   = 0x28:0xfe00d9a2bea0
frame pointer   = 0x28:0xfe00d9a2c010
code segment= base 0x0, limit 0xf, type 0x1b
= DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags= interrupt enabled, resume, IOPL = 0
current process = 1323 (nfsd: master)
trap number = 12
panic: page fault

--- trap 0xc, rip = 0, rsp = 0xfe00d9a2bea0, rbp = 0xfe00d9a2c010 ---
uart_sab82532_class() at 0/frame 0xfe00d9a2c010
svc_run_internal() at svc_run_internal+0x9c9/frame 0xfe00d9a2c1b0
svc_run() at svc_run+0xed/frame 0xfe00d9a2c1f0
nfsrvd_nfsd() at nfsrvd_nfsd+0x19a/frame 0xfe00d9a2c350
nfssvc_nfsd() at nfssvc_nfsd+0x11a/frame 0xfe00d9a2c970
sys_nfssvc() at sys_nfssvc+0xd2/frame 0xfe00d9a2c9a0
amd64_syscall() at amd64_syscall+0x265/frame 0xfe00d9a2cab0
Xfast_syscall() at Xfast_syscall+0xfb/frame 0xfe00d9a2cab0
--- syscall (155, FreeBSD ELF64, sys_nfssvc), rip = 0x80088c13a, rsp = 
0x7fffd438, rbp = 0x7fffd6e0 ---

The uart_sab82532_class is just the closest symbol to 0, so it's in
svc_run_internal that's the problem...  Could you run:
nm /boot/kernel/kernel | grep svc_run_internal

This should return a line w/ a large hex number at the front, then run:
addr2line -e /boot/kernel/kernel $( expr 0x+0x9c9)

This will give you a file name and line number, and can you copy/paste
the lines around and including that line number?  This will help make
sure we get the correct code...

Thanks.

-- 
  John-Mark Gurney  Voice: +1 415 225 5579

 "All that I will do, has been done, All that I have, has not."
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: 11.0-CURRENT panic (nfsd?)

2014-01-05 Thread Markiyan Kushnir
$ nm /boot/kernel/kernel | grep svc_run_internal
80714db0 t svc_run_internal
$ addr2line -e /boot/kernel/kernel 0x80715779
/usr/src.svnup/sys/rpc/svc.c:971

   949  static void
   950  svc_executereq(struct svc_req *rqstp)
   951  {
   952  SVCXPRT *xprt = rqstp->rq_xprt;
   953  SVCPOOL *pool = xprt->xp_pool;
   954  int prog_found;
   955  rpcvers_t low_vers;
   956  rpcvers_t high_vers;
   957  struct svc_callout *s;
   958
   959  /* now match message with a registered service*/
   960  prog_found = FALSE;
   961  low_vers = (rpcvers_t) -1L;
   962  high_vers = (rpcvers_t) 0L;
   963  TAILQ_FOREACH(s, &pool->sp_callouts, sc_link) {
   964  if (s->sc_prog == rqstp->rq_prog) {
   965  if (s->sc_vers == rqstp->rq_vers) {
   966  /*
   967   * We hand ownership of r to the
   968   * dispatch method - they must call
   969   * svc_freereq.
   970   */
   971  (*s->sc_dispatch)(rqstp, xprt);
   972  return;
   973  }  /* found correct version */
   974  prog_found = TRUE;
   975  if (s->sc_vers < low_vers)
   976  low_vers = s->sc_vers;
   977  if (s->sc_vers > high_vers)
   978  high_vers = s->sc_vers;
   979  }   /* found correct program */
   980  }
   981
   982  /*
   983   * if we got here, the program or version
   984   * is not served ...
   985   */
   986  if (prog_found)
   987  svcerr_progvers(rqstp, low_vers, high_vers);
   988  else
   989  svcerr_noprog(rqstp);
   990
   991  svc_freereq(rqstp);
   992  }
   993

2014/1/5 John-Mark Gurney :
> Markiyan Kushnir wrote this message on Sun, Jan 05, 2014 at 11:06 +0200:
>> 2014/1/5 John-Mark Gurney :
>> > Markiyan Kushnir wrote this message on Sun, Jan 05, 2014 at 10:57 +0200:
>> >> I started to see a reliable panic on a recent CURRENT:
>> >>
>> >> $ uname -a
>> >> FreeBSD mkushnir.mooo.com 11.0-CURRENT FreeBSD 11.0-CURRENT #0
>> >> r260296: Sun Jan  5 07:14:50 EET 2014
>> >> r...@vm.mkushnir.mooo.com:/usr/obj/usr/src.svnup/sys/MAREK  amd64
>> >>
>> >> The panic is always triggered by the first request to the nfs service
>> >> (this machine runs a PXE server).
>> >>
>> >> The core.txt is attached. Please let me know if I can help more.
>> >
>> > Apparently the mime-type on the attachment was bad and got scrubbed...
>> >
>> > Maybe include it inline if it isn't too long?
>> >
>>
>> It's 144KB long. I will share it via Google Drive:
>>
>> https://drive.google.com/file/d/0B9Q-zpUXxqCnNVhBY0M5ZzU4d1k/edit?usp=sharing
>
> Looks like a NULL function pointer was called:
> Fatal trap 12: page fault while in kernel mode
> cpuid = 0; apic id = 00
> fault virtual address   = 0x0
> fault code  = supervisor read instruction, page not present
> instruction pointer = 0x20:0x0
> stack pointer   = 0x28:0xfe00d9a2bea0
> frame pointer   = 0x28:0xfe00d9a2c010
> code segment= base 0x0, limit 0xf, type 0x1b
> = DPL 0, pres 1, long 1, def32 0, gran 1
> processor eflags= interrupt enabled, resume, IOPL = 0
> current process = 1323 (nfsd: master)
> trap number = 12
> panic: page fault
>
> --- trap 0xc, rip = 0, rsp = 0xfe00d9a2bea0, rbp = 0xfe00d9a2c010 ---
> uart_sab82532_class() at 0/frame 0xfe00d9a2c010
> svc_run_internal() at svc_run_internal+0x9c9/frame 0xfe00d9a2c1b0
> svc_run() at svc_run+0xed/frame 0xfe00d9a2c1f0
> nfsrvd_nfsd() at nfsrvd_nfsd+0x19a/frame 0xfe00d9a2c350
> nfssvc_nfsd() at nfssvc_nfsd+0x11a/frame 0xfe00d9a2c970
> sys_nfssvc() at sys_nfssvc+0xd2/frame 0xfe00d9a2c9a0
> amd64_syscall() at amd64_syscall+0x265/frame 0xfe00d9a2cab0
> Xfast_syscall() at Xfast_syscall+0xfb/frame 0xfe00d9a2cab0
> --- syscall (155, FreeBSD ELF64, sys_nfssvc), rip = 0x80088c13a, rsp = 
> 0x7fffd438, rbp = 0x7fffd6e0 ---
>
> The uart_sab82532_class is just the closest symbol to 0, so it's in
> svc_run_internal that's the problem...  Could you run:
> nm /boot/kernel/kernel | grep svc_run_internal
>
> This should return a line w/ a large hex number at the front, then run:
> addr2line -e /boot/kernel/kernel $( expr 0x+0x9c9)
>
> This will give you a file name and line number, and can you copy/paste
> the lines around and including that line number?  This will help make
> sure we get the correct code...
>
> Thanks.
>
> --
>   John-Mark Gurney  Voice: +1 415 225 5579
>
>   

Re: [markiyan.kush...@gmail.com: Re: 11.0-CURRENT panic (nfsd?)]

2014-01-06 Thread Alexander Motin

Thank you for the report. Bug fixed at r260367.


- Forwarded message from Markiyan Kushnir  -

Date: Sun, 5 Jan 2014 19:47:37 +0200
Subject: Re: 11.0-CURRENT panic (nfsd?)
From: Markiyan Kushnir 
To: Markiyan Kushnir , freebsd-current@freebsd.org

$ nm /boot/kernel/kernel | grep svc_run_internal
80714db0 t svc_run_internal
$ addr2line -e /boot/kernel/kernel 0x80715779
/usr/src.svnup/sys/rpc/svc.c:971

949  static void
950  svc_executereq(struct svc_req *rqstp)
951  {
952  SVCXPRT *xprt = rqstp->rq_xprt;
953  SVCPOOL *pool = xprt->xp_pool;
954  int prog_found;
955  rpcvers_t low_vers;
956  rpcvers_t high_vers;
957  struct svc_callout *s;
958
959  /* now match message with a registered service*/
960  prog_found = FALSE;
961  low_vers = (rpcvers_t) -1L;
962  high_vers = (rpcvers_t) 0L;
963  TAILQ_FOREACH(s, &pool->sp_callouts, sc_link) {
964  if (s->sc_prog == rqstp->rq_prog) {
965  if (s->sc_vers == rqstp->rq_vers) {
966  /*
967   * We hand ownership of r to the
968   * dispatch method - they must call
969   * svc_freereq.
970   */
971  (*s->sc_dispatch)(rqstp, xprt);
972  return;
973  }  /* found correct version */
974  prog_found = TRUE;
975  if (s->sc_vers < low_vers)
976  low_vers = s->sc_vers;
977  if (s->sc_vers > high_vers)
978  high_vers = s->sc_vers;
979  }   /* found correct program */
980  }
981
982  /*
983   * if we got here, the program or version
984   * is not served ...
985   */
986  if (prog_found)
987  svcerr_progvers(rqstp, low_vers, high_vers);
988  else
989  svcerr_noprog(rqstp);
990
991  svc_freereq(rqstp);
992  }
993

2014/1/5 John-Mark Gurney :

Markiyan Kushnir wrote this message on Sun, Jan 05, 2014 at 11:06 +0200:

2014/1/5 John-Mark Gurney :

Markiyan Kushnir wrote this message on Sun, Jan 05, 2014 at 10:57 +0200:

I started to see a reliable panic on a recent CURRENT:

$ uname -a
FreeBSD mkushnir.mooo.com 11.0-CURRENT FreeBSD 11.0-CURRENT #0
r260296: Sun Jan  5 07:14:50 EET 2014
r...@vm.mkushnir.mooo.com:/usr/obj/usr/src.svnup/sys/MAREK  amd64

The panic is always triggered by the first request to the nfs service
(this machine runs a PXE server).

The core.txt is attached. Please let me know if I can help more.


Apparently the mime-type on the attachment was bad and got scrubbed...

Maybe include it inline if it isn't too long?



It's 144KB long. I will share it via Google Drive:

https://drive.google.com/file/d/0B9Q-zpUXxqCnNVhBY0M5ZzU4d1k/edit?usp=sharing


Looks like a NULL function pointer was called:
Fatal trap 12: page fault while in kernel mode
cpuid = 0; apic id = 00
fault virtual address   = 0x0
fault code  = supervisor read instruction, page not present
instruction pointer = 0x20:0x0
stack pointer   = 0x28:0xfe00d9a2bea0
frame pointer   = 0x28:0xfe00d9a2c010
code segment= base 0x0, limit 0xf, type 0x1b
 = DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags= interrupt enabled, resume, IOPL = 0
current process = 1323 (nfsd: master)
trap number = 12
panic: page fault

--- trap 0xc, rip = 0, rsp = 0xfe00d9a2bea0, rbp = 0xfe00d9a2c010 ---
uart_sab82532_class() at 0/frame 0xfe00d9a2c010
svc_run_internal() at svc_run_internal+0x9c9/frame 0xfe00d9a2c1b0
svc_run() at svc_run+0xed/frame 0xfe00d9a2c1f0
nfsrvd_nfsd() at nfsrvd_nfsd+0x19a/frame 0xfe00d9a2c350
nfssvc_nfsd() at nfssvc_nfsd+0x11a/frame 0xfe00d9a2c970
sys_nfssvc() at sys_nfssvc+0xd2/frame 0xfe00d9a2c9a0
amd64_syscall() at amd64_syscall+0x265/frame 0xfe00d9a2cab0
Xfast_syscall() at Xfast_syscall+0xfb/frame 0xfe00d9a2cab0
--- syscall (155, FreeBSD ELF64, sys_nfssvc), rip = 0x80088c13a, rsp = 
0x7fffd438, rbp = 0x7fffd6e0 ---

The uart_sab82532_class is just the closest symbol to 0, so it's in
svc_run_internal that's the problem...  Could you run:
nm /boot/kernel/kernel | grep svc_run_internal

This should return a line w/ a large hex number at the front, then run:
addr2line -e /boot/kernel/kernel $( expr 0x+0x9c9)

This will give you a file name and line number, and can you copy/past