Re: if_sge related panics

2010-06-04 Thread Pyun YongHyeon
On Fri, Jun 04, 2010 at 07:52:19AM +0300, Nikolay Denev wrote:
 
 On Jun 4, 2010, at 3:35 AM, Pyun YongHyeon wrote:
 
  On Thu, Jun 03, 2010 at 09:29:20AM +0300, Nikolay Denev wrote:
  On May 24, 2010, at 8:12 PM, Pyun YongHyeon wrote:
  
  On Mon, May 24, 2010 at 09:48:33AM -0400, John Baldwin wrote:
  On Monday 24 May 2010 6:35:01 am Nikolay Denev wrote:
  On May 24, 2010, at 8:57 AM, Nikolay Denev wrote:
  
  Hi,
  
  Recently I started to experience a if_sge(4) related panic.
  It happens almost every time I try to download a torrent file for 
  example.
  Copying of large files over NFS seem not to trigger it, but I haven't 
  tested extensively.
  
  Here is the panic message :
  
  Fatal trap 12: page fault while in kernel mode
  cpuid = 0; apic id = 00
  fault virtual address  = 0x8
  fault code = supervisor write data, page 
  not present
  instruction pointer= 0x20:0x80230413
  stack pointer  = 0x28:0xff80001e9280
  frame pointer  = 0x28:0xff80001e9510
  code segment   = base 0x0, limit 0xf, type 0x1b
 = DPL 0, pres 1, long 
  1, def32 0, gran 1
  processor eflags   = interrupt enabled, resume, 
  IOPL = 0
  current process= 12 (irq19: sge0)
  trap number= 12
  panic: page fault
  cpuid = 0
  Uptime: 1d20h56m20s
  Cannot dump. Device not defined or unavailable
  Automatic reboot in 15 seconds - press a key on the console to abort
  Sleeping thread (tid 100039, pid 12) owns a non-sleepable lock
  
  My swap is on a zvol, so I don't have dump. I'll try to attach a disk 
  on the eSATA port and dump there if needed.
  
  Here is some info from the crashdump :
  
  (kgdb) #0  doadump () at pcpu.h:223
  #1  0x802fb149 in boot (howto=260)
at /usr/src/sys/kern/kern_shutdown.c:416
  #2  0x802fb57c in panic (fmt=0x8055d564 %s)
at /usr/src/sys/kern/kern_shutdown.c:590
  #3  0x805055b8 in trap_fatal (frame=0xff000288a3e0, 
  eva=Variable eva is not available.
  )
at /usr/src/sys/amd64/amd64/trap.c:777
  #4  0x805059dc in trap_pfault (frame=0xff80001e91d0, 
  usermode=0)
at /usr/src/sys/amd64/amd64/trap.c:693
  #5  0x805061c5 in trap (frame=0xff80001e91d0)
at /usr/src/sys/amd64/amd64/trap.c:451
  #6  0x804eb977 in calltrap ()
at /usr/src/sys/amd64/amd64/exception.S:223
  #7  0x80230413 in sge_start_locked (ifp=0xff000270d800)
at /usr/src/sys/dev/sge/if_sge.c:1591
  
  Try this.  sge_encap() can sometimes return an error with m_head set to 
  NULL:
  
  
  Thanks John. Committed in r208512.
  
  Index: if_sge.c
  ===
  --- if_sge.c (revision 208375)
  +++ if_sge.c (working copy)
  @@ -1588,7 +1588,8 @@
   if (m_head == NULL)
   break;
   if (sge_encap(sc, m_head)) {
  -IFQ_DRV_PREPEND(ifp-if_snd, m_head);
  +if (m_head != NULL)
  +IFQ_DRV_PREPEND(ifp-if_snd, m_head);
   ifp-if_drv_flags |= IFF_DRV_OACTIVE;
   break;
   }
  
  -- 
  John Baldwin
  
  After the patch I experienced several network outages (ping reporting no 
  buffer space available)
  that were resolved by ifconfig down/up of the sge(4) interface.
  
  
  Because I don't have access to sge(4) controllers I never had chance
  to run it. Does ping(8) generates no buffer space available when
  the system is in idle state? Could you show me more information on
  how you checked network outages?
  
 
 It happened 4-5 times recently. I didn't do extensive investigation, but yes, 
 ping
 returned no buffer space avail when I tried pinging from the machine itself.
 It was unreachable from other hosts on the network.
 I'm not sure what you bean by idle state but there was a torrent client 
 running
 on the machine, which printed errors about inability to reach peers.
 

If system is under heavy TX load(e.g. 64bytes UDP test), ping(8)
may show that message.

 
  I can see that most of the other drivers that handle XXX_encap() returning 
  m_head pointing NULL, break when this condition
  
  Yes, most drivers written/touched by me behaves like that.
  
  is hit: i.e. :
  
  Index: if_sge.c
  ===
  --- if_sge.c   (revision 208375)
  +++ if_sge.c   (working copy)
  @@ -1588,7 +1588,8 @@
 if (m_head == NULL)
 break;
 if (sge_encap(sc, m_head)) {
  -  IFQ_DRV_PREPEND(ifp-if_snd, m_head);
  +  if (m_head == NULL)
  +  break;
 IFQ_DRV_PREPEND(ifp-if_snd, m_head);
 ifp-if_drv_flags |= 

Re: if_sge related panics

2010-06-03 Thread Nikolay Denev
On May 24, 2010, at 8:12 PM, Pyun YongHyeon wrote:

 On Mon, May 24, 2010 at 09:48:33AM -0400, John Baldwin wrote:
 On Monday 24 May 2010 6:35:01 am Nikolay Denev wrote:
 On May 24, 2010, at 8:57 AM, Nikolay Denev wrote:
 
 Hi,
 
 Recently I started to experience a if_sge(4) related panic.
 It happens almost every time I try to download a torrent file for example.
 Copying of large files over NFS seem not to trigger it, but I haven't 
 tested extensively.
 
 Here is the panic message :
 
 Fatal trap 12: page fault while in kernel mode
 cpuid = 0; apic id = 00
 fault virtual address  = 0x8
 fault code = supervisor write data, page not 
 present
 instruction pointer= 0x20:0x80230413
 stack pointer  = 0x28:0xff80001e9280
 frame pointer  = 0x28:0xff80001e9510
 code segment   = base 0x0, limit 0xf, type 0x1b
= DPL 0, pres 1, long 1, def32 
 0, gran 1
 processor eflags   = interrupt enabled, resume, IOPL = 0
 current process= 12 (irq19: sge0)
 trap number= 12
 panic: page fault
 cpuid = 0
 Uptime: 1d20h56m20s
 Cannot dump. Device not defined or unavailable
 Automatic reboot in 15 seconds - press a key on the console to abort
 Sleeping thread (tid 100039, pid 12) owns a non-sleepable lock
 
 My swap is on a zvol, so I don't have dump. I'll try to attach a disk on 
 the eSATA port and dump there if needed.
 
 Here is some info from the crashdump :
 
 (kgdb) #0  doadump () at pcpu.h:223
 #1  0x802fb149 in boot (howto=260)
at /usr/src/sys/kern/kern_shutdown.c:416
 #2  0x802fb57c in panic (fmt=0x8055d564 %s)
at /usr/src/sys/kern/kern_shutdown.c:590
 #3  0x805055b8 in trap_fatal (frame=0xff000288a3e0, 
 eva=Variable eva is not available.
 )
at /usr/src/sys/amd64/amd64/trap.c:777
 #4  0x805059dc in trap_pfault (frame=0xff80001e91d0, usermode=0)
at /usr/src/sys/amd64/amd64/trap.c:693
 #5  0x805061c5 in trap (frame=0xff80001e91d0)
at /usr/src/sys/amd64/amd64/trap.c:451
 #6  0x804eb977 in calltrap ()
at /usr/src/sys/amd64/amd64/exception.S:223
 #7  0x80230413 in sge_start_locked (ifp=0xff000270d800)
at /usr/src/sys/dev/sge/if_sge.c:1591
 
 Try this.  sge_encap() can sometimes return an error with m_head set to NULL:
 
 
 Thanks John. Committed in r208512.
 
 Index: if_sge.c
 ===
 --- if_sge.c (revision 208375)
 +++ if_sge.c (working copy)
 @@ -1588,7 +1588,8 @@
  if (m_head == NULL)
  break;
  if (sge_encap(sc, m_head)) {
 -IFQ_DRV_PREPEND(ifp-if_snd, m_head);
 +if (m_head != NULL)
 +IFQ_DRV_PREPEND(ifp-if_snd, m_head);
  ifp-if_drv_flags |= IFF_DRV_OACTIVE;
  break;
  }
 
 -- 
 John Baldwin

After the patch I experienced several network outages (ping reporting no 
buffer space available)
that were resolved by ifconfig down/up of the sge(4) interface.

I can see that most of the other drivers that handle XXX_encap() returning 
m_head pointing NULL, break when this condition
is hit: i.e. :

Index: if_sge.c
===
--- if_sge.c(revision 208375)
+++ if_sge.c(working copy)
@@ -1588,7 +1588,8 @@
if (m_head == NULL)
break;
if (sge_encap(sc, m_head)) {
-   IFQ_DRV_PREPEND(ifp-if_snd, m_head);
+   if (m_head == NULL)
+   break;
IFQ_DRV_PREPEND(ifp-if_snd, m_head);
ifp-if_drv_flags |= IFF_DRV_OACTIVE;
break;
}

But here in sge(4) we always set IFF_DRV_OACTIVE.
Do you think this can be the source of the problem ?

Regards,
Niki___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: if_sge related panics

2010-06-03 Thread Pyun YongHyeon
On Thu, Jun 03, 2010 at 09:29:20AM +0300, Nikolay Denev wrote:
 On May 24, 2010, at 8:12 PM, Pyun YongHyeon wrote:
 
  On Mon, May 24, 2010 at 09:48:33AM -0400, John Baldwin wrote:
  On Monday 24 May 2010 6:35:01 am Nikolay Denev wrote:
  On May 24, 2010, at 8:57 AM, Nikolay Denev wrote:
  
  Hi,
  
  Recently I started to experience a if_sge(4) related panic.
  It happens almost every time I try to download a torrent file for 
  example.
  Copying of large files over NFS seem not to trigger it, but I haven't 
  tested extensively.
  
  Here is the panic message :
  
  Fatal trap 12: page fault while in kernel mode
  cpuid = 0; apic id = 00
  fault virtual address= 0x8
  fault code   = supervisor write data, page 
  not present
  instruction pointer  = 0x20:0x80230413
  stack pointer= 0x28:0xff80001e9280
  frame pointer= 0x28:0xff80001e9510
  code segment = base 0x0, limit 0xf, type 0x1b
   = DPL 0, pres 1, long 1, def32 
  0, gran 1
  processor eflags = interrupt enabled, resume, IOPL = 0
  current process  = 12 (irq19: sge0)
  trap number  = 12
  panic: page fault
  cpuid = 0
  Uptime: 1d20h56m20s
  Cannot dump. Device not defined or unavailable
  Automatic reboot in 15 seconds - press a key on the console to abort
  Sleeping thread (tid 100039, pid 12) owns a non-sleepable lock
  
  My swap is on a zvol, so I don't have dump. I'll try to attach a disk on 
  the eSATA port and dump there if needed.
  
  Here is some info from the crashdump :
  
  (kgdb) #0  doadump () at pcpu.h:223
  #1  0x802fb149 in boot (howto=260)
 at /usr/src/sys/kern/kern_shutdown.c:416
  #2  0x802fb57c in panic (fmt=0x8055d564 %s)
 at /usr/src/sys/kern/kern_shutdown.c:590
  #3  0x805055b8 in trap_fatal (frame=0xff000288a3e0, 
  eva=Variable eva is not available.
  )
 at /usr/src/sys/amd64/amd64/trap.c:777
  #4  0x805059dc in trap_pfault (frame=0xff80001e91d0, 
  usermode=0)
 at /usr/src/sys/amd64/amd64/trap.c:693
  #5  0x805061c5 in trap (frame=0xff80001e91d0)
 at /usr/src/sys/amd64/amd64/trap.c:451
  #6  0x804eb977 in calltrap ()
 at /usr/src/sys/amd64/amd64/exception.S:223
  #7  0x80230413 in sge_start_locked (ifp=0xff000270d800)
 at /usr/src/sys/dev/sge/if_sge.c:1591
  
  Try this.  sge_encap() can sometimes return an error with m_head set to 
  NULL:
  
  
  Thanks John. Committed in r208512.
  
  Index: if_sge.c
  ===
  --- if_sge.c   (revision 208375)
  +++ if_sge.c   (working copy)
  @@ -1588,7 +1588,8 @@
 if (m_head == NULL)
 break;
 if (sge_encap(sc, m_head)) {
  -  IFQ_DRV_PREPEND(ifp-if_snd, m_head);
  +  if (m_head != NULL)
  +  IFQ_DRV_PREPEND(ifp-if_snd, m_head);
 ifp-if_drv_flags |= IFF_DRV_OACTIVE;
 break;
 }
  
  -- 
  John Baldwin
 
 After the patch I experienced several network outages (ping reporting no 
 buffer space available)
 that were resolved by ifconfig down/up of the sge(4) interface.
 

Because I don't have access to sge(4) controllers I never had chance
to run it. Does ping(8) generates no buffer space available when
the system is in idle state? Could you show me more information on
how you checked network outages?

 I can see that most of the other drivers that handle XXX_encap() returning 
 m_head pointing NULL, break when this condition

Yes, most drivers written/touched by me behaves like that.

 is hit: i.e. :
 
 Index: if_sge.c
 ===
 --- if_sge.c  (revision 208375)
 +++ if_sge.c  (working copy)
 @@ -1588,7 +1588,8 @@
   if (m_head == NULL)
   break;
   if (sge_encap(sc, m_head)) {
 - IFQ_DRV_PREPEND(ifp-if_snd, m_head);
 + if (m_head == NULL)
 + break;
   IFQ_DRV_PREPEND(ifp-if_snd, m_head);
   ifp-if_drv_flags |= IFF_DRV_OACTIVE;
   break;
   }
 
 But here in sge(4) we always set IFF_DRV_OACTIVE.
 Do you think this can be the source of the problem ?
 

More correct way to set IFF_DRV_OACTIVE would be check the number
of queued frames or just exit the transmit loop. If there is no
queued frames, IFF_DRV_OACTIVE would never be cleared which in turn
cause ENOBUFS in ping(8). I think your change looks more reasonable
to me. Do you still see the same issue with the change you suggested?
___
freebsd-stable@freebsd.org mailing list

Re: if_sge related panics

2010-06-03 Thread Nikolay Denev

On Jun 4, 2010, at 3:35 AM, Pyun YongHyeon wrote:

 On Thu, Jun 03, 2010 at 09:29:20AM +0300, Nikolay Denev wrote:
 On May 24, 2010, at 8:12 PM, Pyun YongHyeon wrote:
 
 On Mon, May 24, 2010 at 09:48:33AM -0400, John Baldwin wrote:
 On Monday 24 May 2010 6:35:01 am Nikolay Denev wrote:
 On May 24, 2010, at 8:57 AM, Nikolay Denev wrote:
 
 Hi,
 
 Recently I started to experience a if_sge(4) related panic.
 It happens almost every time I try to download a torrent file for 
 example.
 Copying of large files over NFS seem not to trigger it, but I haven't 
 tested extensively.
 
 Here is the panic message :
 
 Fatal trap 12: page fault while in kernel mode
 cpuid = 0; apic id = 00
 fault virtual address= 0x8
 fault code   = supervisor write data, page 
 not present
 instruction pointer  = 0x20:0x80230413
 stack pointer= 0x28:0xff80001e9280
 frame pointer= 0x28:0xff80001e9510
 code segment = base 0x0, limit 0xf, type 0x1b
  = DPL 0, pres 1, long 1, def32 
 0, gran 1
 processor eflags = interrupt enabled, resume, IOPL = 0
 current process  = 12 (irq19: sge0)
 trap number  = 12
 panic: page fault
 cpuid = 0
 Uptime: 1d20h56m20s
 Cannot dump. Device not defined or unavailable
 Automatic reboot in 15 seconds - press a key on the console to abort
 Sleeping thread (tid 100039, pid 12) owns a non-sleepable lock
 
 My swap is on a zvol, so I don't have dump. I'll try to attach a disk on 
 the eSATA port and dump there if needed.
 
 Here is some info from the crashdump :
 
 (kgdb) #0  doadump () at pcpu.h:223
 #1  0x802fb149 in boot (howto=260)
   at /usr/src/sys/kern/kern_shutdown.c:416
 #2  0x802fb57c in panic (fmt=0x8055d564 %s)
   at /usr/src/sys/kern/kern_shutdown.c:590
 #3  0x805055b8 in trap_fatal (frame=0xff000288a3e0, 
 eva=Variable eva is not available.
 )
   at /usr/src/sys/amd64/amd64/trap.c:777
 #4  0x805059dc in trap_pfault (frame=0xff80001e91d0, 
 usermode=0)
   at /usr/src/sys/amd64/amd64/trap.c:693
 #5  0x805061c5 in trap (frame=0xff80001e91d0)
   at /usr/src/sys/amd64/amd64/trap.c:451
 #6  0x804eb977 in calltrap ()
   at /usr/src/sys/amd64/amd64/exception.S:223
 #7  0x80230413 in sge_start_locked (ifp=0xff000270d800)
   at /usr/src/sys/dev/sge/if_sge.c:1591
 
 Try this.  sge_encap() can sometimes return an error with m_head set to 
 NULL:
 
 
 Thanks John. Committed in r208512.
 
 Index: if_sge.c
 ===
 --- if_sge.c   (revision 208375)
 +++ if_sge.c   (working copy)
 @@ -1588,7 +1588,8 @@
if (m_head == NULL)
break;
if (sge_encap(sc, m_head)) {
 -  IFQ_DRV_PREPEND(ifp-if_snd, m_head);
 +  if (m_head != NULL)
 +  IFQ_DRV_PREPEND(ifp-if_snd, m_head);
ifp-if_drv_flags |= IFF_DRV_OACTIVE;
break;
}
 
 -- 
 John Baldwin
 
 After the patch I experienced several network outages (ping reporting no 
 buffer space available)
 that were resolved by ifconfig down/up of the sge(4) interface.
 
 
 Because I don't have access to sge(4) controllers I never had chance
 to run it. Does ping(8) generates no buffer space available when
 the system is in idle state? Could you show me more information on
 how you checked network outages?
 

It happened 4-5 times recently. I didn't do extensive investigation, but yes, 
ping
returned no buffer space avail when I tried pinging from the machine itself.
It was unreachable from other hosts on the network.
I'm not sure what you bean by idle state but there was a torrent client running
on the machine, which printed errors about inability to reach peers.


 I can see that most of the other drivers that handle XXX_encap() returning 
 m_head pointing NULL, break when this condition
 
 Yes, most drivers written/touched by me behaves like that.
 
 is hit: i.e. :
 
 Index: if_sge.c
 ===
 --- if_sge.c (revision 208375)
 +++ if_sge.c (working copy)
 @@ -1588,7 +1588,8 @@
  if (m_head == NULL)
  break;
  if (sge_encap(sc, m_head)) {
 -IFQ_DRV_PREPEND(ifp-if_snd, m_head);
 +if (m_head == NULL)
 +break;
  IFQ_DRV_PREPEND(ifp-if_snd, m_head);
  ifp-if_drv_flags |= IFF_DRV_OACTIVE;
  break;
  }
 
 But here in sge(4) we always set IFF_DRV_OACTIVE.
 Do you think this can be the source of the problem ?
 
 
 More correct way to set IFF_DRV_OACTIVE would be check the number
 of queued frames or just exit the transmit loop. 

Re: if_sge related panics

2010-05-24 Thread Nikolay Denev
On May 24, 2010, at 8:57 AM, Nikolay Denev wrote:

 Hi,
 
 Recently I started to experience a if_sge(4) related panic.
 It happens almost every time I try to download a torrent file for example.
 Copying of large files over NFS seem not to trigger it, but I haven't tested 
 extensively.
 
 Here is the panic message :
 
 Fatal trap 12: page fault while in kernel mode
 cpuid = 0; apic id = 00
 fault virtual address = 0x8
 fault code= supervisor write data, page not 
 present
 instruction pointer   = 0x20:0x80230413
 stack pointer = 0x28:0xff80001e9280
 frame pointer = 0x28:0xff80001e9510
 code segment  = base 0x0, limit 0xf, type 0x1b
   = DPL 0, pres 1, long 1, def32 
 0, gran 1
 processor eflags  = interrupt enabled, resume, IOPL = 0
 current process   = 12 (irq19: sge0)
 trap number   = 12
 panic: page fault
 cpuid = 0
 Uptime: 1d20h56m20s
 Cannot dump. Device not defined or unavailable
 Automatic reboot in 15 seconds - press a key on the console to abort
 Sleeping thread (tid 100039, pid 12) owns a non-sleepable lock
 
 My swap is on a zvol, so I don't have dump. I'll try to attach a disk on the 
 eSATA port and dump there if needed.

Here is some info from the crashdump :

(kgdb) #0  doadump () at pcpu.h:223
#1  0x802fb149 in boot (howto=260)
at /usr/src/sys/kern/kern_shutdown.c:416
#2  0x802fb57c in panic (fmt=0x8055d564 %s)
at /usr/src/sys/kern/kern_shutdown.c:590
#3  0x805055b8 in trap_fatal (frame=0xff000288a3e0, eva=Variable 
eva is not available.
)
at /usr/src/sys/amd64/amd64/trap.c:777
#4  0x805059dc in trap_pfault (frame=0xff80001e91d0, usermode=0)
at /usr/src/sys/amd64/amd64/trap.c:693
#5  0x805061c5 in trap (frame=0xff80001e91d0)
at /usr/src/sys/amd64/amd64/trap.c:451
#6  0x804eb977 in calltrap ()
at /usr/src/sys/amd64/amd64/exception.S:223
#7  0x80230413 in sge_start_locked (ifp=0xff000270d800)
at /usr/src/sys/dev/sge/if_sge.c:1591
#8  0x80231044 in sge_start (ifp=0xff000270d800)
at /usr/src/sys/dev/sge/if_sge.c:1562
#9  0x803a8b1a in if_transmit (ifp=0xff000270d800, m=Variable m 
is not available.
)
at /usr/src/sys/net/if.c:3355
#10 0x803adbbb in ether_output_frame (ifp=0xff000270d800, 
m=0xff0007fdd600) at /usr/src/sys/net/if_ethersubr.c:452
#11 0x803ae206 in ether_output (ifp=0xff000270d800, 
m=0xff0007fdd600, dst=Variable dst is not available.
) at /usr/src/sys/net/if_ethersubr.c:423
#12 0x803e1798 in ip_output (m=0xff0007fdd600, opt=Variable opt 
is not available.
)
at /usr/src/sys/netinet/ip_output.c:634
#13 0x803ec6d0 in tcp_output (tp=0xff0007ea56e0)
at /usr/src/sys/netinet/tcp_output.c:1190
#14 0x803e7243 in tcp_do_segment (m=0xff00cd095900, 
th=0xff0007743834, so=0xff0007e9e550, tp=0xff0007ea56e0, 
drop_hdrlen=52, tlen=0, iptos=0 '\0', ti_locked=2)
at /usr/src/sys/netinet/tcp_input.c:1484
#15 0x803ea565 in tcp_input (m=0xff00cd095900, off0=Variable off0 
is not available.
)
at /usr/src/sys/netinet/tcp_input.c:1029
#16 0x803de6c1 in ip_input (m=0xff00cd095900)
at /usr/src/sys/netinet/ip_input.c:793
#17 0x803b637e in netisr_dispatch_src (proto=1, source=Variable 
source is not available.
)
at /usr/src/sys/net/netisr.c:917
#18 0x803ada6d in ether_demux (ifp=0xff000270d800, 
m=0xff00cd095900) at /usr/src/sys/net/if_ethersubr.c:901
#19 0x803ae3f0 in ether_input (ifp=0xff000270d800, 
m=0xff00cd095900) at /usr/src/sys/net/if_ethersubr.c:760
#20 0x802317db in sge_intr (arg=Variable arg is not available.
) at /usr/src/sys/dev/sge/if_sge.c:1220
#21 0x802d202d in intr_event_execute_handlers (p=Variable p is not 
available.
)
at /usr/src/sys/kern/kern_intr.c:1220
#22 0x802d36de in ithread_loop (arg=0xff000286f4c0)
at /usr/src/sys/kern/kern_intr.c:1233
#23 0x802cf978 in fork_exit (
callout=0x802d3650 ithread_loop, arg=0xff000286f4c0, 
frame=0xff80001e9c80) at /usr/src/sys/kern/kern_fork.c:844
#24 0x804ebe4e in fork_trampoline ()
at /usr/src/sys/amd64/amd64/exception.S:562
#25 0x in ?? ()
#26 0x in ?? ()
#27 0x0001 in ?? ()
#28 0x in ?? ()
#29 0x in ?? ()
#30 0x in ?? ()
#31 0x in ?? ()
#32 0x in ?? ()
#33 0x in ?? ()
#34 0x in ?? ()
#35 0x in ?? ()
#36 0x in ?? ()
#37 0x in ?? ()
#38 0x in ?? ()
#39 0x in ?? ()
#40 0x 

Re: if_sge related panics

2010-05-24 Thread John Baldwin
On Monday 24 May 2010 6:35:01 am Nikolay Denev wrote:
 On May 24, 2010, at 8:57 AM, Nikolay Denev wrote:
 
  Hi,
  
  Recently I started to experience a if_sge(4) related panic.
  It happens almost every time I try to download a torrent file for example.
  Copying of large files over NFS seem not to trigger it, but I haven't 
  tested extensively.
  
  Here is the panic message :
  
  Fatal trap 12: page fault while in kernel mode
  cpuid = 0; apic id = 00
  fault virtual address   = 0x8
  fault code  = supervisor write data, page not 
  present
  instruction pointer = 0x20:0x80230413
  stack pointer   = 0x28:0xff80001e9280
  frame pointer   = 0x28:0xff80001e9510
  code segment= base 0x0, limit 0xf, type 0x1b
  = DPL 0, pres 1, long 1, def32 
  0, gran 1
  processor eflags= interrupt enabled, resume, IOPL = 0
  current process = 12 (irq19: sge0)
  trap number = 12
  panic: page fault
  cpuid = 0
  Uptime: 1d20h56m20s
  Cannot dump. Device not defined or unavailable
  Automatic reboot in 15 seconds - press a key on the console to abort
  Sleeping thread (tid 100039, pid 12) owns a non-sleepable lock
  
  My swap is on a zvol, so I don't have dump. I'll try to attach a disk on 
  the eSATA port and dump there if needed.
 
 Here is some info from the crashdump :
 
 (kgdb) #0  doadump () at pcpu.h:223
 #1  0x802fb149 in boot (howto=260)
 at /usr/src/sys/kern/kern_shutdown.c:416
 #2  0x802fb57c in panic (fmt=0x8055d564 %s)
 at /usr/src/sys/kern/kern_shutdown.c:590
 #3  0x805055b8 in trap_fatal (frame=0xff000288a3e0, eva=Variable 
 eva is not available.
 )
 at /usr/src/sys/amd64/amd64/trap.c:777
 #4  0x805059dc in trap_pfault (frame=0xff80001e91d0, usermode=0)
 at /usr/src/sys/amd64/amd64/trap.c:693
 #5  0x805061c5 in trap (frame=0xff80001e91d0)
 at /usr/src/sys/amd64/amd64/trap.c:451
 #6  0x804eb977 in calltrap ()
 at /usr/src/sys/amd64/amd64/exception.S:223
 #7  0x80230413 in sge_start_locked (ifp=0xff000270d800)
 at /usr/src/sys/dev/sge/if_sge.c:1591

Try this.  sge_encap() can sometimes return an error with m_head set to NULL:

Index: if_sge.c
===
--- if_sge.c(revision 208375)
+++ if_sge.c(working copy)
@@ -1588,7 +1588,8 @@
if (m_head == NULL)
break;
if (sge_encap(sc, m_head)) {
-   IFQ_DRV_PREPEND(ifp-if_snd, m_head);
+   if (m_head != NULL)
+   IFQ_DRV_PREPEND(ifp-if_snd, m_head);
ifp-if_drv_flags |= IFF_DRV_OACTIVE;
break;
}

-- 
John Baldwin
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: if_sge related panics

2010-05-24 Thread Nikolay Denev
On May 24, 2010, at 4:48 PM, John Baldwin wrote:

 On Monday 24 May 2010 6:35:01 am Nikolay Denev wrote:
 On May 24, 2010, at 8:57 AM, Nikolay Denev wrote:
 
 Hi,
 
 Recently I started to experience a if_sge(4) related panic.
 It happens almost every time I try to download a torrent file for example.
 Copying of large files over NFS seem not to trigger it, but I haven't 
 tested extensively.
 
 Here is the panic message :
 
 Fatal trap 12: page fault while in kernel mode
 cpuid = 0; apic id = 00
 fault virtual address   = 0x8
 fault code  = supervisor write data, page not 
 present
 instruction pointer = 0x20:0x80230413
 stack pointer   = 0x28:0xff80001e9280
 frame pointer   = 0x28:0xff80001e9510
 code segment= base 0x0, limit 0xf, type 0x1b
 = DPL 0, pres 1, long 1, def32 
 0, gran 1
 processor eflags= interrupt enabled, resume, IOPL = 0
 current process = 12 (irq19: sge0)
 trap number = 12
 panic: page fault
 cpuid = 0
 Uptime: 1d20h56m20s
 Cannot dump. Device not defined or unavailable
 Automatic reboot in 15 seconds - press a key on the console to abort
 Sleeping thread (tid 100039, pid 12) owns a non-sleepable lock
 
 My swap is on a zvol, so I don't have dump. I'll try to attach a disk on 
 the eSATA port and dump there if needed.
 
 Here is some info from the crashdump :
 
 (kgdb) #0  doadump () at pcpu.h:223
 #1  0x802fb149 in boot (howto=260)
at /usr/src/sys/kern/kern_shutdown.c:416
 #2  0x802fb57c in panic (fmt=0x8055d564 %s)
at /usr/src/sys/kern/kern_shutdown.c:590
 #3  0x805055b8 in trap_fatal (frame=0xff000288a3e0, eva=Variable 
 eva is not available.
 )
at /usr/src/sys/amd64/amd64/trap.c:777
 #4  0x805059dc in trap_pfault (frame=0xff80001e91d0, usermode=0)
at /usr/src/sys/amd64/amd64/trap.c:693
 #5  0x805061c5 in trap (frame=0xff80001e91d0)
at /usr/src/sys/amd64/amd64/trap.c:451
 #6  0x804eb977 in calltrap ()
at /usr/src/sys/amd64/amd64/exception.S:223
 #7  0x80230413 in sge_start_locked (ifp=0xff000270d800)
at /usr/src/sys/dev/sge/if_sge.c:1591
 
 Try this.  sge_encap() can sometimes return an error with m_head set to NULL:
 
 Index: if_sge.c
 ===
 --- if_sge.c  (revision 208375)
 +++ if_sge.c  (working copy)
 @@ -1588,7 +1588,8 @@
   if (m_head == NULL)
   break;
   if (sge_encap(sc, m_head)) {
 - IFQ_DRV_PREPEND(ifp-if_snd, m_head);
 + if (m_head != NULL)
 + IFQ_DRV_PREPEND(ifp-if_snd, m_head);
   ifp-if_drv_flags |= IFF_DRV_OACTIVE;
   break;
   }
 
 -- 
 John Baldwin

Thanks, patch applied. Will let you know how it goes.

--
Niki___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: if_sge related panics

2010-05-24 Thread Pyun YongHyeon
On Mon, May 24, 2010 at 09:48:33AM -0400, John Baldwin wrote:
 On Monday 24 May 2010 6:35:01 am Nikolay Denev wrote:
  On May 24, 2010, at 8:57 AM, Nikolay Denev wrote:
  
   Hi,
   
   Recently I started to experience a if_sge(4) related panic.
   It happens almost every time I try to download a torrent file for example.
   Copying of large files over NFS seem not to trigger it, but I haven't 
   tested extensively.
   
   Here is the panic message :
   
   Fatal trap 12: page fault while in kernel mode
   cpuid = 0; apic id = 00
   fault virtual address = 0x8
   fault code= supervisor write data, page 
   not present
   instruction pointer   = 0x20:0x80230413
   stack pointer = 0x28:0xff80001e9280
   frame pointer = 0x28:0xff80001e9510
   code segment  = base 0x0, limit 0xf, type 0x1b
 = DPL 0, pres 1, long 1, def32 
   0, gran 1
   processor eflags  = interrupt enabled, resume, IOPL = 0
   current process   = 12 (irq19: sge0)
   trap number   = 12
   panic: page fault
   cpuid = 0
   Uptime: 1d20h56m20s
   Cannot dump. Device not defined or unavailable
   Automatic reboot in 15 seconds - press a key on the console to abort
   Sleeping thread (tid 100039, pid 12) owns a non-sleepable lock
   
   My swap is on a zvol, so I don't have dump. I'll try to attach a disk on 
   the eSATA port and dump there if needed.
  
  Here is some info from the crashdump :
  
  (kgdb) #0  doadump () at pcpu.h:223
  #1  0x802fb149 in boot (howto=260)
  at /usr/src/sys/kern/kern_shutdown.c:416
  #2  0x802fb57c in panic (fmt=0x8055d564 %s)
  at /usr/src/sys/kern/kern_shutdown.c:590
  #3  0x805055b8 in trap_fatal (frame=0xff000288a3e0, 
  eva=Variable eva is not available.
  )
  at /usr/src/sys/amd64/amd64/trap.c:777
  #4  0x805059dc in trap_pfault (frame=0xff80001e91d0, usermode=0)
  at /usr/src/sys/amd64/amd64/trap.c:693
  #5  0x805061c5 in trap (frame=0xff80001e91d0)
  at /usr/src/sys/amd64/amd64/trap.c:451
  #6  0x804eb977 in calltrap ()
  at /usr/src/sys/amd64/amd64/exception.S:223
  #7  0x80230413 in sge_start_locked (ifp=0xff000270d800)
  at /usr/src/sys/dev/sge/if_sge.c:1591
 
 Try this.  sge_encap() can sometimes return an error with m_head set to NULL:
 

Thanks John. Committed in r208512.

 Index: if_sge.c
 ===
 --- if_sge.c  (revision 208375)
 +++ if_sge.c  (working copy)
 @@ -1588,7 +1588,8 @@
   if (m_head == NULL)
   break;
   if (sge_encap(sc, m_head)) {
 - IFQ_DRV_PREPEND(ifp-if_snd, m_head);
 + if (m_head != NULL)
 + IFQ_DRV_PREPEND(ifp-if_snd, m_head);
   ifp-if_drv_flags |= IFF_DRV_OACTIVE;
   break;
   }
 
 -- 
 John Baldwin
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


if_sge related panics

2010-05-23 Thread Nikolay Denev
Hi,

Recently I started to experience a if_sge(4) related panic.
It happens almost every time I try to download a torrent file for example.
Copying of large files over NFS seem not to trigger it, but I haven't tested 
extensively.

Here is the panic message :

Fatal trap 12: page fault while in kernel mode
cpuid = 0; apic id = 00
fault virtual address   = 0x8
fault code  = supervisor write data, page not 
present
instruction pointer = 0x20:0x80230413
stack pointer   = 0x28:0xff80001e9280
frame pointer   = 0x28:0xff80001e9510
code segment= base 0x0, limit 0xf, type 0x1b
= DPL 0, pres 1, long 1, def32 
0, gran 1
processor eflags= interrupt enabled, resume, IOPL = 0
current process = 12 (irq19: sge0)
trap number = 12
panic: page fault
cpuid = 0
Uptime: 1d20h56m20s
Cannot dump. Device not defined or unavailable
Automatic reboot in 15 seconds - press a key on the console to abort
Sleeping thread (tid 100039, pid 12) owns a non-sleepable lock

My swap is on a zvol, so I don't have dump. I'll try to attach a disk on the 
eSATA port and dump there if 
needed.___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org