Re: -current sudden panics :(

2000-03-23 Thread Matthew Dillon

This problem should now be fixed, it's probably the problem I just fixed
a moment ago in netinet/if_ether.c based on a thread in -hackers.  The
m_pullup() NULL check in arpintr() was broken, resulting in a NULL
pointer dereference.  

-Matt



To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: -current sudden panics :(

2000-03-23 Thread Warner Losh

In message [EMAIL PROTECTED] Matthew Dillon writes:
: This problem should now be fixed, it's probably the problem I just fixed
: a moment ago in netinet/if_ether.c based on a thread in -hackers.  The
: m_pullup() NULL check in arpintr() was broken, resulting in a NULL
: pointer dereference.  

inoue-san's patch survived the night.  I'll check into your patch and
give it a try instead.

Warner


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: -current sudden panics :(

2000-03-23 Thread Warner Losh

In message [EMAIL PROTECTED] Yoshinobu Inoue writes:
: I would like to narrow down the problem more and could you
: please try if this patch stop the problem or not?
: (The m_pullup() is recently added to if_rl.c. It should not be
: harmful, but I suspect that this might have invoked another
: hidden bug.)

This survived overnight.  I see that Matt Dillon has another patch,
I'll try that tonight.

Warner


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: -current sudden panics :(

2000-03-23 Thread Yoshinobu Inoue

 : This problem should now be fixed, it's probably the problem I just fixed
 : a moment ago in netinet/if_ether.c based on a thread in -hackers.  The
 : m_pullup() NULL check in arpintr() was broken, resulting in a NULL
 : pointer dereference.  
 
 inoue-san's patch survived the night.  I'll check into your patch and
 give it a try instead.

My patch is just a workaround to avoid m_pullup() when it is
not necessary, and his fix seems to be the real one for the
problem.
But I think my patch to if_rl.c is also better to be applied
for performance reason.

Cheers,
Yoshinobu Inoue


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: -current sudden panics :(

2000-03-23 Thread Ilmar S. Habibulin

On Thu, 23 Mar 2000, Matthew Dillon wrote:

 This problem should now be fixed, it's probably the problem I just fixed
 a moment ago in netinet/if_ether.c based on a thread in -hackers.  The
 m_pullup() NULL check in arpintr() was broken, resulting in a NULL
 pointer dereference.  
Ok. Uptime more than 8 hours, continue testing.



To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: -current sudden panics :(

2000-03-22 Thread Warner Losh

In message [EMAIL PROTECTED] "Ilmar S. 
Habibulin" writes:
: This is driver for ed(ne2000) cards. I have realtek(rl driver). I took a
: look at his source and didn't find such strings. There is comment there
: about cutting off mbuf header before passing it to ether_input - what's
: this?

I applied a similar patch to the end of the rl packet handling
routine.  It didn't solve my arp crashes, however.   It is almost as
if sometimes the rl driver passes a packet to ether_input and then
does bad things to it behind the scenes...  I've not had a lot of time
to try to track down why this does what it does.

Warner


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: -current sudden panics :(

2000-03-22 Thread Yoshinobu Inoue

Hi,

 : This is driver for ed(ne2000) cards. I have realtek(rl driver). I took a
 : look at his source and didn't find such strings. There is comment there
 : about cutting off mbuf header before passing it to ether_input - what's
 : this?
 
 I applied a similar patch to the end of the rl packet handling
 routine.  It didn't solve my arp crashes, however.   It is almost as
 if sometimes the rl driver passes a packet to ether_input and then
 does bad things to it behind the scenes...  I've not had a lot of time
 to try to track down why this does what it does.
 
 Warner

I would like to narrow down the problem more and could you
please try if this patch stop the problem or not?
(The m_pullup() is recently added to if_rl.c. It should not be
harmful, but I suspect that this might have invoked another
hidden bug.)

Yoshinobu Inoue

Index: if_rl.c
===
RCS file: /home/ncvs/src/sys/pci/if_rl.c,v
retrieving revision 1.38
diff -u -r1.38 if_rl.c
--- if_rl.c 1999/12/28 06:04:29 1.38
+++ if_rl.c 2000/03/23 01:35:02
@@ -1130,7 +1130,8 @@
m_adj(m, RL_ETHER_ALIGN);
m_copyback(m, wrap, total_len - wrap,
sc-rl_cdata.rl_rx_buf);
-   m = m_pullup(m, sizeof(struct ether_header));
+   if (m-m_len  sizeof(struct ether_header))
+   m = m_pullup(m, sizeof(struct ether_header));
if (m == NULL) {
printf("rl%d: m_pullup failed",
sc-rl_unit);


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: -current sudden panics :(

2000-03-22 Thread Ilmar S. Habibulin

On Tue, 21 Mar 2000, Warner Losh wrote:

 : But why there is such a sudden change? Everything worked just fine a week
 : before 5-current.
 No it didn't.  I've been seeing panics like this for about two weeks,
Ok, it worked for me.

 but it hadn't been a priority until this week for me.  And I'm not
 seeing it on lightly loaded networks, but am on heavily loaded ones.
My pc is not on lightly loaded network. This networks' load is moving
towards(?) zero. ;-)

 Since our product's network port is just for debugging, it isn't a big
 deal to me
And i'm using freebsd as my desktop OS. So this became a VERY BIG problem
for me. :(

 It is definitely a load related problem for me.  It usually works just
 fine, but sometimes there's a packet that gets to arp that arp barfs
 on.
I can't track this situation. Everything seems to be fine, then 
- BBBOOOMMM - page fault. :( 



To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Next thought: -current sudden panics :(

2000-03-22 Thread Ilmar S. Habibulin


We have dhcp server in our net, which configures windows clients. Maybe 
dhcp requests somehow involved in my panics?



To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: -current sudden panics :(

2000-03-21 Thread Yoshinobu Inoue

Hello,

 Fatal 12 trap: page fault while in kernel mode
 fault virtual address   = 0x8
 fault code  = supervisor read, page not present
 instruction pointer = 0x8:0xc01843fc
 stack pointer   = 0x10:0xc026bd64 
 frame pointer   = 0x10:0xc026bd64 
 code segment= base 0x0, limit 0xf, type 0x1b
 = DPL 0, pres 1, def32 1, gran 1
 processor eflags= interrupt enabled, resume, IOPL = 0
 current process = Idle
 interrupt mask  =
 kernel: type 12 trap, code=0
 Stopped at  arpintr+0x9c:  movl0x8(%ebx),%ecx
 
 trace gave this:
 arpint(c022537b,0,10,10,c0220010) at arpintr+0x9c
 swi_net_next() at awi_net_next
 
 I'm sending kernel config and dmesg in the attachment. I have INET6 there,
 but it is not configured by ifconfig.
 
 What's this and how can i avoid this panics?

Do you have any other hints for the problem?, because at least
I couldn't reproduce it in my 4.0 and 5.0 machines.

  -Any kernel crash dump?
  -Is there any typical situation or condition where the
   problem happens?
  -What is your LAN card?


Thanks,
Yoshinobu Inoue


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: -current sudden panics :(

2000-03-21 Thread Nikolai Saoukh

On Wed, Mar 22, 2000 at 12:51:36AM +0900, Yoshinobu Inoue wrote:

  trace gave this:
  arpint(c022537b,0,10,10,c0220010) at arpintr+0x9c
  swi_net_next() at awi_net_next
  
  I'm sending kernel config and dmesg in the attachment. I have INET6 there,
  but it is not configured by ifconfig.
  
  What's this and how can i avoid this panics?
 
 Do you have any other hints for the problem?, because at least
 I couldn't reproduce it in my 4.0 and 5.0 machines.
 
   -Any kernel crash dump?
   -Is there any typical situation or condition where the
problem happens?
   -What is your LAN card?

The driver for his card does not set packet header pointer, thus
arp stuff see NULL pointer. small patch will cure this problem
(at least I hope so).

*** if_ed.c.old Tue Mar 21 19:21:40 2000
--- if_ed.c Tue Mar 21 19:23:27 2000
***
*** 2728,2733 
--- 2728,2734 
 */
m-m_pkthdr.len = m-m_len = len - sizeof(struct ether_header);
m-m_data += sizeof(struct ether_header);
+   m-m_pkthdr.header = (void *)eh;
  
ether_input(sc-arpcom.ac_if, eh, m);
return;


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: -current sudden panics :(

2000-03-21 Thread Yoshinobu Inoue

-What is your LAN card?

Woops, I often do a needless query. That should be using rl
driver as the kernel log.

 The driver for his card does not set packet header pointer, thus
 arp stuff see NULL pointer. small patch will cure this problem
 (at least I hope so).
 
 *** if_ed.c.old   Tue Mar 21 19:21:40 2000
 --- if_ed.c   Tue Mar 21 19:23:27 2000
 ***
 *** 2728,2733 
 --- 2728,2734 
*/
   m-m_pkthdr.len = m-m_len = len - sizeof(struct ether_header);
   m-m_data += sizeof(struct ether_header);
 + m-m_pkthdr.header = (void *)eh;
   
   ether_input(sc-arpcom.ac_if, eh, m);
   return;

But shouldn't it be sys/pci/if_rl.c ?

Yoshinobu Inoue


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: -current sudden panics :(

2000-03-21 Thread Nikolai Saoukh

On Wed, Mar 22, 2000 at 01:51:53AM +0900, Yoshinobu Inoue wrote:

 But shouldn't it be sys/pci/if_rl.c ?

Sorry,
it is mea culpa. I mixed his case with my (token ring).


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: -current sudden panics :(

2000-03-21 Thread Warner Losh

In message [EMAIL PROTECTED] Nikolai Saoukh writes:
:  But shouldn't it be sys/pci/if_rl.c ?
: 
: Sorry,
: it is mea culpa. I mixed his case with my (token ring).

Do you have the patch to if_rl.c.  I looked at it for all of 10
seconds and it wasn't immediately obvious to me.

Warner


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: -current sudden panics :(

2000-03-21 Thread Ilmar S. Habibulin

On Wed, 22 Mar 2000, Yoshinobu Inoue wrote:

 Do you have any other hints for the problem?, because at least
 I couldn't reproduce it in my 4.0 and 5.0 machines.
   -Any kernel crash dump?
Can you tell me ddb command to make a kernel dump?

   -Is there any typical situation or condition where the
problem happens?
I don't know. uptime between panics is from 5 minutes to 10 hours. They
are sudden as i sayd. :(

   -What is your LAN card?
Something on realtek chiset(rl8139), maybe acorp. I don't remember. The
card worked fine for about one year.



To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: -current sudden panics :(

2000-03-21 Thread Ilmar S. Habibulin

On Tue, 21 Mar 2000, Nikolai Saoukh wrote:

 The driver for his card does not set packet header pointer, thus
 arp stuff see NULL pointer. small patch will cure this problem
 (at least I hope so).
 
 *** if_ed.c.old   Tue Mar 21 19:21:40 2000
 --- if_ed.c   Tue Mar 21 19:23:27 2000
 ***
 *** 2728,2733 
 --- 2728,2734 
*/
   m-m_pkthdr.len = m-m_len = len - sizeof(struct ether_header);
   m-m_data += sizeof(struct ether_header);
 + m-m_pkthdr.header = (void *)eh;
   
   ether_input(sc-arpcom.ac_if, eh, m);
   return;
This is driver for ed(ne2000) cards. I have realtek(rl driver). I took a
look at his source and didn't find such strings. There is comment there
about cutting off mbuf header before passing it to ether_input - what's
this?



To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: -current sudden panics :(

2000-03-21 Thread Ilmar S. Habibulin

On Tue, 21 Mar 2000, Warner Losh wrote:

 In message [EMAIL PROTECTED] Nikolai Saoukh writes:
 :  But shouldn't it be sys/pci/if_rl.c ?
 : 
 : Sorry,
 : it is mea culpa. I mixed his case with my (token ring).
 
 Do you have the patch to if_rl.c.  I looked at it for all of 10
 seconds and it wasn't immediately obvious to me.

But why there is such a sudden change? Everything worked just fine a week
before 5-current.



To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: -current sudden panics :(

2000-03-21 Thread Yoshinobu Inoue

  The driver for his card does not set packet header pointer, thus
  arp stuff see NULL pointer. small patch will cure this problem
  (at least I hope so).
  
  *** if_ed.c.old Tue Mar 21 19:21:40 2000
  --- if_ed.c Tue Mar 21 19:23:27 2000
  ***
  *** 2728,2733 
  --- 2728,2734 
   */
  m-m_pkthdr.len = m-m_len = len - sizeof(struct ether_header);
  m-m_data += sizeof(struct ether_header);
  +   m-m_pkthdr.header = (void *)eh;

  ether_input(sc-arpcom.ac_if, eh, m);
  return;
 This is driver for ed(ne2000) cards. I have realtek(rl driver). I took a
 look at his source and didn't find such strings. There is comment there
 about cutting off mbuf header before passing it to ether_input - what's
 this?

I think this fix is only necessary for token-ring case (as he
say in his following mail), and not related to ethernet.

Yoshinobu Inoue


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: -current sudden panics :(

2000-03-21 Thread Yoshinobu Inoue

-Any kernel crash dump?
 Can you tell me ddb command to make a kernel dump?

 -Please confirm that your /var/crash has enough size for your
  machine's memory.

 -Please check your swap device using "swapinfo" etc.
  In case of my machine,

   % swapinfo
   Device  1K-blocks UsedAvail Capacity  Type
   /dev/wd0s2b26214475612   18640429%Interleaved

 -Please sepcify it as dumpdev in your /etc/rc.conf

   dumpdev="/dev/wd0s2b"

Then at the reboot of after a panic, crash dump will be
written to files under /var/crash/.

Cheers,
Yoshinobu Inoue


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: -current sudden panics :(

2000-03-21 Thread Warner Losh

In message [EMAIL PROTECTED] "Ilmar S. 
Habibulin" writes:
: But why there is such a sudden change? Everything worked just fine a week
: before 5-current.

No it didn't.  I've been seeing panics like this for about two weeks,
but it hadn't been a priority until this week for me.  And I'm not
seeing it on lightly loaded networks, but am on heavily loaded ones.
Since our product's network port is just for debugging, it isn't a big
deal to me

It is definitely a load related problem for me.  It usually works just
fine, but sometimes there's a packet that gets to arp that arp barfs
on.

Warner


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



-current sudden panics :(

2000-03-20 Thread Ilmar S. Habibulin


After upgrading from 4.0-current (09.03) to 5.0-current(16.03,17.03) i've
got subj. Machine panics and reboots. And i was not always near
it. Finally i traced it:

Fatal 12 trap: page fault while in kernel mode
fault virtual address   = 0x8
fault code  = supervisor read, page not present
instruction pointer = 0x8:0xc01843fc
stack pointer   = 0x10:0xc026bd64 
frame pointer   = 0x10:0xc026bd64 
code segment= base 0x0, limit 0xf, type 0x1b
= DPL 0, pres 1, def32 1, gran 1
processor eflags= interrupt enabled, resume, IOPL = 0
current process = Idle
interrupt mask  =
kernel: type 12 trap, code=0
Stopped at  arpintr+0x9c:  movl0x8(%ebx),%ecx

trace gave this:
arpint(c022537b,0,10,10,c0220010) at arpintr+0x9c
swi_net_next() at awi_net_next

I'm sending kernel config and dmesg in the attachment. I have INET6 there,
but it is not configured by ifconfig.

What's this and how can i avoid this panics?


Copyright (c) 1992-2000 The FreeBSD Project.
Copyright (c) 1982, 1986, 1989, 1991, 1993
The Regents of the University of California. All rights reserved.
FreeBSD 5.0-CURRENT #1: Sat Mar 18 10:21:21 MSK 2000
[EMAIL PROTECTED]:/usr/src/sys/compile/WS_ILMAR
Timecounter "i8254"  frequency 1193182 Hz
CPU: Pentium II/Pentium II Xeon/Celeron (367.50-MHz 686-class CPU)
  Origin = "GenuineIntel"  Id = 0x660  Stepping = 0
  
Features=0x183f9ffFPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,MMX,FXSR

real memory  = 134152192 (131008K bytes)
avail memory = 127029248 (124052K bytes)
Preloaded elf kernel "kernel" at 0xc02f4000.
Pentium Pro MTRR support enabled
md0: Malloc disk
npx0: math processor on motherboard
npx0: INT 16 interface
pcib0: Intel 82443BX (440 BX) host to PCI bridge on motherboard
pci0: PCI bus on pcib0
pcib1: Intel 82443BX (440 BX) PCI-PCI (AGP) bridge at device 1.0 on pci0
pci1: PCI bus on pcib1
isab0: Intel 82371AB PCI to ISA bridge at device 7.0 on pci0
isa0: ISA bus on isab0
atapci0: Intel PIIX4 ATA33 controller port 0xf000-0xf00f at device 7.1 on pci0
ata0: at 0x1f0 irq 14 on atapci0
ata1: at 0x170 irq 15 on atapci0
pci0: Intel 82371AB/EB (PIIX4) USB controller at 7.2 irq 11
chip1: Intel 82371AB Power management controller port 0x5000-0x500f at device 7.3 on 
pci0
pci0: Matrox MGA Millennium 2064W graphics accelerator at 9.0 irq 9
rl0: RealTek 8139 10/100BaseTX port 0xe400-0xe47f mem 0xe700-0xe77f irq 11 
at device 11.0 on pci0
rl0: Ethernet address: 00:c0:df:23:60:e2
miibus0: MII bus on rl0
rlphy0: RealTek internal media interface on miibus0
rlphy0:  10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto
rl0: supplying EUI64: 00:c0:df:ff:fe:23:60:e2
fdc0: NEC 72065B or clone at port 0x3f0-0x3f5,0x3f7 irq 6 drq 2 on isa0
fdc0: FIFO enabled, 8 bytes threshold
fd0: 1440-KB 3.5" drive on fdc0 drive 0
atkbdc0: keyboard controller (i8042) at port 0x60-0x6f on isa0
atkbd0: AT Keyboard irq 1 on atkbdc0
vga0: Generic ISA VGA at port 0x3c0-0x3df iomem 0xa-0xb on isa0
sc0: System console on isa0
sc0: VGA 16 virtual consoles, flags=0x200
sio0 at port 0x3f8-0x3ff irq 4 flags 0x10 on isa0
sio0: type 16550A
sio1 at port 0x2f8-0x2ff irq 3 on isa0
sio1: type 16550A
ppc0: Parallel port at port 0x378-0x37f irq 7 on isa0
ppc0: Generic chipset (NIBBLE-only) in COMPATIBLE mode
ppi0: Parallel I/O on ppbus0
lpt0: Printer on ppbus0
lpt0: Interrupt-driven port
plip0: PLIP network interface on ppbus0
unknown0: ESS ES1868 Plug and Play AudioDrive at port 0x800-0x807 on isa0
unknown1: ESS ES1868 Plug and Play AudioDrive at port 
0x220-0x22f,0x388-0x38b,0x330-0x331 irq 5 drq 1,0 on isa0
unknown2: ESS ES1868 Plug and Play AudioDrive at port 0x201 on isa0
unknown3: Generic ESDI/IDE/ATA controller at port 0x168-0x16f,0x36e-0x36f irq 12 on 
isa0
ad0: 3077MB ST33232A [6253/16/63] at ata0-master using UDMA33
ad2: 19574MB IBM-DPTA-372050 [39770/16/63] at ata1-master using UDMA33
acd0: CDROM HITACHI CDR-8335 at ata1-slave using WDMA2
Mounting root from ufs:/dev/wd0s2a
WARNING: / was not properly dismounted
rl0: starting DAD for fe80:0001::02c0:dfff:fe23:60e2
rl0: DAD complete for fe80:0001::02c0:dfff:fe23:60e2 - no duplicates found


#
# GENERIC -- Generic kernel configuration file for FreeBSD/i386
#
# For more information on this file, please read the handbook section on
# Kernel Configuration Files:
#
#http://www.freebsd.org/handbook/kernelconfig-config.html
#
# The handbook is also available locally in /usr/share/doc/handbook
# if you've installed the doc distribution, otherwise always see the
# FreeBSD World Wide Web server (http://www.FreeBSD.ORG/) for the
# latest information.
#
# An exhaustive list of options and more detailed explanations of the
# device lines is also present in the ./LINT configuration file. If you are
# in doubt as to the purpose or necessity of a line, check first in LINT.
#
# $FreeBSD: