date:20070107

Re: Panic in 6.2-PRERELEASE with bge on amd64

2007-01-07 Thread Bruce Evans


On Sun, 7 Jan 2007, Sven Willenberger wrote:


I am starting a new thread on this as what I had assumed was a panic in
nfsd turns out to be an issue with the bge driver. This is an amd64 box,
dual processor (SMP kernel) that happens to be running nfsd. About every
3-5 days the kernel panics and I have finally managed to get a core
dump.
The system: FreeBSD 6.2-PRERELEASE #8: Tue Jan  2 10:57:39 EST 2007


Like most NIC drivers, bge unlocks and re-locks around its call to
ether_input() in its interrupt handler.  This isn't very safe, and it
certainly causes panics for bge.  I often see it panic when bringing
the interface down and up while input is arriving, on a non-SMP non-amd64
(actually i386) non-6.x (actually -current) system.  Bringing the
interface down is probably the worst case.  It creates a null pointer
for bge_intr() to follow.


The short and dirty of the dump:
...
--- trap 0xc, rip = 0x801d5f17, rsp = 0xb371ab50, rbp = 
0xb371aba0 ---
bge_rxeof() at bge_rxeof+0x3b7


What is the instruction here?


bge_intr() at bge_intr+0x1c8
ithread_loop() at ithread_loop+0x14c
fork_exit() at fork_exit+0xbb
fork_trampoline() at fork_trampoline+0xe
--- trap 0, rip = 0, rsp = 0xb371ad00, rbp = 0 ---



Fatal trap 12: page fault while in kernel mode
cpuid = 1; apic id = 01
fault virtual address   = 0x28


Looks like a null pointer panic anyway.  I guess the instruction is
movl to/from 0x28(%reg) where %reg is a null pointer.


...
#8  0x801db818 in bge_intr (xsc=0x0) at 
/usr/src/sys/dev/bge/if_bge.c:2707


What is the statement here?  It presumably follow a null pointer and only
the exprssion for the pointer is interesting.  xsc is already null but
that is probably a bug in gdb, or the result of excessive optimization.
Compiling kernels with -O2 has little effect except to break debugging.

I rarely use gdb on kernels and haven't looked closely enough using ddb
to see where the null pointer for the panic on down/up came from.

BTW, the sbdrop panic in -current isn't bge-only or SMP-only.  I saw
it once for sk on a non-SMP system.  It rarely happens for non-SMP
(much more rarely than the panic in bge_intr()).  Under -current, on
an SMP amd64 system with bge, It happens almost every time on close
of the socket for a ttcp server if input is arriving at the time of
the close.  I haven't seen it for 6.x.

Bruce
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "[EMAIL PROTECTED]"

Re: kernel panic on 6.2-RC2 with GENERIC.

2007-01-07 Thread Scott Long


Jan Mikkelsen wrote:

(Scott:  I should have emailed you this earlier, but Christmas and various
other things got in the way.)

Ian West wrote:

On Sun, Jan 07, 2007 at 02:25:02PM -0500, Mike Tancsa wrote:

At 11:43 AM 1/7/2007, Craig Rodrigues wrote:

On Fri, Jan 05, 2007 at 06:59:10PM +0200, Nikolay Pavlov wrote:
[ Areca kernel panic, IO failures ... ]

I have seen this identical fault with the new areca driver, my machine
is opteron hardware, but running a regular i386/SMP kernel/world. With
everything at 6.2RC2 (as of 29th of December) except the areca driver
the machine is rock solid, with the 29th of december version of the
areca driver the box will crash on extract of a large tar 
file, removal
of a large directory structure, or pretty much anything that 
does a lot
of disk io to different files/locations. There is no error 
log prior to

seeing the following messages..

Dec 29 14:26:44 aleph kernel: 
g_vfs_done():da0s1g[WRITE(offset=433078272, length=8192)]error = 5
Dec 29 14:26:44 aleph kernel: 
g_vfs_done():da0s1g[WRITE(offset=433111040, length=16384)]error = 5
Dec 29 14:26:44 aleph kernel: 
g_vfs_done():da0s1g[WRITE(offset=433209344, length=16384)]error = 5
Dec 29 14:26:44 aleph kernel: 
g_vfs_done():da0s1g[WRITE(offset=433242112, length=32768)]error = 5
Dec 29 14:26:44 aleph kernel: 
g_vfs_done():da0s1g[WRITE(offset=437612544, length=4096)]error = 5
Dec 29 14:26:44 aleph kernel: 
g_vfs_done():da0s1g[WRITE(offset=437616640, length=12288)]error = 5
Dec 29 14:26:44 aleph kernel: 
g_vfs_done():da0s1g[WRITE(offset=437633024, length=6144)]error = 5
Dec 29 14:26:44 aleph kernel: 
g_vfs_done():da0s1g[WRITE(offset=437639168, length=2048)]error = 5
Dec 29 14:26:44 aleph kernel: 
g_vfs_done():da0s1g[WRITE(offset=437641216, length=6144)]error = 5


There are a string of these, followed by a crash and reboot. 
The file system
state can be left very dirty to the point where background 
fsck seems unable

to recover it.

The areca card in question is running the latest firmware/boot and
has shown no problems either before, or since backing out the areca
driver.

The volume is ran the tests on was a 250G on a raid6 raid set.


I have seen various problems with various Areca drivers.  All on
6.2-RC1/amd64 with an Areca RAID-6 volume.

Areca 1.20.00.02 seems to work fine.

Areca 1.20.00.12 (from the Areca website) seems to have data corruption
problems.  My tests involve doing a "diff -r" on a filesystem with 2GB of
data.  It will occasional find differences in files.  On examination, the
last 640 bytes of the first block of the affected file contain data from
another file "nearby" in the filesystem.  Unmounting and remounting the
filesystems and rerunning the test shows no problem, or a difference in
another file entirely.  I think this is the cause of the g_vfs_done failures
with this version of the driver;  the offsets are wrong because the data is
corrupted.

Areca 1.20.00.13 (as currently in the tree) does not seem to have data
corruption problems, but I can trigger g_vfs_done failures under heavy I/O.

I have raised this with Areca support, and I'm waiting to hear back from
Erich Chen.

Regards,

Jan Mikkelsen



I discussed this issue in length with the release engineering team 
today, and we're going to go ahead with keeping the .013 version in

6.2 since it has been working very reliably for a number of other
testers, and reverting it at this late stage of the release represents
more risk.  A note about this issue will likely be put into the 6.2
errata document as well.

I plan to dig into this problem next week unless Areca fixes it first.
Please let me know if you hear anything from them.

Scott


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "[EMAIL PROTECTED]"

Re: Fatal trap 12: page fault while in kernel mode

2007-01-07 Thread Marc G. Fournier

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1


I'm up and running on the patch now as well ...

- --On Sunday, January 07, 2007 17:02:40 -0800 Kevin Oberman <[EMAIL 
PROTECTED]> 
wrote:

>> Date: Sun, 7 Jan 2007 14:03:41 + (GMT)
>> From: Robert Watson <[EMAIL PROTECTED]>
>> Sender: [EMAIL PROTECTED]
>>
>> On Sat, 6 Jan 2007, Marc G. Fournier wrote:
>>
>> > Just had the following happen on a FreeBSD 6.2-PRERELEASE #7: Sun Dec 17>
>> > 01:28:52 AST 2006 system ... amd64, HP Proliant, 6G of RAM ... have core >
>> > if  there is information that I can provide out of it ...
>> >
>> > Fatal trap 12: page fault while in kernel mode
>> > cpuid = 0; apic id = 00
>> > fault virtual address   = 0x18c
>> > fault code  = supervisor read, page not present
>> > instruction pointer = 0x8:0x801f9053
>> > stack pointer   = 0x10:0xb5c78b30
>> > frame pointer   = 0x10:0xb5c78b60
>> > code segment= base 0x0, limit 0xf, type 0x1b
>> >= DPL 0, pres 1, long 1, def32 0, gran 1
>> > processor eflags= resume, IOPL = 0
>> > current process = 5 (thread taskq)
>> > trap number = 12
>> > panic: page fault
>> > cpuid = 0
>> > Uptime: 8d22h25m40s
>> >
>> > (kgdb) where
>> > # 0  doadump () at pcpu.h:172
>> > # 1  0x80203955 in boot (howto=260) at
>> > /usr/src/sys/kern/kern_shutdown.c:409
>> > # 2  0x80204065 in panic (fmt=0xff019b667720
>> > "X\223f\233\001???\020?c\233\001???") at
>> > /usr/src/sys/kern/kern_shutdown.c:565
>> > # 3  0x803287a6 in trap_fatal (frame=0xc, eva=1844674298110007>
>> > # 4784) at
>> > /usr/src/sys/amd64/amd64/trap.c:660
>> > # 4  0x80328cd8 in trap (frame=
>> >  {tf_rdi = 112, tf_rsi = -1092609476832, tf_rdx = 6, tf_rcx =>
>> >  3221225730, tf_r8 = -1245213424, tf_r9 = -1092609476832, tf_rax = 1,
>> > tf_rbx = - -1096874331952, tf_rbp = -1245213856, tf_r10 = -2142258536,
>> > tf_r11 > = 0, tf_r12 = 4, tf_r13 = -1092609476832, tf_r14 = 4, tf_r15 = 1,
>> > tf_trapno > = 12, tf_addr = 396, tf_flags = -2145197496, tf_err = 0,
>> > tf_rip = -2145415085, tf_c> s = 8, tf_rflags = 65538, tf_rsp =
>> > -1245213888, tf_ss = 16}) at
>> > /usr/src/sys/amd64/amd64/trap.c:238
>> > # 5  0x80313c6b in calltrap () at
>> > /usr/src/sys/amd64/amd64/exception.S:168
>> > # 6  0x801f9053 in _mtx_lock_sleep (m=0xff009d31f0d0,
>> > tid=18446742981100074784, opts=6, file=0xc102  02
>> > out of bounds>, line=-1245213424) at /usr/src/sys/kern/kern_mutex.c:546
>> > # 7  0x8025b1ac in unp_gc (arg=0x70, pending=-1687783648) at
>> > /usr/src/sys/kern/uipc_usrreq.c:1714
>> > # 8  0x8022c314 in taskqueue_run (queue=0xff844800) at
>> > /usr/src/sys/kern/subr_taskqueue.c:257
>> > # 9  0x8022d0e7 in taskqueue_thread_loop (arg=0x70) at
>> > /usr/src/sys/kern/subr_taskqueue.c:376
>> > # 10 0x801e7b76 in fork_exit (callout=0x8022d060
>> > , arg=0x805030d0, frame=0xb5c7>
>> > 8c50) at /usr/src/sys/kern/kern_fork.c:821
>> > # 11 0x80313fce in fork_trampoline () at
>> > /usr/src/sys/amd64/amd64/exception.S:394
>>
>> This is a NULL pointer dereference in the UNIX domain socket code.  John
>> Baldwin recently committed a fix for a bug with these symptoms to 7-CURRENT>
>> ,  with an MFC planned in the near future.  The fix won't make 6.2-RELEASE,
>> bu> t  assuming it tests out well over the next few weeks, we will cut an
>> errata>   patch/announcement for it.  I believe you can pull down his
>> 6-STABLE versio> n  at:
>>
>>http://people.FreeBSD.org/~jhb/patches/unp_gc.patch
>>
>> This same patch is currently in texting on mx1.FreeBSD.org.
>>
>> (John CC'd)
>>
>> Robert N M Watson
>> Computer Laboratory
>> University of Cambridge
>
> I have installed this on my system, but the panics have always been very
> erratic, so it may be a while before I am sure whether this fixes it. At
> the moment the system has been up for 7 days, although I have had
> multiple crashes in a single day.
> --
> R. Kevin Oberman, Network Engineer
> Energy Sciences Network (ESnet)
> Ernest O. Lawrence Berkeley National Laboratory (Berkeley Lab)
> E-mail: [EMAIL PROTECTED] Phone: +1 510 486-8634
> Key fingerprint:059B 2DDF 031C 9BA3 14A4  EADA 927D EBB3 987B 3751



- 
Marc G. Fournier   Hub.Org Networking Services (http://www.hub.org)
Email . [EMAIL PROTECTED]  MSN . [EMAIL PROTECTED]
Yahoo . yscrappy   Skype: hub.orgICQ . 7615664
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.5 (FreeBSD)

iD8DBQFFobPh4QvfyHIvDvMRAuGBAJ4vwJoVIRmbdHK6wqBxneuUzjekfACgr4Ys
2DSldX3rTRAHkng3UqKO+8U=
=FtuJ
-END PGP SIGNATURE-

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "[EMAIL PROTECTED]"

Re: kernel panic on 6.2-RC2 with GENERIC.

2007-01-07 Thread Jeff Royle


Craig Rodrigues wrote:

On Fri, Jan 05, 2007 at 06:59:10PM +0200, Nikolay Pavlov wrote:

Hello folks.
I have kernel panic on GENERIC kernel while executing postmark.


The sequence of steps that Nikolay used to produce this panic was:

- install benchmarks/postmark from ports

root# postmark
PostMark v1.5 : 3/27/01
pm>set number=1
pm>set transactions=1
pm>set subdirectories=1
pm>show
pm>run



I was able to perform this test without issue on my latest test server 
using an Adaptec 2130SLP using Raid1 and 2 Ultra320 Scsi-3 drives. 
Plain install, GENERIC-SMP kernel:


su-2.05b# postmark
PostMark v1.5 : 3/27/01
pm>set number=1
pm>set transactions=1
pm>set subdirectories=1
pm>set location /tmp
pm>run
Creating subdirectories...Done
Creating files...Done
Performing transactions..Done
Deleting files...Done
Deleting subdirectories...Done
Time:
102 seconds total
69 seconds of transactions (144 per second)

Files:
15027 created (147 per second)
Creation alone: 1 files (588 per second)
Mixed with transactions: 5027 files (72 per second)
4990 read (72 per second)
5009 appended (72 per second)
15027 deleted (147 per second)
Deletion alone: 10054 files (628 per second)
Mixed with transactions: 4973 files (72 per second)

Data:
27.14 megabytes read (272.46 kilobytes per second)
85.08 megabytes written (854.14 kilobytes per second)
pm>quit
su-2.05b# uname -a
FreeBSD testserv1.aci 6.2-RC2 FreeBSD 6.2-RC2 #0: Sun Dec 24 23:42:30 
UTC 2006 [EMAIL PROTECTED]:/usr/obj/usr/src/sys/SMP  i386



Cheers,

Jeff
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "[EMAIL PROTECTED]"

Re: Fatal trap 12: page fault while in kernel mode

2007-01-07 Thread Kevin Oberman

> Date: Sun, 7 Jan 2007 14:03:41 + (GMT)
> From: Robert Watson <[EMAIL PROTECTED]>
> Sender: [EMAIL PROTECTED]
> 
> On Sat, 6 Jan 2007, Marc G. Fournier wrote:
> 
> > Just had the following happen on a FreeBSD 6.2-PRERELEASE #7: Sun Dec 17>  
> > 01:28:52 AST 2006 system ... amd64, HP Proliant, 6G of RAM ... have core > 
> > if 
> > there is information that I can provide out of it ...
> >
> > Fatal trap 12: page fault while in kernel mode
> > cpuid = 0; apic id = 00
> > fault virtual address   = 0x18c
> > fault code  = supervisor read, page not present
> > instruction pointer = 0x8:0x801f9053
> > stack pointer   = 0x10:0xb5c78b30
> > frame pointer   = 0x10:0xb5c78b60
> > code segment= base 0x0, limit 0xf, type 0x1b
> >= DPL 0, pres 1, long 1, def32 0, gran 1
> > processor eflags= resume, IOPL = 0
> > current process = 5 (thread taskq)
> > trap number = 12
> > panic: page fault
> > cpuid = 0
> > Uptime: 8d22h25m40s
> >
> > (kgdb) where
> > #0  doadump () at pcpu.h:172
> > #1  0x80203955 in boot (howto=260) at
> > /usr/src/sys/kern/kern_shutdown.c:409
> > #2  0x80204065 in panic (fmt=0xff019b667720
> > "X\223f\233\001ÿÿÿ\020µc\233\001ÿÿÿ") at
> > /usr/src/sys/kern/kern_shutdown.c:565
> > #3  0x803287a6 in trap_fatal (frame=0xc, eva=1844674298110007> 
> > 4784) at
> > /usr/src/sys/amd64/amd64/trap.c:660
> > #4  0x80328cd8 in trap (frame=
> >  {tf_rdi = 112, tf_rsi = -1092609476832, tf_rdx = 6, tf_rcx =>  
> > 3221225730,
> > tf_r8 = -1245213424, tf_r9 = -1092609476832, tf_rax = 1, tf_rbx =
> > - -1096874331952, tf_rbp = -1245213856, tf_r10 = -2142258536, tf_r11 > = 0, 
> > tf_r12
> > = 4, tf_r13 = -1092609476832, tf_r14 = 4, tf_r15 = 1, tf_trapno > = 12, 
> > tf_addr =
> > 396, tf_flags = -2145197496, tf_err = 0, tf_rip = -2145415085, tf_c> s = 8,
> > tf_rflags = 65538, tf_rsp = -1245213888, tf_ss = 16}) at
> > /usr/src/sys/amd64/amd64/trap.c:238
> > #5  0x80313c6b in calltrap () at
> > /usr/src/sys/amd64/amd64/exception.S:168
> > #6  0x801f9053 in _mtx_lock_sleep (m=0xff009d31f0d0,
> > tid=18446742981100074784, opts=6, file=0xc102  02 out 
> > of
> > bounds>, line=-1245213424) at /usr/src/sys/kern/kern_mutex.c:546
> > #7  0x8025b1ac in unp_gc (arg=0x70, pending=-1687783648) at
> > /usr/src/sys/kern/uipc_usrreq.c:1714
> > #8  0x8022c314 in taskqueue_run (queue=0xff844800) at
> > /usr/src/sys/kern/subr_taskqueue.c:257
> > #9  0x8022d0e7 in taskqueue_thread_loop (arg=0x70) at
> > /usr/src/sys/kern/subr_taskqueue.c:376
> > #10 0x801e7b76 in fork_exit (callout=0x8022d060
> > , arg=0x805030d0, frame=0xb5c7> 
> > 8c50) at
> > /usr/src/sys/kern/kern_fork.c:821
> > #11 0x80313fce in fork_trampoline () at
> > /usr/src/sys/amd64/amd64/exception.S:394
> 
> This is a NULL pointer dereference in the UNIX domain socket code.  John 
> Baldwin recently committed a fix for a bug with these symptoms to 7-CURRENT> 
> , 
> with an MFC planned in the near future.  The fix won't make 6.2-RELEASE, bu> 
> t 
> assuming it tests out well over the next few weeks, we will cut an errata>  
> patch/announcement for it.  I believe you can pull down his 6-STABLE versio> 
> n 
> at:
> 
>http://people.FreeBSD.org/~jhb/patches/unp_gc.patch
> 
> This same patch is currently in texting on mx1.FreeBSD.org.
> 
> (John CC'd)
> 
> Robert N M Watson
> Computer Laboratory
> University of Cambridge

I have installed this on my system, but the panics have always been very
erratic, so it may be a while before I am sure whether this fixes it. At
the moment the system has been up for 7 days, although I have had
multiple crashes in a single day.
-- 
R. Kevin Oberman, Network Engineer
Energy Sciences Network (ESnet)
Ernest O. Lawrence Berkeley National Laboratory (Berkeley Lab)
E-mail: [EMAIL PROTECTED]   Phone: +1 510 486-8634
Key fingerprint:059B 2DDF 031C 9BA3 14A4  EADA 927D EBB3 987B 3751


pgpMYoyJA65XW.pgp
Description: PGP signature

Re: Intel PRO/1000 PT Desktop NIC, PCIe 1x supported in FreeBSD 6.2?

2007-01-07 Thread Kevin Oberman

On 1/7/07,"Jack Vogel" <[EMAIL PROTECTED]> wrote:

> Yes, all released Intel PCI-E wired NICs are supported,  sometimes not all
> features of the hardware are supported, but my job at Intel is to keep
> the driver
> working with new hardware, and to add support for features that don't have
> such now.

And my sincere thanks to both Intel and you for the efforts to make Intel wired 
Ethernet devices work well with FreeBSD.

Now, it someone could just explain why Intel is so good about this are of OSS 
support but still refuses to even discuss issues with OSS support for wireless 
cards.

Ah, well. Intel still does far batter than many hardware suppliers, in no small 
part due to Jack's efforts.
-- 
R. Kevin Oberman, Network Engineer
Energy Sciences Network (ESnet)
Ernest O. Lawrence Berkeley National Laboratory (Berkeley Lab)
E-mail: [EMAIL PROTECTED]   Phone: +1 510 486-8634
Key fingerprint:059B 2DDF 031C 9BA3 14A4  EADA 927D EBB3 987B 3751

pgpoBLIqq1XLX.pgp
Description: PGP signature

Re: Source MAC addresses when bridge(4) used

2007-01-07 Thread Sten Daniel Sørsdal

Peter Jeremy wrote:
> I've just noticed an number of unpexected "IP address changed MAC"
> messages on one of the hosts in my network.  It is connected via a
> FreeBSD bridge to the rest of my network (there aren't enuf network
> ports in my son's bedroom).  The configuration looks like:
> 
>   +-+ +-+
>   | | | |
>   | laptop1 |-| desktop |--> Rest of network
>   |   |dc0   tl0| |rl0 via dumb switch
>   +-+ +-+
> 
> The desktop network configuration is:
> tl0: flags=8943 mtu 1500
> ether 00:00:24:28:98:9a
> media: Ethernet autoselect (100baseTX )
> status: active
> rl0: flags=8943 mtu 1500
> options=8
> inet 192.168.123.36 netmask 0xff00 broadcast 192.168.123.255
> ether 00:20:ed:78:9c:a3
> media: Ethernet autoselect (100baseTX )
> status: active
> lo0: flags=8049 mtu 16384
> inet 127.0.0.1 netmask 0xff00 
> bridge0: flags=8043 mtu 1500
> ether ca:a9:aa:1e:71:32
> priority 32768 hellotime 2 fwddelay 15 maxage 20
> member: tl0 flags=3
> member: rl0 flags=3
> 
> laptop1 is regularly reporting that 192.168.123.36 (the IP address of
> the desktop) is switching between the two adapters in it:
> Jan  6 07:27:09 laptop1 kernel: arp: 192.168.123.36 moved from 
> 00:00:24:28:98:9a to 00:20:ed:78:9c:a3 on dc0
> Jan  6 08:09:45 laptop1 kernel: arp: 192.168.123.36 moved from 
> 00:20:ed:78:9c:a3 to 00:00:24:28:98:9a on dc0
> Jan  6 08:46:11 laptop1 kernel: arp: 192.168.123.36 moved from 
> 00:00:24:28:98:9a to 00:20:ed:78:9c:a3 on dc0
> Jan  6 09:29:00 laptop1 kernel: arp: 192.168.123.36 moved from 
> 00:00:24:28:98:9a to 00:20:ed:78:9c:a3 on dc0
> Jan  6 12:12:12 laptop1 kernel: arp: 192.168.123.36 moved from 
> 00:20:ed:78:9c:a3 to 00:00:24:28:98:9a on dc0
> Jan  6 12:15:31 laptop1 kernel: arp: 192.168.123.36 moved from 
> 00:00:24:28:98:9a to 00:20:ed:78:9c:a3 on dc0
> Jan  6 13:06:42 laptop1 kernel: arp: 192.168.123.36 moved from 
> 00:00:24:28:98:9a to 00:20:ed:78:9c:a3 on dc0
> Jan  6 16:48:45 laptop1 kernel: arp: 192.168.123.36 moved from 
> 00:00:24:28:98:9a to 00:20:ed:78:9c:a3 on dc0
> Jan  6 17:32:22 laptop1 kernel: arp: 192.168.123.36 moved from 
> 00:20:ed:78:9c:a3 to 00:00:24:28:98:9a on dc0
> Jan  6 17:33:33 laptop1 kernel: arp: 192.168.123.36 moved from 
> 00:00:24:28:98:9a to 00:20:ed:78:9c:a3 on dc0
> Jan  6 17:53:45 laptop1 kernel: arp: 192.168.123.36 moved from 
> 00:20:ed:78:9c:a3 to 00:00:24:28:98:9a on dc0
> Jan  6 17:57:05 laptop1 kernel: arp: 192.168.123.36 moved from 
> 00:00:24:28:98:9a to 00:20:ed:78:9c:a3 on dc0
> Jan  6 18:17:20 laptop1 kernel: arp: 192.168.123.36 moved from 
> 00:20:ed:78:9c:a3 to 00:00:24:28:98:9a on dc0
> Jan  6 18:24:48 laptop1 kernel: arp: 192.168.123.36 moved from 
> 00:00:24:28:98:9a to 00:20:ed:78:9c:a3 on dc0
> Jan  6 18:45:08 laptop1 kernel: arp: 192.168.123.36 moved from 
> 00:20:ed:78:9c:a3 to 00:00:24:28:98:9a on dc0
> Jan  6 18:48:19 laptop1 kernel: arp: 192.168.123.36 moved from 
> 00:00:24:28:98:9a to 00:20:ed:78:9c:a3 on dc0
> Jan  6 19:08:45 laptop1 kernel: arp: 192.168.123.36 moved from 
> 00:20:ed:78:9c:a3 to 00:00:24:28:98:9a on dc0
> Jan  6 19:11:50 laptop1 kernel: arp: 192.168.123.36 moved from 
> 00:00:24:28:98:9a to 00:20:ed:78:9c:a3 on dc0
> Jan  6 19:32:15 laptop1 kernel: arp: 192.168.123.36 moved from 
> 00:20:ed:78:9c:a3 to 00:00:24:28:98:9a on dc0
> Jan  6 19:33:07 laptop1 kernel: arp: 192.168.123.36 moved from 
> 00:00:24:28:98:9a to 00:20:ed:78:9c:a3 on dc0
> Jan  6 19:56:34 laptop1 kernel: arp: 192.168.123.36 moved from 
> 00:00:24:28:98:9a to 00:20:ed:78:9c:a3 on dc0
> Jan  6 22:44:24 laptop1 kernel: arp: 192.168.123.36 moved from 
> 00:20:ed:78:9c:a3 to 00:00:24:28:98:9a on dc0
> Jan  6 23:04:26 laptop1 kernel: arp: 192.168.123.36 moved from 
> 00:00:24:28:98:9a to 00:20:ed:78:9c:a3 on dc0
> 
> Even more unexpectedly, laptop1 is repeating the same "moved" message:
> Jan  7 00:46:55 laptop1 kernel: arp: 192.168.123.36 moved from 
> 00:00:24:28:98:9a to 00:20:ed:78:9c:a3 on dc0
> Jan  7 01:38:09 laptop1 kernel: arp: 192.168.123.36 moved from 
> 00:00:24:28:98:9a to 00:20:ed:78:9c:a3 on dc0
> Jan  7 02:29:26 laptop1 kernel: arp: 192.168.123.36 moved from 
> 00:00:24:28:98:9a to 00:20:ed:78:9c:a3 on dc0
> Jan  7 03:20:39 laptop1 kernel: arp: 192.168.123.36 moved from 
> 00:00:24:28:98:9a to 00:20:ed:78:9c:a3 on dc0
> Jan  7 04:28:59 laptop1 kernel: arp: 192.168.123.36 moved from 
> 00:00:24:28:98:9a to 00:20:ed:78:9c:a3 on dc0
> Jan  7 05:18:50 laptop1 kernel: arp: 192.168.123.36 moved from 
> 00:00:24:28:98:9a to 00:20:ed:78:9c:a3 on dc0
> Jan  7 06:28:31 laptop1 kernel: arp: 192.168.123.36 moved from 
> 00:00:24:28:98:9a to 00:20:ed:78:9c:a3 on dc0
> Jan  7 07:16:05 laptop1 kernel: arp: 192.168.123.36 moved from 
> 00:00:24:28:98:9a to 00:20:ed:78:9c:a3 on dc0
> 
> Both hosts are running 6.1-STABLE:
> laptop1: FreeBSD lapt

RE: kernel panic on 6.2-RC2 with GENERIC.

2007-01-07 Thread Jan Mikkelsen

(Scott:  I should have emailed you this earlier, but Christmas and various
other things got in the way.)

Ian West wrote:
> On Sun, Jan 07, 2007 at 02:25:02PM -0500, Mike Tancsa wrote:
> > At 11:43 AM 1/7/2007, Craig Rodrigues wrote:
> > >On Fri, Jan 05, 2007 at 06:59:10PM +0200, Nikolay Pavlov wrote:
>>> [ Areca kernel panic, IO failures ... ]
> I have seen this identical fault with the new areca driver, my machine
> is opteron hardware, but running a regular i386/SMP kernel/world. With
> everything at 6.2RC2 (as of 29th of December) except the areca driver
> the machine is rock solid, with the 29th of december version of the
> areca driver the box will crash on extract of a large tar 
> file, removal
> of a large directory structure, or pretty much anything that 
> does a lot
> of disk io to different files/locations. There is no error 
> log prior to
> seeing the following messages..
> 
> Dec 29 14:26:44 aleph kernel: 
> g_vfs_done():da0s1g[WRITE(offset=433078272, length=8192)]error = 5
> Dec 29 14:26:44 aleph kernel: 
> g_vfs_done():da0s1g[WRITE(offset=433111040, length=16384)]error = 5
> Dec 29 14:26:44 aleph kernel: 
> g_vfs_done():da0s1g[WRITE(offset=433209344, length=16384)]error = 5
> Dec 29 14:26:44 aleph kernel: 
> g_vfs_done():da0s1g[WRITE(offset=433242112, length=32768)]error = 5
> Dec 29 14:26:44 aleph kernel: 
> g_vfs_done():da0s1g[WRITE(offset=437612544, length=4096)]error = 5
> Dec 29 14:26:44 aleph kernel: 
> g_vfs_done():da0s1g[WRITE(offset=437616640, length=12288)]error = 5
> Dec 29 14:26:44 aleph kernel: 
> g_vfs_done():da0s1g[WRITE(offset=437633024, length=6144)]error = 5
> Dec 29 14:26:44 aleph kernel: 
> g_vfs_done():da0s1g[WRITE(offset=437639168, length=2048)]error = 5
> Dec 29 14:26:44 aleph kernel: 
> g_vfs_done():da0s1g[WRITE(offset=437641216, length=6144)]error = 5
> 
> There are a string of these, followed by a crash and reboot. 
> The file system
> state can be left very dirty to the point where background 
> fsck seems unable
> to recover it.
> 
> The areca card in question is running the latest firmware/boot and
> has shown no problems either before, or since backing out the areca
> driver.
> 
> The volume is ran the tests on was a 250G on a raid6 raid set.

I have seen various problems with various Areca drivers.  All on
6.2-RC1/amd64 with an Areca RAID-6 volume.

Areca 1.20.00.02 seems to work fine.

Areca 1.20.00.12 (from the Areca website) seems to have data corruption
problems.  My tests involve doing a "diff -r" on a filesystem with 2GB of
data.  It will occasional find differences in files.  On examination, the
last 640 bytes of the first block of the affected file contain data from
another file "nearby" in the filesystem.  Unmounting and remounting the
filesystems and rerunning the test shows no problem, or a difference in
another file entirely.  I think this is the cause of the g_vfs_done failures
with this version of the driver;  the offsets are wrong because the data is
corrupted.

Areca 1.20.00.13 (as currently in the tree) does not seem to have data
corruption problems, but I can trigger g_vfs_done failures under heavy I/O.

I have raised this with Areca support, and I'm waiting to hear back from
Erich Chen.

Regards,

Jan Mikkelsen

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "[EMAIL PROTECTED]"

Panic in 6.2-PRERELEASE with bge on amd64

2007-01-07 Thread Sven Willenberger

I am starting a new thread on this as what I had assumed was a panic in
nfsd turns out to be an issue with the bge driver. This is an amd64 box,
dual processor (SMP kernel) that happens to be running nfsd. About every
3-5 days the kernel panics and I have finally managed to get a core
dump. 
The system: FreeBSD 6.2-PRERELEASE #8: Tue Jan  2 10:57:39 EST 2007

The short and dirty of the dump:

# kgdb /usr/obj/usr/src/sys/MSPOOL/kernel.debug /var/crash/vmcore.0
[GDB will not be able to debug user-mode threads: /usr/lib/libthread_db.so: 
Undefined symbol "ps_pglobal_lookup"]
GNU gdb 6.1.1 [FreeBSD]
Copyright 2004 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB.  Type "show warranty" for details.
This GDB was configured as "amd64-marcel-freebsd".

Unread portion of the kernel message buffer:
lock order reversal: (sleepable after non-sleepable)
 1st 0x8836b010 bge0 (network driver) @ 
/usr/src/sys/dev/bge/if_bge.c:2675
 2nd 0x805f26b0 user map (user map) @ /usr/src/sys/vm/vm_map.c:3074
KDB: stack backtrace:
witness_checkorder() at witness_checkorder+0x4da
_sx_xlock() at _sx_xlock+0x51
vm_map_lookup() at vm_map_lookup+0x44
vm_fault() at vm_fault+0xba
trap_pfault() at trap_pfault+0x13c
trap() at trap+0x1f9
calltrap() at calltrap+0x5
--- trap 0xc, rip = 0x801d5f17, rsp = 0xb371ab50, rbp = 
0xb371aba0 ---
bge_rxeof() at bge_rxeof+0x3b7
bge_intr() at bge_intr+0x1c8
ithread_loop() at ithread_loop+0x14c
fork_exit() at fork_exit+0xbb
fork_trampoline() at fork_trampoline+0xe
--- trap 0, rip = 0, rsp = 0xb371ad00, rbp = 0 ---


Fatal trap 12: page fault while in kernel mode
cpuid = 1; apic id = 01
fault virtual address   = 0x28
fault code  = supervisor write, page not present
instruction pointer = 0x8:0x801d5f17
stack pointer   = 0x10:0xb371ab50
frame pointer   = 0x10:0xb371aba0
code segment= base 0x0, limit 0xf, type 0x1b
= DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags= interrupt enabled, resume, IOPL = 0
current process = 28 (irq24: bge0)
trap number = 12
panic: page fault
cpuid = 1
Uptime: 3d4h18m42s

#0  doadump () at pcpu.h:172
172 pcpu.h: No such file or directory.
in pcpu.h
(kgdb) bt
#0  doadump () at pcpu.h:172
#1  0x802771b9 in boot (howto=260) at 
/usr/src/sys/kern/kern_shutdown.c:409
#2  0x80276c4b in panic (fmt=0x8044160c "%s") at 
/usr/src/sys/kern/kern_shutdown.c:565
#3  0x803ebba6 in trap_fatal (frame=0xc, eva=18446742978291675136) at 
/usr/src/sys/amd64/amd64/trap.c:660
#4  0x803ebee3 in trap_pfault (frame=0xb371aaa0, usermode=0) at 
/usr/src/sys/amd64/amd64/trap.c:573
#5  0x803ec0f9 in trap (frame=
  {tf_rdi = 0, tf_rsi = 0, tf_rdx = 1, tf_rcx = 499, tf_r8 = 2521427970, 
tf_r9 = -1099500152320, tf_rax = 0, tf_rbx = -1263948192, tf_rbp = -1284396128, 
tf_r10 = 0, tf_r11 = 0, tf_r12 = -2009681920, tf_r13 = 0, tf_r14 = 0, tf_r15 = 
-1099499984896, tf_trapno = 12, tf_addr = 40, tf_flags = -1263948192, tf_err = 
2, tf_rip = -2145558761, tf_cs = 8, tf_rflags = 66071, tf_rsp = -1284396192, 
tf_ss = 16})
at /usr/src/sys/amd64/amd64/trap.c:352
#6  0x803d779b in calltrap () at 
/usr/src/sys/amd64/amd64/exception.S:168
#7  0x801d5f17 in bge_rxeof (sc=0x8836b000) at 
/usr/src/sys/dev/bge/if_bge.c:2528
#8  0x801db818 in bge_intr (xsc=0x0) at 
/usr/src/sys/dev/bge/if_bge.c:2707
#9  0x8025f2bc in ithread_loop (arg=0xffb1b320) at 
/usr/src/sys/kern/kern_intr.c:682
#10 0x8025e00b in fork_exit (callout=0x8025f170 , 
arg=0xffb1b320, frame=0xb371ac50)
at /usr/src/sys/kern/kern_fork.c:821
#11 0x803d7afe in fork_trampoline () at 
/usr/src/sys/amd64/amd64/exception.S:394

If more information is needed (disassemble, etc) please let me know. In
the interim I may switch to either using the base100 ethernet port (fxp)
or turn off SMP.

Sven

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "[EMAIL PROTECTED]"

Re: kernel panic on 6.2-RC2 with GENERIC.

2007-01-07 Thread Ian West

On Sun, Jan 07, 2007 at 02:25:02PM -0500, Mike Tancsa wrote:
> At 11:43 AM 1/7/2007, Craig Rodrigues wrote:
> >On Fri, Jan 05, 2007 at 06:59:10PM +0200, Nikolay Pavlov wrote:
> >> Hello folks.
> >> I have kernel panic on GENERIC kernel while executing postmark.
> >
> >The sequence of steps that Nikolay used to produce this panic was:
> >
> >- install benchmarks/postmark from ports
> >
> >root# postmark
> >PostMark v1.5 : 3/27/01
> >pm>set number=1
> >pm>set transactions=1
> >pm>set subdirectories=1
> >pm>show
> >pm>run
> 
> I am able to do this on an AMD64 on a AREAC RAID6 file system and on 
> a plain old ata drive on i386 without issue.
> 
> the i386 is a few weeks old but I will cvsup and re-try to confirm on 
> both today
> 
> [tyan-1u]# postmark
> PostMark v1.5 : 3/27/01
> pm>set number=1
> pm>set transactions=1
> pm>set subdirectories=1
> pm>set location /tmp
> pm>run
> Creating subdirectories...Done
> Creating files...Done
> Performing transactions..Done
> Deleting files...Done
> Deleting subdirectories...Done
> Time:
> 481 seconds total
> 233 seconds of transactions (42 per second)
> 
> Files:
> 15027 created (31 per second)
> Creation alone: 1 files (62 per second)
> Mixed with transactions: 5027 files (21 per second)
> 4990 read (21 per second)
> 5009 appended (21 per second)
> 15027 deleted (31 per second)
> Deletion alone: 10054 files (115 per second)
> Mixed with transactions: 4973 files (21 per second)
> 
> Data:
> 27.14 megabytes read (57.78 kilobytes per second)
> 85.08 megabytes written (181.13 kilobytes per second)
> pm>quit
> [tyan-1u]# uname -a
> FreeBSD tyan-1u.sentex.ca 6.2-PRERELEASE FreeBSD 6.2-PRERELEASE #1: 
> Mon Dec 11 17:45:45 EST 
> 2006 [EMAIL PROTECTED]:/usr/obj/usr/src/sys/tyan  i386
> [tyan-1u]#
> 
> 
> and amd64
> 
> pm>show
> Current configuration is:
> The base number of files is 1
> Transactions: 1
> Files range between 500 bytes and 9.77 kilobytes in size
> Working directory:
> /mnt (weight=1)
> 1 subdirectories will be used
> Block sizes are: read=512 bytes, write=512 bytes
> Biases are: read/append=5, create/delete=5
> Using Unix buffered file I/O
> Random number generator seed is 42
> Report format is verbose.
> pm>run
> Creating subdirectories...Done
> Creating files...Done
> Performing transactions..Done
> Deleting files...Done
> Deleting subdirectories...Done
> Time:
> 310 seconds total
> 155 seconds of transactions (64 per second)
> 
> Files:
> 15027 created (48 per second)
> Creation alone: 1 files (103 per second)
> Mixed with transactions: 5027 files (32 per second)
> 4990 read (32 per second)
> 5009 appended (32 per second)
> 15027 deleted (48 per second)
> Deletion alone: 10054 files (173 per second)
> Mixed with transactions: 4973 files (32 per second)
> 
> Data:
> 27.14 megabytes read (89.65 kilobytes per second)
> 85.08 megabytes written (281.04 kilobytes per second)
> pm>quit
> [r2-releng6-64]# uname -a
> FreeBSD r2-releng6-64.sentex.ca 6.2-PRERELEASE FreeBSD 6.2-PRERELEASE 
> #0: Thu Dec 28 23:13:18 EST 
> 2006 [EMAIL PROTECTED]:/usr/obj/usr/src/sys/router  amd64
> [r2-releng6-64]#
> 
> 
> both file systems have normal newfs options and fairly standard 
> kernels with default /etc/make.conf and both are SMP

I have seen this identical fault with the new areca driver, my machine
is opteron hardware, but running a regular i386/SMP kernel/world. With
everything at 6.2RC2 (as of 29th of December) except the areca driver
the machine is rock solid, with the 29th of december version of the
areca driver the box will crash on extract of a large tar file, removal
of a large directory structure, or pretty much anything that does a lot
of disk io to different files/locations. There is no error log prior to
seeing the following messages..

Dec 29 14:26:44 aleph kernel: g_vfs_done():da0s1g[WRITE(offset=433078272, 
length=8192)]error = 5
Dec 29 14:26:44 aleph kernel: g_vfs_done():da0s1g[WRITE(offset=433111040, 
length=16384)]error = 5
Dec 29 14:26:44 aleph kernel: g_vfs_done():da0s1g[WRITE(offset=433209344, 
length=16384)]error = 5
Dec 29 14:26:44 aleph kernel: g_vfs_done():da0s1g[WRITE(offset=433242112, 
length=32768)]error = 5
Dec 29 14:26:44 aleph kernel: g_vfs_done():da0s1g[WRITE(offset=437612544, 
length=4096)]error = 5
Dec 29 14:26:44 aleph kernel: g_vfs_done():da0s1g[WRITE(offset=437616640, 
length=12288)]error = 5
Dec 29 14:26:44 aleph kernel: g_vfs_done():da0s1g[WRITE(offset=437633024, 
length=6144)]error = 5
Dec 29 14:26:44 aleph kernel: g_vfs_done():da0s1g[WRITE(offset=437639168, 
length=2048)]error = 5
Dec 29 14:26:44 aleph kernel: g_vfs_done():da0s1g[WRITE(offset=437641216, 
length=6144)]error = 5

There are a string of these,

Re: Intel PRO/1000 PT Desktop NIC, PCIe 1x supported in FreeBSD 6.2?

2007-01-07 Thread Jack Vogel

On 1/7/07, Erik Trulsson <[EMAIL PROTECTED]> wrote:

On Sun, Jan 07, 2007 at 04:56:26PM +0100, O. Hartmann wrote:
> Hello,
> the company I'm working is about to purchae some additional NICs for
> some replacement built-in NICs of nForce 405-based desktop PCs. I would
> like to purchase the above mentioned NICs from Intel, hoping the em()
> driver is capable of handling the NICs.

I have one of those cards (Intel PRO/1000 PT Desktop NIC) myself, and it
works just fine under 6-STABLE.  So far I have not had any problems at all
with it.

Yes, all released Intel PCI-E wired NICs are supported,  sometimes not all
features of the hardware are supported, but my job at Intel is to keep
the driver
working with new hardware, and to add support for features that don't have
such now.

Happy New Year,

Jack
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "[EMAIL PROTECTED]"

Re: kernel panic on 6.2-RC2 with GENERIC.

2007-01-07 Thread Mike Tancsa


At 11:43 AM 1/7/2007, Craig Rodrigues wrote:

On Fri, Jan 05, 2007 at 06:59:10PM +0200, Nikolay Pavlov wrote:
> Hello folks.
> I have kernel panic on GENERIC kernel while executing postmark.

The sequence of steps that Nikolay used to produce this panic was:

- install benchmarks/postmark from ports

root# postmark
PostMark v1.5 : 3/27/01
pm>set number=1
pm>set transactions=1
pm>set subdirectories=1
pm>show
pm>run


I am able to do this on an AMD64 on a AREAC RAID6 file system and on 
a plain old ata drive on i386 without issue.


the i386 is a few weeks old but I will cvsup and re-try to confirm on 
both today


[tyan-1u]# postmark
PostMark v1.5 : 3/27/01
pm>set number=1
pm>set transactions=1
pm>set subdirectories=1
pm>set location /tmp
pm>run
Creating subdirectories...Done
Creating files...Done
Performing transactions..Done
Deleting files...Done
Deleting subdirectories...Done
Time:
481 seconds total
233 seconds of transactions (42 per second)

Files:
15027 created (31 per second)
Creation alone: 1 files (62 per second)
Mixed with transactions: 5027 files (21 per second)
4990 read (21 per second)
5009 appended (21 per second)
15027 deleted (31 per second)
Deletion alone: 10054 files (115 per second)
Mixed with transactions: 4973 files (21 per second)

Data:
27.14 megabytes read (57.78 kilobytes per second)
85.08 megabytes written (181.13 kilobytes per second)
pm>quit
[tyan-1u]# uname -a
FreeBSD tyan-1u.sentex.ca 6.2-PRERELEASE FreeBSD 6.2-PRERELEASE #1: 
Mon Dec 11 17:45:45 EST 
2006 [EMAIL PROTECTED]:/usr/obj/usr/src/sys/tyan  i386

[tyan-1u]#


and amd64

pm>show
Current configuration is:
The base number of files is 1
Transactions: 1
Files range between 500 bytes and 9.77 kilobytes in size
Working directory:
/mnt (weight=1)
1 subdirectories will be used
Block sizes are: read=512 bytes, write=512 bytes
Biases are: read/append=5, create/delete=5
Using Unix buffered file I/O
Random number generator seed is 42
Report format is verbose.
pm>run
Creating subdirectories...Done
Creating files...Done
Performing transactions..Done
Deleting files...Done
Deleting subdirectories...Done
Time:
310 seconds total
155 seconds of transactions (64 per second)

Files:
15027 created (48 per second)
Creation alone: 1 files (103 per second)
Mixed with transactions: 5027 files (32 per second)
4990 read (32 per second)
5009 appended (32 per second)
15027 deleted (48 per second)
Deletion alone: 10054 files (173 per second)
Mixed with transactions: 4973 files (32 per second)

Data:
27.14 megabytes read (89.65 kilobytes per second)
85.08 megabytes written (281.04 kilobytes per second)
pm>quit
[r2-releng6-64]# uname -a
FreeBSD r2-releng6-64.sentex.ca 6.2-PRERELEASE FreeBSD 6.2-PRERELEASE 
#0: Thu Dec 28 23:13:18 EST 
2006 [EMAIL PROTECTED]:/usr/obj/usr/src/sys/router  amd64

[r2-releng6-64]#


both file systems have normal newfs options and fairly standard 
kernels with default /etc/make.conf and both are SMP




___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "[EMAIL PROTECTED]"

Re: (audit?) Panic in 6.2-PRERELEASE

2007-01-07 Thread Ceri Davies

On Sun, Jan 07, 2007 at 06:05:39PM +, Robert Watson wrote:
> 
> On Sun, 7 Jan 2007, Ceri Davies wrote:
> 
> >>Could you try printing *td->td_ar?  Maybe this will give us a clue as to 
> >>how far it got.  In particular, this may be able to more reliably give us 
> >>the file descriptor number, which is audited early in the system call. 
> >>You might find that 'td' is corrupted in many layers of the stack, keep 
> >>going up until you find one where it's good.  It may well be that 
> >>td->td_ar->k_ar.ar_arg_fd is correct, and might confirm that uap->fd is 
> >>correct still.  We'd like also to know if ARG_SOCKINFO, ARG_VNODE1, or 
> >>ARG_VNODE2 is set in the k_ar.ar_valid_arg field.  This may tell us some 
> >>more about the file descriptor even though it appears to have vanished.
> >
> >*td->td_ar is null (0x0) in both cases...
> 
> I'm actually beginning to wonder if this is actually audit-related at all. 
> Something is clearly not right, and the audit code should not actually have 
> been entered at all there.  Perhaps we're being mislead by the stack trace 
> corruption into thinking audit is involved.

I've wondered the same.

> >>I'm quite worried by the fact that the file descriptor seems not to be 
> >>present any more -- this suggests a file descriptor related race of the 
> >>sort that is both quite difficult to figure out and also quite a risk. 
> >>It's strange that it would only trigger with audit, however--perhaps 
> >>audit stretches out the race.  Is this an SMP box?
> >
> >It's certainly looking quite nasty.  This system is UP hardware without 
> >options SMP.
> >
> >...
> >
> >If it's at all useful, I can provide access to this system and the dumps.
> 
> Yeah, I think at this point that would probably be the most helpful thing.

OK, you should be able to log in as [EMAIL PROTECTED] with your
freefall key.  Details in ~rwatson/README once you're logged in.

> Could you confirm that the kernel.debug you're using definitely matches the 
> version of the kernel in the core dump?

Yes, definitely.

Thanks again,

Ceri
-- 
That must be wonderful!  I don't understand it at all.
  -- Moliere


pgpySGWT4f6UY.pgp
Description: PGP signature

Re: make buildworld is always braking at various points

2007-01-07 Thread Wolfgang Zenker

Hello,

> I keep having troubles compiling either 6.1-RELEASE and 6.2-RC2.
> I downloaded sources, extracted them with install.sh and did a cvsup.
> [..]
> # make buildworld
> it breaks with these last lines:
> 
> [.. lines suppressed ..]

> building shared library libpmc.so.3
> ===> lib/libpthread (all)

> [.. lines suppressed ..]
> cc -O2 -fno-strict-aliasing -pipe -march=pentium4 -DPTHREAD_KERNEL 
> -I/usr/src/lib/libpthread/../libc/include 
> -I/usr/src/lib/libpthread/thread 
> -I/usr/src/lib/libpthread/../../include 
> -I/usr/src/lib/libpthread/arch/i386/include 
> -I/usr/src/lib/libpthread/sys 
> -I/usr/src/lib/libpthread/../../libexec/rtld-elf 
> -I/usr/src/lib/libpthread/../../libexec/rtld-elf/i386 -fno-builtin 
> -D_LOCK_DEBUG -D_PTHREADS_INVARIANTS -Wall 
> -I/usr/src/lib/libpthread/../libc/i386  -c 
> /usr/src/lib/libpthread/thread/thr_condattr_pshared.c

> make: don't know how to make 
> /usr/src.lib/libpthread/arch/i386/include/pthread_md.h. Stop
> *** Error code 2

when make tries to rebuild source files, this is often an indication
of mis-set system clocks. Check date/time settings on your machine.

Wolfgang
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "[EMAIL PROTECTED]"

Re: make buildworld is always braking at various points

2007-01-07 Thread Peter Jeremy

On Sun, 2007-Jan-07 17:44:24 +0100, Christoph Illnar wrote:
>I keep having troubles compiling either 6.1-RELEASE and 6.2-RC2.
>I downloaded sources, extracted them with install.sh and did a cvsup.
>
>My installed system is 6.1-RELEASE and I keep trying to compile it on my 
>own.

Is the failure consistent?  I suspect you may have bad RAM.

>===> lib/libpthread (all)
>
>[.. lines suppressed ..]
>
>cc -O2 -fno-strict-aliasing -pipe -march=pentium4 -DPTHREAD_KERNEL 
>-I/usr/src/lib/libpthread/../libc/include 
>-I/usr/src/lib/libpthread/thread 
>-I/usr/src/lib/libpthread/../../include 
>-I/usr/src/lib/libpthread/arch/i386/include 
>-I/usr/src/lib/libpthread/sys 
>-I/usr/src/lib/libpthread/../../libexec/rtld-elf 
>-I/usr/src/lib/libpthread/../../libexec/rtld-elf/i386 -fno-builtin 
>-D_LOCK_DEBUG -D_PTHREADS_INVARIANTS -Wall 
>-I/usr/src/lib/libpthread/../libc/i386  -c 
>/usr/src/lib/libpthread/thread/thr_condattr_init.c
>
>cc -O2 -fno-strict-aliasing -pipe -march=pentium4 -DPTHREAD_KERNEL 
>-I/usr/src/lib/libpthread/../libc/include 
>-I/usr/src/lib/libpthread/thread 
>-I/usr/src/lib/libpthread/../../include 
>-I/usr/src/lib/libpthread/arch/i386/include 
>-I/usr/src/lib/libpthread/sys 
>-I/usr/src/lib/libpthread/../../libexec/rtld-elf 
>-I/usr/src/lib/libpthread/../../libexec/rtld-elf/i386 -fno-builtin 
>-D_LOCK_DEBUG -D_PTHREADS_INVARIANTS -Wall 
>-I/usr/src/lib/libpthread/../libc/i386  -c 
>/usr/src/lib/libpthread/thread/thr_condattr_pshared.c
>
>make: don't know how to make 
>/usr/src.lib/libpthread/arch/i386/include/pthread_md.h. Stop
>*** Error code 2

Note that:
1) "/usr/src.lib/" does not normally exist;
2)  "." and "/" differ by 1 bit;
3) The cc line shows "-I/usr/src/lib/libpthread/arch/i386/include";
4) Compiling thr_condattr_init.c uses the same #include sequence to
   successfully load "pthread_md.h";
5) None of the test build boxes are reporting any problems.

Please try running a memory test, or swapping your RAM.

-- 
Peter Jeremy


pgpFKiYH7vLfm.pgp
Description: PGP signature

Re: (audit?) Panic in 6.2-PRERELEASE

2007-01-07 Thread Robert Watson



On Sun, 7 Jan 2007, Ceri Davies wrote:

Could you try printing *td->td_ar?  Maybe this will give us a clue as to 
how far it got.  In particular, this may be able to more reliably give us 
the file descriptor number, which is audited early in the system call. 
You might find that 'td' is corrupted in many layers of the stack, keep 
going up until you find one where it's good.  It may well be that 
td->td_ar->k_ar.ar_arg_fd is correct, and might confirm that uap->fd is 
correct still.  We'd like also to know if ARG_SOCKINFO, ARG_VNODE1, or 
ARG_VNODE2 is set in the k_ar.ar_valid_arg field.  This may tell us some 
more about the file descriptor even though it appears to have vanished.


*td->td_ar is null (0x0) in both cases...


I'm actually beginning to wonder if this is actually audit-related at all. 
Something is clearly not right, and the audit code should not actually have 
been entered at all there.  Perhaps we're being mislead by the stack trace 
corruption into thinking audit is involved.


I'm quite worried by the fact that the file descriptor seems not to be 
present any more -- this suggests a file descriptor related race of the 
sort that is both quite difficult to figure out and also quite a risk. It's 
strange that it would only trigger with audit, however--perhaps audit 
stretches out the race.  Is this an SMP box?


It's certainly looking quite nasty.  This system is UP hardware without 
options SMP.


...

If it's at all useful, I can provide access to this system and the dumps.


Yeah, I think at this point that would probably be the most helpful thing.

Could you confirm that the kernel.debug you're using definitely matches the 
version of the kernel in the core dump?


Robert N M Watson
Computer Laboratory
University of Cambridge
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "[EMAIL PROTECTED]"

make buildworld is always braking at various points

2007-01-07 Thread Christoph Illnar


Hello list,

I keep having troubles compiling either 6.1-RELEASE and 6.2-RC2.
I downloaded sources, extracted them with install.sh and did a cvsup.

My installed system is 6.1-RELEASE and I keep trying to compile it on my 
own.


Playing with various parameters did not help neither helped switching to 
the sources of 6.2-RC2.


My compilation system is a P4-550 with 2G RAM running on an Asus P5P800.

This is my /etc/make.conf:

PERL_VER=5.8.8
PERL_VERSION=5.8.8
SUP_UPDATE=yes
SUP=/usr/local/bin/cvsup
SUPFLAGS=-L 1
SUPHOST=cvsup.at.freebsd.org
SUPFILE=/home/franz/cvsupfile
CPUTYPE=pentium4
KERNCONF=MYKERNEL
NO_ATM=true# do not build ATM related programs and libraries
NO_BLUETOOTH=true  # do not build Bluetooth related stuff
NO_FORTRAN=true# do not build g77 and related libraries
NO_GAMES=true  # do not build games (games/ subdir)
NO_GDB=true# do not build GDB
NO_I4B=true# do not build isdn4bsd package
NO_INET6=true  # do not build IPv6 related programs and
NO_IPFILTER=true   # do not build IP Filter package
NO_KERBEROS=true   # do not build and install Kerberos 5 (KTH
NO_NIS=true
NO_PROFILE=true# Avoid compiling profiled libraries
NO_SENDMAIL=true   # do not build sendmail and related programs
PPP_NO_NAT=true# do not build with NAT support (see
PPP_NO_NETGRAPH=true   # do not build with Netgraph support
PPP_NO_RADIUS=true # do not build with RADIUS support
NO_BIND_LIBS_LWRES=true# Do not install the lwres library


Some exclusions resulted from the buildworld being broken up there.

This is my /usr/src/sys/i386/conf/MYKERNEL:

machine i386
cpu I686_CPU
ident   MY-P4-SMP
options SMP # Symmetric MultiProcessor
options MPTABLE_FORCE_HTT   # Enable HTT CPUs with the MP
options IPI_PREEMPTION
options SCHED_4BSD  # 4BSD scheduler
options PREEMPTION  # Enable kernel thread
options INET# InterNETworking
options FFS # Berkeley Fast Filesystem
options SOFTUPDATES # Enable FFS soft updates
options UFS_ACL # Support for access control
options UFS_DIRHASH # Improve performance on big
options MD_ROOT # MD is a potential root device
options NFSCLIENT   # Network Filesystem Client
options NFSSERVER   # Network Filesystem Server
options NFS_ROOT# NFS usable as /, requires
options PROCFS  # Process filesystem (requires
options PSEUDOFS# Pseudo-filesystem framework
options GEOM_GPT# GUID Partition Tables.
options COMPAT_43   # Compatible with BSD 4.3 [KEEP
options COMPAT_FREEBSD4 # Compatible with FreeBSD4
options COMPAT_FREEBSD5 # Compatible with FreeBSD5
options SCSI_DELAY=5000
options SYSVSHM # SYSV-style shared memory
options SYSVMSG # SYSV-style message queues
options SYSVSEM # SYSV-style semaphores
options _KPOSIX_PRIORITY_SCHEDULING # POSIX P1003_1B real-time 
extensions

options KBD_INSTALL_CDEV# install a CDEV entry in /dev
options AHC_REG_PRETTY_PRINT# Print register bitfields in
options AHD_REG_PRETTY_PRINT# Print register bitfields in
options ADAPTIVE_GIANT  # Giant mutex is adaptive.
options SC_DISABLE_REBOOT
device  apic# I/O APIC
device  eisa
device  pci
device  ata
device  atadisk # ATA disk drives
device  atapicd # ATAPI CDROM drives
options ATA_STATIC_ID
device  atkbdc  # AT keyboard controller
device  atkbd   # AT keyboard
device  psm # PS/2 mouse
device  kbdmux  # keyboard multiplexer
device  vga # VGA video card driver
device  splash  # Splash screen and screen saver support
device  sc
device  agp # support several AGP chipsets
device  pmtimer
device  sio # 8250, 16[45]50 based serial ports
device  ppc
device  ppbus   # Parallel port bus (required)
device  lpt # Printer
device  plip# TCP/IP over parallel
device  ppi # Parallel port interface device
device  nve # nVidia nForce MCP on-board Ethernet
device  sk  # SysKonnect SK-984x & SK-982x gigabit
device  loop# Network loopback
device  random  # Entropy device
device  ether   # Ethernet support
device  pty # Pseudo-t

Re: (audit?) Panic in 6.2-PRERELEASE

2007-01-07 Thread Ceri Davies

On Sun, Jan 07, 2007 at 11:49:56AM +, Robert Watson wrote:
> On Sat, 6 Jan 2007, Ceri Davies wrote:
> 
> >>>So far it's happened this morning and yesterday morning.  I haven't seen 
> >>>it before that.  I don't know the cause so I can't reproduce it at will, 
> >>>but the logs don't give any indication.  Chances are that it will happen 
> >>>again tomorrow, but we'll see.
> >>
> >>Hmm.  It looks like you printf *(td->td_proc->p_fd->fd_ofiles) without 
> >>the array index.  Could you repeat that, but with the array index -- 
> >>i.e., td->td_proc->p_fd->fd_ofiles[uap->fd]?  Also, it would probably be 
> >>useful to print uap->fd.  Right now you're printing stdin (index 0), but 
> >>if the index is non-0, we want a different file.
> >
> >Very tactfully put :)  Sorry about that.
> >
> >None of the uap->fd's seem to be valid. In the first case, uap->fd is way 
> >too high for the length of fd_ofiles, which only has 21 elements:
> >
> >(kgdb) up 8
> >#8  0xc04c470d in fstat (td=0xc2eeb180, uap=0xd610dc74) at 
> >/usr/src/sys/kern/kern_descrip.c:1075
> >1075error = kern_fstat(td, uap->fd, &ub);
> >(kgdb) p uap->fd
> >$1 = 89
> >(kgdb) p *td->td_proc->p_fd->fd_ofiles[uap->fd]
> >Cannot access memory at address 0x0
> >
> >In the second, uap->fd is nonsense:
> >
> >(kgdb) up 8
> >#8  0xc04c470d in fstat (td=0xc3109300, uap=0xd617ec74) at 
> >/usr/src/sys/kern/kern_descrip.c:1075
> >1075error = kern_fstat(td, uap->fd, &ub);
> >(kgdb) p uap->fd
> >$1 = -1023449232
> >(kgdb)
> 
> Hmm.  So, I reviewed audit_arg_file() closely, and after staring at the 
> code a lot, couldn't see anything obvious in either the socket or the 
> vnode/fifo case.  I did fix one other bug there, however, which can never 
> actually be exercised in 7-CURRENT, and is fairly unlikely in 6-STABLE, and 
> will MFC that in a week.

OK, thanks.

> Could you try printing *td->td_ar?  Maybe this will give us a clue as to 
> how far it got.  In particular, this may be able to more reliably give us 
> the file descriptor number, which is audited early in the system call.  You 
> might find that 'td' is corrupted in many layers of the stack, keep going 
> up until you find one where it's good.  It may well be that 
> td->td_ar->k_ar.ar_arg_fd is correct, and might confirm that uap->fd is 
> correct still.  We'd like also to know if ARG_SOCKINFO, ARG_VNODE1, or 
> ARG_VNODE2 is set in the k_ar.ar_valid_arg field.  This may tell us some 
> more about the file descriptor even though it appears to have vanished.

*td->td_ar is null (0x0) in both cases...

> I'm quite worried by the fact that the file descriptor seems not to be 
> present any more -- this suggests a file descriptor related race of the 
> sort that is both quite difficult to figure out and also quite a risk.  
> It's strange that it would only trigger with audit, however--perhaps audit 
> stretches out the race.  Is this an SMP box?

It's certainly looking quite nasty.  This system is UP hardware without
options SMP.

> Could you print the entire contents of *td->td_proc->p_fd?

First case:

(kgdb) p *td->td_proc->p_fd
$2 = {fd_ofiles = 0xc3441000, fd_ofileflags = 0xc3441100 "", fd_cdir = 
0xc367f110, 
  fd_rdir = 0xc2ce2bb0, fd_jdir = 0x0, fd_nfiles = 64, fd_map = 0xc3b65970, 
fd_lastfile = 20, 
  fd_freefile = 16, fd_cmask = 63, fd_refcnt = 1, fd_holdcnt = 1, fd_mtx = 
{mtx_object = {
  lo_class = 0xc06ad4c4, lo_name = 0xc067c0fd "filedesc structure", 
  lo_type = 0xc067c0fd "filedesc structure", lo_flags = 196608, lo_list = 
{tqe_next = 0x0, 
tqe_prev = 0x0}, lo_witness = 0x0}, mtx_lock = 4, mtx_recurse = 0}, 
fd_locked = 0, 
  fd_wanted = 0, fd_kqlist = {slh_first = 0x0}, fd_holdleaderscount = 0, 
fd_holdleaderswakeup = 0}

Second case:

(kgdb) p *td->td_proc->p_fd
$2 = {fd_ofiles = 0xc2d23600, fd_ofileflags = 0xc2d23700 "", fd_cdir = 
0xc31b8660, 
  fd_rdir = 0xc2ce2bb0, fd_jdir = 0x0, fd_nfiles = 64, fd_map = 0xc2e9c1c0, 
fd_lastfile = 20, 
  fd_freefile = 17, fd_cmask = 63, fd_refcnt = 1, fd_holdcnt = 1, fd_mtx = 
{mtx_object = {
  lo_class = 0xc06ad4c4, lo_name = 0xc067c0fd "filedesc structure", 
  lo_type = 0xc067c0fd "filedesc structure", lo_flags = 196608, lo_list = 
{tqe_next = 0x0, 
tqe_prev = 0x0}, lo_witness = 0x0}, mtx_lock = 4, mtx_recurse = 0}, 
fd_locked = 0, 
  fd_wanted = 0, fd_kqlist = {slh_first = 0x0}, fd_holdleaderscount = 0, 
fd_holdleaderswakeup = 0}

If it's at all useful, I can provide access to this system and the
dumps.

Ceri
-- 
That must be wonderful!  I don't understand it at all.
  -- Moliere


pgpT6fmVvPA4c.pgp
Description: PGP signature

Re: kernel panic on 6.2-RC2 with GENERIC.

2007-01-07 Thread Craig Rodrigues

On Fri, Jan 05, 2007 at 06:59:10PM +0200, Nikolay Pavlov wrote:
> Hello folks.
> I have kernel panic on GENERIC kernel while executing postmark.

The sequence of steps that Nikolay used to produce this panic was:

- install benchmarks/postmark from ports

root# postmark
PostMark v1.5 : 3/27/01
pm>set number=1
pm>set transactions=1
pm>set subdirectories=1
pm>show
pm>run

-- 
Craig Rodrigues
[EMAIL PROTECTED]
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "[EMAIL PROTECTED]"

Re: Intel PRO/1000 PT Desktop NIC, PCIe 1x supported in FreeBSD 6.2?

2007-01-07 Thread Erik Trulsson

On Sun, Jan 07, 2007 at 04:56:26PM +0100, O. Hartmann wrote:
> Hello,
> the company I'm working is about to purchae some additional NICs for 
> some replacement built-in NICs of nForce 405-based desktop PCs. I would 
> like to purchase the above mentioned NICs from Intel, hoping the em() 
> driver is capable of handling the NICs.

I have one of those cards (Intel PRO/1000 PT Desktop NIC) myself, and it
works just fine under 6-STABLE.  So far I have not had any problems at all
with it.


> I found in the list of supported hardware that Intels PRO/1000 series is 
> supported by the em() driver, but it is said that the nVidia 4XX chipset 
> is also supported, but not the specific Realtek PHYS/chip used on some 
> ASROCK boards (AM2NF6G-VSTA). So I would like to ask here first.
> 
> Othr suggestions for good stable and fast NICs are welcome. Thanks.
> 
> Regards,
> Oliver
> 
> P.S. I do not have problems coming along with 6-STABLE after 6.2 gets 
> released, so if you plan integrating support for the nVidia nForce 405 
> and/or this mentioned specific Intel PRO/1000 NIC shortly after the 
> launch of 6.2-RELEASE I will also welcome positive answeres about this.
> 
> -
> Intel PRO/1000 PT Desktop Adapter, 1x 1000Base-T, PCIe x1, low profile 
> (EXPI9300PTL)



-- 

Erik Trulsson
[EMAIL PROTECTED]
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "[EMAIL PROTECTED]"

Re: Intel PRO/1000 PT Desktop NIC, PCIe 1x supported in FreeBSD 6.2?

2007-01-07 Thread O. Hartmann


Erik Trulsson wrote:

On Sun, Jan 07, 2007 at 04:56:26PM +0100, O. Hartmann wrote:

Hello,
the company I'm working is about to purchae some additional NICs for 
some replacement built-in NICs of nForce 405-based desktop PCs. I would 
like to purchase the above mentioned NICs from Intel, hoping the em() 
driver is capable of handling the NICs.


I have one of those cards (Intel PRO/1000 PT Desktop NIC) myself, and it
works just fine under 6-STABLE.  So far I have not had any problems at all
with it.


I found in the list of supported hardware that Intels PRO/1000 series is 
supported by the em() driver, but it is said that the nVidia 4XX chipset 
is also supported, but not the specific Realtek PHYS/chip used on some 
ASROCK boards (AM2NF6G-VSTA). So I would like to ask here first.


Othr suggestions for good stable and fast NICs are welcome. Thanks.

Regards,
Oliver

P.S. I do not have problems coming along with 6-STABLE after 6.2 gets 
released, so if you plan integrating support for the nVidia nForce 405 
and/or this mentioned specific Intel PRO/1000 NIC shortly after the 
launch of 6.2-RELEASE I will also welcome positive answeres about this.


-
Intel PRO/1000 PT Desktop Adapter, 1x 1000Base-T, PCIe x1, low profile 
(EXPI9300PTL)





Thank you,
that helps a lot :-)

Regards,
Oliver

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "[EMAIL PROTECTED]"

Re: kernel panic on 6.2-RC2 with GENERIC.

2007-01-07 Thread Nikolay Pavlov

On Friday,  5 January 2007 at 18:00:29 -0500, Craig Rodrigues wrote:
> On Fri, Jan 05, 2007 at 06:59:10PM +0200, Nikolay Pavlov wrote:
> > Hello folks.
> > I have kernel panic on GENERIC kernel while executing postmark.
> 
> What is postmark?
> Can you give the exact sequence of steps used to produce this panic?

Sure:

benchmarks/postmark

PostMark is the benchmark used in the NetApp Technical Report TR-3022,
"PostMark: A New File System Benchmark".  The paper fully explains how
to use this tool.

>From the paper's Abstract:
Existing file system benchmarks are deficient in portraying
performance in the ephemeral small-file regime used by Internet
software, especially:
* electronic mail
* netnews
* web-based commerce

PostMark is a new benchmark to measure performance for this class of
application.

WWW: http://www.netapp.com/tech_library/3022.html

root# postmark
PostMark v1.5 : 3/27/01
pm>set number=1
pm>set transactions=1
pm>set subdirectories=1
pm>show
Current configuration is:
The base number of files is 1
Transactions: 1
Files range between 500 bytes and 9.77 kilobytes in size
Working directory: /usr/home/quetzal
1 subdirectories will be used
Block sizes are: read=512 bytes, write=512 bytes
Biases are: read/append=5, create/delete=5
Using Unix buffered file I/O
Random number generator seed is 42
Report format is verbose.

And than:
pm>run

Actualy i can triger this panic even with rm -rf "some dir with many
files" or background fsck after crash. Also i can triger this with rsync
with many (~100G) files. 
My system is very unstable with 6.2-RC2 kernel, but with 6.1 kernel
i can't crash it.

Here is successful postmark results for 6.1:

Creating subdirectories...Done
Creating files...Done
Performing transactions..Done
Deleting files...Done
Deleting subdirectories...Done
Time:
1196 seconds total
556 seconds of transactions (17 per second)

Files:
15027 created (12 per second)
Creation alone: 1 files (32 per second)
Mixed with transactions: 5027 files (9 per second)
4990 read (8 per second)
5009 appended (9 per second)
15027 deleted (12 per second)
Deletion alone: 10054 files (30 per second)
Mixed with transactions: 4973 files (8 per second)

Data:
27.14 megabytes read (23.24 kilobytes per second)
85.08 megabytes written (72.84 kilobytes per second)

> 
> -- 
> Craig Rodrigues
> [EMAIL PROTECTED]

-- 
==  
- Best regards, Nikolay Pavlov. <<<---
==  

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "[EMAIL PROTECTED]"

Intel PRO/1000 PT Desktop NIC, PCIe 1x supported in FreeBSD 6.2?

2007-01-07 Thread O. Hartmann


Hello,
the company I'm working is about to purchae some additional NICs for 
some replacement built-in NICs of nForce 405-based desktop PCs. I would 
like to purchase the above mentioned NICs from Intel, hoping the em() 
driver is capable of handling the NICs.
I found in the list of supported hardware that Intels PRO/1000 series is 
supported by the em() driver, but it is said that the nVidia 4XX chipset 
is also supported, but not the specific Realtek PHYS/chip used on some 
ASROCK boards (AM2NF6G-VSTA). So I would like to ask here first.


Othr suggestions for good stable and fast NICs are welcome. Thanks.

Regards,
Oliver

P.S. I do not have problems coming along with 6-STABLE after 6.2 gets 
released, so if you plan integrating support for the nVidia nForce 405 
and/or this mentioned specific Intel PRO/1000 NIC shortly after the 
launch of 6.2-RELEASE I will also welcome positive answeres about this.


-
Intel PRO/1000 PT Desktop Adapter, 1x 1000Base-T, PCIe x1, low profile 
(EXPI9300PTL)

--

O. Hartmann

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "[EMAIL PROTECTED]"

fxp(4) and lockups on RELENG_6_x

2007-01-07 Thread Krzysztof Kowalik

Hello.

We are running an (IRC) server that under high-rate traffic (ie. DDoS
attack) stops to respond to the network. The network remains locked up
even after the original attack stops. However running tcpdump (which
switches the interface into promisc mode) unlocks networking and things
work again.

At the moment, we are running 6.2-RC1 cvsupped at Dec 10, with if_fxp.c
from Nov 11 (previously, we had 6.1 for a while, having the same issues)

if_fxp.c,v 1.240.2.10.2.1 2006/11/20 16:21:12

The same machine used to run FreeBSD 4.11 without any problems.

Any help/pointers/suggestions would be appreciated.

More hardware details:

[EMAIL PROTECTED]:3:0:  class=0x02 card=0x10408086 chip=0x12298086 rev=0x0c
hdr=0x00
vendor   = 'Intel Corporation'
device   = '82550/1/7/8/9 EtherExpress PRO/100(B) Ethernet Adapter'
class= network
subclass = ethernet

fxp0:  port 0xc800-0xc83f mem
0xd902-0xd9020fff,0xd900-0xd901 irq 11 at device 3.0 on pci2
miibus0:  on fxp0
fxp0: Ethernet address: 00:02:b3:90:65:86

interrupt  total   rate
irq11: fxp067322  0

-- 
() ASCII Ribbon Campaign
/\ Support plain text e-mail
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "[EMAIL PROTECTED]"

Re: Fatal trap 12: page fault while in kernel mode

2007-01-07 Thread Marc G. Fournier

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1


Working on upgrading and applying patch right now ... thanks ...

- --On Sunday, January 07, 2007 14:03:41 + Robert Watson 
<[EMAIL PROTECTED]> wrote:

>
> On Sat, 6 Jan 2007, Marc G. Fournier wrote:
>
>> Just had the following happen on a FreeBSD 6.2-PRERELEASE #7: Sun Dec 17
>> 01:28:52 AST 2006 system ... amd64, HP Proliant, 6G of RAM ... have core if
>> there is information that I can provide out of it ...
>>
>> Fatal trap 12: page fault while in kernel mode
>> cpuid = 0; apic id = 00
>> fault virtual address   = 0x18c
>> fault code  = supervisor read, page not present
>> instruction pointer = 0x8:0x801f9053
>> stack pointer   = 0x10:0xb5c78b30
>> frame pointer   = 0x10:0xb5c78b60
>> code segment= base 0x0, limit 0xf, type 0x1b
>>= DPL 0, pres 1, long 1, def32 0, gran 1
>> processor eflags= resume, IOPL = 0
>> current process = 5 (thread taskq)
>> trap number = 12
>> panic: page fault
>> cpuid = 0
>> Uptime: 8d22h25m40s
>>
>> (kgdb) where
>> # 0  doadump () at pcpu.h:172
>> # 1  0x80203955 in boot (howto=260) at
>> /usr/src/sys/kern/kern_shutdown.c:409
>> # 2  0x80204065 in panic (fmt=0xff019b667720
>> "X\223f\233\001ÿÿÿ\020µc\233\001ÿÿÿ") at
>> /usr/src/sys/kern/kern_shutdown.c:565
>> # 3  0x803287a6 in trap_fatal (frame=0xc, eva=18446742981100074784)
>> # at
>> /usr/src/sys/amd64/amd64/trap.c:660
>> # 4  0x80328cd8 in trap (frame=
>>  {tf_rdi = 112, tf_rsi = -1092609476832, tf_rdx = 6, tf_rcx = 3221225730,
>> tf_r8 = -1245213424, tf_r9 = -1092609476832, tf_rax = 1, tf_rbx =
>> - -1096874331952, tf_rbp = -1245213856, tf_r10 = -2142258536, tf_r11 = 0,
>> tf_r12 = 4, tf_r13 = -1092609476832, tf_r14 = 4, tf_r15 = 1, tf_trapno = 12,
>> tf_addr = 396, tf_flags = -2145197496, tf_err = 0, tf_rip = -2145415085,
>> tf_cs = 8, tf_rflags = 65538, tf_rsp = -1245213888, tf_ss = 16}) at
>> /usr/src/sys/amd64/amd64/trap.c:238
>> # 5  0x80313c6b in calltrap () at
>> /usr/src/sys/amd64/amd64/exception.S:168
>> # 6  0x801f9053 in _mtx_lock_sleep (m=0xff009d31f0d0,
>> tid=18446742981100074784, opts=6, file=0xc102 > bounds>, line=-1245213424) at /usr/src/sys/kern/kern_mutex.c:546
>> # 7  0x8025b1ac in unp_gc (arg=0x70, pending=-1687783648) at
>> /usr/src/sys/kern/uipc_usrreq.c:1714
>> # 8  0x8022c314 in taskqueue_run (queue=0xff844800) at
>> /usr/src/sys/kern/subr_taskqueue.c:257
>> # 9  0x8022d0e7 in taskqueue_thread_loop (arg=0x70) at
>> /usr/src/sys/kern/subr_taskqueue.c:376
>> # 10 0x801e7b76 in fork_exit (callout=0x8022d060
>> , arg=0x805030d0, frame=0xb5c78c50) at
>> /usr/src/sys/kern/kern_fork.c:821
>> # 11 0x80313fce in fork_trampoline () at
>> /usr/src/sys/amd64/amd64/exception.S:394
>
> This is a NULL pointer dereference in the UNIX domain socket code.  John
> Baldwin recently committed a fix for a bug with these symptoms to 7-CURRENT,
> with an MFC planned in the near future.  The fix won't make 6.2-RELEASE, but
> assuming it tests out well over the next few weeks, we will cut an errata
> patch/announcement for it.  I believe you can pull down his 6-STABLE version
> at:
>
>http://people.FreeBSD.org/~jhb/patches/unp_gc.patch
>
> This same patch is currently in texting on mx1.FreeBSD.org.
>
> (John CC'd)
>
> Robert N M Watson
> Computer Laboratory
> University of Cambridge



- 
Marc G. Fournier   Hub.Org Networking Services (http://www.hub.org)
Email . [EMAIL PROTECTED]  MSN . [EMAIL PROTECTED]
Yahoo . yscrappy   Skype: hub.orgICQ . 7615664
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.5 (FreeBSD)

iD8DBQFFoQ8w4QvfyHIvDvMRAuTzAKDrPBUZ0dRgdujdSzQjbFyh2xiYcACgm8Oa
adOhc5QuzI99WsjjjWaSi64=
=lmyP
-END PGP SIGNATURE-

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "[EMAIL PROTECTED]"

Re: Fatal trap 12: page fault while in kernel mode

2007-01-07 Thread Robert Watson



On Sat, 6 Jan 2007, Marc G. Fournier wrote:

Just had the following happen on a FreeBSD 6.2-PRERELEASE #7: Sun Dec 17 
01:28:52 AST 2006 system ... amd64, HP Proliant, 6G of RAM ... have core if 
there is information that I can provide out of it ...


Fatal trap 12: page fault while in kernel mode
cpuid = 0; apic id = 00
fault virtual address   = 0x18c
fault code  = supervisor read, page not present
instruction pointer = 0x8:0x801f9053
stack pointer   = 0x10:0xb5c78b30
frame pointer   = 0x10:0xb5c78b60
code segment= base 0x0, limit 0xf, type 0x1b
   = DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags= resume, IOPL = 0
current process = 5 (thread taskq)
trap number = 12
panic: page fault
cpuid = 0
Uptime: 8d22h25m40s

(kgdb) where
#0  doadump () at pcpu.h:172
#1  0x80203955 in boot (howto=260) at
/usr/src/sys/kern/kern_shutdown.c:409
#2  0x80204065 in panic (fmt=0xff019b667720
"X\223f\233\001ÿÿÿ\020µc\233\001ÿÿÿ") at
/usr/src/sys/kern/kern_shutdown.c:565
#3  0x803287a6 in trap_fatal (frame=0xc, eva=18446742981100074784) at
/usr/src/sys/amd64/amd64/trap.c:660
#4  0x80328cd8 in trap (frame=
 {tf_rdi = 112, tf_rsi = -1092609476832, tf_rdx = 6, tf_rcx = 3221225730,
tf_r8 = -1245213424, tf_r9 = -1092609476832, tf_rax = 1, tf_rbx =
- -1096874331952, tf_rbp = -1245213856, tf_r10 = -2142258536, tf_r11 = 0, tf_r12
= 4, tf_r13 = -1092609476832, tf_r14 = 4, tf_r15 = 1, tf_trapno = 12, tf_addr =
396, tf_flags = -2145197496, tf_err = 0, tf_rip = -2145415085, tf_cs = 8,
tf_rflags = 65538, tf_rsp = -1245213888, tf_ss = 16}) at
/usr/src/sys/amd64/amd64/trap.c:238
#5  0x80313c6b in calltrap () at
/usr/src/sys/amd64/amd64/exception.S:168
#6  0x801f9053 in _mtx_lock_sleep (m=0xff009d31f0d0,
tid=18446742981100074784, opts=6, file=0xc102 , line=-1245213424) at /usr/src/sys/kern/kern_mutex.c:546
#7  0x8025b1ac in unp_gc (arg=0x70, pending=-1687783648) at
/usr/src/sys/kern/uipc_usrreq.c:1714
#8  0x8022c314 in taskqueue_run (queue=0xff844800) at
/usr/src/sys/kern/subr_taskqueue.c:257
#9  0x8022d0e7 in taskqueue_thread_loop (arg=0x70) at
/usr/src/sys/kern/subr_taskqueue.c:376
#10 0x801e7b76 in fork_exit (callout=0x8022d060
, arg=0x805030d0, frame=0xb5c78c50) at
/usr/src/sys/kern/kern_fork.c:821
#11 0x80313fce in fork_trampoline () at
/usr/src/sys/amd64/amd64/exception.S:394


This is a NULL pointer dereference in the UNIX domain socket code.  John 
Baldwin recently committed a fix for a bug with these symptoms to 7-CURRENT, 
with an MFC planned in the near future.  The fix won't make 6.2-RELEASE, but 
assuming it tests out well over the next few weeks, we will cut an errata 
patch/announcement for it.  I believe you can pull down his 6-STABLE version 
at:


  http://people.FreeBSD.org/~jhb/patches/unp_gc.patch

This same patch is currently in texting on mx1.FreeBSD.org.

(John CC'd)

Robert N M Watson
Computer Laboratory
University of Cambridge___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "[EMAIL PROTECTED]"

Re: Livelock in 6.2-RC1

2007-01-07 Thread Robert Watson



On Sat, 6 Jan 2007, Frode Nordahl wrote:

I am experiencing a rare livelock on four of my backend mail servers running 
6.1-STABLE, 6.2-BETA2 and 6.2-RC1. They are running OpenLDAP slapd, postfix 
and UW-IMAPD.


The servers can run for months without any problem, but nevertheless I have 
experienced this problem on multiple versions and different hardware 
configurations about 5 times since september / october 2006.


Server is responding to pings, but all other activity halts.

On one occasion when one of the servers displayed this behaviour it managed 
to recover from the situation by itself after being gone for 20-30 minutes.


Recovery is a sign of possible livelock, but otherwise this description sounds 
more like deadlock than livelock.  Note that deadlock can be in a specific 
subsystem, so other services may still keep running -- for example, interrupts 
and the in-bound network stack generally have no interaction with the file 
system, so a file system deadlock can leave ping and the keyboard working. 
The first step in diagnosing both livelock and deadlock is to figure out what 
the system is actually doing.  I'd start out with the following commands:


show pcpu
show allpcpu
trace
alltrace
ps
show lockedvnods
show locks
show alllocks

(The last two won't work unless you have WITNESS compiled in).  The fact that 
you can get into the debugger and run debugging commands is a good sign; the 
fact that the debugger breaks into the idle thread suggests that the system 
has at least one idle CPU.


Robert N M Watson
Computer Laboratory
University of Cambridge



Typical hardware configuration:
CPU  2x Xeon 3.06GHz or 1x Core2Duo 2.00GHz (SMP)
RAM  4 GB RAM
DISK Intel SRCU42X (amr) or Dell PERC 5/i (mfi)

Kernel config:
include GENERIC
options KDB # Enable kernel debugger support.
options BREAK_TO_DEBUGGER
options DDB # Support DDB.
options GDB # Support remote GDB.
options QUOTA
options SMP

On the last crash i collected the following info from DDB:
db> tr
Tracing pid 11 tid 15 td 0xc8f90780
kdb_enter(c092f08b) at kdb_enter+0x2b
siointr1(c9120800) at siointr1+0xce
siointr(c9120800) at siointr+0x5e
intr_execute_handlers(c8f864c8,e7b14c94,4,e7b14cd8,c0889503,...) at 
intr_execute_handlers+0xe1

lapic_handle_intr(3d) at lapic_handle_intr+0x2e
Xapic_isr1() at Xapic_isr1+0x33
--- interrupt, eip = 0xc0b5b0e5, esp = 0xe7b14cd8, ebp = 0xe7b14cd8 ---
acpi_cpu_c1(0,0,e7b14cf8,c8f90780,1,...) at acpi_cpu_c1+0x5
acpi_cpu_idle(e7b14d10,c066a779,c8f8fa78,c066a6e4,e7b14d24,...) at 
acpi_cpu_idle+0x152

cpu_idle(c8f8fa78,c066a6e4,e7b14d24,c066a465,0,...) at cpu_idle+0x28
idle_proc(0,e7b14d38) at idle_proc+0x95
fork_exit(c066a6e4,0,e7b14d38) at fork_exit+0x71
fork_trampoline() at fork_trampoline+0x8
--- trap 0x1, eip = 0, esp = 0xe7b14d6c, ebp = 0 ---


db> show lockedbufs
buf at 0xdd08cbd0
b_flags = 0x2000
b_error = 0, b_bufsize = 16384, b_bcount = 16384, b_resid = 0
b_bufobj = (0xc937ed80), b_data = 0xdea14000, b_blkno = 14386688
b_npages = 4, pages(OBJ, IDX, PA): (0xc1045210, 0x1b70c0, 
0xdbe35000),(0xc1045210, 0x1b70c1, 0xc17d6000),(0xc1045210, 0x1b70c2, 
0x582d7000),(0xc1045210, 0x1b70c3, 0x84498000)


I have a crashdump or two available for further investigation.

--
Frode Nordahl



___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "[EMAIL PROTECTED]"

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "[EMAIL PROTECTED]"

Re: debugging kernel options

2007-01-07 Thread Robert Watson

On Sat, 30 Dec 2006, Karol Kwiatkowski wrote:

Robert Watson wrote:

On Sat, 30 Dec 2006, Karol Kwiatkowski wrote:

Robert Watson wrote:

P.S. out of curiosity - now that I have configured kernel with DDB and
KDB options, is there any performance penalty of running such kernel?

No, it shouldn't really have any effect on performance. The one thing to
watch out for is that your system will no longer reboot automatically on
a panic, as it will drop to the debugger, by default. You can change
this by setting debug.debugger_on_panic to 0, in which case you will
likely want to set debug.trace_on_panic to 1 so it prints a stack trace
before rebooting (which is often sufficient, combined with the trap frame
and panic message to debug the problem).

Right now these are sysctls, not tunables, but you can change the default
using options KDB_UNATTENDED (which flips the default to not entering the
debugger and rebooting) and options KDB_TRACE (which causes a trace to be
printed on panic by default). Probably they should also be tunables so
that loader.conf entries will work.

Great explanation, thank you. I turned on debugging on my desktop computer
which, apart from normal every day use, is 'testing' STABLE by running it
:) I'm perfectly fine with the defaults, at least for now.

BTW, I have added some new documentation to the Developer's Handbook on the
various copmile-time kernel debugging options, what their impact is, etc:

http://www.freebsd.org/doc/en_US.ISO8859-1/books/developers-handbook/kerneldebug-options.html

The kernel debugging section of the Developer's Handbook seems to be getting a
bit long in the tooth, I may have a chance to do some further updating of it
this weekend. In particular, it seems to focus mostly on crash dumps, and
many problems are more easily debugged using information in DDB.

BTW, if you're running X on your desktop, be aware that it's X that does
all the video mode management. If your box enters the debugger while in X,
the debugger doesn't know how to switch back to text mode (and X isn't
running, obviously), so while you'll be talking to the debugger, the
chances you'll see anything comprehensible are actually quite low. For this
reason, I normally also use a serial console when debugging desktop boxes:
I can always plug my notebook in with a serial cable to see why it's
entered the debugger.

Right, I haven't thought about that. I guess without a serial console my
best option is to set debug.debugger_on_panic to 0, debug.trace_on_panic to
1 and keep crash dump with kernel.debug for later examination, isn't it? The
whole point of doing this, as I am not really experienced in debugging, is
to have the information saved somewhere in case of a panic.

Yes -- if you have no firewire/serial console option (i.e., no extra notebook
and null modem cable, or no serial port), then crash dumps are the best way to
go. Setting the sysctls as above is good.

Something I've been thinking of doing for a while is adding a scripting
facility to DDB, which would allow you to have a script of DDB commands run on
crash but before the dump, displaying useful debugging information which would
then appear in the dump itself...

Robert N M Watson
Computer Laboratory
University of Cambridge
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "[EMAIL PROTECTED]"

Re: (audit?) Panic in 6.2-PRERELEASE

2007-01-07 Thread Robert Watson



On Sat, 6 Jan 2007, Ceri Davies wrote:

So far it's happened this morning and yesterday morning.  I haven't seen 
it before that.  I don't know the cause so I can't reproduce it at will, 
but the logs don't give any indication.  Chances are that it will happen 
again tomorrow, but we'll see.


Hmm.  It looks like you printf *(td->td_proc->p_fd->fd_ofiles) without the 
array index.  Could you repeat that, but with the array index -- i.e., 
td->td_proc->p_fd->fd_ofiles[uap->fd]?  Also, it would probably be useful 
to print uap->fd.  Right now you're printing stdin (index 0), but if the 
index is non-0, we want a different file.


Very tactfully put :)  Sorry about that.

None of the uap->fd's seem to be valid. In the first case, uap->fd is way 
too high for the length of fd_ofiles, which only has 21 elements:


(kgdb) up 8
#8  0xc04c470d in fstat (td=0xc2eeb180, uap=0xd610dc74) at 
/usr/src/sys/kern/kern_descrip.c:1075
1075error = kern_fstat(td, uap->fd, &ub);
(kgdb) p uap->fd
$1 = 89
(kgdb) p *td->td_proc->p_fd->fd_ofiles[uap->fd]
Cannot access memory at address 0x0

In the second, uap->fd is nonsense:

(kgdb) up 8
#8  0xc04c470d in fstat (td=0xc3109300, uap=0xd617ec74) at 
/usr/src/sys/kern/kern_descrip.c:1075
1075error = kern_fstat(td, uap->fd, &ub);
(kgdb) p uap->fd
$1 = -1023449232
(kgdb)


Hmm.  So, I reviewed audit_arg_file() closely, and after staring at the code a 
lot, couldn't see anything obvious in either the socket or the vnode/fifo 
case.  I did fix one other bug there, however, which can never actually be 
exercised in 7-CURRENT, and is fairly unlikely in 6-STABLE, and will MFC that 
in a week.


Could you try printing *td->td_ar?  Maybe this will give us a clue as to how 
far it got.  In particular, this may be able to more reliably give us the file 
descriptor number, which is audited early in the system call.  You might find 
that 'td' is corrupted in many layers of the stack, keep going up until you 
find one where it's good.  It may well be that td->td_ar->k_ar.ar_arg_fd is 
correct, and might confirm that uap->fd is correct still.  We'd like also to 
know if ARG_SOCKINFO, ARG_VNODE1, or ARG_VNODE2 is set in the 
k_ar.ar_valid_arg field.  This may tell us some more about the file descriptor 
even though it appears to have vanished.


I'm quite worried by the fact that the file descriptor seems not to be present 
any more -- this suggests a file descriptor related race of the sort that is 
both quite difficult to figure out and also quite a risk.  It's strange that 
it would only trigger with audit, however--perhaps audit stretches out the 
race.  Is this an SMP box?


Could you print the entire contents of *td->td_proc->p_fd?

Thanks,

Robert N M Watson
Computer Laboratory
University of Cambridge
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "[EMAIL PROTECTED]"

Re: Panic in 6.2-PRERELEASE with bge on amd64

Re: kernel panic on 6.2-RC2 with GENERIC.

Re: Fatal trap 12: page fault while in kernel mode

Re: kernel panic on 6.2-RC2 with GENERIC.

Re: Fatal trap 12: page fault while in kernel mode

Re: Intel PRO/1000 PT Desktop NIC, PCIe 1x supported in FreeBSD 6.2?

Re: Source MAC addresses when bridge(4) used

RE: kernel panic on 6.2-RC2 with GENERIC.

Panic in 6.2-PRERELEASE with bge on amd64

Re: kernel panic on 6.2-RC2 with GENERIC.

Re: Intel PRO/1000 PT Desktop NIC, PCIe 1x supported in FreeBSD 6.2?

Re: kernel panic on 6.2-RC2 with GENERIC.

Re: (audit?) Panic in 6.2-PRERELEASE

Re: make buildworld is always braking at various points

Re: make buildworld is always braking at various points

Re: (audit?) Panic in 6.2-PRERELEASE

make buildworld is always braking at various points

Re: (audit?) Panic in 6.2-PRERELEASE

Re: kernel panic on 6.2-RC2 with GENERIC.

Re: Intel PRO/1000 PT Desktop NIC, PCIe 1x supported in FreeBSD 6.2?

Re: Intel PRO/1000 PT Desktop NIC, PCIe 1x supported in FreeBSD 6.2?

Re: kernel panic on 6.2-RC2 with GENERIC.

Intel PRO/1000 PT Desktop NIC, PCIe 1x supported in FreeBSD 6.2?

fxp(4) and lockups on RELENG_6_x

Re: Fatal trap 12: page fault while in kernel mode

Re: Fatal trap 12: page fault while in kernel mode

Re: Livelock in 6.2-RC1

Re: debugging kernel options

Re: (audit?) Panic in 6.2-PRERELEASE

29 matches

Site Navigation

Mail list logo

Footer information