Re: LOR in mpr(4)

2016-10-23 Thread geoffroy desvernay
On 10/19/2016 06:39 PM, Pete Wright wrote:
> 
> the issue you are seeing is most likely not related to the LOR from the
> original email and PR I filed.  This looks like a media error with the
> disk device on your RAID controller.  A quick google search turn's up
> quite a few threads on this - ranging from bad RAID/JBOD controllers to
> out of date firmware.
> 
> Cheers,
> -pete
> 

Thank you for your response, I'll take more time checking around
controller. I just fear that this dell-repackaged avago controller may
have some 'dell' crapped firmware… I have to look closer then :)

Cheers,

dgeo.
-- 
*geoffroy desvernay*
C.R.I - Administration systèmes et réseaux
Ecole Centrale de Marseille
Tel: (+33|0)4 91 05 45 24
Fax: (+33|0)4 91 05 45 98
d...@centrale-marseille.fr

___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: LOR in mpr(4)

2016-10-19 Thread geoffroy desvernay
On 11/17/2015 21:43, Pete Wright wrote:
> 
> 
> On 11/12/15 09:44, Pete Wright wrote:
>> Hi All,
>> Just wanted a sanity check before filing a PR.  I am running r290688 and
>> am seeing a LOR being triggered in the mpr(4) device:
>>
>> $ uname -ar
>> FreeBSD srd0013 11.0-CURRENT FreeBSD 11.0-CURRENT #1 r290688: Wed Nov 11
>> 21:28:26 PST 2015 root@srd0013:/usr/obj/usr/src/sys/GENERIC  amd64
>>
>> 
>> lock order reversal:
>>  1st 0xf8000d26bc60 CAM device lock (CAM device lock) @
>> /usr/src/sys/cam/cam_xpt.c:784
>>  2nd 0xfe00012811c0 MPR lock (MPR lock) @
>> /usr/src/sys/cam/cam_xpt.c:2620
>> KDB: stack backtrace:
>> db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame
>> 0xfe04608ee890
>> witness_checkorder() at witness_checkorder+0xe79/frame 0xfe04608ee910
>> __mtx_lock_flags() at __mtx_lock_flags+0xa4/frame 0xfe04608ee960
>> xpt_action_default() at xpt_action_default+0xb6c/frame 0xfe04608ee9b0
>> scsi_scan_bus() at scsi_scan_bus+0x1d5/frame 0xfe04608eea20
>> xpt_scanner_thread() at xpt_scanner_thread+0x15c/frame 0xfe04608eea70
>> fork_exit() at fork_exit+0x84/frame 0xfe04608eeab0
>> fork_trampoline() at fork_trampoline+0xe/frame 0xfe04608eeab0
>> --- trap 0, rip = 0, rsp = 0, rbp = 0 ---
>> 
> 
> FWIW I filed the following PR as I can still reproduce this on boot:
> 
> https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=204614
> 
> cheers,
> -pete
> 
Hi all,

Sorry for cross-posting, let me know where this should go please, I
didn't figured it out :(

On 11-RELEASE-p1 here (but replying on current@ where I found something
around mpr(4))

Not sure if it's related, but on a fresh new machine with Avago SAS3008
and a 24 disks enclosure (single attached).

I see a bunch of:

mpr0: Found device <401,End Device> <12.0Gbps> handle<0x001b>
enclosureHandle<0x0002> slot 8
(da0:mpr0:0:8:0): UNMAPPED
(da0:mpr0:0:8:0): CAM status: SCSI Status Error
(da0:mpr0:0:8:0): SCSI status: Check Condition
(da0:mpr0:0:8:0): SCSI sense: ILLEGAL REQUEST asc:20,0 (Invalid command
operation code)
(da0:mpr0:0:8:0): Error 22, Unretryable error
10:0): UNMAPPED
(da0:mpr0:0:8:0): READ(10). CDB: 28 00 e8 e0 88 71 00 00 04 00
(da0:mpr0:0:8:0): CAM status: SCSI Status Error
(da0:mpr0:0:8:0): SCSI status: Check Condition
(da0:mpr0:0:8:0): SCSI sense: ILLEGAL REQUEST asc:20,0 (Invalid command
operation code)
(da0:mpr0:0:8:0): Error 22, Unretryable error
ses0: da0: Element descriptor: 'Drive Slot 0'
ses0: da0: SAS Device Slot Element: 2 Phys at Slot 0
ses0:  phy 0: SAS device type 1 id 0
ses0:  phy 0: protocols: Initiator( None ) Target( SSP )
ses0:  phy 0: parent 520474729974b57f addr 5000c50097ce8215
ses0:  phy 1: SAS device type 1 id 1
ses0:  phy 1: protocols: Initiator( None ) Target( SSP )
ses0:  phy 1: parent 520474729974b5ff addr 5000c50097ce8216

(more complete dmesg.boot here: http://dgeo.perso.ec-m.fr/dmesg.boot )

Later, no way to use these disks with zfs:
# zpool create tank da0
cannot create 'tank': invalid argument for this pool operation

I can dd if=/dev/zero of=/dev/da0 though not tested until disk is full…

Can this be related ? Must I open a pr ? How can I help debugging this ?

I'm not kernel/driver hacker, but I'd like to help this be figured out :)

Yours,
-- 
*geoffroy desvernay*
C.R.I - Administration systèmes et réseaux
Ecole Centrale de Marseille




signature.asc
Description: OpenPGP digital signature


Re: ahcich reset -> cannot mount zfs root in 9.1-PRE

2012-10-03 Thread geoffroy desvernay
On 10/02/2012 17:40, Alexander Motin wrote:
> On 02.10.2012 16:51, Andriy Gapon wrote:
>> on 02/10/2012 16:16 geoffroy desvernay said the following:
>>> Hi all,
>>>
>>> Trying to upgrade a system from 9.0-RELEASE to 9.1-PRE from yesterday on
>>> my machine (GEOM+ZFS mirror setup on ada[01]p3), the new kernel becomes
>>> unable to mount root... The only way to recover is to boot from 9.0
>>> kernel.
>>> The disks were already named ada[01] in 9.0, so I suspect nothing
>>> there...
>>>
>>> I tried
>>>   - disabling AHCI in bios (no change seen)
>>>   - change cables, check PSU, test disks with smartctl
>>>
>>> Here are some bits (via serial console):
>>> ahci0:  port
>>> 0xc000-0xc007,0xb000-0xb003,0xa000-0xa007,0x9000-0x9003,0x8000-0x800f
>>> mem 0xfe9ff800-0xfe9ffbff irq 22 at device 18.0 on pci0
>>> ahci0: AHCI v1.10 with 4 3Gbps ports, Port Multiplier supported
>>> ahci0: Caps: 64bit NCQ SNTF MPS AL CLO 3Gbps PM PMD SSC PSC 32cmd CCC
>>> 4ports
>>> ahcich0:  at channel 0 on ahci0
>>> ahcich0: Caps: HPCP
>>> ahcich1:  at channel 1 on ahci0
>>> ahcich1: Caps: HPCP
>>> ahcich2:  at channel 2 on ahci0
>>> ahcich2: Caps: HPCP
>>> ahcich3:  at channel 3 on ahci0
>>> ahcich3: Caps: HPCP
>>> ahcich0: AHCI reset...
>>> ahcich0: SATA connect time=100us status=0123
>>> ahcich0: AHCI reset: device found
>>> ahcich0: AHCI reset: device ready after 0ms
>>>
>>> The difference with 9.0 is after that: here is 9.0's next lines: (same
>>> for ahcich1)
>>> (aprobe0:ahcich0:0:15:0): Command timed out
>>> (aprobe0:ahcich0:0:15:0): Error 5, Retries exhausted
>>> (aprobe0:ahcich0:0:0:0): SIGNATURE: 
>>>
>>> And 9.1-PRE's:
>>> (aprobe0:ahcich0:0:15:0): NOP. ACB: 00 00 00 00 00 00 00 00 00 00 00 00
>>> (aprobe0:ahcich0:0:15:0): CAM status: Command timeout
>>> (aprobe0:ahcich0:0:15:0): Error 5, Retries exhausted
>>>
>>> In both cases ada[01] are detected and available, but with 9.1-PRE I
>>> see:
>>> GEOM_RAID: Promise: Disk ada0 state changed from NONE to SPARE.
>>> GEOM_RAID: Promise: Disk ada1 state changed from NONE to SPARE.
>>>
>>> (I see the same when I # kldload geom_raid # from running 9.0, doesn't
>>> breaks anything...)
>>>
>>> I attach the full boot log with 9.1-PRE (bios with NO-raid nor AHCI
>>> enabled, but this changes nothing in the output)
>>>
>>> I could test patches or try any command required to debug this… But for
>>> the moment I don't know where to search (and kernel code is far away
>>> from my current skills in debugging…)
>>
>> You probably need to clear RAID metadata on the disks as I think that
>> disabling
>> geom_raid is not possible in 9.1-PRE.
>> I think that Alexander can help you more here.
> 
> The right way is to clear RAID metadata on disks. If it is possible to
> boot from any other source, you can just do `graid delete Promise` and
> then reboot.
> 
> Alternatively it is possible to disable geom_raid module using recently
> added loader tunable kern.geom.raid.enable=0. After that your system
> should boot and run fine. I would still recommend you to erase metadata,
> but after setting that tunable it will be impossible to do it via graid
> tool, only with manual dd surgery. In case of Promise format metadata
> use up to 63 last sectors of the disk. You can identify respective
> sectors to erase by signature "Promise Technology, Inc." in the
> beginning of the sector.
> 
I tried clearing metadata, but no effect (it seems to work, the first
'geom raid delete Promise' returns 0, the second one complains something
like 'Promise array doesn't exist', but it didn't solve the problem.

But adding kern.geom.raid.enable=0 did ;)

I still didn't try to locate manualy the last sectors...

Thanks a lot !
-- 
*geoffroy desvernay*
C.R.I - Administration systèmes et réseaux
Ecole Centrale de Marseille
Tel: (+33|0)4 91 05 45 24
Fax: (+33|0)4 91 05 45 98
d...@centrale-marseille.fr

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: problem with LSI MegaRAID on 8.2-RELEASE

2011-10-23 Thread geoffroy desvernay

On 13/09/2011 13:22, Johan Hendriks wrote:

Maciej Jan Broniarz schreef:

Wiadomość napisana przez Jeremy Chadwick w dniu 13 wrz 2011, o godz.
12:33:


On Tue, Sep 13, 2011 at 11:43:29AM +0200, Maciej Jan Broniarz wrote:

I'm having some trouble with LSI MegaRAID on FreeBSD 8.2-RELEASE-p2
My storage starts to freeze and the following message apears in the
log:

mfi0: COMMAND 0xff8000b6be58 TIMEOUT AFTER 3005 SECONDS
mfi0: COMMAND 0xff8000b6be58 TIMEOUT AFTER 3035 SECONDS
mfi0: COMMAND 0xff8000b6be58 TIMEOUT AFTER 3065 SECONDS
mfi0: COMMAND 0xff8000b6be58 TIMEOUT AFTER 3095 SECONDS
mfi0: COMMAND 0xff8000b6be58 TIMEOUT AFTER 3125 SECONDS
mfi0: COMMAND 0xff8000b6be58 TIMEOUT AFTER 3156 SECONDS
mfi0: COMMAND 0xff8000b6be58 TIMEOUT AFTER 3186 SECONDS

What might be the issue here?

http://lists.freebsd.org/pipermail/freebsd-stable/2011-August/063808.html

http://lists.freebsd.org/pipermail/freebsd-stable/2011-August/063809.html

http://lists.freebsd.org/pipermail/freebsd-stable/2011-August/063810.html

http://lists.freebsd.org/pipermail/freebsd-stable/2011-August/063811.html

http://lists.freebsd.org/pipermail/freebsd-stable/2011-September/063816.html

http://lists.freebsd.org/pipermail/freebsd-stable/2011-September/063817.html

http://lists.freebsd.org/pipermail/freebsd-stable/2011-September/063821.html

http://lists.freebsd.org/pipermail/freebsd-stable/2011-September/063823.html


--

Thanks. But there is still no solution for the problem. I haven't
applied any patch
and yet the problem occurs.

All best,
mjb





Maybe i understand your comment the wrong way, but i think the patch is
there to prevent prevent the problem.
So by applying the following patch//
www.freebsd.org/~jhb/patches/mfi.patch//
Your issue should be SOLVED.//
//
regards
Johan Hendriks
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Same issue here with dell's PERC H700 (LSI repackaged by dell).
The patch referenced here solves the problem for me (8.2-STABLE and 
9.0-RC1 on amd64):

http://www.freebsd.org/cgi/query-pr.cgi?pr=kern/140416

Could someone commit this, or is this problem solved y other means ? (I 
don't follow freebsd-fs@ not freebsd-scsi@ for now...)


--
*Geoffroy Desvernay*
C.R.I - Administration systèmes et réseaux
Ecole Centrale de Marseille

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: bin/136073: recent nscd(8) changes cause client processes to die with SIGPIPE

2011-04-29 Thread geoffroy desvernay
This change is not so recent now... But I'm still experiencing this bug
with 8.2p1 :(

This bug happens with nss_ldap-1.265_6 and nss-pam-ldapd, using one or
more ldap:// and|or ldaps:// servers, and cache enabled in nsswitch.conf.

Some symptoms:
# id dgeo; echo $?
141

cron jobs are logged as executed but are not !

I may test any patch or ?
-- 
*geoffroy desvernay*
C.R.I - Administration systèmes et réseaux
Ecole Centrale de Marseille



signature.asc
Description: OpenPGP digital signature


Re: RELENG_7_1: bce driver change generating too much interrupts ?

2008-12-03 Thread geoffroy desvernay
Xin LI a écrit :
> Hi guys,
> 
> I think I got a real fix.
> 
It seems to "work for me®" too

Server under normal charge (smtp/imap/Maildir for ~1000 users, NFS
filer), everything seems ok... (1h uptime for now)

Thank you !
-- 
geoffroy desvernay



signature.asc
Description: OpenPGP digital signature


Re: RELENG_7_1: bce driver change generating too much interrupts ?

2008-12-02 Thread geoffroy desvernay
Xin LI a écrit :
> Can anyone try reverting the changeset itself?  There are two recent
> changesets:
> 
>   http://www.delphij.net/bce-185161.diff.bz2
>   http://www.delphij.net/bce-184826.diff.bz2
> 
> You can revert the change by doing this:
> 
> cd /usr/src
> fetch http://www.delphij.net/bce-185161.diff.bz2
> fetch http://www.delphij.net/bce-184826.diff.bz2
> bzcat bce-185161.diff.bz2 | patch -R
> bzcat bce-184826.diff.bz2 | patch -R
> 
> I'll check what's happening ASAP.
> 
Done:

I'd say it seems to be related...

Before applying your patches:
# vmstat -i
interrupt  total   rate
irq1: atkbd0  18  0
irq14: ata0   58  0
irq20: uhci1  96  0
irq21: uhci0 uhci+ 5  0
irq78: mfi0   539747  3
cpu0: timer350029937   1999
irq256: bce0  6757905080  38611
irq259: bce1  8296789513  47403
cpu1: timer350029945   1999
cpu2: timer350030010   1999
cpu3: timer350030025   1999
Total16455354434  94018


After patch, make buildkernel && make reinstallkernel and reboot
interrupt  total   rate
irq1: atkbd0  18  0
irq14: ata0   58  0
irq20: uhci1   2  0
irq21: uhci0 uhci+ 5  0
irq78: mfi0 3947 24
cpu0: timer   320361   1989
irq256: bce06658 41
irq259: bce11428  8
cpu1: timer   320320   1989
cpu2: timer   320380   1989
cpu3: timer   320507   1990
Total1293684   8035

-- 
geoffroy desvernay



signature.asc
Description: OpenPGP digital signature


RELENG_7_1: bce driver change generating too much interrupts ?

2008-12-02 Thread Geoffroy Desvernay
Since last upgrade, I see much more CPU time "eated" by interrupts (at
least 10% cpu in top)
(see http://dgeo.perso.ec-marseille.fr/cpu-week.png)

The server behave correctly (Or seems to…), and high interrupt number
seems to come from bce cards (source: systat -vmstat)

I just upgraded from
"RELENG_7 Mon Sep  8 12:33:06 CEST 2008"
to
"RELENG_7_1 Sat Nov 29 16:20:35 CET 2008"

We have the same machine (dell PE 1950) which have not been upgraded
(production use - the two machine are carp(4)-redundant)

I don't know if it is related to "SVN rev 184826 on 2008-11-10 22:40:16Z
by delphij" patch to sys/dev/bce/if_bce.c


If I can help debugging something… These are production machines, but I
may test patches or ? on the faulty system.



Some clues:

Under the very same load (carp interfaces down on other machine), vmstat
shows:
for newer system:

 procs  memory  page   disk   faults cpu
 r b w avmfre   flt  re  pi  pofr  sr mf0   in   sy   cs us
sy id
 0 1 1   4806M   460M   649   0   0   0   582   2   0 21770 1270 13653
1 15 85

and for older:

 procs  memory  page   disk   faults cpu
 r b w avmfre   flt  re  pi  pofr  sr mf0   in   sy   cs us
sy id
 0 1 0   3694M   414M   236   0   0   0   199  17   0  286  317  386  1
 1 97


bce-related part of dmesg for the newer system:

bce0:  mem
0xf400-0xf5ff irq 16 at device 0.0 on pci9
miibus0:  on bce0
bce0: Ethernet address: 00:15:c5:f1:56:f4
bce0: [ITHREAD]
bce0: ASIC (0x57081020); Rev (B2); Bus (PCI-X, 64-bit, 133MHz); F/W
(0x02090105); Flags( SPLT MFW MSI )
bce1:  mem
0xf800-0xf9ff irq 16 at device 0.0 on pci5
miibus1:  on bce1
bce1: Ethernet address: 00:15:c5:f1:56:f2
bce1: [ITHREAD]
bce1: ASIC (0x57081020); Rev (B2); Bus (PCI-X, 64-bit, 133MHz); F/W
(0x02090105); Flags( SPLT MFW MSI )

And on the older system:

bce0:  mem
0xf400-0xf5ff irq 16 at device 0.0 on pci9
miibus0:  on bce0
bce0: Ethernet address: 00:15:c5:f1:6a:47
bce0: [ITHREAD]
bce0: ASIC (0x57081020); Rev (B2); Bus (PCI-X, 64-bit, 133MHz); F/W
(0x02090105); Flags( MFW MSI )
bce1:  mem
0xf800-0xf9ff irq 16 at device 0.0 on pci5
miibus1:  on bce1
bce1: Ethernet address: 00:15:c5:f1:6a:45
bce1: [ITHREAD]
bce1: ASIC (0x57081020); Rev (B2); Bus (PCI-X, 64-bit, 133MHz); F/W
(0x02090105); Flags( MFW MSI )

-- 
Geoffroy Desvernay
Ecole Centrale de Marseille



signature.asc
Description: OpenPGP digital signature


page fault on RELENG_6_1

2006-11-06 Thread Geoffroy DESVERNAY

I'm experiencing kernel panics , and trying to understand something
(not a real kernel hacker... I'm more near 'Hello World' programmer:)

I think there is something like a null-pointer each time, in nd6_output 
(crashes 2 and 3)


I'm not sure crash 4 is the same (look like 
http://www.freebsd.org/cgi/query-pr.cgi?pr=kern/96413 )


Here are my dmesg, and some kgdb logs, hope I didn't forgot anything 
important...


The machine (via C7) is hosting some websites, some mails, is 
ipv6-enabled via gif tunnel, and use 2 openvpn instances.


Please cc my mail address.
--
 ___
    / Geoffroy DESVERNAY   |\
   /\`Service info`| Tel: (+33|0)4 91 05 45 24  /\
   \/ Ecole Centrale de Marseille  | Fax: (+33|0)4 91 05 45 98  \/
\ (ex-EGIM)| Mail: [EMAIL PROTECTED] /
 ---


Dump header from device /dev/ad4s1b
  Architecture: i386
  Architecture Version: 2
  Dump Length: 1056505856B (1007 MB)
  Blocksize: 512
  Dumptime: Sun Oct  8 01:03:32 2006
  Hostname: box.dgeos.net
  Magic: FreeBSD Kernel Dump
  Version String: FreeBSD 6.1-RELEASE-p10 #0: Wed Oct  4 09:30:30 CEST 2006
[EMAIL PROTECTED]:/usr/obj/usr/src/sys/BOX
  Panic String: page fault
  Dump Parity: 103186717
  Bounds: 2
  Dump Status: good

[GDB will not be able to debug user-mode threads: /usr/lib/libthread_db.so: 
Undefined symbol "ps_pglobal_lookup"]
GNU gdb 6.1.1 [FreeBSD]
Copyright 2004 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB.  Type "show warranty" for details.
This GDB was configured as "i386-marcel-freebsd".

Unread portion of the kernel message buffer:
kernel trap 12 with interrupts disabled


Fatal trap 12: page fault while in kernel mode
fault virtual address   = 0x24
fault code  = supervisor read, page not present
instruction pointer = 0x20:0xc0515778
stack pointer   = 0x28:0xe338d828
frame pointer   = 0x28:0xe338d848
code segment= base 0x0, limit 0xf, type 0x1b
= DPL 0, pres 1, def32 1, gran 1
processor eflags= resume, IOPL = 0
current process = 12 (swi1: net)
trap number = 12
panic: page fault
Uptime: 2d22h25m31s
Dumping 1007 MB (2 chunks)
  chunk 0: 1MB (159 pages) ... ok
  chunk 1: 1007MB (257776 pages) 991 975 959 943 927 911 895 879 863 847 831 
815 799 783 767 751 735 719 703 687 671 655 639 623 607 591 575 559 543 527 511 
495 479 463 447 431 415 399 383 367 351 335 319 303 287 271 255 239 223 207 191 
175 159 143 127 111 95 79 63 47 31 15

#0  doadump () at pcpu.h:165
165 pcpu.h: No such file or directory.
in pcpu.h
(kgdb) list *0xc0515778
0xc0515778 is in propagate_priority (/usr/src/sys/kern/subr_turnstile.c:241).
236 /*
237  * Pick up the lock that td is blocked on.
238  */
239 ts = td->td_blocked;
240 MPASS(ts != NULL);
241 tc = TC_LOOKUP(ts->ts_lockobj);
242 mtx_lock_spin(&tc->tc_lock);
243 
244 /* Resort td on the list if needed. */
245 if (!turnstile_adjust_thread(ts, td)) {
(kgdb) bt
#0  doadump () at pcpu.h:165
#1  0xc04edbb7 in boot (howto=260) at /usr/src/sys/kern/kern_shutdown.c:402
#2  0xc04edef9 in panic (fmt=0xc06c92d8 "%s") at 
/usr/src/sys/kern/kern_shutdown.c:558
#3  0xc06ac32c in trap_fatal (frame=0xe338d7e8, eva=0) at 
/usr/src/sys/i386/i386/trap.c:836
#4  0xc06ab9c4 in trap (frame=
  {tf_fs = 8, tf_es = 40, tf_ds = 40, tf_edi = -995882752, tf_esi = 
-995882368, tf_ebp = -482813880, tf_isp = -482813932, tf_ebx = -995882752, 
tf_edx = -995882368, tf_ecx = -992324084, tf_eax = 0, tf_trapno = 12, tf_err = 
0, tf_eip = -1068411016, tf_cs = 32, tf_eflags = 589954, tf_esp = -995882368, 
tf_ss = 40})
at /usr/src/sys/i386/i386/trap.c:269
#5  0xc0698a7a in calltrap () at /usr/src/sys/i386/i386/exception.s:139
#6  0xc0515778 in propagate_priority (td=0xc4a40a80) at 
/usr/src/sys/kern/subr_turnstile.c:239
#7  0xc0515ff3 in turnstile_wait (lock=0xc4da560c, owner=0x0) at 
/usr/src/sys/kern/subr_turnstile.c:634
#8  0xc04e2ba4 in _mtx_lock_sleep (m=0xc4da560c, tid=3299084544, opts=0, 
file=0x0, line=0) at /usr/src/sys/kern/kern_mutex.c:565
#9  0xc05cc183 in nd6_output (ifp=0xc4b1d000, origifp=0x0, m0=0xc4f1f700, 
dst=0xc5185e1c, rt0=0xc4da59cc) at /usr/src/sys/netinet6/nd6.c:2004
#10 0xc05c505b in ip6_output (m0=0xe338da44, opt=0x0, ro=0xe338da44, flags=0, 
im6o=0x0, ifpp=0x0, inp=0xc4f81870) a

Re: i386/86880: [hang] 6.0 hangs or reboots whilst 5.4 is stable (ASUS-A7NX motherboard with nforce2 chipset)

2006-02-09 Thread Geoffroy Desvernay

Quoting "Mars G. Miro" <[EMAIL PROTECTED]>:


On 2/8/06, Geoffroy Desvernay <[EMAIL PROTECTED]> wrote:


>>>I've got the same problem with an A7N8X-X (athlon 2000+) motherboard and
>>>6-STABLE (Build Feb, 2 2006).
>>>booting with kernel.debug says nothing, seems to be hardware hang but
>>>doesn't happend with linux nor OpenBSD. Didn't tried 5.3 yet.
>>>Hang after detection of ATA devices (floppy's light turns on, then hang)
>
> I've experienced this myself. Happens w/ nForce-based mobos and
> certain shuttles. My fix has to always set the BIOS setting of the HD
> to LBA instead of Auto or CHS.
>
> Try this and report back ;-)
>
Tried this unsuccessfully, but fixing cpu freq to 100Mhz (instead of
133Mhz) seems to work...

I've read something about disabling firewire in the bios, but I have it
on a separate card (not in the MB), and I can't remove it for the moment...



Also try disabling APIC (not ACPI) as I've encountered several mobos
that have this implemented poorly w/c results in weird behaviors of
the OS.


I'm not kernel developper, but I may try patches or ?



I'm not aware of any patches but I think this is just a hardware
config problem tho YMMV.


Working at 133Mhz with hint.apic.0.disabled="1" in looader.conf.

Thanks for that :)

I saw at http://acpi.sf.net/dsdt/view.php?id=233 that a dsdt specific 
for this board is available... (not fixing all), could this fix 
anything in my case ? I think I'll give a try on of these days...


Geoffroy


This message was sent using IMP, the Internet Messaging Program.


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: i386/86880: [hang] 6.0 hangs or reboots whilst 5.4 is stable (ASUS-A7NX motherboard with nforce2 chipset)

2006-02-07 Thread Geoffroy Desvernay



I've got the same problem with an A7N8X-X (athlon 2000+) motherboard and
6-STABLE (Build Feb, 2 2006).
booting with kernel.debug says nothing, seems to be hardware hang but
doesn't happend with linux nor OpenBSD. Didn't tried 5.3 yet.
Hang after detection of ATA devices (floppy's light turns on, then hang)


I've experienced this myself. Happens w/ nForce-based mobos and
certain shuttles. My fix has to always set the BIOS setting of the HD
to LBA instead of Auto or CHS.

Try this and report back ;-)


Tried this unsuccessfully, but fixing cpu freq to 100Mhz (instead of
133Mhz) seems to work...

I've read something about disabling firewire in the bios, but I have it
on a separate card (not in the MB), and I can't remove it for the moment...

I'm not kernel developper, but I may try patches or ?

Geoffroy.


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: i386/86880: [hang] 6.0 hangs or reboots whilst 5.4 is stable (ASUS-A7NX motherboard with nforce2 chipset) (regression)

2006-02-03 Thread Geoffroy Desvernay
I've got the same problem with an A7N8X-X (athlon 2000+) motherboard and
6-STABLE (Build Feb, 2 2006).

booting with kernel.debug says nothing, seems to be hardware hang but
doesn't happend with linux nor OpenBSD. Didn't tried 5.3 yet.

Hang after detection of ATA devices (floppy's light turns on, then hang)



signature.asc
Description: OpenPGP digital signature


Re: 5.4 Installer + Promise FT100TX2 = Loader crash

2005-08-05 Thread Geoffroy DESVERNAY
Daniel O'Connor wrote:
> Hi,
> I am updating an old 4.x system to 5.4 here and it has a Promise FT100TX2 
> RAID 
> controller (in mirror).
> 
> The problem is that when I boot the CD the loader crashes (does a reg dump) 
> but only if the card is present and an array is defined. I can't record what 
> the dump is because it continually sprays the dump down the screen which 
> makes it unreadable :(
> 
> I have seen this on an AMD64 system (I used the same RAID card in it) and got 
> the same problem. To work around it on that system I installed via the 
> motherboard IDE controller and then moved the disk over to the RAID 
> controller.
> 
> It only seems to affect booting the installer - once the system is installed 
> it boots from the RAID card just fine (!)
> 
> I just tried booting from floppy and that works (?!) although that method 
> doesn't probe my PS/2 keyboard for some reason :-/
> 
> Does anyone have any suggestions for fixing the RAID + CD boot problem?
> 
> Thanks.
> 
I've exactly the same issue: on a PIII866, only one IDE CD, Floppy and a
RAID1 array on the FT100TX2, no way to boot on the CD...

It works with there are no array on the card.
(No array defined, Ctrl-F for bios or  to continue).
-> I just installed on 'ad4'(prim. master from raid card) disk, edited
fstab to mount ar0 instead of ad4, then build (on the card's bios) a
raid1 with ad6(long time copying...), then it boots from the raid. Still
not from a FreeBSD5.4 RELEASE CD, nor a FreeSBIE 1.1 (based on 5.3).

I can do some tests if needed, even if I'm don't really understand more
than 'hello world', I know how to apply patches ;)

-- 
 ---
/ Geoffroy DESVERNAY|   \
   /`Service info`  | Tel: (+33|0)4 91 05 45 24  \
   \  Ecole Généraliste d'Ingénieurs| Fax: (+33|0)4 91 05 45 98  /
\ ...de MARSEILLE   |  dgeo _AT_ egim-mrs.fr/
 ---



smime.p7s
Description: S/MIME Cryptographic Signature


Re: kernel bug (ufs2?) on a dell 2600

2005-06-14 Thread Geoffroy Desvernay
It's not related to more-than-full fs: it occured one more time without
it :(

Do someone have an idea ?

Geoffroy Desvernay a écrit :
> This server (FreeBSD 5.4 RELENG) is crashing once a week or more since
> 5.4 (maybe before).
> 
> It may be related with a full filesystem:
> I'm using snapshots on this server (using
> http://people.freebsd.org/~rse/snapshot/), and crash has occured after
> (~30mins) a snapshot that fills up to 100% the filesystem.
> 
> Attached the dmesg and kgdb logs.
> 
> I'm not so hacker, but hope that it can help to resolve this bug.
> 
> 
> 
> 
> 
> 
> [GDB will not be able to debug user-mode threads: /usr/lib/libthread_db.so: 
> Undefined symbol "ps_pglobal_lookup"]
> GNU gdb 6.1.1 [FreeBSD]
> Copyright 2004 Free Software Foundation, Inc.
> GDB is free software, covered by the GNU General Public License, and you are
> welcome to change it and/or distribute copies of it under certain conditions.
> Type "show copying" to see the conditions.
> There is absolutely no warranty for GDB.  Type "show warranty" for details.
> This GDB was configured as "i386-marcel-freebsd".
> #0  doadump () at pcpu.h:160
> 160   __asm __volatile("movl %%fs:0,%0" : "=r" (td));
> (kgdb) bt full
> #0  doadump () at pcpu.h:160
> No locals.
> #1  0xc06878d6 in boot (howto=260) at /usr/src/sys/kern/kern_shutdown.c:410
>   first_buf_printf = 1
> #2  0xc0687cc4 in panic (fmt=0xc091a1ae "initiate_write_inodeblock_ufs2: 
> already started")
> at /usr/src/sys/kern/kern_shutdown.c:566
>   td = (struct thread *) 0xc3641c00
>   bootopt = 260
>   newpanic = 0
>   ap = 0xc3641c00 "\\\214\220Ã -XÃ"
>   buf = "initiate_write_inodeblock_ufs2: already started", '\0'  208 times>
> #3  0xc080ef5f in initiate_write_inodeblock_ufs2 (inodedep=0xc5be0280, bp=0x0)
> at /usr/src/sys/ufs/ffs/ffs_softdep.c:3781
>   adp = (struct allocdirect *) 0xd75bd13c
>   lastadp = (struct allocdirect *) 0x1000
>   dp = (struct ufs2_dinode *) 0x0
>   fs = (struct fs *) 0xc21f6730
>   i = Unhandled dwarf expression opcode 0x93
> (kgdb) quit
> 
> 
> 
> 
> Copyright (c) 1992-2005 The FreeBSD Project.
> Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994
>   The Regents of the University of California. All rights reserved.
> FreeBSD 5.4-STABLE #0: Mon Jun  6 18:51:49 CEST 2005
> [EMAIL PROTECTED]:/usr/obj/usr/src/sys/ZLIP
> Timecounter "i8254" frequency 1193182 Hz quality 0
> CPU: Intel(R) Xeon(TM) CPU 2.40GHz (2392.29-MHz 686-class CPU)
>   Origin = "GenuineIntel"  Id = 0xf29  Stepping = 9
>   
> Features=0xbfebfbff
>   Hyperthreading: 2 logical CPUs
> real memory  = 2147287040 (2047 MB)
> avail memory = 2095828992 (1998 MB)
> ACPI APIC Table: 
> FreeBSD/SMP: Multiprocessor System Detected: 4 CPUs
>  cpu0 (BSP): APIC ID:  0
>  cpu1 (AP): APIC ID:  1
>  cpu2 (AP): APIC ID:  6
>  cpu3 (AP): APIC ID:  7
> ioapic0: Changing APIC ID to 8
> ioapic1: Changing APIC ID to 9
> ioapic2: Changing APIC ID to 10
> ioapic2: WARNING: intbase 72 != expected base 48
> ioapic3: Changing APIC ID to 11
> ioapic3: WARNING: intbase 120 != expected base 96
> ioapic4: Changing APIC ID to 12
> ioapic0  irqs 0-23 on motherboard
> ioapic1  irqs 24-47 on motherboard
> ioapic2  irqs 72-95 on motherboard
> ioapic3  irqs 120-143 on motherboard
> ioapic4  irqs 144-167 on motherboard
> npx0:  on motherboard
> npx0: INT 16 interface
> acpi0:  on motherboard
> acpi0: Power Button (fixed)
> Timecounter "ACPI-safe" frequency 3579545 Hz quality 1000
> acpi_timer0: <24-bit timer at 3.579545MHz> port 0x808-0x80b on acpi0
> cpu0:  on acpi0
> cpu1:  on acpi0
> cpu2:  on acpi0
> cpu3:  on acpi0
> pcib0:  port 0xcf8-0xcff on acpi0
> pci0:  on pcib0
> pcib1:  at device 2.0 on pci0
> pci1:  on pcib1
> pci1:  at device 28.0 (no driver 
> attached)
> pcib2:  at device 29.0 on pci1
> pci2:  on pcib2
> em0:  port 
> 0xece0-0xecff mem 0xfdec-0xfded,0xfdee-0xfdef irq 24 at 
> device 2.0 on pci2
> em0: Ethernet address: 00:02:b3:d4:d3:a2
> em0:  Speed:N/A  Duplex:N/A
> pci1:  at device 30.0 (no driver 
> attached)
> pcib3:  at device 31.0 on pci1
> pci3:  on pcib3
> em1:  port 
> 0xdce0-0xdcff mem 0xfdcc-0xfdcd,0xfdce-0xfdcf irq 28 at 
> device 1.0 on pci3
> em1: Ethernet address: 00:0b:db:92:0a:e4
> em1:  Speed:N/A  Duplex:N/A
> pcib4:  at device

kernel bug (ufs2?) on a dell 2600

2005-06-13 Thread Geoffroy Desvernay
This server (FreeBSD 5.4 RELENG) is crashing once a week or more since
5.4 (maybe before).

It may be related with a full filesystem:
I'm using snapshots on this server (using
http://people.freebsd.org/~rse/snapshot/), and crack has occured after
(~30mins) a snapshot that fills up to 100% the filesystem.

Attached the dmesg and kgdb logs.

I'm not so hacker, but hope that it can help to resolve this bug.


[GDB will not be able to debug user-mode threads: /usr/lib/libthread_db.so: 
Undefined symbol "ps_pglobal_lookup"]
GNU gdb 6.1.1 [FreeBSD]
Copyright 2004 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB.  Type "show warranty" for details.
This GDB was configured as "i386-marcel-freebsd".
#0  doadump () at pcpu.h:160
160 __asm __volatile("movl %%fs:0,%0" : "=r" (td));
(kgdb) bt full
#0  doadump () at pcpu.h:160
No locals.
#1  0xc06878d6 in boot (howto=260) at /usr/src/sys/kern/kern_shutdown.c:410
first_buf_printf = 1
#2  0xc0687cc4 in panic (fmt=0xc091a1ae "initiate_write_inodeblock_ufs2: 
already started")
at /usr/src/sys/kern/kern_shutdown.c:566
td = (struct thread *) 0xc3641c00
bootopt = 260
newpanic = 0
ap = 0xc3641c00 "\\\214\220Ã -XÃ"
buf = "initiate_write_inodeblock_ufs2: already started", '\0' 
#3  0xc080ef5f in initiate_write_inodeblock_ufs2 (inodedep=0xc5be0280, bp=0x0)
at /usr/src/sys/ufs/ffs/ffs_softdep.c:3781
adp = (struct allocdirect *) 0xd75bd13c
lastadp = (struct allocdirect *) 0x1000
dp = (struct ufs2_dinode *) 0x0
fs = (struct fs *) 0xc21f6730
i = Unhandled dwarf expression opcode 0x93
(kgdb) quit
Copyright (c) 1992-2005 The FreeBSD Project.
Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994
The Regents of the University of California. All rights reserved.
FreeBSD 5.4-STABLE #0: Mon Jun  6 18:51:49 CEST 2005
[EMAIL PROTECTED]:/usr/obj/usr/src/sys/ZLIP
Timecounter "i8254" frequency 1193182 Hz quality 0
CPU: Intel(R) Xeon(TM) CPU 2.40GHz (2392.29-MHz 686-class CPU)
  Origin = "GenuineIntel"  Id = 0xf29  Stepping = 9
  
Features=0xbfebfbff
  Hyperthreading: 2 logical CPUs
real memory  = 2147287040 (2047 MB)
avail memory = 2095828992 (1998 MB)
ACPI APIC Table: 
FreeBSD/SMP: Multiprocessor System Detected: 4 CPUs
 cpu0 (BSP): APIC ID:  0
 cpu1 (AP): APIC ID:  1
 cpu2 (AP): APIC ID:  6
 cpu3 (AP): APIC ID:  7
ioapic0: Changing APIC ID to 8
ioapic1: Changing APIC ID to 9
ioapic2: Changing APIC ID to 10
ioapic2: WARNING: intbase 72 != expected base 48
ioapic3: Changing APIC ID to 11
ioapic3: WARNING: intbase 120 != expected base 96
ioapic4: Changing APIC ID to 12
ioapic0  irqs 0-23 on motherboard
ioapic1  irqs 24-47 on motherboard
ioapic2  irqs 72-95 on motherboard
ioapic3  irqs 120-143 on motherboard
ioapic4  irqs 144-167 on motherboard
npx0:  on motherboard
npx0: INT 16 interface
acpi0:  on motherboard
acpi0: Power Button (fixed)
Timecounter "ACPI-safe" frequency 3579545 Hz quality 1000
acpi_timer0: <24-bit timer at 3.579545MHz> port 0x808-0x80b on acpi0
cpu0:  on acpi0
cpu1:  on acpi0
cpu2:  on acpi0
cpu3:  on acpi0
pcib0:  port 0xcf8-0xcff on acpi0
pci0:  on pcib0
pcib1:  at device 2.0 on pci0
pci1:  on pcib1
pci1:  at device 28.0 (no driver 
attached)
pcib2:  at device 29.0 on pci1
pci2:  on pcib2
em0:  port 
0xece0-0xecff mem 0xfdec-0xfded,0xfdee-0xfdef irq 24 at device 
2.0 on pci2
em0: Ethernet address: 00:02:b3:d4:d3:a2
em0:  Speed:N/A  Duplex:N/A
pci1:  at device 30.0 (no driver 
attached)
pcib3:  at device 31.0 on pci1
pci3:  on pcib3
em1:  port 
0xdce0-0xdcff mem 0xfdcc-0xfdcd,0xfdce-0xfdcf irq 28 at device 
1.0 on pci3
em1: Ethernet address: 00:0b:db:92:0a:e4
em1:  Speed:N/A  Duplex:N/A
pcib4:  at device 3.0 on pci0
pci4:  on pcib4
pci4:  at device 28.0 (no driver 
attached)
pcib5:  at device 29.0 on pci4
pci5:  on pcib5
pci4:  at device 30.0 (no driver 
attached)
pcib6:  at device 31.0 on pci4
pci6:  on pcib6
bge0:  mem 
0xfd8f-0xfd8f irq 80 at device 3.0 on pci6
miibus0:  on bge0
brgphy0:  on miibus0
brgphy0:  10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseTX, 
1000baseTX-FDX, auto
bge0: Ethernet address: 00:10:18:06:39:68
pcib7:  at device 4.0 on pci0
pci7:  on pcib7
pci7:  at device 28.0 (no driver 
attached)
pcib8:  at device 29.0 on pci7
pci8:  on pcib8
amr0:  mem 0xfebf-0xfebf irq 120 at device 8.0 
on pci8
amr0:  Firmware 250O, BIOS 1.06, 128MB RAM
pci7:  at device 30.0 (no driver 
attached)
pcib9:  at device 31.0 on pci7
pci10:  on pcib9
ahc0:  port 0xbc00-0xbcff mem 
0xfd3ff000-0xfd3f irq 148 at device 7.0 on pci10
aic7899: Ultra160 Wide Channel A, SCSI Id=7, 32/253 SCBs
ahc1:  port 0xb800-0xb8ff mem 
0xfd3fe000-0xfd3fefff irq 149 at device 7.1 on pci10
aic7