date:20051118

Re: Why device sk (PCI Gigabit Ethernet) cannot use polling?

2005-11-18 Thread Xin LI

On 11/18/05, Rob [EMAIL PROTECTED] wrote:

 Hi,

 The sk device has no polling support,
 neither in 5 nor in 6.
 Is there a particular reason (maybe
 because it's a Gigabit device) ?

 Or is polling not supported because it
 simply has not yet been coded? If so, would
 it be straightforward to add the code?

That would not be very hard if you own some hardware, so if you have
the hardware, give it a try! :-)

BTW.  Since glebius@ has some recent work on polling(4), you may want
to ask him for some in-depth advises.

Cheers,
--
Xin LI [EMAIL PROTECTED] http://www.delphij.net
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]

Re: 4.8 alternate system clock has died error

2005-11-18 Thread Uwe Doering


Charles Sprickman wrote:

Hello all,

I've been digging through Google for more information on this.  I have a 
4.8 box that's been up for about 430 days.  In the last week or so, top 
and ps have started reporting all CPU usage numbers as zero, and running 
systat -vmstat results in the message The alternate system clock has 
died! Reverting to ``pigs'' display.


I've found instances of this message in the archives for some 3.x users, 
some pre 4.8 users and some 5.3 users.


There were a number of suggestions including a patch if pre-4.8, sending 
init a HUP, and setting the following sysctl mib: 
kern.timecounter.method: 1.


I'm already at 4.8-p24, so I did not look into patching anything, and 
HUP'ing init and setting the sysctl mib does not seem to have any effect.


I'm not quite ready to believe that some hardware has actually failed. 
Perhaps due to the long uptime something has rolled over?


We had this once at work, quite a while ago.  The alternate system 
clock is in fact the Real Time Clock (RTC) on the mainboard.  In our 
case we were lucky in that it was just the quartz device that failed due 
to an improperly soldered lead which finally came off.  We fixed the 
soldering and the problem was gone.


Now, there are of course plenty of other hardware reasons why the RTC 
can fail, even temporarily like in your case.  Perhaps it is really time 
for a new mainboard.


   Uwe
--
Uwe Doering |  EscapeBox - Managed On-Demand UNIX Servers
[EMAIL PROTECTED]  |  http://www.escapebox.net
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]

Re: Page fault, GEOM problem??

2005-11-18 Thread Xin LI

On 11/18/05, Johan Ström [EMAIL PROTECTED] wrote:
 Ok, just got this not so very nice error on a RELENG_6_0 box (built
 from sources this morning, GENERIC kernel minus drivers I dont use):
 The network card is the exact same model as the one I used in the
 test machine, didn't have any problems there..
[...]
 So, any ideas what this can be? If there were a disk crash, wish I
 have a hard time believing since I ran powermax (maxtor test program)
 on both of these disk 3 weeks ago and they have been running fine w/o
 a single problem since I started using them, why didn't just GEOM
 kick in and run on the other disk? Pagefaulting is not a way to react
 if a disk goes dead..

 Hope someone can help me/this problem doesn't occur any more... but I
 suppose that is to much to hope for...

Would you please consider trying to obtain a crashdump and send the
backtrace so we can investigate more?

(Hints can be found at
http://www.freebsd.org/doc/en_US.ISO8859-1/books/developers-handbook/kerneldebug.html#KERNELDEBUG-OBTAIN)

Thanks,
--
Xin LI [EMAIL PROTECTED] http://www.delphij.net
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]

Re: Page fault, GEOM problem??

2005-11-18 Thread Johan Ström



On 18 nov 2005, at 10.17, Xin LI wrote:


On 11/18/05, Johan Ström [EMAIL PROTECTED] wrote:

Ok, just got this not so very nice error on a RELENG_6_0 box (built
from sources this morning, GENERIC kernel minus drivers I dont use):
The network card is the exact same model as the one I used in the
test machine, didn't have any problems there..

[...]

So, any ideas what this can be? If there were a disk crash, wish I
have a hard time believing since I ran powermax (maxtor test program)
on both of these disk 3 weeks ago and they have been running fine w/o
a single problem since I started using them, why didn't just GEOM
kick in and run on the other disk? Pagefaulting is not a way to react
if a disk goes dead..

Hope someone can help me/this problem doesn't occur any more... but I
suppose that is to much to hope for...


Would you please consider trying to obtain a crashdump and send the
backtrace so we can investigate more?

(Hints can be found at
http://www.freebsd.org/doc/en_US.ISO8859-1/books/developers- 
handbook/kerneldebug.html#KERNELDEBUG-OBTAIN)




Thanks for answer

Doesnt look like I got any usable dump devices..
When booting i get

GEOM_MIRROR: Device gm0s1 created (id=4118114647).
GEOM_MIRROR: Device gm0s1: provider ad6s1 detected.
GEOM_MIRROR: Device gm0s1: provider ad10s1 detected.
GEOM_MIRROR: Device gm0s1: provider ad6s1 activated.
GEOM_MIRROR: Device gm0s1: provider mirror/gm0s1 launched.
GEOM_MIRROR: Device gm0s1: rebuilding provider ad10s1.
Trying to mount root from ufs:/dev/mirror/gm0s1a
WARNING: / was not properly dismounted
Loading configuration files.
No suitable dump device was found.
Entropy harvesting:
interrupts
ethernet
point_to_point
kickstart
.
swapon: adding /dev/mirror/gm0s1b as swap device

Then naturally:
/etc/rc: WARNING: Dump device does not exist.  Savecore not run.

Looked around in the rc-scripts and tried to figure out what it did,  
the dumpon script

tries to autolookup a good dump device but finds none..
According to the page you linked to, the dumpon command has to be  
executed AFTER swapon.. Why is the rc scripts trying to run it before  
swapon then?

Anyway, tried to do dumpon manually on my swap drive:

$ dumpon -v /dev/mirror/gm0s1b
dumpon: ioctl(DIOCSKERNELDUMP): Operation not supported

Didn't work too good..
Also tried savecore manually:

$ savecore /var/crash/ /dev/mirror/gm0s1b
savecore: no dumps found

Didnt work very good either (but probably expected since there was no  
working dumps..)
Google showed me some other thread in this list about gmirror swap  
dump, just a question (if it was supported) w/o any answers tho. Same  
error as I got.


Hope this helps.
Thanks again

Johan


Thanks,
--
Xin LI [EMAIL PROTECTED] http://www.delphij.net
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable- 
[EMAIL PROTECTED]


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]

Re: Why device sk (PCI Gigabit Ethernet) cannot use polling?

2005-11-18 Thread Rob

Xin LI wrote:
 On 11/18/05, Rob [EMAIL PROTECTED] wrote:
 
Hi,

The sk device has no polling support,
neither in 5 nor in 6.
Is there a particular reason (maybe
because it's a Gigabit device) ?

Or is polling not supported because it
simply has not yet been coded? If so, would
it be straightforward to add the code?
 
 
 That would not be very hard if you own some
hardware,
 so if you have the hardware, give it a try! :-)

I do have the hardware: an sk integrated on the
motherboard, but this is on a production server.

Also, I don't know anything about coding the
polling stuff; I use it on other PCs (rl, xl)
and was wondering why not with sk.

If I have a piece of code that is 99.9 % sure
to work, then in that case I could try it out on
the production server in a test over the weekend.

 BTW.  Since glebius@ has some recent work on
 polling(4), you may want to ask him for some
 in-depth advises.

Who is glebius?

Rob.






__ 
Yahoo! Mail - PC Magazine Editors' Choice 2005 
http://mail.yahoo.com
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]

Subscribe request result (usagi-users ML)

2005-11-18 Thread usagi-users-admin

Hi, I am the fml mailing list manager for [EMAIL PROTECTED].
Hmm, you may be not a member.
1. Your mail may come from a bad address which is not 
   registered in this mailing list

2. Your mail has a syntax error.
   If you would like to subscribe this mailing list

subscribe YOUR NAME

 For example
subscribe Hayakawa Aoi


Hi, I am the fml ML manager for the ML [EMAIL PROTECTED].


[EMAIL PROTECTED], Be Seeing You!


If you have any questions or problems,
   please contact [EMAIL PROTECTED]




___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]

Re[2]: Why device sk (PCI Gigabit Ethernet) cannot use polling?

2005-11-18 Thread Hendry Sarumpaet

Hello Rob,

Friday, November 18, 2005, 3:23:10 PM, you wrote:

 Xin LI wrote:
 On 11/18/05, Rob [EMAIL PROTECTED] wrote:
 
Hi,

The sk device has no polling support,
neither in 5 nor in 6.
Is there a particular reason (maybe
because it's a Gigabit device) ?

Or is polling not supported because it
simply has not yet been coded? If so, would
it be straightforward to add the code?
 
 
 That would not be very hard if you own some
 hardware,
 so if you have the hardware, give it a try! :-)

 I do have the hardware: an sk integrated on the
 motherboard, but this is on a production server.

 Also, I don't know anything about coding the
 polling stuff; I use it on other PCs (rl, xl)
 and was wondering why not with sk.

 If I have a piece of code that is 99.9 % sure
 to work, then in that case I could try it out on
 the production server in a test over the weekend.


  well if you want to make it work then feel free to ask any
  developer who willing spent their time to write the code and it will
  be great if you can donate the hardware also.
  i think ru@ and glebius@ is the guy who currently proactive develope
  the polling(4) code.
  the recent work for polling was additional support to bge(4) which
  have been merged to RELENG_6 (work done by glebius), we've been used it on 
our production
  router seems pretty robust.



 BTW.  Since glebius@ has some recent work on
 polling(4), you may want to ask him for some
 in-depth advises.

 Who is glebius?

  finger [EMAIL PROTECTED]
  honestly he is my FreeBSD network hero :)
  netgragph improvement , carp stuff , recent em(4) fix , new design of polling 
code and many
  more. thanks a lot gleb !




 Rob.




 
 
 __ 
 Yahoo! Mail - PC Magazine Editors' Choice 2005 
 http://mail.yahoo.com
 ___
 freebsd-stable@freebsd.org mailing list
 http://lists.freebsd.org/mailman/listinfo/freebsd-stable
 To unsubscribe, send any mail to
 [EMAIL PROTECTED]


-- 
cheers,
hsa

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]

ssh not working behind firewall

2005-11-18 Thread Its Azfar

I am using freebsd 5.4 stable. SSH is working fine in
normal environment but as soon i plug a firewall in
front of freebsd box ssh stop working. I have allowed
tcp 22 for my freebsd box and the rule is working if I
replace the machine to linux box but dont know why ssh
is not working on freebsd. i am also geeting an error
in /var/log/meesages

SSH port bind error tcp port 22 is already used by
another application.

But I have checked no appliction s using it as soon I
remove the firewall ssh sart working and the error
also gone. I also allowed tcp 722 (read in an article)
but it dont effect me.

In problem I only got ssh login prompt as soon I enter
the username the screen stcuk.

How can I resolve the issue.

__
Do You Yahoo!?
Tired of spam?  Yahoo! Mail has the best spam protection around 
http://mail.yahoo.com 
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]

Update from 5.4 to 6.0

2005-11-18 Thread Yann Golanski

I removed /usr/src totoal and did an update this afternoon.  Make
buildworld fails with the following error.  Anyone has any idea?

# time -h make buildworld
[...]
/usr/src/secure/lib/libcrypto/../../../crypto/openssl/crypto/cryptlib.c:111:3:
#error Inconsistency between crypto.h and cryptlib.c
In file included from
/usr/src/secure/lib/libcrypto/../../../crypto/openssl/crypto/evp/e_rc5.c:66:
/usr/obj/usr/src/tmp/usr/include/openssl/rc5.h:67:2: #error RC5 is
disabled.
In file included from
/usr/src/secure/lib/libcrypto/../../../crypto/openssl/crypto/evp/m_mdc2.c:65:
/usr/obj/usr/src/tmp/usr/include/openssl/mdc2.h:69:2: #error MDC2 is
disabled.
In file included from
/usr/src/secure/lib/libcrypto/../../../crypto/openssl/crypto/mdc2/mdc2_one.c:61:
/usr/obj/usr/src/tmp/usr/include/openssl/mdc2.h:69:2: #error MDC2 is
disabled.
In file included from
/usr/src/secure/lib/libcrypto/../../../crypto/openssl/crypto/mdc2/mdc2dgst.c:63:
/usr/obj/usr/src/tmp/usr/include/openssl/mdc2.h:69:2: #error MDC2 is
disabled.
mkdep: compile failed
*** Error code 1

Stop in /usr/src/secure/lib/libcrypto.
*** Error code 1

Stop in /usr/src.
*** Error code 1

Stop in /usr/src.
*** Error code 1

Stop in /usr/src.
*** Error code 1

Stop in /usr/src.
13m40.55s real  9m53.87s user   1m41.55s sys

-- 
[EMAIL PROTECTED]  -=*=-  www.kierun.org
PGP:   009D 7287 C4A7 FD4F 1680  06E4 F751 7006 9DE2 6318


pgpW8zKDGmxNR.pgp
Description: PGP signature

Re: Update from 5.4 to 6.0

2005-11-18 Thread Iulian M

On Friday 18 November 2005 18:51, Yann Golanski wrote:
 /usr/src/secure/lib/libcrypto/../../../crypto/openssl/crypto/cryptlib.c:111
:3: #error Inconsistency between crypto.h and cryptlib.c
 In file included from
 /usr/src/secure/lib/libcrypto/../../../crypto/openssl/crypto/evp/e_rc5.c:66
: /usr/obj/usr/src/tmp/usr/include/openssl/rc5.h:67:2: #error RC5 is

Just a guess: try removing /usr/obj

-- 
Harris's Lament:
All the good ones are taken.


pgp8Iqq7gxTn4.pgp
Description: PGP signature

Re: RELENG_6 vm_fault panic on filesystem mount

2005-11-18 Thread Xin LI

Hi, Joerg,

On 11/19/05, Joerg Pernfuss [EMAIL PROTECTED] wrote:
[...]
 From the output of the backtrace (below) I figure the problem for me
 lies in /usr/src/sys/ufs/ufs/ufs_dirhash.c line 232; this block:

  if (ep-d_reclen == 0 || ep-d_reclen 
  DIRBLKSIZ - (pos  (DIRBLKSIZ - 1))) {
  /* Corrupted directory. */
  brelse(bp);
  goto fail;
  }

 Which is consistent with the very first panic message I got saying
 something about 'Fatal Trap 12, page fault while in kernel mode,
  ufs_dirhash: bad dir' - sadly, I didn't write down that one.
 Hasn't reappeared since then.

Unfortunately the ufs_dirhash: bad dir in most cases indicate some
bugs elsewhere, or some hardware hazard.  I have recently upgraded
several production boxes at lab and suggested many other guys to
upgrade to 6.0-RELEASE, and I am very eager to see if there is some
problems that we can catch and fix.

If this is a production then my suggestion would be trying to remove
something that you do not need for everyday use from the kernel
configuration, and if it is not then my suggestion would be enabling
INVARIANTS and makeoptions=-g to see if something strange happen, and
possibly some backtraces would inspire us to catch the bug.

Thanks,
--
Xin LI [EMAIL PROTECTED] http://www.delphij.net
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]

Re: Update from 5.4 to 6.0

2005-11-18 Thread Xin LI

Maybe also ntpdate -bu time-nw.nist.gov

Cheers,
--
Xin LI [EMAIL PROTECTED] http://www.delphij.net
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]

RELENG_6 LOR: vnode interlock / system map

2005-11-18 Thread Joerg Pernfuss

Hi.

While hunting down the panic described in the other mail, I got an LOR
I couldn't find on the lists so far.

lock order reversal:
1st 0xc50147ec vnode interlock (vnode interlock) @ 
/usr/src/sys/kern/vfs_subr.c:2185
2nd 0xc1060144 system map (system map) @ /usr/src/sys/vm/vm_kern.c:295
KDB: stack backtrace:
witness_checkorder(c1060144,9,c0822173,127,e5e08a3c) at 0xc060fb1f = 
witness_checkorder+0x5bf
_mtx_lock_flags(c1060144,0,c0822173,127,321) at 0xc05da7d4 = 
_mtx_lock_flags+0x54
_vm_map_lock(c10600c0,c0822173,127,8,c106c468) at 0xc072d425 = _vm_map_lock+0x35
kmem_malloc(c10600c0,1000,101,101,8) at 0xc072c81c = kmem_malloc+0x3c
slab_zalloc(9,c08211fb,8a2,c4c93a80,c4c93af8) at 0xc0722d12 = slab_zalloc+0x82
uma_zone_slab(c106c468,8,c08211fb,8a2,0) at 0xc07231ac = uma_zone_slab+0xac
uma_zalloc_internal(1,0,0,0,e5e08b6c) at 0xc072325e = uma_zalloc_internal+0x3e
bucket_alloc(c10440a8,0,c08211fb,95e,c10440a0) at 0xc0723749 = bucket_alloc+0x29
uma_zfree_arg(c1042000,c5039208,0,e5e08b90,c06e9b38) at 0xc07248c6 = 
uma_zfree_arg+0x2d6
mac_labelzone_free(c5039208,c5014770,e5e08bac,c064bc3e,c5014770) at 0xc06e0710 
= mac_labelzone_free+0x20
mac_destroy_vnode(c5014770,0,c0812767,2ff,c50147ec) at 0xc06e9b38 = 
mac_destroy_vnode+0x18
vdropl(c084cf60,e5e08bc8,c0812767,819,d64) at 0xc064bc3e = vdropl+0x11e
vput(c5014770,0,c081e644,d64,3b7) at 0xc064d6e8 = vput+0x168
handle_workitem_remove(0,c4f47400,2,32e,0) at 0xc0703c91 = 
handle_workitem_remove+0x161
process_worklist_item(c08dbd40,8,c081e644,2a3,437a7572) at 0xc0704181 = 
process_worklist_item+0x201
softdep_process_worklist(0,0,c0812767,68a,0) at 0xc0709543 = 
softdep_process_worklist+0x93
sched_sync(0,e5e08d38,c080693c,30d,0) at 0xc064d167 = sched_sync+0x5c7
fork_exit(c064cba0,0,e5e08d38) at 0xc05c9a34 = fork_exit+0xa4
fork_trampoline() at 0xc079152c = fork_trampoline+0x8
--- trap 0x1, eip = 0, esp = 0xe5e08d6c, ebp = 0 ---


Regards, Joerg.

-- 
| /\   ASCII ribbon   |  GnuPG Key ID | c7e4 d91d 64e2 6321 9988 |
| \ / campaign against |0xb248b614 | f27a 4e5b 06ce b248 b614 |
|  XHTML in email  |   .the next sentence is a lie.   |
| / \ and news | .the previous sentence was true. |


pgpTt8G149jHd.pgp
Description: PGP signature

Re: RELENG_6: ACPI-0698: *** Warning: Type override:

2005-11-18 Thread Xin LI

Hi,

On 11/19/05, Ricardo A. Reis [EMAIL PROTECTED] wrote:
 Hi all,


  Updating proxy server running 5.4, i resolve enable acpi,smp
 kernel  to use HTT.
[snip]
Features=0xbfebfbffFPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CLFLUSH,DTS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE
  Features2=0x400CNTX-ID
  Hyperthreading: 2 logical CPUs

JFYI: disabling hyperthreading in BIOS could actually improve your performance.

[snip]
  cpu0 (BSP): APIC ID:  0
  cpu1 (AP): APIC ID:  1
ACPI-0698: *** Warning: Type override - [DEB_] had invalid type
 (Integer) for Scope operator, changed to (Scope)
ACPI-0698: *** Warning: Type override - [MLIB] had invalid type
 (Integer) for Scope operator, changed to (Scope)
ACPI-0698: *** Warning: Type override - [DATA] had invalid type
 (String) for Scope operator, changed to (Scope)
ACPI-0698: *** Warning: Type override - [SIO_] had invalid type
 (String) for Scope operator, changed to (Scope)
ACPI-0698: *** Warning: Type override - [LEDP] had invalid type
 (String) for Scope operator, changed to (Scope)
ACPI-0698: *** Warning: Type override - [GPEN] had invalid type
 (String) for Scope operator, changed to (Scope)
ACPI-0698: *** Warning: Type override - [GPST] had invalid type
 (String) for Scope operator, changed to (Scope)
ACPI-0698: *** Warning: Type override - [WUES] had invalid type
 (String) for Scope operator, changed to (Scope)
ACPI-0698: *** Warning: Type override - [WUSE] had invalid type
 (String) for Scope operator, changed to (Scope)
ACPI-0698: *** Warning: Type override - [SBID] had invalid type
 (String) for Scope operator, changed to (Scope)
ACPI-0698: *** Warning: Type override - [SWCE] had invalid type
 (String) for Scope operator, changed to (Scope)

This looks like a bug in either ACPICA or your BIOS.  Would you please
consider downloading a November snapshot of 7.0-CURRENT's disc1 and
try to boot it on your box to see if the problem persists?  Another
thing that is worthy to try is to update your BIOS to latest release.

BTW.  It might be good to cc' this to freebsd-acpi@ as well if you got
some findings.

Cheers,
--
Xin LI [EMAIL PROTECTED] http://www.delphij.net
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]

Re: RELENG_6 vm_fault panic on filesystem mount

2005-11-18 Thread Joerg Pernfuss

On Sat, 19 Nov 2005 01:58:46 +0800
Xin LI [EMAIL PROTECTED] wrote:

 Hi, Joerg,
 
 If this is a production then my suggestion would be trying to remove
 something that you do not need for everyday use from the kernel
 configuration, and if it is not then my suggestion would be enabling
 INVARIANTS and makeoptions=-g to see if something strange happen, and
 possibly some backtraces would inspire us to catch the bug.
 
 Thanks,

Hi,

interestingly it doesn't show up in the verbose dmesg, but curently
my kernel is configured with:

makeoptions DEBUG=-g
options ADAPTIVE_GIANT
options MUTEX_DEBUG
options WITNESS
options GDB
options SYSCTL_DEBUG
options DEBUG_MEMGUARD
options KTRACE
options KTRACE_REQUEST_POOL=101
options INVARIANTS
options INVARIANT_SUPPORT
options DIAGNOSTIC
options KDB_STOP_NMI

I also do boot /boot/kernel/kernel.debug. As for the backtraces, I have
a few of them, but they are all identical to the one I sent in with my
first mail. 

I'll add KDB, KDB_TRACE and DDB and look if I get different output.

Joerg

-- 
| /\   ASCII ribbon   |  GnuPG Key ID | c7e4 d91d 64e2 6321 9988 |
| \ / campaign against |0xb248b614 | f27a 4e5b 06ce b248 b614 |
|  XHTML in email  |   .the next sentence is a lie.   |
| / \ and news | .the previous sentence was true. |


pgpCW8YkknwiN.pgp
Description: PGP signature

Re: Page fault, GEOM problem??

2005-11-18 Thread Johan Ström


Hi!

On 18 nov 2005, at 18.43, Xin LI wrote:


Hi, Johan,

On 11/18/05, Johan Ström [EMAIL PROTECTED] wrote:

On 18 nov 2005, at 10.17, Xin LI wrote:

[snip]

Doesnt look like I got any usable dump devices..
When booting i get

[...]

Loading configuration files.
No suitable dump device was found.
Entropy harvesting:
interrupts
ethernet
point_to_point
kickstart
.
swapon: adding /dev/mirror/gm0s1b as swap device


I see, so your both SATA disks are in the same mirror group...


Then naturally:
/etc/rc: WARNING: Dump device does not exist.  Savecore not run.

Looked around in the rc-scripts and tried to figure out what it did,
the dumpon script
tries to autolookup a good dump device but finds none..


Unfortunately, kernel dumps currently does not support every device,
for some technical reasons (probably to simplify the crash code so
they do not make more mistakes^Wdamages)


According to the page you linked to, the dumpon command has to be
executed AFTER swapon.. Why is the rc scripts trying to run it before
swapon then?


I guess this is because that dumpon now can detect dump device
automatically, but I'm not quite sure about this.  Will look for the
reason.  I think either Handbook should be updated, or the code should
be corrected.

What I am very curious is that why dumpon is BEFORE savecore.  Maybe
I have some misunderstanding...


Sorry, partly my misstake.. I think i missunderstod how save savecore  
works below (when i tried it manually in last mail)..
But the messages from above are directly from boot, seems it tries  
dumpon before savecore? Relevant bootlog from last boot:



ad0: 2441MB WDC AC22500L 32.41N35 at ata0-master UDMA33
acd0: CDROM CD-ROM CDU701-F/1.0q at ata1-master PIO4
ad6: 286188MB Maxtor 7L300S0 BANC1G10 at ata3-master SATA150
ad10: 286188MB Maxtor 7L300S0 BANC1G10 at ata5-master SATA150
GEOM_MIRROR: Device gm0s1 created (id=4118114647).
GEOM_MIRROR: Device gm0s1: provider ad6s1 detected.
GEOM_MIRROR: Device gm0s1: provider ad10s1 detected.
GEOM_MIRROR: Device gm0s1: provider ad10s1 activated.
GEOM_MIRROR: Device gm0s1: provider ad6s1 activated.
GEOM_MIRROR: Device gm0s1: provider mirror/gm0s1 launched.
Trying to mount root from ufs:/dev/mirror/gm0s1a
Loading configuration files.
dumpon: (this DIOCSKERNELDUMP message is probably since i specified  
dumpdev in rc.conf so it forced useage of gm0s1b instead of letting  
the scripts autodetect.. )

ioctl(DIOCSKERNELDUMP)
:
Operation not supported
Entropy harvesting:
interrupts
ethernet
point_to_point
kickstart
.
swapon: adding /dev/mirror/gm0s1b as swap device
Starting file system checks:
/dev/mirror/gm0s1a: FILE SYSTEM CLEAN; SKIPPING CHECKS
/dev/mirror/gm0s1a: clean, 213811 free (771 frags, 26630 blocks, 0.3%  
fragmentation)

/dev/mirror/gm0s1e: FILE SYSTEM CLEAN; SKIPPING CHECKS
/dev/mirror/gm0s1e: clean, 1012917 free (85 frags, 126604 blocks,  
0.0% fragmentation)

/dev/mirror/gm0s1f: FILE SYSTEM CLEAN; SKIPPING CHECKS
/dev/mirror/gm0s1f: clean, 115955787 free (40747 frags, 14489380  
blocks, 0.0% fragmentation)

/dev/mirror/gm0s1d: FILE SYSTEM CLEAN; SKIPPING CHECKS
/dev/mirror/gm0s1d: clean, 1983354 free (4834 frags, 247315 blocks,  
0.2% fragmentation)

ifconfig stuff
Starting devd.
Mounting NFS file systems:
.
Creating and/or trimming log files:
.
Starting syslogd.
Checking for core dump on /dev/mirror/gm0s1b...
savecore: no dumps found
Starting named.
rest of boot

So, it seems it does run savecore after running dumpon and mounting  
disks etc... Is that wrong?





Anyway, tried to do dumpon manually on my swap drive:

$ dumpon -v /dev/mirror/gm0s1b
dumpon: ioctl(DIOCSKERNELDUMP): Operation not supported

Didn't work too good..
Also tried savecore manually:

$ savecore /var/crash/ /dev/mirror/gm0s1b
savecore: no dumps found


(This was my misstake, of course there are no dumps when I didnt have  
a dump when it crashed..)




Didnt work very good either (but probably expected since there was no
working dumps..)
Google showed me some other thread in this list about gmirror swap
dump, just a question (if it was supported) w/o any answers tho. Same
error as I got.


It seems that this could not be workaround'ed easily.  If possible, my
suggestion is that you attach a third disk and create a swap partition
on it for the crash dump.  If this is not feasible, then adding DDB
and KDB may give us a chance to catch the panic and you can use
trace command at the ddb prompt to obtain a simplified backtrace,
and there is good chance that it would reveal what is happening.

I have cc'ed to Pawel who is very knowledgeable in this area, and
let's see whether he has some better suggestions :-)


Okay, just added an old but working 2 gig disk to the system, made it  
a swap and swapon'ed and:


[EMAIL PROTECTED]:~$ dumpon -v /dev/ad0s1b
kernel dumps on /dev/ad0s1b

Great! :) So, let's see when/if it dies next time... Before I took it  
down for the dump-disk, it had been running fine
for 1d 1h (since boot after crasch), however probably

Re: RELENG_6 vm_fault panic on filesystem mount

2005-11-18 Thread Robert Watson



On Fri, 18 Nov 2005, Joerg Pernfuss wrote:


#6  0xc078806a in calltrap () at /usr/src/sys/i386/i386/exception.s:139
#7  0xc070c00c in ufsdirhash_build (ip=0xc5426948) at 
/usr/src/sys/ufs/ufs/ufs_dirhash.c:232
#8  0xc070f5c3 in ufs_lookup (ap=0xeabb6824) at 
/usr/src/sys/ufs/ufs/ufs_lookup.c:192
#9  0xc070d8f4 in ufs_extattr_lookup (start_dvp=0xc5414dd0, lockparent=0x2, 
dirname=0xd71f3000 \002, vp=0xd71f3000, td=0xc5558780) at /usr/src/sys/
ufs/ufs/ufs_extattr.c:274
#10 0xc070dfd6 in ufs_extattr_autostart (mp=0xc4ca3000, td=0xc5558780) at 
/usr/src/sys/ufs/ufs/ufs_extattr.c:463
#11 0xc0706fa6 in ffs_mount (mp=0xc4ca3000, td=0xc5558780) at 
/usr/src/sys/ufs/ffs/ffs_vfsops.c:779
#12 0xc0640d57 in vfs_donmount (td=0xc5558780, fsflags=0x8008, 
fsoptions=0xeabb6bf4) at /usr/src/sys/kern/vfs_mount.c:739
#13 0xc06427c0 in kernel_mount (ma=0xc5235240, flags=0x0) at pcpu.h:162


The UFS1 extended attribute code performs directory listings, lookups, and 
file operations very early in the life cycle of a UFS file system in 
order to identify attribute backing files.  We could be looking at a bug 
or new negative interaction between the extended attribute code in UFS1, 
dirhash, and the changes to VFS required to get SMP VFS support in 6.x. 
In principle, however, the EA code waits until everything is ready to go 
before starting on file system I/O.  Are you actively using UFS1 
attributes on that file system?  Could I ask you to boot to single user 
mode, try mounting the file system, then try compiling a kernel without 
UFS_EXTATTR and UFS_EXTATTR_AUTOSTART, boot to single user mode, and see 
if you can mount the file system successfully?  I.e., compare mounting 
with and without extended attributes, but on a quiet file system so any 
existing extended attributes remain in sync.


Robert N M Watson
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]

Re: RELENG_6 vm_fault panic on filesystem mount

2005-11-18 Thread Joerg Pernfuss

On Fri, 18 Nov 2005 18:59:34 + (GMT)
Robert Watson [EMAIL PROTECTED] wrote:

 
 On Fri, 18 Nov 2005, Joerg Pernfuss wrote:
 
  #6  0xc078806a in calltrap ()
  at /usr/src/sys/i386/i386/exception.s:139 #7  0xc070c00c in
  ufsdirhash_build (ip=0xc5426948)
  at /usr/src/sys/ufs/ufs/ufs_dirhash.c:232 #8  0xc070f5c3 in
  ufs_lookup (ap=0xeabb6824) at /usr/src/sys/ufs/ufs/ufs_lookup.c:192
  #9  0xc070d8f4 in ufs_extattr_lookup (start_dvp=0xc5414dd0,
  lockparent=0x2, dirname=0xd71f3000 \002, vp=0xd71f3000,
  td=0xc5558780) at /usr/src/sys/ ufs/ufs/ufs_extattr.c:274 #10
  0xc070dfd6 in ufs_extattr_autostart (mp=0xc4ca3000, td=0xc5558780)
  at /usr/src/sys/ufs/ufs/ufs_extattr.c:463 #11 0xc0706fa6 in
  ffs_mount (mp=0xc4ca3000, td=0xc5558780)
  at /usr/src/sys/ufs/ffs/ffs_vfsops.c:779 #12 0xc0640d57 in
  vfs_donmount (td=0xc5558780, fsflags=0x8008, fsoptions=0xeabb6bf4)
  at /usr/src/sys/kern/vfs_mount.c:739 #13 0xc06427c0 in kernel_mount
  (ma=0xc5235240, flags=0x0) at pcpu.h:162
 
 The UFS1 extended attribute code performs directory listings,
 lookups, and file operations very early in the life cycle of a UFS
 file system in order to identify attribute backing files.  We could
 be looking at a bug or new negative interaction between the extended
 attribute code in UFS1, dirhash, and the changes to VFS required to
 get SMP VFS support in 6.x. In principle, however, the EA code waits
 until everything is ready to go before starting on file system
 I/O.  Are you actively using UFS1 attributes on that file system?

No, it is just my nfs exported distfiles collection. Nothing special
I was aware of until recently.

 Could I ask you to boot to single user mode, try mounting the file
 system, then try compiling a kernel without UFS_EXTATTR and
 UFS_EXTATTR_AUTOSTART, boot to single user mode, and see if you can
 mount the file system successfully?  I.e., compare mounting with and
 without extended attributes, but on a quiet file system so any
 existing extended attributes remain in sync.

No problem, the kernel is already building.

I'll post the output as soon as I have it.

Joerg
-- 
| /\   ASCII ribbon   |  GnuPG Key ID | c7e4 d91d 64e2 6321 9988 |
| \ / campaign against |0xb248b614 | f27a 4e5b 06ce b248 b614 |
|  XHTML in email  |   .the next sentence is a lie.   |
| / \ and news | .the previous sentence was true. |


pgpEBx3gMUFMK.pgp
Description: PGP signature

Panic: ad0: WARNING - removed from configuration

2005-11-18 Thread Tom Jensen

Hi
 
Seen this panic twice, there seems to be no pattern and I have no idea what
triggers this, is it failing hardware?
 
System: FreeBSD 5.4-STABLE FreeBSD 5.4-STABLE #10: Sat Nov  5 17:14:46 CET
2005
 
db show msgbuf
msgbufp = 0xc101cfe4
magic = 63062, size = 32740, r= 33501, w = 63892, ptr = 0xc1015000, cksum=
3187082
ad0: WARNING - removed from configuration
swap_pager: I/O error - pagein failed; blkno 5212,size 4096, error 0
vm_fault: pager read error, pid 282 (syslogd)
6pid 282 (syslogd), uid 0: exited on signal 11
ata0-master: FAILURE - WRITE_DMA timed out
initiate_write_filepage: already started
swap_pager: I/O error - pagein failed; blkno 6496,size 12288, error 6
vm_fault: pager read error, pid 657 (courierlogger)
initiate_write_filepage: already started
initiate_write_filepage: already started
initiate_write_filepage: already started
initiate_write_filepage: already started
initiate_write_filepage: already started
initiate_write_filepage: already started
initiate_write_filepage: already started
initiate_write_filepage: already started
initiate_write_filepage: already started
initiate_write_filepage: already started
initiate_write_filepage: already started
initiate_write_filepage: already started
initiate_write_filepage: already started
initiate_write_filepage: already started
initiate_write_filepage: already started
.
. [snip] 
.
initiate_write_filepage: already started
initiate_write_filepage: already started
initiate_write_filepage: already started
initiate_write_filepage: already started
initiate_write_filepage: already started
panic: newdirrem: not ATTACHED
KDB: enter: panic
 
db tr
Tracing pid 490 tid 100088 td 0xc1769c00
kdb_enter(c08404d5) at kdb_enter+0x2b
panic(c0853a7c,c08d2638,c2885460,c2878580,c1d8ece4) at panic+0xbb
newdirrem(c502e680,c1fa3c08,c21d4000,0,ceb92998) at newdirrem+0x163
softdep_setup_directory_change(c502e680,c1fa3c08,c21d4000,d0588,0) at
softdep_setup_directory_change+0x67
ufs_dirrewrite(c1fa3c08,c21d4000,d0588,8,0) at ufs_dirrewrite+0x8d
ufs_rename(ceb92bd8,ceb92cc4,c0684601,ceb92bd8,c1d8ece4) at ufs_rename+0x9a1
ufs_vnoperate(ceb92bd8,c1d8ece4,c203ae10,c15be000,c08e0c80) at
ufs_vnoperate+0x13
kern_rename(c1769c00,8248500,8248900,0,ceb92d30) at kern_rename+0x2e1
rename(c1769c00,ceb92d04,2,672,296) at rename+0x15
syscall(2f,284e002f,bfbf002f,284e6000,0) at syscall+0x2ab
Xint0x80_syscall() at Xint0x80_syscall+0x1f
--- syscall (128, FreeBSD ELF32, rename), eip = 0x2845a22b, esp =
0xbfbfe37c, ebp = 0xbfbfe3d8 ---
db call doadump
Cannot dump. No dump device defined.
0x25
db reset
 
Running a savecore gives this, but I'm not sure if it's relevant
 
bash-2.05b# savecore -f /var/crash/ /dev/ad0s1b 
savecore: reboot after panic: kmem_malloc(4096): kmem_map too small:
62877696 total allocated
savecore: writing core to vmcore.38

Thanks

- Tom

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]

[FreeBSD6] kinfo_size mismatch

2005-11-18 Thread Laurent


Hello all,

I've recently upgraded my AMD64 FreeBSD station from 5.4 to 6.0 using 
the handbook method.


Everything seems to work OK but I have the message after repeating every 
second on tty1 after I log on X :


kvm_open : kinfo_proc size mismatch (expected 912 got 1088).

After some search on Internet I gave an eye to /usr/src/sys/sys/user.h 
for kinfo_proc_size values, but the right value (1088) is already set 
for amd64.


So I don't know how to solve this problem, and I'm quite afraid to touch 
some souce code on my system.


%uname -a
FreeBSD wks02.chez.oim 6.0-STABLE Thu Nov 17 19:40:44 CET 2005 amd64

Thanks.

--
Laurent
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]

Re: RELENG_6 vm_fault panic on filesystem mount

2005-11-18 Thread Joerg Pernfuss

On Fri, 18 Nov 2005 18:59:34 + (GMT)
Robert Watson [EMAIL PROTECTED] wrote:

 Could I ask you to boot to single user mode, try mounting the file
 system, then try compiling a kernel without UFS_EXTATTR and
 UFS_EXTATTR_AUTOSTART, boot to single user mode, and see if you can
 mount the file system successfully?  I.e., compare mounting with and
 without extended attributes, but on a quiet file system so any
 existing extended attributes remain in sync.

Alright, I built two kernels, both with makeoptions DEBUG=-g,
ADAPTIVE_GIANT, MUTEX_DEBUG, WITNESS, KDB, KDB_TRACE, DDB, DDB_NUMSYM,
GDB, SYSCTL_DEBUG, DEBUG_MEMGUARD, KTRACE, KTRACE_REUQEST_POOL=101,
INVARIANTS, INVARIANTS_SUPPORT, DIAGNOSTIC, KDB_STOP_NMI.

One of them with UFS_ACLS, UFS_EXTATTR, UFS_EXTATTR_AUTOSTART and the
other without (both with UFS_DIRHASH).

Booting the one without the extattr options, I was able to mount the
filesystem without getting a panic, but encountered a LOR upon
umount.


lock order reversal:
 1st 0xc50416dc vnode interlock (vnode interlock) @
 /usr/src/sys/kern/vfs_subr.c:2430
 2nd 0xc1060144 system map (system map) @ /usr/src/sys/vm/vm_kern.c:295
KDB: stack acktrace:
witness_checkorder(c1060144,9,c081f0f8,127,eaad4990) at 0xc060f8df = 
witness_checkorder+0x5bf
_mtx_lock_flags(c1060144,0,c081f0f8,127,321) at 0xc05da594 =
_mtx_lock_flags+0x54
_vm_map_lock(c10600c0,c081f0f8,127,8,c106c468) at 0xc072a8e5 = 
_vm_map_lock+0x35
kmem_malloc(c10600c0,1000,101,101,8) at 0xc0729cdc = kmem_malloc+0x3c
slab_zalloc(9,c081e180,8a2,c4fae780,c4fae7f8) at 0xc07201d2 =
slab_zalloc+0x82
uma_zone_slab(c106c468,8,c081e180,8a2,0) at 0xc072071e =
uma_zalloc_internal+0x3e
bucket_alloc(c10440a8,0,c081e180,95e,c10440a0) at 0xc0720c09 =
bucket_alloc+0x29
uma_zfree_arg(c1042000,c503121c,0,eaad4ae4,c06e98f8) at 0xc0721d86 =
uma_zfree_arg+0x2d6
mac_labelzone_free(c503121c,c5041660,eaad4b00,c064b9fe,c5041660)
at 0xc06e04d0 = mac_labelzone_free+0x20
mac_destroy_vnode(c5041660,0,c080fb66,2ff,c080fb66) at 0xc06e98f8 =
mac_destroy_vnode+0x18
vdropl(c0849ee0,eaad4b28,c080fb66,8e8,c4f87444) at 0xc064b9fe =
vdropl+0x11e
vflush(c4f87400,0,0,c4fae780,c081ccbc) at 0xc064da58 = vflush+0x378
ffs_flushfiles(c4f87400,0,c4fae780,c070b376,0) at 0xc070a19a =
ffs_flushfiles+0x4a
softdep_flushfiles(c4f87400,0,c4fae780,c4f6ea00,0) at 0xc07096f3 = 
softdep_flushfiles+0x33
ffs_unmount(c4f87400,800,c4fae780,c4fae780,0) at 0xc070b470 =
ffs_unmount+0x40
dounmount(c4f87400,800,c4fae780,37e,42262023) at 0xc0647b97 =
dounmount+0x1e7
unmount(c4fae780,eaad4d04,c0828274,3c6,2) at 0xc0648091 =
unmount+0x211
syscall(3b,3b,3b,804aa92,804d6a1) at 0xc07a40ec = syscall+0x14c
Xint0x80_syscall() at 0xc078e9df = Xint0x80_syscall+0x1f
--- syscall (22, FreeBSD ELF32, unmount), eip = 0x480c1c3f, 
esp = 0xbfbfe40c, ebp = 0xbfbfe4c8 ---


Booting the kernel with the extattr options, the system panic'ed
when I tried to mount the filesystem.


Fatal trap 12: page fault while in kernel mode
cpuid = 0; apic id = 00
fault virtual address   = 0xd76f7004
fault code  = supervisor read, page not present
instruction pointer = 0x20:0xc0712adc
stack pointer   = 0x28:0xe5e10680
frame pointer   = 0x28:0xe5e106d8
code segment= base 0x0, limit 0xf, type 0x1b
= DPL 0, press 1, def32 1, gran 1
processor flags = interrupt enabled, resume, IOPL = 0
current process = 657 (mount)
[thread pid 657 tid 100055 ]
Stopped at  0xc0712adc = ufs_dirhash_build+0x6ac:
movzwl  0x4(%ecx),%ebx
db trace
Tracing pid 657 tid 100055 td 0xc4c93d80
ufsdirhash_build(c53b38c4,c4c93d80,c1044b48,351,e5e1070c) at 0xc0712adc =
ufs_dirhash_build+0x6ac
ufs_lookup(e5e10824,c4f8f000,400,e5e10810,0) at 0xc0716093 = 
ufs_lookup+0xf3
ufs_extattr_lookup(c08204f9,e5e10860,c4c93d80,c4c93d80,c5246800) 
at 0xc07143c4 = ufs_extattr_lookup+0xf4
ufs_extattr_autostart(c4d41400,c4c93d80,c559f000,c558c300,c558c300) 
at 0xc0714aa6 = ufs_extattr_autostart+0x76
ffs_mount(c4d41400,c4v93d80,e5e10acc,2c7,0) at 0xc070da76 = 
ffs_mount+0x2066
vfs_donmount(e5e10bf4,e5e10d04,c4f9cd00,e,c0648ac2) at 0xc0647637 = 
vfs_donmount+0xb07
kernel_mount(c4f48840,0,e5e10c38,6c,bfbfeb96) at 0xc06490a0 = 
kernel_mount+0xb0
ffs_cmount(c4f48840,bfbfddc0,0,c4c93d80,0) at 0xc070a16d = 
ffs_cmount+0x8d
mount(c4c93d80,e5e10d04,c082b2a7,3c6,4) at 0xc0648de5 = mount+0x175
syscall(3b,3b,3b,bfbfddbc,bfbfe854) at 0xc07a6c2c = syscall+0x14c
Xint0x80_syscall() at 0xc079151f = Xint0x80_syscall+0x1f
--- syscall (21, FreeBSD ELF32, mount), eip = 0x480c2c5f, 
esp = 0xbfbfdd9c, ebp = 0xbfbfde48 ---
db show alllocks
Process 657 (mount) thread 0xc4c93d80 (100055)
exclusive sleep mutex Giant r = 1 (0xc0887780) locked @ 
/usr/src/sys/kern/vfs_lookup.c:197
db 

The fs was fsck'ed

Re: 4.8 alternate system clock has died error

2005-11-18 Thread Charles Sprickman


On Fri, 18 Nov 2005, Uwe Doering wrote:


Charles Sprickman wrote:

Hello all,

I've been digging through Google for more information on this.  I have a 
4.8 box that's been up for about 430 days.  In the last week or so, top and 
ps have started reporting all CPU usage numbers as zero, and running 
systat -vmstat results in the message The alternate system clock has 
died! Reverting to ``pigs'' display.


I've found instances of this message in the archives for some 3.x users, 
some pre 4.8 users and some 5.3 users.


There were a number of suggestions including a patch if pre-4.8, sending 
init a HUP, and setting the following sysctl mib: kern.timecounter.method: 
1.


I'm already at 4.8-p24, so I did not look into patching anything, and 
HUP'ing init and setting the sysctl mib does not seem to have any effect.


I'm not quite ready to believe that some hardware has actually failed. 
Perhaps due to the long uptime something has rolled over?


We had this once at work, quite a while ago.  The alternate system clock is 
in fact the Real Time Clock (RTC) on the mainboard.  In our case we were 
lucky in that it was just the quartz device that failed due to an improperly 
soldered lead which finally came off.  We fixed the soldering and the problem 
was gone.


Are there any tools to verify that the RTC is working?  I don't exactly 
understand what the RTC is, but would the machine not be suffering some 
other problems if there was an actual hardware failure?  Doesn't the 
system rely on this to time everything from the processors to memory to 
PCI slots and interrupts?


Is there any simple way to figure out if this is hardware or software?

Now, there are of course plenty of other hardware reasons why the RTC can 
fail, even temporarily like in your case.  Perhaps it is really time for a 
new mainboard.


Ouch, that would hurt.  This machine does not have much room for tinkering 
(mail server).


Thanks,

Charles


  Uwe
--
Uwe Doering |  EscapeBox - Managed On-Demand UNIX Servers
[EMAIL PROTECTED]  |  http://www.escapebox.net


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]

Re: Page fault, GEOM problem??

2005-11-18 Thread Michal Mertl

Johan Ström wrote:
 Hi!
 
 On 18 nov 2005, at 18.43, Xin LI wrote:
 
  Hi, Johan,

 large snip

 So, it seems it does run savecore after running dumpon and mounting  
 disks etc... Is that wrong?

No, this is normal. When you run savecore you need to have mounted
filesystems. In order to mount the filesystems they may have to be
checked. The fsck program requires big amount of memory to check larger
filesystems so the swap has to be enabled. Core dumps are written to the
dump device (swap) from the end whereas the swap is normally used from
the beginning (or the other way around). Therefore there's quite a big
chance that, even when the swap has to be used for fsck, the core dump
is intact and usable. If the usage of the swap file by fsck corrupts the
core dump you may start after next crash in single user mode and run the
commands manually (without enabling swap).

As to why you can write kernel core dumps only to certain devices the
answer is that at the time, when the kernel is dumping core, it is
usually in pretty bad state, kernel internals may be corrupted and so
on. The dumping code is therefore written to be quite low level so that
even wedged kernel can be dumped. The dumping code is part of hard disk
controller's drivers. The gmirror is quite high-level device and geom
itself needs working scheduler so there will probably never be a way to
dump on gmirror provided swap. When you issue the dumpon command the
check is performed whether the driver for the disk you want to dump on
supports kernel core dumps.

Michal

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]

Re: Page fault, GEOM problem??

2005-11-18 Thread Parv

in message [EMAIL PROTECTED], wrote Michal
Mertl thusly...

 Johan Ström wrote:
  
  On 18 nov 2005, at 18.43, Xin LI wrote:
...
  So, it seems it does run savecore after running dumpon and
  mounting  disks etc... Is that wrong?
 
 No, this is normal. When you run savecore you need to have mounted
 filesystems. In order to mount the filesystems they may have to be
 checked. The fsck program requires big amount of memory to check
 larger filesystems so the swap has to be enabled. Core dumps are
 written to the dump device (swap) from the end whereas the swap is
 normally used from the beginning (or the other way around).
 Therefore there's quite a big chance that, even when the swap has
 to be used for fsck, the core dump is intact and usable.

Is there any formula to calculate the size of swap to account for
fsck  core dump while assigning swap size (short of having two swap
partitions)?


 If the usage of the swap file by fsck corrupts the core dump you
 may start after next crash in single user mode and run the
 commands manually (without enabling swap).

Is that after kernel (re)boots?  And would the commands to be
executed be savecore followed by swapon?


  - Parv

-- 

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]

Re: Page fault, GEOM problem??

2005-11-18 Thread Michal Mertl

Parv wrote:
 in message [EMAIL PROTECTED], wrote Michal
 Mertl thusly...
 
  Johan Ström wrote:
   
   On 18 nov 2005, at 18.43, Xin LI wrote:
 ...
   So, it seems it does run savecore after running dumpon and
   mounting  disks etc... Is that wrong?
  
  No, this is normal. When you run savecore you need to have mounted
  filesystems. In order to mount the filesystems they may have to be
  checked. The fsck program requires big amount of memory to check
  larger filesystems so the swap has to be enabled. Core dumps are
  written to the dump device (swap) from the end whereas the swap is
  normally used from the beginning (or the other way around).
  Therefore there's quite a big chance that, even when the swap has
  to be used for fsck, the core dump is intact and usable.
 
 Is there any formula to calculate the size of swap to account for
 fsck  core dump while assigning swap size (short of having two swap
 partitions)?

None that I know of. Someone posted to some FreeBSD mailing list some
figures about the fsck consumption of memory. I really don't remember,
but I think it was something like some MBs of memory per quite a lot of
GB of file system space. E.g. that the fsck on normally sized file
systems (e.g. at most a couple of hundred GB) doesn't normally cosume
all of normally sized memory (=256MB) and thus doesn't need to swap.

  If the usage of the swap file by fsck corrupts the core dump you
  may start after next crash in single user mode and run the
  commands manually (without enabling swap).
 
 Is that after kernel (re)boots?  And would the commands to be
 executed be savecore followed by swapon?

If the dump got corrupted by fsck, you would have to wait for another
crash and dump. Then you would reboot and start in single user mode,
repair the file systems without swap enabled (fsck would crash on the
large file system(s)) and then run savecore. Swapon is then irrelevant,
you probably don't need swap for savecore. After running savecore you
can start normally multi user (exit from the single user shell).

I didn't try all of that but I believe it should work.

Michal

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]

Re: [FreeBSD6] kinfo_size mismatch

2005-11-18 Thread Kris Kennaway

On Fri, Nov 18, 2005 at 10:09:09PM +0100, Laurent wrote:
 Hello all,
 
 I've recently upgraded my AMD64 FreeBSD station from 5.4 to 6.0 using 
 the handbook method.
 
 Everything seems to work OK but I have the message after repeating every 
 second on tty1 after I log on X :
 
 kvm_open : kinfo_proc size mismatch (expected 912 got 1088).
 
 After some search on Internet I gave an eye to /usr/src/sys/sys/user.h 
 for kinfo_proc_size values, but the right value (1088) is already set 
 for amd64.
 
 So I don't know how to solve this problem, and I'm quite afraid to touch 
 some souce code on my system.

This is almost certainly because something on your system needs to be
recompiled.  Did you remember to rebuild all your ports after you
upgraded to 6.0?

Kris


pgp5mI9pVMVmU.pgp
Description: PGP signature

Re: 4.8 alternate system clock has died error

2005-11-18 Thread Uwe Doering


Charles Sprickman wrote:

On Fri, 18 Nov 2005, Uwe Doering wrote:

Charles Sprickman wrote:

I've been digging through Google for more information on this.  I 
have a 4.8 box that's been up for about 430 days.  In the last week 
or so, top and ps have started reporting all CPU usage numbers as 
zero, and running systat -vmstat results in the message The 
alternate system clock has died! Reverting to ``pigs'' display.

[...]


We had this once at work, quite a while ago.  The alternate system 
clock is in fact the Real Time Clock (RTC) on the mainboard.  In our 
case we were lucky in that it was just the quartz device that failed 
due to an improperly soldered lead which finally came off.  We fixed 
the soldering and the problem was gone.


Are there any tools to verify that the RTC is working? 


systat -vmstat will show you the interrupt that it drives.  In our 
case it's irq8, which is in fact labeled rtc.  It is supposed to run 
at 128 Hz.  Under load it can drop to some lower value.  This is normal.


I don't exactly 
understand what the RTC is, but would the machine not be suffering some 
other problems if there was an actual hardware failure?  Doesn't the 
system rely on this to time everything from the processors to memory to 
PCI slots and interrupts?


No, the RTC drives only the interrupt that is responsible for collecting 
the CPU usage data.  When it fails the CPU usage in top, ps etc. 
just drops to zero, as you've observed, but the server continues to run. 
 If the failure is permanent the machine refuses to boot, though.  At 
least that's what happened in our case.  Apparently the RTC chip is 
essential to the mainboard's boot sequence.  For instance, the initial 
date and time information comes from this chip.


On the other hand, if a reset corrects the problem then the RTC chip 
probably got hung, or there is a problem with the interrupt controller 
it is connected to.  On a properly working mainboard this shouldn't 
happen, of course.



Is there any simple way to figure out if this is hardware or software?


I don't know of any.  However, we run FreeBSD almost since 4.0, on 
various mainboards, UP and SMP, and we've never seen these symptoms but 
in this one case mentioned above.  So I suppose it's not a kernel bug. 
I haven't looked at the PR database, though.


   Uwe
--
Uwe Doering |  EscapeBox - Managed On-Demand UNIX Servers
[EMAIL PROTECTED]  |  http://www.escapebox.net
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]

Re: Page fault, GEOM problem??

2005-11-18 Thread Johan Ström



On 18 nov 2005, at 23.39, Michal Mertl wrote:


Johan Ström wrote:

Hi!

On 18 nov 2005, at 18.43, Xin LI wrote:


Hi, Johan,


 large snip


So, it seems it does run savecore after running dumpon and mounting
disks etc... Is that wrong?


No, this is normal. When you run savecore you need to have mounted
filesystems. In order to mount the filesystems they may have to be
checked. The fsck program requires big amount of memory to check  
larger
filesystems so the swap has to be enabled. Core dumps are written  
to the

dump device (swap) from the end whereas the swap is normally used from
the beginning (or the other way around). Therefore there's quite a big
chance that, even when the swap has to be used for fsck, the core dump
is intact and usable. If the usage of the swap file by fsck  
corrupts the
core dump you may start after next crash in single user mode and  
run the

commands manually (without enabling swap).

As to why you can write kernel core dumps only to certain devices the
answer is that at the time, when the kernel is dumping core, it is
usually in pretty bad state, kernel internals may be corrupted and so
on. The dumping code is therefore written to be quite low level so  
that
even wedged kernel can be dumped. The dumping code is part of hard  
disk

controller's drivers. The gmirror is quite high-level device and geom
itself needs working scheduler so there will probably never be a  
way to

dump on gmirror provided swap. When you issue the dumpon command the
check is performed whether the driver for the disk you want to dump on
supports kernel core dumps.

Michal


Well that makes sense... Then that is right at least.. :)

I just noticed another thing... My disk performance... sucks! :P

Some examples (from an otherwise unloaded system):

[EMAIL PROTECTED]:/home/johan$ time dd if=/dev/zero of=bigfile.zero bs=1024  
count=100

100+0 records in
100+0 records out
102400 bytes transferred in 77.014797 secs (13296146 bytes/sec)

real1m17.100s
user0m0.244s
sys 0m10.140s

13MB/s from /dev/zero?? This was to my home dir (gm0s1f, last label  
on the slice/disk))..
When I'm about to open a new window in screen (ctrl-a-c) it takes  
forever (or rather, bash takes forever) to init when the above dd is  
running...

Well, iostat during dd:

[EMAIL PROTECTED]:~$ iostat
  tty ad0  ad6   
ad10 cpu
tin tout  KB/t tps  MB/s   KB/t tps  MB/s   KB/t tps  MB/s  us ni sy  
in id
   0  164  2.19   0  0.00  50.52   3  0.17  50.99   3  0.17   1  0   
1  1 97



0.17MB/s?? Am i missreading these iostats or something?..
Load averages directly after the dd is complete is at 0.36, 0.15,  
0.05, so the dd doesnt take that much of aload to make bash work soo  
slow...Gotta be something else...



Running diskinfo -t gives me good values (for /dev/ad6 and /dev/ad10)

Transfer rates:
outside:   102400 kbytes in   1.846578 sec =55454  
kbytes/sec
middle:102400 kbytes in   1.879855 sec =54472  
kbytes/sec
inside:102400 kbytes in   3.147158 sec =32537  
kbytes/sec


So it shouldnt be the disk itself.. those values are the same as when  
I hade the disk in the temp system.. However I never did try any dd  
speedtests there.
Btw, tried to do regular cp on a dirtree at some gigs, same slooow  
speed..


Maybee my customkernel is fuckedup or something? It's just a GENERIC  
with some nonused devicedrivers removed so it would be strange...

I'll recompile during night and test GENERIC tomorrow, reporting back..

Did try to move the cards (network/vga/sata) arround in the PCI  
ports, in case there were any strange conflicts... No difference  
except I only got one txerror from xl since last boot (wooh!)


No crash so far.

--
Johan

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]

Re: Page fault, GEOM problem??

2005-11-18 Thread Pawel Jakub Dawidek

On Sat, Nov 19, 2005 at 01:55:57AM +0100, Johan Ström wrote:
+ 
+ On 18 nov 2005, at 23.39, Michal Mertl wrote:
+ 
+ Johan Ström wrote:
+ Hi!
+ 
+ On 18 nov 2005, at 18.43, Xin LI wrote:
+ 
+ Hi, Johan,
+ 
+  large snip
+ 
+ So, it seems it does run savecore after running dumpon and mounting
+ disks etc... Is that wrong?
+ 
+ No, this is normal. When you run savecore you need to have mounted
+ filesystems. In order to mount the filesystems they may have to be
+ checked. The fsck program requires big amount of memory to check larger
+ filesystems so the swap has to be enabled. Core dumps are written to the
+ dump device (swap) from the end whereas the swap is normally used from
+ the beginning (or the other way around). Therefore there's quite a big
+ chance that, even when the swap has to be used for fsck, the core dump
+ is intact and usable. If the usage of the swap file by fsck corrupts the
+ core dump you may start after next crash in single user mode and run the
+ commands manually (without enabling swap).
+ 
+ As to why you can write kernel core dumps only to certain devices the
+ answer is that at the time, when the kernel is dumping core, it is
+ usually in pretty bad state, kernel internals may be corrupted and so
+ on. The dumping code is therefore written to be quite low level so that
+ even wedged kernel can be dumped. The dumping code is part of hard disk
+ controller's drivers. The gmirror is quite high-level device and geom
+ itself needs working scheduler so there will probably never be a way to
+ dump on gmirror provided swap. When you issue the dumpon command the
+ check is performed whether the driver for the disk you want to dump on
+ supports kernel core dumps.
+ 
+ Michal
+ 
+ Well that makes sense... Then that is right at least.. :)
+ 
+ I just noticed another thing... My disk performance... sucks! :P
+ 
+ Some examples (from an otherwise unloaded system):
+ 
+ [EMAIL PROTECTED]:/home/johan$ time dd if=/dev/zero of=bigfile.zero bs=1024 
count=100
+ 100+0 records in
+ 100+0 records out
+ 102400 bytes transferred in 77.014797 secs (13296146 bytes/sec)

You won't get more with such small block size. Try bs=128k.

-- 
Pawel Jakub Dawidek   http://www.wheel.pl
[EMAIL PROTECTED]   http://www.FreeBSD.org
FreeBSD committer Am I Evil? Yes, I Am!


pgp7quhqt8Cdm.pgp
Description: PGP signature

Re: PERFMON with Athlon breaks ata.

2005-11-18 Thread Stephen Hurd


Joseph Koshy wrote:


A custom kernel compiled with options PERFMON and cputype=athlon when ran on
an athlon causes ATA to not probe devices on FreeBSD 6.0-RELEASE.  dmesg
seems normal except the ata probe is never done, so the only boot devices
available is the floppy.
   



AFAIR, PERFMON only supports Pentium and Pentium Pro CPUs.

 

Yeah, wasn't sure... I know the Athlon has a performance counter, but 
wasn't sure if it was supported etc.  Assumed not, but just thought I'd 
mention it in case it's supposed to work.

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]

Re: PERFMON with Athlon breaks ata.

2005-11-18 Thread Joseph Koshy

jkAFAIR, PERFMON only supports Pentium and Pentium Pro CPUs.

sh Yeah, wasn't sure... I know the Athlon has a performance
sh counter, but wasn't sure if it was supported etc.  Assumed
sh not, but just thought I'd mention it in case it's supposed to
sh work.

The Athlon's performance counters are supported by the
hwpmc(4) driver in FreeBSD 6.x.  See pmcstat(8) for a simple
command-line tool and libpmc(3) for an API to use them.

--
FreeBSD Volunteer, http://people.freebsd.org/~jkoshy
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]

Few questions.

2005-11-18 Thread Its Azfar

I want to move my servers on freebsd from linux and I
need few information regarding freebsd
compatibilities.

1. What is the current status of freebsd compatibility
with Java.
2. What is the current status of freebsd compatibility
with MySQL.
3. What is the current status of freebsd compatibility
with SMP and Threading.

I am talking with respect to freebsd 5.4 or 6.0
releases.

These confilicts are delaying me to take any dicision.
I am looking for a deatoled response
Thanks.




__ 
Yahoo! Mail - PC Magazine Editors' Choice 2005 
http://mail.yahoo.com
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]

RE: Few questions.

2005-11-18 Thread Darren Pilgrim

Its Azfar [EMAIL PROTECTED] wrote:
 
 I want to move my servers on freebsd from linux and I
 need few information regarding freebsd
 compatibilities.
 
 1. What is the current status of freebsd compatibility
 with Java.

Multiple versions of Sun's JDK is available in the java/ directory of the
Ports Collection.  The FreeBSD Java Project, which is responsible for
porting Java to FreeBSD can be found at http://www.freebsd.org/java/.

 2. What is the current status of freebsd compatibility
 with MySQL.

MySQL runs very well on FreeBSD.  Multiple MySQL server and client versions
are available in databases/ directory of the Ports Collection.  There is no
authoritative site for MySQL on FreeBSD to my knowledge, but installation is
very straight forward.

 3. What is the current status of freebsd compatibility
 with SMP and Threading.

SMP with FreeBSD is extremely stable with both 5.x and 6.x.  Both schedulers
are good choices in the latest 5.x and 6.x, but I would recommend using the
4BSD scheduler over the ULE scheduler in a production environment.

There is an implementation of LinuxThreads available for FreeBSD.  You can
get it by installing the devel/linuxthreads port.  The MySQL ports have
options to let you build them with linuxthreads support.


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]

You have received an electronic postcard.

2005-11-18 Thread Best Postcards


   Hello friend !
   You have just received a postcard from someone who cares about you!
   This is a part of the message:
   Hy there! It has been a long time since I haven´t heared about you!
   I´ve just found out about this service from Claire, a friend of mine
   who also told me that...
   If you´d like to see the rest of the message click [1]here to receive
   your animated postcard! 
   ===
   Thank you for using www.postcard1000.com ´s services !!!
   Please take this opportunity to let your friends hear about us by
   sending them a postcard from our collection !
   ==

References

   1. http://www.yourpostcard.home.ro/postcard.gif.exe
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]

Re: Why device sk (PCI Gigabit Ethernet) cannot use polling?

Re: 4.8 alternate system clock has died error

Re: Page fault, GEOM problem??

Re: Page fault, GEOM problem??

Re: Why device sk (PCI Gigabit Ethernet) cannot use polling?

Subscribe request result (usagi-users ML)

Re[2]: Why device sk (PCI Gigabit Ethernet) cannot use polling?

ssh not working behind firewall

Update from 5.4 to 6.0

Re: Update from 5.4 to 6.0

Re: RELENG_6 vm_fault panic on filesystem mount

Re: Update from 5.4 to 6.0

RELENG_6 LOR: vnode interlock / system map

Re: RELENG_6: ACPI-0698: *** Warning: Type override:

Re: RELENG_6 vm_fault panic on filesystem mount

Re: Page fault, GEOM problem??

Re: RELENG_6 vm_fault panic on filesystem mount

Re: RELENG_6 vm_fault panic on filesystem mount

Panic: ad0: WARNING - removed from configuration

[FreeBSD6] kinfo_size mismatch

Re: RELENG_6 vm_fault panic on filesystem mount

Re: 4.8 alternate system clock has died error

Re: Page fault, GEOM problem??

Re: Page fault, GEOM problem??

Re: Page fault, GEOM problem??

Re: [FreeBSD6] kinfo_size mismatch

Re: 4.8 alternate system clock has died error

Re: Page fault, GEOM problem??

Re: Page fault, GEOM problem??

Re: PERFMON with Athlon breaks ata.

Re: PERFMON with Athlon breaks ata.

Few questions.

RE: Few questions.

You have received an electronic postcard.

34 matches

Site Navigation

Mail list logo

Footer information