Re: reason for "magic" crashes.

2012-06-25 Thread Wojciech Puchar

thanks first for all suggestions.

Now the (not really)funny part begins.
After turning on all this checks in kernel the system crashes repeatable 
even before ending fully bootstrap sequence.


Before this i've got tons of warning about lock order reversal.

On seems strange as it is

lock order reversal:
 1st 0xff80f5c4ba80 bufwait (bufwait) @ kern/vfs_bio.c:2636
 2nd 0xff0005cb0600 dirhash (dirhash) @ ufs/ufs/ufs_dirhash.c:285
KDB: stack backtrace:
db_trace_self_wrapper() at db_trace_self_wrapper+0x2a
_witness_debugger() at _witness_debugger+0x2e
witness_checkorder() at witness_checkorder+0x80a
_sx_xlock() at _sx_xlock+0x5d
ufsdirhash_acquire() at ufsdirhash_acquire+0x33
ufsdirhash_remove() at ufsdirhash_remove+0x16
ufs_dirremove() at ufs_dirremove+0x181
ufs_remove() at ufs_remove+0x85
VOP_REMOVE_APV() at VOP_REMOVE_APV+0x93
kern_unlinkat() at kern_unlinkat+0x211
amd64_syscall() at amd64_syscall+0x2e0
Xfast_syscall() at Xfast_syscall+0xfc
--- syscall (10, FreeBSD ELF64, unlink), rip = 0xeede070c, rsp = 
0x7fffdb08, rbp = 0x7fffef58 ---



every now and then when files are deleted.

The rest seems to be a problem with not really good and it sync 3-rd party 
kernel addons.


After fixing that part i would post again.

Still - what may be a cause of this messages?
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"


Re: reason for "magic" crashes.

2012-06-24 Thread Wojciech Puchar


Are you asking about overhead of DDB or all debug options?

all. invariants, witness etc.



I don't think that DDB support can be accounted for slowdown.

As for the rest, it's hard to say. I guess it depends on your workload,
also I never performed any benchmarks to compare this and I'm unaware of
any.

In other words, you have to test it yourself.


we will see.
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"


Re: reason for "magic" crashes.

2012-06-24 Thread Mateusz Guzik
On Sun, Jun 24, 2012 at 08:50:41PM +0200, Wojciech Puchar wrote:
> >>
> >>
> >>There is nothing in cron that is done at sunday.
> >>
> >>i don't run "periodic" stuff in /etc/crontab
> >>
> >
> >Compile the kernel with the following:
> >
> >makeoptions DEBUG="-O0 -g"
> >
> >options KDB # Enable kernel debugger support.
> >options DDB # Support DDB.
> >options GDB # Support remote GDB.
> >options DEADLKRES   # Enable the deadlock resolver
> >options INVARIANTS  # Enable calls of extra sanity 
> >checking
> >options INVARIANT_SUPPORT   # Extra sanity checks of internal 
> >structures, required by INVARIANTS
> >options WITNESS # Enable checks to detect deadlocks 
> >and cycles
> >options WITNESS_SKIPSPIN# Don't run witness on spinlocks for 
> >speed
> >options DIAGNOSTIC
> >
> >After kernel panic ddb prompt will be waiting for you. Type in:
> >dump 
> >reset 
> >
> >Make sure you have swap that can handle crashdumps.
> 
> already did this part and debug part, but not DDB. As you see - hang
> not crashdump
> 
> how much would it slow down whole thing?
> 
> If less than 2 times it can be - CPU are rerely half loaded

Are you asking about overhead of DDB or all debug options?

I don't think that DDB support can be accounted for slowdown.

As for the rest, it's hard to say. I guess it depends on your workload,
also I never performed any benchmarks to compare this and I'm unaware of
any.

In other words, you have to test it yourself.

-- 
Mateusz Guzik 
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"


Re: reason for "magic" crashes.

2012-06-24 Thread Wojciech Puchar


Have you proven beyond reasonable doubt that there is no filesystem 
corruption or silent filesystem corruption due to bad hardware?

after last crash fsck_ffs found nothing suggesting such a case.

Actually the only change i made to this system (running flawless close to 
two years) is upgrading to latest 8.* from sources less than month ago.


but still - even assuming that system update introduced a bug - i cannot 
understand why it happens at sunday when it is least loaded. Only rarely 
visited WWW that time, few mails, and no load present as in work days.


since last crash i inserted pendrive and set it as dump device (FreeBSD 
doesn't seem to reliably crashdump to gmirrored+geli devices, i don't have 
others available).


but as you see it halted, no crash dump
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"


Re: reason for "magic" crashes.

2012-06-24 Thread Wojciech Puchar

i've got third crash third week in a row.

Every time in sunday after 18:00, every time with rsync process (which 
means rsync based backup that is done every day, not just in sunday!),


Is it the same rsync everyday, including sundays, or the sunday rsync is 
different?


the funny part is that it is exactly the same.

Perhaps you have some part of the filesystem corrupted or hd 
damaged zone and the sundays rsync is the only one that backups/touchs that 
part.

full fsck_ffs was done week ago
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"


Re: reason for "magic" crashes.

2012-06-24 Thread Wojciech Puchar



There is nothing in cron that is done at sunday.

i don't run "periodic" stuff in /etc/crontab



Compile the kernel with the following:

makeoptions DEBUG="-O0 -g"

options KDB # Enable kernel debugger support.
options DDB # Support DDB.
options GDB # Support remote GDB.
options DEADLKRES   # Enable the deadlock resolver
options INVARIANTS  # Enable calls of extra sanity checking
options INVARIANT_SUPPORT   # Extra sanity checks of internal 
structures, required by INVARIANTS
options WITNESS # Enable checks to detect deadlocks and 
cycles
options WITNESS_SKIPSPIN# Don't run witness on spinlocks for 
speed
options DIAGNOSTIC

After kernel panic ddb prompt will be waiting for you. Type in:
dump 
reset 

Make sure you have swap that can handle crashdumps.


already did this part and debug part, but not DDB. As you see - hang not 
crashdump


how much would it slow down whole thing?

If less than 2 times it can be - CPU are rerely half loaded
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"


Re: reason for "magic" crashes.

2012-06-24 Thread Vincent Hoffman
On 24/06/2012 18:05, Wojciech Puchar wrote:
> i've got third crash third week in a row.
>
> Every time in sunday after 18:00, every time with rsync process (which
> means rsync based backup that is done every day, not just in sunday!),
>
> you may see a crash (viewed from KVM) at
>
> http://www.tensor.gdynia.pl/~wojtek/crash.png
>
> what is important - syncing disk doesn't go on, system hangs here.
>
> For 99% system is not overheating at sunday, but i will be 100% sure
> as i added ipmitool sensor logged from cron every 5 minutes.
>
> Please give me an idea what to check.
>From the FAQ
http://www.freebsd.org/doc/en/books/faq/troubleshoot.html#TRAP-12-PANIC
and
http://www.freebsd.org/doc/en/books/faq/advanced.html#KERNEL-PANIC-TROUBLESHOOTING

Hope that helps.

Vince
>
>
> There is nothing in cron that is done at sunday.
>
> i don't run "periodic" stuff in /etc/crontab
>
> ___
> freebsd-hackers@freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
> To unsubscribe, send any mail to
> "freebsd-hackers-unsubscr...@freebsd.org"


___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"


Re: reason for "magic" crashes.

2012-06-24 Thread Mark Felder
Have you proven beyond reasonable doubt that there is no filesystem  
corruption or silent filesystem corruption due to bad hardware?

___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"


Re: reason for "magic" crashes.

2012-06-24 Thread Mateusz Guzik
On Sun, Jun 24, 2012 at 07:05:35PM +0200, Wojciech Puchar wrote:
> i've got third crash third week in a row.
> 
> Every time in sunday after 18:00, every time with rsync process
> (which means rsync based backup that is done every day, not just in
> sunday!),
> 
> you may see a crash (viewed from KVM) at
> 
> http://www.tensor.gdynia.pl/~wojtek/crash.png
> 
> what is important - syncing disk doesn't go on, system hangs here.
> 
> For 99% system is not overheating at sunday, but i will be 100% sure
> as i added ipmitool sensor logged from cron every 5 minutes.
> 
> Please give me an idea what to check.
> 
> 
> There is nothing in cron that is done at sunday.
> 
> i don't run "periodic" stuff in /etc/crontab
> 

Compile the kernel with the following:

makeoptions DEBUG="-O0 -g"

options KDB # Enable kernel debugger support.
options DDB # Support DDB.
options GDB # Support remote GDB.
options DEADLKRES   # Enable the deadlock resolver
options INVARIANTS  # Enable calls of extra sanity checking
options INVARIANT_SUPPORT   # Extra sanity checks of internal 
structures, required by INVARIANTS
options WITNESS # Enable checks to detect deadlocks and 
cycles
options WITNESS_SKIPSPIN# Don't run witness on spinlocks for 
speed
options DIAGNOSTIC

After kernel panic ddb prompt will be waiting for you. Type in:
dump 
reset 

Make sure you have swap that can handle crashdumps.

See this for more details:
http://www.freebsd.org/doc/en/books/developers-handbook/kerneldebug.html

You can check if everything works correctly by issuing panic manually:
sysctl debug.kdb.panic=1

then typing aforementioned ddb commands. After reboot you should get
core in /var/crash.

Also provide the following:
- system version
- filesystems involved in rsync with mount details (e.g. UFS with SU+J)
- dmesg

Hopefully this will be enough for someone to help.

-- 
Mateusz Guzik 
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"


Re: reason for "magic" crashes.

2012-06-24 Thread Eduardo Morras

At 19:05 24/06/2012, Wojciech Puchar wrote:

i've got third crash third week in a row.

Every time in sunday after 18:00, every time with rsync process 
(which means rsync based backup that is done every day, not just in sunday!),


Is it the same rsync everyday, including sundays, or the sunday rsync 
is different? Perhaps you have some part of the filesystem corrupted 
or hd damaged zone and the sundays rsync is the only one that 
backups/touchs that part.




___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"


Re: reason for "magic" crashes.

2012-06-24 Thread Wojciech Puchar


[1] 
http://freebsd.1045724.n5.nabble.com/Replacing-rc-8-Was-FreeBSD-Boot-Times-td5718636.html


this <2 minute boot time that will follow doesn't matter as it doesn't 
crash every now and then - it is nothing compared to the fact you have to 
travel there.




Please give me an idea what to check.


There is nothing in cron that is done at sunday.

i don't run "periodic" stuff in /etc/crontab


any idea to help?

___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"


Re: reason for "magic" crashes.

2012-06-24 Thread Fernando ApesteguĂ­a
On Sun, Jun 24, 2012 at 7:05 PM, Wojciech Puchar
 wrote:
> i've got third crash third week in a row.


>From you 5 days ago[1]:

"it is unimportant as FreeBSD don't crash."
Man, I really don't understand a thing...

[1] 
http://freebsd.1045724.n5.nabble.com/Replacing-rc-8-Was-FreeBSD-Boot-Times-td5718636.html


>
> Every time in sunday after 18:00, every time with rsync process (which means
> rsync based backup that is done every day, not just in sunday!),
>
> you may see a crash (viewed from KVM) at
>
> http://www.tensor.gdynia.pl/~wojtek/crash.png
>
> what is important - syncing disk doesn't go on, system hangs here.
>
> For 99% system is not overheating at sunday, but i will be 100% sure as i
> added ipmitool sensor logged from cron every 5 minutes.
>
> Please give me an idea what to check.
>
>
> There is nothing in cron that is done at sunday.
>
> i don't run "periodic" stuff in /etc/crontab
>
> ___
> freebsd-hackers@freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
> To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"


reason for "magic" crashes.

2012-06-24 Thread Wojciech Puchar

i've got third crash third week in a row.

Every time in sunday after 18:00, every time with rsync process (which 
means rsync based backup that is done every day, not just in sunday!),


you may see a crash (viewed from KVM) at

http://www.tensor.gdynia.pl/~wojtek/crash.png

what is important - syncing disk doesn't go on, system hangs here.

For 99% system is not overheating at sunday, but i will be 100% sure as i 
added ipmitool sensor logged from cron every 5 minutes.


Please give me an idea what to check.


There is nothing in cron that is done at sunday.

i don't run "periodic" stuff in /etc/crontab

___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"