Re: 7.2-release/amd64: panic, spin lock held too long

2009-07-16 Thread C. C. Tang

Attilio Rao wrote:

2009/7/8 Dan Naumov dan.nau...@gmail.com:

On Wed, Jul 8, 2009 at 3:57 AM, Dan Naumovdan.nau...@gmail.com wrote:

On Tue, Jul 7, 2009 at 4:27 AM, Attilio Raoatti...@freebsd.org wrote:

2009/7/7 Dan Naumov dan.nau...@gmail.com:

On Tue, Jul 7, 2009 at 4:18 AM, Attilio Raoatti...@freebsd.org wrote:

2009/7/7 Dan Naumov dan.nau...@gmail.com:

I just got a panic following by a reboot a few seconds after running
portsnap update, /var/log/messages shows the following:

Jul  7 03:49:38 atom syslogd: kernel boot file is /boot/kernel/kernel
Jul  7 03:49:38 atom kernel: spin lock 0x80b3edc0 (sched lock
1) held by 0xff00017d8370 (tid 100054) too long
Jul  7 03:49:38 atom kernel: panic: spin lock held too long

That's a known bug, affecting -CURRENT as well.
The cpustop IPI is handled though an NMI, which means it could
interrupt a CPU in any moment, even while holding a spinlock,
violating one well known FreeBSD rule.
That means that the cpu can stop itself while the thread was holding
the sched lock spinlock and not releasing it (there is no way, modulo
highly hackish, to fix that).
In the while hardclock() wants to schedule something else to run and
got stuck on the thread lock.

Ideal fix would involve not using a NMI for serving the cpustop while
having a cheap way (not making the common path too hard) to tell
hardclock() to avoid scheduling while cpustop is in flight.

Thanks,
Attilio

Any idea if a fix is being worked on and how unlucky must one be to
run into this issue, should I expect it to happen again? Is it
basically completely random?

I'd like to work on that issue before BETA3 (and backport to
STABLE_7), I'm just time-constrained right now.
it is completely random.

Thanks,
Attilio

Ok, this is getting pretty bad, 23 hours later, I get the same kind of
panic, the only difference is that instead of portsnap update, this
was triggered by portsnap cron which I have running between 3 and 4
am every day:

Jul  8 03:03:49 atom kernel: ssppiinn  lloocckk
00xx8800bb33eeddc400  ((sscchheedd  lloocck k1 )0 )h
ehledl db yb y 0x0xfff0f1081735339760e 0( t(itdi d
1016070)5 )t otoo ol olnogng
Jul  8 03:03:49 atom kernel: p
Jul  8 03:03:49 atom kernel: anic: spin lock held too long
Jul  8 03:03:49 atom kernel: cpuid = 0
Jul  8 03:03:49 atom kernel: Uptime: 23h2m38s

I have now tried repeating the problem by running stress --cpu 8 --io
8 --vm 4 --vm-bytes 1024M --timeout 600s --verbose which pushed
system load into the 15.50 ballpark and simultaneously running
portsnap fetch and portsnap update but I couldn't manually trigger
the panic, it seems that this problem is indeed random (although it
baffles me why is it specifically portsnap triggering it). I have now
disabled powerd to check whether that makes any difference to system
stability.


But is that happening at reboot time?

Thanks,
Attilio



I think I am also having similar problem on my Atom machine. 
(FreeBSD-7.2-Release-p1)

It does not happen at boot/reboot but panic randomly.
And I found that it remains stable for more than a month now after I 
disabled powerd... (although I want to have it enabled)


--
C.C.
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: 7.2-release/amd64: panic, spin lock held too long

2009-07-16 Thread Dan Naumov
 But is that happening at reboot time?

 Thanks,
 Attilio


 I think I am also having similar problem on my Atom machine.
 (FreeBSD-7.2-Release-p1)
 It does not happen at boot/reboot but panic randomly.
 And I found that it remains stable for more than a month now after I
 disabled powerd... (although I want to have it enabled)

I hope you can get some crash dumps for the developers to look at,
Attilio was trying to help me but sadly the machine had to be put into
active use so I could no longer play with FreeBSD due to unsolved
instability.

- Sincerely,
Dan Naumov
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


coretemp(4) build broken in recent STABLE?

2009-07-16 Thread Marat N.Afanasyev
i have 7.2-S csupped to today midnight and while trying to build kernel 
i have the following error:


=== coretemp (all)
cc -O2 -fno-strict-aliasing -pipe -O2 -pipe -msse2 -msse3 -m3dnow 
-march=athlon64  -D_KERNEL -DKLD_MODULE -std=c99 -nostdinc 
-DHAVE_KERNEL_OPTION_HEADERS -include 
/usr/obj/usr/src/sys/ZEALOT/opt_global.h -I. -I@ -I@/contrib/altq 
-finline-limit=8000 --param inline-unit-growth=100 --param 
large-function-growth=1000 -fno-common -g -fno-omit-frame-pointer 
-I/usr/obj/usr/src/sys/ZEALOT -mcmodel=kernel -mno-red-zone 
-mfpmath=387 -mno-sse -mno-sse2 -mno-mmx -mno-3dnow  -msoft-float 
-fno-asynchronous-unwind-tables -ffreestanding -Wall -Wredundant-decls 
-Wnested-externs -Wstrict-prototypes  -Wmissing-prototypes 
-Wpointer-arith -Winline -Wcast-qual  -Wundef -Wno-pointer-sign 
-fformat-extensions -c 
/usr/src/sys/modules/coretemp/../../dev/coretemp/coretemp.c
/usr/src/sys/modules/coretemp/../../dev/coretemp/coretemp.c: In function 
'coretemp_identify':
/usr/src/sys/modules/coretemp/../../dev/coretemp/coretemp.c:98: error: 
'cpu_vendor_id' undeclared (first use in this function)
/usr/src/sys/modules/coretemp/../../dev/coretemp/coretemp.c:98: error: 
(Each undeclared identifier is reported only once
/usr/src/sys/modules/coretemp/../../dev/coretemp/coretemp.c:98: error: 
for each function it appears in.)
/usr/src/sys/modules/coretemp/../../dev/coretemp/coretemp.c:98: error: 
'CPU_VENDOR_INTEL' undeclared (first use in this function)

*** Error code 1

Stop in /usr/src/sys/modules/coretemp.
*** Error code 1

i've tried to find either cpu_vendor_id or CPU_VENDOR_INTEL in source 
tree, but didn't succeeded. can anybody reproduce this error? and can 
anybody tell how to fix it? :)


--
SY, Marat


smime.p7s
Description: S/MIME Cryptographic Signature


Re: coretemp(4) build broken in recent STABLE?

2009-07-16 Thread Andriy Gapon
on 16/07/2009 12:42 Marat N.Afanasyev said the following:
 i have 7.2-S csupped to today midnight and while trying to build kernel
 i have the following error:
[snip]
 i've tried to find either cpu_vendor_id or CPU_VENDOR_INTEL in source
 tree, but didn't succeeded. can anybody reproduce this error? and can
 anybody tell how to fix it? :)

It seems that you are using CVSup server with out-of-sync data.
This has already been discussed, but not on the public mailing lists.
In that case the offending server was cvsup7.ru.FreeBSD.org

-- 
Andriy Gapon
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: coretemp(4) build broken in recent STABLE?

2009-07-16 Thread Marat N.Afanasyev

Andriy Gapon wrote:

on 16/07/2009 12:42 Marat N.Afanasyev said the following:

i have 7.2-S csupped to today midnight and while trying to build kernel
i have the following error:

[snip]

i've tried to find either cpu_vendor_id or CPU_VENDOR_INTEL in source
tree, but didn't succeeded. can anybody reproduce this error? and can
anybody tell how to fix it? :)


It seems that you are using CVSup server with out-of-sync data.
This has already been discussed, but not on the public mailing lists.
In that case the offending server was cvsup7.ru.FreeBSD.org


it seems so

--
SY, Marat


smime.p7s
Description: S/MIME Cryptographic Signature


Posts regarding 8.0-beta

2009-07-16 Thread Kevin Oberman
Just a reminder...there is no 8-stable as of today nor is there a
RELENG_8 branch in CVS. Posts regarding 8.0 should go to curr...@. The
folks who are most likely able to help with 8.0 problems are more likely
to see it there and, if you are watching current@, you may see that the
issue you are reporting has already been discussed.
-- 
R. Kevin Oberman, Network Engineer
Energy Sciences Network (ESnet)
Ernest O. Lawrence Berkeley National Laboratory (Berkeley Lab)
E-mail: ober...@es.net  Phone: +1 510 486-8634
Key fingerprint:059B 2DDF 031C 9BA3 14A4  EADA 927D EBB3 987B 3751
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: FreeBSD 7.2 USB stack info needed

2009-07-16 Thread Bruce Simpson

Sagara Wijetunga wrote:


1. Could I know which exact program print above line on /dev/devctl ?


The kernel...

2. I want to print another line with daN as the device-name, where N 
is 0 to 9, with minimum vendor and product ids once the allocated 
device-name is known for USB Mass Storage devices. Your additional 
ideas/feedback/help is most welcomed. 


rwatson had a patch for something like this somewhere.

cheers
BMS


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


gstripe problem

2009-07-16 Thread Nenhum_de_Nos
hail,

I have a problem with gstripe on today stable. I created this stripe using a 
bit more old stable (two weeks tops) and it can't be read on old stable (from 
30/12/2008). So I recreated in 8-BETA1 and I could mount and see files. When I 
tried again on 30/12/2008 stable and todays, on PII machine (i386):

[r...@xxx ~]# gstripe status
  Name  Status  Components
stripe/stripe0  UP  ad4s2
ad6s2
[r...@xxx ~]# gstripe list  
Geom name: stripe0
State: UP
Status: Total=2, Online=2
Type: MANUAL
Stripesize: 4096
ID: 2302026851
Providers:
1. Name: stripe/stripe0
   Mediasize: 1242615693312 (1.1T)
   Sectorsize: 512
   Mode: r0w0e0
Consumers:
1. Name: ad4s2
   Mediasize: 621307846656 (579G)
   Sectorsize: 512
   Mode: r0w0e0
   Number: 0
2. Name: ad6s2
   Mediasize: 621307846656 (579G)
   Sectorsize: 512
   Mode: r0w0e0
   Number: 1
[r...@xxx ~]# ls /dev/stripe/
stripe0 stripe0c
[r...@xxx ~]# mount /dev/stripe/stripe0 /null 
mount: /dev/stripe/stripe0 : Invalid argument

on 8-BETA1 it works, but can't create stripe on it and use on this stable box 
though. the stripe already has files ! so anything weird could make me loose my 
data ...

the 8-BETA1 is amd64 and core 2 quad cpu.

any hints ?

thanks,

matheus

-- 
We will call you cygnus,
The God of balance you shall be

A: Because it messes up the order in which people normally read text.
Q: Why is top-posting such a bad thing?

http://en.wikipedia.org/wiki/Posting_style
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org