date:20110504

Re: Clang error make buildworld

2011-05-04 Thread Dimitry Andric


On 2011-05-04 03:07, Manfred Antar wrote:

I get this error when trying to buildworld on current i386.
It's been this way for awhile Any Ideas ?

===  boot/i386/boot0 (all)
clang -O2 -pipe  -DVOLUME_SERIAL -DPXE -DFLAGS=0x8f  -DTICKS=0xb6  -DCOMSPEED=7  
5 + 3 -ffreestanding -mpreferred-stack-boundary=2  -mno-mmx -mno-3dnow -mno-sse -mno-sse2 
-mno-sse3 -msoft-float -std=gnu99-c /usr/src/sys/boot/i386/boot0/boot0.S
clang: warning: argument unused during compilation: 
'-mpreferred-stack-boundary=2'
/tmp/cc-4SXZt8.s:42:11: error: .code16 not supported yet


For some reason, on your system, it does not compile boot0.S with
-no-integrated-as.  It works fine here though, so it must be something
local to your system.  Can you please post:

- Your full make.conf and src.conf
- The first 30 lines of sys/boot/i386/boot0/Makefile
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org

Re: Nasty non-recursive lockmgr panic on softdep only enabled UFS partition when filesystem full

2011-05-04 Thread Garrett Cooper

On Tue, May 3, 2011 at 10:59 PM, Kirk McKusick mckus...@mckusick.com wrote:
 Date: Tue, 3 May 2011 22:40:26 -0700
 Subject: Nasty non-recursive lockmgr panic on softdep only enabled UFS
  partition when filesystem full
 From: Garrett Cooper yaneg...@gmail.com
 To: Jeff Roberson j...@freebsd.org,
         Marshall Kirk McKusick mckus...@mckusick.com
 Cc: FreeBSD Current freebsd-current@freebsd.org

 Hi Jeff and Dr. McKusick,
     Ran into this panic when /usr ran out of space doing a make
 universe on amd64/r221219 (it took ~15 minutes for the panic to occur
 after the filesystem ran out of space -- wasn't quite sure what it was
 doing at the time):

 ...

     Let me know what other commands you would like for me to run in kgdb.
 Thanks,
 -Garrett

 You did not indicate whether you are running an 8.X system or a 9-current
 system. It would be helpful to know that.

I've actually been running CURRENT for a few years now, but you're right --
I didn't mention that part.

 Jeff thinks that there may be a potential race in the locking code for
 softdep_request_cleanup. If so, this patch for 9-current should fix it:

 Index: ffs_softdep.c
 ===
 --- ffs_softdep.c       (revision 221385)
 +++ ffs_softdep.c       (working copy)
 @@ -11380,7 +11380,8 @@
                                continue;
                        }
                        MNT_IUNLOCK(mp);
 -                       if (vget(lvp, LK_EXCLUSIVE | LK_INTERLOCK, 
 curthread)) {
 +                       if (vget(lvp, LK_EXCLUSIVE | LK_NOWAIT | LK_INTERLOCK,
 +                           curthread)) {
                                MNT_ILOCK(mp);
                                continue;
                        }

 If you are running an 8.X system, hopefully you will be able to apply it.

I've applied it, rebuilt and installed the kernel, and trying to
repro the case again. Will let you know how things go!
Thanks!
-Garrett
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org

Re: Nasty non-recursive lockmgr panic on softdep only enabled UFS partition when filesystem full

2011-05-04 Thread Garrett Cooper

On Tue, May 3, 2011 at 11:42 PM, Garrett Cooper yaneg...@gmail.com wrote:
 On Tue, May 3, 2011 at 10:59 PM, Kirk McKusick mckus...@mckusick.com wrote:
 Date: Tue, 3 May 2011 22:40:26 -0700
 Subject: Nasty non-recursive lockmgr panic on softdep only enabled UFS
  partition when filesystem full
 From: Garrett Cooper yaneg...@gmail.com
 To: Jeff Roberson j...@freebsd.org,
         Marshall Kirk McKusick mckus...@mckusick.com
 Cc: FreeBSD Current freebsd-current@freebsd.org

 Hi Jeff and Dr. McKusick,
     Ran into this panic when /usr ran out of space doing a make
 universe on amd64/r221219 (it took ~15 minutes for the panic to occur
 after the filesystem ran out of space -- wasn't quite sure what it was
 doing at the time):

 ...

     Let me know what other commands you would like for me to run in kgdb.
 Thanks,
 -Garrett

 You did not indicate whether you are running an 8.X system or a 9-current
 system. It would be helpful to know that.

 I've actually been running CURRENT for a few years now, but you're right --
 I didn't mention that part.

 Jeff thinks that there may be a potential race in the locking code for
 softdep_request_cleanup. If so, this patch for 9-current should fix it:

 Index: ffs_softdep.c
 ===
 --- ffs_softdep.c       (revision 221385)
 +++ ffs_softdep.c       (working copy)
 @@ -11380,7 +11380,8 @@
                                continue;
                        }
                        MNT_IUNLOCK(mp);
 -                       if (vget(lvp, LK_EXCLUSIVE | LK_INTERLOCK, 
 curthread)) {
 +                       if (vget(lvp, LK_EXCLUSIVE | LK_NOWAIT | 
 LK_INTERLOCK,
 +                           curthread)) {
                                MNT_ILOCK(mp);
                                continue;
                        }

 If you are running an 8.X system, hopefully you will be able to apply it.

    I've applied it, rebuilt and installed the kernel, and trying to
 repro the case again. Will let you know how things go!

Happened again with the change. It's really easy to repro:

1. Get a filesystem with UFS+SU
2. Execute something that does a large number of small writes to a partition.
3. 'dd if=/dev/zero of=FOO bs=10m' on the same partition

The kernel will panic with the issue I discussed above.
Thanks!
-Garrett
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org

Re: Building FreeBSD 9.0-CUR/amd64 with CLANG fails

2011-05-04 Thread Dimitry Andric


On 2011-05-03 20:53, O. Hartmann wrote:
...

ld -m elf_i386_fbsd -Y P,/usr/obj/usr/src/lib32/usr/lib32 �-o gcrt1.o -r
crt1_s.o gcrt1_c.o
ld: Relocatable linking with relocations from format elf64-x86-64-freebsd
(crt1_s.o) to format elf32-i386-freebsd (gcrt1.o) is not supported

...

Today, I tried again, after CLANG/LLVM has been updated to version 3.0.
Same error.

This is the addendum I made to the /etc/make.conf:

##
##  CLANG
##
.if defined(USE_CLANG)
.if !defined(CC) || ${CC} == cc
CC=clang
.endif
.if !defined(CXX) || ${CXX} == c++
CXX=clang++
.endif
# Don't die on warnings
NO_WERROR=
WERROR=
# Don't forget this when using Jails!
NO_FSCHG=
.endif


Ok, that looks good, I use a similar construction.  However, in my case
it works fine, so there must be something special on your system that
breaks the build.

What happens here, is that the 32-bit stage on amd64 fails, because it
tries to link together 64-bit and 32-bit object files, which is not
allowed.  This can occur if Makefile.inc1 cannot set CC to the correct
value, but there might also be something else going on.

To debug this further, can you please post:
- Your full /etc/make.conf
- Your full /etc/src.conf
- Any modifications you made to your source tree
- The specific procedure you use for buildworld
- An url to a full build log (don't post it to the list, because it will
  be rather large)

___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org

Re: problems with em(4) since update to driver 7.2.2

2011-05-04 Thread Olivier Smedts

2011/5/4 Jack Vogel jfvo...@gmail.com:
 It has nothing to do with load, it has to do with the prerequisites to init
 your interfaces.
 The amount you need is fixed, it doesn't vary with load. Every RX descriptor
 needs one,
 so its simple math, number-of-interfaces X number-of-queues X size of the
 ring.

 If you have other network interfaces beside Intel they also consume mbufs
 remember.

Only one network interface.
# kldunload if_em.ko
(the old one)
# sysctl kern.ipc.nmbclusters=512000
(I also tried with lower and more meaningful values)
# kldload ./if_em.ko
(the new one)
# dmesg
em0: detached
pci0: network, ethernet at device 25.0 (no driver attached)
em0: Intel(R) PRO/1000 Network Connection 7.2.3 port 0x2100-0x211f
mem 0xf000-0xf001,0xf0025000-0xf0025fff irq 19 at device 25.0
on pci0
em0: Using an MSI interrupt
em0: Ethernet address: d4:85:64:b2:aa:f5
em0: Could not setup receive structures
em0: Could not setup receive structures

What can we do to help you debug this ?

-- 
Olivier Smedts                                                 _
                                        ASCII ribbon campaign ( )
e-mail: oliv...@gid0.org        - against HTML email  vCards  X
www: http://www.gid0.org    - against proprietary attachments / \

  Il y a seulement 10 sortes de gens dans le monde :
  ceux qui comprennent le binaire,
  et ceux qui ne le comprennent pas.
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org

Re: Building FreeBSD 9.0-CUR/amd64 with CLANG fails

2011-05-04 Thread Olivier Smedts

2011/5/4 O. Hartmann ohart...@mail.zedat.fu-berlin.de:

 But when I tried to compile essential ports (essential to me), like
 x11-wm/windowmaker, mulitmedia/ffmpeg, for instance, I run into serious
 compiler/assembler error with LLVM/CLANG. I guess the ports- tree isn't
 mature for clang. So am I right in this thinking: leaving /etc/make.conf
 untouched in terms of not putting there the CLANG build construct and
 putting this instead into /etc/src.conf will only affect the OS' source tree
 to be build by clang and all ports are build by the antique system's gcc
 4.2.1?

A lot of ports can't be build with clang. You can add something like
this to your /etc/make.conf (modifying paths accordingly) :
.if ${.CURDIR:M/usr/src*} ||
${.CURDIR:M*/usr/ports/emulators/virtualbox-ose-kmod*}
.if !defined(CC) || ${CC} == cc
CC=clang
.endif
.if !defined(CXX) || ${CXX} == c++
CXX=clang++
.endif
NO_WERROR=
WERROR=
.endif

That's what I use. Note the first if statement.

Cheers

-- 
Olivier Smedts                                                 _
                                        ASCII ribbon campaign ( )
e-mail: oliv...@gid0.org        - against HTML email  vCards  X
www: http://www.gid0.org    - against proprietary attachments / \

  Il y a seulement 10 sortes de gens dans le monde :
  ceux qui comprennent le binaire,
  et ceux qui ne le comprennent pas.
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org

Re: Building FreeBSD 9.0-CUR/amd64 with CLANG fails

2011-05-04 Thread Dimitry Andric


On 2011-05-04 09:17, O. Hartmann wrote:
...

But when I tried to compile essential ports (essential to me), like
x11-wm/windowmaker, mulitmedia/ffmpeg, for instance, I run into serious
compiler/assembler error with LLVM/CLANG. I guess the ports- tree isn't
mature for clang.


Several patches for this are available at:

http://rainbow-runner.nl/clang/patches/

but getting these into the ports tree itself is proving to be rather
slow, for some reason.

I see an ffmpeg patch in there, but no windowmaker one.  I will have a
look at what the problem is.

Note that if you run into problems with clang's integrated assembler,
you can always add -no-integrated-as to CFLAGS, then it will use GNU
as instead.  It will just be a bit slower.

On the other hand, if there is a way to actually correct the assembly,
or if it is really a bug in the integrated assembler, we would rather
fix it properly. :)



So am I right in this thinking: leaving /etc/make.conf
untouched in terms of not putting there the CLANG build construct and
putting this instead into /etc/src.conf will only affect the OS' source
tree to be build by clang and all ports are build by the antique
system's gcc 4.2.1?


No, you really *must* put any CC= definitions in make.conf; if you put
them in src.conf, they will not be picked up early enough in some cases.

If you only want to build /usr/src with clang, and ports with gcc, it is
probably best to surround the CC=clang definitions with:

.if !empty(.CURDIR:M/usr/src*)
# ...clang definitions here...
.endif
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org

Re: Nasty non-recursive lockmgr panic on softdep only enabled UFS partition when filesystem full

2011-05-04 Thread Sergey Kandaurov

On 4 May 2011 10:42, Garrett Cooper yaneg...@gmail.com wrote:
 On Tue, May 3, 2011 at 10:59 PM, Kirk McKusick mckus...@mckusick.com wrote:
 Date: Tue, 3 May 2011 22:40:26 -0700
 Subject: Nasty non-recursive lockmgr panic on softdep only enabled UFS
  partition when filesystem full
 From: Garrett Cooper yaneg...@gmail.com
 To: Jeff Roberson j...@freebsd.org,
         Marshall Kirk McKusick mckus...@mckusick.com
 Cc: FreeBSD Current freebsd-current@freebsd.org

 Hi Jeff and Dr. McKusick,
     Ran into this panic when /usr ran out of space doing a make
 universe on amd64/r221219 (it took ~15 minutes for the panic to occur
 after the filesystem ran out of space -- wasn't quite sure what it was
 doing at the time):

 ...

     Let me know what other commands you would like for me to run in kgdb.
 Thanks,
 -Garrett

 You did not indicate whether you are running an 8.X system or a 9-current
 system. It would be helpful to know that.

 I've actually been running CURRENT for a few years now, but you're right --
 I didn't mention that part.

 Jeff thinks that there may be a potential race in the locking code for
 softdep_request_cleanup. If so, this patch for 9-current should fix it:

 Index: ffs_softdep.c
 ===
 --- ffs_softdep.c       (revision 221385)
 +++ ffs_softdep.c       (working copy)
 @@ -11380,7 +11380,8 @@
                                continue;
                        }
                        MNT_IUNLOCK(mp);
 -                       if (vget(lvp, LK_EXCLUSIVE | LK_INTERLOCK, 
 curthread)) {
 +                       if (vget(lvp, LK_EXCLUSIVE | LK_NOWAIT | 
 LK_INTERLOCK,
 +                           curthread)) {
                                MNT_ILOCK(mp);
                                continue;
                        }


FYI,
I was playing with head (w/o the above patch) to reproduce the panic and got
this LOR when filesystem was eventually filled.
I'm not sure the patch would fix the panic but I think it should at
least fix the LOR.

kernel: pid 66153 (dd), uid 0 inumber 4 on /mnt: filesystem full
lock order reversal:
 1st 0xfe001d7d3310 ufs (ufs) @ /usr/src/sys/kern/vfs_vnops.c:614
 2nd 0xff807ba8a800 bufwait (bufwait) @ /usr/src/sys/kern/vfs_bio.c:2658
 3rd 0xfe001ade7588 ufs (ufs) @ /usr/src/sys/kern/vfs_subr.c:2126
KDB: stack backtrace:
db_trace_self_wrapper() at 0x802d9eba = db_trace_self_wrapper+0x2a
kdb_backtrace() at 0x80475d17 = kdb_backtrace+0x37
_witness_debugger() at 0x8048b4fe = _witness_debugger+0x2e
witness_checkorder() at 0x8048c7a7 = witness_checkorder+0x807
__lockmgr_args() at 0x80427553 = __lockmgr_args+0xd63
ffs_lock() at 0x806578fc = ffs_lock+0x9c
VOP_LOCK1_APV() at 0x806f285f = VOP_LOCK1_APV+0xbf
_vn_lock() at 0x804e87c7 = _vn_lock+0x57
vget() at 0x804dbb5b = vget+0x7b
softdep_request_cleanup() at 0x80649f31 = softdep_request_cleanup+0x311
ffs_alloc() at 0x80630b64 = ffs_alloc+0x134
ffs_balloc_ufs2() at 0x8063426c = ffs_balloc_ufs2+0x11ac
ffs_write() at 0x8065889f = ffs_write+0x22f
VOP_WRITE_APV() at 0x806f33dd = VOP_WRITE_APV+0x14d
vn_write() at 0x804e9a42 = vn_write+0x2a2
dofilewrite() at 0x8048df25 = dofilewrite+0x85
kern_writev() at 0x8048f740 = kern_writev+0x60
write() at 0x8048f845 = write+0x55
syscallenter() at 0x80483cbb = syscallenter+0x1cb
syscall() at 0x806abaf0 = syscall+0x60
Xfast_syscall() at 0x8069670d = Xfast_syscall+0xdd
--- syscall (4, FreeBSD ELF64, write), rip = 0x8009438fc, rsp =
0x7fffda68, rbp = 0xa0 ---

-- 
wbr,
pluknet
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org

Re: Nasty non-recursive lockmgr panic on softdep only enabled UFS partition when filesystem full

2011-05-04 Thread Kostik Belousov

On Tue, May 03, 2011 at 11:58:49PM -0700, Garrett Cooper wrote:
 On Tue, May 3, 2011 at 11:42 PM, Garrett Cooper yaneg...@gmail.com wrote:
  On Tue, May 3, 2011 at 10:59 PM, Kirk McKusick mckus...@mckusick.com 
  wrote:
  Date: Tue, 3 May 2011 22:40:26 -0700
  Subject: Nasty non-recursive lockmgr panic on softdep only enabled UFS
   partition when filesystem full
  From: Garrett Cooper yaneg...@gmail.com
  To: Jeff Roberson j...@freebsd.org,
          Marshall Kirk McKusick mckus...@mckusick.com
  Cc: FreeBSD Current freebsd-current@freebsd.org

  Hi Jeff and Dr. McKusick,
      Ran into this panic when /usr ran out of space doing a make
  universe on amd64/r221219 (it took ~15 minutes for the panic to occur
  after the filesystem ran out of space -- wasn't quite sure what it was
  doing at the time):

  ...

      Let me know what other commands you would like for me to run in kgdb.
  Thanks,
  -Garrett

  You did not indicate whether you are running an 8.X system or a 9-current
  system. It would be helpful to know that.

  I've actually been running CURRENT for a few years now, but you're right --
  I didn't mention that part.

  Jeff thinks that there may be a potential race in the locking code for
  softdep_request_cleanup. If so, this patch for 9-current should fix it:

  Index: ffs_softdep.c
  ===
  --- ffs_softdep.c       (revision 221385)
  +++ ffs_softdep.c       (working copy)
  @@ -11380,7 +11380,8 @@
                                 continue;
                         }
                         MNT_IUNLOCK(mp);
  -                       if (vget(lvp, LK_EXCLUSIVE | LK_INTERLOCK, 
  curthread)) {
  +                       if (vget(lvp, LK_EXCLUSIVE | LK_NOWAIT | 
  LK_INTERLOCK,
  +                           curthread)) {
                                 MNT_ILOCK(mp);
                                 continue;
                         }

  If you are running an 8.X system, hopefully you will be able to apply it.

     I've applied it, rebuilt and installed the kernel, and trying to
  repro the case again. Will let you know how things go!

 Happened again with the change. It's really easy to repro:

 1. Get a filesystem with UFS+SU
 2. Execute something that does a large number of small writes to a partition.
 3. 'dd if=/dev/zero of=FOO bs=10m' on the same partition

 The kernel will panic with the issue I discussed above.
 Thanks!

Jeff' change is required to avoid LORs, but it is not sufficient to
prevent recursion. We must skip the vnode supplied as a parameter to
softdep_request_cleanup(). Theoretically, other vnodes might be also
locked by curthread, thus I think the change below is needed. Try this.

diff --git a/sys/ufs/ffs/ffs_softdep.c b/sys/ufs/ffs/ffs_softdep.c
index a6d4441..25fa5d6 100644
--- a/sys/ufs/ffs/ffs_softdep.c
+++ b/sys/ufs/ffs/ffs_softdep.c
@@ -11380,7 +11380,9 @@ retry:
continue;
}
MNT_IUNLOCK(mp);
-   if (vget(lvp, LK_EXCLUSIVE | LK_INTERLOCK, curthread)) {
+   if (VOP_ISLOCKED(lvp) ||
+   vget(lvp, LK_EXCLUSIVE | LK_INTERLOCK | LK_NOWAIT,
+   curthread)) {
MNT_ILOCK(mp);
continue;
}

pgpnPeiKnHi9d.pgp
Description: PGP signature

Re: firefox4+html5

2011-05-04 Thread Vitaly Liaschuk

Yes, I understand this, but the same issue occurs on 8.x branch...
But, I turned off all options for debugging and rebuild the kernel.

Issue has not disappeared.


On Tue, 3 May 2011 15:59:23 +
Holger Kipp holger.k...@alogis.com wrote:

 Dear Vitaly,
 
 I'm usually not using FreeBSD for accessing youtube, but
 as you're using FreeBSD 9.0-current, please note that this
 presumably has Witness enabled (because FreeBSD 9.0-current
 is still development branch), which will reduce performance
 and hence might give the problems you described.
 
 
 from
 http://www.freebsd.org/doc/en/books/developers-handbook/kerneldebug-options.html
 
 options WITNESS: this option enables run-time lock order tracking and
 verification, and is an invaluable tool for deadlock diagnosis.
 WITNESS maintains a graph of acquired lock orders by lock type, and
 checks the graph at each acquire for cycles (implicit or explicit).
 If a cycle is detected, a warning and stack trace are generated to
 the console, indicating that a potential deadlock might have
 occurred. WITNESS is required in order to use the show locks, show
 witness and show alllocks DDB commands. This debug option has
 significant performance overhead, which may be somewhat mitigated
 through the use of options WITNESS_SKIPSPIN. Detailed documentation
 may be found in witness(4).
 
 = http://www.freebsd.org/cgi/man.cgi?query=witnesssektion=4
 
 Best regards,
 Holger
 
 
 From: owner-freebsd-curr...@freebsd.org
 [owner-freebsd-curr...@freebsd.org] on behalf of Vitaly Liaschuk
 [lari...@gmail.com] Sent: 03 May 2011 16:49 To: FreeBSD current
 mailing list Subject: firefox4+html5
 
 Hi, list!
 I do not know in what part of forum to write, so I decide write
 in General. I'm trying to use html5 on youtube.com. I getting the
 video stream, but audio stutters on most of video files . I tried
 to use the chrome-browser and he is works fine. Also, I tried boot
 from usb flash drive with installed ubuntu and firefox 4 and this
 works. So, I believe what trouble is in my FreeBSD. [QUOTE]
  uname -a
 FreeBSD laptop 9.0-CURRENT FreeBSD 9.0-CURRENT #0 r221296M: Sun May
 1 20:13:15 EEST 2011 larin@laptop:/usr/obj/usr/src/sys/GENERIC
 amd64
 
 [/QUOTE]
 
 My box is: Laptop asus UL30A, 3 GB ram, Intel CPU U2300 1.2Mhz.
 
 Thanks in advance.
 ___
 freebsd-current@freebsd.org mailing list
 http://lists.freebsd.org/mailman/listinfo/freebsd-current
 To unsubscribe, send any mail to
 freebsd-current-unsubscr...@freebsd.org
 
 
 --
 Holger Kipp
 Diplom-Mathematiker
 Senior Consultant
 
 Tel. : +49 30 436 58 114
 Fax. : +49 30 436 58 214
 Mobil: +49 178 36 58 114
 Email: holger.k...@alogis.com
 
 alogis AG
 Alt-Moabit 90b
 D-10559 Berlin
 
 web : http://www.alogis.com
 
 --
 
 alogis AG
 Sitz/Registergericht: Berlin/AG Charlottenburg, HRB 71484
 Vorstand: Arne Friedrichs, Joern Samuelson
 Aufsichtsratsvorsitzender: Reinhard Mielke

___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org

Re: firefox4+html5

2011-05-04 Thread Guido Falsi

On Tue, May 03, 2011 at 08:13:41PM +0200, Guido Falsi wrote:
 On 05/03/11 19:19, Francois Gerodez wrote:
 Hi all,
 
 I'm currently running FreeBSD 8.1 (latest update) and I'm experiencing a
 similar issue. HTML5 videos are very laggy (both image and sound) with
 Firefox 4. I ended up installing the flash player to watch youtube
 streaming.
 
 I didn't spot any particular warning/error messages so I don't know where to
 start...
 
 
 I have these symptoms too, but usually if I pause the video, send it
 back to the start with the slider and finally start playing it goes
 smoothly usually.
 
 This is quite strange, I know. Perhaps someone else should check if
 this is the same for everyone or just something that is happening to
 me.

Forgot to mension. I'm using 8-stable, not -current. Did not check which
list I was writing to. Sorry.

-- 
Guido Falsi m...@madpilot.net
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org

Re: newnfs NFS client testing

2011-05-04 Thread Peter Jeremy

On 2011-Apr-25 20:33:14 -0400, Rick Macklem rmack...@uoguelph.ca wrote:
I believe that the new/experimental NFS client in head is now
compatible with the old/regular NFS client.

Possibly even too compatible...

Both the old and new NFS clients assume a 1:1 mapping between NFS
error codes (NFSERR_* macros defined in nfs/nfsproto.h) and the E*
macros in sys/errno.h: In the old client, the NFS status is copied
from the RPC response by nfsclient/nfs_krpc.c:nfs_request(), returned
and passed back up the call chain.  In the new client, the status is
copied from the RPC response into struct nfsrv_descript.nd_repstat
by fs/nfs/nfs_commonkrpc.c:newnfs_request() and then moved into an
error return in fs/nfsclient/nfs_clrpcops.c:nfsrpc_*().

This is not currently a problem but it would seem useful to include
notes in nfs/nfsproto.h and sys/errno.h warning of this assumption
in case of future changes.

Note that both NFS servers do include code for error code mapping.

-- 
Peter Jeremy


pgpQy93T4ChKw.pgp
Description: PGP signature

Re: Building FreeBSD 9.0-CUR/amd64 with CLANG fails

2011-05-04 Thread O. Hartmann


On 05/04/11 09:00, Dimitry Andric wrote:

On 2011-05-03 20:53, O. Hartmann wrote:
...

ld -m elf_i386_fbsd -Y P,/usr/obj/usr/src/lib32/usr/lib32 �-o
gcrt1.o -r
crt1_s.o gcrt1_c.o
ld: Relocatable linking with relocations from format
elf64-x86-64-freebsd
(crt1_s.o) to format elf32-i386-freebsd (gcrt1.o) is not supported

...

Today, I tried again, after CLANG/LLVM has been updated to version 3.0.
Same error.

This is the addendum I made to the /etc/make.conf:

##
## CLANG
##
.if defined(USE_CLANG)
.if !defined(CC) || ${CC} == cc
CC=clang
.endif
.if !defined(CXX) || ${CXX} == c++
CXX=clang++
.endif
# Don't die on warnings
NO_WERROR=
WERROR=
# Don't forget this when using Jails!
NO_FSCHG=
.endif


Ok, that looks good, I use a similar construction. However, in my case
it works fine, so there must be something special on your system that
breaks the build.

What happens here, is that the 32-bit stage on amd64 fails, because it
tries to link together 64-bit and 32-bit object files, which is not
allowed. This can occur if Makefile.inc1 cannot set CC to the correct
value, but there might also be something else going on.

To debug this further, can you please post:
- Your full /etc/make.conf
- Your full /etc/src.conf
- Any modifications you made to your source tree
- The specific procedure you use for buildworld
- An url to a full build log (don't post it to the list, because it will
be rather large)




When commenting out the wrapping
.if defined(USE_CLANG)
.endif
construct, as suffested by Olivier, it works. I gues I found my mistake: 
From an earlier attempt of building FreeBSD with clang, I placed the 
WIKI suggestions into /etc/src.conf and I never recalled this. I delete 
it and try again ...
On my lab's box, same OS, same revision, nearly same hardware, building 
world/kernel worked fine even with the 'switch' - but there wasn't 
/etc/src.conf.


Thanks for the hints. I'll report again.

Regards,
Oliver

___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org

Re: Building FreeBSD 9.0-CUR/amd64 with CLANG fails

2011-05-04 Thread O. Hartmann


On 05/04/11 09:00, Dimitry Andric wrote:

On 2011-05-03 20:53, O. Hartmann wrote:
...

ld -m elf_i386_fbsd -Y P,/usr/obj/usr/src/lib32/usr/lib32 �-o
gcrt1.o -r
crt1_s.o gcrt1_c.o
ld: Relocatable linking with relocations from format
elf64-x86-64-freebsd
(crt1_s.o) to format elf32-i386-freebsd (gcrt1.o) is not supported

...

Today, I tried again, after CLANG/LLVM has been updated to version 3.0.
Same error.

This is the addendum I made to the /etc/make.conf:

##
## CLANG
##
.if defined(USE_CLANG)
.if !defined(CC) || ${CC} == cc
CC=clang
.endif
.if !defined(CXX) || ${CXX} == c++
CXX=clang++
.endif
# Don't die on warnings
NO_WERROR=
WERROR=
# Don't forget this when using Jails!
NO_FSCHG=
.endif


Ok, that looks good, I use a similar construction. However, in my case
it works fine, so there must be something special on your system that
breaks the build.

What happens here, is that the 32-bit stage on amd64 fails, because it
tries to link together 64-bit and 32-bit object files, which is not
allowed. This can occur if Makefile.inc1 cannot set CC to the correct
value, but there might also be something else going on.

To debug this further, can you please post:
- Your full /etc/make.conf
- Your full /etc/src.conf
- Any modifications you made to your source tree
- The specific procedure you use for buildworld
- An url to a full build log (don't post it to the list, because it will
be rather large)




Sorry for the noise.
But when I tried to compile essential ports (essential to me), like 
x11-wm/windowmaker, mulitmedia/ffmpeg, for instance, I run into serious 
compiler/assembler error with LLVM/CLANG. I guess the ports- tree isn't 
mature for clang. So am I right in this thinking: leaving /etc/make.conf 
untouched in terms of not putting there the CLANG build construct and 
putting this instead into /etc/src.conf will only affect the OS' source 
tree to be build by clang and all ports are build by the antique 
system's gcc 4.2.1?


Oliver

___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org

Re: Nasty non-recursive lockmgr panic on softdep only enabled UFS partition when filesystem full

2011-05-04 Thread Kirk McKusick

 Date: Tue, 3 May 2011 22:40:26 -0700
 Subject: Nasty non-recursive lockmgr panic on softdep only enabled UFS
  partition when filesystem full
 From: Garrett Cooper yaneg...@gmail.com
 To: Jeff Roberson j...@freebsd.org,
 Marshall Kirk McKusick mckus...@mckusick.com
 Cc: FreeBSD Current freebsd-current@freebsd.org

 Hi Jeff and Dr. McKusick,
 Ran into this panic when /usr ran out of space doing a make
 universe on amd64/r221219 (it took ~15 minutes for the panic to occur
 after the filesystem ran out of space -- wasn't quite sure what it was
 doing at the time):

 ...

 Let me know what other commands you would like for me to run in kgdb.
 Thanks,
 -Garrett

You did not indicate whether you are running an 8.X system or a 9-current
system. It would be helpful to know that.

Jeff thinks that there may be a potential race in the locking code for
softdep_request_cleanup. If so, this patch for 9-current should fix it:

Index: ffs_softdep.c
===
--- ffs_softdep.c   (revision 221385)
+++ ffs_softdep.c   (working copy)
@@ -11380,7 +11380,8 @@
continue;
}
MNT_IUNLOCK(mp);
-   if (vget(lvp, LK_EXCLUSIVE | LK_INTERLOCK, curthread)) {
+   if (vget(lvp, LK_EXCLUSIVE | LK_NOWAIT | LK_INTERLOCK,
+   curthread)) {
MNT_ILOCK(mp);
continue;
}

If you are running an 8.X system, hopefully you will be able to apply it.

Kirk McKusick
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org

Re: problems with em(4) since update to driver 7.2.2

2011-05-04 Thread Alastair Hogge

On Tue, May 03, 2011 at 11:50:53PM +0200, Michael Schmiedgen wrote:
 On 03.05.2011 23:24, Jack Vogel wrote:
  If you get the setup receive structures fail, then increase the nmbclusters.
 
  If you use standard MTU then what you need are mbufs, and standard size
  clusters (2K).
  Only when you use jumbo frames will you need larger.
 
  You must configure enough, its that simple.
 
 I doubled the nmbclusters as well. But nothing happened.
 
 I have no load on this machine and nothing special
 configured.
 
 Thanks,
Michael

Just a me too. I've been following -CURRENT(r221415) but I keep 
/sys/dev/e1000 at r218909 to keep em0 working without problems on -CURRENT.

I also tried 2x,  4x 25600 for max mbuff clusters via kern.ipc.nmbclusters.
This didn't help.

-alastair

___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org

FreeBSD 9.0/CUR/amd64, built with LLVM/CLANG fails starting LibreOffice 3.3

2011-05-04 Thread O. Hartmann

After building FreeBSD 9.0/amd64 (FreeBSD 9.0-CURRENT #182 r221413: Tue 
May  3 23:30:11 CEST 2011), the opensource office software LibreOffice 
(libreoffice-3.3.2) won't start and crashes with


pid 53801 (soffice.bin), uid 8152: exited on signal 8 (core dumped)

(shown in console).

What's wrong?

Oliver
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org

Re: Clang error make buildworld

2011-05-04 Thread Manfred Antar

At 11:38 PM 5/3/2011, Dimitry Andric wrote:
On 2011-05-04 03:07, Manfred Antar wrote:
I get this error when trying to buildworld on current i386.
It's been this way for awhile Any Ideas ?

===  boot/i386/boot0 (all)
clang -O2 -pipe  -DVOLUME_SERIAL -DPXE -DFLAGS=0x8f  -DTICKS=0xb6  
-DCOMSPEED=7  5 + 3 -ffreestanding -mpreferred-stack-boundary=2  -mno-mmx 
-mno-3dnow -mno-sse -mno-sse2 -mno-sse3 -msoft-float -std=gnu99-c 
/usr/src/sys/boot/i386/boot0/boot0.S
clang: warning: argument unused during compilation: 
'-mpreferred-stack-boundary=2'
/tmp/cc-4SXZt8.s:42:11: error: .code16 not supported yet

For some reason, on your system, it does not compile boot0.S with
-no-integrated-as.  It works fine here though, so it must be something
local to your system.  Can you please post:

- Your full make.conf and src.conf
- The first 30 lines of sys/boot/i386/boot0/Makefile

Ok: 
src.conf:

WITHOUT_DYNAMICROOT=yes 
   
WITH_IDEA=yes
.if !defined(CC) || ${CC} == cc
CC=clang
.endif
.if !defined(CXX) || ${CXX} == c++
CXX=clang++
.endif
#Don't die on warnings
NO_WERROR=
WERROR=

make conf:

# $FreeBSD: src/share/examples/etc/make.conf,v 1.276 2006/03/21 09:49:05 ru Exp 
$
#
# NOTE:  Please would any committer updating this file also update the
# make.conf(5) manual page, if necessary, which is located in
# src/share/man/man5/make.conf.5.
#
# /etc/make.conf, if present, will be read by make (see
# /usr/share/mk/sys.mk).  It allows you to override macro definitions
# to make without changing your source tree, or anything the source
# tree installs.
#
# This file must be in valid Makefile syntax.
#
# There are additional things you can put into /etc/make.conf.
# You have to find those in the Makefiles and documentation of
# the source tree.
#
# Note, that you should not set MAKEOBJDIRPREFIX or MAKEOBJDIR
# from make.conf (or as command line variables to make).
# Both variables are environment variables for make and must be used as:
#
# env MAKEOBJDIRPREFIX=/big/directory make
#
#
# The CPUTYPE variable controls which processor should be targeted for
# generated code.  This controls processor-specific optimizations in
# certain code (currently only OpenSSL) as well as modifying the value
# of CFLAGS to contain the appropriate optimization directive to gcc.
# The automatic setting of CFLAGS may be overridden using the
# NO_CPU_CFLAGS variable below.
# Currently the following CPU types are recognized:
#   Intel x86 architecture:
#   (AMD CPUs)  opteron athlon64 athlon-mp athlon-xp athlon-4
#   athlon-tbird athlon k8 k6-3 k6-2 k6 k5
#   (Intel CPUs)nocona pentium4[m] prescott pentium3[m] pentium-m
#   pentium2 pentiumpro pentium-mmx pentium i486 i386
#   Alpha/AXP architecture: ev67 ev6 pca56 ev56 ev5 ev45 ev4
#   AMD64 architecture: opteron, athlon64, nocona
#   Intel ia64 architecture: itanium2, itanium
#
# (?= allows to buildworld for a different CPUTYPE.)
#
#CPUTYPE?=pentium3
#NO_CPU_CFLAGS= # Don't add -march=cpu to CFLAGS automatically
#NO_CPU_COPTFLAGS=  # Don't add -march=cpu to COPTFLAGS automatically
#
# CFLAGS controls the compiler settings used when compiling C code.
# Note that optimization settings other than -O and -O2 are not recommended
# or supported for compiling the world or the kernel - please revert any
# nonstandard optimization settings to -O or -O2 before submitting bug
# reports without patches to the developers.
#
#CFLAGS= -O -pipe -Wl,--export-dynamic #FOR APACHE#
#CFLAGS= -O -pipe -no-integrated-as
#
# CXXFLAGS controls the compiler settings used when compiling C++ code.
# Note that CXXFLAGS is initially set to the value of CFLAGS.  If you wish
# to add to CXXFLAGS value, += must be used rather than =.  Using =
# alone will remove the often needed contents of CFLAGS from CXXFLAGS.
#
#CXXFLAGS+= -fconserve-space
#
# MAKE_SHELL controls the shell used internally by make(1) to process the
# command scripts in makefiles.  Three shells are supported, sh, ksh, and
# csh.  Using sh is most common, and advised.  Using ksh *may* work, but is
# not guaranteed to.  Using csh is absurd.  The default is to use sh.
#
#MAKE_SHELL?=sh
#
# BDECFLAGS are a set of gcc warning settings that Bruce Evans has suggested
# for use in developing FreeBSD and testing changes.  They can be used by
# putting CFLAGS+=${BDECFLAGS} in /etc/make.conf.  -Wconversion is not
# included here due to compiler bugs, e.g., mkdir()'s mode_t argument.
#
#BDECFLAGS= -W -Wall -ansi -pedantic -Wbad-function-cast -Wcast-align \
#   -Wcast-qual -Wchar-subscripts -Winline \
#   -Wmissing-prototypes -Wnested-externs -Wpointer-arith \
#   -Wredundant-decls -Wshadow -Wstrict-prototypes -Wwrite-strings
#
# To compile just the kernel with special optimizations, you should use
# this instead of CFLAGS (which is not applicable to kernel builds

Re: Clang error make buildworld

2011-05-04 Thread Dimitry Andric


On 2011-05-04 15:44, Manfred Antar wrote:
...

src.conf:

WITHOUT_DYNAMICROOT=yes
WITH_IDEA=yes
.if !defined(CC) || ${CC} == cc
CC=clang
.endif
.if !defined(CXX) || ${CXX} == c++
CXX=clang++
.endif
#Don't die on warnings
NO_WERROR=
WERROR=


Aha.  Please move the clang-related stuff to make.conf instead, e.g.
this fragment:

.if !defined(CC) || ${CC} == cc
CC=clang
.endif
.if !defined(CXX) || ${CXX} == c++
CXX=clang++
.endif
#Don't die on warnings
NO_WERROR=
WERROR=

It will not work properly if you put it in src.conf.  (You can really
only use src.conf for WITH_FOO and WITHOUT_BAR type settings.)
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org

Re: cardbus memory allocation problem

2011-05-04 Thread John Baldwin

On Tuesday, May 03, 2011 7:49:29 pm Michael Butler wrote:
  I have WIP patches to fix this but they aren't ready yet.
 
  pcib4:   I/O decode0x4000-0x4fff
  pcib4:   memory decode 0xf090-0xf09f
   *** this memory widow is what I expected all children to allocate from
 
  pcib4:   no prefetched decode
  pcib4:   Subtractively decoded bridge.
 
  It's a subtractive bridge, so the resources do not have to be allocated from
  the window.  That said, I'm committing the last of my patches to HEAD today 
  to
  rework how PCI-PCI bridges handle I/O windows to support growing windows, 
  etc.
  and the new PCI-PCI bridge driver will attempt to grow the memory window to
  allocate a new range before falling back to depending on the subtractive
  decode.
 
 You might be pleased to hear that, without any special arrangements in
 loader.conf, the new PCI-PCI code does The Right Thing with memory
 allocation :-)
 
 Parent bridge:
 
 I fixed the subordinate bus using setpci -s 07:06.2 4c.b=02

I believe it should work even if you don't disable subtractive decoding.

-- 
John Baldwin
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org

Interrupt storm with MSI in combination with em1

2011-05-04 Thread Daan Vreeken

Hi All,

I've just updated a machine to -current (r221321) and since then I'm seeing an 
interrupt storm on irq 16. The storm goes away when I disable MSI/MSIX with 
the following lines in loader.conf :

hw.pci.enable_msix=0
hw.pci.enable_msi=0

According to dmesg the following devices share IRQ 16 :

pcib1: ACPI PCI-PCI bridge irq 16 at device 1.0 on pci0
em0: Intel(R) PRO/1000 Network Connection 7.2.3 port 0xcc00-0xcc1f
   mem 0xf7de-0xf7df,0xf7d0-0xf7d7,0xf7ddc000-0xf7dd
   irq 16 at device 0.0 on pci1
vgapci0: VGA-compatible display port 0xbc00-0xbc07
   mem 0xf780-0xf7bf,0xe000-0xefff irq 16 at device 2.0 
on
   pci0
ehci0: Intel PCH USB 2.0 controller USB-B mem 0xf7cfa000-0xf7cfa3ff
   irq 16 at device 26.0 on pci0
em1: Intel(R) PRO/1000 Network Connection 7.2.3 port 0xec00-0xec1f
   mem 0xf7fe-0xf7ff,0xf7f0-0xf7f7,0xf7fdc000-0xf7fd
   irq 16 at device 0.0 on pci4
pcib4: ACPI PCI-PCI bridge irq 16 at device 28.5 on pci0

During a storm vmstat -i shows a rate of about 220.000 interrupts/sec. MSI 
interrupt delivery to both 'em0' and 'em1' seems to work correctly during a 
storm, as I see their counters increase normally in the vmstat -i output.

As only 'em0' and 'em1' seem to be using MSI interrupts, my guess is that the 
e1000 driver is causing this problem. Could it be that the driver forgets to 
clear/mask legacy interrupts when attaching the MSI interrupts perhaps?

Any tips on how to debug and/or fix this?


The full output of dmesg can be found here :
http://vehosting.nl/pub_diffs/dmesg_plantje2_2011_05_04.txt

And the full output of pciconf -lv is here :
http://vehosting.nl/pub_diffs/pciconf_plantje2_2011_05_04.txt


Regards,
-- 
Daan Vreeken
VEHosting
http://VEHosting.nl
tel: +31-(0)40-7113050 / +31-(0)6-46210825
KvK nr: 17174380
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org

Re: I am very confused and would appreciate some help on device renameing or on renumbering on current fstab.

2011-05-04 Thread krad

On 4 May 2011 04:13, Jason Hellenthal jh...@dataix.net wrote:


 Edwin,

/dev/acd0  /cdrom  cd9660  ro,noauto   0
 0
/dev/acd1  /cdrom1 cd9660  ro,noauto   0
 0

 As a side note. These are also now useless  can be sent to /dev/null for
 extra padding ;)

 Shouldn't cause no harm being there but just for reference.

 --

  Regards, (jhell)
  Jason Hellenthal


Just a sanity check here people, but if the machine was built with freebsd
6.x i would guess it machine is a few years old. If so i doubt the hardware
would support ahci, and therefore wouldn't have the ada type devices, it
would have the old ad style ata ones and therefore noe fstab twiddling
should be necessary.

Forgive me if im missing something here.
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org

Re: I am very confused and would appreciate some help on device renameing or on renumbering on current fstab.

2011-05-04 Thread Freddie Cash

On Wed, May 4, 2011 at 8:16 AM, krad kra...@gmail.com wrote:
 On 4 May 2011 04:13, Jason Hellenthal jh...@dataix.net wrote:
 Edwin,

    /dev/acd0              /cdrom          cd9660  ro,noauto       0
     0
    /dev/acd1              /cdrom1         cd9660  ro,noauto       0
     0

 As a side note. These are also now useless  can be sent to /dev/null for
 extra padding ;)

 Shouldn't cause no harm being there but just for reference.

 Just a sanity check here people, but if the machine was built with freebsd
 6.x i would guess it machine is a few years old. If so i doubt the hardware
 would support ahci, and therefore wouldn't have the ada type devices, it
 would have the old ad style ata ones and therefore noe fstab twiddling
 should be necessary.

 Forgive me if im missing something here.

If you enable options ATA_CAM in the kernel, which uses the old
ata(4) driver via some cam(4) shims, then you also get the adaX device
nodes.

There's currently 4 ways to access PATA/SATA disks:
  - old-style ata(4) using adX device nodes
  - old-style ata(4) using ataahci(4) for ACHI-like access to
PATA/SATA disks, I believe using adX
  - old-style ata(4) via ATA_CAM using adaX device nodes
  - new-style ahci(4)/siis(4)/another(4) using adaX device nodes

I forget the name of the other AHCI-style driver.

The first two options uses atacontrol to manage the disks.  The last
two options use camcontrol to manage the disks.

I believe the plan in 9.0 is to have everything accessed via
ATA_CAM/ahci(4) so all PATA/SATA drives show up the same, as adaX,
with everything being managed via camcontrol, finally unifying all
PATA/SATA/SCSI/SAS disk access via cam(4).

-- 
Freddie Cash
fjwc...@gmail.com
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org

Re: Nasty non-recursive lockmgr panic on softdep only enabled UFS partition when filesystem full

2011-05-04 Thread Garrett Cooper

2011/5/4 Kostik Belousov kostik...@gmail.com:
 On Tue, May 03, 2011 at 11:58:49PM -0700, Garrett Cooper wrote:
 On Tue, May 3, 2011 at 11:42 PM, Garrett Cooper yaneg...@gmail.com wrote:
  On Tue, May 3, 2011 at 10:59 PM, Kirk McKusick mckus...@mckusick.com 
  wrote:
  Date: Tue, 3 May 2011 22:40:26 -0700
  Subject: Nasty non-recursive lockmgr panic on softdep only enabled UFS
   partition when filesystem full
  From: Garrett Cooper yaneg...@gmail.com
  To: Jeff Roberson j...@freebsd.org,
          Marshall Kirk McKusick mckus...@mckusick.com
  Cc: FreeBSD Current freebsd-current@freebsd.org

  Hi Jeff and Dr. McKusick,
      Ran into this panic when /usr ran out of space doing a make
  universe on amd64/r221219 (it took ~15 minutes for the panic to occur
  after the filesystem ran out of space -- wasn't quite sure what it was
  doing at the time):

  ...

      Let me know what other commands you would like for me to run in kgdb.
  Thanks,
  -Garrett

  You did not indicate whether you are running an 8.X system or a 9-current
  system. It would be helpful to know that.

  I've actually been running CURRENT for a few years now, but you're right --
  I didn't mention that part.

  Jeff thinks that there may be a potential race in the locking code for
  softdep_request_cleanup. If so, this patch for 9-current should fix it:

  Index: ffs_softdep.c
  ===
  --- ffs_softdep.c       (revision 221385)
  +++ ffs_softdep.c       (working copy)
  @@ -11380,7 +11380,8 @@
                                 continue;
                         }
                         MNT_IUNLOCK(mp);
  -                       if (vget(lvp, LK_EXCLUSIVE | LK_INTERLOCK, 
  curthread)) {
  +                       if (vget(lvp, LK_EXCLUSIVE | LK_NOWAIT | 
  LK_INTERLOCK,
  +                           curthread)) {
                                 MNT_ILOCK(mp);
                                 continue;
                         }

  If you are running an 8.X system, hopefully you will be able to apply it.

     I've applied it, rebuilt and installed the kernel, and trying to
  repro the case again. Will let you know how things go!

     Happened again with the change. It's really easy to repro:

 1. Get a filesystem with UFS+SU
 2. Execute something that does a large number of small writes to a partition.
 3. 'dd if=/dev/zero of=FOO bs=10m' on the same partition

     The kernel will panic with the issue I discussed above.
 Thanks!

 Jeff' change is required to avoid LORs, but it is not sufficient to
 prevent recursion. We must skip the vnode supplied as a parameter to
 softdep_request_cleanup(). Theoretically, other vnodes might be also
 locked by curthread, thus I think the change below is needed. Try this.

 diff --git a/sys/ufs/ffs/ffs_softdep.c b/sys/ufs/ffs/ffs_softdep.c
 index a6d4441..25fa5d6 100644
 --- a/sys/ufs/ffs/ffs_softdep.c
 +++ b/sys/ufs/ffs/ffs_softdep.c
 @@ -11380,7 +11380,9 @@ retry:
                                continue;
                        }
                        MNT_IUNLOCK(mp);
 -                       if (vget(lvp, LK_EXCLUSIVE | LK_INTERLOCK, 
 curthread)) {
 +                       if (VOP_ISLOCKED(lvp) ||
 +                           vget(lvp, LK_EXCLUSIVE | LK_INTERLOCK | LK_NOWAIT,
 +                           curthread)) {
                                MNT_ILOCK(mp);
                                continue;
                        }

Ok. I'll let the make universe I have going run to completion, and
once I get back home later on, I'll take a look at repro'ing this
again with the above patch applied.
Thanks!
-Garrett
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org

Re: problems with em(4) since update to driver 7.2.2

2011-05-04 Thread Arnaud Lacombe

Hi,

On Wed, May 4, 2011 at 3:58 AM, Olivier Smedts oliv...@gid0.org wrote:
 em0: Using an MSI interrupt
 em0: Ethernet address: d4:85:64:b2:aa:f5
 em0: Could not setup receive structures
 em0: Could not setup receive structures

 What can we do to help you debug this ?

At some point in time, in late February, I had the same issue on a
6-interface machine. I tracked this down to the fact that the main
loop in em_setup_receive_ring() was not being entered. This resulted
in junk being returned as `error'  is not explicitly initialized. At
the time, the following patch worked for me. Without it the driver was
unable to initialize with RX/TX ring's size of 512. With it, ring's
size of 1024 initialized fine.

diff --git a/sys/dev/e1000/if_em.c b/sys/dev/e1000/if_em.c
index fb6ed67..f02059a 100644
--- a/sys/dev/e1000/if_em.c
+++ b/sys/dev/e1000/if_em.c
@@ -3901,7 +3901,7 @@ em_setup_receive_ring(struct rx_ring *rxr)
struct  adapter *adapter = rxr-adapter;
struct em_buffer*rxbuf;
bus_dma_segment_t   seg[1];
-   int i, j, nsegs, error;
+   int i, j, nsegs, error = 0;

I did not dig much more at the time, but I was definitively seeing an
odd behavior. Anyhow, I am no longer able to reproduce this with
7.2.3, so cannot dig in more details.

Btw, I wish you all luck, it took me nearly two full months to
convince Jack (and other FreeBSD devs) that there was a bug in the
mbuf refresh code.

 - Arnaud
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org

Re: problems with em(4) since update to driver 7.2.2

2011-05-04 Thread Arnaud Lacombe

Hi,

On Tue, May 3, 2011 at 7:25 PM, Olivier Smedts oliv...@gid0.org wrote:
 2011/5/4 Arnaud Lacombe lacom...@gmail.com:
 A more rude version might be Why the frak my network adapter stopped
 working with the default setting ? :)

 ...on a -STABLE branch

Maybe you should not have picked the rude version, Jack has a
relatively low cut-off frequency :-)

 - Arnaud
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org

Re: problems with em(4) since update to driver 7.2.2

2011-05-04 Thread Jack Vogel

No, I do not Arnaud. But I refuse to rise to rude and uncivil behavior.  Its
your
behavior again and again which causes you to not get responses, not my
willingness to help and respond to those that behave like respectful
customers
and adults.

Jack


On Wed, May 4, 2011 at 10:20 AM, Arnaud Lacombe lacom...@gmail.com wrote:

 Hi,

 On Tue, May 3, 2011 at 7:25 PM, Olivier Smedts oliv...@gid0.org wrote:
  2011/5/4 Arnaud Lacombe lacom...@gmail.com:
  A more rude version might be Why the frak my network adapter stopped
  working with the default setting ? :)
 
  ...on a -STABLE branch
 
 Maybe you should not have picked the rude version, Jack has a
 relatively low cut-off frequency :-)

  - Arnaud

___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org

Re: problems with em(4) since update to driver 7.2.2

2011-05-04 Thread Arnaud Lacombe

Hi,

On Wed, May 4, 2011 at 1:24 PM, Jack Vogel jfvo...@gmail.com wrote:
 No, I do not Arnaud. But I refuse to rise to rude and uncivil behavior.  Its
 your
 behavior again and again which causes you to not get responses, not my
 willingness to help and respond to those that behave like respectful
 customers
 and adults.

Obviously, I am no longer the only one finding that em(4) has
unacceptable issue, this thread is the proof.

 - Arrnaud

 Jack


 On Wed, May 4, 2011 at 10:20 AM, Arnaud Lacombe lacom...@gmail.com wrote:

 Hi,

 On Tue, May 3, 2011 at 7:25 PM, Olivier Smedts oliv...@gid0.org wrote:
  2011/5/4 Arnaud Lacombe lacom...@gmail.com:
  A more rude version might be Why the frak my network adapter stopped
  working with the default setting ? :)
 
  ...on a -STABLE branch
 
 Maybe you should not have picked the rude version, Jack has a
 relatively low cut-off frequency :-)

  - Arnaud


___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org

Re: Interrupt storm with MSI in combination with em1

2011-05-04 Thread Jack Vogel

Who makes your motherboard? The problem you are having is that MSIX AND
MSI are both failing as em0 comes up, so it falls back to Legacy interrupt
mode,
and must be having some issue with sharing the line, causing the storm.

This is the second report in a matter of a week perhaps about a problematic
motherboard, I would like to know who makes them.

Thanks,

Jack


On Wed, May 4, 2011 at 8:34 AM, Daan Vreeken d...@vehosting.nl wrote:

 Hi All,

 I've just updated a machine to -current (r221321) and since then I'm seeing
 an
 interrupt storm on irq 16. The storm goes away when I disable MSI/MSIX with
 the following lines in loader.conf :

hw.pci.enable_msix=0
hw.pci.enable_msi=0

 According to dmesg the following devices share IRQ 16 :

pcib1: ACPI PCI-PCI bridge irq 16 at device 1.0 on pci0
em0: Intel(R) PRO/1000 Network Connection 7.2.3 port 0xcc00-0xcc1f
   mem
 0xf7de-0xf7df,0xf7d0-0xf7d7,0xf7ddc000-0xf7dd
   irq 16 at device 0.0 on pci1
vgapci0: VGA-compatible display port 0xbc00-0xbc07
   mem 0xf780-0xf7bf,0xe000-0xefff irq 16 at device
 2.0 on
   pci0
ehci0: Intel PCH USB 2.0 controller USB-B mem
 0xf7cfa000-0xf7cfa3ff
   irq 16 at device 26.0 on pci0
em1: Intel(R) PRO/1000 Network Connection 7.2.3 port 0xec00-0xec1f
   mem
 0xf7fe-0xf7ff,0xf7f0-0xf7f7,0xf7fdc000-0xf7fd
   irq 16 at device 0.0 on pci4
pcib4: ACPI PCI-PCI bridge irq 16 at device 28.5 on pci0

 During a storm vmstat -i shows a rate of about 220.000 interrupts/sec.
 MSI
 interrupt delivery to both 'em0' and 'em1' seems to work correctly during a
 storm, as I see their counters increase normally in the vmstat -i output.

 As only 'em0' and 'em1' seem to be using MSI interrupts, my guess is that
 the
 e1000 driver is causing this problem. Could it be that the driver forgets
 to
 clear/mask legacy interrupts when attaching the MSI interrupts perhaps?

 Any tips on how to debug and/or fix this?


 The full output of dmesg can be found here :
http://vehosting.nl/pub_diffs/dmesg_plantje2_2011_05_04.txt

 And the full output of pciconf -lv is here :
http://vehosting.nl/pub_diffs/pciconf_plantje2_2011_05_04.txt


 Regards,
 --
 Daan Vreeken
 VEHosting
 http://VEHosting.nl
 tel: +31-(0)40-7113050 / +31-(0)6-46210825
 KvK nr: 17174380

___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org

Re: Interrupt storm with MSI in combination with em1

2011-05-04 Thread Daan Vreeken

Hi Jack,

Wednesday 04 May 2011 19:46:05 Jack Vogel wrote:
 Who makes your motherboard? The problem you are having is that MSIX AND
 MSI are both failing as em0 comes up, so it falls back to Legacy interrupt
 mode,
 and must be having some issue with sharing the line, causing the storm.

The motherboard is an Asus P7H55-M.
Sorry, I should have mentioned that the dmesg output is from booting with :

 hw.pci.enable_msix=0
 hw.pci.enable_msi=0

.. in loader.conf.

With those lines in loader.conf, MSI and MSIX is disabled, both cards work 
like they should and there is no interrupt storm.

With MSI/MSIX enabled, both cards work like they should and I see the counters 
of the MSI interrupts increase (in small amounts, like they should), but at 
boot-time an interrupt storm starts on 'legacy' IRQ 16.

Because the only difference between disabling/enabling MSI/MSIX seems to be in 
the way em0/em1 are used, and because 'em1' shares IRQ 16 according to the 
dmesg, I'm suspecting 'em1' is causing the storm.
(But please correct me if I'm wrong :)

What can I do to help track this problem down?

 
  According to dmesg the following devices share IRQ 16 :
 
 pcib1: ACPI PCI-PCI bridge irq 16 at device 1.0 on pci0
 em0: Intel(R) PRO/1000 Network Connection 7.2.3 port
  0xcc00-0xcc1f mem
  0xf7de-0xf7df,0xf7d0-0xf7d7,0xf7ddc000-0xf7dd
irq 16 at device 0.0 on pci1
 vgapci0: VGA-compatible display port 0xbc00-0xbc07
mem 0xf780-0xf7bf,0xe000-0xefff irq 16 at
  device 2.0 on
pci0
 ehci0: Intel PCH USB 2.0 controller USB-B mem
  0xf7cfa000-0xf7cfa3ff
irq 16 at device 26.0 on pci0
 em1: Intel(R) PRO/1000 Network Connection 7.2.3 port
  0xec00-0xec1f mem
  0xf7fe-0xf7ff,0xf7f0-0xf7f7,0xf7fdc000-0xf7fd
irq 16 at device 0.0 on pci4
 pcib4: ACPI PCI-PCI bridge irq 16 at device 28.5 on pci0
 
  During a storm vmstat -i shows a rate of about 220.000 interrupts/sec.
  MSI
  interrupt delivery to both 'em0' and 'em1' seems to work correctly during
  a storm, as I see their counters increase normally in the vmstat -i
  output.
 
  As only 'em0' and 'em1' seem to be using MSI interrupts, my guess is that
  the
  e1000 driver is causing this problem. Could it be that the driver forgets
  to
  clear/mask legacy interrupts when attaching the MSI interrupts perhaps?
 
  Any tips on how to debug and/or fix this?
 
 
  The full output of dmesg can be found here :
 http://vehosting.nl/pub_diffs/dmesg_plantje2_2011_05_04.txt
 
  And the full output of pciconf -lv is here :
 http://vehosting.nl/pub_diffs/pciconf_plantje2_2011_05_04.txt
 


Regards,
-- 
Daan Vreeken
VEHosting
http://VEHosting.nl
tel: +31-(0)40-7113050 / +31-(0)6-46210825
KvK nr: 17174380
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org

Re: Interrupt storm with MSI in combination with em1

2011-05-04 Thread Jack Vogel

Will you please set it back to a default and then boot and capture the
message for me?

Thank you,

Jack


On Wed, May 4, 2011 at 11:19 AM, Daan Vreeken d...@vehosting.nl wrote:

 Hi Jack,

 Wednesday 04 May 2011 19:46:05 Jack Vogel wrote:
  Who makes your motherboard? The problem you are having is that MSIX AND
  MSI are both failing as em0 comes up, so it falls back to Legacy
 interrupt
  mode,
  and must be having some issue with sharing the line, causing the storm.

 The motherboard is an Asus P7H55-M.
 Sorry, I should have mentioned that the dmesg output is from booting with :

  hw.pci.enable_msix=0
  hw.pci.enable_msi=0

 .. in loader.conf.

 With those lines in loader.conf, MSI and MSIX is disabled, both cards
 work
 like they should and there is no interrupt storm.

 With MSI/MSIX enabled, both cards work like they should and I see the
 counters
 of the MSI interrupts increase (in small amounts, like they should), but at
 boot-time an interrupt storm starts on 'legacy' IRQ 16.

 Because the only difference between disabling/enabling MSI/MSIX seems to be
 in
 the way em0/em1 are used, and because 'em1' shares IRQ 16 according to the
 dmesg, I'm suspecting 'em1' is causing the storm.
 (But please correct me if I'm wrong :)

 What can I do to help track this problem down?

  
   According to dmesg the following devices share IRQ 16 :
  
  pcib1: ACPI PCI-PCI bridge irq 16 at device 1.0 on pci0
  em0: Intel(R) PRO/1000 Network Connection 7.2.3 port
   0xcc00-0xcc1f mem
   0xf7de-0xf7df,0xf7d0-0xf7d7,0xf7ddc000-0xf7dd
 irq 16 at device 0.0 on pci1
  vgapci0: VGA-compatible display port 0xbc00-0xbc07
 mem 0xf780-0xf7bf,0xe000-0xefff irq 16 at
   device 2.0 on
 pci0
  ehci0: Intel PCH USB 2.0 controller USB-B mem
   0xf7cfa000-0xf7cfa3ff
 irq 16 at device 26.0 on pci0
  em1: Intel(R) PRO/1000 Network Connection 7.2.3 port
   0xec00-0xec1f mem
   0xf7fe-0xf7ff,0xf7f0-0xf7f7,0xf7fdc000-0xf7fd
 irq 16 at device 0.0 on pci4
  pcib4: ACPI PCI-PCI bridge irq 16 at device 28.5 on pci0
  
   During a storm vmstat -i shows a rate of about 220.000
 interrupts/sec.
   MSI
   interrupt delivery to both 'em0' and 'em1' seems to work correctly
 during
   a storm, as I see their counters increase normally in the vmstat -i
   output.
  
   As only 'em0' and 'em1' seem to be using MSI interrupts, my guess is
 that
   the
   e1000 driver is causing this problem. Could it be that the driver
 forgets
   to
   clear/mask legacy interrupts when attaching the MSI interrupts perhaps?
  
   Any tips on how to debug and/or fix this?
  
  
   The full output of dmesg can be found here :
  http://vehosting.nl/pub_diffs/dmesg_plantje2_2011_05_04.txt
  
   And the full output of pciconf -lv is here :
  http://vehosting.nl/pub_diffs/pciconf_plantje2_2011_05_04.txt
  


 Regards,
 --
 Daan Vreeken
 VEHosting
 http://VEHosting.nl
 tel: +31-(0)40-7113050 / +31-(0)6-46210825
 KvK nr: 17174380

___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org

Re: problems with em(4) since update to driver 7.2.2

2011-05-04 Thread Olivier Smedts

2011/5/4 Arnaud Lacombe lacom...@gmail.com:
 Obviously, I am no longer the only one finding that em(4) has
 unacceptable issue, this thread is the proof.

Right, and Jack seems to be willing to help, he asked something (I'll
reply tomorrow when I'll be in front of the hardware) and is trying to
find the same hardware.

Now please not get out of the thread, everybody back to work :)

-- 
Olivier Smedts                                                 _
                                        ASCII ribbon campaign ( )
e-mail: oliv...@gid0.org        - against HTML email  vCards  X
www: http://www.gid0.org    - against proprietary attachments / \

  Il y a seulement 10 sortes de gens dans le monde :
  ceux qui comprennent le binaire,
  et ceux qui ne le comprennent pas.
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org

Re: newnfs NFS client testing

2011-05-04 Thread Rick Macklem

 On 2011-Apr-25 20:33:14 -0400, Rick Macklem rmack...@uoguelph.ca
 wrote:
 I believe that the new/experimental NFS client in head is now
 compatible with the old/regular NFS client.
 
 Possibly even too compatible...
 
 Both the old and new NFS clients assume a 1:1 mapping between NFS
 error codes (NFSERR_* macros defined in nfs/nfsproto.h) and the E*
 macros in sys/errno.h: In the old client, the NFS status is copied
 from the RPC response by nfsclient/nfs_krpc.c:nfs_request(), returned
 and passed back up the call chain. In the new client, the status is
 copied from the RPC response into struct nfsrv_descript.nd_repstat
 by fs/nfs/nfs_commonkrpc.c:newnfs_request() and then moved into an
 error return in fs/nfsclient/nfs_clrpcops.c:nfsrpc_*().
 
 This is not currently a problem but it would seem useful to include
 notes in nfs/nfsproto.h and sys/errno.h warning of this assumption
 in case of future changes.
 
 Note that both NFS servers do include code for error code mapping.
 
I guess that a comment might be in order. I know that the NFS ones will
never change, since they're wired into the RFCs. I doubt anyone has an
urge to renumber errno.h (the ones up to about 70), but a comment w.r.t.
that in nfsproto.h could be useful.

Thanks for the good suggestion, rick
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org

Re: problems with em(4) since update to driver 7.2.2

2011-05-04 Thread Jack Vogel

I have had my validation engineer busy all day, we have tried both
a 9 kernel as well as 8.2,  using the code from HEAD, and we
cannot reproduce this problem.

The data your netstat -m shows suggests to me that what's happening
is somehow setup of the receive ring is running more than once maybe??

You asked at one point how this could go into STABLE, well, because
not only here at Intel, but at lots of external customers this code has been
used and tested thoroughly.

I am not calling into question your problem, but until I understand what it
is I cannot fix it :)

The thing I am guessing right now is the culprit is the setup code, the
reason
is that when I ported to the igb driver I found that it did not work on our
newer
hardware, and so I went back to the older version of setup for igb. Now,
even
though I have not seen hardware fail with em, maybe there is some.

To help me give me a complete pciconf -lv, and if its a namebrand system
tell me that, including all hardware in it.

If you like Olivier I can make a version of em for you that also reverts the
setup code the way I did for igb, see if that fixes it for you?

Thanks for your patience,

Jack
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org

Re: problems with em(4) since update to driver 7.2.2

2011-05-04 Thread Arnaud Lacombe

Hi,

On Wed, May 4, 2011 at 5:38 PM, Jack Vogel jfvo...@gmail.com wrote:
 I have had my validation engineer busy all day, we have tried both
 a 9 kernel as well as 8.2,  using the code from HEAD, and we
 cannot reproduce this problem.

 The data your netstat -m shows suggests to me that what's happening
 is somehow setup of the receive ring is running more than once maybe??

That would be consistent with what I reported back in February. I'll
try to see if I can have a look at that on our platform tonight.

 - Arnaud

 You asked at one point how this could go into STABLE, well, because
 not only here at Intel, but at lots of external customers this code has been
 used and tested thoroughly.

 I am not calling into question your problem, but until I understand what it
 is I cannot fix it :)

 The thing I am guessing right now is the culprit is the setup code, the
 reason
 is that when I ported to the igb driver I found that it did not work on our
 newer
 hardware, and so I went back to the older version of setup for igb. Now,
 even
 though I have not seen hardware fail with em, maybe there is some.

 To help me give me a complete pciconf -lv, and if its a namebrand system
 tell me that, including all hardware in it.

 If you like Olivier I can make a version of em for you that also reverts the
 setup code the way I did for igb, see if that fixes it for you?

 Thanks for your patience,

 Jack
 ___
 freebsd-current@freebsd.org mailing list
 http://lists.freebsd.org/mailman/listinfo/freebsd-current
 To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org

___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org

Re: Interrupt storm with MSI in combination with em1

2011-05-04 Thread Daan Vreeken

Hi,

On Wednesday 04 May 2011 20:47:32 Jack Vogel wrote:
 Will you please set it back to a default and then boot and capture the
 message for me?

No problem. Here's the output with MSI/MSIX enabled :
http://vehosting.nl/pub_diffs/dmesg_plantje2_with_msix_2011_05_04.txt

I've also added the output of vmstat -i a couple of minutes after a reboot 
with MSI enabled :
http://vehosting.nl/pub_diffs/vmstat_i_2011_05_04.txt

Note that in the above vmstat -i dump the interrupt storm hasn't started 
yet. For some reason the storm doesn't always start directly at boot. I 
haven't been able (yet) to pinpoint what's triggering it to start.


 On Wed, May 4, 2011 at 11:19 AM, Daan Vreeken d...@vehosting.nl wrote:
  Hi Jack,
 
  Wednesday 04 May 2011 19:46:05 Jack Vogel wrote:
   Who makes your motherboard? The problem you are having is that MSIX AND
   MSI are both failing as em0 comes up, so it falls back to Legacy
 
  interrupt
 
   mode,
   and must be having some issue with sharing the line, causing the storm.
 
  The motherboard is an Asus P7H55-M.
 
  Sorry, I should have mentioned that the dmesg output is from booting 
with :
   hw.pci.enable_msix=0
   hw.pci.enable_msi=0
 
  .. in loader.conf.
 
  With those lines in loader.conf, MSI and MSIX is disabled, both cards
  work
  like they should and there is no interrupt storm.
 
  With MSI/MSIX enabled, both cards work like they should and I see the
  counters
  of the MSI interrupts increase (in small amounts, like they should), but
  at boot-time an interrupt storm starts on 'legacy' IRQ 16.
 
  Because the only difference between disabling/enabling MSI/MSIX seems to
  be in
  the way em0/em1 are used, and because 'em1' shares IRQ 16 according to
  the dmesg, I'm suspecting 'em1' is causing the storm.
  (But please correct me if I'm wrong :)
 
  What can I do to help track this problem down?
 
According to dmesg the following devices share IRQ 16 :
   
   pcib1: ACPI PCI-PCI bridge irq 16 at device 1.0 on pci0
   em0: Intel(R) PRO/1000 Network Connection 7.2.3 port
0xcc00-0xcc1f mem
0xf7de-0xf7df,0xf7d0-0xf7d7,0xf7ddc000-0xf7dd
  irq 16 at device 0.0 on pci1
   vgapci0: VGA-compatible display port 0xbc00-0xbc07
  mem 0xf780-0xf7bf,0xe000-0xefff irq 16 at
device 2.0 on
  pci0
   ehci0: Intel PCH USB 2.0 controller USB-B mem
0xf7cfa000-0xf7cfa3ff
  irq 16 at device 26.0 on pci0
   em1: Intel(R) PRO/1000 Network Connection 7.2.3 port
0xec00-0xec1f mem
0xf7fe-0xf7ff,0xf7f0-0xf7f7,0xf7fdc000-0xf7fd
  irq 16 at device 0.0 on pci4
   pcib4: ACPI PCI-PCI bridge irq 16 at device 28.5 on pci0
   
During a storm vmstat -i shows a rate of about 220.000
 
  interrupts/sec.
 
MSI
interrupt delivery to both 'em0' and 'em1' seems to work correctly
 
  during
 
a storm, as I see their counters increase normally in the vmstat -i
output.
   
As only 'em0' and 'em1' seem to be using MSI interrupts, my guess is
 
  that
 
the
e1000 driver is causing this problem. Could it be that the driver
 
  forgets
 
to
clear/mask legacy interrupts when attaching the MSI interrupts
perhaps?
   
Any tips on how to debug and/or fix this?
   
   
The full output of dmesg can be found here :
   http://vehosting.nl/pub_diffs/dmesg_plantje2_2011_05_04.txt
   
And the full output of pciconf -lv is here :
   http://vehosting.nl/pub_diffs/pciconf_plantje2_2011_05_04.txt
 
  Regards,
  --
  Daan Vreeken
  VEHosting
  http://VEHosting.nl
  tel: +31-(0)40-7113050 / +31-(0)6-46210825
  KvK nr: 17174380


Regards,
-- 
Daan Vreeken
VEHosting
http://VEHosting.nl
tel: +31-(0)40-7113050 / +31-(0)6-46210825
KvK nr: 17174380
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org

Re: Interrupt storm with MSI in combination with em1

2011-05-04 Thread Jack Vogel

This all looks completely kosher,  what IRQ is the storm on??

Jack


On Wed, May 4, 2011 at 3:04 PM, Daan Vreeken d...@vehosting.nl wrote:

 Hi,

 On Wednesday 04 May 2011 20:47:32 Jack Vogel wrote:
  Will you please set it back to a default and then boot and capture the
  message for me?

 No problem. Here's the output with MSI/MSIX enabled :

 http://vehosting.nl/pub_diffs/dmesg_plantje2_with_msix_2011_05_04.txt

 I've also added the output of vmstat -i a couple of minutes after a
 reboot
 with MSI enabled :
http://vehosting.nl/pub_diffs/vmstat_i_2011_05_04.txt

 Note that in the above vmstat -i dump the interrupt storm hasn't started
 yet. For some reason the storm doesn't always start directly at boot. I
 haven't been able (yet) to pinpoint what's triggering it to start.


  On Wed, May 4, 2011 at 11:19 AM, Daan Vreeken d...@vehosting.nl wrote:
   Hi Jack,
  
   Wednesday 04 May 2011 19:46:05 Jack Vogel wrote:
Who makes your motherboard? The problem you are having is that MSIX
 AND
MSI are both failing as em0 comes up, so it falls back to Legacy
  
   interrupt
  
mode,
and must be having some issue with sharing the line, causing the
 storm.
  
   The motherboard is an Asus P7H55-M.
  
   Sorry, I should have mentioned that the dmesg output is from booting
 with :
hw.pci.enable_msix=0
hw.pci.enable_msi=0
  
   .. in loader.conf.
  
   With those lines in loader.conf, MSI and MSIX is disabled, both cards
   work
   like they should and there is no interrupt storm.
  
   With MSI/MSIX enabled, both cards work like they should and I see the
   counters
   of the MSI interrupts increase (in small amounts, like they should),
 but
   at boot-time an interrupt storm starts on 'legacy' IRQ 16.
  
   Because the only difference between disabling/enabling MSI/MSIX seems
 to
   be in
   the way em0/em1 are used, and because 'em1' shares IRQ 16 according to
   the dmesg, I'm suspecting 'em1' is causing the storm.
   (But please correct me if I'm wrong :)
  
   What can I do to help track this problem down?
  
 According to dmesg the following devices share IRQ 16 :

pcib1: ACPI PCI-PCI bridge irq 16 at device 1.0 on pci0
em0: Intel(R) PRO/1000 Network Connection 7.2.3 port
 0xcc00-0xcc1f mem
 0xf7de-0xf7df,0xf7d0-0xf7d7,0xf7ddc000-0xf7dd
   irq 16 at device 0.0 on pci1
vgapci0: VGA-compatible display port 0xbc00-0xbc07
   mem 0xf780-0xf7bf,0xe000-0xefff irq 16 at
 device 2.0 on
   pci0
ehci0: Intel PCH USB 2.0 controller USB-B mem
 0xf7cfa000-0xf7cfa3ff
   irq 16 at device 26.0 on pci0
em1: Intel(R) PRO/1000 Network Connection 7.2.3 port
 0xec00-0xec1f mem
 0xf7fe-0xf7ff,0xf7f0-0xf7f7,0xf7fdc000-0xf7fd
   irq 16 at device 0.0 on pci4
pcib4: ACPI PCI-PCI bridge irq 16 at device 28.5 on pci0

 During a storm vmstat -i shows a rate of about 220.000
  
   interrupts/sec.
  
 MSI
 interrupt delivery to both 'em0' and 'em1' seems to work correctly
  
   during
  
 a storm, as I see their counters increase normally in the vmstat
 -i
 output.

 As only 'em0' and 'em1' seem to be using MSI interrupts, my guess
 is
  
   that
  
 the
 e1000 driver is causing this problem. Could it be that the driver
  
   forgets
  
 to
 clear/mask legacy interrupts when attaching the MSI interrupts
 perhaps?

 Any tips on how to debug and/or fix this?


 The full output of dmesg can be found here :
http://vehosting.nl/pub_diffs/dmesg_plantje2_2011_05_04.txt

 And the full output of pciconf -lv is here :

 http://vehosting.nl/pub_diffs/pciconf_plantje2_2011_05_04.txt
  
   Regards,
   --
   Daan Vreeken
   VEHosting
   http://VEHosting.nl
   tel: +31-(0)40-7113050 / +31-(0)6-46210825
   KvK nr: 17174380


 Regards,
 --
 Daan Vreeken
 VEHosting
 http://VEHosting.nl
 tel: +31-(0)40-7113050 / +31-(0)6-46210825
 KvK nr: 17174380

___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org

Re: Interrupt storm with MSI in combination with em1

2011-05-04 Thread Jack Vogel

Right, it was you Wiktor :)  Oh, so yours is sort of a special case.

Thanks,

Jack


On Wed, May 4, 2011 at 3:27 PM, Wiktor Niesiobedzki b...@vink.pl wrote:

 2011/5/4 Jack Vogel jfvo...@gmail.com:
  This is the second report in a matter of a week perhaps about a
 problematic
  motherboard, I would like to know who makes them.

 Just for the record, the motherboard with which I had problems (I
 guess my problem is here referred) is VIA EPIA SN1. It's nothing
 new, and probably rarely used with additional PCIe cards, as this is
 embedded-like creature.

 Cheers,

 Wiktor Niesiobedzki

___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org

Re: Interrupt storm with MSI in combination with em1

2011-05-04 Thread Wiktor Niesiobedzki

2011/5/4 Jack Vogel jfvo...@gmail.com:
 This is the second report in a matter of a week perhaps about a problematic
 motherboard, I would like to know who makes them.

Just for the record, the motherboard with which I had problems (I
guess my problem is here referred) is VIA EPIA SN1. It's nothing
new, and probably rarely used with additional PCIe cards, as this is
embedded-like creature.

Cheers,

Wiktor Niesiobedzki
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org

Re: Interrupt storm with MSI in combination with em1

2011-05-04 Thread Daan Vreeken

On Thursday 05 May 2011 00:15:43 you wrote:
 This all looks completely kosher,  what IRQ is the storm on??

IRQ 16. Further down this email there is a list of devices that share the IRQ 
according to 'dmesg'.


 On Wed, May 4, 2011 at 3:04 PM, Daan Vreeken d...@vehosting.nl wrote:
  Hi,
 
  On Wednesday 04 May 2011 20:47:32 Jack Vogel wrote:
   Will you please set it back to a default and then boot and capture the
   message for me?
 
  No problem. Here's the output with MSI/MSIX enabled :
 
  http://vehosting.nl/pub_diffs/dmesg_plantje2_with_msix_2011_05_04.txt
 
  I've also added the output of vmstat -i a couple of minutes after a
  reboot
  with MSI enabled :
 http://vehosting.nl/pub_diffs/vmstat_i_2011_05_04.txt
 
  Note that in the above vmstat -i dump the interrupt storm hasn't
  started yet. For some reason the storm doesn't always start directly at
  boot. I haven't been able (yet) to pinpoint what's triggering it to
  start.
 
   On Wed, May 4, 2011 at 11:19 AM, Daan Vreeken d...@vehosting.nl wrote:
Hi Jack,
   
Wednesday 04 May 2011 19:46:05 Jack Vogel wrote:
 Who makes your motherboard? The problem you are having is that MSIX
 AND MSI are both failing as em0 comes up, so it falls back to Legacy
 interrupt mode,
 and must be having some issue with sharing the line, causing the
 storm.
The motherboard is an Asus P7H55-M.
   
Sorry, I should have mentioned that the dmesg output is from booting
with :
 hw.pci.enable_msix=0
 hw.pci.enable_msi=0
.. in loader.conf.
   
With those lines in loader.conf, MSI and MSIX is disabled, both
cards work
like they should and there is no interrupt storm.
   
With MSI/MSIX enabled, both cards work like they should and I see the
counters
of the MSI interrupts increase (in small amounts, like they should),
but at boot-time an interrupt storm starts on 'legacy' IRQ 16.
   
Because the only difference between disabling/enabling MSI/MSIX seems
to be in
the way em0/em1 are used, and because 'em1' shares IRQ 16 according
to the dmesg, I'm suspecting 'em1' is causing the storm.
(But please correct me if I'm wrong :)
   
What can I do to help track this problem down?
   
  According to dmesg the following devices share IRQ 16 :
 
 pcib1: ACPI PCI-PCI bridge irq 16 at device 1.0 on pci0
 em0: Intel(R) PRO/1000 Network Connection 7.2.3 port
  0xcc00-0xcc1f mem
  0xf7de-0xf7df,0xf7d0-0xf7d7,0xf7ddc000-0xf7dd
irq 16 at device 0.0 on pci1
 vgapci0: VGA-compatible display port 0xbc00-0xbc07
mem 0xf780-0xf7bf,0xe000-0xefff irq 16
  at device 2.0 on
pci0
 ehci0: Intel PCH USB 2.0 controller USB-B mem
  0xf7cfa000-0xf7cfa3ff
irq 16 at device 26.0 on pci0
 em1: Intel(R) PRO/1000 Network Connection 7.2.3 port
  0xec00-0xec1f mem
  0xf7fe-0xf7ff,0xf7f0-0xf7f7,0xf7fdc000-0xf7fd
irq 16 at device 0.0 on pci4
 pcib4: ACPI PCI-PCI bridge irq 16 at device 28.5 on pci0
 
  During a storm vmstat -i shows a rate of about 220.000
  interrupts/sec.
  MSI
  interrupt delivery to both 'em0' and 'em1' seems to work
  correctly during
  a storm, as I see their counters increase normally in the vmstat
  -i output.
  As only 'em0' and 'em1' seem to be using MSI interrupts, my guess
  is that the
  e1000 driver is causing this problem. Could it be that the driver
  forgets to
  clear/mask legacy interrupts when attaching the MSI interrupts
  perhaps?
 
  Any tips on how to debug and/or fix this?
 
 
  The full output of dmesg can be found here :

  http://vehosting.nl/pub_diffs/dmesg_plantje2_2011_05_04.txt
 
  And the full output of pciconf -lv is here :
 
  http://vehosting.nl/pub_diffs/pciconf_plantje2_2011_05_04.txt


Thanks,
-- 
Daan Vreeken
VEHosting
http://VEHosting.nl
tel: +31-(0)40-7113050 / +31-(0)6-46210825
KvK nr: 17174380
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org

Re: Interrupt storm with MSI in combination with em1

2011-05-04 Thread Jack Vogel

OK, but the reason you see the multiple cases of irq 16 is that's the
bridge,
once you are using MSIX, as vmstat shows, its using other vectors.

Can you capture the messages file with the actual storm happening?

I noticed some complaints about checksums in the dmesg, have you
checked on BIOS upgrades or something like that on your motherboard?

Regards,

Jack


On Wed, May 4, 2011 at 4:27 PM, Daan Vreeken d...@vehosting.nl wrote:

 On Thursday 05 May 2011 00:15:43 you wrote:
  This all looks completely kosher,  what IRQ is the storm on??

 IRQ 16. Further down this email there is a list of devices that share the
 IRQ
 according to 'dmesg'.


  On Wed, May 4, 2011 at 3:04 PM, Daan Vreeken d...@vehosting.nl wrote:
   Hi,
  
   On Wednesday 04 May 2011 20:47:32 Jack Vogel wrote:
Will you please set it back to a default and then boot and capture
 the
message for me?
  
   No problem. Here's the output with MSI/MSIX enabled :
  
   http://vehosting.nl/pub_diffs/dmesg_plantje2_with_msix_2011_05_04.txt
  
   I've also added the output of vmstat -i a couple of minutes after a
   reboot
   with MSI enabled :
  http://vehosting.nl/pub_diffs/vmstat_i_2011_05_04.txt
  
   Note that in the above vmstat -i dump the interrupt storm hasn't
   started yet. For some reason the storm doesn't always start directly at
   boot. I haven't been able (yet) to pinpoint what's triggering it to
   start.
  
On Wed, May 4, 2011 at 11:19 AM, Daan Vreeken d...@vehosting.nl
 wrote:
 Hi Jack,

 Wednesday 04 May 2011 19:46:05 Jack Vogel wrote:
  Who makes your motherboard? The problem you are having is that
 MSIX
  AND MSI are both failing as em0 comes up, so it falls back to
 Legacy
  interrupt mode,
  and must be having some issue with sharing the line, causing the
  storm.
 The motherboard is an Asus P7H55-M.

 Sorry, I should have mentioned that the dmesg output is from
 booting
 with :
  hw.pci.enable_msix=0
  hw.pci.enable_msi=0
 .. in loader.conf.

 With those lines in loader.conf, MSI and MSIX is disabled, both
 cards work
 like they should and there is no interrupt storm.

 With MSI/MSIX enabled, both cards work like they should and I see
 the
 counters
 of the MSI interrupts increase (in small amounts, like they
 should),
 but at boot-time an interrupt storm starts on 'legacy' IRQ 16.

 Because the only difference between disabling/enabling MSI/MSIX
 seems
 to be in
 the way em0/em1 are used, and because 'em1' shares IRQ 16 according
 to the dmesg, I'm suspecting 'em1' is causing the storm.
 (But please correct me if I'm wrong :)

 What can I do to help track this problem down?

   According to dmesg the following devices share IRQ 16 :
  
  pcib1: ACPI PCI-PCI bridge irq 16 at device 1.0 on
 pci0
  em0: Intel(R) PRO/1000 Network Connection 7.2.3 port
   0xcc00-0xcc1f mem
  
 0xf7de-0xf7df,0xf7d0-0xf7d7,0xf7ddc000-0xf7dd
 irq 16 at device 0.0 on pci1
  vgapci0: VGA-compatible display port 0xbc00-0xbc07
 mem 0xf780-0xf7bf,0xe000-0xefff irq
 16
   at device 2.0 on
 pci0
  ehci0: Intel PCH USB 2.0 controller USB-B mem
   0xf7cfa000-0xf7cfa3ff
 irq 16 at device 26.0 on pci0
  em1: Intel(R) PRO/1000 Network Connection 7.2.3 port
   0xec00-0xec1f mem
  
 0xf7fe-0xf7ff,0xf7f0-0xf7f7,0xf7fdc000-0xf7fd
 irq 16 at device 0.0 on pci4
  pcib4: ACPI PCI-PCI bridge irq 16 at device 28.5 on
 pci0
  
   During a storm vmstat -i shows a rate of about 220.000
   interrupts/sec.
   MSI
   interrupt delivery to both 'em0' and 'em1' seems to work
   correctly during
   a storm, as I see their counters increase normally in the
 vmstat
   -i output.
   As only 'em0' and 'em1' seem to be using MSI interrupts, my
 guess
   is that the
   e1000 driver is causing this problem. Could it be that the
 driver
   forgets to
   clear/mask legacy interrupts when attaching the MSI interrupts
   perhaps?
  
   Any tips on how to debug and/or fix this?
  
  
   The full output of dmesg can be found here :
  
   http://vehosting.nl/pub_diffs/dmesg_plantje2_2011_05_04.txt
  
   And the full output of pciconf -lv is here :
  
   http://vehosting.nl/pub_diffs/pciconf_plantje2_2011_05_04.txt


 Thanks,
 --
 Daan Vreeken
 VEHosting
 http://VEHosting.nl
 tel: +31-(0)40-7113050 / +31-(0)6-46210825
 KvK nr: 17174380

___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org

Re: problems with em(4) since update to driver 7.2.2

2011-05-04 Thread Arnaud Lacombe

Hi,

On Wed, May 4, 2011 at 5:38 PM, Jack Vogel jfvo...@gmail.com wrote:
 I have had my validation engineer busy all day, we have tried both
 a 9 kernel as well as 8.2,  using the code from HEAD, and we
 cannot reproduce this problem.

Actually, it can be trivially reproduced by tainting `error'. As it is
uninitialized in HEAD, it's value can be _anything_, so let's mark it
as explicitly invalid.

diff -u ./if_em.c /data/src/freebsd/em-7.2.2/src/if_em.c
--- ./if_em.c   2011-02-18 01:18:23.0 -0500
+++ /data/src/freebsd/em-7.2.2/src/if_em.c  2011-05-05
01:12:01.0 -0400
@@ -3912,7 +3912,7 @@
struct  adapter *adapter = rxr-adapter;
struct em_buffer*rxbuf;
bus_dma_segment_t   seg[1];
-   int i, j, nsegs, error;
+   int i, j, nsegs, error = -1;

The error pointed out in this thread pops up in the next boot.

 - Arnaud

 The data your netstat -m shows suggests to me that what's happening
 is somehow setup of the receive ring is running more than once maybe??

 You asked at one point how this could go into STABLE, well, because
 not only here at Intel, but at lots of external customers this code has been
 used and tested thoroughly.

 I am not calling into question your problem, but until I understand what it
 is I cannot fix it :)

 The thing I am guessing right now is the culprit is the setup code, the
 reason
 is that when I ported to the igb driver I found that it did not work on our
 newer
 hardware, and so I went back to the older version of setup for igb. Now,
 even
 though I have not seen hardware fail with em, maybe there is some.

 To help me give me a complete pciconf -lv, and if its a namebrand system
 tell me that, including all hardware in it.

 If you like Olivier I can make a version of em for you that also reverts the
 setup code the way I did for igb, see if that fixes it for you?

 Thanks for your patience,

 Jack
 ___
 freebsd-current@freebsd.org mailing list
 http://lists.freebsd.org/mailman/listinfo/freebsd-current
 To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org

___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org

42 matches

Mail list logo