Re: Clang error make buildworld
On 2011-05-04 03:07, Manfred Antar wrote: I get this error when trying to buildworld on current i386. It's been this way for awhile Any Ideas ? === boot/i386/boot0 (all) clang -O2 -pipe -DVOLUME_SERIAL -DPXE -DFLAGS=0x8f -DTICKS=0xb6 -DCOMSPEED=7 5 + 3 -ffreestanding -mpreferred-stack-boundary=2 -mno-mmx -mno-3dnow -mno-sse -mno-sse2 -mno-sse3 -msoft-float -std=gnu99-c /usr/src/sys/boot/i386/boot0/boot0.S clang: warning: argument unused during compilation: '-mpreferred-stack-boundary=2' /tmp/cc-4SXZt8.s:42:11: error: .code16 not supported yet For some reason, on your system, it does not compile boot0.S with -no-integrated-as. It works fine here though, so it must be something local to your system. Can you please post: - Your full make.conf and src.conf - The first 30 lines of sys/boot/i386/boot0/Makefile ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: Nasty non-recursive lockmgr panic on softdep only enabled UFS partition when filesystem full
On Tue, May 3, 2011 at 10:59 PM, Kirk McKusick mckus...@mckusick.com wrote: Date: Tue, 3 May 2011 22:40:26 -0700 Subject: Nasty non-recursive lockmgr panic on softdep only enabled UFS partition when filesystem full From: Garrett Cooper yaneg...@gmail.com To: Jeff Roberson j...@freebsd.org, Marshall Kirk McKusick mckus...@mckusick.com Cc: FreeBSD Current freebsd-current@freebsd.org Hi Jeff and Dr. McKusick, Ran into this panic when /usr ran out of space doing a make universe on amd64/r221219 (it took ~15 minutes for the panic to occur after the filesystem ran out of space -- wasn't quite sure what it was doing at the time): ... Let me know what other commands you would like for me to run in kgdb. Thanks, -Garrett You did not indicate whether you are running an 8.X system or a 9-current system. It would be helpful to know that. I've actually been running CURRENT for a few years now, but you're right -- I didn't mention that part. Jeff thinks that there may be a potential race in the locking code for softdep_request_cleanup. If so, this patch for 9-current should fix it: Index: ffs_softdep.c === --- ffs_softdep.c (revision 221385) +++ ffs_softdep.c (working copy) @@ -11380,7 +11380,8 @@ continue; } MNT_IUNLOCK(mp); - if (vget(lvp, LK_EXCLUSIVE | LK_INTERLOCK, curthread)) { + if (vget(lvp, LK_EXCLUSIVE | LK_NOWAIT | LK_INTERLOCK, + curthread)) { MNT_ILOCK(mp); continue; } If you are running an 8.X system, hopefully you will be able to apply it. I've applied it, rebuilt and installed the kernel, and trying to repro the case again. Will let you know how things go! Thanks! -Garrett ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: Nasty non-recursive lockmgr panic on softdep only enabled UFS partition when filesystem full
On Tue, May 3, 2011 at 11:42 PM, Garrett Cooper yaneg...@gmail.com wrote: On Tue, May 3, 2011 at 10:59 PM, Kirk McKusick mckus...@mckusick.com wrote: Date: Tue, 3 May 2011 22:40:26 -0700 Subject: Nasty non-recursive lockmgr panic on softdep only enabled UFS partition when filesystem full From: Garrett Cooper yaneg...@gmail.com To: Jeff Roberson j...@freebsd.org, Marshall Kirk McKusick mckus...@mckusick.com Cc: FreeBSD Current freebsd-current@freebsd.org Hi Jeff and Dr. McKusick, Ran into this panic when /usr ran out of space doing a make universe on amd64/r221219 (it took ~15 minutes for the panic to occur after the filesystem ran out of space -- wasn't quite sure what it was doing at the time): ... Let me know what other commands you would like for me to run in kgdb. Thanks, -Garrett You did not indicate whether you are running an 8.X system or a 9-current system. It would be helpful to know that. I've actually been running CURRENT for a few years now, but you're right -- I didn't mention that part. Jeff thinks that there may be a potential race in the locking code for softdep_request_cleanup. If so, this patch for 9-current should fix it: Index: ffs_softdep.c === --- ffs_softdep.c (revision 221385) +++ ffs_softdep.c (working copy) @@ -11380,7 +11380,8 @@ continue; } MNT_IUNLOCK(mp); - if (vget(lvp, LK_EXCLUSIVE | LK_INTERLOCK, curthread)) { + if (vget(lvp, LK_EXCLUSIVE | LK_NOWAIT | LK_INTERLOCK, + curthread)) { MNT_ILOCK(mp); continue; } If you are running an 8.X system, hopefully you will be able to apply it. I've applied it, rebuilt and installed the kernel, and trying to repro the case again. Will let you know how things go! Happened again with the change. It's really easy to repro: 1. Get a filesystem with UFS+SU 2. Execute something that does a large number of small writes to a partition. 3. 'dd if=/dev/zero of=FOO bs=10m' on the same partition The kernel will panic with the issue I discussed above. Thanks! -Garrett ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: Building FreeBSD 9.0-CUR/amd64 with CLANG fails
On 2011-05-03 20:53, O. Hartmann wrote: ... ld -m elf_i386_fbsd -Y P,/usr/obj/usr/src/lib32/usr/lib32 �-o gcrt1.o -r crt1_s.o gcrt1_c.o ld: Relocatable linking with relocations from format elf64-x86-64-freebsd (crt1_s.o) to format elf32-i386-freebsd (gcrt1.o) is not supported ... Today, I tried again, after CLANG/LLVM has been updated to version 3.0. Same error. This is the addendum I made to the /etc/make.conf: ## ## CLANG ## .if defined(USE_CLANG) .if !defined(CC) || ${CC} == cc CC=clang .endif .if !defined(CXX) || ${CXX} == c++ CXX=clang++ .endif # Don't die on warnings NO_WERROR= WERROR= # Don't forget this when using Jails! NO_FSCHG= .endif Ok, that looks good, I use a similar construction. However, in my case it works fine, so there must be something special on your system that breaks the build. What happens here, is that the 32-bit stage on amd64 fails, because it tries to link together 64-bit and 32-bit object files, which is not allowed. This can occur if Makefile.inc1 cannot set CC to the correct value, but there might also be something else going on. To debug this further, can you please post: - Your full /etc/make.conf - Your full /etc/src.conf - Any modifications you made to your source tree - The specific procedure you use for buildworld - An url to a full build log (don't post it to the list, because it will be rather large) ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: problems with em(4) since update to driver 7.2.2
2011/5/4 Jack Vogel jfvo...@gmail.com: It has nothing to do with load, it has to do with the prerequisites to init your interfaces. The amount you need is fixed, it doesn't vary with load. Every RX descriptor needs one, so its simple math, number-of-interfaces X number-of-queues X size of the ring. If you have other network interfaces beside Intel they also consume mbufs remember. Only one network interface. # kldunload if_em.ko (the old one) # sysctl kern.ipc.nmbclusters=512000 (I also tried with lower and more meaningful values) # kldload ./if_em.ko (the new one) # dmesg em0: detached pci0: network, ethernet at device 25.0 (no driver attached) em0: Intel(R) PRO/1000 Network Connection 7.2.3 port 0x2100-0x211f mem 0xf000-0xf001,0xf0025000-0xf0025fff irq 19 at device 25.0 on pci0 em0: Using an MSI interrupt em0: Ethernet address: d4:85:64:b2:aa:f5 em0: Could not setup receive structures em0: Could not setup receive structures What can we do to help you debug this ? -- Olivier Smedts _ ASCII ribbon campaign ( ) e-mail: oliv...@gid0.org - against HTML email vCards X www: http://www.gid0.org - against proprietary attachments / \ Il y a seulement 10 sortes de gens dans le monde : ceux qui comprennent le binaire, et ceux qui ne le comprennent pas. ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: Building FreeBSD 9.0-CUR/amd64 with CLANG fails
2011/5/4 O. Hartmann ohart...@mail.zedat.fu-berlin.de: But when I tried to compile essential ports (essential to me), like x11-wm/windowmaker, mulitmedia/ffmpeg, for instance, I run into serious compiler/assembler error with LLVM/CLANG. I guess the ports- tree isn't mature for clang. So am I right in this thinking: leaving /etc/make.conf untouched in terms of not putting there the CLANG build construct and putting this instead into /etc/src.conf will only affect the OS' source tree to be build by clang and all ports are build by the antique system's gcc 4.2.1? A lot of ports can't be build with clang. You can add something like this to your /etc/make.conf (modifying paths accordingly) : .if ${.CURDIR:M/usr/src*} || ${.CURDIR:M*/usr/ports/emulators/virtualbox-ose-kmod*} .if !defined(CC) || ${CC} == cc CC=clang .endif .if !defined(CXX) || ${CXX} == c++ CXX=clang++ .endif NO_WERROR= WERROR= .endif That's what I use. Note the first if statement. Cheers -- Olivier Smedts _ ASCII ribbon campaign ( ) e-mail: oliv...@gid0.org - against HTML email vCards X www: http://www.gid0.org - against proprietary attachments / \ Il y a seulement 10 sortes de gens dans le monde : ceux qui comprennent le binaire, et ceux qui ne le comprennent pas. ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: Building FreeBSD 9.0-CUR/amd64 with CLANG fails
On 2011-05-04 09:17, O. Hartmann wrote: ... But when I tried to compile essential ports (essential to me), like x11-wm/windowmaker, mulitmedia/ffmpeg, for instance, I run into serious compiler/assembler error with LLVM/CLANG. I guess the ports- tree isn't mature for clang. Several patches for this are available at: http://rainbow-runner.nl/clang/patches/ but getting these into the ports tree itself is proving to be rather slow, for some reason. I see an ffmpeg patch in there, but no windowmaker one. I will have a look at what the problem is. Note that if you run into problems with clang's integrated assembler, you can always add -no-integrated-as to CFLAGS, then it will use GNU as instead. It will just be a bit slower. On the other hand, if there is a way to actually correct the assembly, or if it is really a bug in the integrated assembler, we would rather fix it properly. :) So am I right in this thinking: leaving /etc/make.conf untouched in terms of not putting there the CLANG build construct and putting this instead into /etc/src.conf will only affect the OS' source tree to be build by clang and all ports are build by the antique system's gcc 4.2.1? No, you really *must* put any CC= definitions in make.conf; if you put them in src.conf, they will not be picked up early enough in some cases. If you only want to build /usr/src with clang, and ports with gcc, it is probably best to surround the CC=clang definitions with: .if !empty(.CURDIR:M/usr/src*) # ...clang definitions here... .endif ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: Nasty non-recursive lockmgr panic on softdep only enabled UFS partition when filesystem full
On 4 May 2011 10:42, Garrett Cooper yaneg...@gmail.com wrote: On Tue, May 3, 2011 at 10:59 PM, Kirk McKusick mckus...@mckusick.com wrote: Date: Tue, 3 May 2011 22:40:26 -0700 Subject: Nasty non-recursive lockmgr panic on softdep only enabled UFS partition when filesystem full From: Garrett Cooper yaneg...@gmail.com To: Jeff Roberson j...@freebsd.org, Marshall Kirk McKusick mckus...@mckusick.com Cc: FreeBSD Current freebsd-current@freebsd.org Hi Jeff and Dr. McKusick, Ran into this panic when /usr ran out of space doing a make universe on amd64/r221219 (it took ~15 minutes for the panic to occur after the filesystem ran out of space -- wasn't quite sure what it was doing at the time): ... Let me know what other commands you would like for me to run in kgdb. Thanks, -Garrett You did not indicate whether you are running an 8.X system or a 9-current system. It would be helpful to know that. I've actually been running CURRENT for a few years now, but you're right -- I didn't mention that part. Jeff thinks that there may be a potential race in the locking code for softdep_request_cleanup. If so, this patch for 9-current should fix it: Index: ffs_softdep.c === --- ffs_softdep.c (revision 221385) +++ ffs_softdep.c (working copy) @@ -11380,7 +11380,8 @@ continue; } MNT_IUNLOCK(mp); - if (vget(lvp, LK_EXCLUSIVE | LK_INTERLOCK, curthread)) { + if (vget(lvp, LK_EXCLUSIVE | LK_NOWAIT | LK_INTERLOCK, + curthread)) { MNT_ILOCK(mp); continue; } FYI, I was playing with head (w/o the above patch) to reproduce the panic and got this LOR when filesystem was eventually filled. I'm not sure the patch would fix the panic but I think it should at least fix the LOR. kernel: pid 66153 (dd), uid 0 inumber 4 on /mnt: filesystem full lock order reversal: 1st 0xfe001d7d3310 ufs (ufs) @ /usr/src/sys/kern/vfs_vnops.c:614 2nd 0xff807ba8a800 bufwait (bufwait) @ /usr/src/sys/kern/vfs_bio.c:2658 3rd 0xfe001ade7588 ufs (ufs) @ /usr/src/sys/kern/vfs_subr.c:2126 KDB: stack backtrace: db_trace_self_wrapper() at 0x802d9eba = db_trace_self_wrapper+0x2a kdb_backtrace() at 0x80475d17 = kdb_backtrace+0x37 _witness_debugger() at 0x8048b4fe = _witness_debugger+0x2e witness_checkorder() at 0x8048c7a7 = witness_checkorder+0x807 __lockmgr_args() at 0x80427553 = __lockmgr_args+0xd63 ffs_lock() at 0x806578fc = ffs_lock+0x9c VOP_LOCK1_APV() at 0x806f285f = VOP_LOCK1_APV+0xbf _vn_lock() at 0x804e87c7 = _vn_lock+0x57 vget() at 0x804dbb5b = vget+0x7b softdep_request_cleanup() at 0x80649f31 = softdep_request_cleanup+0x311 ffs_alloc() at 0x80630b64 = ffs_alloc+0x134 ffs_balloc_ufs2() at 0x8063426c = ffs_balloc_ufs2+0x11ac ffs_write() at 0x8065889f = ffs_write+0x22f VOP_WRITE_APV() at 0x806f33dd = VOP_WRITE_APV+0x14d vn_write() at 0x804e9a42 = vn_write+0x2a2 dofilewrite() at 0x8048df25 = dofilewrite+0x85 kern_writev() at 0x8048f740 = kern_writev+0x60 write() at 0x8048f845 = write+0x55 syscallenter() at 0x80483cbb = syscallenter+0x1cb syscall() at 0x806abaf0 = syscall+0x60 Xfast_syscall() at 0x8069670d = Xfast_syscall+0xdd --- syscall (4, FreeBSD ELF64, write), rip = 0x8009438fc, rsp = 0x7fffda68, rbp = 0xa0 --- -- wbr, pluknet ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: Nasty non-recursive lockmgr panic on softdep only enabled UFS partition when filesystem full
On Tue, May 03, 2011 at 11:58:49PM -0700, Garrett Cooper wrote: On Tue, May 3, 2011 at 11:42 PM, Garrett Cooper yaneg...@gmail.com wrote: On Tue, May 3, 2011 at 10:59 PM, Kirk McKusick mckus...@mckusick.com wrote: Date: Tue, 3 May 2011 22:40:26 -0700 Subject: Nasty non-recursive lockmgr panic on softdep only enabled UFS partition when filesystem full From: Garrett Cooper yaneg...@gmail.com To: Jeff Roberson j...@freebsd.org, Marshall Kirk McKusick mckus...@mckusick.com Cc: FreeBSD Current freebsd-current@freebsd.org Hi Jeff and Dr. McKusick, Ran into this panic when /usr ran out of space doing a make universe on amd64/r221219 (it took ~15 minutes for the panic to occur after the filesystem ran out of space -- wasn't quite sure what it was doing at the time): ... Let me know what other commands you would like for me to run in kgdb. Thanks, -Garrett You did not indicate whether you are running an 8.X system or a 9-current system. It would be helpful to know that. I've actually been running CURRENT for a few years now, but you're right -- I didn't mention that part. Jeff thinks that there may be a potential race in the locking code for softdep_request_cleanup. If so, this patch for 9-current should fix it: Index: ffs_softdep.c === --- ffs_softdep.c (revision 221385) +++ ffs_softdep.c (working copy) @@ -11380,7 +11380,8 @@ continue; } MNT_IUNLOCK(mp); - if (vget(lvp, LK_EXCLUSIVE | LK_INTERLOCK, curthread)) { + if (vget(lvp, LK_EXCLUSIVE | LK_NOWAIT | LK_INTERLOCK, + curthread)) { MNT_ILOCK(mp); continue; } If you are running an 8.X system, hopefully you will be able to apply it. I've applied it, rebuilt and installed the kernel, and trying to repro the case again. Will let you know how things go! Happened again with the change. It's really easy to repro: 1. Get a filesystem with UFS+SU 2. Execute something that does a large number of small writes to a partition. 3. 'dd if=/dev/zero of=FOO bs=10m' on the same partition The kernel will panic with the issue I discussed above. Thanks! Jeff' change is required to avoid LORs, but it is not sufficient to prevent recursion. We must skip the vnode supplied as a parameter to softdep_request_cleanup(). Theoretically, other vnodes might be also locked by curthread, thus I think the change below is needed. Try this. diff --git a/sys/ufs/ffs/ffs_softdep.c b/sys/ufs/ffs/ffs_softdep.c index a6d4441..25fa5d6 100644 --- a/sys/ufs/ffs/ffs_softdep.c +++ b/sys/ufs/ffs/ffs_softdep.c @@ -11380,7 +11380,9 @@ retry: continue; } MNT_IUNLOCK(mp); - if (vget(lvp, LK_EXCLUSIVE | LK_INTERLOCK, curthread)) { + if (VOP_ISLOCKED(lvp) || + vget(lvp, LK_EXCLUSIVE | LK_INTERLOCK | LK_NOWAIT, + curthread)) { MNT_ILOCK(mp); continue; } pgpnPeiKnHi9d.pgp Description: PGP signature
Re: firefox4+html5
Yes, I understand this, but the same issue occurs on 8.x branch... But, I turned off all options for debugging and rebuild the kernel. Issue has not disappeared. On Tue, 3 May 2011 15:59:23 + Holger Kipp holger.k...@alogis.com wrote: Dear Vitaly, I'm usually not using FreeBSD for accessing youtube, but as you're using FreeBSD 9.0-current, please note that this presumably has Witness enabled (because FreeBSD 9.0-current is still development branch), which will reduce performance and hence might give the problems you described. from http://www.freebsd.org/doc/en/books/developers-handbook/kerneldebug-options.html options WITNESS: this option enables run-time lock order tracking and verification, and is an invaluable tool for deadlock diagnosis. WITNESS maintains a graph of acquired lock orders by lock type, and checks the graph at each acquire for cycles (implicit or explicit). If a cycle is detected, a warning and stack trace are generated to the console, indicating that a potential deadlock might have occurred. WITNESS is required in order to use the show locks, show witness and show alllocks DDB commands. This debug option has significant performance overhead, which may be somewhat mitigated through the use of options WITNESS_SKIPSPIN. Detailed documentation may be found in witness(4). = http://www.freebsd.org/cgi/man.cgi?query=witnesssektion=4 Best regards, Holger From: owner-freebsd-curr...@freebsd.org [owner-freebsd-curr...@freebsd.org] on behalf of Vitaly Liaschuk [lari...@gmail.com] Sent: 03 May 2011 16:49 To: FreeBSD current mailing list Subject: firefox4+html5 Hi, list! I do not know in what part of forum to write, so I decide write in General. I'm trying to use html5 on youtube.com. I getting the video stream, but audio stutters on most of video files . I tried to use the chrome-browser and he is works fine. Also, I tried boot from usb flash drive with installed ubuntu and firefox 4 and this works. So, I believe what trouble is in my FreeBSD. [QUOTE] uname -a FreeBSD laptop 9.0-CURRENT FreeBSD 9.0-CURRENT #0 r221296M: Sun May 1 20:13:15 EEST 2011 larin@laptop:/usr/obj/usr/src/sys/GENERIC amd64 [/QUOTE] My box is: Laptop asus UL30A, 3 GB ram, Intel CPU U2300 1.2Mhz. Thanks in advance. ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org -- Holger Kipp Diplom-Mathematiker Senior Consultant Tel. : +49 30 436 58 114 Fax. : +49 30 436 58 214 Mobil: +49 178 36 58 114 Email: holger.k...@alogis.com alogis AG Alt-Moabit 90b D-10559 Berlin web : http://www.alogis.com -- alogis AG Sitz/Registergericht: Berlin/AG Charlottenburg, HRB 71484 Vorstand: Arne Friedrichs, Joern Samuelson Aufsichtsratsvorsitzender: Reinhard Mielke ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: firefox4+html5
On Tue, May 03, 2011 at 08:13:41PM +0200, Guido Falsi wrote: On 05/03/11 19:19, Francois Gerodez wrote: Hi all, I'm currently running FreeBSD 8.1 (latest update) and I'm experiencing a similar issue. HTML5 videos are very laggy (both image and sound) with Firefox 4. I ended up installing the flash player to watch youtube streaming. I didn't spot any particular warning/error messages so I don't know where to start... I have these symptoms too, but usually if I pause the video, send it back to the start with the slider and finally start playing it goes smoothly usually. This is quite strange, I know. Perhaps someone else should check if this is the same for everyone or just something that is happening to me. Forgot to mension. I'm using 8-stable, not -current. Did not check which list I was writing to. Sorry. -- Guido Falsi m...@madpilot.net ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: newnfs NFS client testing
On 2011-Apr-25 20:33:14 -0400, Rick Macklem rmack...@uoguelph.ca wrote: I believe that the new/experimental NFS client in head is now compatible with the old/regular NFS client. Possibly even too compatible... Both the old and new NFS clients assume a 1:1 mapping between NFS error codes (NFSERR_* macros defined in nfs/nfsproto.h) and the E* macros in sys/errno.h: In the old client, the NFS status is copied from the RPC response by nfsclient/nfs_krpc.c:nfs_request(), returned and passed back up the call chain. In the new client, the status is copied from the RPC response into struct nfsrv_descript.nd_repstat by fs/nfs/nfs_commonkrpc.c:newnfs_request() and then moved into an error return in fs/nfsclient/nfs_clrpcops.c:nfsrpc_*(). This is not currently a problem but it would seem useful to include notes in nfs/nfsproto.h and sys/errno.h warning of this assumption in case of future changes. Note that both NFS servers do include code for error code mapping. -- Peter Jeremy pgpQy93T4ChKw.pgp Description: PGP signature
Re: Building FreeBSD 9.0-CUR/amd64 with CLANG fails
On 05/04/11 09:00, Dimitry Andric wrote: On 2011-05-03 20:53, O. Hartmann wrote: ... ld -m elf_i386_fbsd -Y P,/usr/obj/usr/src/lib32/usr/lib32 �-o gcrt1.o -r crt1_s.o gcrt1_c.o ld: Relocatable linking with relocations from format elf64-x86-64-freebsd (crt1_s.o) to format elf32-i386-freebsd (gcrt1.o) is not supported ... Today, I tried again, after CLANG/LLVM has been updated to version 3.0. Same error. This is the addendum I made to the /etc/make.conf: ## ## CLANG ## .if defined(USE_CLANG) .if !defined(CC) || ${CC} == cc CC=clang .endif .if !defined(CXX) || ${CXX} == c++ CXX=clang++ .endif # Don't die on warnings NO_WERROR= WERROR= # Don't forget this when using Jails! NO_FSCHG= .endif Ok, that looks good, I use a similar construction. However, in my case it works fine, so there must be something special on your system that breaks the build. What happens here, is that the 32-bit stage on amd64 fails, because it tries to link together 64-bit and 32-bit object files, which is not allowed. This can occur if Makefile.inc1 cannot set CC to the correct value, but there might also be something else going on. To debug this further, can you please post: - Your full /etc/make.conf - Your full /etc/src.conf - Any modifications you made to your source tree - The specific procedure you use for buildworld - An url to a full build log (don't post it to the list, because it will be rather large) When commenting out the wrapping .if defined(USE_CLANG) .endif construct, as suffested by Olivier, it works. I gues I found my mistake: From an earlier attempt of building FreeBSD with clang, I placed the WIKI suggestions into /etc/src.conf and I never recalled this. I delete it and try again ... On my lab's box, same OS, same revision, nearly same hardware, building world/kernel worked fine even with the 'switch' - but there wasn't /etc/src.conf. Thanks for the hints. I'll report again. Regards, Oliver ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: Building FreeBSD 9.0-CUR/amd64 with CLANG fails
On 05/04/11 09:00, Dimitry Andric wrote: On 2011-05-03 20:53, O. Hartmann wrote: ... ld -m elf_i386_fbsd -Y P,/usr/obj/usr/src/lib32/usr/lib32 �-o gcrt1.o -r crt1_s.o gcrt1_c.o ld: Relocatable linking with relocations from format elf64-x86-64-freebsd (crt1_s.o) to format elf32-i386-freebsd (gcrt1.o) is not supported ... Today, I tried again, after CLANG/LLVM has been updated to version 3.0. Same error. This is the addendum I made to the /etc/make.conf: ## ## CLANG ## .if defined(USE_CLANG) .if !defined(CC) || ${CC} == cc CC=clang .endif .if !defined(CXX) || ${CXX} == c++ CXX=clang++ .endif # Don't die on warnings NO_WERROR= WERROR= # Don't forget this when using Jails! NO_FSCHG= .endif Ok, that looks good, I use a similar construction. However, in my case it works fine, so there must be something special on your system that breaks the build. What happens here, is that the 32-bit stage on amd64 fails, because it tries to link together 64-bit and 32-bit object files, which is not allowed. This can occur if Makefile.inc1 cannot set CC to the correct value, but there might also be something else going on. To debug this further, can you please post: - Your full /etc/make.conf - Your full /etc/src.conf - Any modifications you made to your source tree - The specific procedure you use for buildworld - An url to a full build log (don't post it to the list, because it will be rather large) Sorry for the noise. But when I tried to compile essential ports (essential to me), like x11-wm/windowmaker, mulitmedia/ffmpeg, for instance, I run into serious compiler/assembler error with LLVM/CLANG. I guess the ports- tree isn't mature for clang. So am I right in this thinking: leaving /etc/make.conf untouched in terms of not putting there the CLANG build construct and putting this instead into /etc/src.conf will only affect the OS' source tree to be build by clang and all ports are build by the antique system's gcc 4.2.1? Oliver ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: Nasty non-recursive lockmgr panic on softdep only enabled UFS partition when filesystem full
Date: Tue, 3 May 2011 22:40:26 -0700 Subject: Nasty non-recursive lockmgr panic on softdep only enabled UFS partition when filesystem full From: Garrett Cooper yaneg...@gmail.com To: Jeff Roberson j...@freebsd.org, Marshall Kirk McKusick mckus...@mckusick.com Cc: FreeBSD Current freebsd-current@freebsd.org Hi Jeff and Dr. McKusick, Ran into this panic when /usr ran out of space doing a make universe on amd64/r221219 (it took ~15 minutes for the panic to occur after the filesystem ran out of space -- wasn't quite sure what it was doing at the time): ... Let me know what other commands you would like for me to run in kgdb. Thanks, -Garrett You did not indicate whether you are running an 8.X system or a 9-current system. It would be helpful to know that. Jeff thinks that there may be a potential race in the locking code for softdep_request_cleanup. If so, this patch for 9-current should fix it: Index: ffs_softdep.c === --- ffs_softdep.c (revision 221385) +++ ffs_softdep.c (working copy) @@ -11380,7 +11380,8 @@ continue; } MNT_IUNLOCK(mp); - if (vget(lvp, LK_EXCLUSIVE | LK_INTERLOCK, curthread)) { + if (vget(lvp, LK_EXCLUSIVE | LK_NOWAIT | LK_INTERLOCK, + curthread)) { MNT_ILOCK(mp); continue; } If you are running an 8.X system, hopefully you will be able to apply it. Kirk McKusick ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: problems with em(4) since update to driver 7.2.2
On Tue, May 03, 2011 at 11:50:53PM +0200, Michael Schmiedgen wrote: On 03.05.2011 23:24, Jack Vogel wrote: If you get the setup receive structures fail, then increase the nmbclusters. If you use standard MTU then what you need are mbufs, and standard size clusters (2K). Only when you use jumbo frames will you need larger. You must configure enough, its that simple. I doubled the nmbclusters as well. But nothing happened. I have no load on this machine and nothing special configured. Thanks, Michael Just a me too. I've been following -CURRENT(r221415) but I keep /sys/dev/e1000 at r218909 to keep em0 working without problems on -CURRENT. I also tried 2x, 4x 25600 for max mbuff clusters via kern.ipc.nmbclusters. This didn't help. -alastair ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
FreeBSD 9.0/CUR/amd64, built with LLVM/CLANG fails starting LibreOffice 3.3
After building FreeBSD 9.0/amd64 (FreeBSD 9.0-CURRENT #182 r221413: Tue May 3 23:30:11 CEST 2011), the opensource office software LibreOffice (libreoffice-3.3.2) won't start and crashes with pid 53801 (soffice.bin), uid 8152: exited on signal 8 (core dumped) (shown in console). What's wrong? Oliver ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: Clang error make buildworld
At 11:38 PM 5/3/2011, Dimitry Andric wrote: On 2011-05-04 03:07, Manfred Antar wrote: I get this error when trying to buildworld on current i386. It's been this way for awhile Any Ideas ? === boot/i386/boot0 (all) clang -O2 -pipe -DVOLUME_SERIAL -DPXE -DFLAGS=0x8f -DTICKS=0xb6 -DCOMSPEED=7 5 + 3 -ffreestanding -mpreferred-stack-boundary=2 -mno-mmx -mno-3dnow -mno-sse -mno-sse2 -mno-sse3 -msoft-float -std=gnu99-c /usr/src/sys/boot/i386/boot0/boot0.S clang: warning: argument unused during compilation: '-mpreferred-stack-boundary=2' /tmp/cc-4SXZt8.s:42:11: error: .code16 not supported yet For some reason, on your system, it does not compile boot0.S with -no-integrated-as. It works fine here though, so it must be something local to your system. Can you please post: - Your full make.conf and src.conf - The first 30 lines of sys/boot/i386/boot0/Makefile Ok: src.conf: WITHOUT_DYNAMICROOT=yes WITH_IDEA=yes .if !defined(CC) || ${CC} == cc CC=clang .endif .if !defined(CXX) || ${CXX} == c++ CXX=clang++ .endif #Don't die on warnings NO_WERROR= WERROR= make conf: # $FreeBSD: src/share/examples/etc/make.conf,v 1.276 2006/03/21 09:49:05 ru Exp $ # # NOTE: Please would any committer updating this file also update the # make.conf(5) manual page, if necessary, which is located in # src/share/man/man5/make.conf.5. # # /etc/make.conf, if present, will be read by make (see # /usr/share/mk/sys.mk). It allows you to override macro definitions # to make without changing your source tree, or anything the source # tree installs. # # This file must be in valid Makefile syntax. # # There are additional things you can put into /etc/make.conf. # You have to find those in the Makefiles and documentation of # the source tree. # # Note, that you should not set MAKEOBJDIRPREFIX or MAKEOBJDIR # from make.conf (or as command line variables to make). # Both variables are environment variables for make and must be used as: # # env MAKEOBJDIRPREFIX=/big/directory make # # # The CPUTYPE variable controls which processor should be targeted for # generated code. This controls processor-specific optimizations in # certain code (currently only OpenSSL) as well as modifying the value # of CFLAGS to contain the appropriate optimization directive to gcc. # The automatic setting of CFLAGS may be overridden using the # NO_CPU_CFLAGS variable below. # Currently the following CPU types are recognized: # Intel x86 architecture: # (AMD CPUs) opteron athlon64 athlon-mp athlon-xp athlon-4 # athlon-tbird athlon k8 k6-3 k6-2 k6 k5 # (Intel CPUs)nocona pentium4[m] prescott pentium3[m] pentium-m # pentium2 pentiumpro pentium-mmx pentium i486 i386 # Alpha/AXP architecture: ev67 ev6 pca56 ev56 ev5 ev45 ev4 # AMD64 architecture: opteron, athlon64, nocona # Intel ia64 architecture: itanium2, itanium # # (?= allows to buildworld for a different CPUTYPE.) # #CPUTYPE?=pentium3 #NO_CPU_CFLAGS= # Don't add -march=cpu to CFLAGS automatically #NO_CPU_COPTFLAGS= # Don't add -march=cpu to COPTFLAGS automatically # # CFLAGS controls the compiler settings used when compiling C code. # Note that optimization settings other than -O and -O2 are not recommended # or supported for compiling the world or the kernel - please revert any # nonstandard optimization settings to -O or -O2 before submitting bug # reports without patches to the developers. # #CFLAGS= -O -pipe -Wl,--export-dynamic #FOR APACHE# #CFLAGS= -O -pipe -no-integrated-as # # CXXFLAGS controls the compiler settings used when compiling C++ code. # Note that CXXFLAGS is initially set to the value of CFLAGS. If you wish # to add to CXXFLAGS value, += must be used rather than =. Using = # alone will remove the often needed contents of CFLAGS from CXXFLAGS. # #CXXFLAGS+= -fconserve-space # # MAKE_SHELL controls the shell used internally by make(1) to process the # command scripts in makefiles. Three shells are supported, sh, ksh, and # csh. Using sh is most common, and advised. Using ksh *may* work, but is # not guaranteed to. Using csh is absurd. The default is to use sh. # #MAKE_SHELL?=sh # # BDECFLAGS are a set of gcc warning settings that Bruce Evans has suggested # for use in developing FreeBSD and testing changes. They can be used by # putting CFLAGS+=${BDECFLAGS} in /etc/make.conf. -Wconversion is not # included here due to compiler bugs, e.g., mkdir()'s mode_t argument. # #BDECFLAGS= -W -Wall -ansi -pedantic -Wbad-function-cast -Wcast-align \ # -Wcast-qual -Wchar-subscripts -Winline \ # -Wmissing-prototypes -Wnested-externs -Wpointer-arith \ # -Wredundant-decls -Wshadow -Wstrict-prototypes -Wwrite-strings # # To compile just the kernel with special optimizations, you should use # this instead of CFLAGS (which is not applicable to kernel builds
Re: Clang error make buildworld
On 2011-05-04 15:44, Manfred Antar wrote: ... src.conf: WITHOUT_DYNAMICROOT=yes WITH_IDEA=yes .if !defined(CC) || ${CC} == cc CC=clang .endif .if !defined(CXX) || ${CXX} == c++ CXX=clang++ .endif #Don't die on warnings NO_WERROR= WERROR= Aha. Please move the clang-related stuff to make.conf instead, e.g. this fragment: .if !defined(CC) || ${CC} == cc CC=clang .endif .if !defined(CXX) || ${CXX} == c++ CXX=clang++ .endif #Don't die on warnings NO_WERROR= WERROR= It will not work properly if you put it in src.conf. (You can really only use src.conf for WITH_FOO and WITHOUT_BAR type settings.) ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: cardbus memory allocation problem
On Tuesday, May 03, 2011 7:49:29 pm Michael Butler wrote: I have WIP patches to fix this but they aren't ready yet. pcib4: I/O decode0x4000-0x4fff pcib4: memory decode 0xf090-0xf09f *** this memory widow is what I expected all children to allocate from pcib4: no prefetched decode pcib4: Subtractively decoded bridge. It's a subtractive bridge, so the resources do not have to be allocated from the window. That said, I'm committing the last of my patches to HEAD today to rework how PCI-PCI bridges handle I/O windows to support growing windows, etc. and the new PCI-PCI bridge driver will attempt to grow the memory window to allocate a new range before falling back to depending on the subtractive decode. You might be pleased to hear that, without any special arrangements in loader.conf, the new PCI-PCI code does The Right Thing with memory allocation :-) Parent bridge: I fixed the subordinate bus using setpci -s 07:06.2 4c.b=02 I believe it should work even if you don't disable subtractive decoding. -- John Baldwin ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Interrupt storm with MSI in combination with em1
Hi All, I've just updated a machine to -current (r221321) and since then I'm seeing an interrupt storm on irq 16. The storm goes away when I disable MSI/MSIX with the following lines in loader.conf : hw.pci.enable_msix=0 hw.pci.enable_msi=0 According to dmesg the following devices share IRQ 16 : pcib1: ACPI PCI-PCI bridge irq 16 at device 1.0 on pci0 em0: Intel(R) PRO/1000 Network Connection 7.2.3 port 0xcc00-0xcc1f mem 0xf7de-0xf7df,0xf7d0-0xf7d7,0xf7ddc000-0xf7dd irq 16 at device 0.0 on pci1 vgapci0: VGA-compatible display port 0xbc00-0xbc07 mem 0xf780-0xf7bf,0xe000-0xefff irq 16 at device 2.0 on pci0 ehci0: Intel PCH USB 2.0 controller USB-B mem 0xf7cfa000-0xf7cfa3ff irq 16 at device 26.0 on pci0 em1: Intel(R) PRO/1000 Network Connection 7.2.3 port 0xec00-0xec1f mem 0xf7fe-0xf7ff,0xf7f0-0xf7f7,0xf7fdc000-0xf7fd irq 16 at device 0.0 on pci4 pcib4: ACPI PCI-PCI bridge irq 16 at device 28.5 on pci0 During a storm vmstat -i shows a rate of about 220.000 interrupts/sec. MSI interrupt delivery to both 'em0' and 'em1' seems to work correctly during a storm, as I see their counters increase normally in the vmstat -i output. As only 'em0' and 'em1' seem to be using MSI interrupts, my guess is that the e1000 driver is causing this problem. Could it be that the driver forgets to clear/mask legacy interrupts when attaching the MSI interrupts perhaps? Any tips on how to debug and/or fix this? The full output of dmesg can be found here : http://vehosting.nl/pub_diffs/dmesg_plantje2_2011_05_04.txt And the full output of pciconf -lv is here : http://vehosting.nl/pub_diffs/pciconf_plantje2_2011_05_04.txt Regards, -- Daan Vreeken VEHosting http://VEHosting.nl tel: +31-(0)40-7113050 / +31-(0)6-46210825 KvK nr: 17174380 ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: I am very confused and would appreciate some help on device renameing or on renumbering on current fstab.
On 4 May 2011 04:13, Jason Hellenthal jh...@dataix.net wrote: Edwin, /dev/acd0 /cdrom cd9660 ro,noauto 0 0 /dev/acd1 /cdrom1 cd9660 ro,noauto 0 0 As a side note. These are also now useless can be sent to /dev/null for extra padding ;) Shouldn't cause no harm being there but just for reference. -- Regards, (jhell) Jason Hellenthal Just a sanity check here people, but if the machine was built with freebsd 6.x i would guess it machine is a few years old. If so i doubt the hardware would support ahci, and therefore wouldn't have the ada type devices, it would have the old ad style ata ones and therefore noe fstab twiddling should be necessary. Forgive me if im missing something here. ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: I am very confused and would appreciate some help on device renameing or on renumbering on current fstab.
On Wed, May 4, 2011 at 8:16 AM, krad kra...@gmail.com wrote: On 4 May 2011 04:13, Jason Hellenthal jh...@dataix.net wrote: Edwin, /dev/acd0 /cdrom cd9660 ro,noauto 0 0 /dev/acd1 /cdrom1 cd9660 ro,noauto 0 0 As a side note. These are also now useless can be sent to /dev/null for extra padding ;) Shouldn't cause no harm being there but just for reference. Just a sanity check here people, but if the machine was built with freebsd 6.x i would guess it machine is a few years old. If so i doubt the hardware would support ahci, and therefore wouldn't have the ada type devices, it would have the old ad style ata ones and therefore noe fstab twiddling should be necessary. Forgive me if im missing something here. If you enable options ATA_CAM in the kernel, which uses the old ata(4) driver via some cam(4) shims, then you also get the adaX device nodes. There's currently 4 ways to access PATA/SATA disks: - old-style ata(4) using adX device nodes - old-style ata(4) using ataahci(4) for ACHI-like access to PATA/SATA disks, I believe using adX - old-style ata(4) via ATA_CAM using adaX device nodes - new-style ahci(4)/siis(4)/another(4) using adaX device nodes I forget the name of the other AHCI-style driver. The first two options uses atacontrol to manage the disks. The last two options use camcontrol to manage the disks. I believe the plan in 9.0 is to have everything accessed via ATA_CAM/ahci(4) so all PATA/SATA drives show up the same, as adaX, with everything being managed via camcontrol, finally unifying all PATA/SATA/SCSI/SAS disk access via cam(4). -- Freddie Cash fjwc...@gmail.com ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: Nasty non-recursive lockmgr panic on softdep only enabled UFS partition when filesystem full
2011/5/4 Kostik Belousov kostik...@gmail.com: On Tue, May 03, 2011 at 11:58:49PM -0700, Garrett Cooper wrote: On Tue, May 3, 2011 at 11:42 PM, Garrett Cooper yaneg...@gmail.com wrote: On Tue, May 3, 2011 at 10:59 PM, Kirk McKusick mckus...@mckusick.com wrote: Date: Tue, 3 May 2011 22:40:26 -0700 Subject: Nasty non-recursive lockmgr panic on softdep only enabled UFS partition when filesystem full From: Garrett Cooper yaneg...@gmail.com To: Jeff Roberson j...@freebsd.org, Marshall Kirk McKusick mckus...@mckusick.com Cc: FreeBSD Current freebsd-current@freebsd.org Hi Jeff and Dr. McKusick, Ran into this panic when /usr ran out of space doing a make universe on amd64/r221219 (it took ~15 minutes for the panic to occur after the filesystem ran out of space -- wasn't quite sure what it was doing at the time): ... Let me know what other commands you would like for me to run in kgdb. Thanks, -Garrett You did not indicate whether you are running an 8.X system or a 9-current system. It would be helpful to know that. I've actually been running CURRENT for a few years now, but you're right -- I didn't mention that part. Jeff thinks that there may be a potential race in the locking code for softdep_request_cleanup. If so, this patch for 9-current should fix it: Index: ffs_softdep.c === --- ffs_softdep.c (revision 221385) +++ ffs_softdep.c (working copy) @@ -11380,7 +11380,8 @@ continue; } MNT_IUNLOCK(mp); - if (vget(lvp, LK_EXCLUSIVE | LK_INTERLOCK, curthread)) { + if (vget(lvp, LK_EXCLUSIVE | LK_NOWAIT | LK_INTERLOCK, + curthread)) { MNT_ILOCK(mp); continue; } If you are running an 8.X system, hopefully you will be able to apply it. I've applied it, rebuilt and installed the kernel, and trying to repro the case again. Will let you know how things go! Happened again with the change. It's really easy to repro: 1. Get a filesystem with UFS+SU 2. Execute something that does a large number of small writes to a partition. 3. 'dd if=/dev/zero of=FOO bs=10m' on the same partition The kernel will panic with the issue I discussed above. Thanks! Jeff' change is required to avoid LORs, but it is not sufficient to prevent recursion. We must skip the vnode supplied as a parameter to softdep_request_cleanup(). Theoretically, other vnodes might be also locked by curthread, thus I think the change below is needed. Try this. diff --git a/sys/ufs/ffs/ffs_softdep.c b/sys/ufs/ffs/ffs_softdep.c index a6d4441..25fa5d6 100644 --- a/sys/ufs/ffs/ffs_softdep.c +++ b/sys/ufs/ffs/ffs_softdep.c @@ -11380,7 +11380,9 @@ retry: continue; } MNT_IUNLOCK(mp); - if (vget(lvp, LK_EXCLUSIVE | LK_INTERLOCK, curthread)) { + if (VOP_ISLOCKED(lvp) || + vget(lvp, LK_EXCLUSIVE | LK_INTERLOCK | LK_NOWAIT, + curthread)) { MNT_ILOCK(mp); continue; } Ok. I'll let the make universe I have going run to completion, and once I get back home later on, I'll take a look at repro'ing this again with the above patch applied. Thanks! -Garrett ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: problems with em(4) since update to driver 7.2.2
Hi, On Wed, May 4, 2011 at 3:58 AM, Olivier Smedts oliv...@gid0.org wrote: em0: Using an MSI interrupt em0: Ethernet address: d4:85:64:b2:aa:f5 em0: Could not setup receive structures em0: Could not setup receive structures What can we do to help you debug this ? At some point in time, in late February, I had the same issue on a 6-interface machine. I tracked this down to the fact that the main loop in em_setup_receive_ring() was not being entered. This resulted in junk being returned as `error' is not explicitly initialized. At the time, the following patch worked for me. Without it the driver was unable to initialize with RX/TX ring's size of 512. With it, ring's size of 1024 initialized fine. diff --git a/sys/dev/e1000/if_em.c b/sys/dev/e1000/if_em.c index fb6ed67..f02059a 100644 --- a/sys/dev/e1000/if_em.c +++ b/sys/dev/e1000/if_em.c @@ -3901,7 +3901,7 @@ em_setup_receive_ring(struct rx_ring *rxr) struct adapter *adapter = rxr-adapter; struct em_buffer*rxbuf; bus_dma_segment_t seg[1]; - int i, j, nsegs, error; + int i, j, nsegs, error = 0; I did not dig much more at the time, but I was definitively seeing an odd behavior. Anyhow, I am no longer able to reproduce this with 7.2.3, so cannot dig in more details. Btw, I wish you all luck, it took me nearly two full months to convince Jack (and other FreeBSD devs) that there was a bug in the mbuf refresh code. - Arnaud ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: problems with em(4) since update to driver 7.2.2
Hi, On Tue, May 3, 2011 at 7:25 PM, Olivier Smedts oliv...@gid0.org wrote: 2011/5/4 Arnaud Lacombe lacom...@gmail.com: A more rude version might be Why the frak my network adapter stopped working with the default setting ? :) ...on a -STABLE branch Maybe you should not have picked the rude version, Jack has a relatively low cut-off frequency :-) - Arnaud ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: problems with em(4) since update to driver 7.2.2
No, I do not Arnaud. But I refuse to rise to rude and uncivil behavior. Its your behavior again and again which causes you to not get responses, not my willingness to help and respond to those that behave like respectful customers and adults. Jack On Wed, May 4, 2011 at 10:20 AM, Arnaud Lacombe lacom...@gmail.com wrote: Hi, On Tue, May 3, 2011 at 7:25 PM, Olivier Smedts oliv...@gid0.org wrote: 2011/5/4 Arnaud Lacombe lacom...@gmail.com: A more rude version might be Why the frak my network adapter stopped working with the default setting ? :) ...on a -STABLE branch Maybe you should not have picked the rude version, Jack has a relatively low cut-off frequency :-) - Arnaud ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: problems with em(4) since update to driver 7.2.2
Hi, On Wed, May 4, 2011 at 1:24 PM, Jack Vogel jfvo...@gmail.com wrote: No, I do not Arnaud. But I refuse to rise to rude and uncivil behavior. Its your behavior again and again which causes you to not get responses, not my willingness to help and respond to those that behave like respectful customers and adults. Obviously, I am no longer the only one finding that em(4) has unacceptable issue, this thread is the proof. - Arrnaud Jack On Wed, May 4, 2011 at 10:20 AM, Arnaud Lacombe lacom...@gmail.com wrote: Hi, On Tue, May 3, 2011 at 7:25 PM, Olivier Smedts oliv...@gid0.org wrote: 2011/5/4 Arnaud Lacombe lacom...@gmail.com: A more rude version might be Why the frak my network adapter stopped working with the default setting ? :) ...on a -STABLE branch Maybe you should not have picked the rude version, Jack has a relatively low cut-off frequency :-) - Arnaud ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: Interrupt storm with MSI in combination with em1
Who makes your motherboard? The problem you are having is that MSIX AND MSI are both failing as em0 comes up, so it falls back to Legacy interrupt mode, and must be having some issue with sharing the line, causing the storm. This is the second report in a matter of a week perhaps about a problematic motherboard, I would like to know who makes them. Thanks, Jack On Wed, May 4, 2011 at 8:34 AM, Daan Vreeken d...@vehosting.nl wrote: Hi All, I've just updated a machine to -current (r221321) and since then I'm seeing an interrupt storm on irq 16. The storm goes away when I disable MSI/MSIX with the following lines in loader.conf : hw.pci.enable_msix=0 hw.pci.enable_msi=0 According to dmesg the following devices share IRQ 16 : pcib1: ACPI PCI-PCI bridge irq 16 at device 1.0 on pci0 em0: Intel(R) PRO/1000 Network Connection 7.2.3 port 0xcc00-0xcc1f mem 0xf7de-0xf7df,0xf7d0-0xf7d7,0xf7ddc000-0xf7dd irq 16 at device 0.0 on pci1 vgapci0: VGA-compatible display port 0xbc00-0xbc07 mem 0xf780-0xf7bf,0xe000-0xefff irq 16 at device 2.0 on pci0 ehci0: Intel PCH USB 2.0 controller USB-B mem 0xf7cfa000-0xf7cfa3ff irq 16 at device 26.0 on pci0 em1: Intel(R) PRO/1000 Network Connection 7.2.3 port 0xec00-0xec1f mem 0xf7fe-0xf7ff,0xf7f0-0xf7f7,0xf7fdc000-0xf7fd irq 16 at device 0.0 on pci4 pcib4: ACPI PCI-PCI bridge irq 16 at device 28.5 on pci0 During a storm vmstat -i shows a rate of about 220.000 interrupts/sec. MSI interrupt delivery to both 'em0' and 'em1' seems to work correctly during a storm, as I see their counters increase normally in the vmstat -i output. As only 'em0' and 'em1' seem to be using MSI interrupts, my guess is that the e1000 driver is causing this problem. Could it be that the driver forgets to clear/mask legacy interrupts when attaching the MSI interrupts perhaps? Any tips on how to debug and/or fix this? The full output of dmesg can be found here : http://vehosting.nl/pub_diffs/dmesg_plantje2_2011_05_04.txt And the full output of pciconf -lv is here : http://vehosting.nl/pub_diffs/pciconf_plantje2_2011_05_04.txt Regards, -- Daan Vreeken VEHosting http://VEHosting.nl tel: +31-(0)40-7113050 / +31-(0)6-46210825 KvK nr: 17174380 ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: Interrupt storm with MSI in combination with em1
Hi Jack, Wednesday 04 May 2011 19:46:05 Jack Vogel wrote: Who makes your motherboard? The problem you are having is that MSIX AND MSI are both failing as em0 comes up, so it falls back to Legacy interrupt mode, and must be having some issue with sharing the line, causing the storm. The motherboard is an Asus P7H55-M. Sorry, I should have mentioned that the dmesg output is from booting with : hw.pci.enable_msix=0 hw.pci.enable_msi=0 .. in loader.conf. With those lines in loader.conf, MSI and MSIX is disabled, both cards work like they should and there is no interrupt storm. With MSI/MSIX enabled, both cards work like they should and I see the counters of the MSI interrupts increase (in small amounts, like they should), but at boot-time an interrupt storm starts on 'legacy' IRQ 16. Because the only difference between disabling/enabling MSI/MSIX seems to be in the way em0/em1 are used, and because 'em1' shares IRQ 16 according to the dmesg, I'm suspecting 'em1' is causing the storm. (But please correct me if I'm wrong :) What can I do to help track this problem down? According to dmesg the following devices share IRQ 16 : pcib1: ACPI PCI-PCI bridge irq 16 at device 1.0 on pci0 em0: Intel(R) PRO/1000 Network Connection 7.2.3 port 0xcc00-0xcc1f mem 0xf7de-0xf7df,0xf7d0-0xf7d7,0xf7ddc000-0xf7dd irq 16 at device 0.0 on pci1 vgapci0: VGA-compatible display port 0xbc00-0xbc07 mem 0xf780-0xf7bf,0xe000-0xefff irq 16 at device 2.0 on pci0 ehci0: Intel PCH USB 2.0 controller USB-B mem 0xf7cfa000-0xf7cfa3ff irq 16 at device 26.0 on pci0 em1: Intel(R) PRO/1000 Network Connection 7.2.3 port 0xec00-0xec1f mem 0xf7fe-0xf7ff,0xf7f0-0xf7f7,0xf7fdc000-0xf7fd irq 16 at device 0.0 on pci4 pcib4: ACPI PCI-PCI bridge irq 16 at device 28.5 on pci0 During a storm vmstat -i shows a rate of about 220.000 interrupts/sec. MSI interrupt delivery to both 'em0' and 'em1' seems to work correctly during a storm, as I see their counters increase normally in the vmstat -i output. As only 'em0' and 'em1' seem to be using MSI interrupts, my guess is that the e1000 driver is causing this problem. Could it be that the driver forgets to clear/mask legacy interrupts when attaching the MSI interrupts perhaps? Any tips on how to debug and/or fix this? The full output of dmesg can be found here : http://vehosting.nl/pub_diffs/dmesg_plantje2_2011_05_04.txt And the full output of pciconf -lv is here : http://vehosting.nl/pub_diffs/pciconf_plantje2_2011_05_04.txt Regards, -- Daan Vreeken VEHosting http://VEHosting.nl tel: +31-(0)40-7113050 / +31-(0)6-46210825 KvK nr: 17174380 ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: Interrupt storm with MSI in combination with em1
Will you please set it back to a default and then boot and capture the message for me? Thank you, Jack On Wed, May 4, 2011 at 11:19 AM, Daan Vreeken d...@vehosting.nl wrote: Hi Jack, Wednesday 04 May 2011 19:46:05 Jack Vogel wrote: Who makes your motherboard? The problem you are having is that MSIX AND MSI are both failing as em0 comes up, so it falls back to Legacy interrupt mode, and must be having some issue with sharing the line, causing the storm. The motherboard is an Asus P7H55-M. Sorry, I should have mentioned that the dmesg output is from booting with : hw.pci.enable_msix=0 hw.pci.enable_msi=0 .. in loader.conf. With those lines in loader.conf, MSI and MSIX is disabled, both cards work like they should and there is no interrupt storm. With MSI/MSIX enabled, both cards work like they should and I see the counters of the MSI interrupts increase (in small amounts, like they should), but at boot-time an interrupt storm starts on 'legacy' IRQ 16. Because the only difference between disabling/enabling MSI/MSIX seems to be in the way em0/em1 are used, and because 'em1' shares IRQ 16 according to the dmesg, I'm suspecting 'em1' is causing the storm. (But please correct me if I'm wrong :) What can I do to help track this problem down? According to dmesg the following devices share IRQ 16 : pcib1: ACPI PCI-PCI bridge irq 16 at device 1.0 on pci0 em0: Intel(R) PRO/1000 Network Connection 7.2.3 port 0xcc00-0xcc1f mem 0xf7de-0xf7df,0xf7d0-0xf7d7,0xf7ddc000-0xf7dd irq 16 at device 0.0 on pci1 vgapci0: VGA-compatible display port 0xbc00-0xbc07 mem 0xf780-0xf7bf,0xe000-0xefff irq 16 at device 2.0 on pci0 ehci0: Intel PCH USB 2.0 controller USB-B mem 0xf7cfa000-0xf7cfa3ff irq 16 at device 26.0 on pci0 em1: Intel(R) PRO/1000 Network Connection 7.2.3 port 0xec00-0xec1f mem 0xf7fe-0xf7ff,0xf7f0-0xf7f7,0xf7fdc000-0xf7fd irq 16 at device 0.0 on pci4 pcib4: ACPI PCI-PCI bridge irq 16 at device 28.5 on pci0 During a storm vmstat -i shows a rate of about 220.000 interrupts/sec. MSI interrupt delivery to both 'em0' and 'em1' seems to work correctly during a storm, as I see their counters increase normally in the vmstat -i output. As only 'em0' and 'em1' seem to be using MSI interrupts, my guess is that the e1000 driver is causing this problem. Could it be that the driver forgets to clear/mask legacy interrupts when attaching the MSI interrupts perhaps? Any tips on how to debug and/or fix this? The full output of dmesg can be found here : http://vehosting.nl/pub_diffs/dmesg_plantje2_2011_05_04.txt And the full output of pciconf -lv is here : http://vehosting.nl/pub_diffs/pciconf_plantje2_2011_05_04.txt Regards, -- Daan Vreeken VEHosting http://VEHosting.nl tel: +31-(0)40-7113050 / +31-(0)6-46210825 KvK nr: 17174380 ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: problems with em(4) since update to driver 7.2.2
2011/5/4 Arnaud Lacombe lacom...@gmail.com: Obviously, I am no longer the only one finding that em(4) has unacceptable issue, this thread is the proof. Right, and Jack seems to be willing to help, he asked something (I'll reply tomorrow when I'll be in front of the hardware) and is trying to find the same hardware. Now please not get out of the thread, everybody back to work :) -- Olivier Smedts _ ASCII ribbon campaign ( ) e-mail: oliv...@gid0.org - against HTML email vCards X www: http://www.gid0.org - against proprietary attachments / \ Il y a seulement 10 sortes de gens dans le monde : ceux qui comprennent le binaire, et ceux qui ne le comprennent pas. ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: newnfs NFS client testing
On 2011-Apr-25 20:33:14 -0400, Rick Macklem rmack...@uoguelph.ca wrote: I believe that the new/experimental NFS client in head is now compatible with the old/regular NFS client. Possibly even too compatible... Both the old and new NFS clients assume a 1:1 mapping between NFS error codes (NFSERR_* macros defined in nfs/nfsproto.h) and the E* macros in sys/errno.h: In the old client, the NFS status is copied from the RPC response by nfsclient/nfs_krpc.c:nfs_request(), returned and passed back up the call chain. In the new client, the status is copied from the RPC response into struct nfsrv_descript.nd_repstat by fs/nfs/nfs_commonkrpc.c:newnfs_request() and then moved into an error return in fs/nfsclient/nfs_clrpcops.c:nfsrpc_*(). This is not currently a problem but it would seem useful to include notes in nfs/nfsproto.h and sys/errno.h warning of this assumption in case of future changes. Note that both NFS servers do include code for error code mapping. I guess that a comment might be in order. I know that the NFS ones will never change, since they're wired into the RFCs. I doubt anyone has an urge to renumber errno.h (the ones up to about 70), but a comment w.r.t. that in nfsproto.h could be useful. Thanks for the good suggestion, rick ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: problems with em(4) since update to driver 7.2.2
I have had my validation engineer busy all day, we have tried both a 9 kernel as well as 8.2, using the code from HEAD, and we cannot reproduce this problem. The data your netstat -m shows suggests to me that what's happening is somehow setup of the receive ring is running more than once maybe?? You asked at one point how this could go into STABLE, well, because not only here at Intel, but at lots of external customers this code has been used and tested thoroughly. I am not calling into question your problem, but until I understand what it is I cannot fix it :) The thing I am guessing right now is the culprit is the setup code, the reason is that when I ported to the igb driver I found that it did not work on our newer hardware, and so I went back to the older version of setup for igb. Now, even though I have not seen hardware fail with em, maybe there is some. To help me give me a complete pciconf -lv, and if its a namebrand system tell me that, including all hardware in it. If you like Olivier I can make a version of em for you that also reverts the setup code the way I did for igb, see if that fixes it for you? Thanks for your patience, Jack ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: problems with em(4) since update to driver 7.2.2
Hi, On Wed, May 4, 2011 at 5:38 PM, Jack Vogel jfvo...@gmail.com wrote: I have had my validation engineer busy all day, we have tried both a 9 kernel as well as 8.2, using the code from HEAD, and we cannot reproduce this problem. The data your netstat -m shows suggests to me that what's happening is somehow setup of the receive ring is running more than once maybe?? That would be consistent with what I reported back in February. I'll try to see if I can have a look at that on our platform tonight. - Arnaud You asked at one point how this could go into STABLE, well, because not only here at Intel, but at lots of external customers this code has been used and tested thoroughly. I am not calling into question your problem, but until I understand what it is I cannot fix it :) The thing I am guessing right now is the culprit is the setup code, the reason is that when I ported to the igb driver I found that it did not work on our newer hardware, and so I went back to the older version of setup for igb. Now, even though I have not seen hardware fail with em, maybe there is some. To help me give me a complete pciconf -lv, and if its a namebrand system tell me that, including all hardware in it. If you like Olivier I can make a version of em for you that also reverts the setup code the way I did for igb, see if that fixes it for you? Thanks for your patience, Jack ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: Interrupt storm with MSI in combination with em1
Hi, On Wednesday 04 May 2011 20:47:32 Jack Vogel wrote: Will you please set it back to a default and then boot and capture the message for me? No problem. Here's the output with MSI/MSIX enabled : http://vehosting.nl/pub_diffs/dmesg_plantje2_with_msix_2011_05_04.txt I've also added the output of vmstat -i a couple of minutes after a reboot with MSI enabled : http://vehosting.nl/pub_diffs/vmstat_i_2011_05_04.txt Note that in the above vmstat -i dump the interrupt storm hasn't started yet. For some reason the storm doesn't always start directly at boot. I haven't been able (yet) to pinpoint what's triggering it to start. On Wed, May 4, 2011 at 11:19 AM, Daan Vreeken d...@vehosting.nl wrote: Hi Jack, Wednesday 04 May 2011 19:46:05 Jack Vogel wrote: Who makes your motherboard? The problem you are having is that MSIX AND MSI are both failing as em0 comes up, so it falls back to Legacy interrupt mode, and must be having some issue with sharing the line, causing the storm. The motherboard is an Asus P7H55-M. Sorry, I should have mentioned that the dmesg output is from booting with : hw.pci.enable_msix=0 hw.pci.enable_msi=0 .. in loader.conf. With those lines in loader.conf, MSI and MSIX is disabled, both cards work like they should and there is no interrupt storm. With MSI/MSIX enabled, both cards work like they should and I see the counters of the MSI interrupts increase (in small amounts, like they should), but at boot-time an interrupt storm starts on 'legacy' IRQ 16. Because the only difference between disabling/enabling MSI/MSIX seems to be in the way em0/em1 are used, and because 'em1' shares IRQ 16 according to the dmesg, I'm suspecting 'em1' is causing the storm. (But please correct me if I'm wrong :) What can I do to help track this problem down? According to dmesg the following devices share IRQ 16 : pcib1: ACPI PCI-PCI bridge irq 16 at device 1.0 on pci0 em0: Intel(R) PRO/1000 Network Connection 7.2.3 port 0xcc00-0xcc1f mem 0xf7de-0xf7df,0xf7d0-0xf7d7,0xf7ddc000-0xf7dd irq 16 at device 0.0 on pci1 vgapci0: VGA-compatible display port 0xbc00-0xbc07 mem 0xf780-0xf7bf,0xe000-0xefff irq 16 at device 2.0 on pci0 ehci0: Intel PCH USB 2.0 controller USB-B mem 0xf7cfa000-0xf7cfa3ff irq 16 at device 26.0 on pci0 em1: Intel(R) PRO/1000 Network Connection 7.2.3 port 0xec00-0xec1f mem 0xf7fe-0xf7ff,0xf7f0-0xf7f7,0xf7fdc000-0xf7fd irq 16 at device 0.0 on pci4 pcib4: ACPI PCI-PCI bridge irq 16 at device 28.5 on pci0 During a storm vmstat -i shows a rate of about 220.000 interrupts/sec. MSI interrupt delivery to both 'em0' and 'em1' seems to work correctly during a storm, as I see their counters increase normally in the vmstat -i output. As only 'em0' and 'em1' seem to be using MSI interrupts, my guess is that the e1000 driver is causing this problem. Could it be that the driver forgets to clear/mask legacy interrupts when attaching the MSI interrupts perhaps? Any tips on how to debug and/or fix this? The full output of dmesg can be found here : http://vehosting.nl/pub_diffs/dmesg_plantje2_2011_05_04.txt And the full output of pciconf -lv is here : http://vehosting.nl/pub_diffs/pciconf_plantje2_2011_05_04.txt Regards, -- Daan Vreeken VEHosting http://VEHosting.nl tel: +31-(0)40-7113050 / +31-(0)6-46210825 KvK nr: 17174380 Regards, -- Daan Vreeken VEHosting http://VEHosting.nl tel: +31-(0)40-7113050 / +31-(0)6-46210825 KvK nr: 17174380 ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: Interrupt storm with MSI in combination with em1
This all looks completely kosher, what IRQ is the storm on?? Jack On Wed, May 4, 2011 at 3:04 PM, Daan Vreeken d...@vehosting.nl wrote: Hi, On Wednesday 04 May 2011 20:47:32 Jack Vogel wrote: Will you please set it back to a default and then boot and capture the message for me? No problem. Here's the output with MSI/MSIX enabled : http://vehosting.nl/pub_diffs/dmesg_plantje2_with_msix_2011_05_04.txt I've also added the output of vmstat -i a couple of minutes after a reboot with MSI enabled : http://vehosting.nl/pub_diffs/vmstat_i_2011_05_04.txt Note that in the above vmstat -i dump the interrupt storm hasn't started yet. For some reason the storm doesn't always start directly at boot. I haven't been able (yet) to pinpoint what's triggering it to start. On Wed, May 4, 2011 at 11:19 AM, Daan Vreeken d...@vehosting.nl wrote: Hi Jack, Wednesday 04 May 2011 19:46:05 Jack Vogel wrote: Who makes your motherboard? The problem you are having is that MSIX AND MSI are both failing as em0 comes up, so it falls back to Legacy interrupt mode, and must be having some issue with sharing the line, causing the storm. The motherboard is an Asus P7H55-M. Sorry, I should have mentioned that the dmesg output is from booting with : hw.pci.enable_msix=0 hw.pci.enable_msi=0 .. in loader.conf. With those lines in loader.conf, MSI and MSIX is disabled, both cards work like they should and there is no interrupt storm. With MSI/MSIX enabled, both cards work like they should and I see the counters of the MSI interrupts increase (in small amounts, like they should), but at boot-time an interrupt storm starts on 'legacy' IRQ 16. Because the only difference between disabling/enabling MSI/MSIX seems to be in the way em0/em1 are used, and because 'em1' shares IRQ 16 according to the dmesg, I'm suspecting 'em1' is causing the storm. (But please correct me if I'm wrong :) What can I do to help track this problem down? According to dmesg the following devices share IRQ 16 : pcib1: ACPI PCI-PCI bridge irq 16 at device 1.0 on pci0 em0: Intel(R) PRO/1000 Network Connection 7.2.3 port 0xcc00-0xcc1f mem 0xf7de-0xf7df,0xf7d0-0xf7d7,0xf7ddc000-0xf7dd irq 16 at device 0.0 on pci1 vgapci0: VGA-compatible display port 0xbc00-0xbc07 mem 0xf780-0xf7bf,0xe000-0xefff irq 16 at device 2.0 on pci0 ehci0: Intel PCH USB 2.0 controller USB-B mem 0xf7cfa000-0xf7cfa3ff irq 16 at device 26.0 on pci0 em1: Intel(R) PRO/1000 Network Connection 7.2.3 port 0xec00-0xec1f mem 0xf7fe-0xf7ff,0xf7f0-0xf7f7,0xf7fdc000-0xf7fd irq 16 at device 0.0 on pci4 pcib4: ACPI PCI-PCI bridge irq 16 at device 28.5 on pci0 During a storm vmstat -i shows a rate of about 220.000 interrupts/sec. MSI interrupt delivery to both 'em0' and 'em1' seems to work correctly during a storm, as I see their counters increase normally in the vmstat -i output. As only 'em0' and 'em1' seem to be using MSI interrupts, my guess is that the e1000 driver is causing this problem. Could it be that the driver forgets to clear/mask legacy interrupts when attaching the MSI interrupts perhaps? Any tips on how to debug and/or fix this? The full output of dmesg can be found here : http://vehosting.nl/pub_diffs/dmesg_plantje2_2011_05_04.txt And the full output of pciconf -lv is here : http://vehosting.nl/pub_diffs/pciconf_plantje2_2011_05_04.txt Regards, -- Daan Vreeken VEHosting http://VEHosting.nl tel: +31-(0)40-7113050 / +31-(0)6-46210825 KvK nr: 17174380 Regards, -- Daan Vreeken VEHosting http://VEHosting.nl tel: +31-(0)40-7113050 / +31-(0)6-46210825 KvK nr: 17174380 ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: Interrupt storm with MSI in combination with em1
Right, it was you Wiktor :) Oh, so yours is sort of a special case. Thanks, Jack On Wed, May 4, 2011 at 3:27 PM, Wiktor Niesiobedzki b...@vink.pl wrote: 2011/5/4 Jack Vogel jfvo...@gmail.com: This is the second report in a matter of a week perhaps about a problematic motherboard, I would like to know who makes them. Just for the record, the motherboard with which I had problems (I guess my problem is here referred) is VIA EPIA SN1. It's nothing new, and probably rarely used with additional PCIe cards, as this is embedded-like creature. Cheers, Wiktor Niesiobedzki ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: Interrupt storm with MSI in combination with em1
2011/5/4 Jack Vogel jfvo...@gmail.com: This is the second report in a matter of a week perhaps about a problematic motherboard, I would like to know who makes them. Just for the record, the motherboard with which I had problems (I guess my problem is here referred) is VIA EPIA SN1. It's nothing new, and probably rarely used with additional PCIe cards, as this is embedded-like creature. Cheers, Wiktor Niesiobedzki ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: Interrupt storm with MSI in combination with em1
On Thursday 05 May 2011 00:15:43 you wrote: This all looks completely kosher, what IRQ is the storm on?? IRQ 16. Further down this email there is a list of devices that share the IRQ according to 'dmesg'. On Wed, May 4, 2011 at 3:04 PM, Daan Vreeken d...@vehosting.nl wrote: Hi, On Wednesday 04 May 2011 20:47:32 Jack Vogel wrote: Will you please set it back to a default and then boot and capture the message for me? No problem. Here's the output with MSI/MSIX enabled : http://vehosting.nl/pub_diffs/dmesg_plantje2_with_msix_2011_05_04.txt I've also added the output of vmstat -i a couple of minutes after a reboot with MSI enabled : http://vehosting.nl/pub_diffs/vmstat_i_2011_05_04.txt Note that in the above vmstat -i dump the interrupt storm hasn't started yet. For some reason the storm doesn't always start directly at boot. I haven't been able (yet) to pinpoint what's triggering it to start. On Wed, May 4, 2011 at 11:19 AM, Daan Vreeken d...@vehosting.nl wrote: Hi Jack, Wednesday 04 May 2011 19:46:05 Jack Vogel wrote: Who makes your motherboard? The problem you are having is that MSIX AND MSI are both failing as em0 comes up, so it falls back to Legacy interrupt mode, and must be having some issue with sharing the line, causing the storm. The motherboard is an Asus P7H55-M. Sorry, I should have mentioned that the dmesg output is from booting with : hw.pci.enable_msix=0 hw.pci.enable_msi=0 .. in loader.conf. With those lines in loader.conf, MSI and MSIX is disabled, both cards work like they should and there is no interrupt storm. With MSI/MSIX enabled, both cards work like they should and I see the counters of the MSI interrupts increase (in small amounts, like they should), but at boot-time an interrupt storm starts on 'legacy' IRQ 16. Because the only difference between disabling/enabling MSI/MSIX seems to be in the way em0/em1 are used, and because 'em1' shares IRQ 16 according to the dmesg, I'm suspecting 'em1' is causing the storm. (But please correct me if I'm wrong :) What can I do to help track this problem down? According to dmesg the following devices share IRQ 16 : pcib1: ACPI PCI-PCI bridge irq 16 at device 1.0 on pci0 em0: Intel(R) PRO/1000 Network Connection 7.2.3 port 0xcc00-0xcc1f mem 0xf7de-0xf7df,0xf7d0-0xf7d7,0xf7ddc000-0xf7dd irq 16 at device 0.0 on pci1 vgapci0: VGA-compatible display port 0xbc00-0xbc07 mem 0xf780-0xf7bf,0xe000-0xefff irq 16 at device 2.0 on pci0 ehci0: Intel PCH USB 2.0 controller USB-B mem 0xf7cfa000-0xf7cfa3ff irq 16 at device 26.0 on pci0 em1: Intel(R) PRO/1000 Network Connection 7.2.3 port 0xec00-0xec1f mem 0xf7fe-0xf7ff,0xf7f0-0xf7f7,0xf7fdc000-0xf7fd irq 16 at device 0.0 on pci4 pcib4: ACPI PCI-PCI bridge irq 16 at device 28.5 on pci0 During a storm vmstat -i shows a rate of about 220.000 interrupts/sec. MSI interrupt delivery to both 'em0' and 'em1' seems to work correctly during a storm, as I see their counters increase normally in the vmstat -i output. As only 'em0' and 'em1' seem to be using MSI interrupts, my guess is that the e1000 driver is causing this problem. Could it be that the driver forgets to clear/mask legacy interrupts when attaching the MSI interrupts perhaps? Any tips on how to debug and/or fix this? The full output of dmesg can be found here : http://vehosting.nl/pub_diffs/dmesg_plantje2_2011_05_04.txt And the full output of pciconf -lv is here : http://vehosting.nl/pub_diffs/pciconf_plantje2_2011_05_04.txt Thanks, -- Daan Vreeken VEHosting http://VEHosting.nl tel: +31-(0)40-7113050 / +31-(0)6-46210825 KvK nr: 17174380 ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: Interrupt storm with MSI in combination with em1
OK, but the reason you see the multiple cases of irq 16 is that's the bridge, once you are using MSIX, as vmstat shows, its using other vectors. Can you capture the messages file with the actual storm happening? I noticed some complaints about checksums in the dmesg, have you checked on BIOS upgrades or something like that on your motherboard? Regards, Jack On Wed, May 4, 2011 at 4:27 PM, Daan Vreeken d...@vehosting.nl wrote: On Thursday 05 May 2011 00:15:43 you wrote: This all looks completely kosher, what IRQ is the storm on?? IRQ 16. Further down this email there is a list of devices that share the IRQ according to 'dmesg'. On Wed, May 4, 2011 at 3:04 PM, Daan Vreeken d...@vehosting.nl wrote: Hi, On Wednesday 04 May 2011 20:47:32 Jack Vogel wrote: Will you please set it back to a default and then boot and capture the message for me? No problem. Here's the output with MSI/MSIX enabled : http://vehosting.nl/pub_diffs/dmesg_plantje2_with_msix_2011_05_04.txt I've also added the output of vmstat -i a couple of minutes after a reboot with MSI enabled : http://vehosting.nl/pub_diffs/vmstat_i_2011_05_04.txt Note that in the above vmstat -i dump the interrupt storm hasn't started yet. For some reason the storm doesn't always start directly at boot. I haven't been able (yet) to pinpoint what's triggering it to start. On Wed, May 4, 2011 at 11:19 AM, Daan Vreeken d...@vehosting.nl wrote: Hi Jack, Wednesday 04 May 2011 19:46:05 Jack Vogel wrote: Who makes your motherboard? The problem you are having is that MSIX AND MSI are both failing as em0 comes up, so it falls back to Legacy interrupt mode, and must be having some issue with sharing the line, causing the storm. The motherboard is an Asus P7H55-M. Sorry, I should have mentioned that the dmesg output is from booting with : hw.pci.enable_msix=0 hw.pci.enable_msi=0 .. in loader.conf. With those lines in loader.conf, MSI and MSIX is disabled, both cards work like they should and there is no interrupt storm. With MSI/MSIX enabled, both cards work like they should and I see the counters of the MSI interrupts increase (in small amounts, like they should), but at boot-time an interrupt storm starts on 'legacy' IRQ 16. Because the only difference between disabling/enabling MSI/MSIX seems to be in the way em0/em1 are used, and because 'em1' shares IRQ 16 according to the dmesg, I'm suspecting 'em1' is causing the storm. (But please correct me if I'm wrong :) What can I do to help track this problem down? According to dmesg the following devices share IRQ 16 : pcib1: ACPI PCI-PCI bridge irq 16 at device 1.0 on pci0 em0: Intel(R) PRO/1000 Network Connection 7.2.3 port 0xcc00-0xcc1f mem 0xf7de-0xf7df,0xf7d0-0xf7d7,0xf7ddc000-0xf7dd irq 16 at device 0.0 on pci1 vgapci0: VGA-compatible display port 0xbc00-0xbc07 mem 0xf780-0xf7bf,0xe000-0xefff irq 16 at device 2.0 on pci0 ehci0: Intel PCH USB 2.0 controller USB-B mem 0xf7cfa000-0xf7cfa3ff irq 16 at device 26.0 on pci0 em1: Intel(R) PRO/1000 Network Connection 7.2.3 port 0xec00-0xec1f mem 0xf7fe-0xf7ff,0xf7f0-0xf7f7,0xf7fdc000-0xf7fd irq 16 at device 0.0 on pci4 pcib4: ACPI PCI-PCI bridge irq 16 at device 28.5 on pci0 During a storm vmstat -i shows a rate of about 220.000 interrupts/sec. MSI interrupt delivery to both 'em0' and 'em1' seems to work correctly during a storm, as I see their counters increase normally in the vmstat -i output. As only 'em0' and 'em1' seem to be using MSI interrupts, my guess is that the e1000 driver is causing this problem. Could it be that the driver forgets to clear/mask legacy interrupts when attaching the MSI interrupts perhaps? Any tips on how to debug and/or fix this? The full output of dmesg can be found here : http://vehosting.nl/pub_diffs/dmesg_plantje2_2011_05_04.txt And the full output of pciconf -lv is here : http://vehosting.nl/pub_diffs/pciconf_plantje2_2011_05_04.txt Thanks, -- Daan Vreeken VEHosting http://VEHosting.nl tel: +31-(0)40-7113050 / +31-(0)6-46210825 KvK nr: 17174380 ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: problems with em(4) since update to driver 7.2.2
Hi, On Wed, May 4, 2011 at 5:38 PM, Jack Vogel jfvo...@gmail.com wrote: I have had my validation engineer busy all day, we have tried both a 9 kernel as well as 8.2, using the code from HEAD, and we cannot reproduce this problem. Actually, it can be trivially reproduced by tainting `error'. As it is uninitialized in HEAD, it's value can be _anything_, so let's mark it as explicitly invalid. diff -u ./if_em.c /data/src/freebsd/em-7.2.2/src/if_em.c --- ./if_em.c 2011-02-18 01:18:23.0 -0500 +++ /data/src/freebsd/em-7.2.2/src/if_em.c 2011-05-05 01:12:01.0 -0400 @@ -3912,7 +3912,7 @@ struct adapter *adapter = rxr-adapter; struct em_buffer*rxbuf; bus_dma_segment_t seg[1]; - int i, j, nsegs, error; + int i, j, nsegs, error = -1; The error pointed out in this thread pops up in the next boot. - Arnaud The data your netstat -m shows suggests to me that what's happening is somehow setup of the receive ring is running more than once maybe?? You asked at one point how this could go into STABLE, well, because not only here at Intel, but at lots of external customers this code has been used and tested thoroughly. I am not calling into question your problem, but until I understand what it is I cannot fix it :) The thing I am guessing right now is the culprit is the setup code, the reason is that when I ported to the igb driver I found that it did not work on our newer hardware, and so I went back to the older version of setup for igb. Now, even though I have not seen hardware fail with em, maybe there is some. To help me give me a complete pciconf -lv, and if its a namebrand system tell me that, including all hardware in it. If you like Olivier I can make a version of em for you that also reverts the setup code the way I did for igb, see if that fixes it for you? Thanks for your patience, Jack ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org