Re: problems with em(4) since update to driver 7.2.2
Hi, On Thu, May 5, 2011 at 1:20 AM, Arnaud Lacombe lacom...@gmail.com wrote: Hi, On Wed, May 4, 2011 at 5:38 PM, Jack Vogel jfvo...@gmail.com wrote: I have had my validation engineer busy all day, we have tried both a 9 kernel as well as 8.2, using the code from HEAD, and we cannot reproduce this problem. Actually, it can be trivially reproduced by tainting `error'. As it is uninitialized in HEAD, it's value can be _anything_, so let's mark it as explicitly invalid. diff -u ./if_em.c /data/src/freebsd/em-7.2.2/src/if_em.c --- ./if_em.c 2011-02-18 01:18:23.0 -0500 +++ /data/src/freebsd/em-7.2.2/src/if_em.c 2011-05-05 01:12:01.0 -0400 @@ -3912,7 +3912,7 @@ struct adapter *adapter = rxr-adapter; struct em_buffer *rxbuf; bus_dma_segment_t seg[1]; - int i, j, nsegs, error; + int i, j, nsegs, error = -1; The error pointed out in this thread pops up in the next boot. I put a call to kdb_enter() at the beginning of the function, helped with some textdump I got all the backtrace [0] for all the time em_setup_receive_ring() is called. All are exactly the same: kdb_enter_why(0,c09f6511,f391aaa8,c09be1e2,c09f6511,...) at kdb_enter_why+0x3b kdb_enter(c09f6511,0,3810,,5dc,...) at kdb_enter+0x19 em_setup_receive_ring(c3c8d600,c3c8d7a4,c3c96004,31fa,c3c8d600,...) at em_setup_receive_ring+0x22 em_setup_receive_structures(c3c96000,f15f2000,38,8100,3,...) at em_setup_receive_structures+0x26 em_init_locked(c3c96000,0,c09f5de5,414,1,...) at em_init_locked+0x2f2 em_ioctl(c3c7d000,80206934,c3ce9d00,c07b7a0b,c3f2a230,...) at em_ioctl+0x1c3 ifhwioctl(c3f2a230,f391ac34,c07b7a0b,c3f3e3d0,c08df1c0,...) at ifhwioctl+0x4b8 ifioctl(c3f3e3d0,80206934,c3ce9d00,c3f2a230,c3f2a230,...) at ifioctl+0x82 kern_ioctl(c3f2a230,3,80206934,c3ce9d00,c3ce9d00,...) at kern_ioctl+0xa8 ioctl(c3f2a230,f391acf8,c,c,f391ad2c,...) at ioctl+0xc5 syscall(f391ad38) at syscall+0x17d Xint0x80_syscall() at Xint0x80_syscall+0x20 --- syscall (54, FreeBSD ELF32, ioctl), eip = 0x4816ee23, esp = 0xbfbfe67c, ebp = 0xbfbfe698 --- This fully explain why the main loop in em_setup_receive_ring() is never entered, as we always verify `j == rxr-next_to_check' (provided that mbuf have been refreshed if some packet were transfered) and return the value on the stack. As of now, beside changing the call-site of em_setup_receive_ring() to ensure it is never re-entered, I'd guess that the patch I sent earlier today, is the only way to ensure that no junk is returned. I'd guess that the driver _is_ able to transmit, if the code was not explicitly calling em_stop() upon em_setup_receive_structures() failure. - Arnaud [0]: I wish that would have been as easy as in Linux, where a WARN() call do all the job automatically, but still, I should not hope for that much unless I am the one implementing it ... yes, free whining, it's 2a.m. ... - Arnaud The data your netstat -m shows suggests to me that what's happening is somehow setup of the receive ring is running more than once maybe?? You asked at one point how this could go into STABLE, well, because not only here at Intel, but at lots of external customers this code has been used and tested thoroughly. I am not calling into question your problem, but until I understand what it is I cannot fix it :) The thing I am guessing right now is the culprit is the setup code, the reason is that when I ported to the igb driver I found that it did not work on our newer hardware, and so I went back to the older version of setup for igb. Now, even though I have not seen hardware fail with em, maybe there is some. To help me give me a complete pciconf -lv, and if its a namebrand system tell me that, including all hardware in it. If you like Olivier I can make a version of em for you that also reverts the setup code the way I did for igb, see if that fixes it for you? Thanks for your patience, Jack ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
bsdlabel showing value zero on fsize, bsize and bps/cpg for all partitions
Hello, on freebsd version 6 and 7 I was relaying on bsdlabel to get block size : # bsdlabel /dev/mirror/gm4s1 # /dev/mirror/gm4s1: 8 partitions: #size offsetfstype [fsize bsize bps/cpg] a: 419430404.2BSD 2048 16384 28528 b: 8388608 4194304 swap c: 2930416020unused0 0 # raw part, don't edit d: 2097152 125829124.2BSD 2048 16384 28528 e: 52428800 146800644.2BSD 2048 16384 28528 f: 225932738 671088644.2BSD 2048 16384 28528 but on 8.1 and 8.2 I get zero values : # bsdlabel /dev/mirror/gm6s1 # /dev/mirror/gm6s1: 8 partitions: #size offsetfstype [fsize bsize bps/cpg] a: 419430404.2BSD0 0 0 b: 17186816 4194304 swap c: 19535251050unused0 0 # raw part, don't edit d: 2097152 213811204.2BSD0 0 0 e: 83886080 234782724.2BSD0 0 0 f: 1846160753 1073643524.2BSD0 0 0 Has anyone seen this? Is it some step I missed on install? Is there another command to get block size? Thanks for your help, regards, Sergi ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: problems with em(4) since update to driver 7.2.2
OK, but what this does not explain is why I do not see this if its so easily reproduced, what causes the failure case, any idea? As I said, given the code was not feasible for igb anyway I would not be unhappy about returning to the old way of doing things. Jack On Wed, May 4, 2011 at 11:03 PM, Arnaud Lacombe lacom...@gmail.com wrote: Hi, On Thu, May 5, 2011 at 1:20 AM, Arnaud Lacombe lacom...@gmail.com wrote: Hi, On Wed, May 4, 2011 at 5:38 PM, Jack Vogel jfvo...@gmail.com wrote: I have had my validation engineer busy all day, we have tried both a 9 kernel as well as 8.2, using the code from HEAD, and we cannot reproduce this problem. Actually, it can be trivially reproduced by tainting `error'. As it is uninitialized in HEAD, it's value can be _anything_, so let's mark it as explicitly invalid. diff -u ./if_em.c /data/src/freebsd/em-7.2.2/src/if_em.c --- ./if_em.c 2011-02-18 01:18:23.0 -0500 +++ /data/src/freebsd/em-7.2.2/src/if_em.c 2011-05-05 01:12:01.0 -0400 @@ -3912,7 +3912,7 @@ struct adapter *adapter = rxr-adapter; struct em_buffer*rxbuf; bus_dma_segment_t seg[1]; - int i, j, nsegs, error; + int i, j, nsegs, error = -1; The error pointed out in this thread pops up in the next boot. I put a call to kdb_enter() at the beginning of the function, helped with some textdump I got all the backtrace [0] for all the time em_setup_receive_ring() is called. All are exactly the same: kdb_enter_why(0,c09f6511,f391aaa8,c09be1e2,c09f6511,...) at kdb_enter_why+0x3b kdb_enter(c09f6511,0,3810,,5dc,...) at kdb_enter+0x19 em_setup_receive_ring(c3c8d600,c3c8d7a4,c3c96004,31fa,c3c8d600,...) at em_setup_receive_ring+0x22 em_setup_receive_structures(c3c96000,f15f2000,38,8100,3,...) at em_setup_receive_structures+0x26 em_init_locked(c3c96000,0,c09f5de5,414,1,...) at em_init_locked+0x2f2 em_ioctl(c3c7d000,80206934,c3ce9d00,c07b7a0b,c3f2a230,...) at em_ioctl+0x1c3 ifhwioctl(c3f2a230,f391ac34,c07b7a0b,c3f3e3d0,c08df1c0,...) at ifhwioctl+0x4b8 ifioctl(c3f3e3d0,80206934,c3ce9d00,c3f2a230,c3f2a230,...) at ifioctl+0x82 kern_ioctl(c3f2a230,3,80206934,c3ce9d00,c3ce9d00,...) at kern_ioctl+0xa8 ioctl(c3f2a230,f391acf8,c,c,f391ad2c,...) at ioctl+0xc5 syscall(f391ad38) at syscall+0x17d Xint0x80_syscall() at Xint0x80_syscall+0x20 --- syscall (54, FreeBSD ELF32, ioctl), eip = 0x4816ee23, esp = 0xbfbfe67c, ebp = 0xbfbfe698 --- This fully explain why the main loop in em_setup_receive_ring() is never entered, as we always verify `j == rxr-next_to_check' (provided that mbuf have been refreshed if some packet were transfered) and return the value on the stack. As of now, beside changing the call-site of em_setup_receive_ring() to ensure it is never re-entered, I'd guess that the patch I sent earlier today, is the only way to ensure that no junk is returned. I'd guess that the driver _is_ able to transmit, if the code was not explicitly calling em_stop() upon em_setup_receive_structures() failure. - Arnaud [0]: I wish that would have been as easy as in Linux, where a WARN() call do all the job automatically, but still, I should not hope for that much unless I am the one implementing it ... yes, free whining, it's 2a.m. ... - Arnaud The data your netstat -m shows suggests to me that what's happening is somehow setup of the receive ring is running more than once maybe?? You asked at one point how this could go into STABLE, well, because not only here at Intel, but at lots of external customers this code has been used and tested thoroughly. I am not calling into question your problem, but until I understand what it is I cannot fix it :) The thing I am guessing right now is the culprit is the setup code, the reason is that when I ported to the igb driver I found that it did not work on our newer hardware, and so I went back to the older version of setup for igb. Now, even though I have not seen hardware fail with em, maybe there is some. To help me give me a complete pciconf -lv, and if its a namebrand system tell me that, including all hardware in it. If you like Olivier I can make a version of em for you that also reverts the setup code the way I did for igb, see if that fixes it for you? Thanks for your patience, Jack ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: bsdlabel showing value zero on fsize, bsize and bps/cpg for all partitions
On 05.05.2011 10:35, Sergi Seira wrote: on freebsd version 6 and 7 I was relaying on bsdlabel to get block size : Has anyone seen this? Is it some step I missed on install? Is there another command to get block size? I think dumpfs(8) is better tool for doing that. -- WBR, Andrey V. Elsukov signature.asc Description: OpenPGP digital signature
My problems with stability on -current
This is long, sorry. I wish I could condense things down to just the answer, or even just the question, but here goes. I've used HEAD on my main workstation(s) for many years. It's common for there to be ups and downs, and that's fine. Lately however the problems have been debilitating. First a timeline. Since sometime before January 2008 I've been using a Dell Latitude D620 laptop as my primary system. It has a core 2 duo running at 2.33 G, and 2 G RAM. I 4xboot it with windows xp, freebsd current (amd64), another freebsd (usually 8.N-RELEASE i386) and Ubuntu. On the first and last I don't do a lot of compiling obviously, but even under heavy load on 8.2-RELEASE I'm not seeing problems, so the problems I _am_ seeing are not hardware related. I keep my system very close to stock. My kernel config is GENERIC minus devices I don't have, and plus the following: options EXT2FS options IEEE80211_DEBUG # enable debug msgs options VESA device atapicam device sound device snd_hda device snp I was building with clang for a while, but when the problems started I went back to gcc. I still have INVARIANTS on but I disabled WITNESS because with all the known+unfixed LORs it's kind of pointless. Nothing interesting in make/src.conf either, the latter is just a list of stuff not to build, KERNCONF, and MODULES_OVERRIDE. Starting around December 2009 I started having problems under load with -current. Often I reported them, sometimes problems were found, sometimes not. In the course of trying to debug those problems I disabled throttling, which helped. Switching to SCHED_4BSD also helped quite a bit with interactivity under load, although it was still worse than on 8.x. In October of 2010 I was lucky enough to receive a donation of a Dell Optiplex desktop that I started using as my primary workstation. Around that same time there was some work being done in the scheduler(s) and various related systems, and my desktop (which had a slightly faster core 2 duo and 4 G RAM) was running great. I assumed that the problems were solved. Then 2 months ago I packed up the desktop system and pulled out the laptop again. I updated to the latest -current on the laptop, and all heck broke loose. I couldn't do anything on my laptop that created even a mediocre load without it crashing. Trying to do something like a buildworld (even without -j) would cause the system to absolutely crawl. I'd get tons of the dreaded calcru messages about time going backwards, and the system clock would lose literally minutes of wall clock time. At one point when I could keep it up long enough to build the world without crashing it had lost 40 minutes of wall clock time when it finished. I think that specific problem happened sometime between March 15 and r220282. In trying to find that problem, I uncovered another, deeper problem with the one-shot timers from r212541. In order to make my binary search easier for the problem described above I was using a -current snapshot CD from August 2010 that I had laying around. I could easily build world with -j2, run X, do normal desktop stuff (firefox, thunderbird, pidgin, etc.) all at the same time. When I got closer to the more recent -current, it would crash as soon as I put a load on it. I eventually bifurcated down to that exact commit. I've been running on 212540 for over a week now without any problems, including lots of port builds with FORCE_MAKE_JOBS, etc. Alexander suggested some knobs to twist for the timers, and I'll be glad to do that once he gets back to me with more concrete suggestions now that he knows more about my specific problems. Doug -- Nothin' ever doesn't change, but nothin' changes much. -- OK Go Breadth of IT experience, and depth of knowledge in the DNS. Yours for the right price. :) http://SupersetSolutions.com/ ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: Interrupt storm with MSI in combination with em1
Hi Jack, On Thursday 05 May 2011 02:25:39 Jack Vogel wrote: OK, but the reason you see the multiple cases of irq 16 is that's the bridge, once you are using MSIX, as vmstat shows, its using other vectors. Can you capture the messages file with the actual storm happening? I'll do that as soon as I witness another storm. Right now the system has been up over half a day (with MSI/MSIX enabled) and everything seems to be working as it should. I noticed some complaints about checksums in the dmesg, have you checked on BIOS upgrades or something like that on your motherboard? Not yet. I'll reboot the machine later today when I have physical access to it to check the BIOS version. I'll keep you informed as soon as I get another storm going. On Wed, May 4, 2011 at 4:27 PM, Daan Vreeken d...@vehosting.nl wrote: On Thursday 05 May 2011 00:15:43 you wrote: This all looks completely kosher, what IRQ is the storm on?? IRQ 16. Further down this email there is a list of devices that share the IRQ according to 'dmesg'. On Wed, May 4, 2011 at 3:04 PM, Daan Vreeken d...@vehosting.nl wrote: Hi, On Wednesday 04 May 2011 20:47:32 Jack Vogel wrote: Will you please set it back to a default and then boot and capture the message for me? No problem. Here's the output with MSI/MSIX enabled : http://vehosting.nl/pub_diffs/dmesg_plantje2_with_msix_2011_05_04.txt I've also added the output of vmstat -i a couple of minutes after a reboot with MSI enabled : http://vehosting.nl/pub_diffs/vmstat_i_2011_05_04.txt Note that in the above vmstat -i dump the interrupt storm hasn't started yet. For some reason the storm doesn't always start directly at boot. I haven't been able (yet) to pinpoint what's triggering it to start. On Wed, May 4, 2011 at 11:19 AM, Daan Vreeken d...@vehosting.nl wrote: Hi Jack, Wednesday 04 May 2011 19:46:05 Jack Vogel wrote: Who makes your motherboard? The problem you are having is that MSIX AND MSI are both failing as em0 comes up, so it falls back to Legacy interrupt mode, and must be having some issue with sharing the line, causing the storm. The motherboard is an Asus P7H55-M. Sorry, I should have mentioned that the dmesg output is from booting with : hw.pci.enable_msix=0 hw.pci.enable_msi=0 .. in loader.conf. With those lines in loader.conf, MSI and MSIX is disabled, both cards work like they should and there is no interrupt storm. With MSI/MSIX enabled, both cards work like they should and I see the counters of the MSI interrupts increase (in small amounts, like they should), but at boot-time an interrupt storm starts on 'legacy' IRQ 16. Because the only difference between disabling/enabling MSI/MSIX seems to be in the way em0/em1 are used, and because 'em1' shares IRQ 16 according to the dmesg, I'm suspecting 'em1' is causing the storm. (But please correct me if I'm wrong :) What can I do to help track this problem down? According to dmesg the following devices share IRQ 16 : pcib1: ACPI PCI-PCI bridge irq 16 at device 1.0 on pci0 em0: Intel(R) PRO/1000 Network Connection 7.2.3 port 0xcc00-0xcc1f mem 0xf7de-0xf7df,0xf7d0-0xf7d7,0xf7ddc000-0xf7dd irq 16 at device 0.0 on pci1 vgapci0: VGA-compatible display port 0xbc00-0xbc07 mem 0xf780-0xf7bf,0xe000-0xefff irq 16 at device 2.0 on pci0 ehci0: Intel PCH USB 2.0 controller USB-B mem 0xf7cfa000-0xf7cfa3ff irq 16 at device 26.0 on pci0 em1: Intel(R) PRO/1000 Network Connection 7.2.3 port 0xec00-0xec1f mem 0xf7fe-0xf7ff,0xf7f0-0xf7f7,0xf7fdc000-0xf7fd irq 16 at device 0.0 on pci4 pcib4: ACPI PCI-PCI bridge irq 16 at device 28.5 on pci0 During a storm vmstat -i shows a rate of about 220.000 interrupts/sec. MSI interrupt delivery to both 'em0' and 'em1' seems to work correctly during a storm, as I see their counters increase normally in the vmstat -i output. As only 'em0' and 'em1' seem to be using MSI interrupts, my guess is that the e1000 driver is causing this problem. Could it be that the driver forgets to clear/mask legacy interrupts when attaching the MSI interrupts perhaps? Any tips on how to debug and/or fix this? The full output of dmesg can be found here : http://vehosting.nl/pub_diffs/dmesg_plantje2_2011_05_04.txt And the
Re: problems with em(4) since update to driver 7.2.2
Hello, 2011/5/4 Arnaud Lacombe lacom...@gmail.com: Hi, On Wed, May 4, 2011 at 3:58 AM, Olivier Smedts oliv...@gid0.org wrote: em0: Using an MSI interrupt em0: Ethernet address: d4:85:64:b2:aa:f5 em0: Could not setup receive structures em0: Could not setup receive structures What can we do to help you debug this ? At some point in time, in late February, I had the same issue on a 6-interface machine. I tracked this down to the fact that the main loop in em_setup_receive_ring() was not being entered. This resulted in junk being returned as `error' is not explicitly initialized. At the time, the following patch worked for me. Without it the driver was unable to initialize with RX/TX ring's size of 512. With it, ring's size of 1024 initialized fine. diff --git a/sys/dev/e1000/if_em.c b/sys/dev/e1000/if_em.c index fb6ed67..f02059a 100644 --- a/sys/dev/e1000/if_em.c +++ b/sys/dev/e1000/if_em.c @@ -3901,7 +3901,7 @@ em_setup_receive_ring(struct rx_ring *rxr) struct adapter *adapter = rxr-adapter; struct em_buffer *rxbuf; bus_dma_segment_t seg[1]; - int i, j, nsegs, error; + int i, j, nsegs, error = 0; This patch made the trick for me. I'll post what Jack asked for in the following mail. I did not dig much more at the time, but I was definitively seeing an odd behavior. Anyhow, I am no longer able to reproduce this with 7.2.3, so cannot dig in more details. Btw, I wish you all luck, it took me nearly two full months to convince Jack (and other FreeBSD devs) that there was a bug in the mbuf refresh code. - Arnaud -- Olivier Smedts _ ASCII ribbon campaign ( ) e-mail: oliv...@gid0.org - against HTML email vCards X www: http://www.gid0.org - against proprietary attachments / \ Il y a seulement 10 sortes de gens dans le monde : ceux qui comprennent le binaire, et ceux qui ne le comprennent pas. ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: problems with em(4) since update to driver 7.2.2
Hello, (sorry for dual posting) 2011/5/4 Jack Vogel jfvo...@gmail.com: I have had my validation engineer busy all day, we have tried both a 9 kernel as well as 8.2, using the code from HEAD, and we cannot reproduce this problem. The data your netstat -m shows suggests to me that what's happening is somehow setup of the receive ring is running more than once maybe?? You asked at one point how this could go into STABLE, well, because not only here at Intel, but at lots of external customers this code has been used and tested thoroughly. I am not calling into question your problem, but until I understand what it is I cannot fix it :) The thing I am guessing right now is the culprit is the setup code, the reason is that when I ported to the igb driver I found that it did not work on our newer hardware, and so I went back to the older version of setup for igb. Now, even though I have not seen hardware fail with em, maybe there is some. To help me give me a complete pciconf -lv, and if its a namebrand system tell me that, including all hardware in it. The computer is a HP Compaq 8100 Elite Convertible Minitower PC. Here is what I have with the new driver and Arnaud Lacombe's patch. %uname -a FreeBSD zozo.afpicl.lan 9.0-CURRENT FreeBSD 9.0-CURRENT #0 r219752:221420: Wed May 4 11:16:37 CEST 2011 r...@zozo.afpicl.lan:/usr/obj/usr/src/sys/CORE amd64 %pciconf -lv hostb0@pci0:0:0:0: class=0x06 card=0x304b103c chip=0xd1318086 rev=0x11 hdr=0x00 vendor = 'Intel Corporation' class = bridge subclass = HOST-PCI pcib1@pci0:0:3:0: class=0x060400 card=0x304b103c chip=0xd1388086 rev=0x11 hdr=0x01 vendor = 'Intel Corporation' class = bridge subclass = PCI-PCI none0@pci0:0:8:0: class=0x088000 card=0x004b003c chip=0xd1558086 rev=0x11 hdr=0x00 vendor = 'Intel Corporation' class = base peripheral none1@pci0:0:8:1: class=0x088000 card=0x004b003c chip=0xd1568086 rev=0x11 hdr=0x00 vendor = 'Intel Corporation' class = base peripheral none2@pci0:0:8:2: class=0x088000 card=0x004b003c chip=0xd1578086 rev=0x11 hdr=0x00 vendor = 'Intel Corporation' class = base peripheral none3@pci0:0:8:3: class=0x088000 card=0x004b003c chip=0xd1588086 rev=0x11 hdr=0x00 vendor = 'Intel Corporation' class = base peripheral none4@pci0:0:16:0: class=0x088000 card=0x004b003c chip=0xd1508086 rev=0x11 hdr=0x00 vendor = 'Intel Corporation' class = base peripheral none5@pci0:0:16:1: class=0x088000 card=0x004b003c chip=0xd1518086 rev=0x11 hdr=0x00 vendor = 'Intel Corporation' class = base peripheral none6@pci0:0:22:0: class=0x078000 card=0x304b103c chip=0x3b648086 rev=0x06 hdr=0x00 vendor = 'Intel Corporation' class = simple comms none7@pci0:0:22:3: class=0x070002 card=0x304b103c chip=0x3b678086 rev=0x06 hdr=0x00 vendor = 'Intel Corporation' class = simple comms subclass = UART em0@pci0:0:25:0:class=0x02 card=0x304b103c chip=0x10ef8086 rev=0x05 hdr=0x00 vendor = 'Intel Corporation' class = network subclass = ethernet ehci0@pci0:0:26:0: class=0x0c0320 card=0x304b103c chip=0x3b3c8086 rev=0x05 hdr=0x00 vendor = 'Intel Corporation' class = serial bus subclass = USB hdac1@pci0:0:27:0: class=0x040300 card=0x304b103c chip=0x3b568086 rev=0x05 hdr=0x00 vendor = 'Intel Corporation' class = multimedia subclass = HDA pcib2@pci0:0:28:0: class=0x060400 card=0x304b103c chip=0x3b428086 rev=0x05 hdr=0x01 vendor = 'Intel Corporation' class = bridge subclass = PCI-PCI pcib3@pci0:0:28:4: class=0x060400 card=0x304b103c chip=0x3b4a8086 rev=0x05 hdr=0x01 vendor = 'Intel Corporation' class = bridge subclass = PCI-PCI pcib4@pci0:0:28:6: class=0x060400 card=0x304b103c chip=0x3b4e8086 rev=0x05 hdr=0x01 vendor = 'Intel Corporation' class = bridge subclass = PCI-PCI ehci1@pci0:0:29:0: class=0x0c0320 card=0x304b103c chip=0x3b348086 rev=0x05 hdr=0x00 vendor = 'Intel Corporation' class = serial bus subclass = USB pcib5@pci0:0:30:0: class=0x060401 card=0x304b103c chip=0x244e8086 rev=0xa5 hdr=0x01 vendor = 'Intel Corporation' device = '82801 Family (ICH2/3/4/5/6/7/8/9,63xxESB) Hub Interface to PCI Bridge' class = bridge subclass = PCI-PCI isab0@pci0:0:31:0: class=0x060100 card=0x304b103c chip=0x3b0a8086 rev=0x05 hdr=0x00 vendor = 'Intel Corporation' class = bridge subclass = PCI-ISA ahci0@pci0:0:31:2: class=0x010601 card=0x304b103c chip=0x3b228086 rev=0x05 hdr=0x00 vendor = 'Intel Corporation' device = 'IBEX AHCI Controller(6Port) (Intel Q57 Express)' class = mass storage subclass = SATA vgapci0@pci0:1:0:0: class=0x03 card=0x10021002 chip=0x94981002 rev=0x00 hdr=0x00
Re: Switch from legacy ata(4) to CAM-based ATA
2011/4/20 Alexander Motin m...@freebsd.org: Hi. With 9.0 release approaching quickly, I believe it the best time now to manage migration from legacy ata(4) ATA to the new CAM-based one. New ATA code present in the tree for more then a year now, used by many people and proved it's superior functionality and reliability. The only major issue with it now is the migration process. Sooner or later we have to pass it, but due to major UI and API changes we can't do it after 9.0 release. So I propose to do it the next Sunday (April 24) to have as much time for troubleshooting as possible. I have prepared the following patch to do it: http://people.freebsd.org/~mav/ata_switch.patch I haven't added geom_raid to the kernel configurations because we have no other GEOM classes there. But tell me if you thing I should. If somebody has any problems with new ATA stack, please repeat your tests with latest HEAD code and contact me if problem is still there. Next three weeks before BSDCan I am going to dedicate to fixing possibly remaining issues. XENHVM uses it's own naming scheme and can name disks as daN or adN, depending on virtual block device id. atapci0/ata0/ata1 devices still present there (such as in Bruce Cran's dmesg), but no any disks attached from it: instead, all of them hung from device/vbd/N. [In a non-XENHVM mode they are attached from ataN channels, as usual.] /* * Translate Linux major/minor to an appropriate name and unit * number. For HVM guests, this allows us to use the same drive names * with blkfront as the emulated drives, easing transition slightly. */ xenbusb_front0: Xen Frontend Devices on xenstore0 xenbusb_back0: Xen Backend Devices on xenstore0 xctrl0: Xen Control Device on xenstore0 xbd0: 17000MB Virtual Block Device at device/vbd/768 on xenbusb_front0 xbd0: attaching as ad0 GEOM: ad0s1: geometry does not match label (16h,63s != 255h,63s). xbd1: 3812MB Virtual Block Device at device/vbd/2048 on xenbusb_front0 xbd1: attaching as da0 xbd2: 114439MB Virtual Block Device at device/vbd/2064 on xenbusb_front0 xbd2: attaching as da1 Probably, /sys/dev/xen/blkfront/blkfront.c needs updating by s/ad/ada/g; or such. I believe, xen generates sequential numbers starting from zero (or rather such numbers that can be converted to sequential numbers), similar to what ATA_CAM does. -- wbr, pluknet ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: Clang error make buildworld
On 05/04/11 16:20, Dimitry Andric wrote: On 2011-05-04 15:44, Manfred Antar wrote: ... src.conf: WITHOUT_DYNAMICROOT=yes WITH_IDEA=yes .if !defined(CC) || ${CC} == cc CC=clang .endif .if !defined(CXX) || ${CXX} == c++ CXX=clang++ .endif #Don't die on warnings NO_WERROR= WERROR= Aha. Please move the clang-related stuff to make.conf instead, e.g. this fragment: .if !defined(CC) || ${CC} == cc CC=clang .endif .if !defined(CXX) || ${CXX} == c++ CXX=clang++ .endif #Don't die on warnings NO_WERROR= WERROR= On a notebook (DELL Latitude E6510) I tried compiling world with CLANG. So far, so good. It worked. But after rebooting I got a strange misbehaviour of the xdm login window (black/white instead of coloured), but this was only some superficial symptome. The whole system seems to be corrupted. Hitting tab key results like hitting exit in the console. The gcc 4.2.1 system compiler isn't capable of producing binaries, see message below. At this very moment, the box isn't usable anymore, I can't even compile a world with cc (see error below, that was generated by trying to compile a kernel and I'm really confused why cc is used instead of clang). Well, the boxes I reported errors from prior to this are desktop systems with nVidia (Fermi based) graphics boards using a driver BLOB 270.XX.XX which is also used by the notebook. The desktop boxes uses C2D based intel chips, the notebook uses a Core-i5 based chip. All systems got compiled with option CPUTYPE?=native I guess the first compilation with CLANG destroyed the base' system compiler, at this moment I'm incapable of switching back. Floating like a dead man in the water. Any suggestions? Regards and thanks in advance, Oliver --- awk -f /usr/src/sys/tools/usbdevs2h.awk /usr/src/sys/dev/usb/usbdevs -h awk -f /usr/src/sys/tools/usbdevs2h.awk /usr/src/sys/dev/usb/usbdevs -d rpcgen -hM /usr/src/sys/kgssapi/gssd.x | grep -v pthread.h gssd.h cc1: internal compiler error: Bus error: 10 Please submit a full bug report, with preprocessed source if appropriate. See URL:http://gcc.gnu.org/bugs.html for instructions. rpcgen -c /usr/src/sys/kgssapi/gssd.x -o gssd_xdr.c cc1: internal compiler error: Bus error: 10 Please submit a full bug report, with preprocessed source if appropriate. See URL:http://gcc.gnu.org/bugs.html for instructions. *** Error code 1 Stop in /usr/obj/usr/src/sys/MUNIN. *** Error code 1 Stop in /usr/src. *** Error code 1 Stop in /usr/src. ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: Clang error make buildworld
2011/5/5 O. Hartmann ohart...@zedat.fu-berlin.de: On 05/04/11 16:20, Dimitry Andric wrote: On 2011-05-04 15:44, Manfred Antar wrote: ... src.conf: WITHOUT_DYNAMICROOT=yes WITH_IDEA=yes .if !defined(CC) || ${CC} == cc CC=clang .endif .if !defined(CXX) || ${CXX} == c++ CXX=clang++ .endif #Don't die on warnings NO_WERROR= WERROR= Aha. Please move the clang-related stuff to make.conf instead, e.g. this fragment: .if !defined(CC) || ${CC} == cc CC=clang .endif .if !defined(CXX) || ${CXX} == c++ CXX=clang++ .endif #Don't die on warnings NO_WERROR= WERROR= On a notebook (DELL Latitude E6510) I tried compiling world with CLANG. So far, so good. It worked. But after rebooting I got a strange misbehaviour of the xdm login window (black/white instead of coloured), but this was only some superficial symptome. The whole system seems to be corrupted. Hitting tab key results like hitting exit in the console. The gcc 4.2.1 system compiler isn't capable of producing binaries, see message below. At this very moment, the box isn't usable anymore, I can't even compile a world with cc (see error below, that was generated by trying to compile a kernel and I'm really confused why cc is used instead of clang). Well, the boxes I reported errors from prior to this are desktop systems with nVidia (Fermi based) graphics boards using a driver BLOB 270.XX.XX which is also used by the notebook. The desktop boxes uses C2D based intel chips, the notebook uses a Core-i5 based chip. All systems got compiled with option CPUTYPE?=native Can you try without CPUTYPE native, or with another value ? native is not a supported value in /usr/share/mk/bsd.cpu.mk With gcc I used : CPUTYPE?=core2 CFLAGS=-O2 -pipe -march=native NO_CPU_CFLAGS=yes COPTFLAGS=-O2 -pipe -march=native NO_CPU_COPTFLAGS=yes So that /usr/share/mk/bsd.cpu.mk could set the right variables and I could set my own -march value in CFLAGS for gcc. But now for HEAD (which has a newer gcc and clang) I use : CPUTYPE?=core2 CFLAGS=-O2 -pipe -march=core2 NO_CPU_CFLAGS=yes COPTFLAGS=-O2 -pipe -march=core2 NO_CPU_COPTFLAGS=yes Because with clang, -march=native often breaks buildworld, while -march=core2 is ok. First, try to see if you buildworld is still broken with a different (or empty!) make.conf. I guess the first compilation with CLANG destroyed the base' system compiler, at this moment I'm incapable of switching back. Floating like a dead man in the water. Any suggestions? Regards and thanks in advance, Oliver --- awk -f /usr/src/sys/tools/usbdevs2h.awk /usr/src/sys/dev/usb/usbdevs -h awk -f /usr/src/sys/tools/usbdevs2h.awk /usr/src/sys/dev/usb/usbdevs -d rpcgen -hM /usr/src/sys/kgssapi/gssd.x | grep -v pthread.h gssd.h cc1: internal compiler error: Bus error: 10 Please submit a full bug report, with preprocessed source if appropriate. See URL:http://gcc.gnu.org/bugs.html for instructions. rpcgen -c /usr/src/sys/kgssapi/gssd.x -o gssd_xdr.c cc1: internal compiler error: Bus error: 10 Please submit a full bug report, with preprocessed source if appropriate. See URL:http://gcc.gnu.org/bugs.html for instructions. *** Error code 1 Stop in /usr/obj/usr/src/sys/MUNIN. *** Error code 1 Stop in /usr/src. *** Error code 1 Stop in /usr/src. ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org -- Olivier Smedts _ ASCII ribbon campaign ( ) e-mail: oliv...@gid0.org - against HTML email vCards X www: http://www.gid0.org - against proprietary attachments / \ Il y a seulement 10 sortes de gens dans le monde : ceux qui comprennent le binaire, et ceux qui ne le comprennent pas. ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
atkbdc broken on current ?
Hi, I have issue with old HP DL380G3 server. When I use ILO virtual console to manage server. Seems that 9-CURRENT fails to detect atkbdc. When I boot 8.2-RELEASE it works well. 8.2 dmesg shows: atkbdc0: Keyboard controller (i8042) port 0x60,0x64 irq 1 on acpi0 9.0: atkbdc0: Keyboard controller (i8042) failed to probe at port 0x60 on isa0 Is this a known issue? Should I enable some additional outputs, like KBDIO_DEBUG? Thanks, Damjan___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: Clang error make buildworld
Because with clang, -march=native often breaks buildworld, while -march=core2 is ok. Can you be more specific about this claim? On what CPU are seeing this breakage? Anyway, can you compile and run on that machine this: http://lev.vlakno.cz/~rdivacky/Host.cpp It's the LLVM CPU autodetection code, it will print the name of your CPU. I wonder whats the difference to core2. Thank you. roman ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: Clang error make buildworld
2011/5/5 Roman Divacky rdiva...@freebsd.org: Because with clang, -march=native often breaks buildworld, while -march=core2 is ok. Can you be more specific about this claim? On what CPU are seeing this breakage? On a Core2 Quad Q9450 and a Core i7 860. I use core2 on both because that's the most approaching values supported in bsd.cpu.mk and gcc in HEAD. I reverted from -march=native to -march=core2 for two reasons, the first beeing that gcc didn't use the right -mtune when using -march=native (I think it was using internally -mtune=generic). I'll try to be more specific if I can find the tests I was using at that time. The second reason is that with -march=native, my buildworld often failed with clang, and since I use -march=core2 I had no issues. I'll try to buildworld with -march=native and report back. Anyway, can you compile and run on that machine this: http://lev.vlakno.cz/~rdivacky/Host.cpp Compiled with gcc and clang, both output (on one of the two computers I use most) : roman = corei7 It's the LLVM CPU autodetection code, it will print the name of your CPU. I wonder whats the difference to core2. Thank you. roman Cheers -- Olivier Smedts _ ASCII ribbon campaign ( ) e-mail: oliv...@gid0.org - against HTML email vCards X www: http://www.gid0.org - against proprietary attachments / \ Il y a seulement 10 sortes de gens dans le monde : ceux qui comprennent le binaire, et ceux qui ne le comprennent pas. ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: problems with em(4) since update to driver 7.2.2
Hi, On Thu, May 5, 2011 at 2:59 AM, Jack Vogel jfvo...@gmail.com wrote: OK, but what this does not explain is why I do not see this if its so easily reproduced, what causes the failure case, any idea? It is completely random as it depends on the content of the stack. I spent 3 or 4 hours trying to reproduce it using different approach on different platform, with different version of the code and failed. And once `error' was explicitly colored, it popped up. That's the beauty of error related with uninitialized variable. - Arnaud As I said, given the code was not feasible for igb anyway I would not be unhappy about returning to the old way of doing things. I am not sure what you mean by old way of doing thing, but I'd guess that the ring only need to be setup on a few occasion, like initialization and MTU transition. I'm not sure either how other driver manage their ring. Jack On Wed, May 4, 2011 at 11:03 PM, Arnaud Lacombe lacom...@gmail.com wrote: Hi, On Thu, May 5, 2011 at 1:20 AM, Arnaud Lacombe lacom...@gmail.com wrote: Hi, On Wed, May 4, 2011 at 5:38 PM, Jack Vogel jfvo...@gmail.com wrote: I have had my validation engineer busy all day, we have tried both a 9 kernel as well as 8.2, using the code from HEAD, and we cannot reproduce this problem. Actually, it can be trivially reproduced by tainting `error'. As it is uninitialized in HEAD, it's value can be _anything_, so let's mark it as explicitly invalid. diff -u ./if_em.c /data/src/freebsd/em-7.2.2/src/if_em.c --- ./if_em.c 2011-02-18 01:18:23.0 -0500 +++ /data/src/freebsd/em-7.2.2/src/if_em.c 2011-05-05 01:12:01.0 -0400 @@ -3912,7 +3912,7 @@ struct adapter *adapter = rxr-adapter; struct em_buffer *rxbuf; bus_dma_segment_t seg[1]; - int i, j, nsegs, error; + int i, j, nsegs, error = -1; The error pointed out in this thread pops up in the next boot. I put a call to kdb_enter() at the beginning of the function, helped with some textdump I got all the backtrace [0] for all the time em_setup_receive_ring() is called. All are exactly the same: kdb_enter_why(0,c09f6511,f391aaa8,c09be1e2,c09f6511,...) at kdb_enter_why+0x3b kdb_enter(c09f6511,0,3810,,5dc,...) at kdb_enter+0x19 em_setup_receive_ring(c3c8d600,c3c8d7a4,c3c96004,31fa,c3c8d600,...) at em_setup_receive_ring+0x22 em_setup_receive_structures(c3c96000,f15f2000,38,8100,3,...) at em_setup_receive_structures+0x26 em_init_locked(c3c96000,0,c09f5de5,414,1,...) at em_init_locked+0x2f2 em_ioctl(c3c7d000,80206934,c3ce9d00,c07b7a0b,c3f2a230,...) at em_ioctl+0x1c3 ifhwioctl(c3f2a230,f391ac34,c07b7a0b,c3f3e3d0,c08df1c0,...) at ifhwioctl+0x4b8 ifioctl(c3f3e3d0,80206934,c3ce9d00,c3f2a230,c3f2a230,...) at ifioctl+0x82 kern_ioctl(c3f2a230,3,80206934,c3ce9d00,c3ce9d00,...) at kern_ioctl+0xa8 ioctl(c3f2a230,f391acf8,c,c,f391ad2c,...) at ioctl+0xc5 syscall(f391ad38) at syscall+0x17d Xint0x80_syscall() at Xint0x80_syscall+0x20 --- syscall (54, FreeBSD ELF32, ioctl), eip = 0x4816ee23, esp = 0xbfbfe67c, ebp = 0xbfbfe698 --- This fully explain why the main loop in em_setup_receive_ring() is never entered, as we always verify `j == rxr-next_to_check' (provided that mbuf have been refreshed if some packet were transfered) and return the value on the stack. As of now, beside changing the call-site of em_setup_receive_ring() to ensure it is never re-entered, I'd guess that the patch I sent earlier today, is the only way to ensure that no junk is returned. I'd guess that the driver _is_ able to transmit, if the code was not explicitly calling em_stop() upon em_setup_receive_structures() failure. - Arnaud [0]: I wish that would have been as easy as in Linux, where a WARN() call do all the job automatically, but still, I should not hope for that much unless I am the one implementing it ... yes, free whining, it's 2a.m. ... - Arnaud The data your netstat -m shows suggests to me that what's happening is somehow setup of the receive ring is running more than once maybe?? You asked at one point how this could go into STABLE, well, because not only here at Intel, but at lots of external customers this code has been used and tested thoroughly. I am not calling into question your problem, but until I understand what it is I cannot fix it :) The thing I am guessing right now is the culprit is the setup code, the reason is that when I ported to the igb driver I found that it did not work on our newer hardware, and so I went back to the older version of setup for igb. Now, even though I have not seen hardware fail with em, maybe there is some. To help me give me a complete pciconf -lv, and if its a namebrand system tell me that, including all hardware in it. If you like Olivier I can make a version of em for you that also reverts the
Re: Clang error make buildworld
On 05/05/11 15:46, Olivier Smedts wrote: 2011/5/5 O. Hartmannohart...@zedat.fu-berlin.de: On 05/04/11 16:20, Dimitry Andric wrote: On 2011-05-04 15:44, Manfred Antar wrote: ... src.conf: WITHOUT_DYNAMICROOT=yes WITH_IDEA=yes .if !defined(CC) || ${CC} == cc CC=clang .endif .if !defined(CXX) || ${CXX} == c++ CXX=clang++ .endif #Don't die on warnings NO_WERROR= WERROR= Aha. Please move the clang-related stuff to make.conf instead, e.g. this fragment: .if !defined(CC) || ${CC} == cc CC=clang .endif .if !defined(CXX) || ${CXX} == c++ CXX=clang++ .endif #Don't die on warnings NO_WERROR= WERROR= On a notebook (DELL Latitude E6510) I tried compiling world with CLANG. So far, so good. It worked. But after rebooting I got a strange misbehaviour of the xdm login window (black/white instead of coloured), but this was only some superficial symptome. The whole system seems to be corrupted. Hitting tab key results like hitting exit in the console. The gcc 4.2.1 system compiler isn't capable of producing binaries, see message below. At this very moment, the box isn't usable anymore, I can't even compile a world with cc (see error below, that was generated by trying to compile a kernel and I'm really confused why cc is used instead of clang). Well, the boxes I reported errors from prior to this are desktop systems with nVidia (Fermi based) graphics boards using a driver BLOB 270.XX.XX which is also used by the notebook. The desktop boxes uses C2D based intel chips, the notebook uses a Core-i5 based chip. All systems got compiled with option CPUTYPE?=native Can you try without CPUTYPE native, or with another value ? native is not a supported value in /usr/share/mk/bsd.cpu.mk With gcc I used : CPUTYPE?=core2 CFLAGS=-O2 -pipe -march=native NO_CPU_CFLAGS=yes COPTFLAGS=-O2 -pipe -march=native NO_CPU_COPTFLAGS=yes So that /usr/share/mk/bsd.cpu.mk could set the right variables and I could set my own -march value in CFLAGS for gcc. But now for HEAD (which has a newer gcc and clang) I use : CPUTYPE?=core2 CFLAGS=-O2 -pipe -march=core2 NO_CPU_CFLAGS=yes COPTFLAGS=-O2 -pipe -march=core2 NO_CPU_COPTFLAGS=yes Because with clang, -march=native often breaks buildworld, while -march=core2 is ok. First, try to see if you buildworld is still broken with a different (or empty!) make.conf. Well I would like to to as suggested, but I can not even build a system/kernel anymore. Using clang, the build process dies when it comes to rpcgen as shown below, it uses cc (fixed) and cc doen't work properly anymore. I guess the first compilation with CLANG destroyed the base' system compiler, at this moment I'm incapable of switching back. Floating like a dead man in the water. Any suggestions? Regards and thanks in advance, Oliver --- awk -f /usr/src/sys/tools/usbdevs2h.awk /usr/src/sys/dev/usb/usbdevs -h awk -f /usr/src/sys/tools/usbdevs2h.awk /usr/src/sys/dev/usb/usbdevs -d rpcgen -hM /usr/src/sys/kgssapi/gssd.x | grep -v pthread.h gssd.h cc1: internal compiler error: Bus error: 10 Please submit a full bug report, with preprocessed source if appropriate. SeeURL:http://gcc.gnu.org/bugs.html for instructions. rpcgen -c /usr/src/sys/kgssapi/gssd.x -o gssd_xdr.c cc1: internal compiler error: Bus error: 10 Please submit a full bug report, with preprocessed source if appropriate. SeeURL:http://gcc.gnu.org/bugs.html for instructions. *** Error code 1 Stop in /usr/obj/usr/src/sys/MUNIN. *** Error code 1 Stop in /usr/src. *** Error code 1 Stop in /usr/src. ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: problems with em(4) since update to driver 7.2.2
Hi, On Wed, May 4, 2011 at 3:00 AM, Alastair Hogge a...@fastmail.fm wrote: [.] I also tried 2x, 4x 25600 for max mbuff clusters via kern.ipc.nmbclusters. This didn't help. For the record, I did the math yestarday, checked the code. By default, a machine with 6 82574L-backed em(4) interfaces, with only 3 used (ie. brought up), initializes and work just fine with as low as 3076 mbuf clusters (1024*3 + 2). It has been transferring about 28k pps or 20Mbps of traffic (ICMP ping flood) since for the last 10h. Here is the `netstat -m' output: # netstat -m 2879/916/3795 mbufs in use (current/cache/total) 2877/199/3076/3076 mbuf clusters in use (current/cache/total/max) 2877/199 mbuf+clusters out of packet secondary zone in use (current/cache) 0/2/2/1537 4k (page size) jumbo clusters in use (current/cache/total/max) 0/0/0/768 9k jumbo clusters in use (current/cache/total/max) 0/0/0/384 16k jumbo clusters in use (current/cache/total/max) 6473K/635K/7108K bytes allocated to network (current/cache/total) 0/540580029/268859859 requests for mbufs denied (mbufs/clusters/mbuf+clusters) 0/0/0 requests for jumbo clusters denied (4k/9k/16k) 0/5/6656 sfbufs in use (current/peak/max) 0 requests for sfbufs denied 0 requests for sfbufs delayed 0 requests for I/O initiated by sendfile 0 calls to protocol drain routines and, yes, allocation denial has sky-rocketed, but beside that the driver is stable. In that case, the uninitialized issue did not happen when the system booted. The complete machine should be able to initialize properly with 6146 clusters. - Arnaud ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: Clang error make buildworld
2011/5/5 Roman Divacky rdiva...@freebsd.org: Because with clang, -march=native often breaks buildworld, while -march=core2 is ok. Can you be more specific about this claim? On what CPU are seeing this breakage? Ok, with latest HEAD... %echo | gcc -march=native -E -v -x c -### - Using built-in specs. Target: amd64-undermydesk-freebsd Configured with: FreeBSD/amd64 system compiler Thread model: posix gcc version 4.2.2 20070831 prerelease [FreeBSD] /usr/libexec/cc1 -E -quiet -v -D_LONGLONG - -march=core2 -mtune=generic With -march=native, gcc adds -mtune=generic while the man pages says -march=xxx sets -mtune=xxx. %echo | gcc -march=core2 -E -v -x c -### - Using built-in specs. Target: amd64-undermydesk-freebsd Configured with: FreeBSD/amd64 system compiler Thread model: posix gcc version 4.2.2 20070831 prerelease [FreeBSD] /usr/libexec/cc1 -E -quiet -v -D_LONGLONG - -march=core2 With -march=core2, gcc doesn't add -mtune=generic, so it should use -mtune=core2 as suggested by its man page. That's why I use -march=core2 for gcc. Now for clang... With -march=core2, my buildworld compiles just fine on my Core2 Quad, whereas with -march=native (without -jX) if fails on : === libexec/atrun (all) clang -O2 -pipe -march=native -fomit-frame-pointer -DATJOB_DIR=\/var/at/jobs/\ -DLFILE=\/var/at/jobs/.lockfile\ -DLOADAVG_MX=1.5 -DATSPOOL_DIR=\/var/at/spool\ -DVERSION=\2.9\ -DDAEMON_UID=1 -DDAEMON_GID=1 -DDEFAULT_BATCH_QUEUE=\'E\' -DDEFAULT_AT_QUEUE=\'c\' -DPERM_PATH=\/var/at/\ -I/usr/src/libexec/atrun/../../usr.bin/at -I/usr/src/libexec/atrun -DLOGIN_CAP -DPAM -std=gnu99 -fstack-protector -Wsystem-headers -Wall -Wno-format-y2k -Wno-uninitialized -Wno-pointer-sign -c /usr/src/libexec/atrun/atrun.c clang -O2 -pipe -march=native -fomit-frame-pointer -DATJOB_DIR=\/var/at/jobs/\ -DLFILE=\/var/at/jobs/.lockfile\ -DLOADAVG_MX=1.5 -DATSPOOL_DIR=\/var/at/spool\ -DVERSION=\2.9\ -DDAEMON_UID=1 -DDAEMON_GID=1 -DDEFAULT_BATCH_QUEUE=\'E\' -DDEFAULT_AT_QUEUE=\'c\' -DPERM_PATH=\/var/at/\ -I/usr/src/libexec/atrun/../../usr.bin/at -I/usr/src/libexec/atrun -DLOGIN_CAP -DPAM -std=gnu99 -fstack-protector -Wsystem-headers -Wall -Wno-format-y2k -Wno-uninitialized -Wno-pointer-sign -c /usr/src/libexec/atrun/gloadavg.c clang -O2 -pipe -march=native -fomit-frame-pointer -DATJOB_DIR=\/var/at/jobs/\ -DLFILE=\/var/at/jobs/.lockfile\ -DLOADAVG_MX=1.5 -DATSPOOL_DIR=\/var/at/spool\ -DVERSION=\2.9\ -DDAEMON_UID=1 -DDAEMON_GID=1 -DDEFAULT_BATCH_QUEUE=\'E\' -DDEFAULT_AT_QUEUE=\'c\' -DPERM_PATH=\/var/at/\ -I/usr/src/libexec/atrun/../../usr.bin/at -I/usr/src/libexec/atrun -DLOGIN_CAP -DPAM -std=gnu99 -fstack-protector -Wsystem-headers -Wall -Wno-format-y2k -Wno-uninitialized -Wno-pointer-sign -o atrun atrun.o gloadavg.o -lpam -lutil clang: warning: argument unused during compilation: '-std=gnu99' /usr/obj/usr/src/tmp/usr/lib/crt1.o: In function `_start': /usr/src/lib/csu/amd64/crt1.c:(.text+0x5d): undefined reference to `atexit' /usr/src/lib/csu/amd64/crt1.c:(.text+0x64): undefined reference to `_init_tls' /usr/src/lib/csu/amd64/crt1.c:(.text+0x6e): undefined reference to `atexit' /usr/src/lib/csu/amd64/crt1.c:(.text+0x88): undefined reference to `exit' atrun.o: In function `perr': /usr/src/libexec/atrun/atrun.c:(.text+0x65): undefined reference to `strlen' /usr/src/libexec/atrun/atrun.c:(.text+0xac): undefined reference to `vwarn' /usr/src/libexec/atrun/atrun.c:(.text+0xb6): undefined reference to `exit' /usr/src/libexec/atrun/atrun.c:(.text+0xd5): undefined reference to `snprintf' /usr/src/libexec/atrun/atrun.c:(.text+0xe6): undefined reference to `vsyslog' /usr/src/libexec/atrun/atrun.c:(.text+0xf0): undefined reference to `exit' atrun.o: In function `perrx': /usr/src/libexec/atrun/atrun.c:(.text+0x19f): undefined reference to `vwarnx' /usr/src/libexec/atrun/atrun.c:(.text+0x1a9): undefined reference to `exit' /usr/src/libexec/atrun/atrun.c:(.text+0x1be): undefined reference to `vsyslog' /usr/src/libexec/atrun/atrun.c:(.text+0x1c8): undefined reference to `exit' atrun.o: In function `main': /usr/src/libexec/atrun/atrun.c:(.text+0x224): undefined reference to `geteuid' /usr/src/libexec/atrun/atrun.c:(.text+0x239): undefined reference to `getegid' /usr/src/libexec/atrun/atrun.c:(.text+0x24a): undefined reference to `setegid' /usr/src/libexec/atrun/atrun.c:(.text+0x255): undefined reference to `seteuid' /usr/src/libexec/atrun/atrun.c:(.text+0x269): undefined reference to `openlog' /usr/src/libexec/atrun/atrun.c:(.text+0x26f): undefined reference to `opterr' /usr/src/libexec/atrun/atrun.c:(.text+0x292): undefined reference to `getopt' /usr/src/libexec/atrun/atrun.c:(.text+0x2ac): undefined reference to `optarg' /usr/src/libexec/atrun/atrun.c:(.text+0x2bb): undefined reference to `sscanf' /usr/src/libexec/atrun/atrun.c:(.text+0x2e7): undefined reference to `__stderrp' /usr/src/libexec/atrun/atrun.c:(.text+0x2fb): undefined reference to `fwrite' /usr/src/libexec/atrun/atrun.c:(.text+0x305): undefined reference to `exit'
Re: Clang error make buildworld
clang -O2 -pipe -march=native -fomit-frame-pointer -DATJOB_DIR=\/var/at/jobs/\ -DLFILE=\/var/at/jobs/.lockfile\ -DLOADAVG_MX=1.5 -DATSPOOL_DIR=\/var/at/spool\ -DVERSION=\2.9\ -DDAEMON_UID=1 -DDAEMON_GID=1 -DDEFAULT_BATCH_QUEUE=\'E\' -DDEFAULT_AT_QUEUE=\'c\' -DPERM_PATH=\/var/at/\ -I/usr/src/libexec/atrun/../../usr.bin/at -I/usr/src/libexec/atrun -DLOGIN_CAP -DPAM -std=gnu99 -fstack-protector -Wsystem-headers -Wall -Wno-format-y2k -Wno-uninitialized -Wno-pointer-sign -c /usr/src/libexec/atrun/gloadavg.c clang -O2 -pipe -march=native -fomit-frame-pointer -DATJOB_DIR=\/var/at/jobs/\ -DLFILE=\/var/at/jobs/.lockfile\ -DLOADAVG_MX=1.5 -DATSPOOL_DIR=\/var/at/spool\ -DVERSION=\2.9\ -DDAEMON_UID=1 -DDAEMON_GID=1 -DDEFAULT_BATCH_QUEUE=\'E\' -DDEFAULT_AT_QUEUE=\'c\' -DPERM_PATH=\/var/at/\ -I/usr/src/libexec/atrun/../../usr.bin/at -I/usr/src/libexec/atrun -DLOGIN_CAP -DPAM -std=gnu99 -fstack-protector -Wsystem-headers -Wall -Wno-format-y2k -Wno-uninitialized -Wno-pointer-sign -o atrun atrun.o gloadavg.o -lpam -lutil clang: warning: argument unused during compilation: '-std=gnu99' /usr/obj/usr/src/tmp/usr/lib/crt1.o: In function `_start': /usr/src/lib/csu/amd64/crt1.c:(.text+0x5d): undefined reference to `atexit' Can you invoke this very same command (ie. linking) with -### and show me? Does it work when you try to link the same .o files without specifying -march=native ? ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: Clang error make buildworld
2011/5/5 Roman Divacky rdiva...@freebsd.org: clang -O2 -pipe -march=native -fomit-frame-pointer -DATJOB_DIR=\/var/at/jobs/\ -DLFILE=\/var/at/jobs/.lockfile\ -DLOADAVG_MX=1.5 -DATSPOOL_DIR=\/var/at/spool\ -DVERSION=\2.9\ -DDAEMON_UID=1 -DDAEMON_GID=1 -DDEFAULT_BATCH_QUEUE=\'E\' -DDEFAULT_AT_QUEUE=\'c\' -DPERM_PATH=\/var/at/\ -I/usr/src/libexec/atrun/../../usr.bin/at -I/usr/src/libexec/atrun -DLOGIN_CAP -DPAM -std=gnu99 -fstack-protector -Wsystem-headers -Wall -Wno-format-y2k -Wno-uninitialized -Wno-pointer-sign -c /usr/src/libexec/atrun/gloadavg.c clang -O2 -pipe -march=native -fomit-frame-pointer -DATJOB_DIR=\/var/at/jobs/\ -DLFILE=\/var/at/jobs/.lockfile\ -DLOADAVG_MX=1.5 -DATSPOOL_DIR=\/var/at/spool\ -DVERSION=\2.9\ -DDAEMON_UID=1 -DDAEMON_GID=1 -DDEFAULT_BATCH_QUEUE=\'E\' -DDEFAULT_AT_QUEUE=\'c\' -DPERM_PATH=\/var/at/\ -I/usr/src/libexec/atrun/../../usr.bin/at -I/usr/src/libexec/atrun -DLOGIN_CAP -DPAM -std=gnu99 -fstack-protector -Wsystem-headers -Wall -Wno-format-y2k -Wno-uninitialized -Wno-pointer-sign -o atrun atrun.o gloadavg.o -lpam -lutil clang: warning: argument unused during compilation: '-std=gnu99' /usr/obj/usr/src/tmp/usr/lib/crt1.o: In function `_start': /usr/src/lib/csu/amd64/crt1.c:(.text+0x5d): undefined reference to `atexit' Can you invoke this very same command (ie. linking) with -### and show me? Does it work when you try to link the same .o files without specifying -march=native ? I'm going to try. In the meantime, I did other tests on this machine, which is detected by clang as -march=corei7. Compiling this with the system's clang (which has been compiled with -march=core2) and -march=core2 is OK. Compiling this with the system's clang (which has been compiled with -march=core2) and -march=native is OK. Compiling this with the bootstrap clang (which has been compiled with -march=native) and -march=native FAILS. The problem seems to be inside the clang compiled with -march=native. Next, I'm going to try with a bootstrap clang compiled with -march=corei7. -- Olivier Smedts _ ASCII ribbon campaign ( ) e-mail: oliv...@gid0.org - against HTML email vCards X www: http://www.gid0.org - against proprietary attachments / \ Il y a seulement 10 sortes de gens dans le monde : ceux qui comprennent le binaire, et ceux qui ne le comprennent pas. ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: problems with em(4) since update to driver 7.2.2
On Thu, May 5, 2011 at 7:21 AM, Arnaud Lacombe lacom...@gmail.com wrote: Hi, On Thu, May 5, 2011 at 2:59 AM, Jack Vogel jfvo...@gmail.com wrote: OK, but what this does not explain is why I do not see this if its so easily reproduced, what causes the failure case, any idea? It is completely random as it depends on the content of the stack. I spent 3 or 4 hours trying to reproduce it using different approach on different platform, with different version of the code and failed. And once `error' was explicitly colored, it popped up. That's the beauty of error related with uninitialized variable. - Arnaud As I said, given the code was not feasible for igb anyway I would not be unhappy about returning to the old way of doing things. I am not sure what you mean by old way of doing thing, but I'd guess that the ring only need to be setup on a few occasion, like initialization and MTU transition. I'm not sure either how other driver manage their ring. The old way was as the code is in igb now, on each entry to this setup it would completely wipe the descriptor memory, then release all mbufs, and initialize from scratch. Its only because of this lazy reinit, meaning only the range from next_to_refresh to next_to_check is reset, that this problem can happen. For igb the reason this will not work, is it requires you to set E1000_RDH(i) to next_to_check, and in fact, the hardware prohibits the write, its ALWAYS 0 after a reset. The reason for this is that the hardware wishes to manage the head index and not software. Anyway, I see the problematic code path, its only when you skip the while loop altogether. I'm surprised the compiler did not complain about this, its usually so anal. Jack ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: Nasty non-recursive lockmgr panic on softdep only enabled UFS partition when filesystem full
On May 4, 2011, at 2:07 AM, Kostik Belousov wrote: On Tue, May 03, 2011 at 11:58:49PM -0700, Garrett Cooper wrote: On Tue, May 3, 2011 at 11:42 PM, Garrett Cooper yaneg...@gmail.com wrote: On Tue, May 3, 2011 at 10:59 PM, Kirk McKusick mckus...@mckusick.com wrote: Date: Tue, 3 May 2011 22:40:26 -0700 Subject: Nasty non-recursive lockmgr panic on softdep only enabled UFS partition when filesystem full From: Garrett Cooper yaneg...@gmail.com To: Jeff Roberson j...@freebsd.org, Marshall Kirk McKusick mckus...@mckusick.com Cc: FreeBSD Current freebsd-current@freebsd.org Hi Jeff and Dr. McKusick, Ran into this panic when /usr ran out of space doing a make universe on amd64/r221219 (it took ~15 minutes for the panic to occur after the filesystem ran out of space -- wasn't quite sure what it was doing at the time): ... Let me know what other commands you would like for me to run in kgdb. Thanks, -Garrett You did not indicate whether you are running an 8.X system or a 9-current system. It would be helpful to know that. I've actually been running CURRENT for a few years now, but you're right -- I didn't mention that part. Jeff thinks that there may be a potential race in the locking code for softdep_request_cleanup. If so, this patch for 9-current should fix it: Index: ffs_softdep.c === --- ffs_softdep.c (revision 221385) +++ ffs_softdep.c (working copy) @@ -11380,7 +11380,8 @@ continue; } MNT_IUNLOCK(mp); - if (vget(lvp, LK_EXCLUSIVE | LK_INTERLOCK, curthread)) { + if (vget(lvp, LK_EXCLUSIVE | LK_NOWAIT | LK_INTERLOCK, + curthread)) { MNT_ILOCK(mp); continue; } If you are running an 8.X system, hopefully you will be able to apply it. I've applied it, rebuilt and installed the kernel, and trying to repro the case again. Will let you know how things go! Happened again with the change. It's really easy to repro: 1. Get a filesystem with UFS+SU 2. Execute something that does a large number of small writes to a partition. 3. 'dd if=/dev/zero of=FOO bs=10m' on the same partition The kernel will panic with the issue I discussed above. Thanks! Jeff' change is required to avoid LORs, but it is not sufficient to prevent recursion. We must skip the vnode supplied as a parameter to softdep_request_cleanup(). Theoretically, other vnodes might be also locked by curthread, thus I think the change below is needed. Try this. diff --git a/sys/ufs/ffs/ffs_softdep.c b/sys/ufs/ffs/ffs_softdep.c index a6d4441..25fa5d6 100644 --- a/sys/ufs/ffs/ffs_softdep.c +++ b/sys/ufs/ffs/ffs_softdep.c @@ -11380,7 +11380,9 @@ retry: continue; } MNT_IUNLOCK(mp); - if (vget(lvp, LK_EXCLUSIVE | LK_INTERLOCK, curthread)) { + if (VOP_ISLOCKED(lvp) || + vget(lvp, LK_EXCLUSIVE | LK_INTERLOCK | LK_NOWAIT, + curthread)) { MNT_ILOCK(mp); continue; } Ran into the same panic after I applied the patch above with the repro steps I described before. One thing that I noticed is that the issue isn't as easy to reproduce unless you add the dd in parallel with the make operation. Thanks, -Garrett___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: Clang error make buildworld
2011/5/5 Roman Divacky rdiva...@freebsd.org: Can you invoke this very same command (ie. linking) with -### and show me? Does it work when you try to link the same .o files without specifying -march=native ? My system has previously been compiled with clang and -march=core2. It's a corei7. With -march=native in make.conf, after the failed buildworld I cd in /usr/obj/usr/src/libexec/atrun/ and : # clang -O2 -pipe -march=native -fomit-frame-pointer -DATJOB_DIR=\/var/at/jobs/\ -DLFILE=\/var/at/jobs/.lockfile\ -DLOADAVG_MX=1.5 -DATSPOOL_DIR=\/var/at/spool\ -DVERSION=\2.9\ -DDAEMON_UID=1 -DDAEMON_GID=1 -DDEFAULT_BATCH_QUEUE=\'E\' -DDEFAULT_AT_QUEUE=\'c\' -DPERM_PATH=\/var/at/\ -I/usr/src/libexec/atrun/../../usr.bin/at -I/usr/src/libexec/atrun -DLOGIN_CAP -DPAM -std=gnu99 -fstack-protector -Wsystem-headers -Wall -Wno-format-y2k -Wno-uninitialized -Wno-pointer-sign -o atrun atrun.o gloadavg.o -lpam -lutil clang: warning: argument unused during compilation: '-std=gnu99' OK # /usr/obj/usr/src/tmp/usr/bin/clang -O2 -pipe -march=native -fomit-frame-pointer -DATJOB_DIR=\/var/at/jobs/\ -DLFILE=\/var/at/jobs/.lockfile\ -DLOADAVG_MX=1.5 -DATSPOOL_DIR=\/var/at/spool\ -DVERSION=\2.9\ -DDAEMON_UID=1 -DDAEMON_GID=1 -DDEFAULT_BATCH_QUEUE=\'E\' -DDEFAULT_AT_QUEUE=\'c\' -DPERM_PATH=\/var/at/\ -I/usr/src/libexec/atrun/../../usr.bin/at -I/usr/src/libexec/atrun -DLOGIN_CAP -DPAM -std=gnu99 -fstack-protector -Wsystem-headers -Wall -Wno-format-y2k -Wno-uninitialized -Wno-pointer-sign -o atrun atrun.o gloadavg.o -lpam -lutil FAIL (clang: error: linker command failed with exit code 1 (use -v to see invocation)) # /usr/obj/usr/src/tmp/usr/bin/clang -O2 -pipe -march=native -fomit-frame-pointer -DATJOB_DIR=\/var/at/jobs/\ -DLFILE=\/var/at/jobs/.lockfile\ -DLOADAVG_MX=1.5 -DATSPOOL_DIR=\/var/at/spool\ -DVERSION=\2.9\ -DDAEMON_UID=1 -DDAEMON_GID=1 -DDEFAULT_BATCH_QUEUE=\'E\' -DDEFAULT_AT_QUEUE=\'c\' -DPERM_PATH=\/var/at/\ -I/usr/src/libexec/atrun/../../usr.bin/at -I/usr/src/libexec/atrun -DLOGIN_CAP -DPAM -std=gnu99 -fstack-protector -Wsystem-headers -Wall -Wno-format-y2k -Wno-uninitialized -Wno-pointer-sign -o atrun atrun.o gloadavg.o -lpam -lutil -### FreeBSD clang version 3.0 (trunk 130700) 20110502 Target: x86_64-undermydesk-freebsd9.0 Thread model: posix clang: warning: argument unused during compilation: '-std=gnu99' /usr/obj/usr/src/tmp/usr/bin/ld --eh-frame-hdr -dynamic-linker /libexec/ld-elf.so.1 -o atrun /usr/obj/usr/src/tmp/usr/lib/crt1.o /usr/obj/usr/src/tmp/usr/lib/crti.o /usr/obj/usr/src/tmp/usr/lib/crtbegin.o -L/usr/obj/usr/src/tmp/usr/lib atrun.o gloadavg.o -lpam -lutil -lgcc --as-needed -lgcc_s --no-as-needed -lc -lgcc --as-needed -lgcc_s --no-as-needed /usr/obj/usr/src/tmp/usr/lib/crtend.o /usr/obj/usr/src/tmp/usr/lib/crtn.o Using the bootstrap clang (compiled with -march=native) and trying to compile atrun, this time using -march=core2 : # /usr/obj/usr/src/tmp/usr/bin/clang -O2 -pipe -march=core2 -fomit-frame-pointer -DATJOB_DIR=\/var/at/jobs/\ -DLFILE=\/var/at/jobs/.lockfile\ -DLOADAVG_MX=1.5 -DATSPOOL_DIR=\/var/at/spool\ -DVERSION=\2.9\ -DDAEMON_UID=1 -DDAEMON_GID=1 -DDEFAULT_BATCH_QUEUE=\'E\' -DDEFAULT_AT_QUEUE=\'c\' -DPERM_PATH=\/var/at/\ -I/usr/src/libexec/atrun/../../usr.bin/at -I/usr/src/libexec/atrun -DLOGIN_CAP -DPAM -std=gnu99 -fstack-protector -Wsystem-headers -Wall -Wno-format-y2k -Wno-uninitialized -Wno-pointer-sign -o atrun atrun.o gloadavg.o -lpam -lutil FAIL (same error) When trying to compile the Host.cpp you provided (which compiled fine with my system's clang and gcc), still with the bootstrap clang : # /usr/obj/usr/src/tmp/usr/bin/clang -v Host.cpp FreeBSD clang version 3.0 (trunk 130700) 20110502 Target: x86_64-undermydesk-freebsd9.0 Thread model: posix /usr/obj/usr/src/tmp/usr/bin/clang -cc1 -triple x86_64-undermydesk-freebsd9.0 -emit-obj -mrelax-all -disable-free -main-file-name Host.cpp -mrelocation-model static -mdisable-fp-elim -masm-verbose -mconstructor-aliases -munwind-tables -target-cpu x86-64 -momit-leaf-frame-pointer -v -resource-dir /usr/obj/usr/src/tmp/usr/bin/../lib/clang/3.0 -fdeprecated-macro -ferror-limit 19 -fmessage-length 236 -fcxx-exceptions -fexceptions -fgnu-runtime -fdiagnostics-show-option -fcolor-diagnostics -o /tmp/cc-6ijoGC.o -x c++ Host.cpp clang -cc1 version 3.0 based upon llvm 3.0svn hosted on x86_64-undermydesk-freebsd9.0 ignoring nonexistent directory /usr/obj/usr/src/tmp/usr/include/c++/4.2/backward/backward ignoring nonexistent directory /usr/obj/usr/src/tmp/usr/bin/../lib/clang/3.0/include ignoring duplicate directory /usr/obj/usr/src/tmp/usr/include/c++/4.2 ignoring duplicate directory /usr/obj/usr/src/tmp/usr/include/c++/4.2/backward ignoring duplicate directory /usr/obj/usr/src/tmp/usr/include/c++/4.2/backward #include ... search starts here: #include ... search starts here: /usr/obj/usr/src/tmp/usr/include/c++/4.2 /usr/obj/usr/src/tmp/usr/include/c++/4.2/backward /usr/obj/usr/src/tmp/usr/include/clang/3.0 /usr/obj/usr/src/tmp/usr/include
Re: Nasty non-recursive lockmgr panic on softdep only enabled UFS partition when filesystem full
On Thu, May 05, 2011 at 10:23:47AM -0700, Garrett Cooper wrote: On May 4, 2011, at 2:07 AM, Kostik Belousov wrote: On Tue, May 03, 2011 at 11:58:49PM -0700, Garrett Cooper wrote: On Tue, May 3, 2011 at 11:42 PM, Garrett Cooper yaneg...@gmail.com wrote: On Tue, May 3, 2011 at 10:59 PM, Kirk McKusick mckus...@mckusick.com wrote: Date: Tue, 3 May 2011 22:40:26 -0700 Subject: Nasty non-recursive lockmgr panic on softdep only enabled UFS partition when filesystem full From: Garrett Cooper yaneg...@gmail.com To: Jeff Roberson j...@freebsd.org, Marshall Kirk McKusick mckus...@mckusick.com Cc: FreeBSD Current freebsd-current@freebsd.org Hi Jeff and Dr. McKusick, Ran into this panic when /usr ran out of space doing a make universe on amd64/r221219 (it took ~15 minutes for the panic to occur after the filesystem ran out of space -- wasn't quite sure what it was doing at the time): ... Let me know what other commands you would like for me to run in kgdb. Thanks, -Garrett You did not indicate whether you are running an 8.X system or a 9-current system. It would be helpful to know that. I've actually been running CURRENT for a few years now, but you're right -- I didn't mention that part. Jeff thinks that there may be a potential race in the locking code for softdep_request_cleanup. If so, this patch for 9-current should fix it: Index: ffs_softdep.c === --- ffs_softdep.c (revision 221385) +++ ffs_softdep.c (working copy) @@ -11380,7 +11380,8 @@ continue; } MNT_IUNLOCK(mp); - if (vget(lvp, LK_EXCLUSIVE | LK_INTERLOCK, curthread)) { + if (vget(lvp, LK_EXCLUSIVE | LK_NOWAIT | LK_INTERLOCK, + curthread)) { MNT_ILOCK(mp); continue; } If you are running an 8.X system, hopefully you will be able to apply it. I've applied it, rebuilt and installed the kernel, and trying to repro the case again. Will let you know how things go! Happened again with the change. It's really easy to repro: 1. Get a filesystem with UFS+SU 2. Execute something that does a large number of small writes to a partition. 3. 'dd if=/dev/zero of=FOO bs=10m' on the same partition The kernel will panic with the issue I discussed above. Thanks! Jeff' change is required to avoid LORs, but it is not sufficient to prevent recursion. We must skip the vnode supplied as a parameter to softdep_request_cleanup(). Theoretically, other vnodes might be also locked by curthread, thus I think the change below is needed. Try this. diff --git a/sys/ufs/ffs/ffs_softdep.c b/sys/ufs/ffs/ffs_softdep.c index a6d4441..25fa5d6 100644 --- a/sys/ufs/ffs/ffs_softdep.c +++ b/sys/ufs/ffs/ffs_softdep.c @@ -11380,7 +11380,9 @@ retry: continue; } MNT_IUNLOCK(mp); - if (vget(lvp, LK_EXCLUSIVE | LK_INTERLOCK, curthread)) { + if (VOP_ISLOCKED(lvp) || + vget(lvp, LK_EXCLUSIVE | LK_INTERLOCK | LK_NOWAIT, + curthread)) { MNT_ILOCK(mp); continue; } Ran into the same panic after I applied the patch above with the repro steps I described before. One thing that I noticed is that the issue isn't as easy to reproduce unless you add the dd in parallel with the make operation. Well, I misread your original report. Also, there is another issue that is easily reproducable in similar situation. The latest patch is below. diff --git a/sys/sys/mount.h b/sys/sys/mount.h index 231e3d6..f064053 100644 --- a/sys/sys/mount.h +++ b/sys/sys/mount.h @@ -366,6 +366,8 @@ void __mnt_vnode_markerfree(struct vnode **mvp, struct mount *mp); #define MNT_LAZY 3 /* push data not written by filesystem syncer */ #define MNT_SUSPEND4 /* Suspend file system after sync */ +#defineMNT_WAIT_ADV0x1000 /* MNT_WAIT prevent deadlock */ + /* * Generic file handle */ diff --git a/sys/ufs/ffs/ffs_alloc.c b/sys/ufs/ffs/ffs_alloc.c index e60514d..87837cc 100644 --- a/sys/ufs/ffs/ffs_alloc.c +++ b/sys/ufs/ffs/ffs_alloc.c @@ -420,13 +420,13 @@ nospace: */ if (reclaimed == 0) { reclaimed = 1; - softdep_request_cleanup(fs, vp, cred, FLUSH_BLOCKS_WAIT); - UFS_UNLOCK(ump); if (bp) { + UFS_UNLOCK(ump); brelse(bp); bp = NULL; + UFS_LOCK(ump); } -
Re: Clang error make buildworld
# /usr/obj/usr/src/tmp/usr/bin/clang -O2 -pipe -march=native -fomit-frame-pointer -DATJOB_DIR=\/var/at/jobs/\ -DLFILE=\/var/at/jobs/.lockfile\ -DLOADAVG_MX=1.5 -DATSPOOL_DIR=\/var/at/spool\ -DVERSION=\2.9\ -DDAEMON_UID=1 -DDAEMON_GID=1 -DDEFAULT_BATCH_QUEUE=\'E\' -DDEFAULT_AT_QUEUE=\'c\' -DPERM_PATH=\/var/at/\ -I/usr/src/libexec/atrun/../../usr.bin/at -I/usr/src/libexec/atrun -DLOGIN_CAP -DPAM -std=gnu99 -fstack-protector -Wsystem-headers -Wall -Wno-format-y2k -Wno-uninitialized -Wno-pointer-sign -o atrun atrun.o gloadavg.o -lpam -lutil FAIL (clang: error: linker command failed with exit code 1 (use -v to see invocation)) Can you run this in gdb and show me backtrace? Also, what version is your binutils? ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: atkbdc broken on current ?
On Thursday, May 05, 2011 9:21:04 am Damjan Marion wrote: Hi, I have issue with old HP DL380G3 server. When I use ILO virtual console to manage server. Seems that 9-CURRENT fails to detect atkbdc. When I boot 8.2-RELEASE it works well. 8.2 dmesg shows: atkbdc0: Keyboard controller (i8042) port 0x60,0x64 irq 1 on acpi0 9.0: atkbdc0: Keyboard controller (i8042) failed to probe at port 0x60 on isa0 Is this a known issue? Should I enable some additional outputs, like KBDIO_DEBUG? I suspect this is a resource issue stemming from changes I made to the acpi(4) bus driver quite a while ago to make it use rman_reserve_resource(). Can you capture a full verbose dmesg from 9 along with devinfo -rv and devinfo -ur output from 9? -- John Baldwin ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: problems with em(4) since update to driver 7.2.2
2011/5/5 Jack Vogel jfvo...@gmail.com: Anyway, I see the problematic code path, its only when you skip the while loop altogether. I'm surprised the compiler did not complain about this, its usually so anal. Could it be related to the compiler (clang) or some optimization flags ? -- Olivier Smedts _ ASCII ribbon campaign ( ) e-mail: oliv...@gid0.org - against HTML email vCards X www: http://www.gid0.org - against proprietary attachments / \ Il y a seulement 10 sortes de gens dans le monde : ceux qui comprennent le binaire, et ceux qui ne le comprennent pas. ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: problems with em(4) since update to driver 7.2.2
Not sure, I wondered if those seeing this had some special sequence of actions they took for granted that is different than what we do in house... In any case, the init really is ultimately a correctness thing, so let's just call it good :) Jack On Thu, May 5, 2011 at 11:16 AM, Olivier Smedts oliv...@gid0.org wrote: 2011/5/5 Jack Vogel jfvo...@gmail.com: Anyway, I see the problematic code path, its only when you skip the while loop altogether. I'm surprised the compiler did not complain about this, its usually so anal. Could it be related to the compiler (clang) or some optimization flags ? -- Olivier Smedts _ ASCII ribbon campaign ( ) e-mail: oliv...@gid0.org- against HTML email vCards X www: http://www.gid0.org- against proprietary attachments / \ Il y a seulement 10 sortes de gens dans le monde : ceux qui comprennent le binaire, et ceux qui ne le comprennent pas. ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: Nasty non-recursive lockmgr panic on softdep only enabled UFS partition when filesystem full
On Thu, May 5, 2011 at 10:36 AM, Kostik Belousov kostik...@gmail.com wrote: On Thu, May 05, 2011 at 10:23:47AM -0700, Garrett Cooper wrote: On May 4, 2011, at 2:07 AM, Kostik Belousov wrote: On Tue, May 03, 2011 at 11:58:49PM -0700, Garrett Cooper wrote: On Tue, May 3, 2011 at 11:42 PM, Garrett Cooper yaneg...@gmail.com wrote: On Tue, May 3, 2011 at 10:59 PM, Kirk McKusick mckus...@mckusick.com wrote: Date: Tue, 3 May 2011 22:40:26 -0700 Subject: Nasty non-recursive lockmgr panic on softdep only enabled UFS partition when filesystem full From: Garrett Cooper yaneg...@gmail.com To: Jeff Roberson j...@freebsd.org, Marshall Kirk McKusick mckus...@mckusick.com Cc: FreeBSD Current freebsd-current@freebsd.org Hi Jeff and Dr. McKusick, Ran into this panic when /usr ran out of space doing a make universe on amd64/r221219 (it took ~15 minutes for the panic to occur after the filesystem ran out of space -- wasn't quite sure what it was doing at the time): ... Let me know what other commands you would like for me to run in kgdb. Thanks, -Garrett You did not indicate whether you are running an 8.X system or a 9-current system. It would be helpful to know that. I've actually been running CURRENT for a few years now, but you're right -- I didn't mention that part. Jeff thinks that there may be a potential race in the locking code for softdep_request_cleanup. If so, this patch for 9-current should fix it: Index: ffs_softdep.c === --- ffs_softdep.c (revision 221385) +++ ffs_softdep.c (working copy) @@ -11380,7 +11380,8 @@ continue; } MNT_IUNLOCK(mp); - if (vget(lvp, LK_EXCLUSIVE | LK_INTERLOCK, curthread)) { + if (vget(lvp, LK_EXCLUSIVE | LK_NOWAIT | LK_INTERLOCK, + curthread)) { MNT_ILOCK(mp); continue; } If you are running an 8.X system, hopefully you will be able to apply it. I've applied it, rebuilt and installed the kernel, and trying to repro the case again. Will let you know how things go! Happened again with the change. It's really easy to repro: 1. Get a filesystem with UFS+SU 2. Execute something that does a large number of small writes to a partition. 3. 'dd if=/dev/zero of=FOO bs=10m' on the same partition The kernel will panic with the issue I discussed above. Thanks! Jeff' change is required to avoid LORs, but it is not sufficient to prevent recursion. We must skip the vnode supplied as a parameter to softdep_request_cleanup(). Theoretically, other vnodes might be also locked by curthread, thus I think the change below is needed. Try this. diff --git a/sys/ufs/ffs/ffs_softdep.c b/sys/ufs/ffs/ffs_softdep.c index a6d4441..25fa5d6 100644 --- a/sys/ufs/ffs/ffs_softdep.c +++ b/sys/ufs/ffs/ffs_softdep.c @@ -11380,7 +11380,9 @@ retry: continue; } MNT_IUNLOCK(mp); - if (vget(lvp, LK_EXCLUSIVE | LK_INTERLOCK, curthread)) { + if (VOP_ISLOCKED(lvp) || + vget(lvp, LK_EXCLUSIVE | LK_INTERLOCK | LK_NOWAIT, + curthread)) { MNT_ILOCK(mp); continue; } Ran into the same panic after I applied the patch above with the repro steps I described before. One thing that I noticed is that the issue isn't as easy to reproduce unless you add the dd in parallel with the make operation. Well, I misread your original report. Also, there is another issue that is easily reproducable in similar situation. The latest patch is below. diff --git a/sys/sys/mount.h b/sys/sys/mount.h index 231e3d6..f064053 100644 --- a/sys/sys/mount.h +++ b/sys/sys/mount.h @@ -366,6 +366,8 @@ void __mnt_vnode_markerfree(struct vnode **mvp, struct mount *mp); #define MNT_LAZY 3 /* push data not written by filesystem syncer */ #define MNT_SUSPEND 4 /* Suspend file system after sync */ +#define MNT_WAIT_ADV 0x1000 /* MNT_WAIT prevent deadlock */ + /* * Generic file handle */ diff --git a/sys/ufs/ffs/ffs_alloc.c b/sys/ufs/ffs/ffs_alloc.c index e60514d..87837cc 100644 --- a/sys/ufs/ffs/ffs_alloc.c +++ b/sys/ufs/ffs/ffs_alloc.c @@ -420,13 +420,13 @@ nospace: */ if (reclaimed == 0) { reclaimed = 1; - softdep_request_cleanup(fs, vp, cred, FLUSH_BLOCKS_WAIT); - UFS_UNLOCK(ump); if (bp) { + UFS_UNLOCK(ump);
Re: responsiveness during IO tasks
Alexander Motin wrote: Julian Elischer wrote: Doug Barton wrote: No problem, just let's hunt things down. I'll wait for that larger post. In meantime, if it is related to eventtimers, it would be good to collect more detailed information. You could try to make timer run during idle (kern.eventtimer.idletick). You could try to switch timer from one-shot to periodic mode (kern.eventtimer.periodic). You could also try to switch to another timer (kern.eventtimer.timer). kern.eventtimer.periodic needs to be disabled to run 9.x on xen (as of a few months ago) Yes, but it needs to be enabled (it is disabled by default). I remember about it and going to experiment with it nearest time. Problem with Xen HVM freeze in one-shot mode workarounded by r221508. Also, looking on Xen 4.1 sources, seems like problem was already fixed from their side also. -- Alexander Motin ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: Building FreeBSD 9.0-CUR/amd64 with CLANG fails
On Wed, May 04, 2011 at 09:17:23AM +0200, O. Hartmann wrote: I guess the ports-tree isn't mature for clang. That's correct. mcl ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: Interrupt storm with MSI in combination with em1
Hi Peter, On Thursday 05 May 2011 21:28:02 Peter Jeremy wrote: On 2011-May-05 13:22:59 +0200, Daan Vreeken d...@vehosting.nl wrote: Not yet. I'll reboot the machine later today when I have physical access to it to check the BIOS version. I'll keep you informed as soon as I get another storm going. Depending on the quality of your BIOS (competence of the vendor), you might find that kenv(8) reports the BIOS version without needing a reboot. (Look at smbios.bios.* in the output). Great! I didn't know that :) # kenv ... smbios.bios.reldate=07/15/2010 ... smbios.bios.version=0303 ... smbios.planar.maker=ASUSTeK Computer INC. smbios.planar.product=P7H55-M LX Version 0402 is the latest and greatest, so it's time to upgrade. According to Asus it Improves system stability, so let's see if this 'cures' IRQ 16. Thanks, -- Daan Vreeken VEHosting http://VEHosting.nl tel: +31-(0)40-7113050 / +31-(0)6-46210825 KvK nr: 17174380 ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: Interrupt storm with MSI in combination with em1
Cool, thanks for the update! Good luck. Jack On Thu, May 5, 2011 at 1:17 PM, Daan Vreeken d...@vehosting.nl wrote: Hi Peter, On Thursday 05 May 2011 21:28:02 Peter Jeremy wrote: On 2011-May-05 13:22:59 +0200, Daan Vreeken d...@vehosting.nl wrote: Not yet. I'll reboot the machine later today when I have physical access to it to check the BIOS version. I'll keep you informed as soon as I get another storm going. Depending on the quality of your BIOS (competence of the vendor), you might find that kenv(8) reports the BIOS version without needing a reboot. (Look at smbios.bios.* in the output). Great! I didn't know that :) # kenv ... smbios.bios.reldate=07/15/2010 ... smbios.bios.version=0303 ... smbios.planar.maker=ASUSTeK Computer INC. smbios.planar.product=P7H55-M LX Version 0402 is the latest and greatest, so it's time to upgrade. According to Asus it Improves system stability, so let's see if this 'cures' IRQ 16. Thanks, -- Daan Vreeken VEHosting http://VEHosting.nl tel: +31-(0)40-7113050 / +31-(0)6-46210825 KvK nr: 17174380 ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Processes in swapped out states in recent CURRENT?
I was watching top output on my dev box and I noticed that there are more swapped out processes present on the system, shortly after boot (which doesn't make sense given that I'm not low on resources on the box). Also, the os when I run os.waitpid() in python claims that the child doesn't exist, so I'm wondering if there's an issue with the processes reported via ps, top, etc. I'm noting this because it's a behavior change over my 'stable'-ish workstation, running CURRENT/r220089/amd64, which is spec'ed out the same as the dev box, minus some multimedia hardware. Thanks, -Garrett # uname -a FreeBSD fallout.local 9.0-CURRENT FreeBSD 9.0-CURRENT #0 r221219M: Thu May 5 12:09:37 PDT 2011 root@fallout.local:/usr/obj/usr/src/sys/FALLOUT amd6 # fstat -p 1832 USER CMD PID FD MOUNT INUM MODE SZ|DV R/W root sshd1832 root / 2 drwxr-xr-x1024 r root sshd1832 wd / 2 drwxr-xr-x1024 r root sshd1832 text /usr 730118 -r-xr-xr-x 240944 r root sshd18320 /dev 6 crw-rw-rw-null r root sshd18321 /dev 6 crw-rw-rw-null rw root sshd18322 /dev 6 crw-rw-rw-null rw root sshd18323* internet stream tcp fe01e56cf000 root sshd18324* pseudo-terminal master pts/1 rw root sshd18325* local stream fe0008f79960 - fe0008f79a50 # fstat -p 149 USER CMD PID FD MOUNT INUM MODE SZ|DV R/W root adjkerntz149 root / 2 drwxr-xr-x1024 r root adjkerntz149 wd / 2 drwxr-xr-x1024 r root adjkerntz149 text /329805 -r-xr-xr-x8792 r root adjkerntz1490 /dev 6 crw-rw-rw-null rw root adjkerntz1491 /dev 6 crw-rw-rw-null rw root adjkerntz1492 /dev 6 crw-rw-rw-null rw # fstat -p 1479 USER CMD PID FD MOUNT INUM MODE SZ|DV R/W root syslogd 1479 root / 2 drwxr-xr-x1024 r root syslogd 1479 wd / 2 drwxr-xr-x1024 r root syslogd 1479 text /usr 739002 -r-xr-xr-x 39008 r root syslogd 14790 /dev 6 crw-rw-rw-null rw root syslogd 14791 /dev 6 crw-rw-rw-null rw root syslogd 14792 /dev 6 crw-rw-rw-null rw root syslogd 14793 /var 353301 -rw--- 4 w root syslogd 14794* local dgram fe0008cd31e0 root syslogd 14795* local dgram fe0008cd30f0 root syslogd 14796* internet6 dgram udp fe0008ced540 root syslogd 14797* internet dgram udp fe0008ced3f0 root syslogd 14798 /dev 29 crw---klog r root syslogd 1479 10 /var 1389613 -rw-r--r-- 25389 w root syslogd 1479 11 /var 1389579 -rw--- 62 w root syslogd 1479 12 /var 1389572 -rw--- 10164 w root syslogd 1479 13 /var 1389601 -rw-r-2814 w root syslogd 1479 14 /var 1389575 -rw-r--r-- 62 w root syslogd 1479 15 /var 1389580 -rw--- 62 w root syslogd 1479 16 /var 1389577 -rw--- 57212 w root syslogd 1479 17 /var 1389606 -rw--- 38046 w root syslogd 1479 18 /var 1389578 -rw-r- 62 w # fstat -p 1829 USER CMD PID FD MOUNT INUM MODE SZ|DV R/W gcooper sh 1829 root / 2 drwxr-xr-x1024 r gcooper sh 1829 wd /usr 1884160 drwxr-xr-x1024 r gcooper sh 1829 text /212057 -r-xr-xr-x 131784 r gcooper sh 18290 /dev127 crw--w pts/0 rw gcooper sh 18291 /dev127 crw--w pts/0 rw gcooper sh 18292 /dev127 crw--w pts/0 rw gcooper sh 1829 10 /dev127 crw--w pts/0 rw # python -c 'import os; os.waitpid(1825, 0)' Traceback (most recent call last): File string, line 1, in module OSError: [Errno 10] No child processes # ps auxww | grep 1825 root 1825 0.0 0.0 47952 0 ?? IWs - 0:00.00 sshd: gcooper [priv] (sshd) root88213 0.0 0.0 16340 1356 3 S+1:25PM 0:00.00 grep 1825 # top -b last pid: 96740; load averages: 1.07, 0.98, 0.92 up 0+01:15:3213:27:04 50 processes: 2 running, 48 sleeping Mem: 56M Active, 23M Inact, 795M Wired, 1848K Cache, 1237M Buf, 11G Free Swap: 24G Total, 832K Used, 24G Free PID USERNAME THR PRI NICE SIZERES STATE C TIME WCPU COMMAND 1828 gcooper 1 200 47952K 3372K select 6 0:02 0.00% sshd 26295 root1 200 9972K 888K kqread 2 0:01 0.00% tail 95888 root1 520 14472K 8092K wait1 0:00 0.00% make 1729 root1 200 20368K 3000K
Re: My problems with stability on -current
Doug Barton wrote: Alexander suggested some knobs to twist for the timers, and I'll be glad to do that once he gets back to me with more concrete suggestions now that he knows more about my specific problems. OK, I am all here. While this post is indeed larger then previous, it is not much more informative. Sorry. :( I see several possibly unrelated problems there: - crashes are always crashes. They should be debugged. - calcru going backwards could have the same roots as lost wall clock time. If there are some problems with timer interrupts, timecounters could wrap unnoticed that will cause random time jumps. - interactivity problems. I can't prove it is unrelated, but have no real ideas now. I would start from most obvious problems. I need to know more about crashes. As usual: how to trigger, stack backtraces, etc. What's about time problems, I would try to collect more data: - show `sysctl kern.eventtimer`, `sysctl kern.timecounter` and verbose dmesg outputs; - what eventtimer is used now and does it helps to switch to another one with kern.eventtimer.timer sysctl? - does the timer runs in periodic or one-shot mode and does it helps to switch to another one? - if full CPU load makes time to stop, try to track what is going on with timer interrupts using `vmstat -i` and `systat -vm 1`. Under full CPU load in one-shot mode you should have stable timer interrupt rate about hz+stathz. - if timer interrupts are not working well, you can build kernel with optionsKTR optionsALQ optionsKTR_ALQ optionsKTR_COMPILE=(KTR_SPARE2) optionsKTR_ENTRIES=131072 optionsKTR_MASK=(KTR_SPARE2) to track event timers operation and use ktrdump to save the trace when problem exist (preferably when it begins). And let's experiment with fresh CURRENT. -- Alexander Motin ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: atkbdc broken on current ?
On May 5, 2011, at 7:43 PM, John Baldwin wrote: On Thursday, May 05, 2011 9:21:04 am Damjan Marion wrote: Hi, I have issue with old HP DL380G3 server. When I use ILO virtual console to manage server. Seems that 9-CURRENT fails to detect atkbdc. When I boot 8.2-RELEASE it works well. 8.2 dmesg shows: atkbdc0: Keyboard controller (i8042) port 0x60,0x64 irq 1 on acpi0 9.0: atkbdc0: Keyboard controller (i8042) failed to probe at port 0x60 on isa0 Is this a known issue? Should I enable some additional outputs, like KBDIO_DEBUG? I suspect this is a resource issue stemming from changes I made to the acpi(4) bus driver quite a while ago to make it use rman_reserve_resource(). Can you capture a full verbose dmesg from 9 along with devinfo -rv and devinfo -ur output from 9? Here it is: http://web.me.com/dmarion/atkbdc.txt Thanks, Damjan___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: Nasty non-recursive lockmgr panic on softdep only enabled UFS partition when filesystem full
On Thu, May 05, 2011 at 12:49:48PM -0700, Garrett Cooper wrote: Things look ok with that patch and the one that Jeff provided for the LOR, taking into account your style change with the flag list. Thanks! I do not understand your response. Jeff' patch was included into the cumulative change I sent you, with slight modification. What 'style change with the flag list' are you referencing to ? pgpoDAWslEXEc.pgp Description: PGP signature
Re: Interrupt storm with MSI in combination with em1
On 2011-May-05 13:22:59 +0200, Daan Vreeken d...@vehosting.nl wrote: Not yet. I'll reboot the machine later today when I have physical access to it to check the BIOS version. I'll keep you informed as soon as I get another storm going. Depending on the quality of your BIOS (competence of the vendor), you might find that kenv(8) reports the BIOS version without needing a reboot. (Look at smbios.bios.* in the output). -- Peter Jeremy pgpZbYhnW3y6u.pgp Description: PGP signature
Using Dtrace for Performance Evaluation
I was looking at using dtrace to help characterize performance for the new bxe(4) driver but I'm having problems with the very simple task of capturing time spent in a function. The D script I'm using looks like the following: #pragma D option quiet fbt:if_bxe::entry { self-in = timestamp; } fbt:if_bxe::return { @callouts[((struct callout *)arg0)-c_func] = sum(timestamp - self-in); } tick-10sec { printa(%40a %10@d\n, @callouts); clear(@callouts); printf(\n); } BEGIN { printf(%40s | %s\n, function, nanoseconds per second); } After building dtrace into the kernel and loading the dtraceall kernel module, when I load my bxe kernel module and run dtrace -l to list all supported probes I notice that many functions have an entry probe but no exit probe. This effectively prevents me from calculating timestamps on fbt:if_bxe::return probes. Why am I seeing this behavior? Dave ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
RE: Using Dtrace for Performance Evaluation
After building dtrace into the kernel and loading the dtraceall kernel module, when I load my bxe kernel module and run dtrace -l to list all supported probes I notice that many functions have an entry probe but no exit probe. This effectively prevents me from calculating timestamps on fbt:if_bxe::return probes. Why am I seeing this behavior? Tail call optimization could do that to you: http://en.wikipedia.org/wiki/Tail_call How to disable tail call optimization when building my driver? Dave ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: Using Dtrace for Performance Evaluation
On Thu, May 5, 2011 at 4:33 PM, David Christensen davi...@broadcom.com wrote: After building dtrace into the kernel and loading the dtraceall kernel module, when I load my bxe kernel module and run dtrace -l to list all supported probes I notice that many functions have an entry probe but no exit probe. This effectively prevents me from calculating timestamps on fbt:if_bxe::return probes. Why am I seeing this behavior? Tail call optimization could do that to you: http://en.wikipedia.org/wiki/Tail_call How to disable tail call optimization when building my driver? Google is your friend: Either compile with -O0/-O1, or use -fno-optimize-sibling-calls. http://stackoverflow.com/questions/3679435/how-do-i-disable-tailcall-optimizations-in-gcc --Artem ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: Using Dtrace for Performance Evaluation
On Thu, May 5, 2011 at 1:08 PM, David Christensen davi...@broadcom.com wrote: I was looking at using dtrace to help characterize performance for the new bxe(4) driver but I'm having problems with the very simple task of capturing time spent in a function. The D script I'm using looks like the following: #pragma D option quiet fbt:if_bxe::entry { self-in = timestamp; } fbt:if_bxe::return { @callouts[((struct callout *)arg0)-c_func] = sum(timestamp - self-in); } tick-10sec { printa(%40a %10@d\n, @callouts); clear(@callouts); printf(\n); } BEGIN { printf(%40s | %s\n, function, nanoseconds per second); } After building dtrace into the kernel and loading the dtraceall kernel module, when I load my bxe kernel module and run dtrace -l to list all supported probes I notice that many functions have an entry probe but no exit probe. This effectively prevents me from calculating timestamps on fbt:if_bxe::return probes. Why am I seeing this behavior? Tail call optimization could do that to you: http://en.wikipedia.org/wiki/Tail_call --Artem Dave ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org