Re: PCIe SATA HBA for ZFS on -STABLE
On 1 June 2011 17:37, Jeremy Chadwick free...@jdc.parodius.com wrote: On Wed, Jun 01, 2011 at 02:34:55PM +0800, TJ Varghese wrote: On Tue, May 31, 2011 at 10:31 PM, Freddie Cash fjwc...@gmail.com wrote: [snip] SuperMicro AOC-USAS2-L8i works exceptionally well. These are 8-port HBAs using the LSI1068 chipset, supported by the mpt(4) driver. Support 3 Gpbs SATA/SAS, using multi-lane cables (2 connectors on the card, each connector supports 4 SATA ports), hot-plug, hot-swap. The USAS2 (6Gbps) is supported by the mps driver (on -CURRENT, not sure if it's in 8-STABLE yet). Perhaps you're referring to the earlier USAS which does 3Gbps and is supported by the mpt driver. Folks considering use of mps(4), which was committed to RELENG_8 roughly around 2011/02/18 (thus is not in 8.2-RELEASE), should read the below threads just in case. Always good to be educated. Of course, the mailing lists are usually filled with complaints rather than success stories, so the tone of my mail here will therefore sound negative; I don't mean it that way, I just ask that people be aware. * 2011/04/29 -- mps driver instability under stable/8 http://lists.freebsd.org/pipermail/freebsd-stable/2011-April/thread.html#62507 http://lists.freebsd.org/pipermail/freebsd-stable/2011-May/thread.html#62518 * 2011/04/27 -- MPS driver: force bus rescan after remove SAS cable http://lists.freebsd.org/pipermail/freebsd-stable/2011-April/thread.html#62438 http://lists.freebsd.org/pipermail/freebsd-stable/2011-April/thread.html#62443 * 2011/03/10 -- LSI SAS2008 performance with mps(4) driver http://lists.freebsd.org/pipermail/freebsd-stable/2011-March/thread.html#61862 Those threads assure me that the SuperMicro AOC-USAS2-L8i with version 9 firmware and the mps(4) driver work very well as long as I'm running FreeBSD 9-CURRENT or 8-STABLE (not 8.2-RELEASE). As I'm running -STABLE I'm quite happy to give it a go. To the OP (Matt Thyer): Sadly I don't have a recommendation for you, since you effectively want a 6-port SATA300 controller that's reliable, you're almost certainly going to be paying Big Bucks(tm) given the number of ports and your requirement that it be PCIe-based. You state quite boldly not wanting to break the bank, but what you're asking for almost certainly WILL break the bank. Jeremy, I think you need to have another look at current prices. I have now bought a SuperMicro AOC-USAS2-L8i on EBay from bakamuzko with the cables I need for only $US 210.99 (I do know about the UIO bracket). For example, an affordable controller might be one driven by Silicon Image's SiI3124 chip -- four (4) SATA300 ports, but it's only hooked to PCI or PCI-X, not PCIe, which means you're susceptible to a much more severe bus bottleneck than with PCIe: I defintely would not consider PCI for part of a ZFS array with 4 drives on that one controller. http://www.siliconimage.com/products/family.aspx?id=3 I tend to avoid consumer-grade Marvell and JMicron SATA chipsets like the plague, however. That's based on my experiences with them under Windows, where I would expect (truly) the drivers to be rock solid given the marketing demographic of the chips in question. I've had the same bad experiences. Good luck, and please let us know what controller you *do* end up going with and your experience with it! Positives are as important as negatives. I'll let you know how it works out. ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: 8-STABLE won't boot with ZFSv28
Hi all, as yesterday was a bank holiday in Germany I wasn't in the office to try the patch linked in the email. Is it consent that I should try the patch located here: http://www.freebsd.org/cgi/cvsweb.cgi/src/sys/dev/ata/chipsets/ata-intel.c.diff?r1=1.25;r2=1.26 and report the result? Or do you need some additional discussion on this topic? I really don't know much about ata-intel chipset programming interface things, that's why I'm asking :-) Best regards, Holger on 02.06.2011 10:37, Alexander Motin wrote: Jeremy Chadwick wrote: On Thu, Jun 02, 2011 at 09:53:58AM +0300, Alexander Motin wrote: Holger Kipp wrote: got the same messages over and over again - panic took some time: unknown: WARNING - ATAPI_IDENTIFY requeued due to channel reset LBA=0 ata0: reinit done .. ata0: reiniting channel .. ata0: DISCONNECT requested short delay here ata0: p0: SATA connect time=0ms status=0113 ata0: p1: SATA connect timeout status= ata0: reset tp1 mask=03 ostat0=00 ostat1=00 ata0: stat0=0x00 err=0x01 lsb=0x14 msb=0xeb ata0: stat1=0x00 err=0x01 lsb=0x14 msb=0xeb ata0: reset tp2 stat0=00 stat1=00 devices=0x3 unknown: WARNING - ATAPI_IDENTIFY requeued due to channel reset LBA=0 ata0: reinit done .. ata0: reiniting channel .. ata0: DISCONNECT requested I see two problems here: 1. devices=0x3 means that two ATAPI devices were detected instead of one. I can reproduce it also with other Intel chipsets. It looks like a hardware bug to me. It can be workarounded by reconnecting ATAPI device to even (2 or 4) SATA port, or connecting any other device there. 2. DISCONNECT requested means that controller reported PHY status change for some device on channel, triggering infinite retry. Unluckily I have no ICH9 board, while I can't reproduce it with ICH10 or above. This patch should workaround the first problem in software: http://www.freebsd.org/cgi/cvsweb.cgi/src/sys/dev/ata/chipsets/ata-intel.c.diff?r1=1.25;r2=1.26 Try it please and let's see if with some luck it do something about the second problem. With regards to item #1: I don't see anything in the ICH9 errata that indicates a silicon bug if the only device attached to the controller is an ATAPI device and connected to SATA port 0 (presumably), or an odd-numbered port? If this problem exists on other ICHxx and/or ESBxx chips, I sure would hope it'd be documented. I haven't tried confirming it myself, but if need be I can set up a test box with a SATA-based DVD drive hooked up to it + provide remote serial console/etc. if it'd be of any help. I don't think it would be (sounds like you have lots of hardware :-) ), but I'm willing to help in any way I can. Intel probably don't see issue there, as the same behavior can be found even on latest chipsets. But according to my ATA specs understanding and real PATA devices behavior analysis, this behavior is not correct. When ATAPI device connected to the first of two SATA ports, routed to the same legacy-/PATA-emulated ATA channel (master device), soft-reset sequence returns false-positive slave ATAPI device presence. Problem doesn't expose with ATA disk devices, or if some other device really attached to the slave port. Problem looks like it was there always, but before ATA_CAM it was not usually noticed, due to very small IDENTIFY command timeouts in ata(4). If somebody can give better explanation or propose better workaround -- welcome, as I am not very like this solution. With regards to item #2: could this be at all related to OOB (bit 15) somehow being set in PCS (SATA register offset 0x92)? I'm doubting it but I thought I'd ask. My thought process, which is probably wrong (consider it an educational discussion :-) ): The ICH9 specification states that the default value for this register is 0x, and b15=0 means SATA controller will not retry after an OOB failure, while b15=1 causes the controller to indefinitely retry after OOB failure. I imagine system BIOSes and other things can change this default value, but we don't seem to print it anywhere in ata_intel_chipinit() during a verbose boot. Looking at chipsets/ata-intel.c, it looks like we only touch PCS in ata_intel_chipinit() and ata_intel_reset(). In the former, we avoid touching bits 4 through 15, and in the latter we mask out only what we want to adjust (e.g. the SATA port per ch variable). As as I can see, ata_intel.c should not change that bit if it was set for some reason. Theoretically, OOB (Out-of-Band signaling) is the function of the same state machine which sets that PHY changes status flag. But friendly speaking, I have no idea what result can be from setting of this bit. In this legacy/PATA emulation mode there are too many things not documented to be sure in anything. ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any
Re: 8-STABLE won't boot with ZFSv28
Hi. Holger Kipp wrote: as yesterday was a bank holiday in Germany I wasn't in the office to try the patch linked in the email. Is it consent that I should try the patch located here: http://www.freebsd.org/cgi/cvsweb.cgi/src/sys/dev/ata/chipsets/ata-intel.c.diff?r1=1.25;r2=1.26 and report the result? Or do you need some additional discussion on this topic? I really don't know much about ata-intel chipset programming interface things, that's why I'm asking :-) Yes, I want you to try it and report the result. -- Alexander Motin ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: modem support MT9234ZPX-PCIE-NV
Dear John and FreeBSD friends, On Tue, May 31, 2011 at 11:01:23AM -0400, John Baldwin wrote: On Monday, May 30, 2011 5:25:14 am Willy Offermans wrote: Hello John and FreeBSD friends, On Fri, May 27, 2011 at 10:43:34AM -0400, John Baldwin wrote: On Friday, May 27, 2011 10:38:02 am Willy Offermans wrote: Dear John and FreeBSD friends, On Fri, May 27, 2011 at 08:05:56AM -0400, John Baldwin wrote: On Thursday, May 26, 2011 4:58:37 pm Mike Tancsa wrote: On 5/26/2011 4:12 PM, John Baldwin wrote: Hmm, can you get 'pciconf -lb' output? Hmm, wow, I wonder how uart(4) works at all. It tries to reuse it's softc structure in uart_bus_attach() that was setup in uart_bus_probe(). Since it doesn't return 0 from its probe routine, that is forbidden. I guess it accidentally works because of the hack where we call DEVICE_PROBE() again to make sure the device description is correct. I think this is a similar card. Had it laying about for a while and popped it in. cu -l to it, attaches, but I am not able to interact with it. none3@pci0:5:0:0: class=0x070002 card=0x20282205 chip=0x015213a8 rev=0x02 hdr=0x00 vendor = 'Exar Corp.' device = 'XR17C/D152 Dual PCI UART' class = simple comms subclass = UART bar [10] = type Memory, range 32, base 0xe895, size 1024, enabled NetBSD supposedly has support for this card Oh, hmm, looks like the clock has an unusual multiplier. Does it work if you use 'cu -l -s 1200' to talk at 9600 for example? (In general use speed / 8 as the speed to '-s'.) Also, is your card a modem or a dual-port card? -- John Baldwin It is a modem. As suggested: kosmos# cu -l /dev/cuau0 -s 1200 Stale lock on cuau0 PID=3642... overriding. Connected atF OK atdt0045*** NO DIALTONE Ok, try this updated patch. After this you should be able to use the correct speed: Index: uart_bus_pci.c === --- uart_bus_pci.c(revision 85) +++ uart_bus_pci.c(working copy) @@ -110,6 +110,8 @@ static struct pci_id pci_ns8250_ids[] = { { 0x1415, 0x950b, 0x, 0, Oxford Semiconductor OXCB950 Cardbus 16950 UART, 0x10, 16384000 }, { 0x151f, 0x, 0x, 0, TOPIC Semiconductor TP560 56k modem, 0x10 }, +{ 0x13a8, 0x0152, 0x2205, 0x2026, MultiTech MultiModem ZPX, 0x10, + 8 * DEFAULT_RCLK }, { 0x9710, 0x9820, 0x1000, 1, NetMos NM9820 Serial Port, 0x10 }, { 0x9710, 0x9835, 0x1000, 1, NetMos NM9835 Serial Port, 0x10 }, { 0x9710, 0x9865, 0xa000, 0x1000, NetMos NM9865 Serial Port, 0x10 }, -- John Baldwin The structure you have provided in your magic line would also need some explanation. The data concerns the description of the chip and the card I guess and can be gained by `pciconf -lv` uart0@pci0:6:0:0: class=0x070002 card=0x20262205 chip=0x015213a8 rev=0x02 hdr=0x00 vendor = 'Exar Corp.' device = 'XR17C/D152 Dual PCI UART' class = simple comms subclass = UART A more detailed explanation would not harm. The data 0x10 and 8 * DEFAULT_RCLK are still totally miraculous to me. 0x10 is the resource id for the first PCI BAR (rids for PCI device resources use the offset in PCI config space of the associated BAR). It would perhaps be more obvious if uart(4) and puc(4) used PCIR_BAR(0) rather than 0x10. Bumping the clock by a multiple of 8 was based on looking at the change in NetBSD that Mike Tancsa pointed to and that you verified by noting that 'cu -s 1200' connected at 9600 (9600 / 1200 == 8). One question though, would you be able to test the patch for puc(4) that I sent to Mike Tancsa to see if your modem works with puc(4)? The puc(4) patch is more general and if it works fine for your modem I'd rather just commit that. -- John Baldwin I have applied the suggested patch. The outcome was a new /usr/src/sys/dev/puc/pucdata.c file, which I have enclosed. Upon compiling the new kernel, I encountered the following error: cc -c -O2 -frename-registers -pipe -fno-strict-aliasing -std=c99 -g -Wall -Wredundant-decls -Wnested-externs -Wstrict-prototypes -Wmissing-prototypes -Wpointer-arith -Winline -Wcast-qual -Wundef -Wno-pointer-sign -fformat-extensions -nostdinc -I. -I/usr/src/sys -I/usr/src/sys/contrib/altq -D_KERNEL -DHAVE_KERNEL_OPTION_HEADERS -include opt_global.h -fno-common -finline-limit=8000 --param inline-unit-growth=100 --param large-function-growth=1000 -mcmodel=kernel -mno-red-zone -mfpmath=387 -mno-sse -mno-sse2 -mno-mmx -mno-3dnow -msoft-float
Re: modem support MT9234ZPX-PCIE-NV
On Friday, June 03, 2011 8:34:54 am Willy Offermans wrote: Dear John and FreeBSD friends, On Tue, May 31, 2011 at 11:01:23AM -0400, John Baldwin wrote: On Monday, May 30, 2011 5:25:14 am Willy Offermans wrote: Hello John and FreeBSD friends, On Fri, May 27, 2011 at 10:43:34AM -0400, John Baldwin wrote: On Friday, May 27, 2011 10:38:02 am Willy Offermans wrote: Dear John and FreeBSD friends, On Fri, May 27, 2011 at 08:05:56AM -0400, John Baldwin wrote: On Thursday, May 26, 2011 4:58:37 pm Mike Tancsa wrote: On 5/26/2011 4:12 PM, John Baldwin wrote: Hmm, can you get 'pciconf -lb' output? Hmm, wow, I wonder how uart(4) works at all. It tries to reuse it's softc structure in uart_bus_attach() that was setup in uart_bus_probe(). Since it doesn't return 0 from its probe routine, that is forbidden. I guess it accidentally works because of the hack where we call DEVICE_PROBE() again to make sure the device description is correct. I think this is a similar card. Had it laying about for a while and popped it in. cu -l to it, attaches, but I am not able to interact with it. none3@pci0:5:0:0: class=0x070002 card=0x20282205 chip=0x015213a8 rev=0x02 hdr=0x00 vendor = 'Exar Corp.' device = 'XR17C/D152 Dual PCI UART' class = simple comms subclass = UART bar [10] = type Memory, range 32, base 0xe895, size 1024, enabled NetBSD supposedly has support for this card Oh, hmm, looks like the clock has an unusual multiplier. Does it work if you use 'cu -l -s 1200' to talk at 9600 for example? (In general use speed / 8 as the speed to '-s'.) Also, is your card a modem or a dual-port card? -- John Baldwin It is a modem. As suggested: kosmos# cu -l /dev/cuau0 -s 1200 Stale lock on cuau0 PID=3642... overriding. Connected atF OK atdt0045*** NO DIALTONE Ok, try this updated patch. After this you should be able to use the correct speed: Index: uart_bus_pci.c === --- uart_bus_pci.c (revision 85) +++ uart_bus_pci.c (working copy) @@ -110,6 +110,8 @@ static struct pci_id pci_ns8250_ids[] = { { 0x1415, 0x950b, 0x, 0, Oxford Semiconductor OXCB950 Cardbus 16950 UART, 0x10, 16384000 }, { 0x151f, 0x, 0x, 0, TOPIC Semiconductor TP560 56k modem, 0x10 }, +{ 0x13a8, 0x0152, 0x2205, 0x2026, MultiTech MultiModem ZPX, 0x10, + 8 * DEFAULT_RCLK }, { 0x9710, 0x9820, 0x1000, 1, NetMos NM9820 Serial Port, 0x10 }, { 0x9710, 0x9835, 0x1000, 1, NetMos NM9835 Serial Port, 0x10 }, { 0x9710, 0x9865, 0xa000, 0x1000, NetMos NM9865 Serial Port, 0x10 }, -- John Baldwin The structure you have provided in your magic line would also need some explanation. The data concerns the description of the chip and the card I guess and can be gained by `pciconf -lv` uart0@pci0:6:0:0: class=0x070002 card=0x20262205 chip=0x015213a8 rev=0x02 hdr=0x00 vendor = 'Exar Corp.' device = 'XR17C/D152 Dual PCI UART' class = simple comms subclass = UART A more detailed explanation would not harm. The data 0x10 and 8 * DEFAULT_RCLK are still totally miraculous to me. 0x10 is the resource id for the first PCI BAR (rids for PCI device resources use the offset in PCI config space of the associated BAR). It would perhaps be more obvious if uart(4) and puc(4) used PCIR_BAR(0) rather than 0x10. Bumping the clock by a multiple of 8 was based on looking at the change in NetBSD that Mike Tancsa pointed to and that you verified by noting that 'cu -s 1200' connected at 9600 (9600 / 1200 == 8). One question though, would you be able to test the patch for puc(4) that I sent to Mike Tancsa to see if your modem works with puc(4)? The puc(4) patch is more general and if it works fine for your modem I'd rather just commit that. -- John Baldwin I have applied the suggested patch. The outcome was a new /usr/src/sys/dev/puc/pucdata.c file, which I have enclosed. Hmm, there was a newer puc patch. Please try this one instead: Index: pucdata.c === --- pucdata.c (revision 222565) +++ pucdata.c (working copy) @@ -48,8 +48,8 @@ __FBSDID($FreeBSD$); #include dev/puc/puc_bfe.h static puc_config_f puc_config_amc; -static puc_config_f puc_config_cronyx; static puc_config_f puc_config_diva; +static puc_config_f puc_config_exar; static puc_config_f puc_config_icbook; static puc_config_f puc_config_quatech; static
[poll / rfc] kdb_stop_cpus
I wonder if anybody uses kdb_stop_cpus with non-default value. If, yes, I am very interested to learn about your usecase for it. I think that the default kdb behavior is the correct one, so it doesn't make sense to have a knob to turn on incorrect behavior. But I may be missing something obvious. The comment in the code doesn't really satisfy me: /* * Flag indicating whether or not to IPI the other CPUs to stop them on * entering the debugger. Sometimes, this will result in a deadlock as * stop_cpus() waits for the other cpus to stop, so we allow it to be * disabled. In order to maximize the chances of success, use a hard * stop for that. */ The hard stop should be sufficiently mighty. Yes, I am aware of supposedly extremely rare situations where a deadlock could happen even when using hard stop. But I'd rather fix that than have this switch. Oh, the commit message (from 2004) explains it: Add a new sysctl, debug.kdb.stop_cpus, which controls whether or not we attempt to IPI other cpus when entering the debugger in order to stop them while in the debugger. The default remains to issue the stop; however, that can result in a hang if another cpu has interrupts disabled and is spinning, since the IPI won't be received and the KDB will wait indefinitely. We probably need to add a timeout, but this is a useful stopgap in the mean time. But that was before we started using hard stop in this context (in 2009). -- Andriy Gapon ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: HAST instability
Decided to apply the patch proposed in -current by Mikolaj Golub: http://people.freebsd.org/~trociny/uipc_socket.c.patch This apparently fixed my issue as well. Running without checksums for a full bonnie++ run (~100GB write/rewrite) produced no disconnects, no stalls and generated up to 280MB/sec (4 drives in stripped zpool). Interestingly, the hast devices write latency as observed by gstat was under 30ms. I believe this fix should be committed. Here are the accumulated netstat -s from both hosts, for comparison with previous runs. Retransmits etc are much less. http://news.digsys.bg/~admin/hast/test3jun-fix/b1a-netstat-s http://news.digsys.bg/~admin/hast/test3jun-fix/b1b-netstat-s http://news.digsys.bg/~admin/hast/test3jun-fix/b1b-systat-if-fix Before applying the patch I verified there are no network problems. Created 1TB file from /dev/random on the first host. Copied over to the second host with ftp. Transfer speed was low, at 80MB/sec -- ftp would utilize one CPU core 100% at the receiving node. Then calculated md5 checksums on both sides, matched. Daniel ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: [poll / rfc] kdb_stop_cpus
On 06/03/11 10:13, Andriy Gapon wrote: I wonder if anybody uses kdb_stop_cpus with non-default value. If, yes, I am very interested to learn about your usecase for it. I think that the default kdb behavior is the correct one, so it doesn't make sense to have a knob to turn on incorrect behavior. But I may be missing something obvious. The comment in the code doesn't really satisfy me: /* * Flag indicating whether or not to IPI the other CPUs to stop them on * entering the debugger. Sometimes, this will result in a deadlock as * stop_cpus() waits for the other cpus to stop, so we allow it to be * disabled. In order to maximize the chances of success, use a hard * stop for that. */ The hard stop should be sufficiently mighty. Yes, I am aware of supposedly extremely rare situations where a deadlock could happen even when using hard stop. But I'd rather fix that than have this switch. Oh, the commit message (from 2004) explains it: Add a new sysctl, debug.kdb.stop_cpus, which controls whether or not we attempt to IPI other cpus when entering the debugger in order to stop them while in the debugger. The default remains to issue the stop; however, that can result in a hang if another cpu has interrupts disabled and is spinning, since the IPI won't be received and the KDB will wait indefinitely. We probably need to add a timeout, but this is a useful stopgap in the mean time. But that was before we started using hard stop in this context (in 2009). Some non-x86 platforms (e.g. PPC) don't support real NMIs, and so this still applies. -Nathan ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: [poll / rfc] kdb_stop_cpus
on 03/06/2011 18:28 Nathan Whitehorn said the following: On 06/03/11 10:13, Andriy Gapon wrote: I wonder if anybody uses kdb_stop_cpus with non-default value. If, yes, I am very interested to learn about your usecase for it. I think that the default kdb behavior is the correct one, so it doesn't make sense to have a knob to turn on incorrect behavior. But I may be missing something obvious. The comment in the code doesn't really satisfy me: /* * Flag indicating whether or not to IPI the other CPUs to stop them on * entering the debugger. Sometimes, this will result in a deadlock as * stop_cpus() waits for the other cpus to stop, so we allow it to be * disabled. In order to maximize the chances of success, use a hard * stop for that. */ The hard stop should be sufficiently mighty. Yes, I am aware of supposedly extremely rare situations where a deadlock could happen even when using hard stop. But I'd rather fix that than have this switch. Oh, the commit message (from 2004) explains it: Add a new sysctl, debug.kdb.stop_cpus, which controls whether or not we attempt to IPI other cpus when entering the debugger in order to stop them while in the debugger. The default remains to issue the stop; however, that can result in a hang if another cpu has interrupts disabled and is spinning, since the IPI won't be received and the KDB will wait indefinitely. We probably need to add a timeout, but this is a useful stopgap in the mean time. But that was before we started using hard stop in this context (in 2009). Some non-x86 platforms (e.g. PPC) don't support real NMIs, and so this still applies. Well, even if it does, there are two things that can be done about that (and, IMO, both are better than the manually controlled knob): - quick and dirty: just let stop_cpus[_hard] timeout; this way good CPUs are stopped and the bad ones are no worse than with kdb_stop_cpus=0. - have a special reserved high priority interrupt, change 'disabling of interrupts' to 'disabling of all interrupts except the special one' by employing various kinds of interrupt priority registers (like it was done for splX stuff); use the special interrupt like an IPI+NMI. What do you think? P.S. I think that the first quick and dirty thing should be done anyway, regardless of any other changes and plans. -- Andriy Gapon ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: HAST instability
Well, apparently my HAST joy was short. On a second run, I got stuck with Jun 3 19:08:16 b1a hastd[1900]: [data2] (primary) Unable to receive reply header: Operation timed out. on the primary. No messages on the secondary. On primary: # netstat -an | grep 8457 tcp4 0 0 10.2.101.11.42659 10.2.101.12.8457 FIN_WAIT_2 tcp4 0 0 10.2.101.11.62058 10.2.101.12.8457 CLOSE_WAIT tcp4 0 0 10.2.101.11.34646 10.2.101.12.8457 FIN_WAIT_2 tcp4 0 0 10.2.101.11.11419 10.2.101.12.8457 CLOSE_WAIT tcp4 0 0 10.2.101.11.37773 10.2.101.12.8457 FIN_WAIT_2 tcp4 0 0 10.2.101.11.21911 10.2.101.12.8457 FIN_WAIT_2 tcp4 0 0 10.2.101.11.40169 10.2.101.12.8457 CLOSE_WAIT tcp4 0 97749 10.2.101.11.44360 10.2.101.12.8457 CLOSE_WAIT tcp4 0 0 10.2.101.11.8457 *.*LISTEN on secondary # netstat -an | grep 8457 tcp4 0 0 10.2.101.12.8457 10.2.101.11.42659 CLOSE_WAIT tcp4 0 0 10.2.101.12.8457 10.2.101.11.62058 FIN_WAIT_2 tcp4 0 0 10.2.101.12.8457 10.2.101.11.34646 CLOSE_WAIT tcp4 0 0 10.2.101.12.8457 10.2.101.11.11419 FIN_WAIT_2 tcp4 0 0 10.2.101.12.8457 10.2.101.11.37773 CLOSE_WAIT tcp4 0 0 10.2.101.12.8457 10.2.101.11.21911 CLOSE_WAIT tcp4 0 0 10.2.101.12.8457 10.2.101.11.40169 FIN_WAIT_2 tcp4 66415 0 10.2.101.12.8457 10.2.101.11.44360 FIN_WAIT_2 tcp4 0 0 10.2.101.12.8457 *.*LISTEN on primary # hastctl status data0: role: primary provname: data0 localpath: /dev/gpt/data0 extentsize: 2097152 (2.0MB) keepdirty: 64 remoteaddr: 10.2.101.12 sourceaddr: 10.2.101.11 replication: fullsync status: complete dirty: 0 (0B) data1: role: primary provname: data1 localpath: /dev/gpt/data1 extentsize: 2097152 (2.0MB) keepdirty: 64 remoteaddr: 10.2.101.12 sourceaddr: 10.2.101.11 replication: fullsync status: complete dirty: 0 (0B) data2: role: primary provname: data2 localpath: /dev/gpt/data2 extentsize: 2097152 (2.0MB) keepdirty: 64 remoteaddr: 10.2.101.12 sourceaddr: 10.2.101.11 replication: fullsync status: complete dirty: 6291456 (6.0MB) data3: role: primary provname: data3 localpath: /dev/gpt/data3 extentsize: 2097152 (2.0MB) keepdirty: 64 remoteaddr: 10.2.101.12 sourceaddr: 10.2.101.11 replication: fullsync status: complete dirty: 0 (0B) Sits in this state for over 10 minutes. Unfortunately, no KDB in kernel. Any ideas what other to look for? Daniel ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: [poll / rfc] kdb_stop_cpus
On 3 Jun 2011, at 16:13, Andriy Gapon wrote: I wonder if anybody uses kdb_stop_cpus with non-default value. If, yes, I am very interested to learn about your usecase for it. The issue that prompted the sysctl was non-NMI IPIs being used to enter the debugger or reboot following a core hanging with interrupts disabled. With the switch to NMI IPIs in some of those circumstances, life is better -- at least, on hardware that supports non-maskable IPIs. I seem to recall sparc64 doesn't, however? Not sure about MIPS, etc. Attilio has since significantly improved our shutdown behaviour -- initially, the switch to NMI IPIs broke other things (because certain IPIs then improperly preempted threads holding spinlocks), but that pretty much all seems worked out now. Robert I think that the default kdb behavior is the correct one, so it doesn't make sense to have a knob to turn on incorrect behavior. But I may be missing something obvious. The comment in the code doesn't really satisfy me: /* * Flag indicating whether or not to IPI the other CPUs to stop them on * entering the debugger. Sometimes, this will result in a deadlock as * stop_cpus() waits for the other cpus to stop, so we allow it to be * disabled. In order to maximize the chances of success, use a hard * stop for that. */ The hard stop should be sufficiently mighty. Yes, I am aware of supposedly extremely rare situations where a deadlock could happen even when using hard stop. But I'd rather fix that than have this switch. Oh, the commit message (from 2004) explains it: Add a new sysctl, debug.kdb.stop_cpus, which controls whether or not we attempt to IPI other cpus when entering the debugger in order to stop them while in the debugger. The default remains to issue the stop; however, that can result in a hang if another cpu has interrupts disabled and is spinning, since the IPI won't be received and the KDB will wait indefinitely. We probably need to add a timeout, but this is a useful stopgap in the mean time. But that was before we started using hard stop in this context (in 2009). -- Andriy Gapon ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Realtek 8111e on 8.2-RELEASE
Hello, Anyone tried Realtek 8111e on 8.2-RELEASE? Realtek8111e should be supported, but I got following message: re0: RealTek 8168/8111 B/C/CP/D/DP/E PCIe Gibabit Ethernet port 0xe800-0xe8ff mem 0xfdfff000-0xfdff, 0xfdff8000-0xfdffbfff irq 17 at device 0.0 on pci2 re0: Using 1 MSI messages re0: Chip rev. 0x2c80 re0: MAC rev. 0x re0: Unknown H/W revision: 0%2c80 device_attach:re0 attach returned 6 Any suggestion? Thanks a lot! ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: modem support MT9234ZPX-PCIE-NV
Hello John and FreeBSD friends, On Fri, Jun 03, 2011 at 09:48:26AM -0400, John Baldwin wrote: On Friday, June 03, 2011 8:34:54 am Willy Offermans wrote: Dear John and FreeBSD friends, On Tue, May 31, 2011 at 11:01:23AM -0400, John Baldwin wrote: On Monday, May 30, 2011 5:25:14 am Willy Offermans wrote: Hello John and FreeBSD friends, On Fri, May 27, 2011 at 10:43:34AM -0400, John Baldwin wrote: On Friday, May 27, 2011 10:38:02 am Willy Offermans wrote: Dear John and FreeBSD friends, On Fri, May 27, 2011 at 08:05:56AM -0400, John Baldwin wrote: On Thursday, May 26, 2011 4:58:37 pm Mike Tancsa wrote: On 5/26/2011 4:12 PM, John Baldwin wrote: Hmm, can you get 'pciconf -lb' output? Hmm, wow, I wonder how uart(4) works at all. It tries to reuse it's softc structure in uart_bus_attach() that was setup in uart_bus_probe(). Since it doesn't return 0 from its probe routine, that is forbidden. I guess it accidentally works because of the hack where we call DEVICE_PROBE() again to make sure the device description is correct. I think this is a similar card. Had it laying about for a while and popped it in. cu -l to it, attaches, but I am not able to interact with it. none3@pci0:5:0:0: class=0x070002 card=0x20282205 chip=0x015213a8 rev=0x02 hdr=0x00 vendor = 'Exar Corp.' device = 'XR17C/D152 Dual PCI UART' class = simple comms subclass = UART bar [10] = type Memory, range 32, base 0xe895, size 1024, enabled NetBSD supposedly has support for this card Oh, hmm, looks like the clock has an unusual multiplier. Does it work if you use 'cu -l -s 1200' to talk at 9600 for example? (In general use speed / 8 as the speed to '-s'.) Also, is your card a modem or a dual-port card? -- John Baldwin It is a modem. As suggested: kosmos# cu -l /dev/cuau0 -s 1200 Stale lock on cuau0 PID=3642... overriding. Connected atF OK atdt0045*** NO DIALTONE Ok, try this updated patch. After this you should be able to use the correct speed: Index: uart_bus_pci.c === --- uart_bus_pci.c(revision 85) +++ uart_bus_pci.c(working copy) @@ -110,6 +110,8 @@ static struct pci_id pci_ns8250_ids[] = { { 0x1415, 0x950b, 0x, 0, Oxford Semiconductor OXCB950 Cardbus 16950 UART, 0x10, 16384000 }, { 0x151f, 0x, 0x, 0, TOPIC Semiconductor TP560 56k modem, 0x10 }, +{ 0x13a8, 0x0152, 0x2205, 0x2026, MultiTech MultiModem ZPX, 0x10, + 8 * DEFAULT_RCLK }, { 0x9710, 0x9820, 0x1000, 1, NetMos NM9820 Serial Port, 0x10 }, { 0x9710, 0x9835, 0x1000, 1, NetMos NM9835 Serial Port, 0x10 }, { 0x9710, 0x9865, 0xa000, 0x1000, NetMos NM9865 Serial Port, 0x10 }, -- John Baldwin The structure you have provided in your magic line would also need some explanation. The data concerns the description of the chip and the card I guess and can be gained by `pciconf -lv` uart0@pci0:6:0:0: class=0x070002 card=0x20262205 chip=0x015213a8 rev=0x02 hdr=0x00 vendor = 'Exar Corp.' device = 'XR17C/D152 Dual PCI UART' class = simple comms subclass = UART A more detailed explanation would not harm. The data 0x10 and 8 * DEFAULT_RCLK are still totally miraculous to me. 0x10 is the resource id for the first PCI BAR (rids for PCI device resources use the offset in PCI config space of the associated BAR). It would perhaps be more obvious if uart(4) and puc(4) used PCIR_BAR(0) rather than 0x10. Bumping the clock by a multiple of 8 was based on looking at the change in NetBSD that Mike Tancsa pointed to and that you verified by noting that 'cu -s 1200' connected at 9600 (9600 / 1200 == 8). One question though, would you be able to test the patch for puc(4) that I sent to Mike Tancsa to see if your modem works with puc(4)? The puc(4) patch is more general and if it works fine for your modem I'd rather just commit that. -- John Baldwin I have applied the suggested patch. The outcome was a new /usr/src/sys/dev/puc/pucdata.c file, which I have enclosed. Hmm, there was a newer puc patch. Please try this one instead: Index: pucdata.c === --- pucdata.c (revision 222565) +++ pucdata.c (working copy) @@ -48,8 +48,8 @@ __FBSDID($FreeBSD$); #include dev/puc/puc_bfe.h static
Re: modem support MT9234ZPX-PCIE-NV
On Fri, Jun 03, 2011 at 09:00:09PM +0200, Willy Offermans wrote: Hello John and FreeBSD friends, ... The latter patch seems to work: From the boot.msg: snip puc0: Exar XR17C/D152 mem 0xfbfffc00-0xfbff irq 16 at device 0.0 on pci6 puc0: failed to enable port mapping! puc0: [FILTER] uart0: 16750 or compatible on puc0 uart0: [FILTER] uart1: 16750 or compatible on puc0 uart1: [FILTER] /snip As I already pointed out, I do not have a line connected to the modem yet. This connection will hopefully be established tomorrow. After some rigorous testing I will post a mail with the on stream results. On the other hand, if someone knows some off stream testing procedures, then I'm happy to hear about that. ... Many if not most modems supporting a Hayes-style command set include several loopback points (digital and analog) which you can turn on via specific command. Those commands are all non-standardized, so I can't tell you the commands for yours, but if you can look through a user manual or command reference you should be able to find them. Turning on loopback should allow you to do some basic verification tests, e.g. pipe a file of random binary values into it while concurrently reading it, and verify that you get the same contents. Personally, I'd try to get the digital loopback working first, then if that's OK try the analog loopback point. -- Clifton -- Clifton Royston -- clift...@iandicomputing.com / clift...@volcano.org President - I and I Computing * http://www.iandicomputing.com/ Custom programming, network design, systems and network consulting services ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: Realtek 8111e on 8.2-RELEASE
On Sat, Jun 04, 2011 at 01:56:15AM +0800, Ken Chen wrote: Hello, Anyone tried Realtek 8111e on 8.2-RELEASE? Realtek8111e should be supported, but I got following message: re0: RealTek 8168/8111 B/C/CP/D/DP/E PCIe Gibabit Ethernet port 0xe800-0xe8ff mem 0xfdfff000-0xfdff, 0xfdff8000-0xfdffbfff irq 17 at device 0.0 on pci2 re0: Using 1 MSI messages re0: Chip rev. 0x2c80 re0: MAC rev. 0x re0: Unknown H/W revision: 0%2c80 device_attach:re0 attach returned 6 Any suggestion? Thanks a lot! Updating to latest stable/8 will solve the issue. Or 1. Download two files at the following URL http://people.freebsd.org/~yongari/re/8.2R/if_re.c http://people.freebsd.org/~yongari/re/8.2R/if_rlreg.h 2. Copy downloaded if_re.c to /usr/src/sys/dev/re directory 3. Copy downloaded if_rlreg.h to /usr/src/sys/pci directory 4. Rebuild kernel and reboot. ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: modem support MT9234ZPX-PCIE-NV
On Friday, June 03, 2011 3:00:09 pm Willy Offermans wrote: Hello John and FreeBSD friends, On Fri, Jun 03, 2011 at 09:48:26AM -0400, John Baldwin wrote: On Friday, June 03, 2011 8:34:54 am Willy Offermans wrote: Dear John and FreeBSD friends, On Tue, May 31, 2011 at 11:01:23AM -0400, John Baldwin wrote: On Monday, May 30, 2011 5:25:14 am Willy Offermans wrote: Hello John and FreeBSD friends, On Fri, May 27, 2011 at 10:43:34AM -0400, John Baldwin wrote: On Friday, May 27, 2011 10:38:02 am Willy Offermans wrote: Dear John and FreeBSD friends, On Fri, May 27, 2011 at 08:05:56AM -0400, John Baldwin wrote: On Thursday, May 26, 2011 4:58:37 pm Mike Tancsa wrote: On 5/26/2011 4:12 PM, John Baldwin wrote: Hmm, can you get 'pciconf -lb' output? Hmm, wow, I wonder how uart(4) works at all. It tries to reuse it's softc structure in uart_bus_attach() that was setup in uart_bus_probe(). Since it doesn't return 0 from its probe routine, that is forbidden. I guess it accidentally works because of the hack where we call DEVICE_PROBE() again to make sure the device description is correct. I think this is a similar card. Had it laying about for a while and popped it in. cu -l to it, attaches, but I am not able to interact with it. none3@pci0:5:0:0: class=0x070002 card=0x20282205 chip=0x015213a8 rev=0x02 hdr=0x00 vendor = 'Exar Corp.' device = 'XR17C/D152 Dual PCI UART' class = simple comms subclass = UART bar [10] = type Memory, range 32, base 0xe895, size 1024, enabled NetBSD supposedly has support for this card Oh, hmm, looks like the clock has an unusual multiplier. Does it work if you use 'cu -l -s 1200' to talk at 9600 for example? (In general use speed / 8 as the speed to '-s'.) Also, is your card a modem or a dual-port card? -- John Baldwin It is a modem. As suggested: kosmos# cu -l /dev/cuau0 -s 1200 Stale lock on cuau0 PID=3642... overriding. Connected atF OK atdt0045*** NO DIALTONE Ok, try this updated patch. After this you should be able to use the correct speed: Index: uart_bus_pci.c === --- uart_bus_pci.c (revision 85) +++ uart_bus_pci.c (working copy) @@ -110,6 +110,8 @@ static struct pci_id pci_ns8250_ids[] = { { 0x1415, 0x950b, 0x, 0, Oxford Semiconductor OXCB950 Cardbus 16950 UART, 0x10, 16384000 }, { 0x151f, 0x, 0x, 0, TOPIC Semiconductor TP560 56k modem, 0x10 }, +{ 0x13a8, 0x0152, 0x2205, 0x2026, MultiTech MultiModem ZPX, 0x10, + 8 * DEFAULT_RCLK }, { 0x9710, 0x9820, 0x1000, 1, NetMos NM9820 Serial Port, 0x10 }, { 0x9710, 0x9835, 0x1000, 1, NetMos NM9835 Serial Port, 0x10 }, { 0x9710, 0x9865, 0xa000, 0x1000, NetMos NM9865 Serial Port, 0x10 }, -- John Baldwin The structure you have provided in your magic line would also need some explanation. The data concerns the description of the chip and the card I guess and can be gained by `pciconf -lv` uart0@pci0:6:0:0: class=0x070002 card=0x20262205 chip=0x015213a8 rev=0x02 hdr=0x00 vendor = 'Exar Corp.' device = 'XR17C/D152 Dual PCI UART' class = simple comms subclass = UART A more detailed explanation would not harm. The data 0x10 and 8 * DEFAULT_RCLK are still totally miraculous to me. 0x10 is the resource id for the first PCI BAR (rids for PCI device resources use the offset in PCI config space of the associated BAR). It would perhaps be more obvious if uart(4) and puc(4) used PCIR_BAR(0) rather than 0x10. Bumping the clock by a multiple of 8 was based on looking at the change in NetBSD that Mike Tancsa pointed to and that you verified by noting that 'cu -s 1200' connected at 9600 (9600 / 1200 == 8). One question though, would you be able to test the patch for puc(4) that I sent to Mike Tancsa to see if your modem works with puc(4)? The puc(4) patch is more general and if it works fine for your modem I'd rather just commit that. -- John Baldwin I have applied the suggested patch. The outcome was a new /usr/src/sys/dev/puc/pucdata.c file, which I have enclosed. Hmm, there was a newer puc patch. Please
Re: Fileserver panic - FreeBSD 8.1-stable and zfs
Update: On Thu, 02 Jun 2011 16:09:21 -0700 Jeremy Chadwick free...@jdc.parodius.com wrote: Anyway, your system has 4GB of RAM installed in it, so on 8.1-STABLE I'd recommend you try these settings: vm.kmem_size=3584M vm.kmem_size_max=3584M vfs.zfs.arc_max=2048M Now, these are all chosen by me off the top of my head with absolutely ZERO knowledge of what the memory usage on this system is like **without** ZFS in the picture. I'm making a lot of assumptions, and I'm assuming worst-case scenarios. For example, if this machine also runs mysqld and its tuned to take up a lot of memory, I would advocate dropping vfs.zfs.arc_max to 1536M or 1024M. Please don't drop it too much; ZFS performs best when it has lots of ARC. Since you mentioned going to 8.2-STABLE, all you need to tune on that version is one single tunable: vfs.zfs.arc_max The machine now runs 8.2-stable: root@kg-f2# uname -a FreeBSD kg-f2.kg4.no 8.2-STABLE FreeBSD 8.2-STABLE #5: Fri Jun 3 17:20:39 CEST 2011 r...@kg-f2.kg4.no:/usr/obj/usr/src/sys/GENERIC amd64 And I have added the following to /boot/loader.conf: vfs.zfs.arc_max=2048M Hopefully, this will keep the machine rock solid (unitil something else happens, at least). Oh, and thanks for your advice - really helpful. -- Torfinn ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org