Re: HAST instability
Here goes the second run, wihtout checksums. systat -if /0 /1 /2 /3 /4 /5 /6 /7 /8 /9 /10 Load Average Interface Traffic PeakTotal lo0 in 0.000 KB/s 71.666 KB/s 361.825 KB out 0.000 KB/s 71.666 KB/s 361.825 KB ix1 in 0.021 KB/s816.608 MB/s 625.751 GB out 0.016 KB/s 7.384 MB/s 23.032 GB igb0 in 0.025 KB/s 1.507 KB/s 11.547 MB out 0.069 KB/s 1.765 KB/s 17.140 MB This time it managed to achieve 800MB/s wow! Anyway, no idea when this happened, as during my observation, it didn't manage to push much data, due to frequent disconnects. Typical "good" rate was lower than with checksums, like just over 100MB/s. from primary messages: http://news.digsys.bg/~admin/hast/test31may-2/b1a-messages netstat -in: http://news.digsys.bg/~admin/hast/test31may-2/b1a-netstat-in netstat-s: http://news.digsys.bg/~admin/hast/test31may-2/b1a-netstat-s from secondary messages: http://news.digsys.bg/~admin/hast/test31may-2/b1b-messages netstat -in: http://news.digsys.bg/~admin/hast/test31may-2/b1b-netstat-in netstat-s: http://news.digsys.bg/~admin/hast/test31may-2/b1b-netstat-s Daniel ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: [ZFSv28]gtpzfsboot fails to boot ZFSv28 0511
On Tue, May 31, 2011 at 08:15:35PM -0500, Zhihao Yuan wrote: > I met this problem, which is serious. I need some help to recovered > the system, after that I'll show the photos about the error screen. > > I used the ZFSv28 patch maintained by mm@ before, and I have a > backuped working kernel. I need a LiveCD/memstick to boot the system > and recover it. But after I burned the 9.0-current image to memstick, > I found that it keeps giving me kernel panic when booting! How can I > find a LiveFS with ZFSv28 support? Thanks. The closest thing I can think of is this: http://mfsbsd.vx.sk/ Except: 1) The ISOs there don't claim to be "LiveFS"; I don't know if they are. 2) There's no memory stick image available, only ISOs, 3) They're 8.2-RELEASE with ZFSv28 patches, not 9.0-CURRENT. I don't know the implications of this. Best to ask mm@. -- | Jeremy Chadwick j...@parodius.com | | Parodius Networking http://www.parodius.com/ | | UNIX Systems Administrator Mountain View, CA, USA | | Making life hard for others since 1977. PGP 4BD6C0CB | ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
[ZFSv28]gtpzfsboot fails to boot ZFSv28 0511
Hi, I met this problem, which is serious. I need some help to recovered the system, after that I'll show the photos about the error screen. I used the ZFSv28 patch maintained by mm@ before, and I have a backuped working kernel. I need a LiveCD/memstick to boot the system and recover it. But after I burned the 9.0-current image to memstick, I found that it keeps giving me kernel panic when booting! How can I find a LiveFS with ZFSv28 support? Thanks. -- Zhihao Yuan, nickname lichray The best way to predict the future is to invent it. ___ 4BSD -- http://4bsd.biz/ ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: ICH9 panic/instability on recent kernel
On Sun, 29 May 2011, Alexander Motin wrote: If I csup the most recent kernel sources, I get the same problem. However, if, after csuping the latest kernel sources, I then fetch the version of sys/dev/ata/ata-all.c as of April 27, everything works fine. Here's the output of pciconf -l: The only change in 8-STABLE ata-all.c since April 27 was the SVN rev 221155. But I don't see how can it cause problems. I would really like to see full _verbose_ demsg output to better understand what is going on there. If it even panics, I need to see how exactly. I agree that it makes little sense on the surface. I did follow Jeremy's advice and enable AHCI in the BIOS. Even without loding the ACHI module at boot, that still solved the problem. (I also did load AHCI and it worked fine, which the DVD being recognized as a scsi/atapi-style cd0.) I'll turn of AHCI again and try to get a serial port hooked up so that I can do a boot -v and generate the panic (usually a little bit of disk activity will cause a panic). I'll send that in a separate email to you directly. thanks, michael ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Your friend Catherine Jackson has recommended this great product from Street Cat Jewelry
Hi Friend! Your friend, Catherine Jackson, thought that you would be interested in Cotswolds, England NC13 from Street Cat Jewelry. Catherine Jackson sent a note saying: Hello Friend, Your Wealth Miners Source Capital! You Can Earn While You Sleep!!! How Could You Imagine To Send Your Ads To More Than 900 Million Everyday Just a Few Click of Your Mouse Hurry This Limited Hot Business in 2011 First Come First Serve... Congratulations: Get Your Waiting $800 Hot Commissions Now!!! It's So Simple That Even A Ten Year Old Could Learn This In Under 1 Hour!"It Doesn't Matter Where In The World You Are If You Have An Internet Connection & A PC You Can Earn The $800 For Just A Few Minutes Of Clicking A Mouse " Imagine waking up at 10am in the morning, having a quick look at your PC and finding the exact information you need to collect a quick $300 by lunchtime. You could take the afternoon off, play some golf, go shopping or spend some quality time with the family. Then do the same thing again in the evening, What a wonderful concept and you could be doing it today. Even during the boom years in any economy it's not possible to find a form of free income, BUT THIS IS EXACTLY THAT! and it will make you free money for the rest of your life. It's Easy To Make Money Everyday Even If You're Starting From Scratch With Zero Knowledge, Experience Or Budget!I'll Show You Exactly How. We've Start putting New 32 Members in YOUR TEAM for the May 24 to 31/2011 weekly commission cycle...and GROWING everyday earn by $100 up to $200 or more. IMPORTANT:Advance Don't delay on May 31/2011,is the Cut-Off day to lock in your position then faster you act the higher commission you will earn!!! Go Here To Secure not less than $800 commission Now and it still growing as many people joining under you. if you secure your position right away:The $800 Commission will Arrive Through your Paypal or Credit Card on July/20/2011...Hurry this limited time, only 8 Positions are available Now. Once you enter a valid credit card number or paypal, after procces, you will be able to earn $800 in less than 2 hours a day.I will show you how we do that and then I will help you through the process so that YOU SUCCEED! And Enjoy!!! You can claim your $800 USD money in any ATM when your membership process are valid. Click Below!!!And Join Right Now.. https://www.plimus.com/jsp/redirect.jsp?contractId=2757066&referrer=freeincome You Can See This Snapshots? Those Person is Proven Earn After Join _ TYPE DATE & TIME --- NEW MEMBERS COUNTRY P --- MAY. 30 @ 2:38 AM-- Jenny Lopez- United States P --- MAY. 30 @ 2:53 AM-- Andy William --- United Kingdom P --- MAY. 30 @ 2:56 AM-- Jeffrey Jacobs-- Germany M --- MAY. 30 @ 4:19 AM-- Mayeth Thompson- Singapore P --- MAY. 30 @ 4:28 AM-- Chandrena White- Italy P --- MAY. 29 @ 2:38 AM-- Jinky Buffer United States P --- MAY. 29 @ 2:53 AM-- Ailaine Smith -- United Kingdom P --- MAY. 29 @ 2:56 AM-- Mandene Jonhson- Germany M --- MAY. 29 @ 4:19 AM-- Cristian Gatmaitan-- Singapore P --- MAY. 29 @ 4:28 AM-- Jhon Carmalon--- Italy M --- MAY. 28 @ 6:01 AM-- lalaine Anderson Australia P --- MAY. 28 @ 7:11 AM-- Rebecca Underwood--- Hungary P --- MAY. 28 @ 7:39 AM-- Jericho Jackson- Canada P --- MAY. 27 @ 9:42 AM-- Thomas Silva --- Sri Lanka M --- MAY. 27 @ 9:58 PM-- Grace Taylor United States P --- MAY. 27 @ 10:21 PM-- Gina Henry-- New Zealand P --- MAY. 27 @ 11:24 PM-- Mohammed Ahmen - Romania M --- MAY. 26 @ 11:33 PM-- Tracey Duncan--- Puerto Rico P --- MAY. 26 @ 11:41 PM-- Jane Stawrt- United States P --- MAY. 26 @ 11:47 PM-- Janice Youngstown--- Taiwan P --- MAY. 26 @ 11:53 PM-- Shirley Ong- China P --- MAY. 25 @ 1:45 AM-- Ryann Lambert -- Europe M --- MAY. 25 @ 12:34 AM-- Nick Gauci - Calefornia M --- MAY. 25 @ 10:24 AM-- Don Riley -- Netherland P --- MAY. 24 @ 10:30 AM-- Lorne Whittaker Swetzerland P --- MAY. 24 @ 02:14 AM-- Ashwani Vohra -- Brazil M --- MAY. 24 @ 2:34 AM-- Kevin Hunt - United States P --- MAY. 24 @ 1:54 AM-- Charles Brown--- United States Therefore, you have a GUARANTEED $800 CommissionS every month from now on! Earn $25Per Process!Each $25 x 32 = $800 Commission will be yours... Be Sure to Copy the link below & Paste into your browser and press enter: To Secure your $800 commission! You will access your $800 in any ATM when you Join early our weekly cycle. Click Below!!!And Join Right Now.. https://www.plimus.com/jsp/redirect.jsp?contractId=2757066&referrer=freeincome After your simple payment of $25 and you could have earn $800 Remember No one Can give you this kind of commissions in every 20th of the month. Today is $800 Commission in each Member Who start Today before cut-off You m
Re: PCIe SATA HBA for ZFS on -STABLE
On Tue, May 31, 2011 at 7:31 AM, Freddie Cash wrote: > On Tue, May 31, 2011 at 5:48 AM, Matt Thyer wrote: > >> What do people recommend for 8-STABLE as a PCIe SATA II HBA for someone >> using ZFS ? >> >> Not wanting to break the bank. >> Not interested in SATA III 6GB at this time... though it could be useful if >> I add an SSD for... (is it ZIL ?). >> Can this be added at any time ? >> >> The main issue is I need at least 10 ports total for all existing drives... >> ZIL would require 11 so ideally we are talking a 6 port HBA. >> > > SuperMicro AOC-USAS2-L8i works exceptionally well. These are 8-port HBAs > using the LSI1068 chipset, supported by the mpt(4) driver. Support 3 Gpbs > SATA/SAS, using multi-lane cables (2 connectors on the card, each connector > supports 4 SATA ports), hot-plug, hot-swap. > > These are UIO cards, so the bracket that comes with it doesn't work with > normal cases (the bracket is on the wrong side of the card; they're made for > SuperMicro's UIO-based motherboards). However, these are normal PCIe cards > and work in any PCIe slot. You either have to remove the bracket, or you > can purchase separate brackets online. > > These cards are recommended on the zfs-discuss mailing list. They are only > ~$120 CDN at places like cdw.ca and newegg.ca. +1 for LSI1068(e) controller + mpt driver. It's cheap and it works. Those LSI controllers are often hiding behind other brands. SuperMicro mentioned above is one. Intel would be another -- search for Intel SASUC8I. Tyan also sells one as TYAN P3208SR. LSI-branded controllers tend to be a bit more expensive than rebranded ones, though functionality is the same and you can often cross-flash firmware. Keep in mind that HBAs based on LSI1068(e) can't handle hard drives larger than 2TB and will truncate larger drive capacity to 2TB. As for the SSD, you may want to hook them up to on-board SATA ports. In my not-very scientific benchmark Intel's X25-M SSD connected to on-board SATA port on ICH10 was able to deliver ~20% more reads/sec than the same SSD connected to LSI1068 based controller. --Artem ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: HAST instability
On 31.05.11 17:08, Mikolaj Golub wrote: As I wrote privately, it would be nice to see both netstat and hast logs (from both nodes) for the same rather long period, when several cases occured. It would be good to place them somewere on web so other guys could access them too, as I will be offline for 7-10 days and will not be able to help you until I am back. The test finished running for almost three hours, and so here is the collected data: (for the duration of test, on the secondary node) systat -if /0 /1 /2 /3 /4 /5 /6 /7 /8 /9 /10 Load Average Interface Traffic PeakTotal lo0 in 0.000 KB/s 0.000 KB/s1.126 KB out 0.000 KB/s 0.000 KB/s1.126 KB ix1 in 0.003 KB/s230.590 MB/s 614.688 GB out 0.054 KB/s 7.425 MB/s 19.910 GB igb0 in 0.025 KB/s 3.636 KB/s 566.897 KB out 0.072 KB/s 4.296 KB/s1.091 MB The primary node is b1a, the secondary node is b1b. kernel (built just after csup update): FreeBSD b1a 8.2-STABLE FreeBSD 8.2-STABLE #1: Mon May 30 14:17:50 EEST 2011 root@b1a:/usr/obj/usr/src/sys/GENERIC amd64 from primary messages: http://news.digsys.bg/~admin/hast/test31may/b1a-messages netstat -in: http://news.digsys.bg/~admin/hast/test31may/b1a-netstat -in netstat-s: http://news.digsys.bg/~admin/hast/test31may/b1a-netstat-s from secondary messages: http://news.digsys.bg/~admin/hast/test31may/b1b-messages netstat -in: http://news.digsys.bg/~admin/hast/test31may/b1b-netstat -in netstat-s: http://news.digsys.bg/~admin/hast/test31may/b1b-netstat-s DK> One additional note: while playing with this setup, I tried to DK> simulate local disk going away in the hope HAST will switch to using DK> the remote disk. Instead of asking someone at the site to pull out the DK> drive, I just issued on the primary DK> hastctl role init data0 DK> which resulted in kernel panic. Unfortunately, there was no sufficient DK> dump space for 48GB. I will re-run this again with more drives for the DK> crash dump. Anything you want me to look for in particular? (kernels DK> have no KDB compiled in yet) Well, removing physical disk (device /dev/gpt/data0 consumed by hastd dissapears) and switching a resource to init role (devive /dev/hast/data0 consumed by FS dissapears) are two different things. Sure you should not normally change the resource role (destroy hast device) before unmounting (exporting) FS. Then how do I proceed with a failed drive? Or a flaky drive that is still visible to the OS, that I want to remove from HAST and replace with a different one? How do I ask HAST to switch I/O to the secondary? Is there other way to get a drive out of HAST? In any case, even if this is not allowed operation, it should not panic. I am now going to reboot and run the same tests without checksums. Daniel ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: modem support MT9234ZPX-PCIE-NV
On Monday, May 30, 2011 5:25:14 am Willy Offermans wrote: > Hello John and FreeBSD friends, > > On Fri, May 27, 2011 at 10:43:34AM -0400, John Baldwin wrote: > > On Friday, May 27, 2011 10:38:02 am Willy Offermans wrote: > > > Dear John and FreeBSD friends, > > > > > > On Fri, May 27, 2011 at 08:05:56AM -0400, John Baldwin wrote: > > > > On Thursday, May 26, 2011 4:58:37 pm Mike Tancsa wrote: > > > > > On 5/26/2011 4:12 PM, John Baldwin wrote: > > > > > > > > > > > > Hmm, can you get 'pciconf -lb' output? > > > > > > > > > > > > Hmm, wow, I wonder how uart(4) works at all. It tries to reuse > > > > > > it's softc > > > > > > structure in uart_bus_attach() that was setup in uart_bus_probe(). > > > > > > Since > > > > it > > > > > > doesn't return 0 from its probe routine, that is forbidden. I > > > > > > guess it > > > > > > accidentally works because of the hack where we call DEVICE_PROBE() > > > > > > again > > > > > > to make sure the device description is correct. > > > > > > > > > > > > > > > I think this is a similar card. Had it laying about for a while and > > > > > popped it in. cu -l to it, attaches, but I am not able to interact > > > > > with it. > > > > > > > > > > none3@pci0:5:0:0: class=0x070002 card=0x20282205 chip=0x015213a8 > > > > > rev=0x02 hdr=0x00 > > > > > vendor = 'Exar Corp.' > > > > > device = 'XR17C/D152 Dual PCI UART' > > > > > class = simple comms > > > > > subclass = UART > > > > > bar [10] = type Memory, range 32, base 0xe895, size 1024, > > > > > enabled > > > > > > > > > > > > > > > NetBSD supposedly has support for this card > > > > > > > > Oh, hmm, looks like the clock has an unusual multiplier. Does it work > > > > if you > > > > use 'cu -l -s 1200' to talk at 9600 for example? (In general use speed > > > > / 8 > > > > as the speed to '-s'.) > > > > > > > > Also, is your card a modem or a dual-port card? > > > > > > > > -- > > > > John Baldwin > > > > > > It is a modem. > > > > > > As suggested: > > > > > > kosmos# cu -l /dev/cuau0 -s 1200 > > > Stale lock on cuau0 PID=3642... overriding. > > > Connected > > > at&F > > > OK > > > atdt0045*** > > > NO DIALTONE > > > > Ok, try this updated patch. After this you should be able to use the > > correct > > speed: > > > > Index: uart_bus_pci.c > > === > > --- uart_bus_pci.c (revision 85) > > +++ uart_bus_pci.c (working copy) > > @@ -110,6 +110,8 @@ static struct pci_id pci_ns8250_ids[] = { > > { 0x1415, 0x950b, 0x, 0, "Oxford Semiconductor OXCB950 Cardbus 16950 > > UART", > > 0x10, 16384000 }, > > { 0x151f, 0x, 0x, 0, "TOPIC Semiconductor TP560 56k modem", 0x10 }, > > +{ 0x13a8, 0x0152, 0x2205, 0x2026, "MultiTech MultiModem ZPX", 0x10, > > + 8 * DEFAULT_RCLK }, > > { 0x9710, 0x9820, 0x1000, 1, "NetMos NM9820 Serial Port", 0x10 }, > > { 0x9710, 0x9835, 0x1000, 1, "NetMos NM9835 Serial Port", 0x10 }, > > { 0x9710, 0x9865, 0xa000, 0x1000, "NetMos NM9865 Serial Port", 0x10 }, > > > > -- > > John Baldwin > > The structure you have provided in your magic line would also need > some explanation. The data concerns the description of the chip and the > card I guess and can be gained by `pciconf -lv` > > uart0@pci0:6:0:0: class=0x070002 card=0x20262205 chip=0x015213a8 rev=0x02 > hdr=0x00 > vendor = 'Exar Corp.' > device = 'XR17C/D152 Dual PCI UART' > class = simple comms > subclass = UART > > > A more detailed explanation would not harm. The data 0x10 and > 8 * DEFAULT_RCLK are still totally miraculous to me. 0x10 is the resource id for the first PCI BAR (rids for PCI device resources use the offset in PCI config space of the associated BAR). It would perhaps be more obvious if uart(4) and puc(4) used PCIR_BAR(0) rather than 0x10. Bumping the clock by a multiple of 8 was based on looking at the change in NetBSD that Mike Tancsa pointed to and that you verified by noting that 'cu -s 1200' connected at 9600 (9600 / 1200 == 8). One question though, would you be able to test the patch for puc(4) that I sent to Mike Tancsa to see if your modem works with puc(4)? The puc(4) patch is more general and if it works fine for your modem I'd rather just commit that. -- John Baldwin ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: PCIe SATA HBA for ZFS on -STABLE
On Tue, May 31, 2011 at 5:48 AM, Matt Thyer wrote: > What do people recommend for 8-STABLE as a PCIe SATA II HBA for someone > using ZFS ? > > Not wanting to break the bank. > Not interested in SATA III 6GB at this time... though it could be useful if > I add an SSD for... (is it ZIL ?). > Can this be added at any time ? > > The main issue is I need at least 10 ports total for all existing drives... > ZIL would require 11 so ideally we are talking a 6 port HBA. > SuperMicro AOC-USAS2-L8i works exceptionally well. These are 8-port HBAs using the LSI1068 chipset, supported by the mpt(4) driver. Support 3 Gpbs SATA/SAS, using multi-lane cables (2 connectors on the card, each connector supports 4 SATA ports), hot-plug, hot-swap. These are UIO cards, so the bracket that comes with it doesn't work with normal cases (the bracket is on the wrong side of the card; they're made for SuperMicro's UIO-based motherboards). However, these are normal PCIe cards and work in any PCIe slot. You either have to remove the bracket, or you can purchase separate brackets online. These cards are recommended on the zfs-discuss mailing list. They are only ~$120 CDN at places like cdw.ca and newegg.ca. -- Freddie Cash fjwc...@gmail.com ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: HAST instability
On Tue, 31 May 2011 15:51:07 +0300 Daniel Kalchev wrote: DK> On 30.05.11 21:42, Mikolaj Golub wrote: >> DK> One strange thing is that there is never established TCP connection >> DK> between both nodes: >> >> DK> tcp4 0 0 10.2.101.11.48939 10.2.101.12.8457 >> FIN_WAIT_2 >> DK> tcp4 0 1288 10.2.101.11.57008 10.2.101.12.8457 >> CLOSE_WAIT >> DK> tcp4 0 0 10.2.101.11.46346 10.2.101.12.8457 >> FIN_WAIT_2 >> DK> tcp4 0 90648 10.2.101.11.13916 10.2.101.12.8457 >> CLOSE_WAIT >> DK> tcp4 0 0 10.2.101.11.8457 *.* >> LISTEN >> >> It is normal. hastd uses the connections only in one direction so it calls >> shutdown to close unused directions. DK> So the TCP connections are all too short-lived that I can never see a DK> single one in ESTABLISHED state? 10Gbit Ethernet is indeed fast, so DK> this might well be possible... No the connections are persistent, just only one (unused) direction of communication is closed. See shutdown(2) for further info. >> I would like to look at full logs for some rather large period, with several >> cases, from both primary and secondary (and be sure about synchronized >> time). DK> I have made sure clocks are synchronized and am currently running on a freshly rebooted nodes (with two additional SATA drives at each node) -- DK> so far some interesting findings, like I get hash errors and DK> disconnects much more frequent now. Will post when an bonnie++ run on DK> the ZFS filesystem on top of the HAST resources finishes. As I wrote privately, it would be nice to see both netstat and hast logs (from both nodes) for the same rather long period, when several cases occured. It would be good to place them somewere on web so other guys could access them too, as I will be offline for 7-10 days and will not be able to help you until I am back. DK> One additional note: while playing with this setup, I tried to DK> simulate local disk going away in the hope HAST will switch to using DK> the remote disk. Instead of asking someone at the site to pull out the DK> drive, I just issued on the primary DK> hastctl role init data0 DK> which resulted in kernel panic. Unfortunately, there was no sufficient DK> dump space for 48GB. I will re-run this again with more drives for the DK> crash dump. Anything you want me to look for in particular? (kernels DK> have no KDB compiled in yet) Well, removing physical disk (device /dev/gpt/data0 consumed by hastd dissapears) and switching a resource to init role (devive /dev/hast/data0 consumed by FS dissapears) are two different things. Sure you should not normally change the resource role (destroy hast device) before unmounting (exporting) FS. -- Mikolaj Golub ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: PCIe SATA HBA for ZFS on -STABLE
Areca's work well. The ARC-1220 (8 ports) should do you, not the cheapest but good support and performance. Regards Steve - Original Message - From: "Matt Thyer" To: Sent: Tuesday, May 31, 2011 1:48 PM Subject: PCIe SATA HBA for ZFS on -STABLE I'm not on the -STABLE list so please reply to me. I'm using an Intel Core i3-530 on a Gigabyte H55M-D2H motherboard with 8 x 2TB drives & 2 x 1TB drives. The plan is to have the 1 TB drives in a zmirror and the 8 in a raidz2. Now the Intel chipset has only 6 on board SATA II ports so ideally I'm looking for a non RAID SATA II HBA to give me 6 extra ports (4 min). Why 6 extra ? Well the case I'm using has 2 x eSATA ports so 6 would be ideal, 5 OK, and 4 the minimum I need to do the job. So... What do people recommend for 8-STABLE as a PCIe SATA II HBA for someone using ZFS ? Not wanting to break the bank. Not interested in SATA III 6GB at this time... though it could be useful if I add an SSD for... (is it ZIL ?). Can this be added at any time ? The main issue is I need at least 10 ports total for all existing drives... ZIL would require 11 so ideally we are talking a 6 port HBA. ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org" This e.mail is private and confidential between Multiplay (UK) Ltd. and the person or entity to whom it is addressed. In the event of misdirection, the recipient is prohibited from using, copying, printing or otherwise disseminating it or any information contained in it. In the event of misdirection, illegible or incomplete transmission please telephone +44 845 868 1337 or return the E.mail to postmas...@multiplay.co.uk. ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
PCIe SATA HBA for ZFS on -STABLE
I'm not on the -STABLE list so please reply to me. I'm using an Intel Core i3-530 on a Gigabyte H55M-D2H motherboard with 8 x 2TB drives & 2 x 1TB drives. The plan is to have the 1 TB drives in a zmirror and the 8 in a raidz2. Now the Intel chipset has only 6 on board SATA II ports so ideally I'm looking for a non RAID SATA II HBA to give me 6 extra ports (4 min). Why 6 extra ? Well the case I'm using has 2 x eSATA ports so 6 would be ideal, 5 OK, and 4 the minimum I need to do the job. So... What do people recommend for 8-STABLE as a PCIe SATA II HBA for someone using ZFS ? Not wanting to break the bank. Not interested in SATA III 6GB at this time... though it could be useful if I add an SSD for... (is it ZIL ?). Can this be added at any time ? The main issue is I need at least 10 ports total for all existing drives... ZIL would require 11 so ideally we are talking a 6 port HBA. ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: HAST instability
On 30.05.11 21:42, Mikolaj Golub wrote: DK> One strange thing is that there is never established TCP connection DK> between both nodes: DK> tcp4 0 0 10.2.101.11.48939 10.2.101.12.8457 FIN_WAIT_2 DK> tcp4 0 1288 10.2.101.11.57008 10.2.101.12.8457 CLOSE_WAIT DK> tcp4 0 0 10.2.101.11.46346 10.2.101.12.8457 FIN_WAIT_2 DK> tcp4 0 90648 10.2.101.11.13916 10.2.101.12.8457 CLOSE_WAIT DK> tcp4 0 0 10.2.101.11.8457 *.*LISTEN It is normal. hastd uses the connections only in one direction so it calls shutdown to close unused directions. So the TCP connections are all too short-lived that I can never see a single one in ESTABLISHED state? 10Gbit Ethernet is indeed fast, so this might well be possible... I suppose when checksum is enabled the bottleneck is cpu, the triffic rate is lower and the problem is not triggered. I was thinking something like this. My later tests seems to suggest that when the network transfer rate is mugh higher than disk transfer rate this gets triggered. "Hash mismatch" message suggests that actually you were using checksum then, weren't you? Yes, this occurs only when checksums are enabled. Happens with both crc32 and sha256. I would like to look at full logs for some rather large period, with several cases, from both primary and secondary (and be sure about synchronized time). I have made sure clocks are synchronized and am currently running on a freshly rebooted nodes (with two additional SATA drives at each node) -- so far some interesting findings, like I get hash errors and disconnects much more frequent now. Will post when an bonnie++ run on the ZFS filesystem on top of the HAST resources finishes. Also, it might worth checking that there is no network packet corruption (some strange things in netstat -di, netstat -s, may be copying large files via net and comparing checksums). I will post these as well, however so far no indication of any network problems was seen, no interface errors etc. Might be also the ix driver is not reporting such, of course. One additional note: while playing with this setup, I tried to simulate local disk going away in the hope HAST will switch to using the remote disk. Instead of asking someone at the site to pull out the drive, I just issued on the primary hastctl role init data0 which resulted in kernel panic. Unfortunately, there was no sufficient dump space for 48GB. I will re-run this again with more drives for the crash dump. Anything you want me to look for in particular? (kernels have no KDB compiled in yet) Daniel ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: zfs-root and "safe" atomic updates
on 27/05/2011 17:16 Arnaud Houdelette said the following: > On Fri, 27 May 2011 14:41:54 +0300, Andriy Gapon wrote: >> I am not aware of any plans to implement nextboot for zfs as it would >> require at >> least some write support for zpool and there is none (for boot code) >> at the moment. >> > > Could'nt the loader use a bit flag in the loader sector ? First, strictly speaking, the loader is an executable on a filesystem, there is no "loader sector". If we consider the earlier boot stages, various incarnations of boot2 like gptzfsboot or non-MBR part of zfsboot, then it gets interesting for multi-disk configurations. FreeBSD has its view of disks, but BIOS (which is used for disk access during boot) has its own different view of disks. So it's hard (or impossible) to do an auto-magic thing here. One option could be to force a user to use its superior knowledge of a system to explicitly specify which disk and which boot block should be used for nextboot-ish purposes. That, of course, would be prone to footshooting because of the human nature. For example, one could specify a wrong disk, boot, see that nothing changed, realize the mistake, specify correct disk, never clean out nextboot-ish data on the wrong disk, change boot order months later and get badly hurt. But it could also be argued that that approach would be better than nothing, which is the case for ZFS at the moment. > Nextboot (or something equivalent) missing is the sole thing keeping me from > removing ufs boot partition for remote servers. > >>> What do you think ? How do you address the problem ? >> >> I have some patches that allow to boot a different loader or a kernel from a >> different (non-bootfs) ZFS dataset: >> http://lists.freebsd.org/pipermail/freebsd-fs/2010-July/008976.html >> But that still requires access to zfs boot and/or loader command interface. > > Interesting though. Thanks. > Does the mentionned patch still works with latest 8-stable loader ? I've rebased the patch to the latest head: http://people.freebsd.org/~avg/zfsboot.diff > And do you still have to change vfs.root.mountfrom once currdev set ? That should already be included into the patch. -- Andriy Gapon ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: ZFS I/O errors
On Tue, May 31, 2011 at 11:25:56AM +0200, Olaf Seibert wrote: > On Mon 30 May 2011 at 12:19:10 -0500, Dan Nelson wrote: > > The ZFS compression code will panic if it can't allocate the buffer needed > > to store the compressed data, so that's unlikely to be your problem. The > > only time I have seen an "illegal byte sequence" error was when trying to > > copy raw disk images containing ZFS pools to different disks, and the > > destination disk was a different size than the original. I wasn't even able > > to import the pool in that case, though. > > Yet somehow some incorrect data got written, it seems. That never > happened before, fortunately, even though we had crashes before that > seemed to be related to ZFS running out of memory. > > > The zfs IO code overloads the EILSEQ error code and uses it as a "checksum > > error" code. Returning that error for the same block on all disks is > > definitely weird. Could you have run a partitioning tool, or some other > > program that would have done direct writes to all of your component disks? > > I hope I would remember doing that if I did! > > > Your scrub is also a bit worrying - 24k checksum errors definitely shouldn't > > occur during normal usage. > > It turns out that the errors are easy to provoke: they happen every time > I do an ls of of the affected directories. There were processes running > that were likely to be trying to write to the same directories (the file > system is exported over NFS), so in that case it is easy to imagine that > the numbers rack up quickly. > > I moved those directories to the side, for the moment, but I haven't > been able to delete them yet. The data is a bit bigger than we're able > to backup so "just restoring a backup" isn't an easy thing to do. > Possibly I could make a new filesystem in the same pool, if that would > do the trick; it isn't more than 50% full but the affected one is the > biggest filesystem in it. > > The end result of the scrub is as follows: > > pool: tank > state: ONLINE > status: One or more devices has experienced an error resulting in data > corruption. Applications may be affected. > action: Restore the file in question if possible. Otherwise restore the > entire pool from backup. >see: http://www.sun.com/msg/ZFS-8000-8A > scrub: scrub completed after 12h56m with 3 errors on Mon May 30 23:56:47 2011 > config: > > NAMESTATE READ WRITE CKSUM > tankONLINE 0 0 6.38K > raidz2ONLINE 0 0 25.4K > da0 ONLINE 0 0 0 > da1 ONLINE 0 0 0 > da2 ONLINE 0 0 0 > da3 ONLINE 0 0 0 > da4 ONLINE 0 0 0 > da5 ONLINE 0 0 0 > > errors: Permanent errors have been detected in the following files: > > tank/vol-fourquid-1:<0x0> > tank/vol-fourquid-1@saturday:<0x0> > > /tank/vol-fourquid-1/.zfs/snapshot/saturday/backups/dumps/dump_usr_friday.dump > > /tank/vol-fourquid-1/.zfs/snapshot/saturday/sverberne/CLEF-IP11/parts_abs+desc > > /tank/vol-fourquid-1/.zfs/snapshot/sunday/sverberne/CLEF-IP11/parts_abs+desc > > /tank/vol-fourquid-1/.zfs/snapshot/monday/sverberne/CLEF-IP11/parts_abs+desc Mickael Maillot responded to this thread, pointing that situations like this could be caused by bad RAM. I admit that's a possibility; with ZFS in use the most likely memory-utilising piece (meaning volume-wise) of the system would be the ZFS ARC. I don't know if you'd necessarily see things like sig11's on random daemons, etc. (it often depends on where within the addressing range the bad DRAM chip would be associated). Can you rule out bad RAM by letting something like memtest86+ run for 12-24 hours? It's not a 100% infallible utility, but usually for simple things, it will detect/report errors within the first 15-30 minutes. Please keep in mind that even if you have ECC RAM, testing with memtest86+ would be worthwhile. Single-bit errors are correctable by ECC, while multi-bit aren't (but are detectable). "ChipKill" (see Wikipedia please) might work around this problem, but I've never personally used it (never seen it on any Intel systems I've used, only AMD systems). Finally, depending on what CPU model you have, northbridge problems (older systems) or on-die MCH (newer CPUs, e.g. Core iX and recent Xeon) problems could manifest themselves like this. However, in those situations I'd imagine you'd be seeing a lot of other oddities on the system and not limited to just ZFS. Newer systems which support MCA (again see Wikipedia; Machine Check Architecture) would/show throw MCEs which FreeBSD 8.x should absolutely notice/report (you'd see a lot of nastigrams on the console). I think that about does it for my ideas/blabbing on that topic. -- | Jeremy Chadwick
Re: ZFS I/O errors
On Mon 30 May 2011 at 12:19:10 -0500, Dan Nelson wrote: > The ZFS compression code will panic if it can't allocate the buffer needed > to store the compressed data, so that's unlikely to be your problem. The > only time I have seen an "illegal byte sequence" error was when trying to > copy raw disk images containing ZFS pools to different disks, and the > destination disk was a different size than the original. I wasn't even able > to import the pool in that case, though. Yet somehow some incorrect data got written, it seems. That never happened before, fortunately, even though we had crashes before that seemed to be related to ZFS running out of memory. > The zfs IO code overloads the EILSEQ error code and uses it as a "checksum > error" code. Returning that error for the same block on all disks is > definitely weird. Could you have run a partitioning tool, or some other > program that would have done direct writes to all of your component disks? I hope I would remember doing that if I did! > Your scrub is also a bit worrying - 24k checksum errors definitely shouldn't > occur during normal usage. It turns out that the errors are easy to provoke: they happen every time I do an ls of of the affected directories. There were processes running that were likely to be trying to write to the same directories (the file system is exported over NFS), so in that case it is easy to imagine that the numbers rack up quickly. I moved those directories to the side, for the moment, but I haven't been able to delete them yet. The data is a bit bigger than we're able to backup so "just restoring a backup" isn't an easy thing to do. Possibly I could make a new filesystem in the same pool, if that would do the trick; it isn't more than 50% full but the affected one is the biggest filesystem in it. The end result of the scrub is as follows: pool: tank state: ONLINE status: One or more devices has experienced an error resulting in data corruption. Applications may be affected. action: Restore the file in question if possible. Otherwise restore the entire pool from backup. see: http://www.sun.com/msg/ZFS-8000-8A scrub: scrub completed after 12h56m with 3 errors on Mon May 30 23:56:47 2011 config: NAMESTATE READ WRITE CKSUM tankONLINE 0 0 6.38K raidz2ONLINE 0 0 25.4K da0 ONLINE 0 0 0 da1 ONLINE 0 0 0 da2 ONLINE 0 0 0 da3 ONLINE 0 0 0 da4 ONLINE 0 0 0 da5 ONLINE 0 0 0 errors: Permanent errors have been detected in the following files: tank/vol-fourquid-1:<0x0> tank/vol-fourquid-1@saturday:<0x0> /tank/vol-fourquid-1/.zfs/snapshot/saturday/backups/dumps/dump_usr_friday.dump /tank/vol-fourquid-1/.zfs/snapshot/saturday/sverberne/CLEF-IP11/parts_abs+desc /tank/vol-fourquid-1/.zfs/snapshot/sunday/sverberne/CLEF-IP11/parts_abs+desc /tank/vol-fourquid-1/.zfs/snapshot/monday/sverberne/CLEF-IP11/parts_abs+desc -Olaf. -- Pipe rene = new PipePicture(); assert(Not rene.GetType().Equals(Pipe)); ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: ZFS I/O errors
Hi, 2011/5/30 Olaf Seibert > "My" FreeBSD system somehow rebooted itself last friday in the early > hours in the morning, and since then /var/log/messages is full with > messages like these: > > May 30 10:38:28 fourquid root: ZFS: zpool I/O failure, zpool=tank error=86 > May 30 10:38:38 fourquid root: ZFS: checksum mismatch, zpool=tank > path=/dev/da3 offset=278593630720 size=7680 > May 30 10:38:38 fourquid root: ZFS: checksum mismatch, zpool=tank > path=/dev/da4 offset=278593630720 size=7680 > May 30 10:38:38 fourquid root: ZFS: checksum mismatch, zpool=tank > path=/dev/da5 offset=278593630720 size=7680 > May 30 10:38:38 fourquid root: ZFS: checksum mismatch, zpool=tank > path=/dev/da0 offset=278593631232 size=7680 > May 30 10:38:38 fourquid root: ZFS: checksum mismatch, zpool=tank > path=/dev/da1 offset=278593631232 size=7680 > May 30 10:38:38 fourquid root: ZFS: checksum mismatch, zpool=tank > path=/dev/da2 offset=278593631232 size=7680 > May 30 10:38:38 fourquid root: ZFS: checksum mismatch, zpool=tank > path=/dev/da3 offset=278593630720 size=7680 > May 30 10:38:38 fourquid root: ZFS: checksum mismatch, zpool=tank > path=/dev/da4 offset=278593630720 size=7680 > May 30 10:38:38 fourquid root: ZFS: checksum mismatch, zpool=tank > path=/dev/da5 offset=278593630720 size=7680 > May 30 10:38:38 fourquid root: ZFS: checksum mismatch, zpool=tank > path=/dev/da0 offset=278593631232 size=7680 > May 30 10:38:38 fourquid root: ZFS: checksum mismatch, zpool=tank > path=/dev/da1 offset=278593631232 size=7680 > May 30 10:38:38 fourquid root: ZFS: checksum mismatch, zpool=tank > path=/dev/da2 offset=278593631232 size=7680 > > looks like memory errors to me, check your RAM with memtest. Mickael ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"