Re: HAST instability

2011-05-31 Thread Daniel Kalchev

Here goes the second run, wihtout checksums.

systat -if

/0   /1   /2   /3   /4   /5   /6   /7   /8   /9   /10
 Load Average

  Interface   Traffic   PeakTotal
lo0  in  0.000 KB/s 71.666 KB/s  361.825 KB
 out 0.000 KB/s 71.666 KB/s  361.825 KB

ix1  in  0.021 KB/s816.608 MB/s  625.751 GB
 out 0.016 KB/s  7.384 MB/s   23.032 GB

   igb0  in  0.025 KB/s  1.507 KB/s   11.547 MB
 out 0.069 KB/s  1.765 KB/s   17.140 MB

This time it managed to achieve 800MB/s wow! Anyway, no idea when this 
happened, as during my observation, it didn't manage to push much data, 
due to frequent disconnects. Typical "good" rate was lower than with 
checksums, like just over 100MB/s.


from primary
messages: http://news.digsys.bg/~admin/hast/test31may-2/b1a-messages
netstat -in: http://news.digsys.bg/~admin/hast/test31may-2/b1a-netstat-in
netstat-s: http://news.digsys.bg/~admin/hast/test31may-2/b1a-netstat-s

from secondary
messages: http://news.digsys.bg/~admin/hast/test31may-2/b1b-messages
netstat -in: http://news.digsys.bg/~admin/hast/test31may-2/b1b-netstat-in
netstat-s: http://news.digsys.bg/~admin/hast/test31may-2/b1b-netstat-s

Daniel
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: [ZFSv28]gtpzfsboot fails to boot ZFSv28 0511

2011-05-31 Thread Jeremy Chadwick
On Tue, May 31, 2011 at 08:15:35PM -0500, Zhihao Yuan wrote:
> I met this problem, which is serious. I need some help to recovered
> the system, after that I'll show the photos about the error screen.
> 
> I used the ZFSv28 patch maintained by mm@ before, and I have a
> backuped working kernel. I need a LiveCD/memstick to boot the system
> and recover it. But after I burned the 9.0-current image to memstick,
> I found that it keeps giving me kernel panic when booting! How can I
> find a LiveFS with ZFSv28 support? Thanks.

The closest thing I can think of is this:

http://mfsbsd.vx.sk/

Except:

1) The ISOs there don't claim to be "LiveFS"; I don't know if they are.
2) There's no memory stick image available, only ISOs,
3) They're 8.2-RELEASE with ZFSv28 patches, not 9.0-CURRENT.  I don't
   know the implications of this.

Best to ask mm@.

-- 
| Jeremy Chadwick   j...@parodius.com |
| Parodius Networking   http://www.parodius.com/ |
| UNIX Systems Administrator  Mountain View, CA, USA |
| Making life hard for others since 1977.   PGP 4BD6C0CB |

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


[ZFSv28]gtpzfsboot fails to boot ZFSv28 0511

2011-05-31 Thread Zhihao Yuan
Hi,

I met this problem, which is serious. I need some help to recovered
the system, after that I'll show the photos about the error screen.

I used the ZFSv28 patch maintained by mm@ before, and I have a
backuped working kernel. I need a LiveCD/memstick to boot the system
and recover it. But after I burned the 9.0-current image to memstick,
I found that it keeps giving me kernel panic when booting! How can I
find a LiveFS with ZFSv28 support? Thanks.

-- 
Zhihao Yuan, nickname lichray
The best way to predict the future is to invent it.
___
4BSD -- http://4bsd.biz/
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: ICH9 panic/instability on recent kernel

2011-05-31 Thread Michael Sinatra



On Sun, 29 May 2011, Alexander Motin wrote:


If I csup the most recent kernel sources, I get the same problem.
However, if, after csuping the latest kernel sources, I then fetch
the version of sys/dev/ata/ata-all.c as of April 27, everything
works fine.  Here's the output of pciconf -l:


The only change in 8-STABLE ata-all.c since April 27 was the SVN rev 221155. 
But I don't see how can it cause problems. I would really like to see full 
_verbose_ demsg output to better understand what is going on there. If it 
even panics, I need to see how exactly.


I agree that it makes little sense on the surface.  I did follow Jeremy's 
advice and enable AHCI in the BIOS.  Even without loding the ACHI module 
at boot, that still solved the problem.  (I also did load AHCI and it 
worked fine, which the DVD being recognized as a scsi/atapi-style cd0.)


I'll turn of AHCI again and try to get a serial port hooked up so that I 
can do a boot -v and generate the panic (usually a little bit of disk 
activity will cause a panic).  I'll send that in a separate email to you 
directly.


thanks,
michael

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Your friend Catherine Jackson has recommended this great product from Street Cat Jewelry

2011-05-31 Thread Catherine Jackson

Hi Friend!

Your friend, Catherine Jackson, thought that you would be interested in 
Cotswolds, England NC13 from Street Cat Jewelry.

Catherine Jackson sent a note saying:

Hello Friend,

Your Wealth Miners Source Capital! You Can Earn While You Sleep!!!

How Could You Imagine To Send Your Ads To More Than 900 Million Everyday Just a 
Few 
Click of Your Mouse Hurry This Limited Hot Business in 2011 First Come First 
Serve...

Congratulations: Get Your Waiting $800 Hot Commissions Now!!!

It's So Simple That Even A Ten Year Old Could Learn This In Under 1 Hour!"It 
Doesn't
Matter Where In The World You Are If You Have An Internet Connection & A PC You 
Can Earn The $800 For Just A Few Minutes Of Clicking A Mouse "

Imagine waking up at 10am in the morning, having a quick look at your PC and 
finding the exact information you need to collect a quick $300 by lunchtime. 
You could take the afternoon off, play some golf, go shopping or spend some 
quality time with the family. Then do the same thing again in the evening, 
What a wonderful concept and you could be doing it today. 

Even during the boom years in any economy it's not possible to find a form 
of free income, BUT THIS IS EXACTLY THAT! and it will make you free money for 
the rest of your life.
 
It's Easy To Make Money Everyday Even If You're Starting From Scratch With
Zero Knowledge, Experience Or Budget!I'll Show You Exactly How.

We've Start putting New 32 Members in YOUR TEAM for the May 24 to 31/2011
weekly commission cycle...and GROWING everyday earn by $100 up to $200 or more.

IMPORTANT:Advance Don't delay on May 31/2011,is the Cut-Off day to lock 
in your position then faster you act the higher commission you will earn!!!

Go Here To Secure not less than $800 commission Now and it still growing as 
many people joining under you. if you secure your position right away:The $800
Commission will Arrive Through your Paypal or Credit Card on 
July/20/2011...Hurry 
this limited time, only 8 Positions are available Now.

Once you enter a valid credit card number or paypal, after procces, you will 
be able to earn $800 in less than 2 hours a day.I will show you how we do that 
and then I will help you through the process so that YOU SUCCEED! And Enjoy!!! 

You can claim your $800 USD money in any ATM when your membership process are 
valid.

Click Below!!!And Join Right Now..

https://www.plimus.com/jsp/redirect.jsp?contractId=2757066&referrer=freeincome

 You Can See This Snapshots? Those Person is Proven Earn After Join _

TYPE DATE & TIME --- NEW MEMBERS  COUNTRY

P --- MAY. 30 @ 2:38 AM-- Jenny Lopez- United States
P --- MAY. 30 @ 2:53 AM-- Andy William --- United Kingdom
P --- MAY. 30 @ 2:56 AM-- Jeffrey Jacobs-- Germany
M --- MAY. 30 @ 4:19 AM-- Mayeth Thompson- Singapore
P --- MAY. 30 @ 4:28 AM-- Chandrena White- Italy
P --- MAY. 29 @ 2:38 AM-- Jinky Buffer United States
P --- MAY. 29 @ 2:53 AM-- Ailaine Smith -- United Kingdom
P --- MAY. 29 @ 2:56 AM-- Mandene Jonhson- Germany
M --- MAY. 29 @ 4:19 AM-- Cristian Gatmaitan-- Singapore
P --- MAY. 29 @ 4:28 AM-- Jhon Carmalon--- Italy
M --- MAY. 28 @ 6:01 AM-- lalaine Anderson Australia
P --- MAY. 28 @ 7:11 AM-- Rebecca Underwood--- Hungary
P --- MAY. 28 @ 7:39 AM-- Jericho Jackson- Canada
P --- MAY. 27 @ 9:42 AM-- Thomas Silva --- Sri Lanka
M --- MAY. 27 @ 9:58 PM-- Grace Taylor United States
P --- MAY. 27 @ 10:21 PM-- Gina Henry-- New Zealand
P --- MAY. 27 @ 11:24 PM-- Mohammed Ahmen - Romania
M --- MAY. 26 @ 11:33 PM-- Tracey Duncan--- Puerto Rico
P --- MAY. 26 @ 11:41 PM-- Jane Stawrt- United States
P --- MAY. 26 @ 11:47 PM-- Janice Youngstown--- Taiwan
P --- MAY. 26 @ 11:53 PM-- Shirley Ong- China
P --- MAY. 25 @ 1:45 AM-- Ryann Lambert -- Europe
M --- MAY. 25 @ 12:34 AM-- Nick Gauci - Calefornia
M --- MAY. 25 @ 10:24 AM-- Don Riley -- Netherland
P --- MAY. 24 @ 10:30 AM-- Lorne Whittaker  Swetzerland
P --- MAY. 24 @ 02:14 AM-- Ashwani Vohra -- Brazil
M --- MAY. 24 @ 2:34 AM-- Kevin Hunt - United States
P --- MAY. 24 @ 1:54 AM-- Charles Brown--- United States

Therefore, you have a GUARANTEED $800 CommissionS every month from now on!

Earn $25Per Process!Each $25 x 32 = $800 Commission will be yours...

Be Sure to Copy the link below & Paste into your browser and press enter:
To Secure your $800 commission!

You will access your $800 in any ATM when you Join early our weekly cycle.
 
Click Below!!!And Join Right Now..

https://www.plimus.com/jsp/redirect.jsp?contractId=2757066&referrer=freeincome

After your simple payment of $25 and you could have earn $800 Remember No 
one Can give you this kind of commissions in every 20th of the month. 
Today is $800 Commission in each Member Who start Today before cut-off 

You m

Re: PCIe SATA HBA for ZFS on -STABLE

2011-05-31 Thread Artem Belevich
On Tue, May 31, 2011 at 7:31 AM, Freddie Cash  wrote:
> On Tue, May 31, 2011 at 5:48 AM, Matt Thyer  wrote:
>
>> What do people recommend for 8-STABLE as a PCIe SATA II HBA for someone
>> using ZFS ?
>>
>> Not wanting to break the bank.
>> Not interested in SATA III 6GB at this time... though it could be useful if
>> I add an SSD for... (is it ZIL ?).
>> Can this be added at any time ?
>>
>> The main issue is I need at least 10 ports total for all existing drives...
>> ZIL would require 11 so ideally we are talking a 6 port HBA.
>>
>
> SuperMicro AOC-USAS2-L8i works exceptionally well.  These are 8-port HBAs
> using the LSI1068 chipset, supported by the mpt(4) driver.  Support 3 Gpbs
> SATA/SAS, using multi-lane cables (2 connectors on the card, each connector
> supports 4 SATA ports), hot-plug, hot-swap.
>
> These are UIO cards, so the bracket that comes with it doesn't work with
> normal cases (the bracket is on the wrong side of the card; they're made for
> SuperMicro's UIO-based motherboards).  However, these are normal PCIe cards
> and work in any PCIe slot.  You either have to remove the bracket, or you
> can purchase separate brackets online.
>
> These cards are recommended on the zfs-discuss mailing list.  They are only
> ~$120 CDN at places like cdw.ca and newegg.ca.

+1 for LSI1068(e) controller + mpt driver. It's cheap and it works.
Those LSI controllers are often hiding behind other brands. SuperMicro
mentioned above is one. Intel would be another -- search for Intel
SASUC8I. Tyan also sells one as TYAN P3208SR. LSI-branded controllers
tend to be a bit more expensive than rebranded ones, though
functionality is the same and you can often cross-flash firmware.

Keep in mind that HBAs based on LSI1068(e) can't handle hard drives
larger than 2TB and will truncate larger drive capacity to 2TB.

As for the SSD, you may want to hook them up to on-board SATA ports.
In my not-very scientific benchmark Intel's X25-M SSD connected to
on-board SATA port on ICH10 was able to deliver ~20% more reads/sec
than the same SSD connected to LSI1068 based controller.

--Artem
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: HAST instability

2011-05-31 Thread Daniel Kalchev

On 31.05.11 17:08, Mikolaj Golub wrote:

As I wrote privately, it would be nice to see both netstat and hast logs (from 
both nodes) for the same rather long period, when several cases occured. It 
would be good to place them somewere on web so other guys could access them 
too, as I will be offline for 7-10 days and will not be able to help you until 
I am back.


The test finished running for almost three hours, and so here is the 
collected data:


(for the duration of test, on the secondary node)
systat -if
/0   /1   /2   /3   /4   /5   /6   /7   /8   /9   /10
 Load Average

  Interface   Traffic   PeakTotal
lo0  in  0.000 KB/s  0.000 KB/s1.126 KB
 out 0.000 KB/s  0.000 KB/s1.126 KB

ix1  in  0.003 KB/s230.590 MB/s  614.688 GB
 out 0.054 KB/s  7.425 MB/s   19.910 GB

   igb0  in  0.025 KB/s  3.636 KB/s  566.897 KB
 out 0.072 KB/s  4.296 KB/s1.091 MB


The primary node is b1a, the secondary node is b1b.
kernel (built just after csup update):

FreeBSD b1a 8.2-STABLE FreeBSD 8.2-STABLE #1: Mon May 30 14:17:50 EEST 
2011 root@b1a:/usr/obj/usr/src/sys/GENERIC  amd64


from primary
messages: http://news.digsys.bg/~admin/hast/test31may/b1a-messages
netstat -in: http://news.digsys.bg/~admin/hast/test31may/b1a-netstat -in
netstat-s: http://news.digsys.bg/~admin/hast/test31may/b1a-netstat-s

from secondary
messages: http://news.digsys.bg/~admin/hast/test31may/b1b-messages
netstat -in: http://news.digsys.bg/~admin/hast/test31may/b1b-netstat -in
netstat-s: http://news.digsys.bg/~admin/hast/test31may/b1b-netstat-s


  DK>  One additional note: while playing with this setup, I tried to
  DK>  simulate local disk going away in the hope HAST will switch to using
  DK>  the remote disk. Instead of asking someone at the site to pull out the
  DK>  drive, I just issued on the primary

  DK>  hastctl role init data0

  DK>  which resulted in kernel panic. Unfortunately, there was no sufficient
  DK>  dump space for 48GB. I will re-run this again with more drives for the
  DK>  crash dump. Anything you want me to look for in particular? (kernels
  DK>  have no KDB compiled in yet)

Well, removing physical disk (device /dev/gpt/data0 consumed by hastd
dissapears) and switching a resource to init role (devive /dev/hast/data0
consumed by FS dissapears) are two different things. Sure you should not
normally change the resource role (destroy hast device) before unmounting
(exporting) FS.
Then how do I proceed with a failed drive? Or  a flaky drive that is 
still visible to the OS, that I want to remove from HAST and replace 
with a different one? How do I ask HAST to switch I/O to the secondary? 
Is there other way to get a drive out of HAST? In any case, even if this 
is not allowed operation, it should not panic.


I am now going to reboot and run the same tests without checksums.

Daniel

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: modem support MT9234ZPX-PCIE-NV

2011-05-31 Thread John Baldwin
On Monday, May 30, 2011 5:25:14 am Willy Offermans wrote:
> Hello John and FreeBSD friends,
> 
> On Fri, May 27, 2011 at 10:43:34AM -0400, John Baldwin wrote:
> > On Friday, May 27, 2011 10:38:02 am Willy Offermans wrote:
> > > Dear John and FreeBSD friends,
> > > 
> > > On Fri, May 27, 2011 at 08:05:56AM -0400, John Baldwin wrote:
> > > > On Thursday, May 26, 2011 4:58:37 pm Mike Tancsa wrote:
> > > > > On 5/26/2011 4:12 PM, John Baldwin wrote:
> > > > > > 
> > > > > > Hmm, can you get 'pciconf -lb' output?
> > > > > > 
> > > > > > Hmm, wow, I wonder how uart(4) works at all.  It tries to reuse 
> > > > > > it's softc
> > > > > > structure in uart_bus_attach() that was setup in uart_bus_probe().  
> > > > > > Since 
> > > > it
> > > > > > doesn't return 0 from its probe routine, that is forbidden.   I 
> > > > > > guess it
> > > > > > accidentally works because of the hack where we call DEVICE_PROBE() 
> > > > > > again
> > > > > > to make sure the device description is correct.
> > > > > 
> > > > > 
> > > > > I think this is a similar card.  Had it laying about for a while and
> > > > > popped it in.  cu -l to it, attaches, but I am not able to interact 
> > > > > with it.
> > > > > 
> > > > > none3@pci0:5:0:0:   class=0x070002 card=0x20282205 chip=0x015213a8
> > > > > rev=0x02 hdr=0x00
> > > > > vendor = 'Exar Corp.'
> > > > > device = 'XR17C/D152 Dual PCI UART'
> > > > > class  = simple comms
> > > > > subclass   = UART
> > > > > bar   [10] = type Memory, range 32, base 0xe895, size 1024, 
> > > > > enabled
> > > > > 
> > > > > 
> > > > > NetBSD supposedly has support for this card
> > > > 
> > > > Oh, hmm, looks like the clock has an unusual multiplier.  Does it work 
> > > > if you
> > > > use 'cu -l -s 1200' to talk at 9600 for example?  (In general use speed 
> > > > / 8
> > > > as the speed to '-s'.)
> > > > 
> > > > Also, is your card a modem or a dual-port card?
> > > > 
> > > > -- 
> > > > John Baldwin
> > > 
> > > It is a modem.
> > > 
> > > As suggested:
> > > 
> > > kosmos# cu -l /dev/cuau0 -s 1200
> > > Stale lock on cuau0 PID=3642... overriding.
> > > Connected
> > > at&F
> > > OK
> > > atdt0045***
> > > NO DIALTONE
> > 
> > Ok, try this updated patch.  After this you should be able to use the 
> > correct
> > speed:
> > 
> > Index: uart_bus_pci.c
> > ===
> > --- uart_bus_pci.c  (revision 85)
> > +++ uart_bus_pci.c  (working copy)
> > @@ -110,6 +110,8 @@ static struct pci_id pci_ns8250_ids[] = {
> >  { 0x1415, 0x950b, 0x, 0, "Oxford Semiconductor OXCB950 Cardbus 16950 
> > UART",
> > 0x10, 16384000 },
> >  { 0x151f, 0x, 0x, 0, "TOPIC Semiconductor TP560 56k modem", 0x10 },
> > +{ 0x13a8, 0x0152, 0x2205, 0x2026, "MultiTech MultiModem ZPX", 0x10,
> > +   8 * DEFAULT_RCLK },
> >  { 0x9710, 0x9820, 0x1000, 1, "NetMos NM9820 Serial Port", 0x10 },
> >  { 0x9710, 0x9835, 0x1000, 1, "NetMos NM9835 Serial Port", 0x10 },
> >  { 0x9710, 0x9865, 0xa000, 0x1000, "NetMos NM9865 Serial Port", 0x10 },
> > 
> > -- 
> > John Baldwin
> 
> The structure you have provided in your magic line would also need
> some explanation. The data concerns the description of the chip and the
> card I guess and can be gained by `pciconf -lv` 
> 
> uart0@pci0:6:0:0: class=0x070002 card=0x20262205 chip=0x015213a8 rev=0x02 
> hdr=0x00
> vendor = 'Exar Corp.'
> device = 'XR17C/D152 Dual PCI UART'
> class  = simple comms
> subclass   = UART
> 
> 
> A more detailed explanation would not harm. The data 0x10 and 
> 8 * DEFAULT_RCLK are still totally miraculous to me.

0x10 is the resource id for the first PCI BAR (rids for PCI device resources
use the offset in PCI config space of the associated BAR).  It would perhaps
be more obvious if uart(4) and puc(4) used PCIR_BAR(0) rather than 0x10.
Bumping the clock by a multiple of 8 was based on looking at the change in
NetBSD that Mike Tancsa pointed to and that you verified by noting that
'cu -s 1200' connected at 9600 (9600 / 1200 == 8).

One question though, would you be able to test the patch for puc(4) that I
sent to Mike Tancsa to see if your modem works with puc(4)?  The puc(4)
patch is more general and if it works fine for your modem I'd rather just
commit that.

-- 
John Baldwin
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: PCIe SATA HBA for ZFS on -STABLE

2011-05-31 Thread Freddie Cash
On Tue, May 31, 2011 at 5:48 AM, Matt Thyer  wrote:

> What do people recommend for 8-STABLE as a PCIe SATA II HBA for someone
> using ZFS ?
>
> Not wanting to break the bank.
> Not interested in SATA III 6GB at this time... though it could be useful if
> I add an SSD for... (is it ZIL ?).
> Can this be added at any time ?
>
> The main issue is I need at least 10 ports total for all existing drives...
> ZIL would require 11 so ideally we are talking a 6 port HBA.
>

SuperMicro AOC-USAS2-L8i works exceptionally well.  These are 8-port HBAs
using the LSI1068 chipset, supported by the mpt(4) driver.  Support 3 Gpbs
SATA/SAS, using multi-lane cables (2 connectors on the card, each connector
supports 4 SATA ports), hot-plug, hot-swap.

These are UIO cards, so the bracket that comes with it doesn't work with
normal cases (the bracket is on the wrong side of the card; they're made for
SuperMicro's UIO-based motherboards).  However, these are normal PCIe cards
and work in any PCIe slot.  You either have to remove the bracket, or you
can purchase separate brackets online.

These cards are recommended on the zfs-discuss mailing list.  They are only
~$120 CDN at places like cdw.ca and newegg.ca.

-- 
Freddie Cash
fjwc...@gmail.com
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: HAST instability

2011-05-31 Thread Mikolaj Golub

On Tue, 31 May 2011 15:51:07 +0300 Daniel Kalchev wrote:

 DK> On 30.05.11 21:42, Mikolaj Golub wrote:
 >>   DK>  One strange thing is that there is never established TCP connection
 >>   DK>  between both nodes:
 >>
 >>   DK>  tcp4   0  0 10.2.101.11.48939  10.2.101.12.8457   
 >> FIN_WAIT_2
 >>   DK>  tcp4   0   1288 10.2.101.11.57008  10.2.101.12.8457   
 >> CLOSE_WAIT
 >>   DK>  tcp4   0  0 10.2.101.11.46346  10.2.101.12.8457   
 >> FIN_WAIT_2
 >>   DK>  tcp4   0  90648 10.2.101.11.13916  10.2.101.12.8457   
 >> CLOSE_WAIT
 >>   DK>  tcp4   0  0 10.2.101.11.8457   *.*
 >> LISTEN
 >>
 >> It is normal. hastd uses the connections only in one direction so it calls
 >> shutdown to close unused directions.
 DK> So the TCP connections are all too short-lived that I can never see a
 DK> single one in ESTABLISHED state? 10Gbit Ethernet is indeed fast, so
 DK> this might well be possible...

No the connections are persistent, just only one (unused) direction of
communication is closed. See shutdown(2) for further info.

 >> I would like to look at full logs for some rather large period, with several
 >> cases, from both primary and secondary (and be sure about synchronized 
 >> time).
 DK> I have made sure clocks are synchronized and am currently running on a 
freshly rebooted nodes (with two additional SATA drives at each node) -- 
 DK> so far some interesting findings, like  I get hash errors and
 DK> disconnects much more frequent now. Will post when an bonnie++ run on
 DK> the ZFS filesystem on top of the HAST resources finishes.

As I wrote privately, it would be nice to see both netstat and hast logs (from
both nodes) for the same rather long period, when several cases occured. It
would be good to place them somewere on web so other guys could access them
too, as I will be offline for 7-10 days and will not be able to help you until
I am back.

 DK> One additional note: while playing with this setup, I tried to
 DK> simulate local disk going away in the hope HAST will switch to using
 DK> the remote disk. Instead of asking someone at the site to pull out the
 DK> drive, I just issued on the primary

 DK> hastctl role init data0

 DK> which resulted in kernel panic. Unfortunately, there was no sufficient
 DK> dump space for 48GB. I will re-run this again with more drives for the
 DK> crash dump. Anything you want me to look for in particular? (kernels
 DK> have no KDB compiled in yet)

Well, removing physical disk (device /dev/gpt/data0 consumed by hastd
dissapears) and switching a resource to init role (devive /dev/hast/data0
consumed by FS dissapears) are two different things. Sure you should not
normally change the resource role (destroy hast device) before unmounting
(exporting) FS.

-- 
Mikolaj Golub
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: PCIe SATA HBA for ZFS on -STABLE

2011-05-31 Thread Steven Hartland

Areca's work well. The ARC-1220 (8 ports) should do you, not the cheapest but
good support and performance.

   Regards
   Steve

- Original Message - 
From: "Matt Thyer" 

To: 
Sent: Tuesday, May 31, 2011 1:48 PM
Subject: PCIe SATA HBA for ZFS on -STABLE



I'm not on the -STABLE list so please reply to me.

I'm using an Intel Core i3-530 on a Gigabyte H55M-D2H motherboard with 8 x
2TB drives & 2 x 1TB drives.
The plan is to have the 1 TB drives in a zmirror and the 8 in a raidz2.

Now the Intel chipset has only 6 on board SATA II ports so ideally I'm
looking for a non RAID SATA II HBA to give me 6 extra ports (4 min).
Why 6 extra ?
Well the case I'm using has 2 x eSATA ports so 6 would be ideal, 5 OK, and 4
the minimum I need to do the job.

So...

What do people recommend for 8-STABLE as a PCIe SATA II HBA for someone
using ZFS ?

Not wanting to break the bank.
Not interested in SATA III 6GB at this time... though it could be useful if
I add an SSD for... (is it ZIL ?).
Can this be added at any time ?

The main issue is I need at least 10 ports total for all existing drives...
ZIL would require 11 so ideally we are talking a 6 port HBA.
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"




This e.mail is private and confidential between Multiplay (UK) Ltd. and the person or entity to whom it is addressed. In the event of misdirection, the recipient is prohibited from using, copying, printing or otherwise disseminating it or any information contained in it. 


In the event of misdirection, illegible or incomplete transmission please 
telephone +44 845 868 1337
or return the E.mail to postmas...@multiplay.co.uk.

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


PCIe SATA HBA for ZFS on -STABLE

2011-05-31 Thread Matt Thyer
I'm not on the -STABLE list so please reply to me.

I'm using an Intel Core i3-530 on a Gigabyte H55M-D2H motherboard with 8 x
2TB drives & 2 x 1TB drives.
The plan is to have the 1 TB drives in a zmirror and the 8 in a raidz2.

Now the Intel chipset has only 6 on board SATA II ports so ideally I'm
looking for a non RAID SATA II HBA to give me 6 extra ports (4 min).
Why 6 extra ?
Well the case I'm using has 2 x eSATA ports so 6 would be ideal, 5 OK, and 4
the minimum I need to do the job.

So...

What do people recommend for 8-STABLE as a PCIe SATA II HBA for someone
using ZFS ?

Not wanting to break the bank.
Not interested in SATA III 6GB at this time... though it could be useful if
I add an SSD for... (is it ZIL ?).
Can this be added at any time ?

The main issue is I need at least 10 ports total for all existing drives...
ZIL would require 11 so ideally we are talking a 6 port HBA.
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: HAST instability

2011-05-31 Thread Daniel Kalchev



On 30.05.11 21:42, Mikolaj Golub wrote:

  DK>  One strange thing is that there is never established TCP connection
  DK>  between both nodes:

  DK>  tcp4   0  0 10.2.101.11.48939  10.2.101.12.8457   
FIN_WAIT_2
  DK>  tcp4   0   1288 10.2.101.11.57008  10.2.101.12.8457   
CLOSE_WAIT
  DK>  tcp4   0  0 10.2.101.11.46346  10.2.101.12.8457   
FIN_WAIT_2
  DK>  tcp4   0  90648 10.2.101.11.13916  10.2.101.12.8457   
CLOSE_WAIT
  DK>  tcp4   0  0 10.2.101.11.8457   *.*LISTEN

It is normal. hastd uses the connections only in one direction so it calls
shutdown to close unused directions.
So the TCP connections are all too short-lived that I can never see a 
single one in ESTABLISHED state? 10Gbit Ethernet is indeed fast, so this 
might well be possible...

I suppose when checksum is enabled the bottleneck is cpu, the triffic rate is 
lower and the problem is not triggered.
I was thinking something like this. My later tests seems to suggest that 
when the network transfer rate is mugh higher than disk transfer rate 
this gets triggered.



"Hash mismatch" message suggests that actually you were using checksum then,
weren't you?
Yes, this occurs only when checksums are enabled. Happens with both 
crc32 and sha256.

I would like to look at full logs for some rather large period, with several
cases, from both primary and secondary (and be sure about synchronized time).
I have made sure clocks are synchronized and am currently running on a 
freshly rebooted nodes (with two additional SATA drives at each node) -- 
so far some interesting findings, like  I get hash errors and 
disconnects much more frequent now. Will post when an bonnie++ run on 
the ZFS filesystem on top of the HAST resources finishes.

Also, it might worth checking that there is no network packet corruption (some 
strange things in netstat -di, netstat -s, may be copying large files via net 
and comparing checksums).

I will post these as well, however so far no indication of any network 
problems was seen, no interface errors etc. Might be also the ix driver 
is not reporting such, of course.


One additional note: while playing with this setup, I tried to simulate 
local disk going away in the hope HAST will switch to using the remote 
disk. Instead of asking someone at the site to pull out the drive, I 
just issued on the primary


hastctl role init data0

which resulted in kernel panic. Unfortunately, there was no sufficient 
dump space for 48GB. I will re-run this again with more drives for the 
crash dump. Anything you want me to look for in particular? (kernels 
have no KDB compiled in yet)


Daniel
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: zfs-root and "safe" atomic updates

2011-05-31 Thread Andriy Gapon
on 27/05/2011 17:16 Arnaud Houdelette said the following:
> On Fri, 27 May 2011 14:41:54 +0300, Andriy Gapon wrote:
>> I am not aware of any plans to implement nextboot for zfs as it would
>> require at
>> least some write support for zpool and there is none (for boot code)
>> at the moment.
>>
> 
> Could'nt the loader use a bit flag in the loader sector ?

First, strictly speaking, the loader is an executable on a filesystem, there is 
no
"loader sector".  If we consider the earlier boot stages, various incarnations 
of
boot2 like gptzfsboot or non-MBR part of zfsboot, then it gets interesting for
multi-disk configurations.  FreeBSD has its view of disks, but BIOS (which is 
used
for disk access during boot) has its own different view of disks.  So it's hard
(or impossible) to do an auto-magic thing here.  One option could be to force a
user to use its superior knowledge of a system to explicitly specify which disk
and which boot block should be used for nextboot-ish purposes.
That, of course, would be prone to footshooting because of the human nature.  
For
example, one could specify a wrong disk, boot, see that nothing changed, realize
the mistake, specify correct disk, never clean out nextboot-ish data on the 
wrong
disk, change boot order months later and get badly hurt.  But it could also be
argued that that approach would be better than nothing, which is the case for 
ZFS
at the moment.

> Nextboot (or something equivalent) missing is the sole thing keeping me from
> removing ufs boot partition for remote servers.
> 
>>> What do you think ? How do you address the problem ?
>>
>> I have some patches that allow to boot a different loader or a kernel from a
>> different (non-bootfs) ZFS dataset:
>> http://lists.freebsd.org/pipermail/freebsd-fs/2010-July/008976.html
>> But that still requires access to zfs boot and/or loader command interface.
> 
> Interesting though. Thanks.
> Does the mentionned patch still works with latest 8-stable loader ?

I've rebased the patch to the latest head:
http://people.freebsd.org/~avg/zfsboot.diff

> And do you still have to change vfs.root.mountfrom once currdev set ?

That should already be included into the patch.

-- 
Andriy Gapon
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: ZFS I/O errors

2011-05-31 Thread Jeremy Chadwick
On Tue, May 31, 2011 at 11:25:56AM +0200, Olaf Seibert wrote:
> On Mon 30 May 2011 at 12:19:10 -0500, Dan Nelson wrote:
> > The ZFS compression code will panic if it can't allocate the buffer needed
> > to store the compressed data, so that's unlikely to be your problem.  The
> > only time I have seen an "illegal byte sequence" error was when trying to
> > copy raw disk images containing ZFS pools to different disks, and the
> > destination disk was a different size than the original.  I wasn't even able
> > to import the pool in that case, though.  
> 
> Yet somehow some incorrect data got written, it seems. That never
> happened before, fortunately, even though we had crashes before that
> seemed to be related to ZFS running out of memory.
> 
> > The zfs IO code overloads the EILSEQ error code and uses it as a "checksum
> > error" code.  Returning that error for the same block on all disks is
> > definitely weird.  Could you have run a partitioning tool, or some other
> > program that would have done direct writes to all of your component disks?
> 
> I hope I would remember doing that if I did!
> 
> > Your scrub is also a bit worrying - 24k checksum errors definitely shouldn't
> > occur during normal usage.
> 
> It turns out that the errors are easy to provoke: they happen every time
> I do an ls of of the affected directories. There were processes running
> that were likely to be trying to write to the same directories (the file
> system is exported over NFS), so in that case it is easy to imagine that
> the numbers rack up quickly.
> 
> I moved those directories to the side, for the moment, but I haven't
> been able to delete them yet. The data is a bit bigger than we're able
> to backup so "just restoring a backup" isn't an easy thing to do.
> Possibly I could make a new filesystem in the same pool, if that would
> do the trick; it isn't more than 50% full but the affected one is the
> biggest filesystem in it.
> 
> The end result of the scrub is as follows:
> 
>   pool: tank
>  state: ONLINE
> status: One or more devices has experienced an error resulting in data
> corruption.  Applications may be affected.
> action: Restore the file in question if possible.  Otherwise restore the
> entire pool from backup.
>see: http://www.sun.com/msg/ZFS-8000-8A
>  scrub: scrub completed after 12h56m with 3 errors on Mon May 30 23:56:47 2011
> config:
> 
> NAMESTATE READ WRITE CKSUM
> tankONLINE   0 0 6.38K
>   raidz2ONLINE   0 0 25.4K
> da0 ONLINE   0 0 0
> da1 ONLINE   0 0 0
> da2 ONLINE   0 0 0
> da3 ONLINE   0 0 0
> da4 ONLINE   0 0 0
> da5 ONLINE   0 0 0
> 
> errors: Permanent errors have been detected in the following files:
> 
> tank/vol-fourquid-1:<0x0>
> tank/vol-fourquid-1@saturday:<0x0>
> 
> /tank/vol-fourquid-1/.zfs/snapshot/saturday/backups/dumps/dump_usr_friday.dump
> 
> /tank/vol-fourquid-1/.zfs/snapshot/saturday/sverberne/CLEF-IP11/parts_abs+desc
> 
> /tank/vol-fourquid-1/.zfs/snapshot/sunday/sverberne/CLEF-IP11/parts_abs+desc
> 
> /tank/vol-fourquid-1/.zfs/snapshot/monday/sverberne/CLEF-IP11/parts_abs+desc

Mickael Maillot responded to this thread, pointing that situations like
this could be caused by bad RAM.  I admit that's a possibility; with ZFS
in use the most likely memory-utilising piece (meaning volume-wise) of
the system would be the ZFS ARC.  I don't know if you'd necessarily see
things like sig11's on random daemons, etc. (it often depends on where
within the addressing range the bad DRAM chip would be associated).

Can you rule out bad RAM by letting something like memtest86+ run for
12-24 hours?  It's not a 100% infallible utility, but usually for simple
things, it will detect/report errors within the first 15-30 minutes.

Please keep in mind that even if you have ECC RAM, testing with
memtest86+ would be worthwhile.  Single-bit errors are correctable by
ECC, while multi-bit aren't (but are detectable).  "ChipKill" (see
Wikipedia please) might work around this problem, but I've never
personally used it (never seen it on any Intel systems I've used, only
AMD systems).

Finally, depending on what CPU model you have, northbridge problems
(older systems) or on-die MCH (newer CPUs, e.g. Core iX and recent Xeon)
problems could manifest themselves like this.  However, in those
situations I'd imagine you'd be seeing a lot of other oddities on the
system and not limited to just ZFS.

Newer systems which support MCA (again see Wikipedia; Machine Check
Architecture) would/show throw MCEs which FreeBSD 8.x should absolutely
notice/report (you'd see a lot of nastigrams on the console).

I think that about does it for my ideas/blabbing on that topic.

-- 
| Jeremy Chadwick 

Re: ZFS I/O errors

2011-05-31 Thread Olaf Seibert
On Mon 30 May 2011 at 12:19:10 -0500, Dan Nelson wrote:
> The ZFS compression code will panic if it can't allocate the buffer needed
> to store the compressed data, so that's unlikely to be your problem.  The
> only time I have seen an "illegal byte sequence" error was when trying to
> copy raw disk images containing ZFS pools to different disks, and the
> destination disk was a different size than the original.  I wasn't even able
> to import the pool in that case, though.  

Yet somehow some incorrect data got written, it seems. That never
happened before, fortunately, even though we had crashes before that
seemed to be related to ZFS running out of memory.

> The zfs IO code overloads the EILSEQ error code and uses it as a "checksum
> error" code.  Returning that error for the same block on all disks is
> definitely weird.  Could you have run a partitioning tool, or some other
> program that would have done direct writes to all of your component disks?

I hope I would remember doing that if I did!

> Your scrub is also a bit worrying - 24k checksum errors definitely shouldn't
> occur during normal usage.

It turns out that the errors are easy to provoke: they happen every time
I do an ls of of the affected directories. There were processes running
that were likely to be trying to write to the same directories (the file
system is exported over NFS), so in that case it is easy to imagine that
the numbers rack up quickly.

I moved those directories to the side, for the moment, but I haven't
been able to delete them yet. The data is a bit bigger than we're able
to backup so "just restoring a backup" isn't an easy thing to do.
Possibly I could make a new filesystem in the same pool, if that would
do the trick; it isn't more than 50% full but the affected one is the
biggest filesystem in it.

The end result of the scrub is as follows:

  pool: tank
 state: ONLINE
status: One or more devices has experienced an error resulting in data
corruption.  Applications may be affected.
action: Restore the file in question if possible.  Otherwise restore the
entire pool from backup.
   see: http://www.sun.com/msg/ZFS-8000-8A
 scrub: scrub completed after 12h56m with 3 errors on Mon May 30 23:56:47 2011
config:

NAMESTATE READ WRITE CKSUM
tankONLINE   0 0 6.38K
  raidz2ONLINE   0 0 25.4K
da0 ONLINE   0 0 0
da1 ONLINE   0 0 0
da2 ONLINE   0 0 0
da3 ONLINE   0 0 0
da4 ONLINE   0 0 0
da5 ONLINE   0 0 0

errors: Permanent errors have been detected in the following files:

tank/vol-fourquid-1:<0x0>
tank/vol-fourquid-1@saturday:<0x0>

/tank/vol-fourquid-1/.zfs/snapshot/saturday/backups/dumps/dump_usr_friday.dump

/tank/vol-fourquid-1/.zfs/snapshot/saturday/sverberne/CLEF-IP11/parts_abs+desc

/tank/vol-fourquid-1/.zfs/snapshot/sunday/sverberne/CLEF-IP11/parts_abs+desc

/tank/vol-fourquid-1/.zfs/snapshot/monday/sverberne/CLEF-IP11/parts_abs+desc


-Olaf.
-- 
Pipe rene = new PipePicture(); assert(Not rene.GetType().Equals(Pipe));
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: ZFS I/O errors

2011-05-31 Thread Mickaël Maillot
Hi,

2011/5/30 Olaf Seibert 

> "My" FreeBSD system somehow rebooted itself last friday in the early
> hours in the morning, and since then /var/log/messages is full with
> messages like these:
>
> May 30 10:38:28 fourquid root: ZFS: zpool I/O failure, zpool=tank error=86
> May 30 10:38:38 fourquid root: ZFS: checksum mismatch, zpool=tank
> path=/dev/da3 offset=278593630720 size=7680
> May 30 10:38:38 fourquid root: ZFS: checksum mismatch, zpool=tank
> path=/dev/da4 offset=278593630720 size=7680
> May 30 10:38:38 fourquid root: ZFS: checksum mismatch, zpool=tank
> path=/dev/da5 offset=278593630720 size=7680
> May 30 10:38:38 fourquid root: ZFS: checksum mismatch, zpool=tank
> path=/dev/da0 offset=278593631232 size=7680
> May 30 10:38:38 fourquid root: ZFS: checksum mismatch, zpool=tank
> path=/dev/da1 offset=278593631232 size=7680
> May 30 10:38:38 fourquid root: ZFS: checksum mismatch, zpool=tank
> path=/dev/da2 offset=278593631232 size=7680
> May 30 10:38:38 fourquid root: ZFS: checksum mismatch, zpool=tank
> path=/dev/da3 offset=278593630720 size=7680
> May 30 10:38:38 fourquid root: ZFS: checksum mismatch, zpool=tank
> path=/dev/da4 offset=278593630720 size=7680
> May 30 10:38:38 fourquid root: ZFS: checksum mismatch, zpool=tank
> path=/dev/da5 offset=278593630720 size=7680
> May 30 10:38:38 fourquid root: ZFS: checksum mismatch, zpool=tank
> path=/dev/da0 offset=278593631232 size=7680
> May 30 10:38:38 fourquid root: ZFS: checksum mismatch, zpool=tank
> path=/dev/da1 offset=278593631232 size=7680
> May 30 10:38:38 fourquid root: ZFS: checksum mismatch, zpool=tank
> path=/dev/da2 offset=278593631232 size=7680
>
>
looks like memory errors to me, check your RAM with memtest.

Mickael
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"