Re: [zfs-discuss] OpenStorage GUI

2008-11-12 Thread Andy Lubel
Afaik, the drives are pretty much the same, its the chipset that
changed, which also meant a change of cpu and memory.
 
-Andy



From: Tim [mailto:[EMAIL PROTECTED] 
Sent: Wednesday, November 12, 2008 7:24 PM
To: Andy Lubel
Cc: Chris Greer; zfs-discuss@opensolaris.org
Subject: Re: [zfs-discuss] OpenStorage GUI


On Wed, Nov 12, 2008 at 2:38 PM, Andy Lubel <[EMAIL PROTECTED]> wrote:




I too would like to see how this happens, checked with some Sun
people
and they didn't know of a way to "upgrade" a 4500 other than
trading it
in.  Im assuming the Motherboard/CPU/Memory get swapped out, and
from
the chasis layout, looks fairly involved. We don't want to
"upgrade"
something that we just bought so we can take advantage of this
software
which appears to finally complete the Sun NAS picture with zfs!

-Andy




Couldn't you just swap out the hard drives?

--Tim 


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] OpenStorage GUI

2008-11-12 Thread Andy Lubel
The word Module makes it sound really easy :)  Has anyone ever swapped
this module out, and if so - was it painful?

Since our 4500's went from the pallet to the offsite datacenter I never
did really get a chance to look closely at it.  I found a picture of one
and it looks like you could take out the whole guts in one tray (from
the bottom rear?).

-Andy

-Original Message-
From: Chris Greer [mailto:[EMAIL PROTECTED] 
Sent: Wednesday, November 12, 2008 3:57 PM
To: Andy Lubel; zfs-discuss@opensolaris.org
Subject: Re: [zfs-discuss] OpenStorage GUI

I was hoping for a swap out of the system board module.  

Chris G.


- Original Message -
From: Andy Lubel <[EMAIL PROTECTED]>
To: Chris Greer; zfs-discuss@opensolaris.org

Sent: Wed Nov 12 14:38:03 2008
Subject: RE: [zfs-discuss] OpenStorage GUI

 

-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Chris Greer
Sent: Wednesday, November 12, 2008 3:20 PM
To: zfs-discuss@opensolaris.org
Subject: Re: [zfs-discuss] OpenStorage GUI

Do you have any info on this upgrade path?
I can't seem to find anything about this...

I would also like to throw in my $0.02 worth that I would like to see
the software offered to existing sun X4540 (or upgraded X4500)
customers.

Chris G.
--
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


-- 

I too would like to see how this happens, checked with some Sun people
and they didn't know of a way to "upgrade" a 4500 other than trading it
in.  Im assuming the Motherboard/CPU/Memory get swapped out, and from
the chasis layout, looks fairly involved. We don't want to "upgrade"
something that we just bought so we can take advantage of this software
which appears to finally complete the Sun NAS picture with zfs!

-Andy
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] OpenStorage GUI

2008-11-12 Thread Andy Lubel
 

-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Chris Greer
Sent: Wednesday, November 12, 2008 3:20 PM
To: zfs-discuss@opensolaris.org
Subject: Re: [zfs-discuss] OpenStorage GUI

Do you have any info on this upgrade path?
I can't seem to find anything about this...

I would also like to throw in my $0.02 worth that I would like to see
the software offered to existing sun X4540 (or upgraded X4500)
customers.

Chris G.
--
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


-- 

I too would like to see how this happens, checked with some Sun people
and they didn't know of a way to "upgrade" a 4500 other than trading it
in.  Im assuming the Motherboard/CPU/Memory get swapped out, and from
the chasis layout, looks fairly involved. We don't want to "upgrade"
something that we just bought so we can take advantage of this software
which appears to finally complete the Sun NAS picture with zfs!

-Andy
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] OpenStorage GUI

2008-11-11 Thread Andy Lubel
-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Bryan Cantrill
Sent: Tuesday, November 11, 2008 12:39 PM
To: Adam Leventhal
Cc: zfs-discuss@opensolaris.org
Subject: Re: [zfs-discuss] OpenStorage GUI

On Tue, Nov 11, 2008 at 09:31:26AM -0800, Adam Leventhal wrote:
> > Is this software available for people who already have thumpers?
> 
> We're considering offering an upgrade path for people with existing 
> thumpers. Given the feedback we've been hearing, it seems very likely 
> that we will. No word yet on pricing or availability.

Just to throw some ice-cold water on this:

  1.  It's highly unlikely that we will ever support the x4500 -- only
the
  x4540 is a real possibility.

  2.  If we do make something available, your data and any custom
  software won't survive the journey:  you will be forced to
  fresh-install your x4540 with our stack.

  3.  If we do make something available, it will become an appliance:
you
  will permanently lose the ability to run your own apps on the
x4540.

  4.  If we do make something available, it won't be free.

If you are willing/prepared(/eager?) to abide by these constraints,
please let us ([EMAIL PROTECTED]) know -- that will help us build the
business case for doing this...

- Bryan


--
Bryan Cantrill, Sun Microsystems Fishworks.
http://blogs.sun.com/bmc
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


--



1. Can someone let me know who to talk to about upgrading our 4500's
(offline is fine).  We did buy them about 3 weeks before the new 4540
was announced and were pretty stung that our rep didn't tell us what was
on the horizon.. Im an apple user so im pretty used to that anyways.  I
just didn't know that there was a path for upgrading to what a 4540 is.

2. That's why we bought 2 and keep each half full :)

3. I like that, even better would be a way to install it without
dedicating spindles to OS.

4. I am ok with value added software being sold by Sun.  We don't mind
paying money if it makes our job actually less complex each workday!

Im going to give this vmware image a whirl and see what its missing :)  

-Andy

PS - sorry for the incorrect format for responding, I didn't have a real
email client available today.
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Sun Storage 7000

2008-11-10 Thread Andy Lubel
LOL, I guess Sun forgot that they had xvm!  I wonder if you could use a
converter (vmware converter) to make it work on vbox etc?

I would also like to see this available as an upgrade to our 4500's..
Webconsole/zfs just stinks because it only paints a tiny fraction of the
overall need for a web driven GUI.

Anyone know if something like that is in the works?  It looks like a
nice appliance for file shares in a corp network.

-Andy

-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Tom Buskey
Sent: Monday, November 10, 2008 3:40 PM
To: zfs-discuss@opensolaris.org
Subject: Re: [zfs-discuss] Sun Storage 7000

What, no VirtualBox image?

This VMware image won't run on VMware Workstation 5.5 either :-(
--
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Ideal Setup: RAID-5, Areca, etc!

2008-07-26 Thread Andy Lubel
We have been using some 1068-1078 based cards (both raid:AOC-USAS-H4IR
and jbod:LSISAS3801E) with b87-b90 and in s10u5 without issue for some
time.  Both the downloaded LSI driver and the bundled one have worked
fine for us for around 6 months of moderate usage.  The LSI jbod card is
similar to the Sun SAS HBA card ;)

-Andy

-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of James C.
McPherson
Sent: Saturday, July 26, 2008 8:18 AM
To: Miles Nordin
Cc: zfs-discuss@opensolaris.org
Subject: Re: [zfs-discuss] Ideal Setup: RAID-5, Areca, etc!

Miles Nordin wrote:
>> "bh" == Brandon High <[EMAIL PROTECTED]> writes:
> 
> bh> a system built around the Marvell or LSI chipsets
> 
> according to The Blogosphere, source of all reliable information, 
> there's some issue with LSI, too.  The driver is not available in 
> stable Solaris nor OpenSolaris, or there are two drivers, or 
> something.  the guy is so upset, I can't figure out wtf he's trying to
> say:
> 
>   http://www.osnews.com/thread?317113

The driver for LSI's MegaRAID SAS card is "mega_sas" which was
integrated into snv_88. It's planned for backporting to a Solaris 10
update.

And I can't figure out what his beef is either.


James C. McPherson
--
Senior Kernel Software Engineer, Solaris Sun Microsystems
http://blogs.sun.com/jmcp   http://www.jmcp.homeunix.com/blog
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] SSD reliability, wear levelling, warranty period

2008-06-11 Thread Andy Lubel

On Jun 11, 2008, at 11:35 AM, Bob Friesenhahn wrote:

> On Wed, 11 Jun 2008, Al Hopper wrote:
>> disk drives.  But - based on personal observation - there is a lot of
>> hype surrounding SSD reliability.  Obviously the *promise* of this
>> technology is higher performance and *reliability* with lower power
>> requirements due to no (mechanical) moving parts.  But... if you look
>> broadly at the current SSD product offerings, you see: a) lower than
>> expected performance - particularly in regard to write IOPS (I/O Ops
>> per Second) and b) warranty periods that are typically 1 year - with
>> the (currently rare) exception of products that are offered with a 5
>> year warranty.
>
> Other than the fact that SSDs eventually wear out from use, SSDs are
> no different from any other electronic device in that the number of
> individual parts, and the individual reliability of those parts,
> results in an overall reliability factor for the subsystem comprised
> of those parts.  SSDs are jam-packed with parts.  In fact, if you were
> to look inside an SSD and then look at how typical computers are
> implemented these days, you will see that one SSD has a whole lot more
> complex parts than the rest of the computer.
>
> SSDs will naturally become more reliable as their parts count is
> reduced due to higher integration and product maturity.  Large SSD
> storage capacity requires more parts so large storage devices have
> less relability than smaller devices comprised of similar parts.
>
> SSDs are good for laptop reliability since hard drives tend to fail
> with high shock levels and laptops are often severely abused.

Yeah I was going to add the fact that they dont spin at 7k+ rpm and  
have no 'moving' parts.  I do agree that there is a lot of circuitry  
involved and eventually they will reduce that just like they did with  
mainboards.  Remember how packed they used to be?

Either way, I'm really interested in the vendor and technology Sun  
will choose for providing these SSD's in systems or as an add on card/ 
drive.

-Andy

>
>
> Bob
> ==
> Bob Friesenhahn
> [EMAIL PROTECTED], http://www.simplesystems.org/users/bfriesen/
> GraphicsMagick Maintainer,http://www.GraphicsMagick.org/
>
> ___
> zfs-discuss mailing list
> zfs-discuss@opensolaris.org
> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] zfs/nfs issue editing existing files

2008-06-09 Thread Andy Lubel

On Jun 9, 2008, at 12:28 PM, Andy Lubel wrote:

>
> On Jun 6, 2008, at 11:22 AM, Andy Lubel wrote:
>
>> That was it!
>>
>> hpux-is-old.com -> nearline.host NFS C GETATTR3 FH=F6B3
>> nearline.host -> hpux-is-old.com NFS R GETATTR3 OK
>> hpux-is-old.com -> nearline.host NFS C SETATTR3 FH=F6B3
>> nearline.host -> hpux-is-old.com NFS R SETATTR3 Update synch mismatch
>> hpux-is-old.com -> nearline.host NFS C GETATTR3 FH=F6B3
>> nearline.host -> hpux-is-old.com NFS R GETATTR3 OK
>> hpux-is-old.com -> nearline.host NFS C SETATTR3 FH=F6B3
>> nearline.host -> hpux-is-old.com NFS R SETATTR3 Update synch mismatch
>> hpux-is-old.com -> nearline.host NFS C GETATTR3 FH=F6B3
>> nearline.host -> hpux-is-old.com NFS R GETATTR3 OK
>> hpux-is-old.com -> nearline.host NFS C SETATTR3 FH=F6B3
>> nearline.host -> hpux-is-old.com NFS R SETATTR3 Update synch mismatch
>> hpux-is-old.com -> nearline.host NFS C GETATTR3 FH=F6B3
>>
>> It is too bad our silly hardware only allows us to go to 11.23.
>> That's OK though, in a couple months we will be dumping this server
>> with new x4600's.
>>
>> Thanks for the help,
>>
>> -Andy
>>
>>
>> On Jun 5, 2008, at 6:19 PM, Robert Thurlow wrote:
>>
>>> Andy Lubel wrote:
>>>
>>>> I've got a real doozie..   We recently implemented a b89 as zfs/
>>>> nfs/ cifs server.  The NFS client is HP-UX (11.23).
>>>> What's happening is when our dba edits a file on the nfs mount
>>>> with  vi, it will not save.
>>>> I removed vi from the mix by doing 'touch /nfs/file1' then 'echo
>>>> abc   > /nfs/file1' and it just sat there while the nfs servers cpu
>>>> went up  to 50% (one full core).
>>>
>>> Hi Andy,
>>>
>>> This sounds familiar: you may be hitting something I diagnosed
>>> last year.  Run snoop and see if it loops like this:
>>>
>>> 10920   0.00013 141.240.193.235 -> 141.240.193.27 NFS C GETATTR3
>>> FH=6614
>>> 10921   0.7 141.240.193.27 -> 141.240.193.235 NFS R GETATTR3 OK
>>> 10922   0.00017 141.240.193.235 -> 141.240.193.27 NFS C SETATTR3
>>> FH=6614
>>> 10923   0.7 141.240.193.27 -> 141.240.193.235 NFS R SETATTR3
>>> Update synch mismatch
>>> 10924   0.00017 141.240.193.235 -> 141.240.193.27 NFS C GETATTR3
>>> FH=6614
>>> 10925   0.00023 141.240.193.27 -> 141.240.193.235 NFS R GETATTR3 OK
>>> 10926   0.00026 141.240.193.235 -> 141.240.193.27 NFS C SETATTR3
>>> FH=6614
>>> 10927   0.9 141.240.193.27 -> 141.240.193.235 NFS R SETATTR3
>>> Update synch mismatch
>>>
>>> If you see this, you've hit what we filed as Sun bugid 6538387,
>>> "HP-UX automount NFS client hangs for ZFS filesystems".  It's an
>>> HP-UX bug, fixed in HP-UX 11.31.  The synopsis is that HP-UX gets
>>> bitten by the nanosecond resolution on ZFS.  Part of the CREATE
>>> handshake is for the server to send the create time as a 'guard'
>>> against almost-simultaneous creates - the client has to send it
>>> back in the SETATTR to complete the file creation.  HP-UX has only
>>> microsecond resolution in their VFS, and so the 'guard' value is
>>> not sent accurately and the server rejects it, lather rinse and
>>> repeat.  The spec, RFC 1813, talks about this in section 3.3.2.
>>> You can use NFSv2 in the short term until you get that update.
>>>
>>> If you see something different, by all means send us a snoop.
>
> Update:
>
> We tried nfs v2 and the speed was terrible but the gettattr/setattr
> issue was gone.  So what I'm looking at doing now is to create a raw
> volume, format it with ufs, mount it locallly, then share it over
> nfs.  Luckily we will only have to do it this way for a few months, I
> don't like the extra layer and the block device isn't as fast as we
> hoped (I get about 400MB/s on the zfs filesystem and 180MB/s using the
> ufs-formatted local disk..  I just sure hope I'm not breaking any
> rules by implementing this workaround that will come back to haunt me
> later.
>
> -Andy

Tried this today and although things appear to function correctly, the  
performance seems to be steadily degrading.  Am I getting burnt by  
double-caching?  If so, what is the best way to workaround for my sad  
situation?  I tried directio for the ufs volume and it made it even  
worse..

The only next thing I know to do is destroy one of my zfs pools and go  
back to SVM until we can get some newer nfs clients writing to this  
nearline.  It pains me deeply!!

TIA,

-Andy

>
>
>>>
>>>
>>> Rob T
>>
>> ___
>> zfs-discuss mailing list
>> zfs-discuss@opensolaris.org
>> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
>
> ___
> zfs-discuss mailing list
> zfs-discuss@opensolaris.org
> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] zfs/nfs issue editing existing files

2008-06-09 Thread Andy Lubel

On Jun 6, 2008, at 11:22 AM, Andy Lubel wrote:

> That was it!
>
> hpux-is-old.com -> nearline.host NFS C GETATTR3 FH=F6B3
> nearline.host -> hpux-is-old.com NFS R GETATTR3 OK
> hpux-is-old.com -> nearline.host NFS C SETATTR3 FH=F6B3
> nearline.host -> hpux-is-old.com NFS R SETATTR3 Update synch mismatch
> hpux-is-old.com -> nearline.host NFS C GETATTR3 FH=F6B3
> nearline.host -> hpux-is-old.com NFS R GETATTR3 OK
> hpux-is-old.com -> nearline.host NFS C SETATTR3 FH=F6B3
> nearline.host -> hpux-is-old.com NFS R SETATTR3 Update synch mismatch
> hpux-is-old.com -> nearline.host NFS C GETATTR3 FH=F6B3
> nearline.host -> hpux-is-old.com NFS R GETATTR3 OK
> hpux-is-old.com -> nearline.host NFS C SETATTR3 FH=F6B3
> nearline.host -> hpux-is-old.com NFS R SETATTR3 Update synch mismatch
> hpux-is-old.com -> nearline.host NFS C GETATTR3 FH=F6B3
>
> It is too bad our silly hardware only allows us to go to 11.23.
> That's OK though, in a couple months we will be dumping this server
> with new x4600's.
>
> Thanks for the help,
>
> -Andy
>
>
> On Jun 5, 2008, at 6:19 PM, Robert Thurlow wrote:
>
>> Andy Lubel wrote:
>>
>>> I've got a real doozie..   We recently implemented a b89 as zfs/
>>> nfs/ cifs server.  The NFS client is HP-UX (11.23).
>>> What's happening is when our dba edits a file on the nfs mount
>>> with  vi, it will not save.
>>> I removed vi from the mix by doing 'touch /nfs/file1' then 'echo
>>> abc   > /nfs/file1' and it just sat there while the nfs servers cpu
>>> went up  to 50% (one full core).
>>
>> Hi Andy,
>>
>> This sounds familiar: you may be hitting something I diagnosed
>> last year.  Run snoop and see if it loops like this:
>>
>> 10920   0.00013 141.240.193.235 -> 141.240.193.27 NFS C GETATTR3
>> FH=6614
>> 10921   0.7 141.240.193.27 -> 141.240.193.235 NFS R GETATTR3 OK
>> 10922   0.00017 141.240.193.235 -> 141.240.193.27 NFS C SETATTR3
>> FH=6614
>> 10923   0.7 141.240.193.27 -> 141.240.193.235 NFS R SETATTR3
>> Update synch mismatch
>> 10924   0.00017 141.240.193.235 -> 141.240.193.27 NFS C GETATTR3
>> FH=6614
>> 10925   0.00023 141.240.193.27 -> 141.240.193.235 NFS R GETATTR3 OK
>> 10926   0.00026 141.240.193.235 -> 141.240.193.27 NFS C SETATTR3
>> FH=6614
>> 10927   0.9 141.240.193.27 -> 141.240.193.235 NFS R SETATTR3
>> Update synch mismatch
>>
>> If you see this, you've hit what we filed as Sun bugid 6538387,
>> "HP-UX automount NFS client hangs for ZFS filesystems".  It's an
>> HP-UX bug, fixed in HP-UX 11.31.  The synopsis is that HP-UX gets
>> bitten by the nanosecond resolution on ZFS.  Part of the CREATE
>> handshake is for the server to send the create time as a 'guard'
>> against almost-simultaneous creates - the client has to send it
>> back in the SETATTR to complete the file creation.  HP-UX has only
>> microsecond resolution in their VFS, and so the 'guard' value is
>> not sent accurately and the server rejects it, lather rinse and
>> repeat.  The spec, RFC 1813, talks about this in section 3.3.2.
>> You can use NFSv2 in the short term until you get that update.
>>
>> If you see something different, by all means send us a snoop.

Update:

We tried nfs v2 and the speed was terrible but the gettattr/setattr  
issue was gone.  So what I'm looking at doing now is to create a raw  
volume, format it with ufs, mount it locallly, then share it over  
nfs.  Luckily we will only have to do it this way for a few months, I  
don't like the extra layer and the block device isn't as fast as we  
hoped (I get about 400MB/s on the zfs filesystem and 180MB/s using the  
ufs-formatted local disk..  I just sure hope I'm not breaking any  
rules by implementing this workaround that will come back to haunt me  
later.

-Andy

>>
>>
>> Rob T
>
> ___
> zfs-discuss mailing list
> zfs-discuss@opensolaris.org
> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] zfs/nfs issue editing existing files

2008-06-06 Thread Andy Lubel
That was it!

hpux-is-old.com -> nearline.host NFS C GETATTR3 FH=F6B3
nearline.host -> hpux-is-old.com NFS R GETATTR3 OK
hpux-is-old.com -> nearline.host NFS C SETATTR3 FH=F6B3
nearline.host -> hpux-is-old.com NFS R SETATTR3 Update synch mismatch
hpux-is-old.com -> nearline.host NFS C GETATTR3 FH=F6B3
nearline.host -> hpux-is-old.com NFS R GETATTR3 OK
hpux-is-old.com -> nearline.host NFS C SETATTR3 FH=F6B3
nearline.host -> hpux-is-old.com NFS R SETATTR3 Update synch mismatch
hpux-is-old.com -> nearline.host NFS C GETATTR3 FH=F6B3
nearline.host -> hpux-is-old.com NFS R GETATTR3 OK
hpux-is-old.com -> nearline.host NFS C SETATTR3 FH=F6B3
nearline.host -> hpux-is-old.com NFS R SETATTR3 Update synch mismatch
hpux-is-old.com -> nearline.host NFS C GETATTR3 FH=F6B3

It is too bad our silly hardware only allows us to go to 11.23.   
That's OK though, in a couple months we will be dumping this server  
with new x4600's.

Thanks for the help,

-Andy


On Jun 5, 2008, at 6:19 PM, Robert Thurlow wrote:

> Andy Lubel wrote:
>
>> I've got a real doozie..   We recently implemented a b89 as zfs/ 
>> nfs/ cifs server.  The NFS client is HP-UX (11.23).
>> What's happening is when our dba edits a file on the nfs mount  
>> with  vi, it will not save.
>> I removed vi from the mix by doing 'touch /nfs/file1' then 'echo  
>> abc   > /nfs/file1' and it just sat there while the nfs servers cpu  
>> went up  to 50% (one full core).
>
> Hi Andy,
>
> This sounds familiar: you may be hitting something I diagnosed
> last year.  Run snoop and see if it loops like this:
>
> 10920   0.00013 141.240.193.235 -> 141.240.193.27 NFS C GETATTR3  
> FH=6614
> 10921   0.7 141.240.193.27 -> 141.240.193.235 NFS R GETATTR3 OK
> 10922   0.00017 141.240.193.235 -> 141.240.193.27 NFS C SETATTR3  
> FH=6614
> 10923   0.7 141.240.193.27 -> 141.240.193.235 NFS R SETATTR3  
> Update synch mismatch
> 10924   0.00017 141.240.193.235 -> 141.240.193.27 NFS C GETATTR3  
> FH=6614
> 10925   0.00023 141.240.193.27 -> 141.240.193.235 NFS R GETATTR3 OK
> 10926   0.00026 141.240.193.235 -> 141.240.193.27 NFS C SETATTR3  
> FH=6614
> 10927   0.9 141.240.193.27 -> 141.240.193.235 NFS R SETATTR3  
> Update synch mismatch
>
> If you see this, you've hit what we filed as Sun bugid 6538387,
> "HP-UX automount NFS client hangs for ZFS filesystems".  It's an
> HP-UX bug, fixed in HP-UX 11.31.  The synopsis is that HP-UX gets
> bitten by the nanosecond resolution on ZFS.  Part of the CREATE
> handshake is for the server to send the create time as a 'guard'
> against almost-simultaneous creates - the client has to send it
> back in the SETATTR to complete the file creation.  HP-UX has only
> microsecond resolution in their VFS, and so the 'guard' value is
> not sent accurately and the server rejects it, lather rinse and
> repeat.  The spec, RFC 1813, talks about this in section 3.3.2.
> You can use NFSv2 in the short term until you get that update.
>
> If you see something different, by all means send us a snoop.
>
> Rob T

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] zfs/nfs issue editing existing files

2008-06-05 Thread Andy Lubel
Hello,

I've got a real doozie..   We recently implemented a b89 as zfs/nfs/ 
cifs server.  The NFS client is HP-UX (11.23).

What's happening is when our dba edits a file on the nfs mount with  
vi, it will not save.

I removed vi from the mix by doing 'touch /nfs/file1' then 'echo abc  
 > /nfs/file1' and it just sat there while the nfs servers cpu went up  
to 50% (one full core).

This nfsstat is most troubling (I zeroed it and only tried to echo  
data into a file so this is the numbers for about 2 minutes before I  
CTRL-C'ed the echo command).

Version 3: (11242416 calls)
nullgetattr setattr lookup  access  readlink
0 0%5600958 49% 5600895 49% 19 0%   9 0%0 0%
readwrite   create  mkdir   symlink mknod
0 0%40494 0%5 0%0 0%0 0%0 0%
remove  rmdir   rename  linkreaddir readdirplus
3 0%0 0%0 0%0 0%0 0%7 0%
fsstat  fsinfo  pathconfcommit
12 0%   0 0%0 0%14 0%


Thats a lot of getattr and setattr!  Does anyone have any advice on  
where I should start to figure out what is going on?  truss, dtrace,  
snoop.. so many choices!

Thanks,

-Andy
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] SMC Webconsole 3.1 and ZFS Administration 1.0 - stacktraces in snv_b89

2008-05-29 Thread Andy Lubel


On May 29, 2008, at 9:52 AM, Jim Klimov wrote:

I've installed SXDE (snv_89) and found that the web console only  
listens on https://localhost:6789/ now, and the module for ZFS admin  
doesn't work.


It works for out of the box without any special mojo.  In order to get  
the webconsole to listen on something other than localhost did you do  
this?


# svccfg -s svc:/system/webconsole setprop options/tcp_listen = true
# svcadm disable svc:/system/webconsole
# svcadm enable svc:/system/webconsole

-Andy





When I open the link, the left frame lists a stacktrace (below) and  
the right frame is plain empty. Any suggestions?


I tried substituting different SUNWzfsgr and SUNWzfsgu packages from  
older Solarises (x86/sparc, snv_77/84/89, sol10u3/u4), and directly  
substituting the zfs.jar file, but these actions resulted in either  
the same error or crash-and-restart of SMC Webserver.


I didn't yet try installing an older SUNWmco* packages (a 10u4  
system with SMC 3.0.2 works ok), I'm not sure it's a good idea ;)


The system has JDK 1.6.0_06 per default, maybe that's the culprit? I  
tried setting it to JDL 1.5.0_15 and web-module zfs refused to start  
and register itself...



===
Application Error
com.iplanet.jato.NavigationException: Exception encountered during  
forward
Root cause = [java.lang.IllegalArgumentException: No enum const  
class com.sun.zfs.common.model.AclInheritProperty 
$AclInherit.restricted]

Notes for application developers:

   * To prevent users from seeing this error message, override the  
onUncaughtException() method in the module servlet and take action  
specific to the application
   * To see a stack trace from this error, see the source for this  
page


Generated Thu May 29 17:39:50 MSD 2008
===

In fact, the traces in the logs are quite long (several screenfulls)  
and nearly the same; this one starts as:

===
com.iplanet.jato.NavigationException: Exception encountered during  
forward
Root cause = [java.lang.IllegalArgumentException: No enum const  
class com.sun.zfs.common.model.AclInheritProperty 
$AclInherit.restricted]
   at  
com.iplanet.jato.view.ViewBeanBase.forward(ViewBeanBase.java:380)
   at  
com.iplanet.jato.view.ViewBeanBase.forwardTo(ViewBeanBase.java:261)
   at  
com 
.iplanet 
.jato 
.ApplicationServletBase.dispatchRequest(ApplicationServletBase.java: 
981)
   at  
com 
.iplanet 
.jato 
.ApplicationServletBase.processRequest(ApplicationServletBase.java: 
615)
   at  
com 
.iplanet 
.jato.ApplicationServletBase.doGet(ApplicationServletBase.java:459)

   at javax.servlet.http.HttpServlet.service(HttpServlet.java:690)
...
===


This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Disabling ZFS ACL

2008-05-28 Thread Andy Lubel
Did you try mounting with nfs version 3?

mount -o vers=3

On May 28, 2008, at 10:38 AM, kevin kramer wrote:

> that is my thread and I'm still having issues even after applying  
> that patch. It just came up again this week.
>
> [locahost] uname -a
> Linux dv-121-25.centtech.com 2.6.18-53.1.14.el5 #1 SMP Wed Mar 5  
> 11:37:38 EST 2008 x86_64 x86_64 x86_64 GNU/Linux
> [localhost] cat /etc/issue
> CentOS release 5 (Final)
> Kernel \r on an \m
>
> [localhost: /n/scr20] touch test
> [localhost: /n/scr20] mv test /n/scr01/test/ ** this is a UFS mount  
> on FreeBSD
>
> mv: preserving permissions for `/n/scr01/test/test': Operation not  
> supported
> mv: preserving ACL for `/n/scr01/test/test': Operation not supported
> mv: preserving permissions for `/n/scr01/test/test': Operation not  
> supported
>
> If I move it to the local /tmp, I get no errors.
>
>
> This message posted from opensolaris.org
> ___
> zfs-discuss mailing list
> zfs-discuss@opensolaris.org
> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS in S10U6 vs openSolaris 05/08

2008-05-27 Thread Andy Lubel

On May 27, 2008, at 1:44 PM, Rob Logan wrote:

>
>> There is something more to consider with SSDs uses as a cache device.
> why use SATA as the interface? perhaps
> http://www.tgdaily.com/content/view/34065/135/
> would be better? (no experience)

We are pretty happy with RAMSAN SSD's (ours is RAM based, not flash).

-Andy

>
>
> "cards will start at 80 GB and will scale to 320 and 640 GB next year.
> By the end of 2008, Fusion io also hopes to roll out a 1.2 TB  
> card.
> 160 parallel pipelines that can read data at 800 megabytes per second
> and write at 600 MB/sec 4K blocks and then streaming eight
> simultaneous 1 GB reads and writes.  In that test, the ioDrive
> clocked in at 100,000 operations per second...  beat $30 dollars a  
> GB,"
> ___
> zfs-discuss mailing list
> zfs-discuss@opensolaris.org
> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Per-user home filesystems and OS-X Leopard anomaly

2008-05-21 Thread Andy Lubel

On May 21, 2008, at 11:15 AM, Bob Friesenhahn wrote:

> I encountered an issue that people using OS-X systems as NFS clients
> need to be aware of.  While not strictly a ZFS issue, it may be
> encounted most often by ZFS users since ZFS makes it easy to support
> and export per-user filesystems.  The problem I encountered was when
> using ZFS to create exported per-user filesystems and the OS-X
> automounter to perform the necessary mount magic.
>
> OS-X creates hidden ".DS_Store" directories in every directory which
> is accessed (http://en.wikipedia.org/wiki/.DS_Store).
>
> OS-X decided that it wanted to create the path "/home/.DS_Store" and
> it would not take `no' for an answer.  First it would try to create
> "/home/.DS_Store" and then it would try an alternate name.  Since the
> automounter was used, there would be an automount request for
> "/home/.DS_Store", which does not exist on the server so the mount
> request would fail.  Since OS-X does not take 'no' for an answer,
> there would be subsequent thousands of back to back mount requests.
> The end result was that 'mountd' was one of the top three resource
> consumers on my system, there would be bursts of high network traffic
> (1500 packets/second), and the affected OS-X system would operate
> more strangely than normal.
>
> The simple solution was to simply create a "/home/.DS_Store" directory
> on the server so that the mount request would succeed.

Did you try this?
http://support.apple.com/kb/HT1629

-Andy

>
>
> Bob
> ==
> Bob Friesenhahn
> [EMAIL PROTECTED], http://www.simplesystems.org/users/bfriesen/
> GraphicsMagick Maintainer,http://www.GraphicsMagick.org/
>
> ___
> zfs-discuss mailing list
> zfs-discuss@opensolaris.org
> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS and Sun Disk arrays - Opinions?

2008-05-19 Thread Andy Lubel
The limitation existed in every Sun branded Engenio array we tested - 
2510,2530,2540,6130,6540.  This limitation is on volumes.  You will not be able 
to present a lun larger than that magical 1.998TB.  I think it is a combination 
of both in CAM and the firmware.  Can't do it with sscs either...
 
Warm and fuzzy:  Sun engineers told me they would have a new release of CAM 
(and firmware bundle) in late June which would "resolve" this limitation.
 
Or just do ZFS (or even SVM) setup like Bob and I did.  Its actually pretty 
nice because the traffic will split to both controllers giving you 
theoretically more throughput so long as MPxIO is functioning properly.  Only 
(minor) downside is parity is being transmitted from the host to the disks 
rather than living on the controller entirely.
 
-Andy
 


From: [EMAIL PROTECTED] on behalf of Torrey McMahon
Sent: Mon 5/19/2008 1:59 PM
To: Bob Friesenhahn
Cc: zfs-discuss@opensolaris.org; Kenny
Subject: Re: [zfs-discuss] ZFS and Sun Disk arrays - Opinions?



Bob Friesenhahn wrote:
> On Mon, 19 May 2008, Kenny wrote:
>
>  
>> Bob M.- Thanks for the heads up on the 2 (1.998) TN Lun limit.
>> This has me a little concerned esp. since I have 1 TB drives being
>> delivered! Also thanks for the scsi cache flushing heads up, yet
>> another item to lookup!  
>>
>
> I am not sure if this LUN size limit really exists, or if it exists,
> in which cases it actually applies.  On my drive array, I created a
> 3.6GB RAID-0 pool with all 12 drives included during the testing
> process.  Unfortunately, I don't recall if I created a LUN using all
> the space.
>
> I don't recall ever seeing mention of a 2TB limit in the CAM user
> interface or in the documentation.

The Solaris LUN limit is gone if you're using Solaris 10 and recent patches.
The array limit(s) are tied to the type of array you're using. (Which
type is this again?)
CAM shouldn't be enforcing any limits of its own but only reporting back
when the array complains.
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS and Sun Disk arrays - Opinions?

2008-05-16 Thread Andy Lubel

On May 16, 2008, at 10:04 AM, Robert Milkowski wrote:

> Hello James,
>
>
>>> 2) Does anyone have experiance with the 2540?
>
> JCM> Kinda. I worked on adding MPxIO support to the mpt driver so
> JCM> we could support the SAS version of this unit - the ST2530.
>
> JCM> What sort of experience are you after? I'ver never used one
> JCM> of these boxes in production - only ever for benchmarking and
> JCM> bugfixing :-) I think Robert Milkowski might have one or two
> JCM> of them, however.
>
>
> Yeah, I do have several of them (both 2530 and 2540).

we did a try and buy of the 2510,2530 and 2540.

>
>
> 2530 (SAS) - cables tend to pop-out sometimes when you are around
> servers... then MPxIO does not work properly if you just hot-unplug
> and hot-replug the sas cable... there is still 2TB LUN size limit
> IIRC... other than that generally it is a good value

Yeah the sff-8088 connectors are a bit rigid and clumsy, but the  
performance was better than everything we tested in the 2500 series.

>
>
> 2540 (FC) - 2TB LUN size limit IIRC, other than that it is a good
> value array
>

Echo.  We like the 2540 as well, and will be buying lots of them  
shortly.


>
>
> -- 
> Best regards,
> Robert Milkowskimailto:[EMAIL PROTECTED]
>   http://milek.blogspot.com
>

-Andy


> ___
> zfs-discuss mailing list
> zfs-discuss@opensolaris.org
> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Image with DD from ZFS partition

2008-05-14 Thread Andy Lubel

On May 14, 2008, at 10:39 AM, Chris Siebenmann wrote:

> | Think what you are looking for would be a combination of a snapshot
> | and zfs send/receive, that would give you an archive that you can  
> use
> | to recreate your zfs filesystems on your zpool at will at later  
> time.
>
> Talking of using zfs send/recieve for backups and archives: the
> Solaris 10U4 zfs manpage contains some blood-curdling warnings about
> there being no cross-version compatability promises for the output
> of 'zfs send'. Can this be ignored in practice, or is it a real issue?

It's real!  You cant send and receive between versions of ZFS.

>
>
> (Speaking as a sysadmin, I certainly hope that it is a misplaced
> warning. Even ignoring backups and archives, imagine the fun if you
> cannot use 'zfs send | zfs receive' to move a ZFS filesystem from an
> old but reliable server running a stable old Solaris to your new, just
> installed server running the latest version of Solaris.)

If you use external storage array attached via FC,iscsi,SAS etc, you  
can just do a 'zpool export', disconnect the storage from the old  
server, attach it to the new server then run 'zpool import' - and then  
do a 'zpool upgrade'.  Unfortunately this doesn't help the thumpers so  
much :(



>
>
>   - cks
> ___
> zfs-discuss mailing list
> zfs-discuss@opensolaris.org
> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

-Andy
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS cli for REMOTE Administration

2008-05-11 Thread Andy Lubel

I can now see the snapshots in the CLI if i do a mkdir from the windows command 
line as opposed to getting stuck with a "New Folder", which seems to confuse 
the CLI (maybe snapshots cant have spaces in the name?

I was wondering, if there was an easy way for us to script zfs home filesystem 
creation upon connection to an AD joined cifs server?  samba had some cool 
stuff with preexec and I just wonder if something like that is available for 
the kernel mode cifs driver.

-Andy

-Original Message-
From: [EMAIL PROTECTED] on behalf of Andy Lubel
Sent: Sun 5/11/2008 2:24 AM
To: Mark Shellenbaum; Paul B. Henson
Cc: zfs-discuss@opensolaris.org
Subject: Re: [zfs-discuss] ZFS cli for REMOTE Administration
 
Paul B. Henson wrote:
>> On Thu, 8 May 2008, Mark Shellenbaum wrote:
>> 
>>> we already have the ability to allow users to create/destroy snapshots
>>> over NFS.  Look at the ZFS delegated administration model.  If all you
>>> want is snapshot creation/destruction then you will need to grant
>>> "snapshot,mount,destroy" permissions.
>>>
>>> then on the NFS client mount go into .zfs/snapshot and do mkdir
>>> .  Providing the user has the appropriate permission the
>>> snapshot will be created.
>>>
>>> rmdir can be used to remove the snapshot.
>> 
>> Now that is just uber-cool.
>> 
>> Can you do that through the in kernel CIFS server too?
>> 
>> 

>Yes, it works over CIFS too.

>   -Mark

Great stuff!

I confirmed that it does work, but its strange that I don't see the snapshot in 
'zfs list' on the zfs box.  Is that a bug or a feature?  Im using XP - another 
thing is that if you right click in the .zfs/snapshot directory and do new -> 
folder you will be stuck with a snapshot called "New Folder".  I couldn't 
rename it and the only way to delete it was to log into the machine and do a 
lil 'rm -Rf'.  good news is that it is snapshotting :)

I have a simple backup script where I use robocopy and then at the end I want 
to do a 'mkdir .zfs/snapshot/xxx', but I would eventually want to delete the 
oldest snapshot, similar to the zsnap.pl script floating around.

Cant wait to try this on NFS, the whole reason we objected to snapshots in the 
first place in our org was because our admins didn't want to be involved with 
the users for the routine of working with snapshots.

-Andy

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss





___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS cli for REMOTE Administration

2008-05-10 Thread Andy Lubel
Paul B. Henson wrote:
>> On Thu, 8 May 2008, Mark Shellenbaum wrote:
>> 
>>> we already have the ability to allow users to create/destroy snapshots
>>> over NFS.  Look at the ZFS delegated administration model.  If all you
>>> want is snapshot creation/destruction then you will need to grant
>>> "snapshot,mount,destroy" permissions.
>>>
>>> then on the NFS client mount go into .zfs/snapshot and do mkdir
>>> .  Providing the user has the appropriate permission the
>>> snapshot will be created.
>>>
>>> rmdir can be used to remove the snapshot.
>> 
>> Now that is just uber-cool.
>> 
>> Can you do that through the in kernel CIFS server too?
>> 
>> 

>Yes, it works over CIFS too.

>   -Mark

Great stuff!

I confirmed that it does work, but its strange that I don't see the snapshot in 
'zfs list' on the zfs box.  Is that a bug or a feature?  Im using XP - another 
thing is that if you right click in the .zfs/snapshot directory and do new -> 
folder you will be stuck with a snapshot called "New Folder".  I couldn't 
rename it and the only way to delete it was to log into the machine and do a 
lil 'rm -Rf'.  good news is that it is snapshotting :)

I have a simple backup script where I use robocopy and then at the end I want 
to do a 'mkdir .zfs/snapshot/xxx', but I would eventually want to delete the 
oldest snapshot, similar to the zsnap.pl script floating around.

Cant wait to try this on NFS, the whole reason we objected to snapshots in the 
first place in our org was because our admins didn't want to be involved with 
the users for the routine of working with snapshots.

-Andy

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS partition info makes my system not boot

2008-03-11 Thread Andy Lubel

On Mar 11, 2008, at 4:58 PM, Bart Smaalders wrote:

> Frank Bottone wrote:
>> I'm using the latest build of opensolaris express available from
>> opensolaris.org.
>>
>> I had no problems with the install (its an AMD64 x2 3800+, 1gb
>> physical ram, 1 ide drive for the os and 4*250GB sata drives attached
>> to the motherboard - nforce based chipset).
>>
>> I create a zfs pool on the 4 sata drives as a raidZ and the pool  
>> works
>> fine. If I reboot with any of the 4 drives connected the system hangs
>> right after all the drives are detected on the post screen. I need to
>> put them in a different system and zero them with dd in order to be
>> able to reconnect them to my server and still have the system boot
>> properly.
>>
>> Any ideas on how I can get around this? It seems like the onboard
>> system itself is getting confused by the metadata ZFS is adding to  
>> the
>> drive. The system already has the latest available bios from the
>> manufacturer - I'm not using any hardware raid of any sort.
>>
>
> This is likely the BIOS getting confused by the EFI label on the
> disks.  Since there's no newer BIOS available there are two ways
> around this problem: 1) put a normal label on the disk and
> give zfs slice 2, or 2)  don't have the BIOS do auto-detect on those
> drives.  Many BIOSs let you select None for the disk type; this will
> allow the system to boot.  Solaris has no problem finding the
> drives even w/o the BIOSs help...
>

See if a BIOS update is available as well?

> - Bart
>
> -- 
> Bart SmaaldersSolaris Kernel Performance
> [EMAIL PROTECTED] http://blogs.sun.com/barts
> "You will contribute more with mercurial than with thunderbird."
> ___
> zfs-discuss mailing list
> zfs-discuss@opensolaris.org
> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Preferred backup s/w

2008-02-26 Thread Andy Lubel

On Feb 26, 2008, at 10:23 AM, Rich Teer wrote:

> On Tue, 26 Feb 2008, Joerg Schilling wrote:
>
>> Hi Rich, I asked you a question that you did not yet answer:
>
> Hi Jörg,
>
>> Are you interested only in full backups and in the ability to  
>> restore single
>> files from that type of backups?
>>
>> Or are you interested in incremental backups that _also_ allow you  
>> to reduce the
>> daily backup size but still gives you the ability to restore single  
>> files?
>
> Both: I'd like to be able to restore single files from both a full and
> incremental backup of a ZFS file system.

A zfs-aware NDMP daemon would be really neat.

>
>
> -- 
> Rich Teer, SCSA, SCNA, SCSECA, OGB member
>
> CEO,
> My Online Home Inventory
>
> URLs: http://www.rite-group.com/rich
>  http://www.linkedin.com/in/richteer
>  http://www.myonlinehomeinventory.com
> ___
> zfs-discuss mailing list
> zfs-discuss@opensolaris.org
> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

-Andy
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Hardware RAID vs. ZFS RAID

2008-02-11 Thread Andy Lubel
> With my (COTS) LSI 1068 and 1078 based controllers I get consistently

> better performance when I export all disks as jbod (MegaCli - 
> CfgEachDskRaid0).
>
>   
>> Is that really 'all disks as JBOD'? or is it 'each disk as a single 
>> drive RAID0'?

single disk raid0:
./MegaCli -CfgEachDskRaid0 Direct -a0


>>It may not sound different on the surface, but I asked in another
thread 
>>and others confirmed, that if your RAID card has a battery backed
cache 
>>giving ZFS many single drive RAID0's is much better than JBOD (using
the 
>>'nocacheflush' option may even improve it more.)

>>My understanding is that it's kind of like the best of both worlds.
You 
>>get the higher number of spindles and vdevs for ZFS to manage, ZFS
gets 
>>to do the redundancy, and the the HW RAID Cache gives virtually
instant 
>>acknowledgement of writes, so that ZFS can be on it's way.

>>So I think many RAID0's is not always the same as JBOD. That's not to 
>>say that even True JBOD doesn't still have an advantage over HW RAID.
I 
>>don't know that for sure.

I have tried mixing hardware and zfs raid but it just doesn't make sense
to use from a performance or redundancy standpoint why we would add
those layers of complexity.  In this case I'm building nearline so there
isn't even a battery attached and I have disabled any caching on the
controller.  I have a SUN SAS HBA on the way which would be what I would
use ultimately for my JBOD attachment.


>>But I think there is a use for HW RAID in ZFS configs which wasn't 
>>always the theory I've heard.
> I have really learned not to do it this way with raidz and raidz2:
>
> #zpool create pool2 raidz c3t8d0 c3t9d0 c3t10d0 c3t11d0 c3t12d0  
> c3t13d0 c3t14d0 c3t15d0
>   
>>Why? I know creating raidz's with more than 9-12 devices, but that 
>>doesn't cross that threshold.
>>Is there a reason you'd split 8 disks up into 2 groups of 4? What 
>>experience led you to this?
>>(Just so I don't have to repeat it. ;) )

I don't know why but with most setups I have tested (8 and 16 drive
configs) dividing raid5 into 4 disks per vdev and 5 for a raidz2 perform
better.  Take a look at my simple dd test (filebench results as soon as
I can figure out how to get it working proper with SOL10).

=

8 SATA 500gb disk system with LSI 1068 (megaRAID ELP) - no BBU


-
bash-3.00# zpool history
History for 'pool0-raidz':
2008-02-11.16:38:13 zpool create pool0-raidz raidz c2t0d0 c2t1d0 c2t2d0
c2t3d0 c2t4d0 c2t5d0 c2t6d0 c2t7d0

bash-3.00# zfs list
NAME  USED  AVAIL  REFER  MOUNTPOINT
pool0-raidz   117K  3.10T  42.6K  /pool0-raidz


bash-3.00# time dd if=/dev/zero of=/pool0-raidz/w-test.lo0 bs=8192
count=131072;time sync
131072+0 records in
131072+0 records out

real0m1.768s
user0m0.080s
sys 0m1.688s

real0m3.495s
user0m0.001s
sys 0m0.013s

bash-3.00# time dd if=/pool0-raidz/w-test.lo0
of=/pool0-raidz/rw-test.lo0 bs=8192; time sync
131072+0 records in
131072+0 records out

real0m6.994s
user0m0.097s
sys 0m2.827s

real0m1.043s
user0m0.001s
sys 0m0.013s

bash-3.00# time dd if=/dev/zero of=/pool0-raidz/w-test.lo1 bs=8192
count=655360;time sync
655360+0 records in
655360+0 records out

real0m24.064s
user0m0.402s
sys 0m8.974s

real0m1.629s
user0m0.001s
sys 0m0.013s

bash-3.00# time dd if=/pool0-raidz/w-test.lo1
of=/pool0-raidz/rw-test.lo1 bs=8192; time sync
655360+0 records in
655360+0 records out

real0m40.542s
user0m0.476s
sys 0m16.077s

real0m0.617s
user0m0.001s
sys 0m0.013s
bash-3.00# time dd if=/pool0-raidz/w-test.lo0 of=/dev/null bs=8192; time
sync
131072+0 records in
131072+0 records out

real0m3.443s
user0m0.084s
sys 0m1.327s

real0m0.013s
user0m0.001s
sys 0m0.013s

bash-3.00# time dd if=/pool0-raidz/w-test.lo1 of=/dev/null bs=8192; time
sync
655360+0 records in
655360+0 records out

real0m15.972s
user0m0.413s
sys 0m6.589s

real0m0.013s
user0m0.001s
sys 0m0.012s
---

bash-3.00# zpool history
History for 'pool0-raidz':
2008-02-11.17:02:16 zpool create pool0-raidz raidz c2t0d0 c2t1d0 c2t2d0
c2t3d0
2008-02-11.17:02:51 zpool add pool0-raidz raidz c2t4d0 c2t5d0 c2t6d0
c2t7d0

bash-3.00# zfs list
NAME  USED  AVAIL  REFER  MOUNTPOINT
pool0-raidz   110K  2.67T  36.7K  /pool0-raidz

bash-3.00# time dd if=/dev/zero of=/pool0-raidz/w-test.lo0 bs=8192
count=131072;time sync
131072+0 records in
131072+0 records out

real0m1.835s
user0m0.079s
sys 0m1.687s

real0m2.521s
user0m0.001s
sys 0m0.013s

bash-3.00# time dd if=/pool0-raidz/w-test.lo0
of=/pool0-raidz/rw-test.lo0 bs=8192; time sync
131072+0 records in
131072+0 records out

real0m2.376s
user0m0.084s
sys 0m2.291s

real0m2.578s
user0m0.001s
sys 0m0.013s

bash-3.00# time dd if=/dev/zero of=/pool0-raidz/w-test.lo1 bs=8192
count=655360;time sync
655360+0 records in
655360+0 records out

real0m19.531s
user0m0.404s
sys

Re: [zfs-discuss] Hardware RAID vs. ZFS RAID

2008-02-07 Thread Andy Lubel
With my (COTS) LSI 1068 and 1078 based controllers I get consistently  
better performance when I export all disks as jbod (MegaCli - 
CfgEachDskRaid0).

I even went through all the loops and hoops with 6120's, 6130's and  
even some SGI storage and the result was always the same; better  
performance exporting single disk than even the "ZFS" profiles within  
CAM.

---
'pool0':
#zpool create pool0 mirror c2t0d0 c2t1d0
#zpool add pool0 mirror c2t2d0 c2t3d0
#zpool add pool0 mirror c2t4d0 c2t5d0
#zpool add pool0 mirror c2t6d0 c2t7d0

'pool2':
#zpool create pool2 raidz c3t8d0 c3t9d0 c3t10d0 c3t11d0
#zpool add pool2 raidz c3t12d0 c3t13d0 c3t14d0 c3t15d0


I have really learned not to do it this way with raidz and raidz2:

#zpool create pool2 raidz c3t8d0 c3t9d0 c3t10d0 c3t11d0 c3t12d0  
c3t13d0 c3t14d0 c3t15d0


So when is thumper going to have an all SAS option? :)


-Andy


On Feb 7, 2008, at 2:28 PM, Joel Miller wrote:

> Much of the complexity in hardware RAID is in the fault detection,  
> isolation, and management.  The fun part is trying to architect a  
> fault-tolerant system when the suppliers of the components can not  
> come close to enumerating most of the possible failure modes.
>
> What happens when a drive's performance slows down because it is  
> having to go through internal retries more than others?
>
> What layer gets to declare a drive dead? What happens when you start  
> declaring the drives dead one by one because of they all seemed to  
> stop responding but the problem is not really the drives?
>
> Hardware RAID systems attempt to deal with problems that are not  
> always straight forward...Hopefully we will eventually get similar  
> functionality in Solaris...
>
> Understand that I am a proponent of ZFS, but everything has it's use.
>
> -Joel
>
>
> This message posted from opensolaris.org
> ___
> zfs-discuss mailing list
> zfs-discuss@opensolaris.org
> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Home Motherboard

2007-11-23 Thread Andy Lubel
Arcea,  nice!
 
Any word on whether 3ware has come around yet?  I've been bugging them for 
months to do something to get a driver made for solaris.
 
-Andy



From: [EMAIL PROTECTED] on behalf of James C. McPherson
Sent: Thu 11/22/2007 5:06 PM
To: mike
Cc: zfs-discuss@opensolaris.org
Subject: Re: [zfs-discuss] Home Motherboard



mike wrote:
> I actually have a related motherboard, chassis, dual power-supplies
> and 12x400 gig drives already up on ebay too. If I recall Areca cards
> are supported in OpenSolaris...

At the moment you can download the Areca "arcmsr" driver
from areca.com.tw, but I'm in the process of integrating
it into OpenSolaris

http://bugs.opensolaris.org/view_bug.do?bug_id=6614012
6614012 add Areca SAS/SATA RAID adapter driver


James C. McPherson
--
Senior Kernel Software Engineer, Solaris
Sun Microsystems
http://blogs.sun.com/jmcp   http://www.jmcp.homeunix.com/blog
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Yager on ZFS

2007-11-15 Thread Andy Lubel
On 11/15/07 9:05 AM, "Robert Milkowski" <[EMAIL PROTECTED]> wrote:

> Hello can,
> 
> Thursday, November 15, 2007, 2:54:21 AM, you wrote:
> 
> cyg> The major difference between ZFS and WAFL in this regard is that
> cyg> ZFS batch-writes-back its data to disk without first aggregating
> cyg> it in NVRAM (a subsidiary difference is that ZFS maintains a
> cyg> small-update log which WAFL's use of NVRAM makes unnecessary).
> cyg> Decoupling the implementation from NVRAM makes ZFS usable on
> cyg> arbitrary rather than specialized platforms, and that without
> cyg> doubt  constitutes a significant advantage by increasing the
> cyg> available options (in both platform and price) for those
> cyg> installations that require the kind of protection (and ease of
> cyg> management) that both WAFL and ZFS offer and that don't require
> cyg> the level of performance that WAFL provides and ZFS often may not
> cyg> (the latter hasn't gotten much air time here, and while it can be
> cyg> discussed to some degree in the abstract a better approach would
> cyg> be to have some impartial benchmarks to look at, because the
> cyg> on-disk block layouts do differ significantly and sometimes
> cyg> subtly even if the underlying approaches don't).
> 
> Well, ZFS allows you to put its ZIL on a separate device which could
> be NVRAM.

Like RAMSAN SSD

http://www.superssd.com/products/ramsan-300/

It is the only FC attached, Battery-backed SSD that I know of, and we have
dreams of clusterfication.  Otherwise we would use one of those PCI-Express
based NVRAM cards that are on the horizon.

My initial results for lots of small files was very pleasing.

I dream of a JBOD with lots of disks + something like this built into 3u.
Too bad Sun's forthcoming JBODS probably wont have anything similar to
this...

-Andy

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Recommended many-port SATA controllers for budget ZFS

2007-11-02 Thread Andy Lubel
Marvell controllers work great with solaris.

Supermicro AOC-SAT2-MV8 is what I currently use.  I bought it on
recommendation from this list actually. I think I paid 110$ for mine.

-Andy


On 11/2/07 4:10 PM, "Peter Schuller" <[EMAIL PROTECTED]> wrote:

> Hello,
> 
> Short version: Can anyone recommend a *many port* (8 or more) SATA/SAS
> controller (RAID or otherwise) that will allow *SAFE* use of ZFS, including
> honoring cache flush commands in the sense of submitting them to the
> underlying device, that is also *low budget* (suitable for personal use; say
> in the <= $250 range)?
> 
> Long version:
> 
> I am having difficulty getting reliable information on SATA controllers for
> use with ZFS (that also work with FreeBSD in this case; though I magine the
> same problem applies with Solaris).
> 
> The problem is that most non-RAID controllers do not have enough ports. In
> fact the only one I have found that is decent is the Supermicro Marvell card;
> but that is PCI-X rather than PCI (or PCI Express). Works in one machine,
> doesn't in another (presumably because of PCI-X; only have PCI slots). And
> even if it does work, you are rather limited in bandwidth. Not that I really
> care about the latter for low budget use.
> 
> You might say that just get a RAID controller and configure it for JBOD. Well,
> I was assuming that would be aafe bet, but apparanly you cannot trust them to
> behave correctly with respect to write caching and cache flushing.
> 
> I recently found out that the Dell supplied LSI MegaRaid derived Perc 5/i RAID
> controllers will not honor cache flush requests (according to Dell technical
> support, after quite some time trying to explain to them what I wanted to
> know). So assuming this information is correct (I never saw the actual
> response from the "behind the lines" tech support that my tech support
> contact in turn asked), it means that running without battery backup, you
> actually negate the safety offered by ZFS with respect to write caching,
> making the pool less reliable than it would be with a cheap non-RAID card.
> 
> Right now I have noticed that LSI has recently began offering some
> lower-budget stuff; specifically I am looking at the MegaRAID SAS
> 8208ELP/XLP, which are very reasonably priced.
> 
> The problem again is that, while they are cheap raid cards without cache, my
> understanding is that it is still primarily intended for RAID rather than
> plain SATA "pass-through". As a result I am worried about the same problem as
> with the Perc 5/i.
> 
> Of course, LSI being a large corporation, it is seemingly impossible to obtain
> contact information for them, or find any technical specifications that would
> contain information as specific as what it will do in response to cache flush
> requests, so I am at a loss. Unless you're buying 10 000 cards it's difficult
> to get answers.
>  
> Does anyone have suggestions on what to choose, that will actually work the
> way you want it for JBOD use with ZFS? Or avenus of investigation? Is there
> any chance of a lowly consumer getting any information out of LSI? Is there
> some other manufacturer that provide low-budget stuff that you can get some
> technical information about? Does anyone have some specific knowledge of a
> suitable product?


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Force SATA1 on AOC-SAT2-MV8

2007-11-02 Thread Andy Lubel

Jumpering drives by removing the cover?  Do you mean opening the chassis
because they aren't removable from the outside?

Your cable is longer than 1 meter inside of a chasis??

I think sataI is 2 meters and sataII is 1 meter.

As far as a system setting for demoting these to sataI I don't know, but I
don't think its possible.. Don't hold me to that however, I only say that
because THE way I demote them to sataI is by removing a jumper actually :)

HTH,

Andy

On 11/2/07 12:29 PM, "Eric Haycraft" <[EMAIL PROTECTED]> wrote:

> I have a supermicro AOC-SAT2-MV8 and am having some issues getting drives to
> work. From what I can tell, my cables are to long to use with SATA2. I got
> some drives to work by jumpering them down to sata1, but other drives I can't
> jumper without opening the case and voiding the drive warranty. Does anyone
> know if there is a system setting to drop it back to SATA1? I use zfs on a
> raid2 if makes a difference. This is on release of OpenSolaris 74.
>  
>  
> This message posted from opensolaris.org
> ___
> zfs-discuss mailing list
> zfs-discuss@opensolaris.org
> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

-- 


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Mac OS X 10.5.0 Leopard ships with a readonly ZFS

2007-10-26 Thread Andy Lubel
Yeah im pumped about this new release today..  such harmony in my
storage to be had.  now if OSX only had a native iscsi target/initiator!


-Andy Lubel




-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Peter Woodman
Sent: Friday, October 26, 2007 8:14 AM
To: Kugutsumen
Cc: zfs-discuss@opensolaris.org
Subject: Re: [zfs-discuss] Mac OS X 10.5.0 Leopard ships with a readonly
ZFS

it would seem that the reason that it's been pulled is that it's
installed by default in the release version (9A581) - just tested it
here, and willikers, it works!

On 10/26/07, Kugutsumen <[EMAIL PROTECTED]> wrote:
> # zfs list
> ZFS Readonly implemntation is loaded!
> To download the full ZFS read/write kext with all functionality 
> enabled, please go to http://developer.apple.com no datasets available
>
> Unfortunately, I can't find it on ADC yet and it seems that it was
removed by Apple:
>
> "Another turn in the Apple-ZFS saga. Apple has made available a
developer preview of ZFS for Mac OS X with read/write capability. The
preview is available to all ADC members. From the readme file: "ZFS is a
new filesystem from Sun Microsystems which has been ported by Apple to
Mac OS X. The initial (10.5.0) release of Leopard will restrict ZFS to
read-only, so no ZFS pools or filesystems can be modified or created.
This Developer Preview will enable full read/write capability, which
includes the creation/destruction of ZFS pools and filesystems." Update:
Will it ever end? The release has been pulled from ADC by Apple."
>
> I can't wait to reformat all my external 2.5 drives with zfs.
>
>
> This message posted from opensolaris.org 
> ___
> zfs-discuss mailing list
> zfs-discuss@opensolaris.org
> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
>
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Sun 6120 array again

2007-10-01 Thread Andy Lubel
I gave up.

The 6120 I just ended up not doing zfs.  And for our 6130 since we don't
have santricity or the sscs command to set it, I just decided to export each
disk and create an array with zfs (and a RAMSAN zil), which made performance
acceptable for us.

I wish there was a firmware that just made these things dumb jbods!

-Andy


On 9/28/07 7:37 PM, "Marion Hakanson" <[EMAIL PROTECTED]> wrote:

> Greetings,
> 
> Last April, in this discussion...
> http://www.opensolaris.org/jive/thread.jspa?messageID=143517
> 
> ...we never found out how (or if) the Sun 6120 (T4) array can be configured
> to ignore cache flush (sync-cache) requests from hosts.  We're about to
> reconfigure a 6120 here for use with ZFS (S10U4), and the evil tuneable
> zfs_nocacheflush is not going to serve us well (there is a ZFS pool on
> slices of internal SAS drives, along with UFS boot/OS slices).
> 
> Any pointers would be appreciated.
> 
> Thanks and regards,
> 
> Marion
> 
> 
> ___
> zfs-discuss mailing list
> zfs-discuss@opensolaris.org
> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

-- 


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS speed degraded in S10U4 ?

2007-09-25 Thread Andy Lubel


On 9/25/07 3:37 AM, "Sergiy Kolodka" <[EMAIL PROTECTED]>
wrote:

> Hi Guys,
> 
> I'm playing with Blade 6300 to check performance of compressed ZFS with Oracle
> database.
> After some really simple tests I noticed that default (well, not really
> default, some patches applied, but definitely noone bother to tweak disk
> subsystem or something else) installation of S10U3 is actually faster than
> S10U4, and a lot faster. Actually it's even faster on compressed ZFS with
> S10U3 than on uncompressed with S10U4.
> 
> My configuration - default Update 3 LiveUpgraded to Update 4 with ZFS
> filesystem on dedicated disk, and I'm working with same files which are on
> same physical cylinders, so it's not likely a problem with HDD itself.
> 
Did you do a 'zpool upgrade -a'?

> I'm doing as simple as just $time dd if=file.dbf of=/dev/null in few parallel
> tasks. On Update3 it's somewhere close to 11m32s and on Update 4 it's around
> 12m6s. And it's both reading from compressed or uncompressed ZFS, numbers a
> little bit higher with compressed, couple of seconds more, which impressive by
> itself, but difference is the same, and strangest part is that reading file
> from compressed ZFS on U3 is faster than reading uncompressed with U4.
> 
> I'm really surprised by this results, anyone else noticed that ?
>  
I'm running a 'motley group of disks' on an e450 acting as our jumpstart
server and server build times are noticeably quicker since u4.

>  
> This message posted from opensolaris.org
> ___
> zfs-discuss mailing list
> zfs-discuss@opensolaris.org
> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

-Andy

-- 


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] enterprise scale redundant Solaris 10/ZFS server providing NFSv4/CIFS

2007-09-21 Thread Andy Lubel



On 9/20/07 7:31 PM, "Paul B. Henson" <[EMAIL PROTECTED]> wrote:

> On Thu, 20 Sep 2007, Tim Spriggs wrote:
> 
>> It's an IBM re-branded NetApp which can which we are using for NFS and
>> iSCSI.

Yeah its fun to see IBM compete with its OEM provider Netapp.
> 
> Ah, I see.
> 
> Is it comparable storage though? Does it use SATA drives similar to the
> x4500, or more expensive/higher performance FC drives? Is it one of the
> models that allows connecting dual clustered heads and failing over the
> storage between them?
> 
> I agree the x4500 is a sweet looking box, but when making price comparisons
> sometimes it's more than just the raw storage... I wish I could just drop
> in a couple of x4500's and not have to worry about the complexity of
> clustering ...
> 
> 
zfs send/receive.


Netapp is great, we have about 6 varieties in production here. But what I
pay in maintenance and up front cost on just 2 filers,  I can buy a x4500 a
year, and have a 3 year warranty each time I buy.  It just depends on the
company you work for.

I haven't played too much with anything but netapp and storagetek.. But once
I got started on zfs I just knew it was the future; and I think netapp
realizes that too.  And if apple does what I think it will, it will only get
better :)

Fast, Cheap, Easy - you only get 2.  Zfs may change that.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] enterprise scale redundant Solaris 10/ZFS server providing NFSv4/CIFS

2007-09-20 Thread Andy Lubel
rage pools.
> 
> Again though, that would imply two different storage locations visible to
> the clients? I'd really rather avoid that. For example, with our current
> Samba implementation, a user can just connect to
> '\\files.csupomona.edu\' to access their home directory or
> '\\files.csupomona.edu\' to access a shared group directory.
> They don't need to worry on which physical server it resides or determine
> what server name to connect to.
> 
>> The SE is mistaken.  Sun^H^Holaris Cluster supports a wide variety of
>> JBOD and RAID array solutions.  For ZFS, I recommend a configuration
>> which allows ZFS to repair corrupted data.
> 
> That would also be my preference, but if I were forced to use hardware
> RAID, the additional loss of storage for ZFS redundancy would be painful.
> 
> Would anyone happen to have any good recommendations for an enterprise
> scale storage subsystem suitable for ZFS deployment? If I recall correctly,
> the SE we spoke with recommended the StorageTek 6140 in a hardware raid
> configuration, and evidently mistakenly claimed that Cluster would not work
> with JBOD.

I really have to disagree, we have 6120 and 6130's and if I had the option
to actually plan out some storage I would have just bought a thumper.  You
could probably buy 2 for the cost of that 6140.

> 
> Thanks...
> 

-Andy Lubel
-- 


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Zfs log device (zil) ever coming to Sol10?

2007-09-18 Thread Andy Lubel
On 9/18/07 2:26 PM, "Neil Perrin" <[EMAIL PROTECTED]> wrote:

> 
> 
> Andy Lubel wrote:
>> On 9/18/07 1:02 PM, "Bryan Cantrill" <[EMAIL PROTECTED]> wrote:
>> 
>>> Hey Andy,
>>> 
>>> On Tue, Sep 18, 2007 at 12:59:02PM -0400, Andy Lubel wrote:
>>>> I think we are very close to using zfs in our production environment..  Now
>>>> that I have snv_72 installed and my pools set up with NVRAM log devices
>>>> things are hauling butt.
>>> Interesting!  Are you using a MicroMemory device, or is this some other
>>> NVRAM concoction?
>>> 
>> 
>> RAMSAN :)
>> http://www.superssd.com/products/ramsan-400/
> 
> May I ask roughly what you paid for it.
> I think perhaps we ought to get one in-house and check it out as well.
> 
> Thanks: Neil.

~80k for the 128gb model.  But we didn't pay anything for it, it was a
customer return that the vendor wouldn't take back.

Being that they have a fancy (we love) sun logo on the homepage I'm willing
to bet that they would send you a demo unit.  Let me know if I can help at
all with that.

-Andy Lubel
-- 


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Zfs log device (zil) ever coming to Sol10?

2007-09-18 Thread Andy Lubel
On 9/18/07 1:02 PM, "Bryan Cantrill" <[EMAIL PROTECTED]> wrote:

> 
> Hey Andy,
> 
> On Tue, Sep 18, 2007 at 12:59:02PM -0400, Andy Lubel wrote:
>> I think we are very close to using zfs in our production environment..  Now
>> that I have snv_72 installed and my pools set up with NVRAM log devices
>> things are hauling butt.
> 
> Interesting!  Are you using a MicroMemory device, or is this some other
> NVRAM concoction?
> 

RAMSAN :)
http://www.superssd.com/products/ramsan-400/

>> I've been digging to find out whether this capability would be put into
>> Solaris 10, does anyone know?
> 
> I would say it's probably unlikely, but I'll let Neil and the ZFS team
> speak for that.  Do you mind if I ask what you're using ZFS for?
> 
> - Bryan

Today's answer:
We want to use it for nearline backups (via nfs), eventually we would like
to use zvols+iscsi to serve up for Oracle databases.

My future answer:
What cant we use ZFS for?


If anyone wants to see my iozones just let me know.

-Andy Lubel

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] Zfs log device (zil) ever coming to Sol10?

2007-09-18 Thread Andy Lubel
I think we are very close to using zfs in our production environment..  Now
that I have snv_72 installed and my pools set up with NVRAM log devices
things are hauling butt.

I've been digging to find out whether this capability would be put into
Solaris 10, does anyone know?

If not, then I guess we can probably be OK using SXCE (as Joyent did).

Thanks,

Andy Lubel

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] MS Exchange storage on ZFS?

2007-09-06 Thread Andy Lubel
On 9/6/07 2:51 PM, "Joe S" <[EMAIL PROTECTED]> wrote:

> Has anyone here attempted to store their MS Exchange data store on a
> ZFS pool? If so, could you please tell me about your setup? A friend
> is looking for a NAS solution, and may be interested in a ZFS box
> instead of a netapp or something like that.

I don't see why it wouldn't using zvols and iscsi.  We use iscsi in our
rather large exchange implementation - not backed by zfs but I don't see why
it couldn't be.

PS no "NAS" solution will work for exchange will it?  You have to use
DAS/SAN or iscsi afaik.

-Andy
> 
> Thanks.
> ___
> zfs-discuss mailing list
> zfs-discuss@opensolaris.org
> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

-- 


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Zfs with storedge 6130

2007-09-06 Thread Andy Lubel
On 9/4/07 4:34 PM, "Richard Elling" <[EMAIL PROTECTED]> wrote:

> Hi Andy,
> my comments below...
> note that I didn't see zfs-discuss@opensolaris.org in the CC for the
> original...
> 
> Andy Lubel wrote:
>> Hi All,
>> 
>> I have been asked to implement a zfs based solution using storedge 6130 and
>> im chasing my own tail trying to decide how best to architect this.  The
>> storage space is going to be used for database dumps/backups (nearline
>> storage).  What is killing me is that I must mix hardware raid and zfs..
> 
> Why should that be killing you?  ZFS works fine with RAID arrays.

What kills me is the fact that I have a choice and it was hard to decide on
which one was going to be at the top of the totem pole.  From now on I only
want JBOD!

Works even better when I export each disk in my array as a single raid0 x14
then create the zpool :)

#zpool create -f vol0 c2t1d12 c2t1d11 c2t1d10 c2t1d9 c2t1d8 c2t1d7 c2t1d6
c2t1d5 c2t1d4 c2t1d3 c2t1d2 c2t1d1 c2t1d0 spare c2t1d13
> 
>> The storedge shelf has 14 FC 72gb disks attached to a solaris snv_68.
>> 
>> I was thinking that since I cant export all the disks un-raided out to the
>> solaris system that I would instead:
>> 
>> (on the 6130)
>> Create 3 raid5 volumes of 200gb each using the "Sun_ZFS" pool (128k segment
>> size, read ahead enabled 4 disk).
>> 
>> (On the snv_68)
>> Create a raid0 using zfs of the 3 volumes from the 6130, using the same 128k
>> stripe size.
> 
> OK
> 
>> It seemed to me that if I was going to go for redundancy with a mixture of
>> zfs and hardware raid that I would put the redundancy into the hardware raid
>> and use striping at the zfs level, is that methodology the best way to think
>> of it?
> 
> The way to think about this is that ZFS can only correct errors when it has
> redundancy.  By default, for dynamic stripes, only metadata is redundant.
> You can set the copies parameter to add redundancy on a per-file system basis,
> so you could set a different policy for data you really care about.
> 
Makes perfect sense.  Since this is a nearline backup solution, I think we
will be OK with a dynamic stripe.  Once I get approved for thumper im
definitely going to go raidz2.  Since we are a huge Sun partner.. It should
be easier than its been :(

>> The only requirement ive gotten so far is that it can be written to and read
>> from at a minimum of 72mb/s locally and 1gb/35sec via nfs.  I suspect I
>> would need at least 600gb of storage.
> 
> I hope you have a test case for this.  It is difficult for us to predict
> that sort of thing because there are a large number of variables.  But in
> general, to get high bandwidth, you need large I/Os.  That implies the
> application
> is responsible for it's use of the system, since the application is the source
> of I/Os.
> 
Its all going to be accessed via NFS and eventually iscsi, as soon as we
figure out how to backup iscsi targets from the SAN itself.

>> Anyone have any recommendations?  The last time tried to create one 13 disk
>> raid5 with zfs filesystem the performance was terrible via nfs..  But when I
>> shared an nfs filesystem via a raidz or mirror things were much better.. So
>> im nervous about doing this with only one volume in the zfs pool.
> 
> 13 disk RAID-5 will suck.  Try to stick with fewer devices in the set.
> 
> See also
> http://mail.opensolaris.org/pipermail/zfs-discuss/2006-December/024194.html
> http://blogs.digitar.com/jjww/?itemid=44
> 

I cant find a santricity download that will work with a 6130, but that's
ok.. I just created 14 volumes per shelf :)  hardware raid is so yesterday.

> That data is somewhat dated, as we now have the ability to put the ZIL
> on a different log device (Nevada b70 or later). This will be more obvious
> if the workload creates a lot of small files, less of a performance problem
> for large files.
>   -- richard
> 

Got my hands on a Ram-San SSD 64gb and I'm using that for the zil.. Its
crazy fast now.

-Andy
-- 


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Solaris HDD crash, Restore?

2007-09-04 Thread Andy Lubel
"Zpool import" is your friend.. Yes you still have your pool!


On 9/4/07 5:50 PM, "christopher" <[EMAIL PROTECTED]> wrote:

> First time user of Solaris and ZFS 
> 
> I have Solaris 10 installed on the primary IDE drive of my motherboard.  I
> also have a 4 disc RAIDZ setup on my sata connections.  I setup up a
> successful 1.5TB ZFS server with all discs operational.
> 
> Well ... I was trying out something new and I borked my Solaris install HDD;
> the main problem is that I also had my RAIDZ zpool operational.  I don't think
> I can salvage my Solaris install (just not experienced enough), though if I
> reformat and install Solaris back on the IDE HDD, will I be able to save all
> the data on my RAIDZ after install?
> 
> I didnt take any snapshots of my existing ZFS.
> 
> Will I be able to save my data on the RAIDZ and remount the ZFS after
> reinstall, or am I screwed?
> 
> Please help ...
>  
>  
> This message posted from opensolaris.org
> ___
> zfs-discuss mailing list
> zfs-discuss@opensolaris.org
> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Andy Lubel
Application Administrator / IT Department

GTSI Corp.
3901 Stonecroft Boulevard
Chantilly, VA 20151
Tel: 1.800.999.GTSI ext.2309
Dir: 703.502.2309
[EMAIL PROTECTED]


-- 


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Re: ZFS Apple WWDC Keynote Absence

2007-06-12 Thread Andy Lubel
Yeah this is pretty sad, we had such plans for actually using our apple
(PPC) hardware in our datacenter for something other than AFP and web
serving.

It also shows how limited apples vision seems to be.  For 2 CEO's not to be
on the same page demonstrates that there is something else going on rather
than just "we chose not to put a future ready file system into our next OS".
And how its being dismissed by apple is quite upsetting.

I wonder when we will see Johnny-cat and Steve-o in the same room talking
about it.


On 6/12/07 8:23 AM, "Sunstar Dude" <[EMAIL PROTECTED]> wrote:

> Yea, What is the deal with this? I am so bummed :( What the heck was Sun's CEO
> talking about the other day? And why the heck did Apple not include at least
> non-default ZFS support in Leopard? If no ZFS in Leapard, then what is all the
> Apple-induced-hype about? A trapezoidal Dock table? A transparent menu bar?
> 
> Can anyone explain the absence of ZFS in Leopard??? I signed up for this forum
> just to post this.
>  
>  
> This message posted from opensolaris.org
> ___
> zfs-discuss mailing list
> zfs-discuss@opensolaris.org
> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Andy Lubel



-- 


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Slashdot Article: Does ZFS Obsolete Expensive NAS/SANs?

2007-06-06 Thread Andy Lubel
Anyone know when s10u4 will be released?


On 6/5/07 1:06 PM, "Jesus Cea" <[EMAIL PROTECTED]> wrote:

> -BEGIN PGP SIGNED MESSAGE-
> Hash: SHA1
> 
> eric kustarz wrote:
>> Cindy has been doing a good job of putting the new features into the
>> admin guide:
>> http://www.opensolaris.org/os/community/zfs/docs/zfsadmin.pdf
>> 
>> Check out the "What's New in ZFS?" section.
> 
> I will update the wikipedia entry when Solaris10U4 be published :)
> 
> - --
> Jesus Cea Avion _/_/  _/_/_/_/_/_/
> [EMAIL PROTECTED] http://www.argo.es/~jcea/ _/_/_/_/  _/_/_/_/  _/_/
> jabber / xmpp:[EMAIL PROTECTED] _/_/_/_/  _/_/_/_/_/
>_/_/  _/_/_/_/  _/_/  _/_/
> "Things are not so easy"  _/_/  _/_/_/_/  _/_/_/_/  _/_/
> "My name is Dump, Core Dump"   _/_/_/_/_/_/  _/_/  _/_/
> "El amor es poner tu felicidad en la felicidad de otro" - Leibniz
> -BEGIN PGP SIGNATURE-
> Version: GnuPG v1.4.6 (GNU/Linux)
> Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org
> 
> iQCVAwUBRmWYG5lgi5GaxT1NAQJtzgQAmT0FQ/1ciQYAqi2unOjPkBMe8fkkI08Y
> ux19N+ONvDHp742im5ZPaWrpa5Ns+42+SWziIOPaPYC27DaV2vqLz1gun53LLRPi
> /gRo2AFCgKGvmHBM2qsL9Ch8kepMSm4pUmWLjG81eIq+1R5wjo5Dv4Nld0YITS9u
> EdrfG6VU6pE=
> =pa0L
> -END PGP SIGNATURE-
> ___
> zfs-discuss mailing list
> zfs-discuss@opensolaris.org
> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Andy Lubel
Application Administrator / IT Department
-- 


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] No zfs_nocacheflush in Solaris 10?

2007-05-25 Thread Andy Lubel
Im using: 
 
  zfs set:zil_disable 1

On my se6130 with zfs, accessed by NFS and writing performance almost
doubled.  Since you have BBC, why not just set that?

-Andy



On 5/24/07 4:16 PM, "Albert Chin"
<[EMAIL PROTECTED]> wrote:

> On Thu, May 24, 2007 at 11:55:58AM -0700, Grant Kelly wrote:
>> I'm running SunOS Release 5.10 Version Generic_118855-36 64-bit
>> and in [b]/etc/system[/b] I put:
>> 
>> [b]set zfs:zfs_nocacheflush = 1[/b]
>> 
>> And after rebooting, I get the message:
>> 
>> [b]sorry, variable 'zfs_nocacheflush' is not defined in the 'zfs' module[/b]
>> 
>> So is this variable not available in the Solaris kernel?
> 
> I think zfs:zfs_nocacheflush is only available in Nevada.
> 
>> I'm getting really poor write performance with ZFS on a RAID5 volume
>> (5 disks) from a storagetek 6140 array. I've searched the web and
>> these forums and it seems that this zfs_nocacheflush option is the
>> solution, but I'm open to others as well.
> 
> What type of poor performance? Is it because of ZFS? You can test this
> by creating a RAID-5 volume on the 6140, creating a UFS file system on
> it, and then comparing performance with what you get against ZFS.
> 
> It would also be worthwhile doing something like the following to
> determine the max throughput the H/W RAID is giving you:
>   # time dd of= if=/dev/zero bs=1048576 count=1000
> For a 2Gbps 6140 with 300GB/10K drives, we get ~46MB/s on a
> single-drive RAID-0 array, ~83MB/s on a 4-disk RAID-0 array w/128k
> stripe, and ~69MB/s on a seven-disk RAID-5 array w/128k strip.
-- 


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Motley group of discs? (doing it right, or right now)

2007-05-07 Thread Andy Lubel
I think it will be in the next.next (10.6) OSX, we just need to get apple to
stop playing with their silly cell phone (that I cant help but want, damn
them!).

I have similar situation at home, but what I do is use Solaris 10 on a
cheapish x86 box with 6 400gb IDE/SATA disks, I then make them into ISCSI
targets and use that free GlobalSAN initiator ([EMAIL PROTECTED]).  I once was 
like
you, had 5 USB/Firewire drives hanging off everything and eventually I just
got fed up with the mess of cables and wall warts.

Perhaps my method of putting redundant and fast storage isn't as easy to
achieve to everyone else.  If you want more details about my setup, just
email me directly, I don't mind :)

-Andy



On 5/7/07 4:48 PM, "[EMAIL PROTECTED]" <[EMAIL PROTECTED]>
wrote:

> Lee,
> 
> Yes, the hot spare (disk4) should kick if another disk in the pool fails
> and yes, the data is moved to disk4.
> 
> You are correct:
> 
> 160 GB (the smallest disk) * 3 + raidz parity info
> 
> Here's the size of raidz pool comprised of 3 136-GB disks:
> 
> # zpool list
> NAMESIZEUSED   AVAILCAP  HEALTH ALTROOT
> pool408G 98K408G 0%  ONLINE -
> # zfs list
> NAME   USED  AVAIL  REFER  MOUNTPOINT
> pool  89.9K   267G  32.6K  /pool
> 
> The pool is 408GB in size but usable space in the pool is 267GB.
> 
> If you added the 600GB disk to the pool, then you'll still lose out
> on the extra capacity because of the smaller disks, which is why
> I suggested using it as a spare.
> 
> Regarding this:
> 
> If I didn't need a hot spare, but instead could live with running out
> and buying a new drive to add on as soon as one fails, what
> configuration would I use then?
> 
> I don't have any add'l ideas but I still recommend going with a spare.
> 
> Cindy
> 
> 
> 
> 
> 
> Lee Fyock wrote:
>> Cindy,
>> 
>> Thanks so much for the response -- this is the first one that I consider
>> an actual answer. :-)
>> 
>> I'm still unclear on exactly what I end up with. I apologize in advance
>> for my ignorance -- the ZFS admin guide assumes knowledge that I don't
>> yet have.
>> 
>> I assume that disk4 is a hot spare, so if one of the other disks die,
>> it'll kick into active use. Is data immediately replicated from the
>> other surviving disks to disk4?
>> 
>> What usable capacity do I end up with? 160 GB (the smallest disk) * 3?
>> Or less, because raidz has parity overhead? Or more, because that
>> overhead can be stored on the larger disks?
>> 
>> If I didn't need a hot spare, but instead could live with running out
>> and buying a new drive to add on as soon as one fails, what
>> configuration would I use then?
>> 
>> Thanks!
>> Lee
>> 
>> On May 7, 2007, at 2:44 PM, [EMAIL PROTECTED]
>>  wrote:
>> 
>>> Hi Lee,
>>> 
>>> 
>>> You can decide whether you want to use ZFS for a root file system now.
>>> 
>>> You can find this info here:
>>> 
>>> 
>>> http://opensolaris.org/os/community/zfs/boot/
>>> 
>>> 
>>> Consider this setup for your other disks, which are:
>>> 
>>> 
>>> 250, 200 and 160 GB drives, and an external USB 2.0 600 GB drive
>>> 
>>> 
>>> 250GB = disk1
>>> 
>>> 200GB = disk2
>>> 
>>> 160GB = disk3
>>> 
>>> 600GB = disk4 (spare)
>>> 
>>> 
>>> I include a spare in this setup because you want to be protected from
>>> a disk failure. Since the replacement disk must be equal to or larger than
>>> 
>>> the disk to replace, I think this is best (safest) solution.
>>> 
>>> 
>>> zpool create pool raidz disk1 disk2 disk3 spare disk4
>>> 
>>> 
>>> This setup provides less capacity but better safety, which is probably
>>> 
>>> important for older disks. Because of the spare disk requirement (must
>>> 
>>> be equal to or larger in size), I don't see a better arrangement. I
>>> 
>>> hope someone else can provide one.
>>> 
>>> 
>>> Your questions remind me that I need to provide add'l information about
>>> 
>>> the current ZFS spare feature...
>>> 
>>> 
>>> Thanks,
>>> 
>>> 
>>> Cindy
>>> 
>>> 
>> 
>> 
>> 
>> 
>> ___
>> zfs-discuss mailing list
>> zfs-discuss@opensolaris.org
>> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
> ___
> zfs-discuss mailing list
> zfs-discuss@opensolaris.org
> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss



-- 


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


RE: [zfs-discuss] HowTo: UPS + ZFS & NFS + no fsync

2007-04-26 Thread Andy Lubel
Anyone who has an Xraid should have one (or 2) of these BBC modules.
good mojo.

http://store.apple.com/1-800-MY-APPLE/WebObjects/AppleStore.woa/wa/RSLID
?mco=6C04E0D7&nplm=M8941G/B

Can you tell I <3 apple?


-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Wee Yeh Tan
Sent: Thursday, April 26, 2007 9:40 PM
To: cedric briner
Cc: [EMAIL PROTECTED]
Subject: Re: [zfs-discuss] HowTo: UPS + ZFS & NFS + no fsync

Cedric,

On 4/26/07, cedric briner <[EMAIL PROTECTED]> wrote:
> >> okay let'say that it is not. :)
> >> Imagine that I setup a box:
> >>   - with Solaris
> >>   - with many HDs (directly attached).
> >>   - use ZFS as the FS
> >>   - export the Data with NFS
> >>   - on an UPS.
> >>
> >> Then after reading the :
> >> http://www.solarisinternals.com/wiki/index.php/ZFS_Best_Practices_G
> >> uide#ZFS_and_Complex_Storage_Considerations
> >>
> >> I wonder if there is a way to tell the OS to ignore the fsync flush

> >> commands since they are likely to survive a power outage.
> >
> > Cedric,
> >
> > You do not want to ignore syncs from ZFS if your harddisk is 
> > directly attached to the server.  As the document mentioned, that is

> > really for Complex Storage with NVRAM where flush is not necessary.
>
> This post follows : `XServe Raid & Complex Storage Considerations'
> http://www.opensolaris.org/jive/thread.jspa?threadID=29276&tstart=0

Ah... I wasn't aware the other thread was started by you :).  If your
storage device features NVRAM, you should in fact configure it as
discussed in the stated thread.  However, if your storage device(s) are
directly attached disks (or anything without an NVRAM controller),
zfs_noflush=1 is potentially fatal (see link below).

> Where we have made the assumption (*1) if the XServe Raid is connected

> to an UPS that we can consider the RAM in the XServe Raid as it was
NVRAM.

I'm not sure about the interaction between XServe and the UPS but I'd
imagine that the UPS can probably power the XServe  for a few minutes
after a power outage.  That should be enough time for the XServe to
drain stuff in its RAM to disk.

> (*1)
>This assumption is even pointed by Roch  :
>http://blogs.sun.com/roch/#zfs_to_ufs_performance_comparison
>>> Intelligent Storage
>through: `the Shenanigans with ZFS flushing and intelligent
arrays...'
>http://blogs.digitar.com/jjww/?itemid=44
>>> Tell your array to ignore ZFS' flush commands
>
> So in this way, when we export it with NFS we get a boost in the BW.

Indeed.  This is especially true if you consider that expensive storage
are likely to be shared by more than 1 host.  A flush command likely
flushes the entire cache rather than just parts relevant to the
requesting host.

> Okay, then is there any difference that I do not catch between :
>   - the Shenanigans with ZFS flushing and intelligent arrays...
>   - and my situation
>
> I mean, I want to have a cheap and reliable nfs service. Why should I 
> buy expensive `Complex Storage with NVRAM' and not just buying a 
> machine with 8 IDE HD's ?

Your 8 IDE HD may not benefit from zfs_noflush=1 since their caches are
small anyway but the potential impact on reliability will be fairly
severe.
  http://www.opensolaris.org/jive/thread.jspa?messageID=91730

Nothing is stopping you though from getting decent performance from 8
IDE HDD.  You just should not treat them like they are NVRAM backed
array.


--
Just me,
Wire ...
Blog: 
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


RE: [zfs-discuss] XServe Raid & Complex Storage Considerations

2007-04-25 Thread Andy Lubel
They do need to start on the "next" filesystem and it seems very ideal
for Apple.  If they didn't then apple will be making a huge mistake
because whatever FS's exist now, zfs has already pretty much trump'd it
on almost every level except for maturity.

I'm expecting ZFS and ISCSI(initiator and target) in Leopard.  After
all, OS X borrows from FreeBSD.. FreeBSD 7 has zfs ;)

Andy

-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Luke Scharf
Sent: Wednesday, April 25, 2007 3:00 PM
To: Toby Thain
Cc: [EMAIL PROTECTED]
Subject: Re: [zfs-discuss] XServe Raid & Complex Storage Considerations

Toby Thain wrote:
>
> On 25-Apr-07, at 12:17 PM, cedric briner wrote:
>
>> hello the list,
>>
>> After reading the _excellent_ ZFS Best Practices Guide, I've seen in 
>> the section: ZFS and Complex Storage Consideration that we should 
>> configure the storage system to ignore command which will flush the 
>> memory into the disk.
>>
>> So does some of you knows how to tell Xserve Raid to ignore ``fsync''

>> requests ?
>>
>> After the announce that zfs will be included in Tiger,
>
> Much as I would like to see it, I am not aware of any such 
> announcement from Apple, only rumours.

FWIW, I heard the rumor from their sales guys.

The wouldn't say whether it ZFS would be available in 10.5 or 10.6, and
they wouldn't say whether it ZFS-boot would be available when ZFS is
introduced -- but they did confirm that it's being worked on.

-Luke

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


RE: [zfs-discuss] Re: ZFS+NFS on storedge 6120 (sun t4)

2007-04-23 Thread Andy Lubel

What I'm saying is ZFS doesn't play nice with NFS in all the scenarios I could 
think of:

-Single second disk in a v210 (sun72g) write cache on and off = ~1/3 the 
performance of UFS when writing files using dd over an NFS mount using the same 
disk. 

-2 raid 5 volumes composing of 6 spindles each taking ~53 seconds to write 1gb 
over a NFS mounted zfs stripe,raidz or mirror of a storedge 6120 array with 
bbc, zil_disable'd and write cache off/on.  In some testing dd would even seem 
to 'hang'. When any volslice is formatted UFS with the same NFS client - its 
~17 seconds!

We are likely going to just try iscsi instead, the behavior is non-existent.  
At some point though we would like to use ZFS based NFS mounts for things..  
the current difference in performance just scares us!

-Andy


-Original Message-
From: [EMAIL PROTECTED] on behalf of Roch - PAE
Sent: Mon 4/23/2007 5:32 AM
To: Leon Koll
Cc: zfs-discuss@opensolaris.org
Subject: Re: [zfs-discuss] Re: ZFS+NFS on storedge 6120 (sun t4)
 
Leon Koll writes:
 > Welcome to the club, Andy...
 > 
 > I tried several times to attract the attention of the community to the 
 > dramatic performance degradation (about 3 times) of NFZ/ZFS vs. ZFS/UFS 
 > combination - without any result :  href="http://www.opensolaris.org/jive/thread.jspa?messageID=98592";>[1] , 
 > http://www.opensolaris.org/jive/thread.jspa?threadID=24015";>[2].
 > 
 > Just look at two graphs in my  href="http://napobo3.blogspot.com/2006/08/spec-sfs-bencmark-of-zfsufsvxfs.html";>posting
 >  dated August, 2006 to see how bad the situation was and, unfortunately, 
 > this situation wasn't changed much recently: 
 > http://photos1.blogger.com/blogger/7591/428/1600/sfs.1.png
 > 
 > I don't think the storage array is a source of the problems you reported. 
 > It's somewhere else...
 > 

Why do you say this ?

My reading is that  almost all NFS/ZFS complaints are either
complaining  about NFS   performance   vs   direct   attach,
comparing UFS vs  ZFS on disk with  write cache  enabled, or
complaining  about ZFS running  on storage with NVRAM.  Your
complain is the  one   exception, SFS being worst  with  ZFS
backend vs say UFS or VxFS.

My points being:

 So NFS cannot match direct attach for some loads.
 It's a fact that we can't get around .

 Enabling the write cache gives is not a valid way to
 run NFS over UFS. 

 ZFS on NVRAM storage, we need to make sure the storage
 does not flush the cache in response to ZFS requests.


 Then SFS over ZFS is being investigated by others within
 Sun. I believe we have stuff in the pipe to make ZFS match
 or exceed  UFS on small server level loads. So I think your
 complaint is being heard. 
 
 I personally find it always incredibly hard to do performance
 engineering around SFS.
 So my perspective is that improving the SFS numbers
 will more likely come from finding ZFS/NFS performance
 deficiencies on simpler benchmarks.


-r

 > [i]-- leon[/i]
 >  
 >  
 > This message posted from opensolaris.org
 > ___
 > zfs-discuss mailing list
 > zfs-discuss@opensolaris.org
 > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss




___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


RE: [zfs-discuss] Re: ZFS+NFS on storedge 6120 (sun t4)

2007-04-21 Thread Andy Lubel
so what you are saying is that if we were using NFS v4 things should be 
dramatically better?

do you think this applies to any NFS v4 client or only Suns?



-Original Message-
From: [EMAIL PROTECTED] on behalf of Erblichs
Sent: Sun 4/22/2007 4:50 AM
To: Leon Koll
Cc: zfs-discuss@opensolaris.org
Subject: Re: [zfs-discuss] Re: ZFS+NFS on storedge 6120 (sun t4)
 
Leon Koll,

As a knowldegeable outsider I can say something.

The benchbark (SFS) page specifies NFSv3,v2 support, so I question
 whether you ra n NFSv4. I would expect a major change in
 performance just to version 4 NFS version and ZFS.

The benchmark seems to stress your configuration enough that
the latency to service NFS ops increases to the point of non
serviced NFS requests. However, you don't know what is the
byte count per IO op. Reads are bottlenecked against rtt of
the connection and writes are normally sub 1K with a later
commit. However, many ops are probably just file handle
verifications which again are limited to your connection
rtt (round trip time). So, my initial guess is that the number
of NFS threads are somewhat related to the number of non
state (v4 now has state) per file handle op. Thus, if a 64k
ZFS block is being modified by 1 byte, COW would require a
64k byte read, 1 byte modify, and then allocation of another
64k block. So, for every write op, you COULD be writing a
full ZFS block.

This COW philosphy works best with extending delayed writes, etc
where later reads would make the trade-off of increased
latency of the larger block on a read op versus being able
to minimize the number of seeks on the write and read. Basicly
increasing the block size from say 8k to 64K. Thus, your
read latency goes up just to get the data off the disk
and minimizing the number of seeks, and dropping the read
ahead logic for the needed 8k to 64k file offset.

I do NOT know that "THAT" 4000 IO OPS load would match your maximal
load and that your actual load would never increase past 2000 IO ops.
Secondly, jumping from 2000 to 4000 seems to be too big of a jump
for your environment. Going to 2500 or 3000 might be more
appropriate. Lastly wrt the benchmark, some remnants (NFS and/or ZFS
and/or benchmark) seem to remain that have a negative impact.

Lastly, my guess is that this NFS and the benchark are stressing small
partial block writes and that is probably one of the worst case
scenarios for ZFS. So, my guess is the proper analogy is trying to
kill a nat with a sledgehammer. Each write IO OP really needs to be
equal
to a full size ZFS block to get the full benefit of ZFS on a per byte
basis.

Mitchell Erblich
Sr Software Engineer
-





Leon Koll wrote:
> 
> Welcome to the club, Andy...
> 
> I tried several times to attract the attention of the community to the 
> dramatic performance degradation (about 3 times) of NFZ/ZFS vs. ZFS/UFS 
> combination - without any result :  href="http://www.opensolaris.org/jive/thread.jspa?messageID=98592";>[1] , 
> http://www.opensolaris.org/jive/thread.jspa?threadID=24015";>[2].
> 
> Just look at two graphs in my  href="http://napobo3.blogspot.com/2006/08/spec-sfs-bencmark-of-zfsufsvxfs.html";>posting
>  dated August, 2006 to see how bad the situation was and, unfortunately, 
> this situation wasn't changed much recently: 
> http://photos1.blogger.com/blogger/7591/428/1600/sfs.1.png
> 
> I don't think the storage array is a source of the problems you reported. 
> It's somewhere else...
> 
> [i]-- leon[/i]
> 
> 
> This message posted from opensolaris.org
> ___
> zfs-discuss mailing list
> zfs-discuss@opensolaris.org
> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


RE: [zfs-discuss] ZFS+NFS on storedge 6120 (sun t4)

2007-04-20 Thread Andy Lubel
Im not sure about the workload but I did configure the volumes with the block 
size in mind.. didnt seem to do much.  it could be due to the fact im basically 
HW raid then zfs raid and i just dont know the equation to define a smarter 
blocksize.  seems like if i have 2 arrays with 64kb striped together that 128k 
would be ideal for my zfs datasets, but again.. my logic isnt infinite when it 
comes to this fun stuff ;)

The 6120 has 2 volumes each with 64k stripe size blocks.  i then raidz'ed the 2 
volumes and tried both 64k and 128k.  i do get a bit of a performance gain on 
rewrite at 128k.

These are dd tests by the way:


*this one is locally, and works just great.

bash-3.00# date ; uname -a 
Thu Apr 19 21:11:22 EDT 2007 
SunOS yuryaku 5.10 Generic_125100-04 sun4u sparc SUNW,Sun-Fire-V210 
 ^---^

bash-3.00# df -k 
Filesystemkbytesused   avail capacity  Mounted on 
... 
se6120   697761792  26 666303904 1%/pool/se6120 
se6120/rfs-v10   31457280 9710895 2174638431%/pool/se6120/rfs-v10

bash-3.00# time dd if=/dev/zero of=/pool/se6120/rfs-v10/rw-test-1.loo bs=8192 
count=131072 
131072+0 records in 
131072+0 records out 
real0m13.783s real0m14.136s 
user0m0.331s 
sys 0m9.947s


*this one is from a HP-UX 11i system mounted to the v210 listed above:

onyx:/rfs># date ; uname -a 
Thu Apr 19 21:15:02 EDT 2007 
HP-UX onyx B.11.11 U 9000/800 1196424606 unlimited-user license 
 ^^ 
onyx:/rfs># bdf 
Filesystem  kbytesused   avail %used Mounted on 
... 
yuryaku.sol:/pool/se6120/rfs-v10 
   31457280 9710896 21746384   31% /rfs/v10

onyx:/rfs># time dd if=/dev/zero of=/rfs/v10/rw-test-2.loo bs=8192 count=131072 
131072+0 records in 
131072+0 records out

real1m2.25s real0m29.02s real0m50.49s 
user0m0.30s 
sys 0m8.16s

*my 6120 tidbits of interest:

6120 Release 3.2.6 Mon Feb  5 02:26:22 MST 2007 (xxx.xxx.xxx.xxx) 
Copyright (C) 1997-2006 Sun Microsystems, Inc.  All Rights Reserved. 
daikakuji:/:<1>vol mode 
volume mounted cachemirror 
v1 yes writebehind  off 
v2 yes writebehind  off 

daikakuji:/:<5>vol list 
volumecapacity raid data   standby 
v1  340.851 GB5 u1d01-06 u1d07 
v2  340.851 GB5 u1d08-13 u1d14 
daikakuji:/:<6>sys list 
controller : 2.5 
blocksize  : 64k 
cache  : auto 
mirror : auto 
mp_support : none 
naca   : off 
rd_ahead   : off 
recon_rate : med 
sys memsize: 256 MBytes 
cache memsize  : 1024 MBytes 
fc_topology: auto 
fc_speed   : 2Gb 
disk_scrubber  : on 
ondg   : befit


Am i missing something?  As far as the RW test, i will tinker some more and 
paste the results soonish.

Thanks in advance,

Andy Lubel

-Original Message-
From: Bill Moore [mailto:[EMAIL PROTECTED]
Sent: Fri 4/20/2007 5:13 PM
To: Andy Lubel
Cc: zfs-discuss@opensolaris.org
Subject: Re: [zfs-discuss] ZFS+NFS on storedge 6120 (sun t4)
 
When you say rewrites, can you give more detail?  For example, are you
rewriting in 8K chunks, random sizes, etc?  The reason I ask is because
ZFS will, by default, use 128K blocks for large files.  If you then
rewrite a small chunk at a time, ZFS is forced to read 128K, modify the
small chunk you're changing, and then write 128K.  Obviously, this has
adverse effects on performance.  :)  If your typical workload has a
preferred block size that it uses, you might try setting the recordsize
property in ZFS to match - that should help.

If you're completely rewriting the file, then I can't imagine why it
would be slow.  The only thing I can think of is the forced sync that
NFS does on a file closed.  But if you set zil_disable in /etc/system
and reboot, you shouldn't see poor performance in that case.

Other folks have had good success with NFS/ZFS performance (while other
have not).  If it's possible, could you characterize your workload in a
bit more detail?


--Bill

On Fri, Apr 20, 2007 at 04:07:44PM -0400, Andy Lubel wrote:
> 
> We are having a really tough time accepting the performance with ZFS
> and NFS interaction.  I have tried so many different ways trying to
> make it work (even zfs set:zil_disable 1) and I'm still no where near
> the performance of using a standard NFS mounted UFS filesystem -
> insanely slow; especially on file rewrites.
> 
> We have been combing the message boards and it looks like there was a
> lot of talk about this interaction of zfs+nfs back in november and
> before but since i have not seen much.  It seems the only fix up to
> that date was to disable zil, is that still the case?  Did anyone ever
> get closure on this?
> 
> We are running solaris 10 (SPARC) .latest patched 11/06 relea

RE: [zfs-discuss] ZFS+NFS on storedge 6120 (sun t4)

2007-04-20 Thread Andy Lubel

yeah i saw that post about the other arrays but none for this EOL'd hunk of 
metal.  i have some 6130's but hopefully by the time they are implemented we 
will have retired this nfs stuff and stepped into zvol iscsi targets.

thanks anyways.. back to the drawing board on how to resolve this!

-Andy

-Original Message-
From: [EMAIL PROTECTED] on behalf of Torrey McMahon
Sent: Fri 4/20/2007 6:00 PM
To: Marion Hakanson
Cc: zfs-discuss@opensolaris.org
Subject: Re: [zfs-discuss] ZFS+NFS on storedge 6120 (sun t4)
 
Marion Hakanson wrote:
> [EMAIL PROTECTED] said:
>   
>> We have been combing the message boards and it looks like there was a lot of
>> talk about this interaction of zfs+nfs back in november and before but since
>> i have not seen much.  It seems the only fix up to that date was to disable
>> zil, is that still the case?  Did anyone ever get closure on this? 
>> 
>
> There's a way to tell your 6120 to ignore ZFS cache flushes, until ZFS
> learns to do that itself.  See:
>   http://mail.opensolaris.org/pipermail/zfs-discuss/2006-December/024194.html
>
>   

The 6120 isn't the same as a 6130/61340/6540. The instructions 
referenced above won't work on a T3/T3+/6120/6320

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss




___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] ZFS+NFS on storedge 6120 (sun t4)

2007-04-20 Thread Andy Lubel

We are having a really tough time accepting the performance with ZFS and NFS 
interaction.  I have tried so many different ways trying to make it work (even 
zfs set:zil_disable 1) and I'm still no where near the performance of using a 
standard NFS mounted UFS filesystem - insanely slow; especially on file 
rewrites.

We have been combing the message boards and it looks like there was a lot of 
talk about this interaction of zfs+nfs back in november and before but since i 
have not seen much.  It seems the only fix up to that date was to disable zil, 
is that still the case?  Did anyone ever get closure on this?

We are running solaris 10 (SPARC) .latest patched 11/06 release connecting 
directly via FC to a 6120 with 2 raid 5 volumes over a bge interface (gigabit). 
 tried raidz, mirror and stripe with no negligible difference in speed.  the 
clients connecting to this machine are HP-UX 11i and OS X 10.4.9 and they both 
have corresponding performance characteristics.

Any insight would be appreciated - we really like zfs compared to any 
filesystem we have EVER worked on and dont want to revert if at all possible!


TIA,

Andy Lubel

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss