Re: [zfs-discuss] Large scale performance query

2011-07-31 Thread Evgueni Martynov

On 25/07/2011 2:34 AM, Phil Harrison wrote:

Hi All,

Hoping to gain some insight from some people who have done large scale systems 
before? I'm hoping to get some
performance estimates, suggestions and/or general discussion/feedback. I cannot 
discuss the exact specifics of the
purpose but will go into as much detail as I can.

Technical Specs:
216x 3TB 7k3000 HDDs
24x 9 drive RAIDZ3
4x JBOD Chassis (45 bay)
1x server (36 bay)
2x AMD 12 Core CPU
128GB EEC RAM
2x 480GB SSD Cache
10Gbit NIC

Workloads:

Mainly streaming compressed data. That is, pulling compressed data in a 
sequential manner however could have multiple
streams happening at once making it somewhat random. We are hoping to have 5 
clients pull 500Mbit sustained.

Considerations:

The main reason RAIDZ3 was chosen was so we can distribute the parity across 
the JBOD enclosures. With this method even
if an entire JBOD enclosure is taken offline the data is still accessible.


What kind of 45 bay enclosures?
Have you tested this and took an enclosure out?

Thanks
Evgueni


Questions:

How to manage the physical locations of such a vast number of drives? I have 
read this
(http://blogs.oracle.com/eschrock/entry/external_storage_enclosures_in_solaris) 
and am hoping some can shed some light
if the SES2 enclosure identification has worked for them? (enclosures are SES2)

What kind of performance would you expect from this setup? I know we can 
multiple the base IOPS by 24 but what about max
sequential read/write?

Thanks,

Phil



___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] booting from ashift=12 pool..

2011-07-31 Thread Daniel Carosone
On Mon, Aug 01, 2011 at 11:22:36AM +1000, Daniel Carosone wrote:
> On Fri, Jul 29, 2011 at 05:58:49PM +0200, Hans Rosenfeld wrote:
> 
> > I'm working on a patch for grub that fixes the ashift=12 issue. 
> 
> Oh, great - and from the looks of the patch, for other values of 12 as
> well :)
> 
> > I'm probably not going to fix the div-by-zero reboot.
> 
> Fair enough, if it's an existing unrelated error we no longer
> expose. Perhaps it's even fixed/irrelevant for grub2, can this be
> checked easily? 
> 

FWIW, this seems to be a live issue with the zfs-on-linux folks too,
perhaps some coordination would be helpful?

See, e.g.:
http://groups.google.com/a/zfsonlinux.org/group/zfs-discuss/browse_thread/thread/0c80103a8d5c0bb0#

> > If you want to try it, the patch can be found at
> > http://cr.illumos.org/view/6qc99xkh/illumos-1303-webrev/illumos-1303-webrev.patch
> 
> Any chance of providing an alternate stage1/stage2 binary I can feed
> to installgrub?  When you're ready..

To be clear, the system I was working on the other day is now running
with a normal ashift=9 pool, on a mirror of WD 2TB EARX.  Not quite
what I was hoping for, but hopefully it will be OK; I won't have much
chance to mess with it again for a little while.  I will be building
something else useful for testing this, sometime in the next couple of
weeks.

--
Dan.

pgpRxUtHugLXX.pgp
Description: PGP signature
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] booting from ashift=12 pool..

2011-07-31 Thread Daniel Carosone
On Fri, Jul 29, 2011 at 05:58:49PM +0200, Hans Rosenfeld wrote:

> I'm working on a patch for grub that fixes the ashift=12 issue. 

Oh, great - and from the looks of the patch, for other values of 12 as
well :)

> I'm probably not going to fix the div-by-zero reboot.

Fair enough, if it's an existing unrelated error we no longer
expose. Perhaps it's even fixed/irrelevant for grub2, can this be
checked easily? 

> If you want to try it, the patch can be found at
> http://cr.illumos.org/view/6qc99xkh/illumos-1303-webrev/illumos-1303-webrev.patch

Any chance of providing an alternate stage1/stage2 binary I can feed
to installgrub?  When you're ready..

--
Dan.


pgpq9MYR4EuUs.pgp
Description: PGP signature
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] NexentaCore 3.1 - ZFS V. 28

2011-07-31 Thread Volker A. Brandt
> I'm 99% sure N36L takes 3 TByte SATA, as we have 5 of such
> systems in production using the more expensive 3 TByte Hitachis.

That is very good to hear, thank you!
 
> You can't boot from them, of course, but that's what the internal
> USB and external eSATA ports are good for.

Of course, but that's a BIOS limitation and has nothing to do
with the SATA controller, which was my first concern.


Regards -- Volker
-- 

Volker A. Brandt   Consulting and Support for Oracle Solaris
Brandt & Brandt Computer GmbH   WWW: http://www.bb-c.de/
Am Wiesenpfad 6, 53340 Meckenheim Email: v...@bb-c.de
Handelsregister: Amtsgericht Bonn, HRB 10513  Schuhgröße: 46
Geschäftsführer: Rainer J. H. Brandt und Volker A. Brandt
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] NexentaCore 3.1 - ZFS V. 28

2011-07-31 Thread Richard Elling
On Jul 31, 2011, at 8:20 AM, Eugen Leitl  wrote:

> On Sun, Jul 31, 2011 at 05:19:07AM -0700, Erik Trimble wrote:
> 
>> 
>> Yes. You can attach a ZIL or L2ARC device anytime after the pool is created.
> 
> Excellent.

:-)

> 
>> Also, I think you want an Intel 320, NOT the 311, for use as a ZIL.  The  
>> 320 includes capacitors, so if you lose power, your ZIL doesn't lose  
>> data.  The 311 DOESN'T include capacitors.

FYI, these drives have not yet passed qualification testing at Nexenta.

> This is basically just a test system for hybrid pools, will
> be on UPS in production, and mostly read-only.

Please test thoroughly, prior to production.

> 
> The nice advantage of Nexenta core + napp-it is that it includes
> apache + mysql + php, which saves the need for a dedicated machine
> or virtual guest.
> 
> The appliance will host some 600+ k small (few MBytes) files. 
> Does zfs need any special tuning for this case?

This should work fine out-of-the-box.
  -- richard

> 
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] NexentaCore 3.1 - ZFS V. 28

2011-07-31 Thread Eugen Leitl
On Sun, Jul 31, 2011 at 05:19:07AM -0700, Erik Trimble wrote:

>
> Yes. You can attach a ZIL or L2ARC device anytime after the pool is created.

Excellent.

> Also, I think you want an Intel 320, NOT the 311, for use as a ZIL.  The  
> 320 includes capacitors, so if you lose power, your ZIL doesn't lose  
> data.  The 311 DOESN'T include capacitors.

This is basically just a test system for hybrid pools, will
be on UPS in production, and mostly read-only.

The nice advantage of Nexenta core + napp-it is that it includes
apache + mysql + php, which saves the need for a dedicated machine
or virtual guest.

The appliance will host some 600+ k small (few MBytes) files. 
Does zfs need any special tuning for this case?

-- 
Eugen* Leitl http://leitl.org";>leitl http://leitl.org
__
ICBM: 48.07100, 11.36820 http://www.ativel.com http://postbiota.org
8B29F6BE: 099D 78BA 2FD3 B014 B08A  7779 75B0 2443 8B29 F6BE
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] NexentaCore 3.1 - ZFS V. 28

2011-07-31 Thread Eugen Leitl
On Sun, Jul 31, 2011 at 03:45:23PM +0200, Volker A. Brandt wrote:

> I would be very interested in hearing about your success.  Especially,
> if the Hitachi HDS5C3030ALA630 SATA-III disks work in the N36L at all.
> 
> My guess would be that the on-board SATA-II controller will not
> support more than 2TB, but I have not found a definitive statement.
> HP certainly will not sell you disks bigger than 2TB for the N36L.

I'm 99% sure N36L takes 3 TByte SATA, as we have 5 of such
systems in production using the more expensive 3 TByte Hitachis.

You can't boot from them, of course, but that's what the internal
USB and external eSATA ports are good for.

-- 
Eugen* Leitl http://leitl.org";>leitl http://leitl.org
__
ICBM: 48.07100, 11.36820 http://www.ativel.com http://postbiota.org
8B29F6BE: 099D 78BA 2FD3 B014 B08A  7779 75B0 2443 8B29 F6BE
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] NexentaCore 3.1 - ZFS V. 28

2011-07-31 Thread Volker A. Brandt
Hello Eugen!


> I finally came around installing NexentaCore 3.1 along with
> napp-it and AMP on a HP N36L with 8 GBytes RAM. I'm testing
> it with 4x 1 and 1.5 TByte consumer SATA drives (Seagate)
> with raidz2 and raidz3 and like what I see so far.
> 
> Given http://opensolaris.org/jive/thread.jspa?threadID=139315
> I've ordered an Intel 311 series for ZIL/L2ARC.
> 
> I hope to use above with 4x 3 TByte Hitachi Deskstar 5K3000 HDS5C3030ALA630
> given the data from Blackblaze in regards to their reliability.

I would be very interested in hearing about your success.  Especially,
if the Hitachi HDS5C3030ALA630 SATA-III disks work in the N36L at all.

My guess would be that the on-board SATA-II controller will not
support more than 2TB, but I have not found a definitive statement.
HP certainly will not sell you disks bigger than 2TB for the N36L.


Regards -- Volker
-- 

Volker A. Brandt   Consulting and Support for Oracle Solaris
Brandt & Brandt Computer GmbH   WWW: http://www.bb-c.de/
Am Wiesenpfad 6, 53340 Meckenheim Email: v...@bb-c.de
Handelsregister: Amtsgericht Bonn, HRB 10513  Schuhgröße: 46
Geschäftsführer: Rainer J. H. Brandt und Volker A. Brandt
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] NexentaCore 3.1 - ZFS V. 28

2011-07-31 Thread Erik Trimble

On 7/31/2011 4:29 AM, Eugen Leitl wrote:

On Sat, Jul 30, 2011 at 12:56:38PM +0200, Eugen Leitl wrote:

apt-get update
apt-clone upgrade

Any first impressions?

I finally came around installing NexentaCore 3.1 along with
napp-it and AMP on a HP N36L with 8 GBytes RAM. I'm testing
it with 4x 1 and 1.5 TByte consumer SATA drives (Seagate)
with raidz2 and raidz3 and like what I see so far.

Given http://opensolaris.org/jive/thread.jspa?threadID=139315
I've ordered an Intel 311 series for ZIL/L2ARC.

I hope to use above with 4x 3 TByte Hitachi Deskstar 5K3000 HDS5C3030ALA630
given the data from Blackblaze in regards to their reliability.
Suggestion for above layout (8 GByte RAM 4x 3 TByte as raidz2)
I should go with 4 GByte for slog and 16 GByte for L2ARC, right?

Is it possible to attach slog/L2ARC to a pool after the fact?
I'd rather not wear out the small SSD with ~5 TByte avoidable
writes.



Yes. You can attach a ZIL or L2ARC device anytime after the pool is created.

Also, I think you want an Intel 320, NOT the 311, for use as a ZIL.  The 
320 includes capacitors, so if you lose power, your ZIL doesn't lose 
data.  The 311 DOESN'T include capacitors.



--
Erik Trimble
Java Platform Group Infrastructure
Mailstop:  usca22-317
Phone:  x67195
Santa Clara, CA
Timezone: US/Pacific (UTC-0800)

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] NexentaCore 3.1 - ZFS V. 28

2011-07-31 Thread Eugen Leitl
On Sat, Jul 30, 2011 at 12:56:38PM +0200, Eugen Leitl wrote:
> 
> apt-get update
> apt-clone upgrade
> 
> Any first impressions?

I finally came around installing NexentaCore 3.1 along with
napp-it and AMP on a HP N36L with 8 GBytes RAM. I'm testing
it with 4x 1 and 1.5 TByte consumer SATA drives (Seagate)
with raidz2 and raidz3 and like what I see so far.

Given http://opensolaris.org/jive/thread.jspa?threadID=139315
I've ordered an Intel 311 series for ZIL/L2ARC.

I hope to use above with 4x 3 TByte Hitachi Deskstar 5K3000 HDS5C3030ALA630
given the data from Blackblaze in regards to their reliability.
Suggestion for above layout (8 GByte RAM 4x 3 TByte as raidz2)
I should go with 4 GByte for slog and 16 GByte for L2ARC, right?

Is it possible to attach slog/L2ARC to a pool after the fact?
I'd rather not wear out the small SSD with ~5 TByte avoidable
writes.

-- 
Eugen* Leitl http://leitl.org";>leitl http://leitl.org
__
ICBM: 48.07100, 11.36820 http://www.ativel.com http://postbiota.org
8B29F6BE: 099D 78BA 2FD3 B014 B08A  7779 75B0 2443 8B29 F6BE
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss