On Thu, Mar 06, 2014 at 02:33:26PM +0000, Alex Adriaanse wrote:

> We’re needing to store many TBs of data (somewhere between 10TB and 50TB) on 
> a SmartOS server (or cluster of SmartOS servers). Most of this data will be 
> fairly static, and the files that will take up this space will typically 
> average around 1MB in size. Performance isn’t going to be a huge concern, 
> except that we’ll also house a database server where performance is a bit 
> more important (although two 7200RPM drives in a mirrored zpool seem to be 
> holding up currently just fine). We’ll be using zones for most virtual 
> machines on this server.
> 
> We’re looking at getting a dedicated server with a 36 hard drive bay chassis 
> to give us lots of room for growth. I’m envisioning we have two options:
> 
>   1.  Create one giant zpool that will eventually span up to ~32 hard drives 
> and dump all the files in there. When using 4TB drives in RAID10, this would 
> give us 32TB of usable storage. This would be the easiest solution because we 
> could dump most of our data in a single filesystem. However, I’m concerned 
> about scalability and reliability, as the wrong two hard drives going out at 
> the same time would kill the entire zpool, and also dealing with a ~32TB 
> filesystem may be challenging (we use zfs send/receive to replicate our data 
> to an offsite mirror server, so at some point the diffs between snapshots 
> will become huge as we collect more and more data; the initial full zfs 
> send/receive would take forever, too).

The way we do this for our Manta storage nodes is to have a single
RAIDZ2 pool with 3 vdevs (the topology ends up being 3x(9+2) with 2
spares and a slog in a 36-disk chassis).  This works well, performs
well, and has excellent durability.  If you're curious about our HW
config, see github.com/joyent/manufacturing.

I don't understand your comments on snapshots and send/recv; the size of
the pool has nothing to do with these things.  Snapshots are per-dataset
(per-"filesystem", or practically speaking per-zone).  You can also use
incrementals snapshots to reduce send stream size.  None of this is
influenced in any way by the pool size; it's strictly about the amount
of data that you need consistent snapshots of.  A 32 TB pool is much
smaller than many others in production, including at Joyent.  The people
building storage systems with ZFS are accustomed to pools with hundreds
of TB or even several PB.  The problems with huge pools all stem from
managing huge numbers of disks, not with any of the things you're
worried about.

>   2.  We create one zpool for each group of 4 hard drives in RAID10 (with 4TB 
> hard drives this would give us 8TB of usable space per zpool). This would 
> require some forethought in our application as we’d have to split our data 
> carefully to make sure it doesn’t exceed 8TB per filesystem, but this 
> shouldn’t be a big deal.

Multiple pools is insane.  Please don't do this.  It defeats the purpose
of pooled storage, and support for more than the zones pool in SmartOS
is somewhere between poor and nonexistent.


-------------------------------------------
smartos-discuss
Archives: https://www.listbox.com/member/archive/184463/=now
RSS Feed: https://www.listbox.com/member/archive/rss/184463/25769125-55cfbc00
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=25769125&id_secret=25769125-7688e9fb
Powered by Listbox: http://www.listbox.com

Reply via email to