Hello,

lots of similar questions in the past, google is your friend.

On Mon, 5 Jun 2017 23:59:07 -0400 Daniel K wrote:

> I've built 'my-first-ceph-cluster' with two of the 4-node, 12 drive
> Supermicro servers and dual 10Gb interfaces(one cluster, one public)
> 
> I now have 9x 36-drive supermicro StorageServers made available to me, each
> with dual 10GB and a single Mellanox IB/40G nic. No 1G interfaces except
> IPMI. 2x 6-core 6-thread 1.7ghz xeon processors (12 cores total) for 36
> drives. Currently 32GB of ram. 36x 1TB 7.2k drives.
>
I love using IB, alas with just one port per host you're likely best off
ignoring it, unless you have a converged network/switches that can make
use of it (or run it in Ethernet mode).
 
> Early usage will be CephFS, exported via NFS and mounted on ESXi 5.5 and
> 6.0 hosts(migrating from a VMWare environment), later to transition to
> qemu/kvm/libvirt using native RBD mapping. I tested iscsi using lio and saw
> much worse performance with the first cluster, so it seems this may be the
> better way, but I'm open to other suggestions.
> 
I've never seen any ultimate solution to providing HA iSCSI on top of
Ceph, though other people here have made significant efforts.

> Considerations:
> Best practice documents indicate .5 cpu per OSD, but I have 36 drives and
> 12 CPUs. Would it be better to create 18x 2-drive raid0 on the hardware
> raid card to present a fewer number of larger devices to ceph? Or run
> multiple drives per OSD?
> 
You're definitely underpowered in the CPU department and I personally
would make RAID1 or 10s for never having to re-balance an OSD.
But if space is an issue, RAID0s would do.
OTOH, w/o any SSDs in the game your HDD only cluster is going to be less
CPU hungry than others.

> There is a single 256gb SSD which i feel would be a bottleneck if I used it
> as a journal for all 36 drives, so I believe bluestore with a journal on
> each drive would be the best option.
> 
Bluestore doesn't have journals per se and unless you're going to wait for
Luminous I wouldn't recommend using Bluestore in production.
Hell, I won't be using it any time soon, but anything pre L sounds
like outright channeling Murphy to smite you.

That said, what SSD is it? 
Bluestore WAL needs are rather small.
OTOH, a single SSD isn't something I'd recommend either, SPOF and all.

I'm guessing you have no budget to improve on that gift horse?

> Is 1.7Ghz too slow for what I'm doing?
> 
If you're going to have a lot of small I/Os it probably will be.

> I like the idea of keeping the public and cluster networks separate. 

I don't, at least not on a physical level when you pay for this by loosing
redundancy.
Do you have 2 switches, are they MC-LAG capable (aka stackable)?

>Any
> suggestions on which interfaces to use for what? I could theoretically push
> 36Gb/s, figuring 125MB/s for each drive, but in reality will I ever see
> that? 
Not by a long shot, even with Bluestore. 
With the WAL and other bits on SSD and very kind write patterns, maybe
100MB/s per drive, but IIRC there were issues with current Bluestore and
performance as well.

>Perhaps bond the two 10GB and use them as the public, and the 40gb as
> the cluster network? Or split the 40gb in to 4x10gb and use 3x10GB bonded
> for each?
>
If you can actually split it up, see above, mc-LAG.
That will give you 60Gb/s, half that if a switch fails and if it makes you
fell better, do the cluster and public with VLANs.

But that will cost you in not so cheap switch ports, of course.

Christian
> If there is a more appropriate venue for my request, please point me in
> that direction.
> 
> Thanks,
> Dan


-- 
Christian Balzer        Network/Systems Engineer                
ch...@gol.com           Rakuten Communications
_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to