Bill Kenworthy <billk <at> iinet.net.au> writes:

> > The main thing keeping me away from CephFS is that it has no mechanism
> > for resolving silent corruption.  Btrfs underneath it would obviously
> > help, though not for failure modes that involve CephFS itself.  I'd
> > feel a lot better if CephFS had some way of determining which copy was
> > the right one other than "the master server always wins."


The "Giant" version 0.87 is a major release with many new fixes;
it may have the features you need. Currently the ongoing releases are
up to : v0.91. The readings look promissing, but I'll agree it
needs to be tested with non-critical data.

http://ceph.com/docs/master/release-notes/#v0-87-giant

http://ceph.com/docs/master/release-notes/#notable-changes


> Forget ceph on btrfs for the moment - the COW kills it stone dead after
> real use.  When running a small handful of VMs on a raid1 with ceph -
> sloooooooooooow :)

I'm staying away from VMs. It's spark on top of mesos I'm after. Maybe
docker or another container solution, down the road.

I read where some are using a SSD with raid 1 and bcache to speed up
performance and stability a bit. I do not want to add SSD to the mix right
now, as the (3) node development systems all have 32 G of ram.



> You can turn off COW and go single on btrfs to speed it up but bugs in
> ceph and btrfs lose data real fast!

Interesting idea, since I'll have raid1 underneath each node. I'll need to
dig into this idea a bit more.


> ceph itself (my last setup trashed itself 6 months ago and I've given
> up!) will only work under real use/heavy loads with lots of discrete
> systems, ideally 10G network, and small disks to spread the failure
> domain.  Using 3 hosts and 2x2g disks per host wasn't near big enough :(
>  Its design means that small scale trials just wont work.

Huh. My systems are FX8350 (8)processors running at 4GHz with 32 G ram.
Water coolers will allow me to crank up the speed (when/if needed) to
5 or 6 GHz. Not intel but  low end either.


> Its not designed for small scale/low end hardware, no matter how
> attractive the idea is :(

Supposedly there are tool to measure/monitor ceph better now. That is
one of the things I need to research. How to manage the small cluster
better and back off the throughput/load while monitoring performance
on a variety of different tasks. Definitely not a production usage.

I certainly appreciate your ceph_experiences. I filed a but with the
version request for Giant v0.87. Did your run the 9999 version ?
What versions did you experiment with?

I hope to set up Anisble to facilitate rapid installations of a variety
of gentoo systems used for cluster or ceph testing. That way configurations
should be able to "reboot" after bad failures.  Did your experienced
failures with Ceph require the gentoo-btrfs based systems to be complete
reinstalled from scratch, or just purge the disk of Ceph and reconfigure Ceph?

I'm hoping to "configure ceph" in such a way that failures do not corrupt
the gentoo-btrfs installation and only require repair to ceph; so your
comments on that strategy are most welcome.




> BillK


James


> 





Reply via email to