Hello, On Mon, 11 Apr 2016 22:45:00 +0200 Oliver Dzombic wrote:
> Hi, > > currently in use: > > oldest: > > SSDs: Intel S3510 80GB Ouch. As in, not a speed wonder at 110MB/s writes (or 2 HDDs worth), but at least suitable as a journal when it comes to sync writes. But at 45TBW dangerously low in the in endurance department, I'd check their wear-out constantly! See the recent thread: https://www.mail-archive.com/ceph-users@lists.ceph.com/msg28083.html > HDD: HGST 6TB H3IKNAS600012872SE NAS HGST should be fine. > > latest: > > SSDs: Kingston 120 GB SV300 Don't know them, so no idea if they are suitable when it comes to sync writes, but at 64TBW also in danger of expiring rather quickly. > HDDs: HGST 3TB H3IKNAS30003272SE NAS > > in future will be in use: > > SSDs: Samsung SM863 240 GB Those should be both suitable in the sync write and endurance department, alas haven't tested them myself. > HDDs: HGST 3TB H3IKNAS30003272SE NAS and/or > Seagate ST2000NM0023 2 TB > > > ----- > > Its hard to say, if and at what chance the newer will fail with the > osd's getting down/out, compared to the old ones. > > We did a lot to avoid that. > > Without having it in real numbers, my feeling is/was that the newer will > fail with a much lower chance. But what is responsible for that, is > unknown. > > In the very end, the old nodes have with 2x 2,3 GHz Intel Celeron ( 2x > cores without HT ) and 3x 6 TB HDD much less cpu power per HDD compared > to the 4x 3,3 GHz Intel E3-1225v5 CPU ( 4 cores ) with 10x 3 TB HDD. > Yes, I'd suspect CPU exhaustion mostly here, aside from the IO overload. On my massively underpowered test cluster I've been able to create OSD/MON failures from exhausting CPU or RAM, on my production clusters never. > So its just too much different, CPU, HDD, RAM, even the HDD Controller. > > I will have to make sure, that the new cluster will have enough Hardware > to make sure, that i dont need to consider possible problems there. > > ------ > > atop: sda/sdb == SSD journal > Since there are 12 disk, I presume those are Kingston ones. Frankly I wouldn't expect 10+ms waits from SSDs, but then again they are 90%ish busy when doing only 500IOPS and writing 1.5MB/s. This indicates to me that they are NOT handling sync writes gracefully and are not suitable as Ceph journals. > ------ > > That was my first experience too. At very first, deep-scrubs and even > normal scrubs were driving the %WA and business of the HDDs to 100% Flat. > > ------ > > I rechecked it with munin. > > The journal SSD's go from ~40% up to 80-90% during deebscrub. I have no explanation for this, as deep-scrubbing introduces no writes. > The HDDs go from ~ 20% up to 90-100% flat more or less, during > deebscrubt. > That's to be expected, again the sleep factor can reduce the impact for client I/O immensely. Christian > At the same time, the load avarage goes to 16-20 ( 4 cores ) > while the CPU will see up to 318% Idle Waiting Time. ( out of max. 400% ) > > ------ > > The OSD's receive a peer timeout. Which is just understandable if the > system will see a 300% Idle Waiting Time for just long enough. > > > ------ > > And yes, as it seems, for clusters which are very busy, especially with > low hardware ressources, needs much more than the standard config > can/will deliver. As soon as the LTS is out i will have to start busting > my head with available config parameters. > -- Christian Balzer Network/Systems Engineer ch...@gol.com Global OnLine Japan/Rakuten Communications http://www.gol.com/ _______________________________________________ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com