On Mon, Dec 22, 2014 at 2:57 PM, Sean Sullivan <seapasu...@uchicago.edu>
wrote:

>  Thanks Craig!
>
> I think that this may very well be my issue with osds dropping out but I
> am still not certain as I had the cluster up for a small period while
> running rados bench for a few days without any status changes.
>

Mine were fine for a while too, through several benchmarks and a large
RadosGW import.  My problems were memory pressure plus an XFS bug, so it
took a while to manifest.  When it did, all of the ceph-osd processes on
that node would have periods of ~30 seconds with 100% CPU.  Some OSDs would
get kicked out.  Once that started, it was a downward spiral of recovery
causing increasing load causing more OSDs to get kicked out...

Once I found the memory problem, I cronned a buffer flush, and that usually
kept things from getting too bad.

I was able to see on the CPU graphs that CPU was increasing before the
problems started.  Once CPU got close to 100% usage on all cores, that's
when the OSDs started dropping out.  Hard to say if it was the CPU itself,
or if the CPU was just a symptom of the memory pressure plus XFS bug.




> The real big issue that I have is the radosgw one currently. After I
> figure out the root cause of the slow radosgw performance and correct that,
> it should hopefully buy me enough time to figure out the osd slow issue.
>
> It just doesn't make sense that I am getting 8mbps per client no matter 1
> or 60 clients while rbd and rados shoot well above 600MBs (above 1000 as
> well).
>

That is strange.  I was able to get >300 Mbps per client, on a 3 node
cluster with GigE.  I expected that each client would saturate the GigE on
their own, but 300 Mbps is more than enough for now.

I am using the Ceph apache and fastcgi module, but otherwise it's a pretty
standard apache setup.  My RadosGW processes are using a fair amount of
CPU, but as long as you have some idle CPU, that shouldn't be the
bottleneck.




>
> May I ask how you are monitoring your clusters logs? Are you just using
> rsyslog or do you have a logstash type system set up? Load wise I do not
> see a spike until I pull an osd out of the cluster or stop then start an
> osd without marking nodown.
>

I'm monitoring the cluster with Zabbix, and that gives me pretty much the
same info that I'd get in the logs.  I am planning to start pushing the
logs to Logstash soon, as soon as I get my logstash is able to handle the
extra load.


>
> I do think that CPU is probably the cause of the osd slow issue though as
> it makes the most logical sense. Did you end up dropping ceph and moving to
> zfs or did you stick with it and try to mitigate it via file flusher/ other
> tweaks?
>
>
I'm still on Ceph.  I worked around the memory pressure by reformatting my
XFS filesystems to use regular sized inodes.  It was a rough couple of
months, but everything has been stable for the last two months.

I do still want to use ZFS on my OSDs.  It's got all the features of BtrFS,
with the extra feature of being production ready.  It's just not production
ready in Ceph yet.  It's coming along nicely though, and I hope to reformat
one node to be all ZFS sometime next year.
_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to