Re: [ceph-users] Uneven CPU usage on OSD nodes

f...@univ-lr.fr Sun, 22 Mar 2015 02:16:11 -0700

Hi Craig,

An uneven primaries distribution was indeed my first thought.

I should have been more explicit on the percentages of the histograms Igave, lets see them in detail in a more comprehensive way.

On a 27938 bench objects seen by osdmap, the hosts are distributed likethat :

20904 host1
21210 host2
20835 host3
20709 host3
That's the number of time they appear (as primary or secondary or tertiary).

The distribution is pretty linear, as we don't have more than 0.5% oftotal objects difference between the most and the less used host.


If we now considere the primary host distribution, here is what we have :
7207 host1
6960 host2
6814 host3
6957 host3
That's the number of time each host appears as primary.

Once again, the distribution is correct with less than 1.5% of the totalentries between the most and the less used host as primary.I must add that such a distribution is of course observed for thesecondary and the tertiary copy.

I think we have enough samples to confirms the correct distribution ofthe crush function.Each host having 25% of chance to be primary, this shouldn't be thereason why we observe a higher CPU load. There's must something else....


I must add we run 0.87.1 Giant.

Go to a firefly release is an option as the phenomena is not currentlyobserved on comparable hardware platforms running 0.80.xAbout the memory on hosts, 32GB is just a beginning for the tests. We'lladd more later.


Frederic


Craig Lewis <cle...@centraldesktop.com> a écrit le 20/03/15 23:19 :

I would say you're a little light on RAM. With 4TB disks 70% full,I've seen some ceph-osd processes using 3.5GB of RAM during recovery.You'll be fine during normal operation, but you might run into issuesat the worst possible time.

I have 8 OSDs per node, and 32G of RAM. I've had ceph-osd processesstart swapping, and that's a great way to get them kicked out forbeing unresponsive.

I'm not a dev, but I can make some wild and uninformed guesses :-) .The primary OSD uses more CPU than the replicas, and I suspect thatyou have more primaries on the hot nodes.

Since you're testing, try repeating the test on 3 OSD nodes instead of4. If you don't want to run that test, you can generate a histogramfrom ceph pg dump data, and see if there are more primary osds (thefirst one in the acting array) on the hot nodes.

On Wed, Mar 18, 2015 at 7:18 AM, f...@univ-lr.fr<mailto:f...@univ-lr.fr> <f...@univ-lr.fr <mailto:f...@univ-lr.fr>> wrote:


    Hi to the ceph-users list !

    We're setting up a new Ceph infrastructure :
    - 1 MDS admin node
    - 4 OSD storage nodes (60 OSDs)
      each of them running a monitor
    - 1 client

    Each 32GB RAM/16 cores OSD node supports 15 x 4TB SAS OSDs (XFS)
    and 1 SSD with 5GB journal partitions, all in JBOD attachement.
    Every node has 2x10Gb LACP attachement.
    The OSD nodes are freshly installed with puppet then from the
    admin node
    Default OSD weight in the OSD tree
    1 test pool with 4096 PGs

    During setup phase, we're trying to qualify the performance
    characteristics of our setup.
    Rados benchmark are done from a client with these commandes :
    rados -p pool -b 4194304 bench 60 write -t 32 --no-cleanup
    rados -p pool -b 4194304 bench 60 seq -t 32 --no-cleanup

    Each time we observed a recurring phenomena : 2 of the 4 OSD nodes
    have twice the CPU load :
    http://www.4shared.com/photo/Ua0umPVbba/UnevenLoad.html
    (What to look at is the real-time %CPU and the cumulated CPU time
    per ceph-osd process)

    And after a fresh complete reinstall to be sure, this
    twice-as-high CPU load is observed but not on the same 2 nodes :
    http://www.4shared.com/photo/2AJfd1B_ba/UnevenLoad-v2.html

    Nothing obvious about the installation seems able to explain that.

    The crush distribution function doesn't have more than 4.5%
    inequality between the 4 OSD nodes for the primary OSDs of the
    objects, and less than 3% between the hosts if we considere the
    whole acting sets for the objects used during the benchmark. And
    the differences are not accordingly comparable to the CPU loads.
    So the cause has to be elsewhere.

    I cannot be sure it has no impact on performance. Even if we have
    enough CPU cores headroom, logic would say it has to have some
    consequences on delays and also on performances .

    Would someone have any idea, or reproduce the test on its setup to
    see if this is a common comportment ?


    _______________________________________________
    ceph-users mailing list
    ceph-users@lists.ceph.com <mailto:ceph-users@lists.ceph.com>
    http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Uneven CPU usage on OSD nodes

Reply via email to