Another strange thing I'm seeing is that two of the nodes in the cluster
have some OSD's with almost no activity. If I watch top long enough I'll
eventually see cpu utilization on these osds but for the most part they sit
a 0% cpu utilization. I'm not sure if this is expected behavior or not
though. I have another cluster running the same version of ceph that has
the same symptom but the osds in our jewel cluster always show activity.


John Petrini
Platforms Engineer

[image: Call CoreDial] 215.297.4400 x 232 <215-297-4400>
[image: Call CoreDial] www.coredial.com <https://coredial.com/>
[image: CoreDial] 751 Arbor Way, Hillcrest I, Suite 150 Blue Bell, PA 19422
<https://www.google.com/maps/place/CoreDial,+LLC/@40.140902,-75.2878857,17z/data=!3m1!4b1!4m5!3m4!1s0x89c6bc587f1cfd47:0x4c79d505f2ee580b!8m2!3d40.140902!4d-75.285697>
The information transmitted is intended only for the person or entity to
which it is addressed and may contain confidential and/or privileged
material. Any review, retransmission, dissemination or other use of, or
taking of any action in reliance upon, this information by persons or
entities other than the intended recipient is prohibited. If you received
this in error, please contact the sender and delete the material from any
computer.

On Mon, Dec 18, 2017 at 11:51 AM, John Petrini <jpetr...@coredial.com>
wrote:

> Hi David,
>
> Thanks for the info. The controller in the server (perc h730) was just
> replaced and the battery is at full health. Prior to replacing the
> controller I was seeing very high iowait when running iostat but I no
> longer see that behavior - just apply latency when running ceph osd perf.
> Since there's no iowait it makes me believe that the latency is not being
> introduced by the hardware; though I'm not ruling it out completely. I'd
> like to know what I can do to get a better understanding of what the OSD
> processes are so busy doing because they are working much harder on this
> server than the others.
>
>
>
>
>
> On Thu, Dec 14, 2017 at 11:33 AM, David Turner <drakonst...@gmail.com>
> wrote:
>
>> We show high disk latencies on a node when the controller's cache battery
>> dies.  This is assuming that you're using a controller with cache enabled
>> for your disks.  In any case, I would look at the hardware on the server.
>>
>> On Thu, Dec 14, 2017 at 10:15 AM John Petrini <jpetr...@coredial.com>
>> wrote:
>>
>>> Anyone have any ideas on this?
>>> _______________________________________________
>>> ceph-users mailing list
>>> ceph-users@lists.ceph.com
>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>>
>>
>
_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to