Re: [ceph-users] Ceph cluster NO read / write performance :: Ops are blocked

Vickey Singh Mon, 07 Sep 2015 13:35:27 -0700

Adding ceph-users.

On Mon, Sep 7, 2015 at 11:31 PM, Vickey Singh <vickey.singh22...@gmail.com>
wrote:


>
>
> On Mon, Sep 7, 2015 at 10:04 PM, Udo Lembke <ulem...@polarzone.de> wrote:
>
>> Hi Vickey,
>>
> Thanks for your time in replying to my problem.
>
>
>> I had the same rados bench output after changing the motherboard of the
>> monitor node with the lowest IP...
>> Due to the new mainboard, I assume the hw-clock was wrong during startup.
>> Ceph health show no errors, but all VMs aren't able to do IO (very high
>> load on the VMs - but no traffic).
>> I stopped the mon, but this don't changed anything. I had to restart all
>> other mons to get IO again. After that I started the first mon also (with
>> the right time now) and all worked fine again...
>>
>
> Thanks i will try to restart all OSD / MONS and report back , if it solves
> my problem
>
>>
>> Another posibility:
>> Do you use journal on SSDs? Perhaps the SSDs can't write to garbage
>> collection?
>>
>
> No i don't have journals on SSD , they are on the same OSD disk.
>
>>
>>
>>
>> Udo
>>
>>
>> On 07.09.2015 16:36, Vickey Singh wrote:
>>
>> Dear Experts
>>
>> Can someone please help me , why my cluster is not able write data.
>>
>> See the below output  cur MB/S  is 0  and Avg MB/s is decreasing.
>>
>>
>> Ceph Hammer  0.94.2
>> CentOS 6 (3.10.69-1)
>>
>> The Ceph status says OPS are blocked , i have tried checking , what all i
>> know
>>
>> - System resources ( CPU , net, disk , memory )    -- All normal
>> - 10G network for public and cluster network  -- no saturation
>> - Add disks are physically healthy
>> - No messages in /var/log/messages OR dmesg
>> - Tried restarting OSD which are blocking operation , but no luck
>> - Tried writing through RBD  and Rados bench , both are giving same
>> problemm
>>
>> Please help me to fix this problem.
>>
>> #  rados bench -p rbd 60 write
>>  Maintaining 16 concurrent writes of 4194304 bytes for up to 60 seconds
>> or 0 objects
>>  Object prefix: benchmark_data_stor1_1791844
>>    sec Cur ops   started  finished  avg MB/s  cur MB/s  last lat   avg lat
>>      0       0         0         0         0         0         -         0
>>      1      16       125       109   435.873       436  0.022076 0.0697864
>>      2      16       139       123   245.948        56  0.246578 0.0674407
>>      3      16       139       123   163.969         0         - 0.0674407
>>      4      16       139       123   122.978         0         - 0.0674407
>>      5      16       139       123    98.383         0         - 0.0674407
>>      6      16       139       123   81.9865         0         - 0.0674407
>>      7      16       139       123   70.2747         0         - 0.0674407
>>      8      16       139       123   61.4903         0         - 0.0674407
>>      9      16       139       123   54.6582         0         - 0.0674407
>>     10      16       139       123   49.1924         0         - 0.0674407
>>     11      16       139       123   44.7201         0         - 0.0674407
>>     12      16       139       123   40.9934         0         - 0.0674407
>>     13      16       139       123   37.8401         0         - 0.0674407
>>     14      16       139       123   35.1373         0         - 0.0674407
>>     15      16       139       123   32.7949         0         - 0.0674407
>>     16      16       139       123   30.7451         0         - 0.0674407
>>     17      16       139       123   28.9364         0         - 0.0674407
>>     18      16       139       123   27.3289         0         - 0.0674407
>>     19      16       139       123   25.8905         0         - 0.0674407
>> 2015-09-07 15:54:52.694071min lat: 0.022076 max lat: 0.46117 avg lat:
>> 0.0674407
>>    sec Cur ops   started  finished  avg MB/s  cur MB/s  last lat   avg lat
>>     20      16       139       123    24.596         0         - 0.0674407
>>     21      16       139       123   23.4247         0         - 0.0674407
>>     22      16       139       123     22.36         0         - 0.0674407
>>     23      16       139       123   21.3878         0         - 0.0674407
>>     24      16       139       123   20.4966         0         - 0.0674407
>>     25      16       139       123   19.6768         0         - 0.0674407
>>     26      16       139       123     18.92         0         - 0.0674407
>>     27      16       139       123   18.2192         0         - 0.0674407
>>     28      16       139       123   17.5686         0         - 0.0674407
>>     29      16       139       123   16.9628         0         - 0.0674407
>>     30      16       139       123   16.3973         0         - 0.0674407
>>     31      16       139       123   15.8684         0         - 0.0674407
>>     32      16       139       123   15.3725         0         - 0.0674407
>>     33      16       139       123   14.9067         0         - 0.0674407
>>     34      16       139       123   14.4683         0         - 0.0674407
>>     35      16       139       123   14.0549         0         - 0.0674407
>>     36      16       139       123   13.6645         0         - 0.0674407
>>     37      16       139       123   13.2952         0         - 0.0674407
>>     38      16       139       123   12.9453         0         - 0.0674407
>>     39      16       139       123   12.6134         0         - 0.0674407
>> 2015-09-07 15:55:12.697124min lat: 0.022076 max lat: 0.46117 avg lat:
>> 0.0674407
>>    sec Cur ops   started  finished  avg MB/s  cur MB/s  last lat   avg lat
>>     40      16       139       123   12.2981         0         - 0.0674407
>>     41      16       139       123   11.9981         0         - 0.0674407
>>
>>
>>
>>
>>     cluster 86edf8b8-b353-49f1-ab0a-a4827a9ea5e8
>>      health HEALTH_WARN
>>             1 requests are blocked > 32 sec
>>      monmap e3: 3 mons at {stor0111=
>> 10.100.1.111:6789/0,stor0113=10.100.1.113:6789/0,stor011
>> 5=10.100.1.115:6789/0}
>>             election epoch 32, quorum 0,1,2 stor0111,stor0113,stor0115
>>      osdmap e19536: 50 osds: 50 up, 50 in
>>       pgmap v928610: 2752 pgs, 9 pools, 30476 GB data, 4183 kobjects
>>             91513 GB used, 47642 GB / 135 TB avail
>>                 2752 active+clean
>>
>>
>> Tried using RBD
>>
>>
>> # dd if=/dev/zero of=file1 bs=4K count=10000 oflag=direct
>> 10000+0 records in
>> 10000+0 records out
>> 40960000 bytes (41 MB) copied, 24.5529 s, 1.7 MB/s
>>
>> # dd if=/dev/zero of=file1 bs=1M count=100 oflag=direct
>> 100+0 records in
>> 100+0 records out
>> 104857600 bytes (105 MB) copied, 1.05602 s, 9.3 MB/s
>>
>> # dd if=/dev/zero of=file1 bs=1G count=1 oflag=direct
>> 1+0 records in
>> 1+0 records out
>> 1073741824 bytes (1.1 GB) copied, 293.551 s, 3.7 MB/s
>> ]#
>>
>>
>>
>>
>>
>>
>>
>>
>> _______________________________________________
>> ceph-users mailing 
>> listceph-us...@lists.ceph.comhttp://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>
>>
>>
>

_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Ceph cluster NO read / write performance :: Ops are blocked

Reply via email to