The cluster is currently recovering from a failed disk:
    cluster 1604ec7a-6ceb-42fc-8c68-0a7896c4e120
health HEALTH_WARN 520 pgs backfill; 378 pgs degraded; 1 pgs recovering; 2 pgs recovery_wait; 523 pgs stuck unclean; 84 requests are blocked > 32 sec; recovery 4324271/37841102 objects degraded (11.427%); 1/18338660 unfound (0.000%); noscrub,nodeep-scrub flag(s) set monmap e2: 2 mons at {ceph0c=10.193.0.6:6789/0,ceph1c=10.193.0.7:6789/0}, election epoch 86, quorum 0,1 ceph0c,ceph1c
     osdmap e13266: 15 osds: 15 up, 15 in
            flags noscrub,nodeep-scrub
      pgmap v5354951: 2592 pgs, 18 pools, 15071 GB data, 17908 kobjects
            27912 GB used, 27951 GB / 55864 GB avail
4324271/37841102 objects degraded (11.427%); 1/18338660 unfound (0.000%)
                2069 active+clean
                 143 active+remapped+wait_backfill
                 208 active+degraded+wait_backfill
                   2 active+recovery_wait
                   1 active+recovering+degraded
                 169 active+degraded+remapped+wait_backfill
  client io 1024 B/s rd, 0 B/s wr, 1 op/s


I dropped radosgw-agent --num-processes down to 1, so I can see which bucket is having the problem.


*Craig Lewis*
Senior Systems Engineer
Office +1.714.602.1309
Email cle...@centraldesktop.com <mailto:cle...@centraldesktop.com>

*Central Desktop. Work together in ways you never thought possible.*
Connect with us Website <http://www.centraldesktop.com/> | Twitter <http://www.twitter.com/centraldesktop> | Facebook <http://www.facebook.com/CentralDesktop> | LinkedIn <http://www.linkedin.com/groups?gid=147417> | Blog <http://cdblog.centraldesktop.com/>

On 4/4/14 12:44 , Jean-Charles Lopez wrote:
What is the status of your PGs on the slave zone side.

A down or stale PG could definitely cause this.

May be a quick ceph -s and ceph health detail could help locate the PG with a problem that could may be then help you get the correct ceph pg {pgid} query command to find out which OSD is causing it

JC

On Friday, April 4, 2014, Craig Lewis <cle...@centraldesktop.com <mailto:cle...@centraldesktop.com>> wrote:

    I've been seeing this warning on ceph -w for a while:
    2014-04-04 11:26:29.438992 osd.3 [WRN] 84 slow requests, 1
    included below; oldest blocked for > 90124.336765 secs
    2014-04-04 11:26:29.438996 osd.3 [WRN] slow request 1920.199044
    seconds old, received at 2014-04-04 10:54:29.239906:
    osd_op(client.45483332.0:79 .dir.us-west-1.51941060.1 [call
    rgw.bucket_list] 11.7c96a483 e13266) v4 currently waiting for
    missing object

    It appears to be causing problems for radosgw-agent (this warning
    is in the slave zone).

    Requests blocked for more than a day are a bit of a problem.  Is
    there anything I can do about this?


--
    *Craig Lewis*
    Senior Systems Engineer
    Office +1.714.602.1309
    Email cle...@centraldesktop.com
    <javascript:_e(%7B%7D,'cvml','cle...@centraldesktop.com');>

    *Central Desktop. Work together in ways you never thought possible.*
    Connect with us Website <http://www.centraldesktop.com/>  |
    Twitter <http://www.twitter.com/centraldesktop>  | Facebook
    <http://www.facebook.com/CentralDesktop>  | LinkedIn
    <http://www.linkedin.com/groups?gid=147417>  | Blog
    <http://cdblog.centraldesktop.com/>



--
Sent while moving
Pardon my French and any spelling &| grammar glitches


_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to