Re: [ceph-users] [Ceph-community] Noobie question about OSD fail

2016-07-27 Thread Samuel Just
osd min down reports = 2

Set that to 1?
-Sam

On Wed, Jul 27, 2016 at 10:24 AM, Patrick McGarry  wrote:
> Moving this to ceph-user.
>
>
> On Wed, Jul 27, 2016 at 8:36 AM, Kostya Velychkovsky
>  wrote:
>> Hello. I have test CEPH cluster with 5 nodes:  3 MON and 2 OSD
>>
>> This is my ceph.conf
>>
>> [global]
>> fsid = 714da611-2c40-4930-b5b9-d57e70d5cf7e
>> mon_initial_members = node1
>> mon_host = node1,node3,node4
>>
>> auth_cluster_required = cephx
>> auth_service_required = cephx
>> auth_client_required = cephx
>> osd_pool_default_size = 2
>> public_network = X.X.X.X/24
>>
>> [mon]
>> osd report timeout = 15
>> osd min down reports = 2
>>
>> [osd]
>> mon report interval max = 30
>> mon heartbeat interval = 15
>>
>>
>> So, while I run some fail tests and hard reset one OSD node, I have long
>> timeout while ceph mark this OSD down, ~15 minutes
>>
>> and ceph -s display that cluster OK.
>> ---
>> cluster 714da611-2c40-4930-b5b9-d57e70d5cf7e
>>  health HEALTH_OK
>>  monmap e5: 3 mons at 
>> election epoch 272, quorum 0,1,2 node1,node3,node4
>>  osdmap e90: 2 osds: 2 up, 2 in
>> ---
>> Only after ~15 minutes mon nodes Mark this OSD down, and change state of
>> cluster
>> 
>>  osdmap e86: 2 osds: 1 up, 2 in; 64 remapped pgs
>> flags sortbitwise
>>   pgmap v3927: 64 pgs, 1 pools, 10961 MB data, 2752 objects
>> 22039 MB used, 168 GB / 189 GB avail
>> 2752/5504 objects degraded (50.000%)
>>   64 active+undersized+degraded
>> ---
>>
>> I tried to ajust 'osd report timeout'  but have the same result.
>>
>> Can you pls help me tune my cluster to decrease this reaction time ?
>>
>> --
>> Best Regards
>>
>> Kostiantyn Velychkovsky
>>
>> ___
>> Ceph-community mailing list
>> ceph-commun...@lists.ceph.com
>> http://lists.ceph.com/listinfo.cgi/ceph-community-ceph.com
>>
>
>
>
> --
>
> Best Regards,
>
> Patrick McGarry
> Director Ceph Community || Red Hat
> http://ceph.com  ||  http://community.redhat.com
> @scuttlemonkey || @ceph
> ___
> Ceph-community mailing list
> ceph-commun...@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-community-ceph.com
>
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] [Ceph-community] Noobie question about OSD fail

2016-07-27 Thread Patrick McGarry
Moving this to ceph-user.


On Wed, Jul 27, 2016 at 8:36 AM, Kostya Velychkovsky
 wrote:
> Hello. I have test CEPH cluster with 5 nodes:  3 MON and 2 OSD
>
> This is my ceph.conf
>
> [global]
> fsid = 714da611-2c40-4930-b5b9-d57e70d5cf7e
> mon_initial_members = node1
> mon_host = node1,node3,node4
>
> auth_cluster_required = cephx
> auth_service_required = cephx
> auth_client_required = cephx
> osd_pool_default_size = 2
> public_network = X.X.X.X/24
>
> [mon]
> osd report timeout = 15
> osd min down reports = 2
>
> [osd]
> mon report interval max = 30
> mon heartbeat interval = 15
>
>
> So, while I run some fail tests and hard reset one OSD node, I have long
> timeout while ceph mark this OSD down, ~15 minutes
>
> and ceph -s display that cluster OK.
> ---
> cluster 714da611-2c40-4930-b5b9-d57e70d5cf7e
>  health HEALTH_OK
>  monmap e5: 3 mons at 
> election epoch 272, quorum 0,1,2 node1,node3,node4
>  osdmap e90: 2 osds: 2 up, 2 in
> ---
> Only after ~15 minutes mon nodes Mark this OSD down, and change state of
> cluster
> 
>  osdmap e86: 2 osds: 1 up, 2 in; 64 remapped pgs
> flags sortbitwise
>   pgmap v3927: 64 pgs, 1 pools, 10961 MB data, 2752 objects
> 22039 MB used, 168 GB / 189 GB avail
> 2752/5504 objects degraded (50.000%)
>   64 active+undersized+degraded
> ---
>
> I tried to ajust 'osd report timeout'  but have the same result.
>
> Can you pls help me tune my cluster to decrease this reaction time ?
>
> --
> Best Regards
>
> Kostiantyn Velychkovsky
>
> ___
> Ceph-community mailing list
> ceph-commun...@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-community-ceph.com
>



-- 

Best Regards,

Patrick McGarry
Director Ceph Community || Red Hat
http://ceph.com  ||  http://community.redhat.com
@scuttlemonkey || @ceph
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com