Re: [ceph-users] How repair 2 invalids pgs

2015-08-18 Thread Pierre BLONDEAU
Le 14/08/2015 15:48, Pierre BLONDEAU a écrit :
> Hy,
> 
> Yesterday, I removed 5 ods on 15 from my cluster ( machine migration ).
> 
> When I stopped the processes, I haven't verified that all the pages were
> in active stat. I removed the 5 ods from the cluster ( ceph osd out
> osd.9 ; ceph osd crush rm osd.9 ; ceph auth del osd.9 ; ceph osd rm
> osd.9 ) , and i check after... and I had two inactive pgs
> 
> I have not formatted the filesystem of the osds.
> 
> The health :
> pg 7.b is stuck inactive for 86083.236722, current state inactive, last
> acting [1,2]
> pg 7.136 is stuck inactive for 86098.214967, current state inactive,
> last acting [4,7]
> 
> The recovery state :
> "recovery_state": [
> { "name": "Started\/Primary\/Peering\/WaitActingChange",
>   "enter_time": "2015-08-13 15:19:49.559965",
>   "comment": "waiting for pg acting set to change"},
> { "name": "Started",
>   "enter_time": "2015-08-13 15:19:46.492625"}],
> 
> How can i solved my problem ?
> 
> Can i re-add the osds since the filesystem ?
> 
> My cluster is used for rbd's image and a little cephfs share. I can read
> all files in cephfs and I tried to check if there pgs were used by an
> image. I don't find anything, but I not sure of my script.
> 
> My cluster is used for rbd image and a little cephfs share. I can read
> all block in cephfs and i check all image to verify if they use these
> pgs. I don't find anything.
> 
> How do you know if a pg is "used" ?
> 
> Regards

Hello,

The names of pgs start with "7.". so they are used by the pool id 7 ?

For me, it's cephfs_meta ( cephfs metadata ). I get no response when i
done "rados -p cephfs_meta ls" .

Like it's a small share, it's not serious. I can restore it easily. So I
add the news OSDs of the new machine.

And it solved the problem, but i don't understand why. So if someone
have an idea ?

Regards

PS : I use 0.80.10 on wheezy

-- 
--
Pierre BLONDEAU
Administrateur Systèmes & réseaux
Université de Caen
Laboratoire GREYC, Département d'informatique

tel : 02 31 56 75 42
bureau  : Campus 2, Science 3, 406
--



smime.p7s
Description: Signature cryptographique S/MIME
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] How repair 2 invalids pgs

2015-08-14 Thread Pierre BLONDEAU
Hy,

Yesterday, I removed 5 ods on 15 from my cluster ( machine migration ).

When I stopped the processes, I haven't verified that all the pages were
in active stat. I removed the 5 ods from the cluster ( ceph osd out
osd.9 ; ceph osd crush rm osd.9 ; ceph auth del osd.9 ; ceph osd rm
osd.9 ) , and i check after... and I had two inactive pgs

I have not formatted the filesystem of the osds.

The health :
pg 7.b is stuck inactive for 86083.236722, current state inactive, last
acting [1,2]
pg 7.136 is stuck inactive for 86098.214967, current state inactive,
last acting [4,7]

The recovery state :
"recovery_state": [
{ "name": "Started\/Primary\/Peering\/WaitActingChange",
  "enter_time": "2015-08-13 15:19:49.559965",
  "comment": "waiting for pg acting set to change"},
{ "name": "Started",
  "enter_time": "2015-08-13 15:19:46.492625"}],

How can i solved my problem ?

Can i re-add the osds since the filesystem ?

My cluster is used for rbd's image and a little cephfs share. I can read
all files in cephfs and I tried to check if there pgs were used by an
image. I don't find anything, but I not sure of my script.

My cluster is used for rbd image and a little cephfs share. I can read
all block in cephfs and i check all image to verify if they use these
pgs. I don't find anything.

How do you know if a pg is "used" ?

Regards

-- 
--
Pierre BLONDEAU
Administrateur Systèmes & réseaux
Université de Caen
Laboratoire GREYC, Département d'informatique

tel : 02 31 56 75 42
bureau  : Campus 2, Science 3, 406
--
{ "state": "inactive",
  "snap_trimq": "[]",
  "epoch": 15291,
  "up": [
4,
7],
  "acting": [
4,
7],
  "info": { "pgid": "7.136",
  "last_update": "0'0",
  "last_complete": "0'0",
  "log_tail": "0'0",
  "last_user_version": 0,
  "last_backfill": "MAX",
  "purged_snaps": "[]",
  "history": { "epoch_created": 4046,
  "last_epoch_started": 14458,
  "last_epoch_clean": 14458,
  "last_epoch_split": 0,
  "same_up_since": 14475,
  "same_interval_since": 14475,
  "same_primary_since": 1,
  "last_scrub": "0'0",
  "last_scrub_stamp": "2015-08-13 07:07:17.963482",
  "last_deep_scrub": "0'0",
  "last_deep_scrub_stamp": "2015-08-08 06:18:33.726150",
  "last_clean_scrub_stamp": "2015-08-13 07:07:17.963482"},
  "stats": { "version": "0'0",
  "reported_seq": "10510",
  "reported_epoch": "15291",
  "state": "inactive",
  "last_fresh": "2015-08-14 13:52:48.121254",
  "last_change": "2015-08-13 15:19:43.824578",
  "last_active": "2015-08-13 15:19:31.362363",
  "last_clean": "2015-08-13 15:19:31.362363",
  "last_became_active": "0.00",
  "last_unstale": "2015-08-14 13:52:48.121254",
  "mapping_epoch": 14472,
  "log_start": "0'0",
  "ondisk_log_start": "0'0",
  "created": 4046,
  "last_epoch_clean": 14458,
  "parent": "0.0",
  "parent_split_bits": 0,
  "last_scrub": "0'0",
  "last_scrub_stamp": "2015-08-13 07:07:17.963482",
  "last_deep_scrub": "0'0",
  "last_deep_scrub_stamp": "2015-08-08 06:18:33.726150",
  "last_clean_scrub_stamp": "2015-08-13 07:07:17.963482",
  "log_size": 0,
  "ondisk_log_size": 0,
  "stats_invalid": "0",
  "stat_sum": { "num_bytes": 0,
  "num_objects": 0,
  "num_object_clones": 0,
  "num_object_copies": 0,
  "num_objects_missing_on_primary": 0,
  "num_objects_degraded": 0,
  "num_objects_unfound": 0,
  "num_objects_dirty": 0,
  "num_whiteouts": 0,
  "num_read": 0,
  "num_read_kb": 0,
  "num_write": 0,
  "num_write_kb": 0,
  "num_scrub_errors": 0,
  "num_shallow_scrub_errors": 0,
  "num_deep_scrub_errors": 0,
  "num_objects_recovered": 0,
  "num_bytes_recovered": 0,
  "num_keys_recovered": 0,
  "num_objects_omap": 0,
  "num_objects_hit_set_archive": 0},
  "stat_cat_sum": {},
  "up": [
4,
7],
  "acting": [
4,
7],
  "up_primary": 4,
  "acting_primary": 4},
  "empty": 1,
  "dne": 0,
  "incomplete": 0,
  "last_epoch_started": 14474,
  "hit_set_history": { "current_last_update": "0'0",
  "current_last_stamp": "0.00",
  "current_info": { "begin": "0.00",
  "end": "0.00",
  "version": "0'0"},
  "history": []}},
  "peer_info": [],
  "recovery_state": [
{ "name": "Started\/Primary\/Peering\/WaitActingChange",