I have a small update to this: 

After an even closer reading of an offending pg's query I noticed the following:

"peer": "4",
"pgid": "19.6e",
"last_update": "51072'48910307",
"last_complete": "51072'48910307",
"log_tail": "50495'48906592",

The log tail seems to have lagged behind the last_update/last_complete. I 
suspect this is whats causing the cluster to reject these pgs. Anyone know how 
i can go about cleaning this up?


Aaron 

> On Dec 1, 2014, at 8:12 PM, Aaron Bassett <aa...@five3genomics.com> wrote:
> 
> Hi all, I have a problem with some incomplete pgs. Here’s the backstory: I 
> had a pool that I had accidently left with a size of 2. On one of the ods 
> nodes, the system hdd started to fail and I attempted to rescue it by 
> sacrificing one of my osd nodes. That went ok and I was able to bring the 
> node back up minus the one osd. Now I have 11 incomplete osds. I believe 
> these are mostly from the pool that only had size two, but I cant tell for 
> sure. I found another thread on here that talked about using 
> ceph_objectstore_tool to add or remove pg data to get out of an incomplete 
> state. 
> 
> Let’s start with the one pg I’ve been playing with, this is a loose 
> description of where I’ve been. First I saw that it had the missing osd in 
> “down_osds_we_would_probe” when I queried it, and some reading around that 
> told me to recreate the missing osd, so I did that. It (obviously) didnt have 
> the missing data, but it took the pg from down+incomplete to just incomplete. 
> Then I tried pg_force_create and that didnt seem to make a difference. Some 
> more googling then brought me to ceph_objectstore_tool and I started to take 
> a closer look at the results from pg query. I noticed that the list of 
> probing osds gets longer and longer till the end of the query has something 
> like:
> 
> "probing_osds": [
>       "0",
>       "3",
>       "4",
>       "16",
>       "23",
>       "26",
>       "35",
>       "41",
>       "44",
>       "51",
>       "56”],
> 
> So I took a look at those osds and noticed that some of them have data in the 
> directory for the troublesome pg and others dont. So I tried picking one with 
> the *most* data and i used ceph_objectstore_tool to export the pg. It was > 
> 6G so a fair amount of data is still there. I then imported it (after 
> removing) into all the others in that list. Unfortunately, it is still 
> incomplete. I’m not sure what my next step should be here. Here’s some other 
> stuff from the query on it:
> 
> "info": { "pgid": "0.63b",
>    "last_update": "50495'8246",
>    "last_complete": "50495'8246",
>    "log_tail": "20346'5245",
>    "last_user_version": 8246,
>    "last_backfill": "MAX",
>    "purged_snaps": "[]",
>    "history": { "epoch_created": 1,
>        "last_epoch_started": 51102,
>        "last_epoch_clean": 50495,
>        "last_epoch_split": 0,
>        "same_up_since": 68312,
>        "same_interval_since": 68312,
>        "same_primary_since": 68190,
>        "last_scrub": "28158'8240",
>        "last_scrub_stamp": "2014-11-18 17:08:49.368486",
>        "last_deep_scrub": "28158'8240",
>        "last_deep_scrub_stamp": "2014-11-18 17:08:49.368486",
>        "last_clean_scrub_stamp": "2014-11-18 17:08:49.368486"},
>    "stats": { "version": "50495'8246",
>        "reported_seq": "84279",
>        "reported_epoch": "69394",
>        "state": "down+incomplete",
>        "last_fresh": "2014-12-01 23:23:07.355308",
>        "last_change": "2014-12-01 21:28:52.771807",
>        "last_active": "2014-11-24 13:37:09.784417",
>        "last_clean": "2014-11-22 21:59:49.821836",
>        "last_became_active": "0.000000",
>        "last_unstale": "2014-12-01 23:23:07.355308",
>        "last_undegraded": "2014-12-01 23:23:07.355308",
>        "last_fullsized": "2014-12-01 23:23:07.355308",
>        "mapping_epoch": 68285,
>        "log_start": "20346'5245",
>        "ondisk_log_start": "20346'5245",
>        "created": 1,
>        "last_epoch_clean": 50495,
>        "parent": "0.0",
>        "parent_split_bits": 0,
>        "last_scrub": "28158'8240",
>        "last_scrub_stamp": "2014-11-18 17:08:49.368486",
>        "last_deep_scrub": "28158'8240",
>        "last_deep_scrub_stamp": "2014-11-18 17:08:49.368486",
>        "last_clean_scrub_stamp": "2014-11-18 17:08:49.368486",
>        "log_size": 3001,
>        "ondisk_log_size": 3001,
> 
> 
> Also in the peering section, all the peers now have the same last_update: 
> which makes me think it should just pick up and take off. 
> 
> There is another think I’m having problems with and I’m not sure if it’s 
> related or not. I set a crush map manually as I have a mix of ssd and platter 
> osds and it seems to work when I set it, the cluster starts rebalancing, etc, 
> but if I do a restart ceph-all on all my nodes the crush maps seems to revert 
> to the one I didn’t set. I don’t know if its being blocked from taking by 
> these incomplete pgs or if I’m missing a step to get it to “stick” It makes 
> me think when I’m stopping and starting these osds to use 
> ceph_objectstore_tool on them they may be getting out of sync with the 
> cluster.
> 
> Any insights would be greatly appreciated,
> 
> Aaron 
> 

_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to