Re: [ceph-users] Ceph nautilus upgrade problem

2019-04-02 Thread Jan-Willem Michels

Op 2-4-2019 om 12:16 schreef Stefan Kooman:

Quoting Stadsnet (jwil...@stads.net):

On 26-3-2019 16:39, Ashley Merrick wrote:

Have you upgraded any OSD's?


No didn't go through with the osd's

Just checking here: are your sure all PGs have been scrubbed while
running Luminous? As the release notes [1] mention this:

"If you are unsure whether or not your Luminous cluster has completed a
full scrub of all PGs, you can check your clusters state by running:

# ceph osd dump | grep ^flags

In order to be able to proceed to Nautilus, your OSD map must include
the recovery_deletes and purged_snapdirs flags."


Yes I did check that.

No everything went fine, exactly as Ashley predicted

"On a test cluster I saw the same and as I upgraded / restarted the 
OSD's the PG's started to show online till it was 100%."


So I upgraded the first osd, and exactly that amount of percentage of 
OSD's became active.

And every server the same percentage was added.
And then finaly, with the last one I got 100% active.

So went without problems.
But it looked a bit uggly that's why I asked.

And the new Nautilus versions is really a big plus in almost every way.

Sorry for not getting back how it went. I was not sure if I should 
bother the mailing list.


Thanks for your time.




Gr. Stefan

[1]:
http://docs.ceph.com/docs/master/releases/nautilus/#upgrading-from-mimic-or-luminous

P.s. I expect most users upgrade to Mimic first, then go to Nautilus.
It might be a better tested upgrade path ...




___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] S3 objects deleted but storage doesn't free space

2017-12-14 Thread Jan-Willem Michels


Hi there all,
Perhaps someone can help.

We tried to free some storage so we deleted a lot S3 objects. The bucket 
has also valuable data so we can't delete the whole bucket.
Everything went fine, but used storage space doesn't get  less.  We are 
expecting several TB of data to be freed.


We then learned of garbage collection. So we thought let's wait. But 
even day's later no real change.
We started " radosgw-admin gc process ", that never finished , or 
displayed any  error or anything.
Could find anything like -verbose or debug for this command or find a 
place with log to debug what is going on when radosgw-admin is working


We tried to change the default settings, we got from old posting.
We have put them in global and tried  also in [client.rgw..]
rgw_gc_max_objs =7877 ( but also rgw_gc_max_objs =200 or 
rgw_gc_max_objs =1000)

rgw_lc_max_objs = 7877
rgw_gc_obj_min_wait = 300
rgw_gc_processor_period = 600
rgw_gc_processor_max_time = 600

We restarted the  ceph-radosgw several times, the computers, all over 
period of days etc . Tried radosgw-admin gc process a few times etc.
Did not find any references in radosgw logs like gc:: delete etc. But we 
don't know what to look for
System is well, nor errors or warnings. But system is in use ( we are 
loading up data) -> Will GC only run when idle?


When we count them with "radosgw-admin gc list | grep oid | wc -l" we get
11:00 18.086.665 objects
13:00 18.086.665 objects
15:00 18.086.665 objects
so no change in objects after hours

When we list "radosgw-admin gc list" we get files like
 radosgw-admin gc list | more
[
{
"tag": "b5687590-473f-4386-903f-d91a77b8d5cd.7354141.21122\u",
"time": "2017-12-06 11:04:56.0.459704s",
"objs": [
{
"pool": "default.rgw.buckets.data",
"oid": 
"b5687590-473f-4386-903f-d91a77b8d5cd.44121.4__shadow_.5OtA02n_GU8TkP08We_SLrT5GL1ihuS_1",

"key": "",
"instance": ""
},
{
"pool": "default.rgw.buckets.data",
"oid": 
"b5687590-473f-4386-903f-d91a77b8d5cd.44121.4__shadow_.5OtA02n_GU8TkP08We_SLrT5GL1ihuS_2",

"key": "",
"instance": ""
},
{
"pool": "default.rgw.buckets.data",
"oid": 
"b5687590-473f-4386-903f-d91a77b8d5cd.44121.4__shadow_.5OtA02n_GU8TkP08We_SLrT5GL1ihuS_3",

"key": "",
"instance": ""
},

 A few questions ->

Who purges the gc list. Is it on the the radosgw machines. Or is it done 
distributed on the OSD's?
Where do i have to change default "rgw_gc_max_objs =1000". We tried 
everywhere. We have used "tell" to change them in OSD and MON systems 
and also on the RGW endpoint's which we restarted.


We have two radosgw endpoints. Is there a lock that only one will act, 
or will they both try to delete. Can we free / display such a lock


How can I debug the application radosgw-admin. In which log files to 
look, what would be example of message.


If I know an oid like above. Can I manually delete such an oid.

Suppose we would delete the complete bucket with "radosgw-admin bucket 
rm --bucket=mybucket --purge-objects --inconsistent-index" would that 
also get rid of the GC files that allready there?


Thanks  ahead for your time,

JW Michels







q
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] [Solved] Oeps: lost cluster with: ceph osd require-osd-release luminous

2017-09-13 Thread Jan-Willem Michels

On 9/12/17 9:13 PM, Josh Durgin wrote:


Could you post your crushmap? PGs mapping to no OSDs is a symptom of 
something wrong there.



You can stop the osds from changing position at startup with 'osd 
crush update on start = false':



Yes I  had found that. Thanks. Seems be be by design, which we didn't 
understand.

We will try device classes.


http://docs.ceph.com/docs/master/rados/operations/crush-map/#crush-location




My "Big" problem, turned out to be a cosmetic problem.
Although the whole problem looks quite ugly, and every metric is 0 where 
ever you look.

And you can't really use any ceph management anymore.

But the whole system kept functioning. Since it was a remote test site I 
didn't notice that earlier.


So the whole problem was that the MGR server where up. But that a 
firewall prevented contact.
At the moment that "ceph osd require-osd-release luminous" was set the 
old compatible metric system stopped working and switched to the 
mandatory MGR servers.

And then you get this types of 0 messages.

So even without visual management, and administrator thinking it was 
dead, ceph kept running.
One could  say that Ceph also managed to create a succesfull upgrade 
path to 12.2. Well done.


Thanks for your time

The only minor problem left is with scrub error and pg repair doing 
nothing.

And because of bluestore not easy acces to the files

rados list-inconsistent-pg  default.rgw.buckets.data ["15.720"]
rados list-inconsistent-obj 15.720 --format=json-pretty
No scrub information available for pg 15.720
error 2: (2) No such file or directory

Other people seem to also have  this problem. 
http://tracker.ceph.com/issues/15781
I've read that perhaps a better pg-repair will be build. Will wait for 
that.






Sent from Nine <http://www.9folders.com/>
*From:* Jan-Willem Michels <jwil...@stads.net>
*Sent:* Sep 11, 2017 23:50
*To:* ceph-users@lists.ceph.com
*Subject:* [ceph-users] Oeps: lost cluster with: ceph osd 
require-osd-release luminous


We have a kraken cluster,  at the time newly build, with bluestore
enabled.
it is 8 systems, with each 10 disks 10TB ,  and each computer has
1 NVME
2TB disk
3 monitor etc
About 700 TB and 300TB used. Mainly S3 objectstore

Of course there is more to the story:  We have one strange thing
in our
cluster.
We tried  to create two pools of storage, default and ssd, and
created a
new crush rule.
Worked without problems for months
But when we restart a computer / nvme-osd, it would "forget" that the
nvme should be connected the SSD pool ( for that particular
computer).
Since we don't restart systems, we didn't notice that.
The nvme would appear back a default  pool.
When we re-apply the same crush rule again it would go back to the
SSD
pool.
All while data kept working on the nvme disks

Clearly something is not ideal there. And luminous has a different
approach to separating  SSD from HDD.
So we thought first go to luminous 12.2.0 and later see how we fix
this.

We did an upgrade to luminous and that went well. That requires a
reboot
/ restart off osd's, so all nvme devices where a default.
Reapplying the crush rule  brought them back to the ssd pool.
Also while doing the upgrade we switched off in ceph.conf the rule:
# enable experimental unrecoverable data corrupting features =
bluestore, sine in luminous that was no problem

Everything was working fine.
In Ceph -s we had this health warning

 all OSDs are running luminous or later but
require_osd_release < luminous

So i thought i would set the minimum  OSD version to luminous with;

ceph osd require-osd-release luminous

To us that seemed nothing more than a minimum software version
that was
required to connect tot the cluster
the system answered back

recovery_deletes is set

and that was it, the same second, ceph-s went to "0"

  ceph -s
   cluster:
 id: 5bafad08-31b2-4716-be77-07ad2e2647eb
 health: HEALTH_WARN
 noout flag(s) set
 Reduced data availability: 3248 pgs inactive
 Degraded data redundancy: 3248 pgs unclean

   services:
 mon: 3 daemons, quorum Ceph-Mon1,Ceph-Mon2,Ceph-Mon3
 mgr: Ceph-Mon2(active), standbys: Ceph-Mon3, Ceph-Mon1
 osd: 88 osds: 88 up, 88 in; 297 remapped pgs
  flags noout

   data:
 pools:   26 pools, 3248 pgs
 objects: 0 objects, 0 bytes
 usage:   0 kB used, 0 kB / 0 kB avail
 pgs: 100.000% pgs unknown
  3248 unknown

And it was something like this. The errors (apart  from the scrub
error)
you see would where from the upgrade / restarting, and I would expect
them to go away very fast.

ceph -s
   cluster:

[ceph-users] Oeps: lost cluster with: ceph osd require-osd-release luminous

2017-09-12 Thread Jan-Willem Michels

We have a kraken cluster,  at the time newly build, with bluestore enabled.
it is 8 systems, with each 10 disks 10TB ,  and each computer has 1 NVME 
2TB disk

3 monitor etc
About 700 TB and 300TB used. Mainly S3 objectstore

Of course there is more to the story:  We have one strange thing in our 
cluster.
We tried  to create two pools of storage, default and ssd, and created a 
new crush rule.

Worked without problems for months
But when we restart a computer / nvme-osd, it would "forget" that the 
nvme should be connected the SSD pool ( for that particular computer).

Since we don't restart systems, we didn't notice that.
The nvme would appear back a default  pool.
When we re-apply the same crush rule again it would go back to the SSD 
pool.

All while data kept working on the nvme disks

Clearly something is not ideal there. And luminous has a different 
approach to separating  SSD from HDD.

So we thought first go to luminous 12.2.0 and later see how we fix this.

We did an upgrade to luminous and that went well. That requires a reboot 
/ restart off osd's, so all nvme devices where a default.

Reapplying the crush rule  brought them back to the ssd pool.
Also while doing the upgrade we switched off in ceph.conf the rule:
# enable experimental unrecoverable data corrupting features = 
bluestore, sine in luminous that was no problem


Everything was working fine.
In Ceph -s we had this health warning

all OSDs are running luminous or later but 
require_osd_release < luminous


So i thought i would set the minimum  OSD version to luminous with;

ceph osd require-osd-release luminous

To us that seemed nothing more than a minimum software version that was 
required to connect tot the cluster

the system answered back

recovery_deletes is set

and that was it, the same second, ceph-s went to "0"

 ceph -s
  cluster:
id: 5bafad08-31b2-4716-be77-07ad2e2647eb
health: HEALTH_WARN
noout flag(s) set
Reduced data availability: 3248 pgs inactive
Degraded data redundancy: 3248 pgs unclean

  services:
mon: 3 daemons, quorum Ceph-Mon1,Ceph-Mon2,Ceph-Mon3
mgr: Ceph-Mon2(active), standbys: Ceph-Mon3, Ceph-Mon1
osd: 88 osds: 88 up, 88 in; 297 remapped pgs
 flags noout

  data:
pools:   26 pools, 3248 pgs
objects: 0 objects, 0 bytes
usage:   0 kB used, 0 kB / 0 kB avail
pgs: 100.000% pgs unknown
 3248 unknown

And it was something like this. The errors (apart  from the scrub error) 
you see would where from the upgrade / restarting, and I would expect 
them to go away very fast.


ceph -s
  cluster:
id: 5bafad08-31b2-4716-be77-07ad2e2647eb
health: HEALTH_ERR
385 pgs backfill_wait
5 pgs backfilling
135 pgs degraded
1 pgs inconsistent
1 pgs peering
4 pgs recovering
131 pgs recovery_wait
98 pgs stuck degraded
525 pgs stuck unclean
recovery 119/612465488 objects degraded (0.000%)
recovery 24/612465488 objects misplaced (0.000%)
1 scrub errors
noout flag(s) set
all OSDs are running luminous or later but 
require_osd_release < luminous


  services:
mon: 3 daemons, quorum Ceph-Mon1,Ceph-Mon2,Ceph-Mon3
mgr: Ceph-Mon2(active), standbys: Ceph-Mon1, Ceph-Mon3
osd: 88 osds: 88 up, 88 in; 387 remapped pgs
 flags noout

  data:
pools:   26 pools, 3248 pgs
objects: 87862k objects, 288 TB
usage:   442 TB used, 300 TB / 742 TB avail
pgs: 0.031% pgs not active
 119/612465488 objects degraded (0.000%)
 24/612465488 objects misplaced (0.000%)
 2720 active+clean
 385  active+remapped+backfill_wait
 131  active+recovery_wait+degraded
 5active+remapped+backfilling
 4active+recovering+degraded
 1active+clean+inconsistent
 1peering
 1active+clean+scrubbing+deep

  io:
client:   34264 B/s rd, 2091 kB/s wr, 38 op/s rd, 48 op/s wr
recovery: 4235 kB/s, 6 objects/s

current ceph health detail

HEALTH_WARN noout flag(s) set; Reduced data availability: 3248 pgs 
inactive; Degraded data redundancy: 3248 pgs unclean

OSDMAP_FLAGS noout flag(s) set
PG_AVAILABILITY Reduced data availability: 3248 pgs inactive
pg 15.7cd is stuck inactive for 24780.157341, current state 
unknown, last acting []
pg 15.7ce is stuck inactive for 24780.157341, current state 
unknown, last acting []
pg 15.7cf is stuck inactive for 24780.157341, current state 
unknown, last acting []

..
pg 15.7ff is stuck inactive for 24728.059692, current state 
unknown, last acting []

PG_DEGRADED Degraded data redundancy: 3248 pgs unclean
pg 15.7cd is stuck unclean for 24728.059692, current state unknown, 
last acting []
pg 15.7ce is stuck unclean for 24728.059692, current state unknown, 
last