[ceph-users] SOLVED: How to Limit S3 Access to One Subuser

2024-09-03 Thread Ansgar Jazdzewski
Hi folks,

I found countless questions but no real solution on how to have
multiple subusers and buckets in one account while limiting access to
a bucket to just one specific subuser.

Here’s how I managed to make it work:

```
{
  "Version": "2012-10-17",
  "Statement": [
{
  "Sid": "DenyAllUsersButOne",
  "Effect": "Deny",
  "Action": "s3:*",
  "Resource": [
"arn:aws:s3:::test-a",
"arn:aws:s3:::test-a/*"
  ],
  "NotPrincipal": {
"AWS": "arn:aws:iam:::user/:"
  }
}
  ]
}
```

I hope this might be useful for others as well.

Best regards,
Ansgar
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: erasure-code-lrc Questions regarding repair

2024-03-07 Thread Ansgar Jazdzewski
Hi,

I somehow missed your message thanks for your effort to raise this issue

Ansgar

Am Di., 16. Jan. 2024 um 10:05 Uhr schrieb Eugen Block :
>
> Hi,
>
> I don't really have an answer, I just wanted to mention that I created
> a tracker issue [1] because I believe there's a bug in the LRC plugin.
> But there hasn't been any response yet.
>
> [1] https://tracker.ceph.com/issues/61861
>
> Zitat von Ansgar Jazdzewski :
>
> > hi folks,
> >
> > I currently test erasure-code-lrc (1) in a multi-room multi-rack setup.
> > The idea is to be able to repair a disk-failures within the rack
> > itself to lower bandwidth-usage
> >
> > ```bash
> > ceph osd erasure-code-profile set lrc_hdd \
> > plugin=lrc \
> > crush-root=default \
> > crush-locality=rack \
> > crush-failure-domain=host \
> > crush-device-class=hdd \
> > mapping=__D__D__D__D \
> > layers='
> > [
> > [ "_cD_cD_cD_cD", "" ],
> > [ "cDD_", "" ],
> > [ "___cDD__", "" ],
> > [ "__cDD___", "" ],
> > [ "_cDD", "" ],
> > ]' \
> > crush-steps='[
> > [ "choose", "room", 4 ],
> > [ "choose", "rack", 1 ],
> > [ "chooseleaf", "host", 7 ],
> > ]'
> > ```
> >
> > The roule picks 4 out of 5 rooms and keeps the PG in one rack like expected!
> >
> > However it looks like the PG will not move to another Room if the PG
> > is undersized or the entire Room or Rack is down!
> >
> > Questions:
> > * do I miss something to allow LRC (PG's) to move across Racks/Rooms
> > for repair?
> > * Is it even possible to build such a 'Multi-stage' grushmap?
> >
> > Thanks for your help,
> > Ansgar
> >
> > 1) https://docs.ceph.com/en/quincy/rados/operations/erasure-code-jerasure/
> > ___
> > ceph-users mailing list -- ceph-users@ceph.io
> > To unsubscribe send an email to ceph-users-le...@ceph.io
>
>
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Sharing our "Containerized Ceph and Radosgw Playground"

2024-02-22 Thread Ansgar Jazdzewski
Hi Folks,

We are excited to announce plans for building a larger Ceph-S3 setup.
To ensure its success, extensive testing is needed in advance.

Some of these tests don't need a full-blown Ceph cluster on hardware
but still require meeting specific logical requirements, such as a
multi-site S3 setup. To address this, we're pleased to introduce our
ceph-s3-box test environment, which you can access on GitHub:

https://github.com/hetznercloud/ceph-s3-box

In the spirit of collaboration and knowledge sharing, we've made this
testing environment publicly available today. We hope that it proves
as beneficial to you as it has been for us.

If you have any questions or suggestions, please don't hesitate to reach out.

Cheers,
Ansgar
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Reef 18.2.1 unable to join multi-side when rgw_dns_name is configured

2024-02-21 Thread Ansgar Jazdzewski
for the record i tried both ways to configure:

```
radosgw-admin zonegroup get --rgw-zonegroup="dev" | \
jq '.hostnames |= ["dev.s3.localhost"]' | \
radosgw-admin zonegroup set --rgw-zonegroup="dev" -i -
```

```
ceph config set global rgw_dns_name dev.s3.localhost
```

Am Mi., 21. Feb. 2024 um 17:34 Uhr schrieb Ansgar Jazdzewski
:
>
> Hi folks,
>
> i just try to setup a new ceph s3 multisite-setup and it looks to me
> that dns-style s3 is broken in multi-side as wehn rgw_dns_name is
> configured the `radosgw-admin period update -commit`from the new mebe
> will not succeeded!
>
> it looks like when ever hostnames is configured it brakes on the new
> to add cluster
> https://docs.ceph.com/en/reef/radosgw/multisite/#setting-a-zonegroup
>
> Thanks for any advice!
> Ansgar
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Reef 18.2.1 unable to join multi-side when rgw_dns_name is configured

2024-02-21 Thread Ansgar Jazdzewski
Hi folks,

i just try to setup a new ceph s3 multisite-setup and it looks to me
that dns-style s3 is broken in multi-side as wehn rgw_dns_name is
configured the `radosgw-admin period update -commit`from the new mebe
will not succeeded!

it looks like when ever hostnames is configured it brakes on the new
to add cluster
https://docs.ceph.com/en/reef/radosgw/multisite/#setting-a-zonegroup

Thanks for any advice!
Ansgar
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] erasure-code-lrc Questions regarding repair

2024-01-15 Thread Ansgar Jazdzewski
hi folks,

I currently test erasure-code-lrc (1) in a multi-room multi-rack setup.
The idea is to be able to repair a disk-failures within the rack
itself to lower bandwidth-usage

```bash
ceph osd erasure-code-profile set lrc_hdd \
plugin=lrc \
crush-root=default \
crush-locality=rack \
crush-failure-domain=host \
crush-device-class=hdd \
mapping=__D__D__D__D \
layers='
[
[ "_cD_cD_cD_cD", "" ],
[ "cDD_", "" ],
[ "___cDD__", "" ],
[ "__cDD___", "" ],
[ "_cDD", "" ],
]' \
crush-steps='[
[ "choose", "room", 4 ],
[ "choose", "rack", 1 ],
[ "chooseleaf", "host", 7 ],
]'
```

The roule picks 4 out of 5 rooms and keeps the PG in one rack like expected!

However it looks like the PG will not move to another Room if the PG
is undersized or the entire Room or Rack is down!

Questions:
* do I miss something to allow LRC (PG's) to move across Racks/Rooms for repair?
* Is it even possible to build such a 'Multi-stage' grushmap?

Thanks for your help,
Ansgar

1) https://docs.ceph.com/en/quincy/rados/operations/erasure-code-jerasure/
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] persistent write-back cache and quemu

2022-06-30 Thread Ansgar Jazdzewski
Hi folks,

I did a little testing with the persistent write-back cache (*1) we
run ceph quincy 17.2.1 qemu 6.2.0

rbd.fio works with the cache, but as soon as we start we get something like

error: internal error: process exited while connecting to monitor:
Failed to open module: /usr/lib/x86_64-linux-gnu/qemu/block-rbd.so:
undefined symbol: aio_bh_schedule_oneshot_full
2022-06-30T13:08:39.640532Z qemu-system-x86_64: -blockdev

so my assumption is that we need to do a bit more to have it running
with qemu, if you have some more information on how to get it running
please let me know!

Thanks a lot,
Ansgar

(*1) 
https://docs.ceph.com/en/pacific/rbd/rbd-persistent-write-back-cache/#rbd-persistent-write-back-cache
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Inconsistent PGs after upgrade to Pacific

2022-06-23 Thread Ansgar Jazdzewski
Hi,

I would say yes but it would be nice if other people can confirm it too.

also can you create a test cluster and do the same tasks
* create it with octopus
* create snapshot
* reduce rank to 1
* upgrade to pacific

and then try to fix the PG, assuming that you will have the same
issues in your test-cluster,

cheers,
Ansgar

Am Do., 23. Juni 2022 um 22:12 Uhr schrieb Pascal Ehlert :
>
> Hi,
>
> I have now tried to "ceph osd pool rmsnap $POOL beforefixes" and it says the 
> snapshot could not be found although I have definitely run "ceph osd pool 
> mksnap $POOL beforefixes" about three weeks ago.
> When running rados list-inconsistent-obj $PG on one of the affected PGs, all 
> of the objects returned have "snap" set to 1:
>
> root@srv01:~# for i in $(rados list-inconsistent-pg $POOL | jq -er .[]); do 
> rados list-inconsistent-obj $i | jq -er .inconsistents[].object; done
> [..]
> {
>   "name": "200020744f4.",
>   "nspace": "",
>   "locator": "",
>   "snap": 1,
>   "version": 5704208
> }
> {
>   "name": "200021aeb16.",
>   "nspace": "",
>   "locator": "",
>   "snap": 1,
>   "version": 6189078
> }
> [..]
>
> Running listsnaps on any of them then looks like this:
>
> root@srv01:~# rados listsnaps 200020744f4. -p $POOL
> 200020744f4.:
> cloneidsnapssizeoverlap
> 110[]
> head-0
>
>
> Is it save to assume that these objects belong to a somewhat broken snapshot 
> and can be removed safely without causing further damage?
>
>
> Thanks,
>
> Pascal
>
>
>
> Ansgar Jazdzewski wrote on 23.06.22 20:36:
>
> Hi,
>
> we could identify the rbd images that wehre affected and did an export 
> before, but in the case of cephfs metadata i have no plan that will work.
>
> can you try to delete the snapshot?
> also if the filesystem can be shutdown? try to do a backup of the metadatapool
>
> hope you will have some luck, let me know if I can help,
> Ansgar
>
> Pascal Ehlert  schrieb am Do., 23. Juni 2022, 16:45:
>>
>> Hi Ansgar,
>>
>> Thank you very much for the response.
>> Running your first command to obtain inconsistent objects, I retrieve a
>> total of 23114 only some of which are snaps.
>>
>> You mentioning snapshots did remind me of the fact however that I
>> created a snapshot on the Ceph metadata pool via "ceph osd pool $POOL
>> mksnap" before I reduced the number of ranks.
>> Maybe that has causes the inconsistencies and would explain why the
>> actual file system appears unaffected?
>>
>> Is there any way to validate that theory? I am a bit hesitant to just
>> run "rmsnap". Could that cause inconsistent data to be written back to
>> the actual objects?
>>
>>
>> Best regards,
>>
>> Pascal
>>
>>
>>
>> Ansgar Jazdzewski wrote on 23.06.22 16:11:
>> > Hi Pascal,
>> >
>> > We just had a similar situation on our RBD and had found some bad data
>> > in RADOS here is How we did it:
>> >
>> > for i in $(rados list-inconsistent-pg $POOL | jq -er .[]); do rados
>> > list-inconsistent-obj $i | jq -er .inconsistents[].object.name| awk
>> > -F'.' '{print $2}'; done
>> >
>> > we than found inconsistent snaps on the Object:
>> >
>> > rados list-inconsistent-snapset $PG --format=json-pretty | jq
>> > .inconsistents[].name
>> >
>> > List the data on the OSD's (ceph pg map $PG)
>> >
>> > ceph-objectstore-tool --data-path /var/lib/ceph/osd/ceph-${OSD}/ --op
>> > list ${OBJ} --pgid ${PG}
>> >
>> > and finally remove the object, like:
>> > ceph-objectstore-tool --data-path /var/lib/ceph/osd/ceph-459/ --op
>> > list rbd_data.762a94d768c04d.0036b7ac --pgid
>> > 2.704ceph-objectstore-tool --data-path /var/lib/ceph/osd/ceph-459/
>> > '["2.704",{"oid":"rbd_data.801e1d1d9c719d.00044943","key":"","snapid":125458,"hash":4136961796,"max":0,"pool":2,"namespace":"","max":0}]'
>> > remove
>> >
>> > we had to do it for all OSD one after the other after this a 'pg repair' 
>> > worked
>> >
>> > i hope it will help
>> > Ansgar
>> >
>> > Am Do., 23. Juni 2022 um 15:02 Uhr schrie

[ceph-users] Re: Inconsistent PGs after upgrade to Pacific

2022-06-23 Thread Ansgar Jazdzewski
Hi,

we could identify the rbd images that wehre affected and did an export
before, but in the case of cephfs metadata i have no plan that will work.

can you try to delete the snapshot?
also if the filesystem can be shutdown? try to do a backup of the
metadatapool

hope you will have some luck, let me know if I can help,
Ansgar

Pascal Ehlert  schrieb am Do., 23. Juni 2022, 16:45:

> Hi Ansgar,
>
> Thank you very much for the response.
> Running your first command to obtain inconsistent objects, I retrieve a
> total of 23114 only some of which are snaps.
>
> You mentioning snapshots did remind me of the fact however that I
> created a snapshot on the Ceph metadata pool via "ceph osd pool $POOL
> mksnap" before I reduced the number of ranks.
> Maybe that has causes the inconsistencies and would explain why the
> actual file system appears unaffected?
>
> Is there any way to validate that theory? I am a bit hesitant to just
> run "rmsnap". Could that cause inconsistent data to be written back to
> the actual objects?
>
>
> Best regards,
>
> Pascal
>
>
>
> Ansgar Jazdzewski wrote on 23.06.22 16:11:
> > Hi Pascal,
> >
> > We just had a similar situation on our RBD and had found some bad data
> > in RADOS here is How we did it:
> >
> > for i in $(rados list-inconsistent-pg $POOL | jq -er .[]); do rados
> > list-inconsistent-obj $i | jq -er .inconsistents[].object.name| awk
> > -F'.' '{print $2}'; done
> >
> > we than found inconsistent snaps on the Object:
> >
> > rados list-inconsistent-snapset $PG --format=json-pretty | jq
> > .inconsistents[].name
> >
> > List the data on the OSD's (ceph pg map $PG)
> >
> > ceph-objectstore-tool --data-path /var/lib/ceph/osd/ceph-${OSD}/ --op
> > list ${OBJ} --pgid ${PG}
> >
> > and finally remove the object, like:
> > ceph-objectstore-tool --data-path /var/lib/ceph/osd/ceph-459/ --op
> > list rbd_data.762a94d768c04d.0036b7ac --pgid
> > 2.704ceph-objectstore-tool --data-path /var/lib/ceph/osd/ceph-459/
> >
> '["2.704",{"oid":"rbd_data.801e1d1d9c719d.00044943","key":"","snapid":125458,"hash":4136961796,"max":0,"pool":2,"namespace":"","max":0}]'
> > remove
> >
> > we had to do it for all OSD one after the other after this a 'pg repair'
> worked
> >
> > i hope it will help
> > Ansgar
> >
> > Am Do., 23. Juni 2022 um 15:02 Uhr schrieb Dan van der Ster
> > :
> >> Hi Pascal,
> >>
> >> It's not clear to me how the upgrade procedure you described would
> >> lead to inconsistent PGs.
> >>
> >> Even if you didn't record every step, do you have the ceph.log, the
> >> mds logs, perhaps some osd logs from this time?
> >> And which versions did you upgrade from / to ?
> >>
> >> Cheers, Dan
> >>
> >> On Wed, Jun 22, 2022 at 7:41 PM Pascal Ehlert 
> wrote:
> >>> Hi all,
> >>>
> >>> I am currently battling inconsistent PGs after a far-reaching mistake
> >>> during the upgrade from Octopus to Pacific.
> >>> While otherwise following the guide, I restarted the Ceph MDS daemons
> >>> (and this started the Pacific daemons) without previously reducing the
> >>> ranks to 1 (from 2).
> >>>
> >>> This resulted in daemons not coming up and reporting inconsistencies.
> >>> After later reducing the ranks and bringing the MDS back up (I did not
> >>> record every step as this was an emergency situation), we started
> seeing
> >>> health errors on every scrub.
> >>>
> >>> Now after three weeks, while our CephFS is still working fine and we
> >>> haven't noticed any data damage, we realized that every single PG of
> the
> >>> cephfs metadata pool is affected.
> >>> Below you can find some information on the actual status and a detailed
> >>> inspection of one of the affected pgs. I am happy to provide any other
> >>> information that could be useful of course.
> >>>
> >>> A repair of the affected PGs does not resolve the issue.
> >>> Does anyone else here have an idea what we could try apart from copying
> >>> all the data to a new CephFS pool?
> >>>
> >>>
> >>>
> >>> Thank you!
> >>>
> >>> Pascal
> >>>
> >>>
> >>>

[ceph-users] Re: Inconsistent PGs after upgrade to Pacific

2022-06-23 Thread Ansgar Jazdzewski
Hi Pascal,

We just had a similar situation on our RBD and had found some bad data
in RADOS here is How we did it:

for i in $(rados list-inconsistent-pg $POOL | jq -er .[]); do rados
list-inconsistent-obj $i | jq -er .inconsistents[].object.name| awk
-F'.' '{print $2}'; done

we than found inconsistent snaps on the Object:

rados list-inconsistent-snapset $PG --format=json-pretty | jq
.inconsistents[].name

List the data on the OSD's (ceph pg map $PG)

ceph-objectstore-tool --data-path /var/lib/ceph/osd/ceph-${OSD}/ --op
list ${OBJ} --pgid ${PG}

and finally remove the object, like:
ceph-objectstore-tool --data-path /var/lib/ceph/osd/ceph-459/ --op
list rbd_data.762a94d768c04d.0036b7ac --pgid
2.704ceph-objectstore-tool --data-path /var/lib/ceph/osd/ceph-459/
'["2.704",{"oid":"rbd_data.801e1d1d9c719d.00044943","key":"","snapid":125458,"hash":4136961796,"max":0,"pool":2,"namespace":"","max":0}]'
remove

we had to do it for all OSD one after the other after this a 'pg repair' worked

i hope it will help
Ansgar

Am Do., 23. Juni 2022 um 15:02 Uhr schrieb Dan van der Ster
:
>
> Hi Pascal,
>
> It's not clear to me how the upgrade procedure you described would
> lead to inconsistent PGs.
>
> Even if you didn't record every step, do you have the ceph.log, the
> mds logs, perhaps some osd logs from this time?
> And which versions did you upgrade from / to ?
>
> Cheers, Dan
>
> On Wed, Jun 22, 2022 at 7:41 PM Pascal Ehlert  wrote:
> >
> > Hi all,
> >
> > I am currently battling inconsistent PGs after a far-reaching mistake
> > during the upgrade from Octopus to Pacific.
> > While otherwise following the guide, I restarted the Ceph MDS daemons
> > (and this started the Pacific daemons) without previously reducing the
> > ranks to 1 (from 2).
> >
> > This resulted in daemons not coming up and reporting inconsistencies.
> > After later reducing the ranks and bringing the MDS back up (I did not
> > record every step as this was an emergency situation), we started seeing
> > health errors on every scrub.
> >
> > Now after three weeks, while our CephFS is still working fine and we
> > haven't noticed any data damage, we realized that every single PG of the
> > cephfs metadata pool is affected.
> > Below you can find some information on the actual status and a detailed
> > inspection of one of the affected pgs. I am happy to provide any other
> > information that could be useful of course.
> >
> > A repair of the affected PGs does not resolve the issue.
> > Does anyone else here have an idea what we could try apart from copying
> > all the data to a new CephFS pool?
> >
> >
> >
> > Thank you!
> >
> > Pascal
> >
> >
> >
> >
> > root@srv02:~# ceph status
> >cluster:
> >  id: f0d6d4d0-8c17-471a-9f95-ebc80f1fee78
> >  health: HEALTH_ERR
> >  insufficient standby MDS daemons available
> >  69262 scrub errors
> >  Too many repaired reads on 2 OSDs
> >  Possible data damage: 64 pgs inconsistent
> >
> >services:
> >  mon: 3 daemons, quorum srv02,srv03,srv01 (age 3w)
> >  mgr: srv03(active, since 3w), standbys: srv01, srv02
> >  mds: 2/2 daemons up, 1 hot standby
> >  osd: 44 osds: 44 up (since 3w), 44 in (since 10M)
> >
> >data:
> >  volumes: 2/2 healthy
> >  pools:   13 pools, 1217 pgs
> >  objects: 75.72M objects, 26 TiB
> >  usage:   80 TiB used, 42 TiB / 122 TiB avail
> >  pgs: 1153 active+clean
> >   55   active+clean+inconsistent
> >   9active+clean+inconsistent+failed_repair
> >
> >io:
> >  client:   2.0 MiB/s rd, 21 MiB/s wr, 240 op/s rd, 1.75k op/s wr
> >
> >
> > {
> >"epoch": 4962617,
> >"inconsistents": [
> >  {
> >"object": {
> >  "name": "100cc8e.",
> >  "nspace": "",
> >  "locator": "",
> >  "snap": 1,
> >  "version": 4253817
> >},
> >"errors": [],
> >"union_shard_errors": [
> >  "omap_digest_mismatch_info"
> >],
> >"selected_object_info": {
> >  "oid": {
> >"oid": "100cc8e.",
> >"key": "",
> >"snapid": 1,
> >"hash": 1369745244,
> >"max": 0,
> >"pool": 7,
> >"namespace": ""
> >  },
> >  "version": "4962847'6209730",
> >  "prior_version": "3916665'4306116",
> >  "last_reqid": "osd.27.0:757107407",
> >  "user_version": 4253817,
> >  "size": 0,
> >  "mtime": "2022-02-26T12:56:55.612420+0100",
> >  "local_mtime": "2022-02-26T12:56:55.614429+0100",
> >  "lost": 0,
> >  "flags": [
> >"dirty",
> >"omap",
> >"data_digest",
> >"omap_digest"
> >  ],
> >  "truncate_seq": 0,
> >  "truncate_size": 0,
> >  "data_digest": "0x",
> >  "omap_digest": "0xe5211a9e",
> >  "ex

[ceph-users] Re: RGW with keystone and dns-style buckets

2022-01-11 Thread Ansgar Jazdzewski
hi folks,

i got it to work using haproxy just put some stuff into the frontend
to rewrite the url from domainstyle with '_' to pathstyle with ':'

 acl dnsstyle_buckets hdr_end(host) -i .object.domain
 capture request header User-Agent len 256
 capture request header Host len 128
 http-request set-var(req.bucketname)
hdr(host),regsub(.object.domain,),regsub(_,:) if dnsstyle_buckets
 http-request set-var(req.bucketname)
hdr(host),map(/etc/haproxy/map/buckets.map)
 http-request set-header X-Debug-Bucket %[var(req.bucketname)] if {
var(req.bucketname) -m found }
 http-request set-uri
/%[var(req.bucketname)]%[path,regsub(/$,/index.html)] if {
var(req.bucketname) -m found }
 http-request set-header Host object.domain if { var(req.bucketname)
-m found }
 use_backend stats if stats
 use_backend ceph-mgr if ceph-mgr
 default_backend ceph-rgw

hope it will help others too
Ansgar

Am Mo., 10. Jan. 2022 um 14:52 Uhr schrieb Ansgar Jazdzewski
:
>
> Hi folks,
>
> i try to get dns-style buckets running and stumbled across an issue with 
> tenants
>
> I can access the bucket like https://s3.domain/: but I
> did not find a way to do it with DNS-Style something like that
> https://_.s3.domain !
>
> Do I miss something in the documentation?
>
> Thanks for your help!
> Ansgar
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] RGW with keystone and dns-style buckets

2022-01-10 Thread Ansgar Jazdzewski
Hi folks,

i try to get dns-style buckets running and stumbled across an issue with tenants

I can access the bucket like https://s3.domain/: but I
did not find a way to do it with DNS-Style something like that
https://_.s3.domain !

Do I miss something in the documentation?

Thanks for your help!
Ansgar
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: upgraded to cluster to 16.2.6 PACIFIC

2021-11-09 Thread Ansgar Jazdzewski
> IIRC you get a HEALTH_WARN message that there are OSDs with old metadata
> format. You can suppress that warning, but I guess operators feel like
> they want to deal with the situation and get it fixed rather than ignore it.

Yes, and if suppress the waning gets forgotten you run into other
issues down the road
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: upgraded to cluster to 16.2.6 PACIFIC

2021-11-09 Thread Ansgar Jazdzewski
Am Di., 9. Nov. 2021 um 11:08 Uhr schrieb Dan van der Ster 
:
>
> Hi Ansgar,
>
> To clarify the messaging or docs, could you say where you learned that
> you should enable the bluestore_fsck_quick_fix_on_mount setting? Is
> that documented somewhere, or did you have it enabled from previously?
> The default is false so the corruption only occurs when users actively
> choose to fsck.

I have upgraded another cluster in the past with no issues as of
today, so I just followed my own instructions for this cluster

> As to recovery, Igor wrote the low level details here:
> https://www.spinics.net/lists/ceph-users/msg69338.html
> How did you resolve the omap issues in your rgw.index pool? What type
> of issues remain in meta and log?

for the index pool we run this script
https://paste.openstack.org/show/810861/ it adds a omap-key and
triggers a repair but is dose not work for the meta pool
my next best option  is to stop the radosgw and create a new pool with
the same data! like:

pool=default.rgw.meta
ceph osd pool create $pool.new 64 64
ceph osd pool application enable $pool.new rgw

# copy data
rados -p $pool export /tmp/$pool.img
rados -p $pool.new import /tmp/$pool.img

#swap pools
ceph osd pool rename $pool $pool.old
ceph osd pool rename $pool.new $pool

rm -f /tmp/$pool.img
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] upgraded to cluster to 16.2.6 PACIFIC

2021-11-08 Thread Ansgar Jazdzewski
Hi fellow ceph users,

I did an upgrade from 14.2.23 to 16.2.6 not knowing that the current
minor version had this nasty bug! [1] [2]

we were able to resolve some of the omap issues in the rgw.index pool
but still have 17pg's to fix in the rgw.meta and rgw.log pool!

I have a couple of questions:
- did someone have done a script to fix that pg's we were only able to
fix the index with our approach [3]
- why is the 16.2.6 version still in the public mirror (should it not be moved)
- do you have any other workarounds to resolve this?

thanks for your help!
Ansgar

1) https://docs.ceph.com/en/latest/releases/pacific/
2) https://tracker.ceph.com/issues/53062
3) https://paste.openstack.org/show/810861
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] OSDs crash after deleting unfound object in Nautilus 14.2.22

2021-09-09 Thread Ansgar Jazdzewski
Hi Folks,

We had to delete some unfound objects in our cache to get our cluster
back working! but after an hour we see OSD's crash

we found that it is caused by the fact that we deleted the:
 "hit_set_8.3fc_archive_2021-09-09 08:25:58.520768Z_2021-09-09
08:26:18.907234Z" Object

Crash-Log can be found here https://paste.openstack.org/show/809211/

our plan is now to change the osd code to not update the stats in
order to get the osd back online and remove the cache layer

diff --git a/src/osd/PrimaryLogPG.cc b/src/osd/PrimaryLogPG.cc
index 3b3e3e59292..a06fec9c269 100644
--- a/src/osd/PrimaryLogPG.cc
+++ b/src/osd/PrimaryLogPG.cc
@@ -13932,11 +13932,13 @@ void
PrimaryLogPG::hit_set_trim(OpContextUPtr &ctx, unsigned max)
 updated_hit_set_hist.history.pop_front();

 ObjectContextRef obc = get_object_context(oid, false);
-ceph_assert(obc);
+//ceph_assert(obc);
+if (obc) {
 --ctx->delta_stats.num_objects;
 --ctx->delta_stats.num_objects_hit_set_archive;
 ctx->delta_stats.num_bytes -= obc->obs.oi.size;
 ctx->delta_stats.num_bytes_hit_set_archive -= obc->obs.oi.size;
+}
   }
 }

Does anyone have done this before or have another workaround to get
the OSD back online

Thanks in Advance
Ansgar
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Manually add monitor to a running cluster

2021-08-19 Thread Ansgar Jazdzewski
Hi,

so yes I was assuming that the new mon is a member of the cluster, so
packages are installed and ceph.conf in place!
You also need to add the IP of the new mon to the ceph.conf when you
are done and redistribute it to all members of the cluster.

Ansgar

Am Do., 19. Aug. 2021 um 15:30 Uhr schrieb Francesco Piraneo G.
:
>
>
> >> Question: You don't have to add the new monitor (i.e. mon2) on the
> >> monitor map of the running monitor to allow to join the cluster? Because
> >> these are the instruction I followed but the mon2 doesn't start because
> >> it was not mapped on the cluster.
> > This will be done automatically when the mon is joining the cluster
>
> Another question: What about the ceph.conf file? It has to be sent to
> the new monitor / modified or not? Consider that you cannot run any
> command on new monitor without a ceph.conf file on it without getting an
> error...
>
> F.
>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Manually add monitor to a running cluster

2021-08-19 Thread Ansgar Jazdzewski
Hi,

Am Do., 19. Aug. 2021 um 14:57 Uhr schrieb Francesco Piraneo G.
:
>
>
> >mkdir /var/lib/ceph/mon/ceph-$(hostname -s)
>
> This has to be done on new host, right?

Yes

> >ceph auth get mon. -o /tmp/mon-keyfile
> >ceph mon getmap -o /tmp/mon-monmap

> This has to be done on the running mon host, right? Then we have to send
> monmap and keyfiles on the new host?

if you not have copy the ceph-admin-key on the system so you can run
the commands on the target / new mon system

> >ceph-mon -i $(hostname -s) --mkfs --monmap /tmp/mon-monmap --keyring
> > /tmp/mon-keyfile
> >chown -R ceph: /var/lib/ceph/mon/ceph-$(hostname -s)
> >systemctl start ceph-mon@$(hostname -s)
>
> Always on the new mon host I suppose.

Yes ;-)

> Question: You don't have to add the new monitor (i.e. mon2) on the
> monitor map of the running monitor to allow to join the cluster? Because
> these are the instruction I followed but the mon2 doesn't start because
> it was not mapped on the cluster.

This will be done automatically when the mon is joining the cluster

> F.
>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Manually add monitor to a running cluster

2021-08-19 Thread Ansgar Jazdzewski
Hi Francesco,

in short you need to do this:

  mkdir /var/lib/ceph/mon/ceph-$(hostname -s)
  ceph auth get mon. -o /tmp/mon-keyfile
  ceph mon getmap -o /tmp/mon-monmap
  ceph-mon -i $(hostname -s) --mkfs --monmap /tmp/mon-monmap --keyring
/tmp/mon-keyfile
  chown -R ceph: /var/lib/ceph/mon/ceph-$(hostname -s)
  systemctl start ceph-mon@$(hostname -s)

i hope it helps,
Ansgar

Am Do., 19. Aug. 2021 um 14:42 Uhr schrieb Francesco Piraneo G.
:
>
> Good afternoon,
>
> I'm doing my best to follow all the instructions here to add extra
> monitors to a running cluster:
>
> https://docs.ceph.com/en/latest/rados/operations/add-or-rm-mons/
>
> Unfortunately they seems quite confused and incomplete; can someone
> point me to have a more detailed and complete instructions?
>
> I'm running ceph pacific under CentOS 8. mon1 and osd1...3 running; just
> need to manually install mon2...3.
>
> Thank you very much.
>
> Francesco
>
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: 1/3 mons down! mon do not rejoin

2021-07-26 Thread Ansgar Jazdzewski
Yes, the empty DB told me that at this point I had no other choice
than recreate the entire mon service.

* remove broken mon
  ceph mon remove $(hostname -s)

* mon preparation done
  rm -rf /var/lib/ceph/mon/ceph-$(hostname -s)
  mkdir /var/lib/ceph/mon/ceph-$(hostname -s)
  ceph auth get mon. -o /tmp/mon-keyfile
  ceph mon getmap -o /tmp/mon-monmap
  ceph-mon -i $(hostname -s) --mkfs --monmap /tmp/mon-monmap --keyring
/tmp/mon-keyfile
  chown -R ceph: /var/lib/ceph/mon/ceph-$(hostname -s)

will wait for low-traffic time on the cluster to enable the recreated mon

thanks for all the help so far
Ansgar

Am Mo., 26. Juli 2021 um 15:39 Uhr schrieb Dan van der Ster
:
>
> Your log ends with
>
> > 2021-07-25 06:46:52.078 7fe065f24700  1 mon.osd01@0(leader).osd e749666 
> > do_prune osdmap full prune enabled
>
> So mon.osd01 was still the leader at that time.
> When did it leave the cluster?
>
> > I also found that the rocksdb on osd01 is only 1MB in size and 345MB on the 
> > other mons!
>
> It sounds like mon.osd01's db has been re-initialized as empty, e.g.
> maybe the directory was lost somehow between reboots?
>
> -- dan
>
>
> On Mon, Jul 26, 2021 at 1:55 PM Ansgar Jazdzewski
>  wrote:
> >
> > Hi Dan, Hi Folks,
> >
> > this is how things started, I also found that the rocksdb on osd01 is
> > only 1MB in size and 345MB on the other mons!
> >
> > 2021-07-25 06:46:30.029 7fe061f1c700  0 log_channel(cluster) log [DBG]
> > : monmap e1: 3 mons at
> > {osd01=[v2:10.152.28.171:3300/0,v1:10.152.28.171:6789/0],osd02=[v2:10.152.28.172:3300/0,v1:10.152.28.172:6789/0],osd03=[v2:10.152.28.173:3300/0,v1:10.152.28.173:6789/0]}
> > 2021-07-25 06:46:30.029 7fe061f1c700  0 log_channel(cluster) log [DBG]
> > : fsmap cephfs:1 {0=osd01=up:active} 2 up:standby
> > 2021-07-25 06:46:30.029 7fe061f1c700  0 log_channel(cluster) log [DBG]
> > : osdmap e749665: 436 total, 436 up, 436 in
> > 2021-07-25 06:46:30.029 7fe061f1c700  0 log_channel(cluster) log [DBG]
> > : mgrmap e89: osd03(active, since 13h), standbys: osd01, osd02
> > 2021-07-25 06:46:30.029 7fe061f1c700  0 log_channel(cluster) log [INF]
> > : overall HEALTH_OK
> > 2021-07-25 06:46:30.805 7fe065f24700  1 mon.osd01@0(leader).osd
> > e749665 do_prune osdmap full prune enabled
> > 2021-07-25 06:46:30.957 7fe06371f700  0 mon.osd01@0(leader) e1
> > handle_command mon_command({"prefix": "status"} v 0) v1
> > 2021-07-25 06:46:30.957 7fe06371f700  0 log_channel(audit) log [DBG] :
> > from='client.? 10.152.28.171:0/3290370429' entity='client.admin'
> > cmd=[{"prefix": "status"}]: dispatch
> > 2021-07-25 06:46:51.922 7fe065f24700  1 mon.osd01@0(leader).mds e85
> > tick: resetting beacon timeouts due to mon delay (slow election?) of
> > 20.3627s seconds
> > 2021-07-25 06:46:51.922 7fe065f24700 -1 mon.osd01@0(leader) e1
> > get_health_metrics reporting 13 slow ops, oldest is pool_op(delete
> > unmanaged snap pool 3 tid 27666 name  v749664)
> > 2021-07-25 06:46:51.930 7fe06371f700  0 log_channel(cluster) log [INF]
> > : mon.osd01 calling monitor election
> > 2021-07-25 06:46:51.930 7fe06371f700  1
> > mon.osd01@0(electing).elector(173) init, last seen epoch 173,
> > mid-election, bumping
> > 2021-07-25 06:46:51.946 7fe06371f700  1 mon.osd01@0(electing) e1
> > collect_metadata :  no unique device id for : fallback method has no
> > model nor serial'
> > 2021-07-25 06:46:51.962 7fe067727700  1 mon.osd01@0(electing) e1
> > handle_auth_request failed to assign global_id
> > 2021-07-25 06:46:51.962 7fe067727700  1 mon.osd01@0(electing) e1
> > handle_auth_request failed to assign global_id
> > 2021-07-25 06:46:51.962 7fe067727700  1 mon.osd01@0(electing) e1
> > handle_auth_request failed to assign global_id
> > 2021-07-25 06:46:51.962 7fe067727700  1 mon.osd01@0(electing) e1
> > handle_auth_request failed to assign global_id
> > 2021-07-25 06:46:51.962 7fe067727700  1 mon.osd01@0(electing) e1
> > handle_auth_request failed to assign global_id
> > 2021-07-25 06:46:51.962 7fe067727700  1 mon.osd01@0(electing) e1
> > handle_auth_request failed to assign global_id
> > 2021-07-25 06:46:51.962 7fe067727700  1 mon.osd01@0(electing) e1
> > handle_auth_request failed to assign global_id
> > 2021-07-25 06:46:51.962 7fe067727700  1 mon.osd01@0(electing) e1
> > handle_auth_request failed to assign global_id
> > 2021-07-25 06:46:51.966 7fe067727700  1 mon.osd01@0(electing) e1
> > handle_auth_request failed to assign global_id
> > 2021-07-25 06:46:51.966 7fe067727700  1 mon.osd01@0(electing) e1
> > handle_auth_request faile

[ceph-users] Re: 1/3 mons down! mon do not rejoin

2021-07-26 Thread Ansgar Jazdzewski
ound to be a network issue:
> https://tracker.ceph.com/issues/48033
>
> -- Dan
>
> On Sun, Jul 25, 2021 at 6:36 PM Ansgar Jazdzewski
>  wrote:
> >
> > Am So., 25. Juli 2021 um 18:02 Uhr schrieb Dan van der Ster
> > :
> > >
> > > What do you have for the new global_id settings? Maybe set it to allow 
> > > insecure global_id auth and see if that allows the mon to join?
> >
> >  auth_allow_insecure_global_id_reclaim is allowed as we still have
> > some VM's not restarted
> >
> > # ceph config get mon.*
> > WHO MASK LEVELOPTION VALUE RO
> > mon  advanced auth_allow_insecure_global_id_reclaim  true
> > mon  advanced mon_warn_on_insecure_global_id_reclaim false
> > mon  advanced mon_warn_on_insecure_global_id_reclaim_allowed false
> >
> > > > I can try to move the /var/lib/ceph/mon/ dir and recreate it!?
> > >
> > > I'm not sure it will help. Running the mon with --debug_ms=1 might give 
> > > clues why it's stuck probing.
> >
> > 2021-07-25 16:28:41.418 7fcc613d8700 10 mon.osd01@0(probing) e1
> > probing other monitors
> > 2021-07-25 16:28:41.418 7fcc613d8700  1 --
> > [v2:10.152.28.171:3300/0,v1:10.152.28.171:6789/0] send_to--> mon
> > [v2:10.152.28.172:3300/0,v1:10.152.28.172:6789/0] -- mon_probe(probe
> > a6baa789-6be2-4ce0-ab2d-7c78b899d4bd name osd01 mon_release 14) v7 --
> > ?+0 0x55c6b35ae780
> > 2021-07-25 16:28:41.418 7fcc613d8700  1 --
> > [v2:10.152.28.171:3300/0,v1:10.152.28.171:6789/0] -->
> > [v2:10.152.28.172:3300/0,v1:10.152.28.172:6789/0] -- mon_probe(probe
> > a6baa789-6be2-4ce0-ab2d-7c78b899d4bd name osd01 mon_release 14) v7 --
> > 0x55c6b35ae780 con 0x55c6b2611180
> > 2021-07-25 16:28:41.418 7fcc613d8700  1 --
> > [v2:10.152.28.171:3300/0,v1:10.152.28.171:6789/0] send_to--> mon
> > [v2:10.152.28.173:3300/0,v1:10.152.28.173:6789/0] -- mon_probe(probe
> > a6baa789-6be2-4ce0-ab2d-7c78b899d4bd name osd01 mon_release 14) v7 --
> > ?+0 0x55c6b35aea00
> > 2021-07-25 16:28:41.418 7fcc613d8700  1 --
> > [v2:10.152.28.171:3300/0,v1:10.152.28.171:6789/0] -->
> > [v2:10.152.28.173:3300/0,v1:10.152.28.173:6789/0] -- mon_probe(probe
> > a6baa789-6be2-4ce0-ab2d-7c78b899d4bd name osd01 mon_release 14) v7 --
> > 0x55c6b35aea00 con 0x55c6b2611600
> > 2021-07-25 16:28:41.814 7fcc5dbd1700  1 --2-
> > [v2:10.152.28.171:3300/0,v1:10.152.28.171:6789/0] >>
> > [v2:10.152.28.173:3300/0,v1:10.152.28.173:6789/0] conn(0x55c6b2611600
> > 0x55c6b3323c00 unknown :-1 s=BANNER_CONNECTING pgs=0 cs=0 l=0 rev1=1
> > rx=0 tx=0)._handle_peer_banner_payload supported=1 required=0
> > 2021-07-25 16:28:41.814 7fcc62bdb700  1 --2-
> > [v2:10.152.28.171:3300/0,v1:10.152.28.171:6789/0] >>
> > [v2:10.152.28.172:3300/0,v1:10.152.28.172:6789/0] conn(0x55c6b2611180
> > 0x55c6b3323500 unknown :-1 s=BANNER_CONNECTING pgs=0 cs=0 l=0 rev1=1
> > rx=0 tx=0)._handle_peer_banner_payload supported=1 required=0
> > 2021-07-25 16:28:41.814 7fcc62bdb700 10 mon.osd01@0(probing) e1
> > ms_get_authorizer for mon
> > 2021-07-25 16:28:41.814 7fcc5dbd1700 10 mon.osd01@0(probing) e1
> > ms_get_authorizer for mon
> > 2021-07-25 16:28:41.814 7fcc62bdb700  1 --
> > [v2:10.152.28.171:3300/0,v1:10.152.28.171:6789/0] >>
> > [v2:10.152.28.172:3300/0,v1:10.152.28.172:6789/0] conn(0x55c6b2611180
> > msgr2=0x55c6b3323500 secure :-1 s=STATE_CONNECTION_ESTABLISHED
> > l=0).read_bulk peer close file descriptor 27
> > 2021-07-25 16:28:41.814 7fcc62bdb700  1 --
> > [v2:10.152.28.171:3300/0,v1:10.152.28.171:6789/0] >>
> > [v2:10.152.28.172:3300/0,v1:10.152.28.172:6789/0] conn(0x55c6b2611180
> > msgr2=0x55c6b3323500 secure :-1 s=STATE_CONNECTION_ESTABLISHED
> > l=0).read_until read failed
> > 2021-07-25 16:28:41.814 7fcc62bdb700  1 --2-
> > [v2:10.152.28.171:3300/0,v1:10.152.28.171:6789/0] >>
> > [v2:10.152.28.172:3300/0,v1:10.152.28.172:6789/0] conn(0x55c6b2611180
> > 0x55c6b3323500 secure :-1 s=SESSION_CONNECTING pgs=0 cs=0 l=0 rev1=1
> > rx=0x55c6b34bbad0 tx=0x55c6b3528130).handle_read_frame_preamble_main
> > read frame preamble failed r=-1 ((1) Operation not permitted)
> > 2021-07-25 16:28:41.814 7fcc5dbd1700  1 --
> > [v2:10.152.28.171:3300/0,v1:10.152.28.171:6789/0] >>
> > [v2:10.152.28.173:3300/0,v1:10.152.28.173:6789/0] conn(0x55c6b2611600
> > msgr2=0x55c6b3323c00 secure :-1 s=STATE_CONNECTION_ESTABLISHED
> > l=0).read_bulk peer close file descriptor 28
> > 2021-07-25 16:28:41.814 7fcc5dbd1700  1 --
> > [v2:10.152.28.171:3

[ceph-users] Re: 1/3 mons down! mon do not rejoin

2021-07-25 Thread Ansgar Jazdzewski
n.osd01@0(probing) e1
unregister_cluster_logger - not registered
2021-07-25 16:28:43.418 7fcc613d8700 10 mon.osd01@0(probing) e1
cancel_probe_timeout (none scheduled)
2021-07-25 16:28:43.418 7fcc613d8700 10 mon.osd01@0(probing) e1 monmap
e1: 3 mons at 
{osd01=[v2:10.152.28.171:3300/0,v1:10.152.28.171:6789/0],osd02=[v2:10.152.28.172:3300/0,v1:10.152.28.172:6789/0],osd03=[v2:10.152.28.173:3300/0,v1:10.152.28.173:6789/0]}
2021-07-25 16:28:43.418 7fcc613d8700 10 mon.osd01@0(probing) e1 _reset
2021-07-25 16:28:43.418 7fcc613d8700 10 mon.osd01@0(probing).auth v0
_set_mon_num_rank num 0 rank 0
2021-07-25 16:28:43.418 7fcc613d8700 10 mon.osd01@0(probing) e1
cancel_probe_timeout (none scheduled)
2021-07-25 16:28:43.418 7fcc613d8700 10 mon.osd01@0(probing) e1 timecheck_finish
2021-07-25 16:28:43.418 7fcc613d8700 15 mon.osd01@0(probing) e1 health_tick_stop
2021-07-25 16:28:43.418 7fcc613d8700 15 mon.osd01@0(probing) e1
health_interval_stop
2021-07-25 16:28:43.418 7fcc613d8700 10 mon.osd01@0(probing) e1
scrub_event_cancel
2021-07-25 16:28:43.418 7fcc613d8700 10 mon.osd01@0(probing) e1 scrub_reset
2021-07-25 16:28:43.418 7fcc613d8700 10 mon.osd01@0(probing) e1
cancel_probe_timeout (none scheduled)
2021-07-25 16:28:43.418 7fcc613d8700 10 mon.osd01@0(probing) e1
reset_probe_timeout 0x55c6b3553260 after 2 seconds


still looks like a connection issue but I can connect! using telnet

root@osd01:~# telnet 10.152.28.172 6789
Trying 10.152.28.172...
Connected to 10.152.28.172.
Escape character is '^]'.
ceph v027



> .. Dan
>
>
>
>
>
> On Sun, 25 Jul 2021, 17:53 Ansgar Jazdzewski,  
> wrote:
>>
>> Am So., 25. Juli 2021 um 17:17 Uhr schrieb Dan van der Ster
>> :
>> >
>> > > raise the min version to nautilus
>> >
>> > Are you referring to the min osd version or the min client version?
>>
>> yes sorry was not written clearly
>>
>> > I don't think the latter will help.
>> >
>> > Are you sure that mon.osd01 can reach those other mons on ports 6789 and 
>> > 3300?
>>
>> yes I just tested it one more time ping MTU and telnet to all mon ports
>>
>> > Do you have any notable custom ceph configurations on this cluster?
>>
>> No, I did not think anything fancy
>>
>> [global]
>> cluster network = 10.152.40.0/22
>> fsid = a6baa789-6be2-4ce0-ab2d-7c78b899d4bd
>> mon host = 10.152.28.171,10.152.28.172,10.152.28.173
>> mon initial members = osd01,osd02,osd03
>> osd pool default crush rule = -1
>> public network = 10.152.28.0/22
>>
>>
>> I just tried to start the mon with force-sync but as the mon did not
>> join it will not pull any data
>> ceph-mon -f --cluster ceph --id osd01 --setuser ceph --setgroup ceph
>> --debug_mon 10 --yes-i-really-mean-it --force-sync -d
>>
>> I can try to move the /var/lib/ceph/mon/ dir and recreate it!?
>>
>>
>> thanks for all the help so far!
>> Ansgar
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: 1/3 mons down! mon do not rejoin

2021-07-25 Thread Ansgar Jazdzewski
Am So., 25. Juli 2021 um 17:17 Uhr schrieb Dan van der Ster
:
>
> > raise the min version to nautilus
>
> Are you referring to the min osd version or the min client version?

yes sorry was not written clearly

> I don't think the latter will help.
>
> Are you sure that mon.osd01 can reach those other mons on ports 6789 and 3300?

yes I just tested it one more time ping MTU and telnet to all mon ports

> Do you have any notable custom ceph configurations on this cluster?

No, I did not think anything fancy

[global]
cluster network = 10.152.40.0/22
fsid = a6baa789-6be2-4ce0-ab2d-7c78b899d4bd
mon host = 10.152.28.171,10.152.28.172,10.152.28.173
mon initial members = osd01,osd02,osd03
osd pool default crush rule = -1
public network = 10.152.28.0/22


I just tried to start the mon with force-sync but as the mon did not
join it will not pull any data
ceph-mon -f --cluster ceph --id osd01 --setuser ceph --setgroup ceph
--debug_mon 10 --yes-i-really-mean-it --force-sync -d

I can try to move the /var/lib/ceph/mon/ dir and recreate it!?


thanks for all the help so far!
Ansgar
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: 1/3 mons down! mon do not rejoin

2021-07-25 Thread Ansgar Jazdzewski
yload 41
mon.osd01@0(probing) e1 handle_auth_request haven't formed initial quorum, EBUSY
mon.osd01@0(probing) e1 ms_handle_reset 0x560954d36000 -
mon.osd01@0(probing) e1 probe_timeout 0x560954c9c420
mon.osd01@0(probing) e1 bootstrap
mon.osd01@0(probing) e1 sync_reset_requester
mon.osd01@0(probing) e1 unregister_cluster_logger - not registered
mon.osd01@0(probing) e1 cancel_probe_timeout (none scheduled)
mon.osd01@0(probing) e1 monmap e1: 3 mons at
{osd01=[v2:10.152.28.171:3300/0,v1:10.152.28.171:6789/0],osd02=[v2:10.152.28.172:3300/0,v1:10.152.28.172:6789/0],osd03=[v2:10.152.28.173:3300/0,v1:10.152.28.173:6789/0]}
mon.osd01@0(probing) e1 _reset

Am So., 25. Juli 2021 um 13:24 Uhr schrieb Dan van der Ster
:
>
> With four mons total then only one can be down... mon.osd01 is down already 
> you're at the limit.
>
> It's possible that whichever reason is preventing this mon from joining will 
> also prevent the new mon from joining.
>
> I think you should:
>
> 1. Investigate why mon.osd01 isn't coming back into the quorum... The logs on 
> that mon or the others can help.
> 2. If you decide to give up on mon.osd01, then first you should rm it from 
> the cluster before you add a mon from another host.
>
> .. Dan
>
>
> On Sun, 25 Jul 2021, 12:43 Ansgar Jazdzewski,  
> wrote:
>>
>> hi folks
>>
>> I have a cluster running ceph 14.2.22 on ubuntu 18.04 and some hours
>> ago one of the mons stopped working and the on-call team rebooted the
>> node; not the mon is is not joining the ceph-cluster.
>>
>> TCP ports of mons are open and reachable!
>>
>> ceph health detail
>> HEALTH_WARN 1/3 mons down, quorum osd02,osd03
>> MON_DOWN 1/3 mons down, quorum osd02,osd03
>> mon.osd01 (rank 0) addr
>> [v2:10.152.28.171:3300/0,v1:10.152.28.171:6789/0] is down (out of
>> quorum)
>>
>> I like to add a new 3rd mon to the cluster on osd04 but I'm a bit
>> scared as it can result in 50% of the mons are not in reach!?
>>
>> Question: should I remove the mon on osd01 first and recreate the
>> demon before starting a new mon on osd04?
>>
>>
>> Thanks for your input!
>> Ansgar
>> ___
>> ceph-users mailing list -- ceph-users@ceph.io
>> To unsubscribe send an email to ceph-users-le...@ceph.io
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] 1/3 mons down! mon do not rejoin

2021-07-25 Thread Ansgar Jazdzewski
hi folks

I have a cluster running ceph 14.2.22 on ubuntu 18.04 and some hours
ago one of the mons stopped working and the on-call team rebooted the
node; not the mon is is not joining the ceph-cluster.

TCP ports of mons are open and reachable!

ceph health detail
HEALTH_WARN 1/3 mons down, quorum osd02,osd03
MON_DOWN 1/3 mons down, quorum osd02,osd03
mon.osd01 (rank 0) addr
[v2:10.152.28.171:3300/0,v1:10.152.28.171:6789/0] is down (out of
quorum)

I like to add a new 3rd mon to the cluster on osd04 but I'm a bit
scared as it can result in 50% of the mons are not in reach!?

Question: should I remove the mon on osd01 first and recreate the
demon before starting a new mon on osd04?


Thanks for your input!
Ansgar
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: suggestion for Ceph client network config

2021-06-11 Thread Ansgar Jazdzewski
Hi,

I would do an extra network / VLAN mostly for security reasons, also
take a look at CTDB for samba failover.

Have a nice Weekend,
Ansgar

Am Fr., 11. Juni 2021 um 08:21 Uhr schrieb Götz Reinicke
:
>
> Hi all
>
> We get a new samba smb fileserver who mounts our cephfs for exporting some 
> shares. What might be a good or better network setup for that server?
>
> Should I configure two interfaces - one for the smb share export towards our 
> workstations and desktops and one towards the ceph cluster?
>
> Or would it be „ok“ for all traffic to be on one interface?
>
> The server has 40G ports.
>
> Thanks for your suggestions and feedback . Regards . Götz
>
>
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: CephFS design

2021-06-11 Thread Ansgar Jazdzewski
Hi,

first of all, check the workload you like to have on the filesystem if
you plan to migrate an old one do some proper performance-testing of
the old storage.

the io500 can give some ideas https://www.vi4io.org/io500/start but it
depends on the use-case of the filesystem

cheers,
Ansgar

Am Fr., 11. Juni 2021 um 10:54 Uhr schrieb Szabo, Istvan (Agoda)
:
>
> Hi,
>
> Can you suggest me what is a good cephfs design? I've never used it, only rgw 
> and rbd we have, but want to give a try. Howvere in the mail list I saw a 
> huge amount of issues with cephfs so would like to go with some let's say 
> bulletproof best practices.
>
> Like separate the mds from mon and mgr?
> Need a lot of memory?
> Should be on ssd or nvme?
> How many cpu/disk ...
>
> Very appreciate it.
>
> Istvan Szabo
> Senior Infrastructure Engineer
> ---
> Agoda Services Co., Ltd.
> e: istvan.sz...@agoda.com
> ---
>
>
> 
> This message is confidential and is for the sole use of the intended 
> recipient(s). It may also be privileged or otherwise protected by copyright 
> or other legal rules. If you have received it by mistake please let us know 
> by reply email and delete it from your system. It is prohibited to copy this 
> message or disclose its content to anyone. Any confidentiality or privilege 
> is not waived or lost by any mistaken delivery or unauthorized disclosure of 
> the message. All messages sent to and from Agoda may be monitored to ensure 
> compliance with company policies, to protect the company's interests and to 
> remove potential malware. Electronic messages may be intercepted, amended, 
> lost or deleted, or contain viruses.
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Can we deprecate FileStore in Quincy?

2021-06-03 Thread Ansgar Jazdzewski
Hi folks,

I'm fine with dropping Filestore in the R release!
Only one thing to add is: please add a warning to all versions we can
upgrade from to the R release son not only Quincy but also pacific!

Thanks,
Ansgar

Neha Ojha  schrieb am Di., 1. Juni 2021, 21:24:

> Hello everyone,
>
> Given that BlueStore has been the default and more widely used
> objectstore since quite some time, we would like to understand whether
> we can consider deprecating FileStore in our next release, Quincy and
> remove it in the R release. There is also a proposal [0] to add a
> health warning to report FileStore OSDs.
>
> We discussed this topic in the Ceph Month session today [1] and there
> were no objections from anybody on the call. I wanted to reach out to
> the list to check if there are any concerns about this or any users
> who will be impacted by this decision.
>
> Thanks,
> Neha
>
> [0] https://github.com/ceph/ceph/pull/39440
> [1] https://pad.ceph.com/p/ceph-month-june-2021
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: RadosGW unable to start resharding

2021-03-10 Thread Ansgar Jazdzewski
hi,

no luck after running radosgw-admin bucket check --fix
radosgw-admin reshard stale-instances list and radosgw-admin reshard
stale-instances rm work now but I can not start the resharding

radosgw-admin bucket reshard --tenant=... --bucket=... --uid=...
--num-shards=512
ERROR: the bucket is currently undergoing resharding and cannot be
added to the reshard list at this time

any idea how to force-remove the entry from the reshard.log?
2021-03-10 09:30:20.215 7f8aa239d940 -1 ERROR: failed to remove entry
from reshard log, oid=reshard.05 tenant=... bucket=..

Thanks,
Ansgar

Am Mi., 10. März 2021 um 12:44 Uhr schrieb Ansgar Jazdzewski
:
>
> Hi,
>
> Both commands did not come back with any output after 30min
>
> I found that people have had run:
>  radosgw-admin reshard cancel --tenant="..." --bucket="..."
> --uid="..." --debug-rgw=20 --debug-ms=1
>
> and I got this error in the output:
> 2021-03-10 09:30:20.215 7f8aa239d940 -1 ERROR: failed to remove entry
> from reshard log, oid=reshard.05 tenant=... bucket=...
>
> this lead me to runt
> radosgw-admin bucket check --fix --tenant=... --bucket=... --uid=...
> --debug-rgw=20 --debug-ms=1
>
> and it is not finished yet so I'll give some update as soon as it is done
>
> Thanks,
> Ansgar
>
> Am Mi., 10. März 2021 um 10:55 Uhr schrieb Konstantin Shalygin 
> :
> >
> > Try to look at:
> > radosgw-admin reshard stale-instances list
> >
> > Then:
> > radosgw-admin reshard stale-instances rm
> >
> >
> >
> > k
> >
> > On 10 Mar 2021, at 12:11, Ansgar Jazdzewski  
> > wrote:
> >
> > We are running ceph 14.2.16 and I like to reshard a bucket because I
> > have a large object warning!
> >
> > so I did:
> > radosgw-admin bucket reshard --tenant="..." --bucket="..." --uid="..."
> > --num-shards=512
> > but I got receive an error:
> >
> > ERROR: the bucket is currently undergoing resharding and cannot be
> > added to the reshard list at this time
> >
> > `radosgw-admin reshard list` is empty so I assume I have to delete
> > some leftovers from the old resharding!? did someone has had this
> > before?
> >
> >
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: RadosGW unable to start resharding

2021-03-10 Thread Ansgar Jazdzewski
Hi,

Both commands did not come back with any output after 30min

I found that people have had run:
 radosgw-admin reshard cancel --tenant="..." --bucket="..."
--uid="..." --debug-rgw=20 --debug-ms=1

and I got this error in the output:
2021-03-10 09:30:20.215 7f8aa239d940 -1 ERROR: failed to remove entry
from reshard log, oid=reshard.05 tenant=... bucket=...

this lead me to runt
radosgw-admin bucket check --fix --tenant=... --bucket=... --uid=...
--debug-rgw=20 --debug-ms=1

and it is not finished yet so I'll give some update as soon as it is done

Thanks,
Ansgar

Am Mi., 10. März 2021 um 10:55 Uhr schrieb Konstantin Shalygin :
>
> Try to look at:
> radosgw-admin reshard stale-instances list
>
> Then:
> radosgw-admin reshard stale-instances rm
>
>
>
> k
>
> On 10 Mar 2021, at 12:11, Ansgar Jazdzewski  
> wrote:
>
> We are running ceph 14.2.16 and I like to reshard a bucket because I
> have a large object warning!
>
> so I did:
> radosgw-admin bucket reshard --tenant="..." --bucket="..." --uid="..."
> --num-shards=512
> but I got receive an error:
>
> ERROR: the bucket is currently undergoing resharding and cannot be
> added to the reshard list at this time
>
> `radosgw-admin reshard list` is empty so I assume I have to delete
> some leftovers from the old resharding!? did someone has had this
> before?
>
>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] RadosGW unable to start resharding

2021-03-10 Thread Ansgar Jazdzewski
Hi Folks,

We are running ceph 14.2.16 and I like to reshard a bucket because I
have a large object warning!

so I did:
radosgw-admin bucket reshard --tenant="..." --bucket="..." --uid="..."
--num-shards=512
but I got receive an error:

ERROR: the bucket is currently undergoing resharding and cannot be
added to the reshard list at this time

`radosgw-admin reshard list` is empty so I assume I have to delete
some leftovers from the old resharding!? did someone has had this
before?

thanks for your input,
Ansgar
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Question about expansion existing Ceph cluster - adding OSDs

2020-10-21 Thread Ansgar Jazdzewski
Hi,

You can make use of the upmap so you do not need to rebalance the entire
crush map every time you change the weight.


https://docs.ceph.com/en/latest/rados/operations/upmap/


Hope it helps,
Ansgar


Kristof Coucke  schrieb am Mi., 21. Okt. 2020,
13:29:

> Hi,
>
> I have a cluster with 182 OSDs, this has been expanded towards 282 OSDs.
> Some disks were near full.
> The new disks have been added with initial weight = 0.
> The original plan was to increase this slowly towards their full weight
> using the gentle reweight script. However, this is going way too slow and
> I'm also having issues now with "backfill_toofull".
> Can I just add all the OSDs with their full weight, or will I get a lot of
> issues when I'm doing that?
> I know that a lot of PGs will have to be replaced, but increasing the
> weight slowly will take a year at the current speed. I'm already playing
> with the max backfill to increase the speed, but every time I increase the
> weight it will take a lot of time again...
> I can face the fact that there will be a performance decrease.
>
> Looking forward to your comments!
>
> Regards,
>
> Kristof
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Radosgw Multiside Sync

2020-08-14 Thread Ansgar Jazdzewski
Hi,

it looks like that only buckets from my sub-tenant-user are not in sync:
<...>
radosgw-admin --tenant tmp --uid test --display-name "Test User"
--access_key 1VIH8RUV7OD5I3IWFX5H --secret
0BvSbieeHhKi7gLHyN8zsVPHIzEFRwEXZwgj0u22 user create
<..>

Do I have to create a new group/flow/pipe for each tenant?

Thanks,
Ansgar

Am Fr., 14. Aug. 2020 um 16:59 Uhr schrieb Ansgar Jazdzewski
:
>
> Hi,
>
> > As I can understand, we are talking about Ceph 15.2.x Octopus, right?
>
> Yes i'am on ceph 15.2.4
>
> > What is the number of zones/realms/zonegroups?
>
> ATM i run just a small test on my local machine one zonegroup (global)
> with a zone node01 and node02 als just one realm
>
> > Is Ceph healthy? (ceph -s and ceph health detail )
>
> ceph is fine on both clusters
>
> > What does radosgw-admin sync status say?
>
> root@node01:/home/vagrant# radosgw-admin sync status
>  realm 17331a9d-8424-40f6-b35b-5cd21faf1561 (global)
>  zonegroup ffb97955-e89a-42fc-b8f0-926ad18d56bc (global)
>   zone acff3488-0ae4-4733-8f8c-a90baf7d09e9 (global-node01)
>  metadata sync no sync (zone is master)
>  data sync source: b88a3bbf-dde6-4758-846b-49838d398e6e
> (global-node02)
>syncing
>full sync: 0/128 shards
>incremental sync: 128/128 shards
>data is caught up with source
>
> root@node02:/home/vagrant# radosgw-admin sync status
>  realm 17331a9d-8424-40f6-b35b-5cd21faf1561 (global)
>  zonegroup ffb97955-e89a-42fc-b8f0-926ad18d56bc (global)
>   zone b88a3bbf-dde6-4758-846b-49838d398e6e (global-node02)
>  metadata sync syncing
>full sync: 0/64 shards
>incremental sync: 64/64 shards
>metadata is caught up with master
>  data sync source: acff3488-0ae4-4733-8f8c-a90baf7d09e9
> (global-node01)
>syncing
>full sync: 0/128 shards
>incremental sync: 128/128 shards
>data is caught up with source
>
> > Do you see your zone.user (or whatever you name it) in both zones with the 
> > same credentials?
>
> user is the same:
> root@node01:/home/vagrant# radosgw-admin user info --uid=synchronization-user
> ...
> "user_id": "synchronization-user",
> "display_name": "Synchronization User",
> "user": "synchronization-user",
> "access_key": "B4BVEJJZ4R7PB5EJKIW4",
> "secret_key": "wNyAAioDQenNSvo6eXEJH118047D0a4CabTYXAIE"
> ...
>
> > Did it work without sync group/flow/pipe settings?
>
> yes without it metadata was in sync
>
> > Is there any useful information in radosgw logfile?
> >
> > You can change the log level in your ceph.conf file with the line 
> > (https://docs.ceph.com/docs/master/rados/troubleshooting/log-and-debug/#:~:text=Ceph%20Subsystems,and%2020%20is%20verbose%201%20.)
> >
> > [global]
> > <...>
> > debug rgw = 20
> > <...>
> >
> > and restart your radosgw daemon.
>
> i'll try
>
> from my understanding it should be possible to write into the same
> bucket on both clusters at the same time and it will sync with each
> other?
> also if i upload data (on the master, two files just around 100KB) it
> takes a lot of time (10 min) until both sides are back in sync
>
>   data sync source: acff3488-0ae4-4733-8f8c-a90baf7d09e9 (global-node01)
>syncing
>full sync: 0/128 shards
>incremental sync: 128/128 shards
>2 shards are recovering
>recovering shards: [71,72]
>
> from my understanding that should be a lot faster?
>
> Thanks,
> Ansgar
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Radosgw Multiside Sync

2020-08-14 Thread Ansgar Jazdzewski
Hi,

> As I can understand, we are talking about Ceph 15.2.x Octopus, right?

Yes i'am on ceph 15.2.4

> What is the number of zones/realms/zonegroups?

ATM i run just a small test on my local machine one zonegroup (global)
with a zone node01 and node02 als just one realm

> Is Ceph healthy? (ceph -s and ceph health detail )

ceph is fine on both clusters

> What does radosgw-admin sync status say?

root@node01:/home/vagrant# radosgw-admin sync status
 realm 17331a9d-8424-40f6-b35b-5cd21faf1561 (global)
 zonegroup ffb97955-e89a-42fc-b8f0-926ad18d56bc (global)
  zone acff3488-0ae4-4733-8f8c-a90baf7d09e9 (global-node01)
 metadata sync no sync (zone is master)
 data sync source: b88a3bbf-dde6-4758-846b-49838d398e6e
(global-node02)
   syncing
   full sync: 0/128 shards
   incremental sync: 128/128 shards
   data is caught up with source

root@node02:/home/vagrant# radosgw-admin sync status
 realm 17331a9d-8424-40f6-b35b-5cd21faf1561 (global)
 zonegroup ffb97955-e89a-42fc-b8f0-926ad18d56bc (global)
  zone b88a3bbf-dde6-4758-846b-49838d398e6e (global-node02)
 metadata sync syncing
   full sync: 0/64 shards
   incremental sync: 64/64 shards
   metadata is caught up with master
 data sync source: acff3488-0ae4-4733-8f8c-a90baf7d09e9
(global-node01)
   syncing
   full sync: 0/128 shards
   incremental sync: 128/128 shards
   data is caught up with source

> Do you see your zone.user (or whatever you name it) in both zones with the 
> same credentials?

user is the same:
root@node01:/home/vagrant# radosgw-admin user info --uid=synchronization-user
...
"user_id": "synchronization-user",
"display_name": "Synchronization User",
"user": "synchronization-user",
"access_key": "B4BVEJJZ4R7PB5EJKIW4",
"secret_key": "wNyAAioDQenNSvo6eXEJH118047D0a4CabTYXAIE"
...

> Did it work without sync group/flow/pipe settings?

yes without it metadata was in sync

> Is there any useful information in radosgw logfile?
>
> You can change the log level in your ceph.conf file with the line 
> (https://docs.ceph.com/docs/master/rados/troubleshooting/log-and-debug/#:~:text=Ceph%20Subsystems,and%2020%20is%20verbose%201%20.)
>
> [global]
> <...>
> debug rgw = 20
> <...>
>
> and restart your radosgw daemon.

i'll try

from my understanding it should be possible to write into the same
bucket on both clusters at the same time and it will sync with each
other?
also if i upload data (on the master, two files just around 100KB) it
takes a lot of time (10 min) until both sides are back in sync

  data sync source: acff3488-0ae4-4733-8f8c-a90baf7d09e9 (global-node01)
   syncing
   full sync: 0/128 shards
   incremental sync: 128/128 shards
   2 shards are recovering
   recovering shards: [71,72]

from my understanding that should be a lot faster?

Thanks,
Ansgar
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Radosgw Multiside Sync

2020-08-14 Thread Ansgar Jazdzewski
Hi Folks,

i'am trying to move from our own custom bucket synchronization to the
rados-gateway build in one.

Multisite setup is working https://docs.ceph.com/docs/master/radosgw/multisite/
All buckes and users are visible in both clusters

Next i tried to setup the multi-side-sync
https://docs.ceph.com/docs/master/radosgw/multisite-sync-policy/
for the first test i liked to have a full symmetric setup so i
configured it as followed

radosgw-admin sync group create \
--group-id=group1 \
--status=allowed
radosgw-admin sync group flow create \
--group-id=group1 \
--flow-id=flow-mirror \
--flow-type=symmetrical \
--zones=*
radosgw-admin sync group pipe create \
--group-id=group1 \
--pipe-id=pipe1 \
--source-zones='*' \
--source-bucket='*' \
--dest-zones='*' \
--dest-bucket='*'
radosgw-admin sync group modify \
--group-id=group1 \
--status=enabled
radosgw-admin period update \
--commit

but the objects are not copied around do i need to start the
copy-process? can i find some debug information somewhere?

Thanks,
Ansgar
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io