[ceph-users] ceph-iscsi issue after upgrading from nautilus to octopus

2021-04-15 Thread icy chan
Hi,

I had several clusters running as nautilus and pending upgrading to
octopus.

I am now testing the upgrade steps for ceph cluster from nautilus
to octopus using cephadm adopt in lab referred to below link:
- https://docs.ceph.com/en/octopus/cephadm/adoption/

Lab environment:
3 all-in-one nodes.
OS: CentOS 7.9.2009 with podman 1.6.4.

After the adoption, ceph health keep warns about tcme-runner not managed by
cephadm.
# ceph health detail
HEALTH_WARN 12 stray daemon(s) not managed by cephadm; 1 pool(s) have no
replicas configured
[WRN] CEPHADM_STRAY_DAEMON: 12 stray daemon(s) not managed by cephadm
stray daemon tcmu-runner.ceph-aio1:iSCSI/iscsi_image_01 on host
ceph-aio1 not managed by cephadm
stray daemon tcmu-runner.ceph-aio1:iSCSI/iscsi_image_02 on host
ceph-aio1 not managed by cephadm
stray daemon tcmu-runner.ceph-aio1:iSCSI/iscsi_image_03 on host
ceph-aio1 not managed by cephadm
stray daemon tcmu-runner.ceph-aio1:iSCSI/iscsi_image_test on host
ceph-aio1 not managed by cephadm
stray daemon tcmu-runner.ceph-aio2:iSCSI/iscsi_image_01 on host
ceph-aio2 not managed by cephadm
stray daemon tcmu-runner.ceph-aio2:iSCSI/iscsi_image_02 on host
ceph-aio2 not managed by cephadm
stray daemon tcmu-runner.ceph-aio2:iSCSI/iscsi_image_03 on host
ceph-aio2 not managed by cephadm
stray daemon tcmu-runner.ceph-aio2:iSCSI/iscsi_image_test on host
ceph-aio2 not managed by cephadm
stray daemon tcmu-runner.ceph-aio3:iSCSI/iscsi_image_01 on host
ceph-aio3 not managed by cephadm
stray daemon tcmu-runner.ceph-aio3:iSCSI/iscsi_image_02 on host
ceph-aio3 not managed by cephadm
stray daemon tcmu-runner.ceph-aio3:iSCSI/iscsi_image_03 on host
ceph-aio3 not managed by cephadm
stray daemon tcmu-runner.ceph-aio3:iSCSI/iscsi_image_test on host
ceph-aio3 not managed by cephadm

And tcmu-runner is still running with the old version.
# ceph versions
{
"mon": {
"ceph version 15.2.10 (27917a557cca91e4da407489bbaa64ad4352cc02)
octopus (stable)": 3
},
"mgr": {
"ceph version 15.2.10 (27917a557cca91e4da407489bbaa64ad4352cc02)
octopus (stable)": 1
},
"osd": {
"ceph version 15.2.10 (27917a557cca91e4da407489bbaa64ad4352cc02)
octopus (stable)": 9
},
"mds": {},
"tcmu-runner": {
"ceph version 14.2.18 (befbc92f3c11eedd8626487211d200c0b44786d9)
nautilus (stable)": 12
},
"overall": {
"ceph version 14.2.18 (befbc92f3c11eedd8626487211d200c0b44786d9)
nautilus (stable)": 12,
"ceph version 15.2.10 (27917a557cca91e4da407489bbaa64ad4352cc02)
octopus (stable)": 13
}
}

I didn't find any ceph-iscsi related upgrade steps from the above reference
link.
Can anyone here point me to the right direction of ceph-iscsi version
upgrade?

Thanks.

Regs,
Icy
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Swift Stat Timeout

2021-04-15 Thread Dylan Griff
Just some more info on this, it started happening after they added several 
thousand objects to their buckets. While the client side times out, the 
operation seems to proceed in ceph for a very long
time happily working away getting the stat info for their objects. It doesn't 
appear to be failing, just taking an extremely long time. This doesn't seem 
right to me, but can someone confirm that they
can run an account level stat with swift on a user with several thousand 
buckets/objects?

Any info would be helpful!

Cheers,
Dylan

On Tue, 2021-04-13 at 21:50 +, Dylan Griff wrote:
> Hey folks!
> 
> We have a user with ~1900 buckets in our RGW service and running this stat 
> command results in a timeout for them:
> 
> swift -A https://:443/auth/1.0 -U  -K  stat
> 
> Running the same command, but specifiying one of their buckets, returns 
> promptly. Running the command for a different user with minimal buckets 
> returns promptly as well. Turning up debug logging to
> 20
> for rgw resulted in a great deal of logs showing:
> 
> 20 reading from default.rgw.meta:root:.bucket.meta.
> 20 get_system_obj_state: rctx=0x559b32a6b570 
> obj=default.rgw.meta:root:.bucket.meta. state=0x559b32c37e40 
> s->prefetch_data=0
> 10 cache get: name=default.rgw.meta+root+.bucket.meta. : hit 
> (requested=0x16, cached=0x17)
> 20 get_system_obj_state: s->obj_tag was set empty
> 10 cache get: name=default.rgw.meta+root+.bucket.meta. : hit 
> (requested=0x11, cached=0x17)
> 
> Which looks like to me it is iterating getting the state of all their stuff. 
> My question: is ~1900 an unreasonable amount of buckets such that we should 
> expect to see this full account 'stat'
> command
> timeout? Or should I be expecting it to return promptly still? Thanks!
> 
> Cheers,
> Dylan
> --
> Dylan Griff
> Senior System Administrator
> CLE D063
> RCS - Systems - University of Victoria
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Fresh install of Ceph using Ansible

2021-04-15 Thread Philip Brown
erm
use ceph-ansible? :)


go to github, and find the correct branch associated with the particular 
release of ceph you want to use.
then try to follow the ceph-ansible docs on setup.
For example, to use  ceph octopus, you are best off with the "STABLE-5" branch.

After that, a hint:
Just try to get 3 "mon" nodes working initially.

you will also want to use an ansible host inventory in yml format for maximum 
flexibility.

example:

all:
  children:
mons:
  hosts:
cephmon01:
cephmon02:
cephmon03:
mgrs:
  children:
mons


- Original Message -
From: "Jared Jacob" 
To: "ceph-users" 
Sent: Thursday, April 15, 2021 9:57:02 AM
Subject: [ceph-users] Fresh install of Ceph using Ansible

I am looking to rebuild my ceph cluster using ansible.What is the best way
to start this process?
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Cephadm upgrade to Pacific problem

2021-04-15 Thread Eneko Lacunza

Hi Dave,

I see now what was the problem. Thanks a lot for the link.

Cheers

El 15/4/21 a las 14:55, Dave Hall escribió:

Eneko,

For clarification, this is the link I used to fix my particular 
docker issue:


https://github.com/Debian/docker.io/blob/master/debian/README.Debian



The specific issue for me was as follows:

My cluster is running:

  * Debian 10 with DefaultRelease=buster-backports
  * Ceph packages from the Debian repo
  * Non-container installation (except for Grafana/Prometheus/Node
Exporter)
  * Docker.io repo added by ceph-ansible

The Grafana and Prometheus containers stopped working after running 
apt-get dist-upgrade (via ceph-ansible) to pick up the latest Debian 
Ceph packages .  The systemd log messages were similar to those 
reported by Radoslav.  A Google search led me to the link above.  The 
suggested addition to the kernel command line fixed the issue.


-Dave

--
Dave Hall
Binghamton University
kdh...@binghamton.edu 

On Thu, Apr 15, 2021 at 4:07 AM Eneko Lacunza > wrote:


Hi Dave,

El 14/4/21 a las 19:15, Dave Hall escribió:
> Radoslav,
>
> I ran into the same.  For Debian 10 - recent updates - you have
to add
> 'cgroup_enable=memory swapaccount=1' to the kernel command line
> (/etc/default/grub).  The reference I found said that Debian
decided to
> disable this by default and make us turn it on if we want to run
containers.
I find this quite strange. We have several updated Debian 10 servers
running Docker containers (not related to Ceph) without needing this
tuning. Docker is from Debian repos but we had a couple with
docker.io 
version without issues.

Cheers

Eneko Lacunza
Zuzendari teknikoa | Director técnico
Binovo IT Human Project

Tel. +34 943 569 206 | https://www.binovo.es 
Astigarragako Bidea, 2 - 2º izda. Oficina 10-11, 20180 Oiartzun

https://www.youtube.com/user/CANALBINOVO

https://www.linkedin.com/company/37269706/

___
ceph-users mailing list -- ceph-users@ceph.io

To unsubscribe send an email to ceph-users-le...@ceph.io




 EnekoLacunza

Director Técnico | Zuzendari teknikoa

Binovo IT Human Project

943 569 206 

elacu...@binovo.es 

binovo.es 

Astigarragako Bidea, 2 - 2 izda. Oficina 10-11, 20180 Oiartzun


youtube    
linkedin  

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Fresh install of Ceph using Ansible

2021-04-15 Thread Jared Jacob
I am looking to rebuild my ceph cluster using ansible.What is the best way
to start this process?
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Cephadm upgrade to Pacific problem

2021-04-15 Thread Dave Hall
Eneko,

For clarification, this is the link I used to fix my particular
docker issue:

https://github.com/Debian/docker.io/blob/master/debian/README.Debian


The specific issue for me was as follows:

My cluster is running:

   - Debian 10 with DefaultRelease=buster-backports
   - Ceph packages from the Debian repo
   - Non-container installation (except for Grafana/Prometheus/Node
   Exporter)
   - Docker.io repo added by ceph-ansible

The Grafana and Prometheus containers stopped working after running apt-get
dist-upgrade (via ceph-ansible) to pick up the latest Debian Ceph packages
.  The systemd log messages were similar to those reported by Radoslav.  A
Google search led me to the link above.  The suggested addition to the
kernel command line fixed the issue.

-Dave

--
Dave Hall
Binghamton University
kdh...@binghamton.edu

On Thu, Apr 15, 2021 at 4:07 AM Eneko Lacunza  wrote:

> Hi Dave,
>
> El 14/4/21 a las 19:15, Dave Hall escribió:
> > Radoslav,
> >
> > I ran into the same.  For Debian 10 - recent updates - you have to add
> > 'cgroup_enable=memory swapaccount=1' to the kernel command line
> > (/etc/default/grub).  The reference I found said that Debian decided to
> > disable this by default and make us turn it on if we want to run
> containers.
> I find this quite strange. We have several updated Debian 10 servers
> running Docker containers (not related to Ceph) without needing this
> tuning. Docker is from Debian repos but we had a couple with docker.io
> version without issues.
>
> Cheers
>
> Eneko Lacunza
> Zuzendari teknikoa | Director técnico
> Binovo IT Human Project
>
> Tel. +34 943 569 206 | https://www.binovo.es
> Astigarragako Bidea, 2 - 2º izda. Oficina 10-11, 20180 Oiartzun
>
> https://www.youtube.com/user/CANALBINOVO
> https://www.linkedin.com/company/37269706/
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: s3 requires twice the space it should use

2021-04-15 Thread Boris Behrens
Cheers,

[root@s3db1 ~]#  ceph daemon osd.23 perf dump | grep numpg
"numpg": 187,
"numpg_primary": 64,
"numpg_replica": 121,
"numpg_stray": 2,
"numpg_removing": 0,


Am Do., 15. Apr. 2021 um 18:18 Uhr schrieb 胡 玮文 :

> Hi Boris,
>
> Could you check something like
>
> ceph daemon osd.23 perf dump | grep numpg
>
> to see if there are some stray or removing PG?
>
> Weiwen Hu
>
> > 在 2021年4月15日,22:53,Boris Behrens  写道:
> >
> > Ah you are right.
> > [root@s3db1 ~]# ceph daemon osd.23 config get
> bluestore_min_alloc_size_hdd
> > {
> >"bluestore_min_alloc_size_hdd": "65536"
> > }
> > But I also checked how many objects our s3 hold and the numbers just do
> not
> > add up.
> > There are only 26509200 objects, which would result in around 1TB "waste"
> > if every object would be empty.
> >
> > I think the problem began when I updated the PG count from 1024 to 2048.
> > Could there be an issue where the data is written twice?
> >
> >
> >> Am Do., 15. Apr. 2021 um 16:48 Uhr schrieb Amit Ghadge <
> amitg@gmail.com
> >>> :
> >>
> >> verify those two parameter values ,bluestore_min_alloc_size_hdd &
> >> bluestore_min_alloc_size_sdd, If you are using hdd disk then
> >> bluestore_min_alloc_size_hdd are applicable.
> >>
> >>> On Thu, Apr 15, 2021 at 8:06 PM Boris Behrens  wrote:
> >>>
> >>> So, I need to live with it? A value of zero leads to use the default?
> >>> [root@s3db1 ~]# ceph daemon osd.23 config get bluestore_min_alloc_size
> >>> {
> >>>"bluestore_min_alloc_size": "0"
> >>> }
> >>>
> >>> I also checked the fragmentation on the bluestore OSDs and it is around
> >>> 0.80 - 0.89 on most OSDs. yikes.
> >>> [root@s3db1 ~]# ceph daemon osd.23 bluestore allocator score block
> >>> {
> >>>"fragmentation_rating": 0.85906054329923576
> >>> }
> >>>
> >>> The problem I currently have is, that I barely keep up with adding OSD
> >>> disks.
> >>>
> >>> Am Do., 15. Apr. 2021 um 16:18 Uhr schrieb Amit Ghadge <
> >>> amitg@gmail.com>:
> >>>
>  size_kb_actual are actually bucket object size but on OSD level the
>  bluestore_min_alloc_size default 64KB and SSD are 16KB
> 
> 
> 
> https://apac01.safelinks.protection.outlook.com/?url=https%3A%2F%2Faccess.redhat.com%2Fdocumentation%2Fen-us%2Fred_hat_ceph_storage%2F3%2Fhtml%2Fadministration_guide%2Fosd-bluestoredata=04%7C01%7C%7Cba98c0dff13941ea96ff08d9001e3759%7C84df9e7fe9f640afb435%7C1%7C0%7C637540952043049058%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000sdata=wfSqqyiDHRXp4ypOGTxx4p%2Buy902OGPEmGkNfJ2BF6I%3Dreserved=0
> 
>  -AmitG
> 
>  On Thu, Apr 15, 2021 at 7:29 PM Boris Behrens  wrote:
> 
> > Hi,
> >
> > maybe it is just a problem in my understanding, but it looks like
> our s3
> > requires twice the space it should use.
> >
> > I ran "radosgw-admin bucket stats", and added all "size_kb_actual"
> > values
> > up and divided to TB (/1024/1024/1024).
> > The resulting space is 135,1636733 TB. When I tripple it because of
> > replication I end up with around 405TB which is nearly half the
> space of
> > what ceph df tells me.
> >
> > Hope someone can help me.
> >
> > ceph df shows
> > RAW STORAGE:
> >CLASS SIZE AVAIL   USEDRAW USED %RAW
> > USED
> >hdd   1009 TiB 189 TiB 820 TiB  820 TiB
> > 81.26
> >TOTAL 1009 TiB 189 TiB 820 TiB  820 TiB
> > 81.26
> >
> > POOLS:
> >POOLID PGS  STORED
> > OBJECTS
> >USED%USED MAX AVAIL
> >rbd  0   64 0 B
> >   0
> >0 B 018 TiB
> >.rgw.root1   64  99 KiB
> > 119
> > 99 KiB 018 TiB
> >eu-central-1.rgw.control 2   64 0 B
> >   8
> >0 B 018 TiB
> >eu-central-1.rgw.data.root   3   64 1.0 MiB
> > 3.15k
> >1.0 MiB 018 TiB
> >eu-central-1.rgw.gc  4   64  71 MiB
> >  32
> > 71 MiB 018 TiB
> >eu-central-1.rgw.log 5   64 267 MiB
> > 564
> >267 MiB 018 TiB
> >eu-central-1.rgw.users.uid   6   64 2.8 MiB
> > 6.91k
> >2.8 MiB 018 TiB
> >eu-central-1.rgw.users.keys  7   64 263 KiB
> > 6.73k
> >263 KiB 018 TiB
> >eu-central-1.rgw.meta8   64 384 KiB
> >  1k
> >384 KiB 018 TiB
> >eu-central-1.rgw.users.email 9   6440 B
> >   1
> >   40 B 018 TiB
> >

[ceph-users] Re: s3 requires twice the space it should use

2021-04-15 Thread 胡 玮文
Hi Boris,

Could you check something like

ceph daemon osd.23 perf dump | grep numpg

to see if there are some stray or removing PG?

Weiwen Hu

> 在 2021年4月15日,22:53,Boris Behrens  写道:
> 
> Ah you are right.
> [root@s3db1 ~]# ceph daemon osd.23 config get bluestore_min_alloc_size_hdd
> {
>"bluestore_min_alloc_size_hdd": "65536"
> }
> But I also checked how many objects our s3 hold and the numbers just do not
> add up.
> There are only 26509200 objects, which would result in around 1TB "waste"
> if every object would be empty.
> 
> I think the problem began when I updated the PG count from 1024 to 2048.
> Could there be an issue where the data is written twice?
> 
> 
>> Am Do., 15. Apr. 2021 um 16:48 Uhr schrieb Amit Ghadge >> :
>> 
>> verify those two parameter values ,bluestore_min_alloc_size_hdd &
>> bluestore_min_alloc_size_sdd, If you are using hdd disk then
>> bluestore_min_alloc_size_hdd are applicable.
>> 
>>> On Thu, Apr 15, 2021 at 8:06 PM Boris Behrens  wrote:
>>> 
>>> So, I need to live with it? A value of zero leads to use the default?
>>> [root@s3db1 ~]# ceph daemon osd.23 config get bluestore_min_alloc_size
>>> {
>>>"bluestore_min_alloc_size": "0"
>>> }
>>> 
>>> I also checked the fragmentation on the bluestore OSDs and it is around
>>> 0.80 - 0.89 on most OSDs. yikes.
>>> [root@s3db1 ~]# ceph daemon osd.23 bluestore allocator score block
>>> {
>>>"fragmentation_rating": 0.85906054329923576
>>> }
>>> 
>>> The problem I currently have is, that I barely keep up with adding OSD
>>> disks.
>>> 
>>> Am Do., 15. Apr. 2021 um 16:18 Uhr schrieb Amit Ghadge <
>>> amitg@gmail.com>:
>>> 
 size_kb_actual are actually bucket object size but on OSD level the
 bluestore_min_alloc_size default 64KB and SSD are 16KB
 
 
 https://apac01.safelinks.protection.outlook.com/?url=https%3A%2F%2Faccess.redhat.com%2Fdocumentation%2Fen-us%2Fred_hat_ceph_storage%2F3%2Fhtml%2Fadministration_guide%2Fosd-bluestoredata=04%7C01%7C%7Cba98c0dff13941ea96ff08d9001e3759%7C84df9e7fe9f640afb435%7C1%7C0%7C637540952043049058%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000sdata=wfSqqyiDHRXp4ypOGTxx4p%2Buy902OGPEmGkNfJ2BF6I%3Dreserved=0
 
 -AmitG
 
 On Thu, Apr 15, 2021 at 7:29 PM Boris Behrens  wrote:
 
> Hi,
> 
> maybe it is just a problem in my understanding, but it looks like our s3
> requires twice the space it should use.
> 
> I ran "radosgw-admin bucket stats", and added all "size_kb_actual"
> values
> up and divided to TB (/1024/1024/1024).
> The resulting space is 135,1636733 TB. When I tripple it because of
> replication I end up with around 405TB which is nearly half the space of
> what ceph df tells me.
> 
> Hope someone can help me.
> 
> ceph df shows
> RAW STORAGE:
>CLASS SIZE AVAIL   USEDRAW USED %RAW
> USED
>hdd   1009 TiB 189 TiB 820 TiB  820 TiB
> 81.26
>TOTAL 1009 TiB 189 TiB 820 TiB  820 TiB
> 81.26
> 
> POOLS:
>POOLID PGS  STORED
> OBJECTS
>USED%USED MAX AVAIL
>rbd  0   64 0 B
>   0
>0 B 018 TiB
>.rgw.root1   64  99 KiB
> 119
> 99 KiB 018 TiB
>eu-central-1.rgw.control 2   64 0 B
>   8
>0 B 018 TiB
>eu-central-1.rgw.data.root   3   64 1.0 MiB
> 3.15k
>1.0 MiB 018 TiB
>eu-central-1.rgw.gc  4   64  71 MiB
>  32
> 71 MiB 018 TiB
>eu-central-1.rgw.log 5   64 267 MiB
> 564
>267 MiB 018 TiB
>eu-central-1.rgw.users.uid   6   64 2.8 MiB
> 6.91k
>2.8 MiB 018 TiB
>eu-central-1.rgw.users.keys  7   64 263 KiB
> 6.73k
>263 KiB 018 TiB
>eu-central-1.rgw.meta8   64 384 KiB
>  1k
>384 KiB 018 TiB
>eu-central-1.rgw.users.email 9   6440 B
>   1
>   40 B 018 TiB
>eu-central-1.rgw.buckets.index  10   64  10 GiB
> 67.61k
> 10 GiB  0.0218 TiB
>eu-central-1.rgw.buckets.data   11 2048 264 TiB
> 138.31M
>264 TiB 83.3718 TiB
>eu-central-1.rgw.buckets.non-ec 12   64 297 MiB
> 11.32k
>297 MiB 018 TiB
>eu-central-1.rgw.usage  13   64 536 MiB
>  32
>536 MiB 018 TiB
>eu-msg-1.rgw.control

[ceph-users] How to handle bluestore fragmentation

2021-04-15 Thread David Caro

Reading the thread "s3 requires twice the space it should use", Boris pointed
out that the fragmentation for the osds is around 0.8-0.9:


> On Thu, Apr 15, 2021 at 8:06 PM Boris Behrens  wrote:
>> I also checked the fragmentation on the bluestore OSDs and it is around
>> 0.80 - 0.89 on most OSDs. yikes.
>> [root@s3db1 ~]# ceph daemon osd.23 bluestore allocator score block
>> {
>> "fragmentation_rating": 0.85906054329923576
>> }


And that made me wonder what is the current recommended (and not recommended)
way to handle and reduce the fragmentation of the existing OSDs.

Reading around I would think of tweaking the min_alloc_size_{ssd,hdd} and
redeploying those OSDs, but I was unable to find much else, I wonder what do
people do?


ps. There was another thread that got no replies asking something similar (and
a bunch of other things):
https://lists.ceph.io/hyperkitty/list/ceph-users@ceph.io/message/3PITWZRNX7RFRQNG33VSNKYGOO2IFMZG/


signature.asc
Description: PGP signature
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: s3 requires twice the space it should use

2021-04-15 Thread Boris Behrens
Ah you are right.
[root@s3db1 ~]# ceph daemon osd.23 config get bluestore_min_alloc_size_hdd
{
"bluestore_min_alloc_size_hdd": "65536"
}
But I also checked how many objects our s3 hold and the numbers just do not
add up.
There are only 26509200 objects, which would result in around 1TB "waste"
if every object would be empty.

I think the problem began when I updated the PG count from 1024 to 2048.
Could there be an issue where the data is written twice?


Am Do., 15. Apr. 2021 um 16:48 Uhr schrieb Amit Ghadge :

> verify those two parameter values ,bluestore_min_alloc_size_hdd &
> bluestore_min_alloc_size_sdd, If you are using hdd disk then
> bluestore_min_alloc_size_hdd are applicable.
>
> On Thu, Apr 15, 2021 at 8:06 PM Boris Behrens  wrote:
>
>> So, I need to live with it? A value of zero leads to use the default?
>> [root@s3db1 ~]# ceph daemon osd.23 config get bluestore_min_alloc_size
>> {
>> "bluestore_min_alloc_size": "0"
>> }
>>
>> I also checked the fragmentation on the bluestore OSDs and it is around
>> 0.80 - 0.89 on most OSDs. yikes.
>> [root@s3db1 ~]# ceph daemon osd.23 bluestore allocator score block
>> {
>> "fragmentation_rating": 0.85906054329923576
>> }
>>
>> The problem I currently have is, that I barely keep up with adding OSD
>> disks.
>>
>> Am Do., 15. Apr. 2021 um 16:18 Uhr schrieb Amit Ghadge <
>> amitg@gmail.com>:
>>
>>> size_kb_actual are actually bucket object size but on OSD level the
>>> bluestore_min_alloc_size default 64KB and SSD are 16KB
>>>
>>>
>>> https://access.redhat.com/documentation/en-us/red_hat_ceph_storage/3/html/administration_guide/osd-bluestore
>>>
>>> -AmitG
>>>
>>> On Thu, Apr 15, 2021 at 7:29 PM Boris Behrens  wrote:
>>>
 Hi,

 maybe it is just a problem in my understanding, but it looks like our s3
 requires twice the space it should use.

 I ran "radosgw-admin bucket stats", and added all "size_kb_actual"
 values
 up and divided to TB (/1024/1024/1024).
 The resulting space is 135,1636733 TB. When I tripple it because of
 replication I end up with around 405TB which is nearly half the space of
 what ceph df tells me.

 Hope someone can help me.

 ceph df shows
 RAW STORAGE:
 CLASS SIZE AVAIL   USEDRAW USED %RAW
 USED
 hdd   1009 TiB 189 TiB 820 TiB  820 TiB
  81.26
 TOTAL 1009 TiB 189 TiB 820 TiB  820 TiB
  81.26

 POOLS:
 POOLID PGS  STORED
 OBJECTS
 USED%USED MAX AVAIL
 rbd  0   64 0 B
0
 0 B 018 TiB
 .rgw.root1   64  99 KiB
  119
  99 KiB 018 TiB
 eu-central-1.rgw.control 2   64 0 B
8
 0 B 018 TiB
 eu-central-1.rgw.data.root   3   64 1.0 MiB
  3.15k
 1.0 MiB 018 TiB
 eu-central-1.rgw.gc  4   64  71 MiB
   32
  71 MiB 018 TiB
 eu-central-1.rgw.log 5   64 267 MiB
  564
 267 MiB 018 TiB
 eu-central-1.rgw.users.uid   6   64 2.8 MiB
  6.91k
 2.8 MiB 018 TiB
 eu-central-1.rgw.users.keys  7   64 263 KiB
  6.73k
 263 KiB 018 TiB
 eu-central-1.rgw.meta8   64 384 KiB
   1k
 384 KiB 018 TiB
 eu-central-1.rgw.users.email 9   6440 B
1
40 B 018 TiB
 eu-central-1.rgw.buckets.index  10   64  10 GiB
 67.61k
  10 GiB  0.0218 TiB
 eu-central-1.rgw.buckets.data   11 2048 264 TiB
  138.31M
 264 TiB 83.3718 TiB
 eu-central-1.rgw.buckets.non-ec 12   64 297 MiB
 11.32k
 297 MiB 018 TiB
 eu-central-1.rgw.usage  13   64 536 MiB
   32
 536 MiB 018 TiB
 eu-msg-1.rgw.control56   64 0 B
8
 0 B 018 TiB
 eu-msg-1.rgw.data.root  57   64  72 KiB
  227
  72 KiB 018 TiB
 eu-msg-1.rgw.gc 58   64 300 KiB
   32
 300 KiB 018 TiB
 eu-msg-1.rgw.log59   64 835 KiB
  242
 835 KiB 018 TiB
 eu-msg-1.rgw.users.uid  60   64  56 KiB
  104
  56 KiB 018 TiB
 eu-msg-1.rgw.usage  61   64  37 MiB
   25
 

[ceph-users] Re: DocuBetter Meeting This Week -- 1630 UTC

2021-04-15 Thread Mike Perez
Here's the recording for the meeting:

https://www.youtube.com/watch?v=5bemZ8opdhs

On Wed, Apr 14, 2021 at 1:34 AM John Zachary Dover  wrote:
>
> This week's meeting will focus on the ongoing rewrite of the cephadm
> documentation and the upcoming Google Season of Docs project.
>
> Meeting: https://bluejeans.com/908675367
> Etherpad: https://pad.ceph.com/p/Ceph_Documentation
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
>


-- 
Mike Perez
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: s3 requires twice the space it should use

2021-04-15 Thread Boris Behrens
So, I need to live with it? A value of zero leads to use the default?
[root@s3db1 ~]# ceph daemon osd.23 config get bluestore_min_alloc_size
{
"bluestore_min_alloc_size": "0"
}

I also checked the fragmentation on the bluestore OSDs and it is around
0.80 - 0.89 on most OSDs. yikes.
[root@s3db1 ~]# ceph daemon osd.23 bluestore allocator score block
{
"fragmentation_rating": 0.85906054329923576
}

The problem I currently have is, that I barely keep up with adding OSD
disks.

Am Do., 15. Apr. 2021 um 16:18 Uhr schrieb Amit Ghadge :

> size_kb_actual are actually bucket object size but on OSD level the
> bluestore_min_alloc_size default 64KB and SSD are 16KB
>
>
> https://access.redhat.com/documentation/en-us/red_hat_ceph_storage/3/html/administration_guide/osd-bluestore
>
> -AmitG
>
> On Thu, Apr 15, 2021 at 7:29 PM Boris Behrens  wrote:
>
>> Hi,
>>
>> maybe it is just a problem in my understanding, but it looks like our s3
>> requires twice the space it should use.
>>
>> I ran "radosgw-admin bucket stats", and added all "size_kb_actual" values
>> up and divided to TB (/1024/1024/1024).
>> The resulting space is 135,1636733 TB. When I tripple it because of
>> replication I end up with around 405TB which is nearly half the space of
>> what ceph df tells me.
>>
>> Hope someone can help me.
>>
>> ceph df shows
>> RAW STORAGE:
>> CLASS SIZE AVAIL   USEDRAW USED %RAW USED
>> hdd   1009 TiB 189 TiB 820 TiB  820 TiB 81.26
>> TOTAL 1009 TiB 189 TiB 820 TiB  820 TiB 81.26
>>
>> POOLS:
>> POOLID PGS  STORED
>> OBJECTS
>> USED%USED MAX AVAIL
>> rbd  0   64 0 B
>>  0
>> 0 B 018 TiB
>> .rgw.root1   64  99 KiB
>>  119
>>  99 KiB 018 TiB
>> eu-central-1.rgw.control 2   64 0 B
>>  8
>> 0 B 018 TiB
>> eu-central-1.rgw.data.root   3   64 1.0 MiB
>>  3.15k
>> 1.0 MiB 018 TiB
>> eu-central-1.rgw.gc  4   64  71 MiB
>> 32
>>  71 MiB 018 TiB
>> eu-central-1.rgw.log 5   64 267 MiB
>>  564
>> 267 MiB 018 TiB
>> eu-central-1.rgw.users.uid   6   64 2.8 MiB
>>  6.91k
>> 2.8 MiB 018 TiB
>> eu-central-1.rgw.users.keys  7   64 263 KiB
>>  6.73k
>> 263 KiB 018 TiB
>> eu-central-1.rgw.meta8   64 384 KiB
>> 1k
>> 384 KiB 018 TiB
>> eu-central-1.rgw.users.email 9   6440 B
>>  1
>>40 B 018 TiB
>> eu-central-1.rgw.buckets.index  10   64  10 GiB
>> 67.61k
>>  10 GiB  0.0218 TiB
>> eu-central-1.rgw.buckets.data   11 2048 264 TiB
>>  138.31M
>> 264 TiB 83.3718 TiB
>> eu-central-1.rgw.buckets.non-ec 12   64 297 MiB
>> 11.32k
>> 297 MiB 018 TiB
>> eu-central-1.rgw.usage  13   64 536 MiB
>> 32
>> 536 MiB 018 TiB
>> eu-msg-1.rgw.control56   64 0 B
>>  8
>> 0 B 018 TiB
>> eu-msg-1.rgw.data.root  57   64  72 KiB
>>  227
>>  72 KiB 018 TiB
>> eu-msg-1.rgw.gc 58   64 300 KiB
>> 32
>> 300 KiB 018 TiB
>> eu-msg-1.rgw.log59   64 835 KiB
>>  242
>> 835 KiB 018 TiB
>> eu-msg-1.rgw.users.uid  60   64  56 KiB
>>  104
>>  56 KiB 018 TiB
>> eu-msg-1.rgw.usage  61   64  37 MiB
>> 25
>>  37 MiB 018 TiB
>> eu-msg-1.rgw.users.keys 62   64 3.8 KiB
>> 97
>> 3.8 KiB 018 TiB
>> eu-msg-1.rgw.meta   63   64 607 KiB
>>  1.60k
>> 607 KiB 018 TiB
>> eu-msg-1.rgw.buckets.index  64   64  71 MiB
>>  119
>>  71 MiB 018 TiB
>> eu-msg-1.rgw.users.email65   64 0 B
>>  0
>> 0 B 018 TiB
>> eu-msg-1.rgw.buckets.data   66   64 2.9 TiB
>>  1.16M
>> 2.9 TiB  5.3018 TiB
>> eu-msg-1.rgw.buckets.non-ec 67   64 2.2 MiB
>>  354
>> 2.2 MiB 018 TiB
>> default.rgw.control 69   32 0 B
>>  8
>> 0 B 018 TiB
>> default.rgw.data.root   70   32 0 B
>>  0
>> 0 B 018 TiB
>> default.rgw.gc  71   32 0 B
>>  0
>> 0 B 0

[ceph-users] s3 requires twice the space it should use

2021-04-15 Thread Boris Behrens
Hi,

maybe it is just a problem in my understanding, but it looks like our s3
requires twice the space it should use.

I ran "radosgw-admin bucket stats", and added all "size_kb_actual" values
up and divided to TB (/1024/1024/1024).
The resulting space is 135,1636733 TB. When I tripple it because of
replication I end up with around 405TB which is nearly half the space of
what ceph df tells me.

Hope someone can help me.

ceph df shows
RAW STORAGE:
CLASS SIZE AVAIL   USEDRAW USED %RAW USED
hdd   1009 TiB 189 TiB 820 TiB  820 TiB 81.26
TOTAL 1009 TiB 189 TiB 820 TiB  820 TiB 81.26

POOLS:
POOLID PGS  STORED  OBJECTS
USED%USED MAX AVAIL
rbd  0   64 0 B   0
0 B 018 TiB
.rgw.root1   64  99 KiB 119
 99 KiB 018 TiB
eu-central-1.rgw.control 2   64 0 B   8
0 B 018 TiB
eu-central-1.rgw.data.root   3   64 1.0 MiB   3.15k
1.0 MiB 018 TiB
eu-central-1.rgw.gc  4   64  71 MiB  32
 71 MiB 018 TiB
eu-central-1.rgw.log 5   64 267 MiB 564
267 MiB 018 TiB
eu-central-1.rgw.users.uid   6   64 2.8 MiB   6.91k
2.8 MiB 018 TiB
eu-central-1.rgw.users.keys  7   64 263 KiB   6.73k
263 KiB 018 TiB
eu-central-1.rgw.meta8   64 384 KiB  1k
384 KiB 018 TiB
eu-central-1.rgw.users.email 9   6440 B   1
   40 B 018 TiB
eu-central-1.rgw.buckets.index  10   64  10 GiB  67.61k
 10 GiB  0.0218 TiB
eu-central-1.rgw.buckets.data   11 2048 264 TiB 138.31M
264 TiB 83.3718 TiB
eu-central-1.rgw.buckets.non-ec 12   64 297 MiB  11.32k
297 MiB 018 TiB
eu-central-1.rgw.usage  13   64 536 MiB  32
536 MiB 018 TiB
eu-msg-1.rgw.control56   64 0 B   8
0 B 018 TiB
eu-msg-1.rgw.data.root  57   64  72 KiB 227
 72 KiB 018 TiB
eu-msg-1.rgw.gc 58   64 300 KiB  32
300 KiB 018 TiB
eu-msg-1.rgw.log59   64 835 KiB 242
835 KiB 018 TiB
eu-msg-1.rgw.users.uid  60   64  56 KiB 104
 56 KiB 018 TiB
eu-msg-1.rgw.usage  61   64  37 MiB  25
 37 MiB 018 TiB
eu-msg-1.rgw.users.keys 62   64 3.8 KiB  97
3.8 KiB 018 TiB
eu-msg-1.rgw.meta   63   64 607 KiB   1.60k
607 KiB 018 TiB
eu-msg-1.rgw.buckets.index  64   64  71 MiB 119
 71 MiB 018 TiB
eu-msg-1.rgw.users.email65   64 0 B   0
0 B 018 TiB
eu-msg-1.rgw.buckets.data   66   64 2.9 TiB   1.16M
2.9 TiB  5.3018 TiB
eu-msg-1.rgw.buckets.non-ec 67   64 2.2 MiB 354
2.2 MiB 018 TiB
default.rgw.control 69   32 0 B   8
0 B 018 TiB
default.rgw.data.root   70   32 0 B   0
0 B 018 TiB
default.rgw.gc  71   32 0 B   0
0 B 018 TiB
default.rgw.log 72   32 0 B   0
0 B 018 TiB
default.rgw.users.uid   73   32 0 B   0
0 B 018 TiB
fra-1.rgw.control   74   32 0 B   8
0 B 018 TiB
fra-1.rgw.meta  75   32 0 B   0
0 B 018 TiB
fra-1.rgw.log   76   3250 B  28
   50 B 018 TiB


-- 
Die Selbsthilfegruppe "UTF-8-Probleme" trifft sich diesmal abweichend im
groüen Saal.
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: [External Email] Cephadm upgrade to Pacific problem

2021-04-15 Thread Eneko Lacunza

Hi Dave,

El 14/4/21 a las 19:15, Dave Hall escribió:

Radoslav,

I ran into the same.  For Debian 10 - recent updates - you have to add
'cgroup_enable=memory swapaccount=1' to the kernel command line
(/etc/default/grub).  The reference I found said that Debian decided to
disable this by default and make us turn it on if we want to run containers.
I find this quite strange. We have several updated Debian 10 servers 
running Docker containers (not related to Ceph) without needing this 
tuning. Docker is from Debian repos but we had a couple with docker.io 
version without issues.


Cheers

Eneko Lacunza
Zuzendari teknikoa | Director técnico
Binovo IT Human Project

Tel. +34 943 569 206 | https://www.binovo.es
Astigarragako Bidea, 2 - 2º izda. Oficina 10-11, 20180 Oiartzun

https://www.youtube.com/user/CANALBINOVO
https://www.linkedin.com/company/37269706/
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: has anyone enabled bdev_enable_discard?

2021-04-15 Thread Wido den Hollander




On 13/04/2021 11:07, Dan van der Ster wrote:

On Tue, Apr 13, 2021 at 9:00 AM Wido den Hollander  wrote:




On 4/12/21 5:46 PM, Dan van der Ster wrote:

Hi all,

bdev_enable_discard has been in ceph for several major releases now
but it is still off by default.
Did anyone try it recently -- is it safe to use? And do you have perf
numbers before and after enabling?



I have done so on SATA SSDs in a few cases and: it worked

Did I notice a real difference? Not really.



Thanks, I've enabled it on a test box and am draining data to check
that it doesn't crash anything.


It's highly debated if this still makes a difference with modern flash
devices. I don't think there is a real conclusion if you still need to
trim/discard blocks.


Do you happen to have any more info on these debates? As you know we
have seen major performance issues on hypervisors that are not running
a periodic fstrim; we use similar or identical SATA ssds for HV local
storage and our block.db's. If it doesn't hurt anything, why wouldn't
we enable it by default?



These debates are more about if it really makes sense with modern SSDs 
as the performance gain seems limited.


With older (SATA) SSDs it might, but with the modern NVMe DC-grade ones 
people are doubting if it is still needed.


SATA 3.0 also had the issue that the TRIM command was a blocking command 
where with SATA 3.1 it became async and thus non-blocking.


With NVMe it is a different story again.

I don't have links or papers for you, it's mainly stories I heard on 
conferences and such.


Wido


Cheers, Dan


___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: _delete_some new onodes has appeared since PG removal started

2021-04-15 Thread Dan van der Ster
Thanks Igor and Neha for the quick responses.

I posted an osd log with debug_osd 10 and debug_bluestore 20:
ceph-post-file: 09094430-abdb-4248-812c-47b7babae06c

Hope that helps,

Dan

On Thu, Apr 15, 2021 at 1:27 AM Neha Ojha  wrote:
>
> We saw this warning once in testing
> (https://tracker.ceph.com/issues/49900#note-1), but there, the problem
> was different, which also led to a crash. That issue has been fixed
> but if you can provide osd logs with verbose logging, we might be able
> to investigate further.
>
> Neha
>
> On Wed, Apr 14, 2021 at 4:14 PM Igor Fedotov  wrote:
> >
> > Hi Dan,
> >
> > Seen that once before and haven't thoroughly investigated yet but I
> > think the new PG removal stuff just revealed this "issue". In fact it
> > had been in the code before the patch.
> >
> > The warning means that new object(s) (given the object names these are
> > apparently system objects, don't remember what's this exactly)  has been
> > written to a PG after it was staged for removal.
> >
> > New PG removal properly handles that case - that was just a paranoid
> > check for an unexpected situation which has actually triggered. Hence
> > IMO no need to worry at this point but developers might want to validate
> > why this is happening
> >
> >
> > Thanks,
> >
> > Igor
> >
> > On 4/14/2021 10:26 PM, Dan van der Ster wrote:
> > > Hi Igor,
> > >
> > > After updating to 14.2.19 and then moving some PGs around we have a
> > > few warnings related to the new efficient PG removal code, e.g. [1].
> > > Is that something to worry about?
> > >
> > > Best Regards,
> > >
> > > Dan
> > >
> > > [1]
> > >
> > > /var/log/ceph/ceph-osd.792.log:2021-04-14 20:34:34.353 7fb2439d4700  0
> > > osd.792 pg_epoch: 40906 pg[10.14b2s0( v 40734'290069
> > > (33782'287000,40734'290069] lb MIN (bitwise) local-lis/les=33990/33991
> > > n=36272 ec=4951/4937 lis/c 33990/33716 les/c/f 33991/33747/0
> > > 40813/40813/37166) [933,626,260,804,503,491]p933(0) r=-1 lpr=40813
> > > DELETING pi=[33716,40813)/4 crt=40734'290069 unknown NOTIFY mbc={}]
> > > _delete_some additional unexpected onode list (new onodes has appeared
> > > since PG removal started[0#10:4d28head#]
> > >
> > > /var/log/ceph/ceph-osd.851.log:2021-04-14 18:40:13.312 7fd87bded700  0
> > > osd.851 pg_epoch: 40671 pg[10.133fs5( v 40662'288967
> > > (33782'285900,40662'288967] lb MIN (bitwise) local-lis/les=33786/33787
> > > n=13 ec=4947/4937 lis/c 40498/33714 les/c/f 40499/33747/0
> > > 40670/40670/33432) [859,199,913,329,439,79]p859(0) r=-1 lpr=40670
> > > DELETING pi=[33714,40670)/4 crt=40662'288967 unknown NOTIFY mbc={}]
> > > _delete_some additional unexpected onode list (new onodes has appeared
> > > since PG removal started[5#10:fcc8head#]
> > >
> > > /var/log/ceph/ceph-osd.851.log:2021-04-14 20:58:14.393 7fd87adeb700  0
> > > osd.851 pg_epoch: 40906 pg[10.2e8s3( v 40610'288991
> > > (33782'285900,40610'288991] lb MIN (bitwise) local-lis/les=33786/33787
> > > n=161220 ec=4937/4937 lis/c 39826/33716 les/c/f 39827/33747/0
> > > 40617/40617/39225) [717,933,727,792,607,129]p717(0) r=-1 lpr=40617
> > > DELETING pi=[33716,40617)/3 crt=40610'288991 unknown NOTIFY mbc={}]
> > > _delete_some additional unexpected onode list (new onodes has appeared
> > > since PG removal started[3#10:1740head#]
> > >
> > > /var/log/ceph/ceph-osd.883.log:2021-04-14 18:55:16.822 7f78c485d700  0
> > > osd.883 pg_epoch: 40857 pg[7.d4( v 40804'9911289
> > > (35835'9908201,40804'9911289] lb MIN (bitwise)
> > > local-lis/les=40782/40783 n=195 ec=2063/1989 lis/c 40782/40782 les/c/f
> > > 40783/40844/0 40781/40845/40845) [877,870,894] r=-1 lpr=40845 DELETING
> > > pi=[40782,40845)/1 crt=40804'9911289 lcod 40804'9911288 unknown NOTIFY
> > > mbc={}] _delete_some additional unexpected onode list (new onodes has
> > > appeared since PG removal started[#7:2b00head#]
> > ___
> > ceph-users mailing list -- ceph-users@ceph.io
> > To unsubscribe send an email to ceph-users-le...@ceph.io
>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io