[ceph-users] Re: CEPH orch made osd without WAL

2023-07-10 Thread Eugen Block
Yes, because you did *not* specify a dedicated WAL device. This is  
also reflected in the OSD metadata:


$ ceph osd metadata 6 | grep dedicated
"bluefs_dedicated_db": "1",
"bluefs_dedicated_wal": "0"

Only if you had specified a dedicated WAL device you would see it in  
the lvm list output, so this is all as expected.
You can check out the perf dump of an OSD to see that it actually  
writes to the WAL:


# ceph daemon osd.6 perf dump bluefs | grep wal
"wal_total_bytes": 0,
"wal_used_bytes": 0,
"files_written_wal": 1588,
"bytes_written_wal": 1090677563392,
"max_bytes_wal": 0,


Zitat von Jan Marek :


Hello,

but when I try to list devices config with ceph-volume, I can see
a DB devices, but no WAL devices:

ceph-volume lvm list

== osd.8 ===

  [db]   
/dev/ceph-5aa92e38-077b-48e2-bda6-5b7db7b7701c/osd-db-bfd11468-d109-4f85-9723-75976f51bfb9


  block device   
/dev/ceph-eaf5f0d7-ad50-4009-9ee6-04b8204b5b1a/osd-block-26b1d4b7-2425-4a2f-912b-111cf66a5970

  block uuidj4s9lv-wS9n-xg2W-I4Y0-fUSu-Vuvl-9gOB2P
  cephx lockbox secret
  cluster fsid  2c565e24-7850-47dc-a751-a6357cbbaf2a
  cluster name  ceph
  crush device class
  db device  
/dev/ceph-5aa92e38-077b-48e2-bda6-5b7db7b7701c/osd-db-bfd11468-d109-4f85-9723-75976f51bfb9

  db uuid   d9MZ2r-ImXX-Xod0-TNDS-tqi5-oG5Y-wrXFtW
  encrypted 0
  osd fsid  26b1d4b7-2425-4a2f-912b-111cf66a5970
  osd id8
  osdspec affinity  osd_spec_default
  type  db
  vdo   0
  devices   /dev/nvme0n1

  [block]
/dev/ceph-eaf5f0d7-ad50-4009-9ee6-04b8204b5b1a/osd-block-26b1d4b7-2425-4a2f-912b-111cf66a5970


  block device   
/dev/ceph-eaf5f0d7-ad50-4009-9ee6-04b8204b5b1a/osd-block-26b1d4b7-2425-4a2f-912b-111cf66a5970

  block uuidj4s9lv-wS9n-xg2W-I4Y0-fUSu-Vuvl-9gOB2P
  cephx lockbox secret
  cluster fsid  2c565e24-7850-47dc-a751-a6357cbbaf2a
  cluster name  ceph
  crush device class
  db device  
/dev/ceph-5aa92e38-077b-48e2-bda6-5b7db7b7701c/osd-db-bfd11468-d109-4f85-9723-75976f51bfb9

  db uuid   d9MZ2r-ImXX-Xod0-TNDS-tqi5-oG5Y-wrXFtW
  encrypted 0
  osd fsid  26b1d4b7-2425-4a2f-912b-111cf66a5970
  osd id8
  osdspec affinity  osd_spec_default
  type  block
  vdo   0
  devices   /dev/sdi

(part of listing...)

Sincerely
Jan Marek


Dne Po, čec 10, 2023 at 08:10:58 CEST napsal Eugen Block:

Hi,

if you don't specify a different device for WAL it will be automatically
colocated on the same device as the DB. So you're good with this
configuration.

Regards,
Eugen


Zitat von Jan Marek :

> Hello,
>
> I've tried to add to CEPH cluster OSD node with a 12 rotational
> disks and 1 NVMe. My YAML was this:
>
> service_type: osd
> service_id: osd_spec_default
> service_name: osd.osd_spec_default
> placement:
>   host_pattern: osd8
> spec:
>   block_db_size: 64G
>   data_devices:
> rotational: 1
>   db_devices:
> paths:
> - /dev/nvme0n1
>   filter_logic: AND
>   objectstore: bluestore
>
> Now I have 12 OSD with DB on NVMe device, but without WAL. How I
> can add WAL to this OSD?
>
> NVMe device still have 128GB free place.
>
> Thanks a lot.
>
> Sincerely
> Jan Marek
> --
> Ing. Jan Marek
> University of South Bohemia
> Academic Computer Centre
> Phone: +420389032080
> http://www.gnu.org/philosophy/no-word-attachments.cs.html


___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


--
Ing. Jan Marek
University of South Bohemia
Academic Computer Centre
Phone: +420389032080
http://www.gnu.org/philosophy/no-word-attachments.cs.html



___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: CEPH orch made osd without WAL

2023-07-10 Thread Joachim Kraftmayer - ceph ambassador
you can also test it directly with ceph bench, if the WAL is on the 
flash device:


https://www.clyso.com/blog/verify-ceph-osd-db-and-wal-setup/

Joachim


___
ceph ambassador DACH
ceph consultant since 2012

Clyso GmbH - Premier Ceph Foundation Member

https://www.clyso.com/

Am 10.07.23 um 09:12 schrieb Eugen Block:
Yes, because you did *not* specify a dedicated WAL device. This is 
also reflected in the OSD metadata:


$ ceph osd metadata 6 | grep dedicated
    "bluefs_dedicated_db": "1",
    "bluefs_dedicated_wal": "0"

Only if you had specified a dedicated WAL device you would see it in 
the lvm list output, so this is all as expected.
You can check out the perf dump of an OSD to see that it actually 
writes to the WAL:


# ceph daemon osd.6 perf dump bluefs | grep wal
    "wal_total_bytes": 0,
    "wal_used_bytes": 0,
    "files_written_wal": 1588,
    "bytes_written_wal": 1090677563392,
    "max_bytes_wal": 0,


Zitat von Jan Marek :


Hello,

but when I try to list devices config with ceph-volume, I can see
a DB devices, but no WAL devices:

ceph-volume lvm list

== osd.8 ===

  [db] 
/dev/ceph-5aa92e38-077b-48e2-bda6-5b7db7b7701c/osd-db-bfd11468-d109-4f85-9723-75976f51bfb9


  block device 
/dev/ceph-eaf5f0d7-ad50-4009-9ee6-04b8204b5b1a/osd-block-26b1d4b7-2425-4a2f-912b-111cf66a5970

  block uuid j4s9lv-wS9n-xg2W-I4Y0-fUSu-Vuvl-9gOB2P
  cephx lockbox secret
  cluster fsid 2c565e24-7850-47dc-a751-a6357cbbaf2a
  cluster name  ceph
  crush device class
  db device 
/dev/ceph-5aa92e38-077b-48e2-bda6-5b7db7b7701c/osd-db-bfd11468-d109-4f85-9723-75976f51bfb9

  db uuid d9MZ2r-ImXX-Xod0-TNDS-tqi5-oG5Y-wrXFtW
  encrypted 0
  osd fsid 26b1d4b7-2425-4a2f-912b-111cf66a5970
  osd id    8
  osdspec affinity  osd_spec_default
  type  db
  vdo   0
  devices   /dev/nvme0n1

  [block] 
/dev/ceph-eaf5f0d7-ad50-4009-9ee6-04b8204b5b1a/osd-block-26b1d4b7-2425-4a2f-912b-111cf66a5970


  block device 
/dev/ceph-eaf5f0d7-ad50-4009-9ee6-04b8204b5b1a/osd-block-26b1d4b7-2425-4a2f-912b-111cf66a5970

  block uuid j4s9lv-wS9n-xg2W-I4Y0-fUSu-Vuvl-9gOB2P
  cephx lockbox secret
  cluster fsid 2c565e24-7850-47dc-a751-a6357cbbaf2a
  cluster name  ceph
  crush device class
  db device 
/dev/ceph-5aa92e38-077b-48e2-bda6-5b7db7b7701c/osd-db-bfd11468-d109-4f85-9723-75976f51bfb9

  db uuid d9MZ2r-ImXX-Xod0-TNDS-tqi5-oG5Y-wrXFtW
  encrypted 0
  osd fsid 26b1d4b7-2425-4a2f-912b-111cf66a5970
  osd id    8
  osdspec affinity  osd_spec_default
  type  block
  vdo   0
  devices   /dev/sdi

(part of listing...)

Sincerely
Jan Marek


Dne Po, čec 10, 2023 at 08:10:58 CEST napsal Eugen Block:

Hi,

if you don't specify a different device for WAL it will be 
automatically

colocated on the same device as the DB. So you're good with this
configuration.

Regards,
Eugen


Zitat von Jan Marek :

> Hello,
>
> I've tried to add to CEPH cluster OSD node with a 12 rotational
> disks and 1 NVMe. My YAML was this:
>
> service_type: osd
> service_id: osd_spec_default
> service_name: osd.osd_spec_default
> placement:
>   host_pattern: osd8
> spec:
>   block_db_size: 64G
>   data_devices:
> rotational: 1
>   db_devices:
> paths:
> - /dev/nvme0n1
>   filter_logic: AND
>   objectstore: bluestore
>
> Now I have 12 OSD with DB on NVMe device, but without WAL. How I
> can add WAL to this OSD?
>
> NVMe device still have 128GB free place.
>
> Thanks a lot.
>
> Sincerely
> Jan Marek
> --
> Ing. Jan Marek
> University of South Bohemia
> Academic Computer Centre
> Phone: +420389032080
> http://www.gnu.org/philosophy/no-word-attachments.cs.html


___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


--
Ing. Jan Marek
University of South Bohemia
Academic Computer Centre
Phone: +420389032080
http://www.gnu.org/philosophy/no-word-attachments.cs.html



___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Are replicas 4 or 6 safe during network partition? Will there be split-brain?

2023-07-10 Thread Robert Sander

Hi,

On 07.07.23 16:52, jcic...@cloudflare.com wrote:


There are two sites, A and B. There are 5 mons, 2 in A, 3 in B. Looking at just 
one PG and 4 replicas, we have 2 replicas in site A and 2 replicas in site B. 
Site A holds the primary OSD for this PG. When a network split happens, I/O 
would still be working in site A since there are still 2 OSDs, even without mon 
quorum.


The site without MON quorum will stop to work completely.

Regards
--
Robert Sander
Heinlein Consulting GmbH
Schwedter Str. 8/9b, 10119 Berlin

http://www.heinlein-support.de

Tel: 030 / 405051-43
Fax: 030 / 405051-19

Zwangsangaben lt. §35a GmbHG:
HRB 220009 B / Amtsgericht Berlin-Charlottenburg,
Geschäftsführer: Peer Heinlein -- Sitz: Berlin
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: CEPH orch made osd without WAL

2023-07-10 Thread Jan Marek
Hello Eugen,

I've tried to specify dedicated WAL device, but I have only
/dev/nvme0n1 , so I cannot write a correct YAML file...

Dne Po, čec 10, 2023 at 09:12:29 CEST napsal Eugen Block:
> Yes, because you did *not* specify a dedicated WAL device. This is also
> reflected in the OSD metadata:
> 
> $ ceph osd metadata 6 | grep dedicated
> "bluefs_dedicated_db": "1",
> "bluefs_dedicated_wal": "0"

Yes, it is exactly, as you wrote.

> 
> Only if you had specified a dedicated WAL device you would see it in the lvm
> list output, so this is all as expected.
> You can check out the perf dump of an OSD to see that it actually writes to
> the WAL:
> 
> # ceph daemon osd.6 perf dump bluefs | grep wal
> "wal_total_bytes": 0,
> "wal_used_bytes": 0,
> "files_written_wal": 1588,
> "bytes_written_wal": 1090677563392,
> "max_bytes_wal": 0,

Here is some problem:

# ceph daemon osd.8 perf dump bluefs
Can't get admin socket path: unable to get conf option admin_socket for osd: 
b"error parsing 'osd': expected string of the form TYPE.ID, valid types are: 
auth, mon, osd, mds, mgr, client\n"

I'm on the host, on which is this OSD 8.

My CEPH version is latest (I hope) quincy: 17.2.6.

Thanks a lot for help.

Sincerely
Jan Marek

> 
> 
> Zitat von Jan Marek :
> 
> > Hello,
> > 
> > but when I try to list devices config with ceph-volume, I can see
> > a DB devices, but no WAL devices:
> > 
> > ceph-volume lvm list
> > 
> > == osd.8 ===
> > 
> >   [db]  
> > /dev/ceph-5aa92e38-077b-48e2-bda6-5b7db7b7701c/osd-db-bfd11468-d109-4f85-9723-75976f51bfb9
> > 
> >   block device  
> > /dev/ceph-eaf5f0d7-ad50-4009-9ee6-04b8204b5b1a/osd-block-26b1d4b7-2425-4a2f-912b-111cf66a5970
> >   block uuidj4s9lv-wS9n-xg2W-I4Y0-fUSu-Vuvl-9gOB2P
> >   cephx lockbox secret
> >   cluster fsid  2c565e24-7850-47dc-a751-a6357cbbaf2a
> >   cluster name  ceph
> >   crush device class
> >   db device 
> > /dev/ceph-5aa92e38-077b-48e2-bda6-5b7db7b7701c/osd-db-bfd11468-d109-4f85-9723-75976f51bfb9
> >   db uuid   d9MZ2r-ImXX-Xod0-TNDS-tqi5-oG5Y-wrXFtW
> >   encrypted 0
> >   osd fsid  26b1d4b7-2425-4a2f-912b-111cf66a5970
> >   osd id8
> >   osdspec affinity  osd_spec_default
> >   type  db
> >   vdo   0
> >   devices   /dev/nvme0n1
> > 
> >   [block]   
> > /dev/ceph-eaf5f0d7-ad50-4009-9ee6-04b8204b5b1a/osd-block-26b1d4b7-2425-4a2f-912b-111cf66a5970
> > 
> >   block device  
> > /dev/ceph-eaf5f0d7-ad50-4009-9ee6-04b8204b5b1a/osd-block-26b1d4b7-2425-4a2f-912b-111cf66a5970
> >   block uuidj4s9lv-wS9n-xg2W-I4Y0-fUSu-Vuvl-9gOB2P
> >   cephx lockbox secret
> >   cluster fsid  2c565e24-7850-47dc-a751-a6357cbbaf2a
> >   cluster name  ceph
> >   crush device class
> >   db device 
> > /dev/ceph-5aa92e38-077b-48e2-bda6-5b7db7b7701c/osd-db-bfd11468-d109-4f85-9723-75976f51bfb9
> >   db uuid   d9MZ2r-ImXX-Xod0-TNDS-tqi5-oG5Y-wrXFtW
> >   encrypted 0
> >   osd fsid  26b1d4b7-2425-4a2f-912b-111cf66a5970
> >   osd id8
> >   osdspec affinity  osd_spec_default
> >   type  block
> >   vdo   0
> >   devices   /dev/sdi
> > 
> > (part of listing...)
> > 
> > Sincerely
> > Jan Marek
> > 
> > 
> > Dne Po, čec 10, 2023 at 08:10:58 CEST napsal Eugen Block:
> > > Hi,
> > > 
> > > if you don't specify a different device for WAL it will be automatically
> > > colocated on the same device as the DB. So you're good with this
> > > configuration.
> > > 
> > > Regards,
> > > Eugen
> > > 
> > > 
> > > Zitat von Jan Marek :
> > > 
> > > > Hello,
> > > >
> > > > I've tried to add to CEPH cluster OSD node with a 12 rotational
> > > > disks and 1 NVMe. My YAML was this:
> > > >
> > > > service_type: osd
> > > > service_id: osd_spec_default
> > > > service_name: osd.osd_spec_default
> > > > placement:
> > > >   host_pattern: osd8
> > > > spec:
> > > >   block_db_size: 64G
> > > >   data_devices:
> > > > rotational: 1
> > > >   db_devices:
> > > > paths:
> > > > - /dev/nvme0n1
> > > >   filter_logic: AND
> > > >   objectstore: bluestore
> > > >
> > > > Now I have 12 OSD with DB on NVMe device, but without WAL. How I
> > > > can add WAL to this OSD?
> > > >
> > > > NVMe device still have 128GB free place.
> > > >
> > > > Thanks a lot.
> > > >
> > > > Sincerely
> > > > Jan Marek
> > > > --
> > > > Ing. Jan Marek
> > > > University of South Bohemia
> > > > Academic Computer Centre
> > > > Phone: +420389032080
> > > > http://www.gnu.org/philosophy/no-word-attachments.cs.html
> > > 
> > > 
> > > __

[ceph-users] Re: CEPH orch made osd without WAL

2023-07-10 Thread Eugen Block
It's fine, you don't need to worry about the WAL device, it is  
automatically created on the nvme if the DB is there. Having a  
dedicated WAL device would only make sense if for example your data  
devices are on HDD, your rocksDB on "regular" SSDs and you also have  
nvme devices. But since you already use nvme for DB you don't need to  
specify a WAL device.



Here is some problem:

# ceph daemon osd.8 perf dump bluefs
Can't get admin socket path: unable to get conf option admin_socket  
for osd: b"error parsing 'osd': expected string of the form TYPE.ID,  
valid types are: auth, mon, osd, mds, mgr, client\n"


I'm on the host, on which is this OSD 8.


I should have mentioned that you need to enter into the container first

cephadm enter --name osd.8

and then

ceph daemon osd.8 perf dump bluefs

Zitat von Jan Marek :


Hello Eugen,

I've tried to specify dedicated WAL device, but I have only
/dev/nvme0n1 , so I cannot write a correct YAML file...

Dne Po, čec 10, 2023 at 09:12:29 CEST napsal Eugen Block:

Yes, because you did *not* specify a dedicated WAL device. This is also
reflected in the OSD metadata:

$ ceph osd metadata 6 | grep dedicated
"bluefs_dedicated_db": "1",
"bluefs_dedicated_wal": "0"


Yes, it is exactly, as you wrote.



Only if you had specified a dedicated WAL device you would see it in the lvm
list output, so this is all as expected.
You can check out the perf dump of an OSD to see that it actually writes to
the WAL:

# ceph daemon osd.6 perf dump bluefs | grep wal
"wal_total_bytes": 0,
"wal_used_bytes": 0,
"files_written_wal": 1588,
"bytes_written_wal": 1090677563392,
"max_bytes_wal": 0,


Here is some problem:

# ceph daemon osd.8 perf dump bluefs
Can't get admin socket path: unable to get conf option admin_socket  
for osd: b"error parsing 'osd': expected string of the form TYPE.ID,  
valid types are: auth, mon, osd, mds, mgr, client\n"


I'm on the host, on which is this OSD 8.

My CEPH version is latest (I hope) quincy: 17.2.6.

Thanks a lot for help.

Sincerely
Jan Marek




Zitat von Jan Marek :

> Hello,
>
> but when I try to list devices config with ceph-volume, I can see
> a DB devices, but no WAL devices:
>
> ceph-volume lvm list
>
> == osd.8 ===
>
>   [db]   
/dev/ceph-5aa92e38-077b-48e2-bda6-5b7db7b7701c/osd-db-bfd11468-d109-4f85-9723-75976f51bfb9

>
>   block device   
/dev/ceph-eaf5f0d7-ad50-4009-9ee6-04b8204b5b1a/osd-block-26b1d4b7-2425-4a2f-912b-111cf66a5970

>   block uuidj4s9lv-wS9n-xg2W-I4Y0-fUSu-Vuvl-9gOB2P
>   cephx lockbox secret
>   cluster fsid  2c565e24-7850-47dc-a751-a6357cbbaf2a
>   cluster name  ceph
>   crush device class
>   db device  
/dev/ceph-5aa92e38-077b-48e2-bda6-5b7db7b7701c/osd-db-bfd11468-d109-4f85-9723-75976f51bfb9

>   db uuid   d9MZ2r-ImXX-Xod0-TNDS-tqi5-oG5Y-wrXFtW
>   encrypted 0
>   osd fsid  26b1d4b7-2425-4a2f-912b-111cf66a5970
>   osd id8
>   osdspec affinity  osd_spec_default
>   type  db
>   vdo   0
>   devices   /dev/nvme0n1
>
>   [block]
/dev/ceph-eaf5f0d7-ad50-4009-9ee6-04b8204b5b1a/osd-block-26b1d4b7-2425-4a2f-912b-111cf66a5970

>
>   block device   
/dev/ceph-eaf5f0d7-ad50-4009-9ee6-04b8204b5b1a/osd-block-26b1d4b7-2425-4a2f-912b-111cf66a5970

>   block uuidj4s9lv-wS9n-xg2W-I4Y0-fUSu-Vuvl-9gOB2P
>   cephx lockbox secret
>   cluster fsid  2c565e24-7850-47dc-a751-a6357cbbaf2a
>   cluster name  ceph
>   crush device class
>   db device  
/dev/ceph-5aa92e38-077b-48e2-bda6-5b7db7b7701c/osd-db-bfd11468-d109-4f85-9723-75976f51bfb9

>   db uuid   d9MZ2r-ImXX-Xod0-TNDS-tqi5-oG5Y-wrXFtW
>   encrypted 0
>   osd fsid  26b1d4b7-2425-4a2f-912b-111cf66a5970
>   osd id8
>   osdspec affinity  osd_spec_default
>   type  block
>   vdo   0
>   devices   /dev/sdi
>
> (part of listing...)
>
> Sincerely
> Jan Marek
>
>
> Dne Po, čec 10, 2023 at 08:10:58 CEST napsal Eugen Block:
> > Hi,
> >
> > if you don't specify a different device for WAL it will be automatically
> > colocated on the same device as the DB. So you're good with this
> > configuration.
> >
> > Regards,
> > Eugen
> >
> >
> > Zitat von Jan Marek :
> >
> > > Hello,
> > >
> > > I've tried to add to CEPH cluster OSD node with a 12 rotational
> > > disks and 1 NVMe. My YAML was this:
> > >
> > > service_type: osd
> > > service_id: osd_spec_default
> > > service_name: osd.osd_spec_default
> > > placement:
> > >   host_pattern: osd8
> > > spec:
> > >   block_db_size: 64G
> > >   data_devices:
> > > rotati

[ceph-users] Re: CEPH orch made osd without WAL

2023-07-10 Thread Jan Marek
Hello Eugen,

Dne Po, čec 10, 2023 at 10:02:58 CEST napsal Eugen Block:
> It's fine, you don't need to worry about the WAL device, it is automatically
> created on the nvme if the DB is there. Having a dedicated WAL device would
> only make sense if for example your data devices are on HDD, your rocksDB on
> "regular" SSDs and you also have nvme devices. But since you already use
> nvme for DB you don't need to specify a WAL device.

OK :-)

> 
> > Here is some problem:
> > 
> > # ceph daemon osd.8 perf dump bluefs
> > Can't get admin socket path: unable to get conf option admin_socket for
> > osd: b"error parsing 'osd': expected string of the form TYPE.ID, valid
> > types are: auth, mon, osd, mds, mgr, client\n"
> > 
> > I'm on the host, on which is this OSD 8.
> 
> I should have mentioned that you need to enter into the container first
> 
> cephadm enter --name osd.8
> 
> and then
> 
> ceph daemon osd.8 perf dump bluefs

Yes, it was a problem:

 ceph daemon osd.8 perf dump bluefs | grep wal
"wal_total_bytes": 0,
"wal_used_bytes": 0,
"files_written_wal": 535,
"bytes_written_wal": 121443819520,
"max_bytes_wal": 0,
"alloc_unit_wal": 0,
"read_random_disk_bytes_wal": 0,
"read_disk_bytes_wal": 0,

So I can now see, that it uses WAL.

Once again, thanks a lot.

Sincerely
Jan Marek
-- 
Ing. Jan Marek
University of South Bohemia
Academic Computer Centre
Phone: +420389032080
http://www.gnu.org/philosophy/no-word-attachments.cs.html


signature.asc
Description: PGP signature
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: MON sync time depends on outage duration

2023-07-10 Thread Eugen Block

Hi,
I got a customer response with payload size 4096, that made things  
even worse. The mon startup time was now around 40 minutes. My doubts  
wrt decreasing the payload size seem confirmed. Then I read Dan's  
response again which also mentions that the default payload size could  
be too small. So I asked them to double the default (2M instead of 1M)  
and am now waiting for a new result. I'm still wondering why this only  
happens when the mon is down for more than 5 minutes. Does anyone have  
an explanation for that time factor?
Another thing they're going to do is to remove lots of snapshot  
tombstones (rbd mirroring snapshots in the trash namespace), maybe  
that will reduce the osd_snap keys in the mon db, which then would  
increase the startup time. We'll see...


Zitat von Eugen Block :


Thanks, Dan!


Yes that sounds familiar from the luminous and mimic days.
The workaround for zillions of snapshot keys at that time was to use:
  ceph config set mon mon_sync_max_payload_size 4096


I actually did search for mon_sync_max_payload_keys, not bytes so I  
missed your thread, it seems. Thanks for pointing that out. So the  
defaults seem to be these in Octopus:


"mon_sync_max_payload_keys": "2000",
"mon_sync_max_payload_size": "1048576",


So it could be in your case that the sync payload is just too small to
efficiently move 42 million osd_snap keys? Using debug_paxos and debug_mon
you should be able to understand what is taking so long, and tune
mon_sync_max_payload_size and mon_sync_max_payload_keys accordingly.


I'm confused, if the payload size is too small, why would decreasing  
it help? Or am I misunderstanding something? But it probably won't  
hurt to try it with 4096 and see if anything changes. If not we can  
still turn on debug logs and take a closer look.


And additional to Dan suggestion, the HDD is not a good choices for  
RocksDB, which is most likely the reason for this thread, I think  
that from the 3rd time the database just goes into compaction  
maintenance


Believe me, I know... but there's not much they can currently do  
about it, quite a long story... But I have been telling them that  
for months now. Anyway, I will make some suggestions and report back  
if it worked in this case as well.


Thanks!
Eugen

Zitat von Dan van der Ster :


Hi Eugen!

Yes that sounds familiar from the luminous and mimic days.

Check this old thread:
https://lists.ceph.io/hyperkitty/list/ceph-users@ceph.io/thread/F3W2HXMYNF52E7LPIQEJFUTAD3I7QE25/
(that thread is truncated but I can tell you that it worked for Frank).
Also the even older referenced thread:
https://lists.ceph.io/hyperkitty/list/ceph-users@ceph.io/thread/M5ZKF7PTEO2OGDDY5L74EV4QS5SDCZTH/

The workaround for zillions of snapshot keys at that time was to use:
  ceph config set mon mon_sync_max_payload_size 4096

That said, that sync issue was supposed to be fixed by way of adding the
new option mon_sync_max_payload_keys, which has been around since nautilus.

So it could be in your case that the sync payload is just too small to
efficiently move 42 million osd_snap keys? Using debug_paxos and debug_mon
you should be able to understand what is taking so long, and tune
mon_sync_max_payload_size and mon_sync_max_payload_keys accordingly.

Good luck!

Dan

__
Clyso GmbH | Ceph Support and Consulting | https://www.clyso.com



On Thu, Jul 6, 2023 at 1:47 PM Eugen Block  wrote:


Hi *,

I'm investigating an interesting issue on two customer clusters (used
for mirroring) I've not solved yet, but today we finally made some
progress. Maybe someone has an idea where to look next, I'd appreciate
any hints or comments.
These are two (latest) Octopus clusters, main usage currently is RBD
mirroring with snapshot mode (around 500 RBD images are synced every
30 minutes). They noticed very long startup times of MON daemons after
reboot, times between 10 and 30 minutes (reboot time already
subtracted). These delays are present on both sites. Today we got a
maintenance window and started to check in more detail by just
restarting the MON service (joins quorum within seconds), then
stopping the MON service and wait a few minutes (still joins quorum
within seconds). And then we stopped the service and waited for more
than 5 minutes, simulating a reboot, and then we were able to
reproduce it. The sync then takes around 15 minutes, we verified with
other MONs as well. The MON store is around 2 GB of size (on HDD), I
understand that the sync itself can take some time, but what is the
threshold here? I tried to find a hint in the MON config, searching
for timeouts with 300 seconds, there were only a few matches
(mon_session_timeout is one of them), but I'm not sure if they can
explain this behavior.
Investigating the MON store (ceph-monstore-tool dump-keys) I noticed
that there were more than 42 Million osd_snap keys, which is quite a
lot and would explain the size of the MON store. But I

[ceph-users] mon log file grows huge

2023-07-10 Thread Ben
Hi,

In our cluster monitors' log grows to couple GBs in days. There are quite
many debug message from rocksdb, osd, mgr and mds. These should not be
necessary with a well-run cluster. How could I close these logging?

Thanks,
Ben
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: ceph quota qustion

2023-07-10 Thread sejun21 . kim
Hi, 

yes, this is incomplete multiparts problem.

Then, how do admin delete the incomplete multipart object?
I mean
1. can admin find incomplete job and incomplete multipart object?
2. If first question is possible, then can admin delete all the job or object 
at once?
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: ceph quota qustion

2023-07-10 Thread Casey Bodley
On Mon, Jul 10, 2023 at 10:40 AM  wrote:
>
> Hi,
>
> yes, this is incomplete multiparts problem.
>
> Then, how do admin delete the incomplete multipart object?
> I mean
> 1. can admin find incomplete job and incomplete multipart object?
> 2. If first question is possible, then can admin delete all the job or object 
> at once?

you don't need to be an admin to do these things, they're part of the
S3 API. this cleanup can be automated using a bucket lifecycle policy
[1]. for manual cleanup with aws cli, for example, you can use 'aws
s3api list-multipart-uploads' [2] to discover all of the incomplete
uploads in a given bucket, and 'aws s3api abort-multipart-upload' [3]
to abort each of them. in addition to specifying the
AWS_ACCESS_KEY_ID/AWS_SECRET_ACCESS_KEY environment variables, you'll
also need to point --endpoint-url at a running radosgw

[1] 
https://docs.aws.amazon.com/AmazonS3/latest/userguide/mpu-abort-incomplete-mpu-lifecycle-config.html
[2] 
https://docs.aws.amazon.com/cli/latest/reference/s3api/list-multipart-uploads.html
[3] 
https://docs.aws.amazon.com/cli/latest/reference/s3api/abort-multipart-upload.html

> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: mon log file grows huge

2023-07-10 Thread Wesley Dillingham
At what level do you have logging set to for your mons? That is a high
volume of logs for the mon to generate.

You can ask all the mons to print their debug logging level with:

"ceph tell mon.* config get debug_mon"

The default is 1/5

What is the overall status of your cluster? Is it healthy?

"ceph status"

Consider implementing more aggressive log rotation

This link may provide useful
https://docs.ceph.com/en/latest/rados/troubleshooting/log-and-debug/



Respectfully,

*Wes Dillingham*
w...@wesdillingham.com
LinkedIn 


On Mon, Jul 10, 2023 at 9:44 AM Ben  wrote:

> Hi,
>
> In our cluster monitors' log grows to couple GBs in days. There are quite
> many debug message from rocksdb, osd, mgr and mds. These should not be
> necessary with a well-run cluster. How could I close these logging?
>
> Thanks,
> Ben
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Reef release candidate - v18.1.2

2023-07-10 Thread Stefan Kooman

On 6/30/23 18:36, Yuri Weinstein wrote:


This RC has gone thru partial testing due to issues we are
experiencing in the sepia lab.
Please try it out and report any issues you encounter. Happy testing!


I tested the RC (v18.1.2) this afternoon. I tried out the new "read 
balancer". I hit asserts after applying the "ceph osd pg-upmap-primary 
$pd-id" commands on all affected OSDs. I posted (or tried at least) a 
ceph crash report: 
"2023-07-10T14:40:33.087472Z_c279b11c-7a69-488f-bd19-ed11cc4b0553"


"assert_condition": "pg_upmap_primaries.empty()",
"assert_file": 
"/home/jenkins-build/build/workspace/ceph-dev-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos8/DIST/centos8/MACHINE_SIZE/gigantic/release/18.0.0-4795-g2f6e4f7d/rpm/el8/BUILD/ceph-18.0.0-4795-g2f6e4f7d/src/osd/OSDMap.cc",
"assert_func": "void OSDMap::encode(ceph::buffer::v15_2_0::list&, 
uint64_t) const",

"assert_line": 3251,
"assert_msg": 
"/home/jenkins-build/build/workspace/ceph-dev-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos8/DIST/centos8/MACHINE_SIZE/gigantic/release/18.0.0-4795-g2f6e4f7d/rpm/el8/BUILD/ceph-18.0.0-4795-g2f6e4f7d/src/osd/OSDMap.cc: 
In function 'void OSDMap::encode(ceph::buffer::v15_2_0::list&, uint64_t) 
const' thread 7f9c4ca18700 time 
2023-07-10T14:40:33.074045+\n/home/jenkins-build/build/workspace/ceph-dev-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos8/DIST/centos8/MACHINE_SIZE/gigantic/release/18.0.0-4795-g2f6e4f7d/rpm/el8/BUILD/ceph-18.0.0-4795-g2f6e4f7d/src/osd/OSDMap.cc: 
3251: FAILED ceph_assert(pg_upmap_primaries.empty())\n",


Shall I create a tracker issue for this?

Gr. Stefan
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Reef release candidate - v18.1.2

2023-07-10 Thread Laura Flores
Hi Stefan,

Yes, please create a tracker. I will take a look at the issue,

Thanks,
Laura Flores

On Mon, Jul 10, 2023 at 10:50 AM Stefan Kooman  wrote:

> On 6/30/23 18:36, Yuri Weinstein wrote:
>
> > This RC has gone thru partial testing due to issues we are
> > experiencing in the sepia lab.
> > Please try it out and report any issues you encounter. Happy testing!
>
> I tested the RC (v18.1.2) this afternoon. I tried out the new "read
> balancer". I hit asserts after applying the "ceph osd pg-upmap-primary
> $pd-id" commands on all affected OSDs. I posted (or tried at least) a
> ceph crash report:
> "2023-07-10T14:40:33.087472Z_c279b11c-7a69-488f-bd19-ed11cc4b0553"
>
>  "assert_condition": "pg_upmap_primaries.empty()",
>  "assert_file":
>
> "/home/jenkins-build/build/workspace/ceph-dev-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos8/DIST/centos8/MACHINE_SIZE/gigantic/release/18.0.0-4795-g2f6e4f7d/rpm/el8/BUILD/ceph-18.0.0-4795-g2f6e4f7d/src/osd/OSDMap.cc",
>  "assert_func": "void OSDMap::encode(ceph::buffer::v15_2_0::list&,
> uint64_t) const",
>  "assert_line": 3251,
>  "assert_msg":
> "/home/jenkins-build/build/workspace/ceph-dev-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos8/DIST/centos8/MACHINE_SIZE/gigantic/release/18.0.0-4795-g2f6e4f7d/rpm/el8/BUILD/ceph-18.0.0-4795-g2f6e4f7d/src/osd/OSDMap.cc:
>
> In function 'void OSDMap::encode(ceph::buffer::v15_2_0::list&, uint64_t)
> const' thread 7f9c4ca18700 time
> 2023-07-10T14:40:33.074045+\n/home/jenkins-build/build/workspace/ceph-dev-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos8/DIST/centos8/MACHINE_SIZE/gigantic/release/18.0.0-4795-g2f6e4f7d/rpm/el8/BUILD/ceph-18.0.0-4795-g2f6e4f7d/src/osd/OSDMap.cc:
>
> 3251: FAILED ceph_assert(pg_upmap_primaries.empty())\n",
>
> Shall I create a tracker issue for this?
>
> Gr. Stefan
>
>

-- 

Laura Flores

She/Her/Hers

Software Engineer, Ceph Storage 

Chicago, IL

lflo...@ibm.com | lflo...@redhat.com 
M: +17087388804
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: RBD with PWL cache shows poor performance compared to cache device

2023-07-10 Thread Matthew Booth
On Thu, 6 Jul 2023 at 12:54, Mark Nelson  wrote:
>
>
> On 7/6/23 06:02, Matthew Booth wrote:
> > On Wed, 5 Jul 2023 at 15:18, Mark Nelson  wrote:
> >> I'm sort of amazed that it gave you symbols without the debuginfo
> >> packages installed.  I'll need to figure out a way to prevent that.
> >> Having said that, your new traces look more accurate to me.  The thing
> >> that sticks out to me is the (slight?) amount of contention on the PWL
> >> m_lock in dispatch_deferred_writes, update_root_scheduled_ops,
> >> append_ops, append_sync_point(), etc.
> >>
> >> I don't know if the contention around the m_lock is enough to cause an
> >> increase in 99% tail latency from 1.4ms to 5.2ms, but it's the first
> >> thing that jumps out at me.  There appears to be a large number of
> >> threads (each tp_pwl thread, the io_context_pool threads, the qemu
> >> thread, and the bstore_aio thread) that all appear to have potential to
> >> contend on that lock.  You could try dropping the number of tp_pwl
> >> threads from 4 to 1 and see if that changes anything.
> > Will do. Any idea how to do that? I don't see an obvious rbd config option.
> >
> > Thanks for looking into this,
> > Matt
>
> you thanked me too soon...it appears to be hard-coded in, so you'll have
> to do a custom build. :D
>
> https://github.com/ceph/ceph/blob/main/src/librbd/cache/pwl/AbstractWriteLog.cc#L55-L56

Just to update: I have managed to test this today and it made no difference :(

In general, though, unless it's something egregious are we really
looking for something CPU-bound? Writes are 2 orders of magnitude
slower than the underlying local disk. This has to be caused by
something wildly inefficient.

I have had a thought: the guest filesystem has 512 byte blocks, but
the pwl filesystem has 4k blocks (on a 4k disk). Given that the test
is of small writes, is there any chance that we're multiplying the
number of physical writes in some pathological manner?

Matt
-- 
Matthew Booth
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Planning cluster

2023-07-10 Thread Dan van der Ster
Hi Jan,

On Sun, Jul 9, 2023 at 11:17 PM Jan Marek  wrote:

> Hello,
>
> I have a cluster, which have this configuration:
>
> osd pool default size = 3
> osd pool default min size = 1
>

Don't use min_size = 1 during regular stable operations. Instead, use
min_size = 2 to ensure data safety, and then you can set the pool to
min_size = 1 manually in the case of an emergency. (E.g. in case the 2
copies fail and will not be recoverable).


> I have 5 monitor nodes and 7 OSD nodes.
>

3 monitors is probably enough. Put 2 in the same DC with 2 replicas, and
the other in the DC with 1 replica.


> I have changed a crush map to divide ceph cluster to two
> datacenters - in the first one will be a part of cluster with 2
> copies of data and in the second one will be part of cluster
> with one copy - only emergency.
>
> I still have this cluster in one
>
> This cluster have a 1 PiB of raw data capacity, thus it is very
> expensive add a further 300TB capacity to have 2+2 data redundancy.
>
> Will it works?
>
> If I turn off the 1/3 location, will it be operational?


Yes the PGs should be active and accept IO. But the cluster will be
degraded, it cannot stay in this state permanently. (You will need to
recover the 3rd replica or change the crush map).



> I
> believe, it is a better choose, it will. And what if "die" 2/3
> location?


with min_size = 2, the PG wil be inactive. but the data will be safe. If
this happens, then set min_size=1 to activate the PGs.
Mon will not have quorum though -- you need a plan for that. And also plan
where you put your MDSs.

-- dan

__
Clyso GmbH | Ceph Support and Consulting | https://www.clyso.com




> On this cluster is pool with cephfs - this is a main
> part of CEPH.
>
> Many thanks for your notices.
>
> Sincerely
> Jan Marek
> --
> Ing. Jan Marek
> University of South Bohemia
> Academic Computer Centre
> Phone: +420389032080
> http://www.gnu.org/philosophy/no-word-attachments.cs.html
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: MON sync time depends on outage duration

2023-07-10 Thread Dan van der Ster
Oh yes, sounds like purging the rbd trash will be the real fix here!
Good luck!

__
Clyso GmbH | Ceph Support and Consulting | https://www.clyso.com




On Mon, Jul 10, 2023 at 6:10 AM Eugen Block  wrote:

> Hi,
> I got a customer response with payload size 4096, that made things
> even worse. The mon startup time was now around 40 minutes. My doubts
> wrt decreasing the payload size seem confirmed. Then I read Dan's
> response again which also mentions that the default payload size could
> be too small. So I asked them to double the default (2M instead of 1M)
> and am now waiting for a new result. I'm still wondering why this only
> happens when the mon is down for more than 5 minutes. Does anyone have
> an explanation for that time factor?
> Another thing they're going to do is to remove lots of snapshot
> tombstones (rbd mirroring snapshots in the trash namespace), maybe
> that will reduce the osd_snap keys in the mon db, which then would
> increase the startup time. We'll see...
>
> Zitat von Eugen Block :
>
> > Thanks, Dan!
> >
> >> Yes that sounds familiar from the luminous and mimic days.
> >> The workaround for zillions of snapshot keys at that time was to use:
> >>   ceph config set mon mon_sync_max_payload_size 4096
> >
> > I actually did search for mon_sync_max_payload_keys, not bytes so I
> > missed your thread, it seems. Thanks for pointing that out. So the
> > defaults seem to be these in Octopus:
> >
> > "mon_sync_max_payload_keys": "2000",
> > "mon_sync_max_payload_size": "1048576",
> >
> >> So it could be in your case that the sync payload is just too small to
> >> efficiently move 42 million osd_snap keys? Using debug_paxos and
> debug_mon
> >> you should be able to understand what is taking so long, and tune
> >> mon_sync_max_payload_size and mon_sync_max_payload_keys accordingly.
> >
> > I'm confused, if the payload size is too small, why would decreasing
> > it help? Or am I misunderstanding something? But it probably won't
> > hurt to try it with 4096 and see if anything changes. If not we can
> > still turn on debug logs and take a closer look.
> >
> >> And additional to Dan suggestion, the HDD is not a good choices for
> >> RocksDB, which is most likely the reason for this thread, I think
> >> that from the 3rd time the database just goes into compaction
> >> maintenance
> >
> > Believe me, I know... but there's not much they can currently do
> > about it, quite a long story... But I have been telling them that
> > for months now. Anyway, I will make some suggestions and report back
> > if it worked in this case as well.
> >
> > Thanks!
> > Eugen
> >
> > Zitat von Dan van der Ster :
> >
> >> Hi Eugen!
> >>
> >> Yes that sounds familiar from the luminous and mimic days.
> >>
> >> Check this old thread:
> >>
> https://lists.ceph.io/hyperkitty/list/ceph-users@ceph.io/thread/F3W2HXMYNF52E7LPIQEJFUTAD3I7QE25/
> >> (that thread is truncated but I can tell you that it worked for Frank).
> >> Also the even older referenced thread:
> >>
> https://lists.ceph.io/hyperkitty/list/ceph-users@ceph.io/thread/M5ZKF7PTEO2OGDDY5L74EV4QS5SDCZTH/
> >>
> >> The workaround for zillions of snapshot keys at that time was to use:
> >>   ceph config set mon mon_sync_max_payload_size 4096
> >>
> >> That said, that sync issue was supposed to be fixed by way of adding the
> >> new option mon_sync_max_payload_keys, which has been around since
> nautilus.
> >>
> >> So it could be in your case that the sync payload is just too small to
> >> efficiently move 42 million osd_snap keys? Using debug_paxos and
> debug_mon
> >> you should be able to understand what is taking so long, and tune
> >> mon_sync_max_payload_size and mon_sync_max_payload_keys accordingly.
> >>
> >> Good luck!
> >>
> >> Dan
> >>
> >> __
> >> Clyso GmbH | Ceph Support and Consulting | https://www.clyso.com
> >>
> >>
> >>
> >> On Thu, Jul 6, 2023 at 1:47 PM Eugen Block  wrote:
> >>
> >>> Hi *,
> >>>
> >>> I'm investigating an interesting issue on two customer clusters (used
> >>> for mirroring) I've not solved yet, but today we finally made some
> >>> progress. Maybe someone has an idea where to look next, I'd appreciate
> >>> any hints or comments.
> >>> These are two (latest) Octopus clusters, main usage currently is RBD
> >>> mirroring with snapshot mode (around 500 RBD images are synced every
> >>> 30 minutes). They noticed very long startup times of MON daemons after
> >>> reboot, times between 10 and 30 minutes (reboot time already
> >>> subtracted). These delays are present on both sites. Today we got a
> >>> maintenance window and started to check in more detail by just
> >>> restarting the MON service (joins quorum within seconds), then
> >>> stopping the MON service and wait a few minutes (still joins quorum
> >>> within seconds). And then we stopped the service and waited for more
> >>> than 5 minutes, simulating a reboot, and 

[ceph-users] radosgw + keystone breaks when projects have - in their names

2023-07-10 Thread Andrew Bogott
I'm in the process of adding the radosgw service to our OpenStack cloud 
and hoping to re-use keystone for discovery and auth. Things seem to 
work fine with many keystone tenants, but as soon as we try to do 
something in a project with a '-' in its name everything fails.


Here's an example, using the openstack swift cli:

root@cloudcontrol2001-dev:~# OS_PROJECT_ID="testlabs" openstack 
container create 'makethiscontainer'

+---+---++
| account   | container | 
x-trans-id |

+---+---++
| AUTH_testlabs | makethiscontainer | 
tx008c311dbda86c695-0064ac5fad-6927acd-default |

+---+---++
root@cloudcontrol2001-dev:~# OS_PROJECT_ID="service" openstack container 
create 'makethiscontainer'

+--+---++
| account  | container | 
x-trans-id |

+--+---++
| AUTH_service | makethiscontainer | 
tx0b341a22866f65e44-0064ac5fb7-6927acd-default |

+--+---++
root@cloudcontrol2001-dev:~# OS_PROJECT_ID="admin-monitoring" openstack 
container create 'makethiscontainer'
Bad Request (HTTP 400) (Request-ID: 
tx0f7326bb541b4d2a9-0064ac5fc2-6927acd-default)



Before I dive into the source code, is this a known issue and/or 
something I can configure? Dash-named-projects work fine in keystone and 
seem to also work fine with standalone rados; I assume the issue is 
somewhere in the communication between the two. I suspected the implicit 
user creation code, but that seems to be working properly:


# radosgw-admin user list
[
    "cloudvirt-canary$cloudvirt-canary",
    "testlabs$testlabs",
    "paws-dev$paws-dev",
    "andrewtestproject$andrewtestproject",
    "admin-monitoring$admin-monitoring",
    "taavi-test-project$taavi-test-project",
    "admin$admin",
    "taavitestproject$taavitestproject",
    "bastioninfra-codfw1dev$bastioninfra-codfw1dev",
]

Here is the radosgw section of my ceph.conf:

[client.radosgw]

    host = 10.192.20.9
    keyring = /etc/ceph/ceph.client.radosgw.keyring
    rgw frontends = "civetweb port=18080"
    rgw_keystone_verify_ssl = false
    rgw_keystone_api_version = 3
    rgw_keystone_url = https://openstack.codfw1dev.wikimediacloud.org:25000
    rgw_keystone_accepted_roles = 'reader, admin, member'
    rgw_keystone_implicit_tenants = true
    rgw_keystone_admin_domain = default
    rgw_keystone_admin_project = service
    rgw_keystone_admin_user = swift
    rgw_keystone_admin_password = (redacted)
    rgw_s3_auth_use_keystone = true
    rgw_swift_account_in_url = true

    rgw_user_default_quota_max_objects = 4096
    rgw_user_default_quota_max_size = 8589934592


And here's a debug log of a failed transaction:

    https://phabricator.wikimedia.org/P49539

Thanks in advance!
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: mon log file grows huge

2023-07-10 Thread Ben
just rechecked debug_mon is by default 1/5. mgr/cephadm log_to_cluster
level has been set to critical from debug. Wonder how to set others' level.
Haven't got a clue to do that.

Thanks,
Ben

Wesley Dillingham  于2023年7月10日周一 23:21写道:

> At what level do you have logging set to for your mons? That is a high
> volume of logs for the mon to generate.
>
> You can ask all the mons to print their debug logging level with:
>
> "ceph tell mon.* config get debug_mon"
>
> The default is 1/5
>
> What is the overall status of your cluster? Is it healthy?
>
> "ceph status"
>
> Consider implementing more aggressive log rotation
>
> This link may provide useful
> https://docs.ceph.com/en/latest/rados/troubleshooting/log-and-debug/
>
>
>
> Respectfully,
>
> *Wes Dillingham*
> w...@wesdillingham.com
> LinkedIn 
>
>
> On Mon, Jul 10, 2023 at 9:44 AM Ben  wrote:
>
>> Hi,
>>
>> In our cluster monitors' log grows to couple GBs in days. There are quite
>> many debug message from rocksdb, osd, mgr and mds. These should not be
>> necessary with a well-run cluster. How could I close these logging?
>>
>> Thanks,
>> Ben
>> ___
>> ceph-users mailing list -- ceph-users@ceph.io
>> To unsubscribe send an email to ceph-users-le...@ceph.io
>>
>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io