[ceph-users] Re: Debian/bullseye build for reef

2023-09-04 Thread Matthew Vernon

Hi,

On 21/08/2023 17:16, Josh Durgin wrote:
We weren't targeting bullseye once we discovered the compiler version 
problem, the focus shifted to bookworm. If anyone would like to help 
maintaining debian builds, or looking into these issues, it would be 
welcome:


https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1030129 


I think this bug is similar (identical?) to Debian 1039472, which is now 
fixed in bookworm by a backport; so it might be worth trying again with 
a fully-updated bookworm system?


[this is going to be relevant to my interests at some point, but I can't 
yet offer much time]


Regards,

Matthew
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Permissions of the .snap directory do not inherit ACLs in 17.2.6

2023-09-04 Thread MARTEL Arnaud
Hi Eugen,



We have a lot of shared directories in cephfs and each directory has a specific 
ACL to grant access to several groups (for read and/or for read/write access).

Here is the complete steps to reproduce the pb in 17.2.6 with only one group, 
GIPSI, in the ACL:

# mkdir /mnt/ceph/test

# chown root:nogroup /mnt/ceph/test

# chmod 770 /mnt/ceph/test

# setfacl --set="u::rwx,g::rwx,o::-,d:m::rwx,m::rwx,d:g:GIPSI:rwx,g:GIPSI:rwx" 
/mnt/ceph/test/



# getfacl /mnt/ceph/test

# file: mnt/ceph/test

# owner: root

# group: nogroup

user::rwx

group::rwx

group:GIPSI:rwx

mask::rwx

other::---

default:user::rwx

default:group::rwx

default:group:GIPSI:rwx

default:mask::rwx

default:other::---



# touch /mnt/ceph/test/foo

# getfacl /mnt/ceph/test/foo

# file: mnt/ceph/test/foo

# owner: root

# group: root

user::rw-

group::rwx   #effective:rw-

group:GIPSI:rwx  #effective:rw-

mask::rw-

other::---



# mkdir /mnt/ceph/ec42/test/.snap/snaptest

# getfacl /mnt/ceph/test/.snap

# file: mnt/ceph/test/.snap

# owner: root

# group: nogroup

user::rwx

group::rwx

other::---





As a result, no member of the GIPSI group is able to access the snaphots…

And we had no user complained about the access to the snapshots before our 
upgrade so I suppose that the ACL of the .snap directory was OK in pacific (> 
16.2.9)



Arnaud



Le 04/09/2023 12:59, « Eugen Block » mailto:ebl...@nde.ag>> a 
écrit :





I'm wondering if I did something wrong or if I'm missing something. I

tried to reproduce the described steps from the bug you mentioned, and

from Nautilus to Reef (I have a couple of test clusters) the getfacl

output always shows the same output for the .snap directory:





$ getfacl /mnt/cephfs/test/.snap/

getfacl: Removing leading '/' from absolute path names

# file: mnt/cephfs/test/.snap/

# owner: root

# group: root

user::rwx

group::rwx

other::---





So in my tests it never actually shows the "users" group acl. But you

wrote that it worked with Pacific for you, I'm confused...





Zitat von MARTEL Arnaud mailto:arnaud.mar...@cea.fr>>:





> Hi,

>

> I'm facing the same situation as described in bug #57084

> (https://tracker.ceph.com/issues/57084 
> ) since I upgraded from

> 16.2.13 to 17.2.6

>

> for example:

>

> root@faiserver:~# getfacl /mnt/ceph/default/

> # file: mnt/ceph/default/

> # owner: 99

> # group: nogroup

> # flags: -s-

> user::rwx

> user:s-sac-acquisition:rwx

> group::rwx

> group:acquisition:r-x

> group:SAC_R:r-x

> mask::rwx

> other::---

> default:user::rwx

> default:user:s-sac-acquisition:rwx

> default:group::rwx

> default:group:acquisition:r-x

> default:group:SAC_R:r-x

> default:mask::rwx

> default:other::---

>

> root@faiserver:~# getfacl /mnt/ceph/default/.snap

> # file: mnt/ceph/default/.snap

> # owner: 99

> # group: nogroup

> # flags: -s-

> user::rwx

> group::rwx

> other::r-x

> 

>

> Before creating a new bug report, could you tell me if someone has

> the same problem with 17.2.6 ??

>

> Kind regards,

> Arnaud

> ___

> ceph-users mailing list -- ceph-users@ceph.io 

> To unsubscribe send an email to ceph-users-le...@ceph.io 
> 









___

ceph-users mailing list -- ceph-users@ceph.io 

To unsubscribe send an email to ceph-users-le...@ceph.io 





___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: rgw replication sync issue

2023-09-04 Thread Eugen Block
Did you try to rewrite the objects to see if at least those two errors  
resolve? Do you have any logs from the RGWs when the sync stopped to  
work? You write that the bandwidth usage just dropped but is > 0, does  
that mean some buckets are still syncing? Can you see a pattern if the  
failing sync is only for specific buckets or are all buckets affected?  
Do you see anything on the system level of the remote cluster (syslog,  
dmesg, networking issues)?



Zitat von ankit raikwar :


Hello Eugen Block,
 There is no inactive pgs in the clusters.  
now even we put around the 4 Tib of data in the Primary cluster  but  
data is not sync to the secondary cluster . it still same .

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io



___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Permissions of the .snap directory do not inherit ACLs in 17.2.6

2023-09-04 Thread Eugen Block
I'm wondering if I did something wrong or if I'm missing something. I  
tried to reproduce the described steps from the bug you mentioned, and  
from Nautilus to Reef (I have a couple of test clusters) the getfacl  
output always shows the same output for the .snap directory:


$ getfacl  /mnt/cephfs/test/.snap/
getfacl: Removing leading '/' from absolute path names
# file: mnt/cephfs/test/.snap/
# owner: root
# group: root
user::rwx
group::rwx
other::---

So in my tests it never actually shows the "users" group acl. But you  
wrote that it worked with Pacific for you, I'm confused...


Zitat von MARTEL Arnaud :


Hi,

I'm facing the same situation as described in bug #57084  
(https://tracker.ceph.com/issues/57084) since I upgraded from  
16.2.13 to 17.2.6


for example:

root@faiserver:~# getfacl /mnt/ceph/default/
# file: mnt/ceph/default/
# owner: 99
# group: nogroup
# flags: -s-
user::rwx
user:s-sac-acquisition:rwx
group::rwx
group:acquisition:r-x
group:SAC_R:r-x
mask::rwx
other::---
default:user::rwx
default:user:s-sac-acquisition:rwx
default:group::rwx
default:group:acquisition:r-x
default:group:SAC_R:r-x
default:mask::rwx
default:other::---

root@faiserver:~# getfacl /mnt/ceph/default/.snap
# file: mnt/ceph/default/.snap
# owner: 99
# group: nogroup
# flags: -s-
user::rwx
group::rwx
other::r-x


Before creating a new bug report, could you tell me if someone has  
the same problem with 17.2.6 ??


Kind regards,
Arnaud
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io



___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] RGW Lua - writable response header/field

2023-09-04 Thread Ondřej Kukla
Hello,

We have a RGW setup that has a bunch of Nginx in front of RGWs to work as a LB. 
I’m currently working on some metrics and log analysis from the LB logs.

At the moment I’m looking at possibilities to recognise the type of s3 request 
on the LB. I know that matching the format shouldn’t be extremely hard, but I 
was looking into a possibility to extract the information from RGW as that’s 
the part that’s aware of that.

I was working with the LUA part of RGW before so I know that the Request.RGWOp 
Field is an great fit.


I would like to add this as a some kind of response header, but unfortunately 
that’s not possible at the moment if I’m not wrong.

Has anyone looked into this (wink wink Yuval :))? Or do you have a 
recommendation how to do it?

Thanks a lot.

Regards,

Ondrej
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Is it possible (or meaningful) to revive old OSDs?

2023-09-04 Thread ceph-mail
Hello,

I have a ten node cluster with about 150 OSDs. One node went down a while back, 
several months. The OSDs on the node have been marked as down and out since.

I am now in the position to return the node to the cluster, with all the OS and 
OSD disks. When I boot up the now working node, the OSDs do not start.

Essentially , it seems to complain with "fail[ing]to load OSD map for [various 
epoch]s, got 0 bytes".

I'm guessing the OSDs on disk maps are so old, they can't get back into the 
cluster?

My questions are whether it's possible or worth it to try to squeeze these OSDs 
back in or to just replace them. And if I should just replace them, what's the 
best way? Manually remove [1] and recreate? Replace [2]? Purge in dashboard?

[1] 
https://docs.ceph.com/en/quincy/rados/operations/add-or-rm-osds/#removing-osds-manual
[2] 
https://docs.ceph.com/en/quincy/rados/operations/add-or-rm-osds/#replacing-an-osd

Many thanks!

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Is it safe to add different OS but same ceph version to the existing cluster?

2023-09-04 Thread Szabo, Istvan (Agoda)
Hi,

I've added ubuntu 20.04 nodes to my ceph octopus 15.2.17 baremetal deployed 
cluster next to the centos 8 nodes and I see something interesting regarding 
the disk usage, it is higher on ubuntu than on centos, however the cpu usage is 
lower (on this picture you can see 4 nodes, each column is 1 node, the last 
column is the ubuntu): 
https://i.ibb.co/Tk5Srk6/image-2023-09-04-09-55-52-311.png

Could this be because of the missing HPC tuned profile on ubuntu 20.04?
On ubuntu 20.04 there isn't any HPC tuned profile, I've used the 
latency-performance one which is the base of the HPC:

This is the latency performance:

[main]
summary=Optimize for deterministic performance at the cost of increased power 
consumption
[cpu]
force_latency=1
governor=performance
energy_perf_bias=performance
min_perf_pct=100
[sysctl]
kernel.sched_min_granularity_ns=1000
vm.dirty_ratio=10
vm.dirty_background_ratio=3
vm.swappiness=10
kernel.sched_migration_cost_ns=500

The hpc tuned profile has these additional values on centos and on ubuntu 22.04:
[main]
summary=Optimize for HPC compute workloads
description=Configures virtual memory, CPU governors, and network settings for 
HPC compute workloads.
include=latency-performance
[vm]
transparent_hugepages=always
[disk]
readahead=>4096
[sysctl]
vm.hugepages_treat_as_movable=0
vm.min_free_kbytes=135168
vm.zone_reclaim_mode=1
kernel.numa_balancing=0
net.core.busy_read=50
net.core.busy_poll=50
net.ipv4.tcp_fastopen=3

If someone is very god with these kernel parameter values, do you see something 
that might be related to the high disk utilization?

Thank you
[https://i.ibb.co/Tk5Srk6/image-2023-09-04-09-55-52-311.png]




From: Milind Changire 
Sent: Monday, August 7, 2023 11:38 PM
To: Szabo, Istvan (Agoda) 
Cc: Ceph Users 
Subject: Re: [ceph-users] Is it safe to add different OS but same ceph version 
to the existing cluster?

Email received from the internet. If in doubt, don't click any link nor open 
any attachment !


On Mon, Aug 7, 2023 at 8:23 AM Szabo, Istvan (Agoda)
 wrote:
>
> Hi,
>
> I have an octopus cluster on the latest octopus version with mgr/mon/rgw/osds 
> on centos 8.
> Is it safe to add an ubuntu osd host with the same octopus version?
>
> Thank you

Well, the ceph source bits surely remain the same. The binary bits
could be different due to better compiler support on the newer OS
version.
So assuming the new ceph is deployed on the same hardware platform
things should be stable.
Also, assuming that relevant OS tunables and ceph features and config
options have been configured to match the older deployment, the new
ceph deployment should just work fine and as expected.
Saying all this, I'd still recommend to test out the move one node at
a time rather than executing a bulk move.
Making a list of types of devices and checking driver support on the
new OS would also be a prudent thing to do.



This message is confidential and is for the sole use of the intended 
recipient(s). It may also be privileged or otherwise protected by copyright or 
other legal rules. If you have received it by mistake please let us know by 
reply email and delete it from your system. It is prohibited to copy this 
message or disclose its content to anyone. Any confidentiality or privilege is 
not waived or lost by any mistaken delivery or unauthorized disclosure of the 
message. All messages sent to and from Agoda may be monitored to ensure 
compliance with company policies, to protect the company's interests and to 
remove potential malware. Electronic messages may be intercepted, amended, lost 
or deleted, or contain viruses.
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io