[ceph-users] Re: logging with container

2022-03-21 Thread Tony Liu
Hi Adam,

When I do "ceph tell mon.ceph-1 config set log_to_file true",
I see the log file is created. That confirms that those options in command line
can only be override by runtime config change.
Could you check mon and mgr logging on your setup?

Can we remove those options in command line and let logging to be controlled
by cluster configuration or configuration file?

Another issue is that, log keeps going to 
/var/lib/docker/containers//-json.log,
which keeps growing up and it's not under logrotate management. How can I stop
logging to container stdout/stderr? Setting "log_to_stderr" doesn't help.


Thanks!
Tony

From: Tony Liu 
Sent: March 21, 2022 09:41 PM
To: Adam King
Cc: ceph-users@ceph.io; d...@ceph.io
Subject: [ceph-users] Re: logging with container

Hi Adam,

# ceph config get mgr log_to_file
true
# ceph config get mgr log_file
/var/log/ceph/$cluster-$name.log
# ceph config get osd log_to_file
true
# ceph config get osd log_file
/var/log/ceph/$cluster-$name.log
# ls /var/log/ceph/fa771070-a975-11ec-86c7-e4434be9cb2e/
ceph-osd.10.log  ceph-osd.13.log  ceph-osd.16.log  ceph-osd.19.log  
ceph-osd.1.log  ceph-osd.22.log  ceph-osd.4.log  ceph-osd.7.log  ceph-volume.log
# ceph version
ceph version 16.2.7 (dd0603118f56ab514f133c8d2e3adfc983942503) pacific (stable)

"log_to_file" and "log_file" are set the same for mgr and osd, but why there is 
osd log only,
but no mgr log?


Thanks!
Tony

From: Adam King 
Sent: March 21, 2022 08:26 AM
To: Tony Liu
Cc: ceph-users@ceph.io; d...@ceph.io
Subject: Re: [ceph-users] logging with container

Hi Tony,

Afaik those container flags just set the defaults and the config options 
override them. Setting the necessary flags 
(https://docs.ceph.com/en/latest/cephadm/operations/#logging-to-files) seemed 
to work for me.

[ceph: root@vm-00 /]# ceph config get osd.0 log_to_file
false
[ceph: root@vm-00 /]# ceph config show osd.0 log_to_file
false
[ceph: root@vm-00 /]# ceph config set global log_to_file true
[ceph: root@vm-00 /]# ceph config set global mon_cluster_log_to_file true
[ceph: root@vm-00 /]# ceph config get osd.0 log_to_file
true
[ceph: root@vm-00 /]# ceph config show osd.0 log_to_file
true
[ceph: root@vm-00 /]# ceph version
ceph version 16.2.7-601-g179a7bca (179a7bca8a84771b0dde09e26f7a2146a985df90) 
pacific (stable)
[ceph: root@vm-00 /]# exit
exit
[root@vm-00 ~]# ls /var/log/ceph/413f7ec8-a91e-11ec-9b02-52540092b5a3/
ceph.audit.log  ceph.cephadm.log  ceph.log  ceph-mgr.vm-00.ukcctb.log  
ceph-mon.vm-00.log  ceph-osd.0.log  ceph-osd.10.log  ceph-osd.2.log  
ceph-osd.4.log  ceph-osd.6.log  ceph-osd.8.log  ceph-volume.log



On Mon, Mar 21, 2022 at 1:06 AM Tony Liu 
mailto:tonyliu0...@hotmail.com>> wrote:
Hi,

After reading through doc, it's still not very clear to me how logging works 
with container.
This is with Pacific v16.2 container.

In OSD container, I see this.
```
/usr/bin/ceph-osd -n osd.16 -f --setuser ceph --setgroup ceph 
--default-log-to-file=false --default-log-to-stderr=true 
--default-log-stderr-prefix=debug
```
When check ceph configuration.
```
# ceph config get osd.16 log_file
/var/log/ceph/$cluster-$name.log
# ceph config get osd.16 log_to_file
true
# ceph config show osd.16 log_to_file
false
```
Q1, what's the intention of those log settings in command line? It's high 
priority and overrides
configuration in file and mon. Is there any option not doing that when deploy 
the container?
Q2, since log_to_file is set to false by command line, why there is still 
loggings in log_file?

The same for mgr and mon.

What I want is to have everything in log file and minimize the stdout and 
stderr from container.
Because log file is managed by logrotate, it unlikely blow up disk space. But 
stdout and stderr
from container is stored in a single file, not managed by logrotate. It may 
grow up to huge file.
Also, it's easier to check log file by vi than "podman logs". And log file is 
also collected and
stored by ELK for central management.

Any comments how I can achieve what I want?
Runtime override may not be the best option, cause it's not persistent.


Thanks!
Tony

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to 
ceph-users-le...@ceph.io

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: logging with container

2022-03-21 Thread Tony Liu
Hi Adam,

# ceph config get mgr log_to_file
true
# ceph config get mgr log_file
/var/log/ceph/$cluster-$name.log
# ceph config get osd log_to_file
true
# ceph config get osd log_file
/var/log/ceph/$cluster-$name.log
# ls /var/log/ceph/fa771070-a975-11ec-86c7-e4434be9cb2e/
ceph-osd.10.log  ceph-osd.13.log  ceph-osd.16.log  ceph-osd.19.log  
ceph-osd.1.log  ceph-osd.22.log  ceph-osd.4.log  ceph-osd.7.log  ceph-volume.log
# ceph version
ceph version 16.2.7 (dd0603118f56ab514f133c8d2e3adfc983942503) pacific (stable)

"log_to_file" and "log_file" are set the same for mgr and osd, but why there is 
osd log only,
but no mgr log?


Thanks!
Tony

From: Adam King 
Sent: March 21, 2022 08:26 AM
To: Tony Liu
Cc: ceph-users@ceph.io; d...@ceph.io
Subject: Re: [ceph-users] logging with container

Hi Tony,

Afaik those container flags just set the defaults and the config options 
override them. Setting the necessary flags 
(https://docs.ceph.com/en/latest/cephadm/operations/#logging-to-files) seemed 
to work for me.

[ceph: root@vm-00 /]# ceph config get osd.0 log_to_file
false
[ceph: root@vm-00 /]# ceph config show osd.0 log_to_file
false
[ceph: root@vm-00 /]# ceph config set global log_to_file true
[ceph: root@vm-00 /]# ceph config set global mon_cluster_log_to_file true
[ceph: root@vm-00 /]# ceph config get osd.0 log_to_file
true
[ceph: root@vm-00 /]# ceph config show osd.0 log_to_file
true
[ceph: root@vm-00 /]# ceph version
ceph version 16.2.7-601-g179a7bca (179a7bca8a84771b0dde09e26f7a2146a985df90) 
pacific (stable)
[ceph: root@vm-00 /]# exit
exit
[root@vm-00 ~]# ls /var/log/ceph/413f7ec8-a91e-11ec-9b02-52540092b5a3/
ceph.audit.log  ceph.cephadm.log  ceph.log  ceph-mgr.vm-00.ukcctb.log  
ceph-mon.vm-00.log  ceph-osd.0.log  ceph-osd.10.log  ceph-osd.2.log  
ceph-osd.4.log  ceph-osd.6.log  ceph-osd.8.log  ceph-volume.log



On Mon, Mar 21, 2022 at 1:06 AM Tony Liu 
mailto:tonyliu0...@hotmail.com>> wrote:
Hi,

After reading through doc, it's still not very clear to me how logging works 
with container.
This is with Pacific v16.2 container.

In OSD container, I see this.
```
/usr/bin/ceph-osd -n osd.16 -f --setuser ceph --setgroup ceph 
--default-log-to-file=false --default-log-to-stderr=true 
--default-log-stderr-prefix=debug
```
When check ceph configuration.
```
# ceph config get osd.16 log_file
/var/log/ceph/$cluster-$name.log
# ceph config get osd.16 log_to_file
true
# ceph config show osd.16 log_to_file
false
```
Q1, what's the intention of those log settings in command line? It's high 
priority and overrides
configuration in file and mon. Is there any option not doing that when deploy 
the container?
Q2, since log_to_file is set to false by command line, why there is still 
loggings in log_file?

The same for mgr and mon.

What I want is to have everything in log file and minimize the stdout and 
stderr from container.
Because log file is managed by logrotate, it unlikely blow up disk space. But 
stdout and stderr
from container is stored in a single file, not managed by logrotate. It may 
grow up to huge file.
Also, it's easier to check log file by vi than "podman logs". And log file is 
also collected and
stored by ELK for central management.

Any comments how I can achieve what I want?
Runtime override may not be the best option, cause it's not persistent.


Thanks!
Tony

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to 
ceph-users-le...@ceph.io

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: bind monitoring service to specific network and port

2022-03-21 Thread Tony Liu
It's probably again related to podman. After switching back to Docker,
this works fine.

Thanks!
Tony

From: Tony Liu 
Sent: March 20, 2022 06:31 PM
To: ceph-users@ceph.io; d...@ceph.io
Subject: [ceph-users] bind monitoring service to specific network and port

Hi,

https://docs.ceph.com/en/pacific/cephadm/services/monitoring/#networks-and-ports
When I try that with Pacific v16.2 image, port works, network doesn't.
No matter which network specified in yaml file, orch apply always bind the 
service to *.
Is this known issue or something I am missing?
Could anyone point me to the coding for this?


Thanks!
Tony
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: orch apply failed to use insecure private registry

2022-03-21 Thread Tony Liu
It's podman issue.
https://github.com/containers/podman/issues/11933
Switch back to Docker.

Thanks!
Tony

From: Eugen Block 
Sent: March 21, 2022 06:11 AM
To: ceph-users@ceph.io
Subject: [ceph-users] Re: orch apply failed to use insecure private registry

Hi,

> Setting mgr/cephadm/registry_insecure to false doesn't help.

if you want to use an insecure registry you would need to set this
option to true, not false.

> I am using podman and /etc/containers/registries.conf is set with
> that insecure private registry.

Can you paste the whole content? It's been two years or so since I
tested a setup with an insecure registry, I believe the
registries.conf also requires a line with "insecure = true". I'm not
sure if this will be enough, though. Did you successfully login to the
registry from all nodes?

ceph cephadm registry-login my_url my_username my_password

Zitat von Tony Liu :

> Hi,
>
> I am using Pacific v16.2 container image. I put images on a insecure
> private registry.
> I am using podman and /etc/containers/registries.conf is set with
> that insecure private registry.
> "cephadm bootstrap" works fine to pull the image and setup the first node.
> When "ceph orch apply -i service.yaml" to deploy services on all
> nodes, "ceph log last cephadm"
> shows the failure to ping private registry with SSL.
> Setting mgr/cephadm/registry_insecure to false doesn't help.
> I have to manuall pull all images on all nodes, then "orch apply"
> continues and all services are deployed.
> Is this known issue or some settings I am missing?
> Could anyone point me to the cephadm code to pull container image?
>
>
> Thanks!
> Tony
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io



___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Ceph RADOSGW with Keycloak ODIC

2022-03-21 Thread Seth Cagampang
Good news! I figured out what was going on. I found in my ceph.conf where
the debug logging location was:
[client.radosgw.gateway_name]
debug ms = 1
debug rgw = 20
host = gateway_name
keyring = /etc/ceph/ceph.client.radosgw.keyring
log file = /var/log/radosgw/client.radosgw.gateway_name.log
#<-- log file is here
rgw dns name = key-cloak-sc140.osnexus.net
rgw frontends = beast endpoint=10.0.26.140:7480
rgw print continue = false
rgw s3 auth use sts = true
rgw socket path =
/var/run/ceph/ceph-client.radosgw.gateway_name.asok
rgw sts key = abcdefghijklmnop
rgw zone = default

I tailed that log to see what kind of exceptions were being thrown and
found that the token verification failed because it expired. I must have
been taking too long between testing iterations after generating new access
tokens. I was able to create my-bucket on my ceph object storage after
regenerating the access token, plugging it in, and then immediately running
the sample script. Looks like my issues were self-inflicted. For those who
find this thread, either extend the expiration time for the access tokens,
plug in and run the sample script in the allotted expiration time for the
access token, or modify the sample script to load in a new access token
each time.

On Mon, Mar 21, 2022 at 1:45 PM Seth Cagampang 
wrote:

> I think that the app_id condition was a typo. After I run the python
> script to create the role I get the following role:
> {
> "Roles": [
> {
> "Path": "/",
> "RoleName": "S3Access",
> "RoleId": "2097f1fc-8a56-454c-8f00-23ded3c3c3b4",
> "Arn": "arn:aws:iam:::role/S3Access",
> "CreateDate": "2022-03-21T18:35:34.282000+00:00",
> "AssumeRolePolicyDocument": {
> "Version": "2012-10-17",
> "Statement": [
> {
> "Effect": "Allow",
> "Principal": {
> "Federated": [
>
> "arn:aws:iam:::oidc-provider/localhost:8080/auth/realms/demo"
> ]
> },
> "Action": [
> "sts:AssumeRoleWithWebIdentity"
> ],
> "Condition": {
> "StringEquals": {
> "localhost:8080/auth/realms/demo:app_id":
> "account"
> }
> }
> }
> ]
> },
> "MaxSessionDuration": 3600
> }
> ]
> }
>
> I am able to verify the 'aud' and 'client_id' attribute using the keycloak
> introspection URL:
>
> {
>   "exp": 1647889827,
>   "iat": 1647889527,
>   "jti": "5b754ce5-6601-416f-aeb5-2163bd3f8315",
>   "iss": "http://localhost:8080/auth/realms/demo;,
>   "aud": "account",
>   "sub": "19bd5627-2952-4aca-bc67-a2724b7d61b5",
>   "typ": "Bearer",
>   "azp": "myclient",
>   "preferred_username": "service-account-myclient",
>   "email_verified": false,
>   "acr": "1",
>   "allowed-origins": [
> "https://10.0.26.140:7480;
>   ],
>   "realm_access": {
> "roles": [
>   "offline_access",
>   "default-roles-demo",
>   "uma_authorization"
> ]
>   },
>   "resource_access": {
> "myclient": {
>   "roles": [
> "uma_protection"
>   ]
> },
> "account": {
>   "roles": [
> "manage-account",
> "manage-account-links",
> "view-profile"
>   ]
> }
>   },
>   "scope": "openid email profile",
>   "clientId": "myclient",
>   "clientHost": "10.0.26.140",
>   "clientAddress": "10.0.26.140",
>   "client_id": "myclient",
>   "username": "service-account-myclient",
>   "active": true
> }
>
> VIA simones suggestion it looks like my 'rgw sts key = abcdefghijklmnop'
> and 'rgw s3 auth use sts = true' are not being applied. I added the debug
> options and sts options to the /etc/ceph/ceph.conf file and verified that
> all nodes in the cluster have the settings applied. Then, I restarted the
> 'radosgw' service using 'systemctl restart radosgw.service'. Finally, I
> check the rgw config using 'radosgw-admin --show-config':
>
> root@terminal:~# radosgw-admin --show-config | grep -i sts
> mds_forward_all_requests_to_auth = false
> mds_max_completed_requests = 10
> rbd_readahead_trigger_requests = 10
> rgw_enable_apis = s3, s3website, swift, swift_auth, admin, sts, iam,
> notifications
> rgw_max_concurrent_requests = 1024
> rgw_s3_auth_order = sts, external, local
> rgw_s3_auth_use_sts = false
> rgw_sts_client_id =
> rgw_sts_client_secret =
> rgw_sts_entry = sts
> rgw_sts_key = sts
> rgw_sts_max_session_duration = 43200
> rgw_sts_min_session_duration = 900
> rgw_sts_token_introspection_url =
>
> root@terminal:~# radosgw-admin --show-config | grep -i debug_ms
> debug_ms = 0/0
>
> root@terminal:~# 

[ceph-users] Re: Ceph RADOSGW with Keycloak ODIC

2022-03-21 Thread Seth Cagampang
I think that the app_id condition was a typo. After I run the python script
to create the role I get the following role:
{
"Roles": [
{
"Path": "/",
"RoleName": "S3Access",
"RoleId": "2097f1fc-8a56-454c-8f00-23ded3c3c3b4",
"Arn": "arn:aws:iam:::role/S3Access",
"CreateDate": "2022-03-21T18:35:34.282000+00:00",
"AssumeRolePolicyDocument": {
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Principal": {
"Federated": [

"arn:aws:iam:::oidc-provider/localhost:8080/auth/realms/demo"
]
},
"Action": [
"sts:AssumeRoleWithWebIdentity"
],
"Condition": {
"StringEquals": {
"localhost:8080/auth/realms/demo:app_id":
"account"
}
}
}
]
},
"MaxSessionDuration": 3600
}
]
}

I am able to verify the 'aud' and 'client_id' attribute using the keycloak
introspection URL:

{
  "exp": 1647889827,
  "iat": 1647889527,
  "jti": "5b754ce5-6601-416f-aeb5-2163bd3f8315",
  "iss": "http://localhost:8080/auth/realms/demo;,
  "aud": "account",
  "sub": "19bd5627-2952-4aca-bc67-a2724b7d61b5",
  "typ": "Bearer",
  "azp": "myclient",
  "preferred_username": "service-account-myclient",
  "email_verified": false,
  "acr": "1",
  "allowed-origins": [
"https://10.0.26.140:7480;
  ],
  "realm_access": {
"roles": [
  "offline_access",
  "default-roles-demo",
  "uma_authorization"
]
  },
  "resource_access": {
"myclient": {
  "roles": [
"uma_protection"
  ]
},
"account": {
  "roles": [
"manage-account",
"manage-account-links",
"view-profile"
  ]
}
  },
  "scope": "openid email profile",
  "clientId": "myclient",
  "clientHost": "10.0.26.140",
  "clientAddress": "10.0.26.140",
  "client_id": "myclient",
  "username": "service-account-myclient",
  "active": true
}

VIA simones suggestion it looks like my 'rgw sts key = abcdefghijklmnop'
and 'rgw s3 auth use sts = true' are not being applied. I added the debug
options and sts options to the /etc/ceph/ceph.conf file and verified that
all nodes in the cluster have the settings applied. Then, I restarted the
'radosgw' service using 'systemctl restart radosgw.service'. Finally, I
check the rgw config using 'radosgw-admin --show-config':

root@terminal:~# radosgw-admin --show-config | grep -i sts
mds_forward_all_requests_to_auth = false
mds_max_completed_requests = 10
rbd_readahead_trigger_requests = 10
rgw_enable_apis = s3, s3website, swift, swift_auth, admin, sts, iam,
notifications
rgw_max_concurrent_requests = 1024
rgw_s3_auth_order = sts, external, local
rgw_s3_auth_use_sts = false
rgw_sts_client_id =
rgw_sts_client_secret =
rgw_sts_entry = sts
rgw_sts_key = sts
rgw_sts_max_session_duration = 43200
rgw_sts_min_session_duration = 900
rgw_sts_token_introspection_url =

root@terminal:~# radosgw-admin --show-config | grep -i debug_ms
debug_ms = 0/0

root@terminal:~# radosgw-admin --show-config | grep -i debug_rgw
debug_rgw = 1/5

As you can see it looks like the settings in the config file did not get
applied from the perspective of the radosgw-admin CLI tool. Am I doing
something wrong to apply these settings? It seems I won't be able to get
the debug logs until I can apply some of these settings. After running the
example boto3 script, I am not seeing any sort of rgw logs in
'/var/log/ceph/' :

root@terminal# ls -la /var/log/ceph/
total 19304
drwxrws--T  2 ceph ceph  4096 Mar 21 12:58 .
drwxrwxr-x 19 root syslog4096 Mar 21 13:02 ..
-rw---  1 ceph ceph   4071721 Mar 21 13:34 ceph.audit.log
-rw---  1 ceph ceph776987 Mar 21 13:34 ceph.log
-rw-r--r--  1 ceph ceph   1823776 Mar 21 13:28 ceph-mgr.key-cloak-sc140.log
-rw-r--r--  1 ceph ceph   3484236 Mar 21 13:34 ceph-mon.key-cloak-sc140.log
-rw-r--r--  1 ceph ceph   2359942 Mar 21 13:28 ceph-osd.0.log
-rw-r--r--  1 ceph ceph   2306777 Mar 21 13:28 ceph-osd.1.log
-rw-r--r--  1 ceph ceph   2312102 Mar 21 13:28 ceph-osd.2.log
-rw-r--r--  1 ceph ceph   2365239 Mar 21 13:28 ceph-osd.3.log
-rw-rw-rw-  1 root ceph184626 Mar 21 12:58 ceph-volume.log
-rw-r--r--  1 root ceph 11306 Mar 21 12:58 ceph-volume-systemd.log

I was trying to tail the log files while running the POC script, but I did
not notice any clear error messages related to the
AssumeRoleWithWebIdentity call. Does this mean that my radosgw is not set
up properly? I used this guide <
https://access.redhat.com/solutions/2085183#:~:text=The%20logs%20will%20be%20inside,on%20the%20Rados%20Gateway%20node.>
to try to set up the debug logging:


[ceph-users] Re: logging with container

2022-03-21 Thread Adam King
Hi Tony,

Afaik those container flags just set the defaults and the config options
override them. Setting the necessary flags (
https://docs.ceph.com/en/latest/cephadm/operations/#logging-to-files)
seemed to work for me.

[ceph: root@vm-00 /]# ceph config get osd.0 log_to_file
false
[ceph: root@vm-00 /]# ceph config show osd.0 log_to_file
false
[ceph: root@vm-00 /]# ceph config set global log_to_file true
[ceph: root@vm-00 /]# ceph config set global mon_cluster_log_to_file true
[ceph: root@vm-00 /]# ceph config get osd.0 log_to_file
true
[ceph: root@vm-00 /]# ceph config show osd.0 log_to_file
true
[ceph: root@vm-00 /]# ceph version
ceph version 16.2.7-601-g179a7bca
(179a7bca8a84771b0dde09e26f7a2146a985df90) pacific (stable)
[ceph: root@vm-00 /]# exit
exit
[root@vm-00 ~]# ls /var/log/ceph/413f7ec8-a91e-11ec-9b02-52540092b5a3/
ceph.audit.log  ceph.cephadm.log  ceph.log  ceph-mgr.vm-00.ukcctb.log
 ceph-mon.vm-00.log  ceph-osd.0.log  ceph-osd.10.log  ceph-osd.2.log
 ceph-osd.4.log  ceph-osd.6.log  ceph-osd.8.log  ceph-volume.log



On Mon, Mar 21, 2022 at 1:06 AM Tony Liu  wrote:

> Hi,
>
> After reading through doc, it's still not very clear to me how logging
> works with container.
> This is with Pacific v16.2 container.
>
> In OSD container, I see this.
> ```
> /usr/bin/ceph-osd -n osd.16 -f --setuser ceph --setgroup ceph
> --default-log-to-file=false --default-log-to-stderr=true
> --default-log-stderr-prefix=debug
> ```
> When check ceph configuration.
> ```
> # ceph config get osd.16 log_file
> /var/log/ceph/$cluster-$name.log
> # ceph config get osd.16 log_to_file
> true
> # ceph config show osd.16 log_to_file
> false
> ```
> Q1, what's the intention of those log settings in command line? It's high
> priority and overrides
> configuration in file and mon. Is there any option not doing that when
> deploy the container?
> Q2, since log_to_file is set to false by command line, why there is still
> loggings in log_file?
>
> The same for mgr and mon.
>
> What I want is to have everything in log file and minimize the stdout and
> stderr from container.
> Because log file is managed by logrotate, it unlikely blow up disk space.
> But stdout and stderr
> from container is stored in a single file, not managed by logrotate. It
> may grow up to huge file.
> Also, it's easier to check log file by vi than "podman logs". And log file
> is also collected and
> stored by ELK for central management.
>
> Any comments how I can achieve what I want?
> Runtime override may not be the best option, cause it's not persistent.
>
>
> Thanks!
> Tony
>
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
>
>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: RadosGW S3 range on a 0 byte object gives 416 Range Not Satisfiable

2022-03-21 Thread Ulrich Klein
RFC 7233

4.4 .  416 Range Not 
Satisfiable

   The 416 (Range Not Satisfiable) status code indicates that none of
   the ranges in the request's Range header field (Section 3.1 
) overlap
   the current extent of the selected resource or that the set of ranges
   requested has been rejected due to invalid ranges or an excessive
   request of small or overlapping ranges.

   For byte ranges, failing to overlap the current extent means that the
   first-byte-pos of all of the byte-range-spec values were greater than
   the current length of the selected representation.  When this status
   code is generated in response to a byte-range request, the sender
   SHOULD generate a Content-Range header field specifying the current
   length of the selected representation (Section 4.2 
).

   For example:

 HTTP/1.1 416 Range Not Satisfiable
 Date: Fri, 20 Jan 2012 15:41:54 GMT
 Content-Range: bytes */47022

  Note: Because servers are free to ignore Range, many
  implementations will simply respond with the entire selected
  representation in a 200 (OK) response.  That is partly because
  most clients are prepared to receive a 200 (OK) to complete the
  task (albeit less efficiently) and partly because clients might
  not stop making an invalid partial request until they have
  received a complete representation.  Thus, clients cannot depend
  on receiving a 416 (Range Not Satisfiable) response even when it
  is most appropriate.


> On 21. 03 2022, at 15:11, Ulrich Klein  wrote:
> 
> With a bit of HTTP background I’d say:
> bytes=0-100 means: First byte to to 100nd byte. First byte is byte #0
> On an empty object there is no first byte, i.e. not satisfiable ==> 416
> 
> Should be the same as on a single byte object and
> bytes=1-100
> 
> 200 OK should only be correct, if the server or a proxy in between doesn’t 
> support range requests.
> 
> Ciao, Uli
> 
>> On 21. 03 2022, at 14:47, Kai Stian Olstad  wrote:
>> 
>> Hi
>> 
>> Ceph v16.2.6.
>> 
>> Using GET with Range: bytes=0-100 it fails with 416 if the object is 0 
>> byte.
>> I tried reading the http specification[1] on the subject but did not get any 
>> wiser unfortunately.
>> 
>> I did a test with curl and range against a 0 byte file on Nginx and it 
>> returned 200 OK.
>> 
>> Does anyone know it's correct to return 416 on 0 byte object with range or 
>> should this be considered a bug in Ceph.
>> 
>> 
>> [1] https://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.35.1
>> 
>> -- 
>> Kai Stian Olstad
>> ___
>> ceph-users mailing list -- ceph-users@ceph.io
>> To unsubscribe send an email to ceph-users-le...@ceph.io
> 
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: RadosGW S3 range on a 0 byte object gives 416 Range Not Satisfiable

2022-03-21 Thread Ulrich Klein
With a bit of HTTP background I’d say:
bytes=0-100 means: First byte to to 100nd byte. First byte is byte #0
On an empty object there is no first byte, i.e. not satisfiable ==> 416

Should be the same as on a single byte object and
bytes=1-100

200 OK should only be correct, if the server or a proxy in between doesn’t 
support range requests.

Ciao, Uli

> On 21. 03 2022, at 14:47, Kai Stian Olstad  wrote:
> 
> Hi
> 
> Ceph v16.2.6.
> 
> Using GET with Range: bytes=0-100 it fails with 416 if the object is 0 
> byte.
> I tried reading the http specification[1] on the subject but did not get any 
> wiser unfortunately.
> 
> I did a test with curl and range against a 0 byte file on Nginx and it 
> returned 200 OK.
> 
> Does anyone know it's correct to return 416 on 0 byte object with range or 
> should this be considered a bug in Ceph.
> 
> 
> [1] https://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.35.1
> 
> -- 
> Kai Stian Olstad
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Cephfs default data pool (inode backtrace) no longer a thing?

2022-03-21 Thread Gregory Farnum
The backtraces are written out asynchronously by the MDS to those
objects, so there can be a delay between file creation and when they
appear. In fact I think backtraces only get written when the inode in
question is falling out of the MDS journal, so if you have a
relatively small number of flies which are consistently getting
updated, they can go arbitrarily long without the backtrace object
getting written out to RADOS objects.

Side note: I think you should be able to specify your EC pool as the
default data pool for a filesystem, which would prevent you from
needing a separate replicated pool storing those backtraces. Unless we
still have a programmed limit from back when EC pools couldn't handle
omap?
-Greg

On Thu, Mar 17, 2022 at 1:15 PM Vladimir Brik
 wrote:
>
> Never mind. I don't know what changed, but I am seeing
> 0-size objects in the default pool now.
>
> Vlad
>
> On 3/16/22 11:02, Vladimir Brik wrote:
> >  > Are you sure there are no objects?
> > Yes. In the 16.2.7 cluster "ceph df" reports no objects in
> > the default data pool. I am wondering if I need to something
> > special ensure that recovery data is stored in a fast pool
> > and not together with data in the EC pool.
> >
> > In my other cluster that was deployed when Ceph was on
> > version 14 (but upgraded to 15 since), there are a lot of
> > 0-size objects in the default data pool.
> >
> > Vlad
> >
> >
> > On 3/16/22 04:16, Frank Schilder wrote:
> >> Are you sure there are no objects? Here is what it looks
> >> on our FS:
> >>
> >>  NAME ID USED%USED
> >> MAX AVAIL OBJECTS
> >>  con-fs2-meta112 474 MiB
> >> 0.04   1.0 TiB  35687606
> >>  con-fs2-meta213 0 B
> >> 0   1.0 TiB 300163323
> >>
> >> Meta1 is the meta-data pool and meta2 the default data
> >> pool. It shows 0 bytes, but contains 10x the objects that
> >> sit in the meta data pool. These objects contain only meta
> >> data. That's why no actual usage is reported (at least on
> >> mimic).
> >>
> >> The data in this default data pool is a serious challenge
> >> for recovery. I put it on fast SSDs, but the large number
> >> of objects requires aggressive recovery options. With the
> >> default settings recovery of this pool takes longer than
> >> the rebuild of data in the EC data pools on HDD. I also
> >> allocated lots of PGs to it to reduce the object count per
> >> PG. Having this data on fast drives with tuned settings
> >> helps a lot with overall recovery and snaptrim.
> >>
> >> Best regards,
> >> =
> >> Frank Schilder
> >> AIT Risø Campus
> >> Bygning 109, rum S14
> >>
> >> 
> >> From: Vladimir Brik 
> >> Sent: 15 March 2022 20:53:25
> >> To: ceph-users
> >> Subject: [ceph-users] Cephfs default data pool (inode
> >> backtrace) no longer a thing?
> >>
> >> Hello
> >>
> >> https://docs.ceph.com/en/latest/cephfs/createfs/ mentions a
> >> "default data pool" that is used for "inode backtrace
> >> information, which is used for hard link management and
> >> disaster recovery", and "all CephFS inodes have at least one
> >> object in the default data pool".
> >>
> >> I noticed that when I create a volume using "ceph fs volume
> >> create" and then add the EC data pool where my files
> >> actually are, the default pool remains empty (no objects).
> >>
> >> Does this mean that the recommendation from the link above
> >> "If erasure-coded pools are planned for file system data, it
> >> is best to configure the default as a replicated pool" is no
> >> longer applicable, or do I need to configure something to
> >> avoid a performance hit when using EC data pools?
> >>
> >>
> >> Thanks
> >>
> >> Vlad
> >> ___
> >> ceph-users mailing list -- ceph-users@ceph.io
> >> To unsubscribe send an email to ceph-users-le...@ceph.io
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] RadosGW S3 range on a 0 byte object gives 416 Range Not Satisfiable

2022-03-21 Thread Kai Stian Olstad

Hi

Ceph v16.2.6.

Using GET with Range: bytes=0-100 it fails with 416 if the object is 
0 byte.
I tried reading the http specification[1] on the subject but did not get 
any wiser unfortunately.


I did a test with curl and range against a 0 byte file on Nginx and it 
returned 200 OK.


Does anyone know it's correct to return 416 on 0 byte object with range 
or should this be considered a bug in Ceph.



[1] https://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.35.1

--
Kai Stian Olstad
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: orch apply failed to use insecure private registry

2022-03-21 Thread Eugen Block

Hi,


Setting mgr/cephadm/registry_insecure to false doesn't help.


if you want to use an insecure registry you would need to set this  
option to true, not false.


I am using podman and /etc/containers/registries.conf is set with  
that insecure private registry.


Can you paste the whole content? It's been two years or so since I  
tested a setup with an insecure registry, I believe the  
registries.conf also requires a line with "insecure = true". I'm not  
sure if this will be enough, though. Did you successfully login to the  
registry from all nodes?


ceph cephadm registry-login my_url my_username my_password

Zitat von Tony Liu :


Hi,

I am using Pacific v16.2 container image. I put images on a insecure  
private registry.
I am using podman and /etc/containers/registries.conf is set with  
that insecure private registry.

"cephadm bootstrap" works fine to pull the image and setup the first node.
When "ceph orch apply -i service.yaml" to deploy services on all  
nodes, "ceph log last cephadm"

shows the failure to ping private registry with SSL.
Setting mgr/cephadm/registry_insecure to false doesn't help.
I have to manuall pull all images on all nodes, then "orch apply"  
continues and all services are deployed.

Is this known issue or some settings I am missing?
Could anyone point me to the cephadm code to pull container image?


Thanks!
Tony
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io




___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Ceph OSD's take 10+ minutes to start on reboot

2022-03-21 Thread Igor Fedotov

Hi Chris,

Such meta growth is completely unexpected to me.

And IIUC you've got custom rocksdb settings, right? What's the rationale 
for that? I would strongly discourage to alter them without deep 
understanding of the consequences...


My current working hypothesis is that DB compaction is not performed 
properly during regular operation and is postponed till OSD restart. 
Let's try to confirm that.


Could you please share the output for the following command:

ceph tell osd.1 bluefs stats

Additionally you might want to share rocksdb stats, this to be collected 
on an offline OSD:


ceph-kvstore-tool bluestore_kv /var/lib/ceph/osd/ceph-1 stats


Then please set debug-rocksdb & debug-bluestore to 10 and bring up osd.1 
again. Which apparently will need some time. What's in OSD log then?


Once restarted - please collect a fresh report from 'bluefs stats' 
command and share the results.



And finally I would suggest to leave other OSDs (as well as rocksdb 
settings) intact for a while to be able to troubleshoot the issue to the 
end..



Thanks,

Igor




On 3/18/2022 5:38 PM, Chris Page wrote:
This certainly seems to be the case as running a manual compaction and 
restarting works.


And `ceph tell osd.0 compact` reduces metadata consumption from ~160GB 
of metadata (for 380GB worth of data) to just 750MB. Below is a 
snippet of my osd stats -


image.png

OSD Is this expected behaviour or is my metadata growing abnormally? 
OSD's 1, 4 & 11 haven't been restarted in a couple of weeks.


Here's my rocksdb settings -

bluestore_rocksdb_options = 
compression=kNoCompression,max_write_buffer_number=32,min_write_buffer_number_to_merge=2,recycle_log_file_num=32,write_buffer_size=64M,compaction_readahead_size=2M


I hope you can help with this one - I'm at a bit of a loss!

Thanks,
Chris.


On Fri, 18 Mar 2022 at 14:25, Chris Page  wrote:

Hi,

Following up from this, is it just normal for them to take a
while? I notice that once I have restarted an OSD, the 'meta'
value drops right down to empty and slowly builds back up. The
restarted OSD's start with just 1gb or so of metadata and increase
over time to 160/170GB of metadata.

So perhaps the delay is just the rebuilding of this metadata pool?

Thanks,
Chris.


--
Igor Fedotov
Ceph Lead Developer

Looking for help with your Ceph cluster? Contact us athttps://croit.io

croit GmbH, Freseniusstr. 31h, 81247 Munich
CEO: Martin Verges - VAT-ID: DE310638492
Com. register: Amtsgericht Munich HRB 231263
Web:https://croit.io  | YouTube:https://goo.gl/PGE1Bx
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] OSD encryption

2022-03-21 Thread Stephen Smith6
A couple of questions around OSD encryption:


  1.  Was support for an external key management solution ever implemented 
(Specifically for the OSD encryption key)?
  2.  Is there an easy wrapper key rotation utility or process for the existing 
solution?

Thanks,
Eric
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io