from:"farhad kh"

[ceph-users] add an existing rbd image to iscsi target

2022-12-07 Thread farhad kh

i have cluster (v 17.2.4) with cephadm
---
[root@ceph-01 ~]# ceph -s
  cluster:
id: c61f6c8a-42a1-11ed-a5f1-000c29089b59
health: HEALTH_OK

  services:
mon:3 daemons, quorum ceph-01.fns.com,ceph-03,ceph-02 (age 109m)
mgr:ceph-01.fns.com.vdoxhd(active, since 109m), standbys:
ceph-03.dpdpgp
mds:2/2 daemons up, 1 standby
osd:3 osds: 3 up (since 109m), 3 in (since 3w)
rbd-mirror: 2 daemons active (2 hosts)

  data:
volumes: 2/2 healthy
pools:   6 pools, 161 pgs
objects: 189 objects, 80 MiB
usage:   131 MiB used, 150 GiB / 150 GiB avail
pgs: 161 active+clean

  io:
client:   1.7 KiB/s rd, 1 op/s rd, 0 op/s wr

---
[root@ceph-01 ~]# ceph orch ls
NAME PORTSRUNNING  REFRESHED  AGE   PLACEMENT
alertmanager ?:9093,9094  1/1  9m ago 9wcount:1
crash 3/3  9m ago 9w*
grafana  ?:3000   1/1  9m ago 9wcount:1
iscsi.kube-disk   2/2  9m ago 114m  ceph-01.fns.com;
ceph-02.fns.com;count:2
mds.rook  3/3  9m ago 6wceph-01.fns.com;
ceph-02.fns.com;ceph-03.fns.com;count:3
mgr   2/2  9m ago 9wcount:2
mon   3/5  9m ago 9wcount:5
node-exporter?:9100   3/3  9m ago 9w*
osd 3  9m ago - 
prometheus   ?:9095   1/1  9m ago 9wcount:1
rbd-mirror2/2  9m ago 8dcount:2
---

 i created  rbd-image befor , now i want expose them on the iscsi target.
 i create iscsi-gw by dashboard and iscsi-target.
 but in the dashboaerd i can't added image in iscsi-target (There are no
images available)



i try get bash from iscsi-gw container with ( docker exec -it
ceph-c61f6c8a-42a1-11ed-a5f1-000c29089b59-iscsi-kube-disk-ceph-01-bkbhlp-tcmu
bash)
and add image by gwcli but it doesn't show me any image


---
[root@ceph-01 ~]# rbd ls --pool k8s-rbd
csi-vol-4953bb63-708f-11ed-b0e9-328684e96ded
csi-vol-93d55bd9-6931-11ed-8a99-02e8c67c703f
csi-vol-93ea9aad-6931-11ed-8a99-02e8c67c703f
csi-vol-aeeb71af-6eab-11ed-9ea2-7aaa7c682140
csi-vol-c32fbb4c-6eb3-11ed-9ea2-7aaa7c682140
csi-vol-d925df21-6eb3-11ed-9ea2-7aaa7c682140
csi-vol-e9865e15-6931-11ed-8a99-02e8c67c703f
csi-vol-ec2e8490-6931-11ed-8a99-02e8c67c703f
csi-vol-f1725e80-6931-11ed-8a99-02e8c67c703f
csi-vol-f6de7ea8-6931-11ed-8a99-02e8c67c703f


[root@ceph-01 /]# gwcli
Warning: Could not load preferences file /root/.gwcli/prefs.bin.
/>
*   /   @gateways   @host-groups@hosts
 cluster/disks/  iscsi-targets/
bookmarks   cd  exitexport  get
gotohelpinfo
ls  pwd set
/> ls
o- /
.
[...]
  o- cluster
.
[Clusters: 1]
  | o- ceph

[HEALTH_OK]
  |   o- pools
..
[Pools: 6]
  |   | o- .mgr
.. [(x3),
Commit: 0.00Y/49749484K (0%), Used: 1356K]
  |   | o- cephfs_data
. [(x3), Commit:
0.00Y/49749484K (0%), Used: 138053b]
  |   | o- cephfs_metadata
. [(x3), Commit:
0.00Y/49749484K (0%), Used: 174984b]
  |   | o- k8s-cephfs-data
... [(x3), Commit:
0.00Y/49749484K (0%), Used: 0.00Y]
  |   | o- k8s-cephfs-metadata
. [(x3), Commit:
0.00Y/49749484K (0%), Used: 237318b]
  |   | o- k8s-rbd
... [(x3), Commit:
0.00Y/49749484K (0%), Used: 30295498b]
  |   o- topology

[OSDs: 3,MONs: 3]
  o- disks
...
[0.00Y, Disks: 0]
  o- iscsi-targets
...
[DiscoveryAuth: None, Targets: 1]
o- iqn.2001-07.com.ceph:1670310941655
 [Auth:
None, Gateways: 2]
  o- disks
..
[Disks: 0]
  o- gateways

[Up: 2/

[ceph-users] recovery for node disaster

2023-02-12 Thread farhad kh

 I have a cluster of three nodes, with three replicas per pool on cluster
nodes
-
HOST ADDR LABELS  STATUS
apcepfpspsp0101  192.168.114.157  _admin mon
apcepfpspsp0103  192.168.114.158  mon _admin
apcepfpspsp0105  192.168.114.159  mon _admin
3 hosts in cluster
-
# ceph osd crush rule dump
[
{
"rule_id": 0,
"rule_name": "replicated_rule",
"type": 1,
"steps": [
{
"op": "take",
"item": -1,
"item_name": "default"
},
{
"op": "chooseleaf_firstn",
"num": 0,
"type": "host"
},
{
"op": "emit"
}
]
}
]
-
epoch 1033
fsid 9c35e594-2392-11ed-809a-005056ae050c
created 2022-08-24T09:53:36.481866+
modified 2023-02-12T18:57:34.447536+
flags sortbitwise,recovery_deletes,purged_snapdirs,pglog_hardlimit
crush_version 51
full_ratio 0.95
backfillfull_ratio 0.9
nearfull_ratio 0.85
require_min_compat_client luminous
min_compat_client luminous
require_osd_release quincy
stretch_mode_enabled false
pool 1 '.mgr' replicated size 3 min_size 2 crush_rule 0 object_hash
rjenkins pg_num 1 pgp_num 1 autoscale_mode on last_change 21 flags
hashpspool stripe_width 0 pg_num_max 32 pg_num_min 1 application mgr
pool 2 'k8s-rbd' replicated size 3 min_size 2 crush_rule 0 object_hash
rjenkins pg_num 32 pgp_num 32 autoscale_mode on last_change 541 lfor 0/0/44
flags hashpspool,selfmanaged_snaps max_bytes 75161927680 stripe_width 0
application rbd
pool 3 'k8s-cephfs_metadata' replicated size 3 min_size 2 crush_rule 0
object_hash rjenkins pg_num 16 pgp_num 16 autoscale_mode on last_change 543
lfor 0/0/57 flags hashpspool max_bytes 5368709120 stripe_width 0
pg_autoscale_bias 4 pg_num_min 16 recovery_priority 5 application cephfs
pool 4 'k8s-cephfs_data' replicated size 3 min_size 2 crush_rule 0
object_hash rjenkins pg_num 32 pgp_num 32 autoscale_mode on last_change 542
lfor 0/0/57 flags hashpspool max_bytes 32212254720 stripe_width 0
application cephfs
---

Is it possible to recover data when two nodes with all physical disks are
lost for any reason?
What is the maximum number of fault tolerance for the cluster?
For this purpose, consider the default settings .
what changes should I make to increase fault tolerance?
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Set the Quality of Service configuration.

2023-04-02 Thread farhad kh

how to i  can set IO  quota or limit for R/W for erasure coding pool in
ceph ?
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] new install or change default registry to private registry

2023-05-17 Thread farhad kh

i try deploy cluster from private registry and used this command
{cephadm bootstrap ---mon-ip 10.10.128.68 --registry-url my.registry.xo
--registry-username myuser1 --registry-password mypassword1
--dashboard-password-noupdate --initial-dashboard-password P@ssw0rd }
even i changed section  Default container images in  cephadm script
# Default container images
-
DEFAULT_IMAGE = ' my.registry.xo/ceph/ceph:v17'
DEFAULT_IMAGE_IS_MASTER = False
DEFAULT_IMAGE_RELEASE = 'quincy'
DEFAULT_PROMETHEUS_IMAGE = ' my.registry.xo/ceph/prometheus:v2.33.4'
DEFAULT_LOKI_IMAGE = ' my.registry.xo/ceph/loki:2.4.0'
DEFAULT_PROMTAIL_IMAGE =  my.registry.xo/ceph/promtail:2.4.0'
DEFAULT_NODE_EXPORTER_IMAGE = ' my.registry.xo/ceph/node-exporter:v1.3.1'
DEFAULT_ALERT_MANAGER_IMAGE = ' my.registry.xo/ceph/alertmanager:v0.23.0'
DEFAULT_GRAFANA_IMAGE = ' my.registry.xo/ceph/ceph-grafana:8.3.5'
DEFAULT_HAPROXY_IMAGE = ' my.registry.xo/ceph/haproxy:2.3'
DEFAULT_KEEPALIVED_IMAGE = ' my.registry.xo/ceph//keepalived:2.1.5'
DEFAULT_SNMP_GATEWAY_IMAGE = ' my.registry.xo/ceph/ snmp-notifier:v1.2.1'
DEFAULT_REGISTRY = ' my.registry.xo'   # normalize unqualified digests to
this
#
--
to my private registy
but when i deploy new cluster cephadm try pull prometheus , node-exporter
and ceph-grafana from quay.io .
Why does it happen and what should I do to change it? Even the default
registry is not changed
#ceph config get mgr mgr/cephadm/default_registry
docker.io

Can someone guide me, what should I do?
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] fail delete "daemon(s) not managed by cephadm"

2023-05-27 Thread farhad kh

 hi everyone
i have a warning ` 1 stray daemon(s) not managed by cephadm`

# ceph health detail
HEALTH_WARN 1 stray daemon(s) not managed by cephadm
[WRN] CEPHADM_STRAY_DAEMON: 1 stray daemon(s) not managed by cephadm
stray daemon mon.apcepfpspsp0111 on host apcepfpspsp0111 not
managed by cephadm

and in detail :

# ceph crash info
2023-01-17T18:25:19.601062Z_4affd3b9-f486-4ae5-801b-ede231e2b624
{
"archived": "2023-01-18 19:31:58.200096",
"assert_condition": "abort",
"assert_file":
"/home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos8/DIST/centos8/MACHINE_SIZE/gigantic/release/17.2.5/rpm/el8/BUILD/ceph-17.2.5/src/mon/MonitorDBStore.h",
"assert_func": "int
MonitorDBStore::apply_transaction(MonitorDBStore::TransactionRef)",
"assert_line": 355,
"assert_msg":
"/home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos8/DIST/centos8/MACHINE_SIZE/gigantic/release/17.2.5/rpm/el8/BUILD/ceph-17.2.5/src/mon/MonitorDBStore.h:
In function 'int
MonitorDBStore::apply_transaction(MonitorDBStore::TransactionRef)'
thread 7fabeb737700 time
2023-01-17T18:25:19.568046+\n/home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos8/DIST/centos8/MACHINE_SIZE/gigantic/release/17.2.5/rpm/el8/BUILD/ceph-17.2.5/src/mon/MonitorDBStore.h:
355: ceph_abort_msg(\"failed to write to db\")\n",
"assert_thread_name": "ms_dispatch",
"backtrace": [
"/lib64/libpthread.so.0(+0x12cf0) [0x7fabf5949cf0]",
"gsignal()",
"abort()",
"(ceph::__ceph_abort(char const*, int, char const*,
std::__cxx11::basic_string,
std::allocator > const&)+0x197) [0x7fabf799eb5f]",

"(MonitorDBStore::apply_transaction(std::shared_ptr)+0x88f)
[0x55fe9cc3a68f]",
"(Elector::persist_epoch(unsigned int)+0x184) [0x55fe9ccc64c4]",
"(ElectionLogic::bump_epoch(unsigned int)+0x5d) [0x55fe9d5d]",
"(ElectionLogic::propose_classic_prefix(int, unsigned
int)+0x3c7) [0x55fe9ccce8e7]",
"(ElectionLogic::propose_classic_handler(int, unsigned
int)+0x2b) [0x55fe9cccef9b]",
"(Elector::handle_propose(boost::intrusive_ptr)+0x6f5)
[0x55fe9ccc6175]",
"(Elector::dispatch(boost::intrusive_ptr)+0xcdb)
[0x55fe9ccc756b]",
"(Monitor::dispatch_op(boost::intrusive_ptr)+0x11c2)
[0x55fe9cc0aeb2]",
"(Monitor::_ms_dispatch(Message*)+0x406) [0x55fe9cc0b5e6]",
"(Dispatcher::ms_dispatch2(boost::intrusive_ptr
const&)+0x5d) [0x55fe9cc3bdad]",
"(Messenger::ms_deliver_dispatch(boost::intrusive_ptr
const&)+0x478) [0x7fabf7c18c88]",
"(DispatchQueue::entry()+0x50f) [0x7fabf7c160cf]",
"(DispatchQueue::DispatchThread::entry()+0x11) [0x7fabf7cdd8f1]",
"/lib64/libpthread.so.0(+0x81ca) [0x7fabf593f1ca]",
"clone()"
],
"ceph_version": "17.2.5",
"crash_id":
"2023-01-17T18:25:19.601062Z_4affd3b9-f486-4ae5-801b-ede231e2b624",
"entity_name": "mon.apcepfpspsp0101",
"os_id": "centos",
"os_name": "CentOS Stream",
"os_version": "8",
"os_version_id": "8",
"process_name": "ceph-mon",
"stack_sig":
"74f29b024a3ec0f1395579505c5728e8ad5fd5a77c4a92437d696fa6fd1a8e20",
"timestamp": "2023-01-17T18:25:19.601062Z",
"utsname_hostname": "apcepfpspsp0101",
"utsname_machine": "x86_64",
"utsname_release": "5.4.17-2136.310.7.1.el8uek.x86_64",
"utsname_sysname": "Linux",
"utsname_version": "#2 SMP Wed Aug 17 15:14:08 PDT 2022"
}

but i have 3 instance from mon modul

NAME   PORTSRUNNING  REFRESHED  AGE  PLACEMENT
alertmanager   ?:9093,9094  3/3  2m ago 9M   label:mon
crash   6/6  9m ago 9M   *
grafana?:3000   3/3  2m ago 4M
count-per-host:1;label:mon
mds.rook3/3  2m ago 8M
apcepfpspsp0101;apcepfpspsp0103;apcepfpspsp0105;count:3
mgr 2/2  2m ago 9M   count:2
mon 3/3  2m ago 2h   label:mon
node-exporter  ?:9100   6/6  9m ago 9M   *
osd   3  2m ago -
osd.cost_capacity15  9m ago 4M   *
prometheus ?:9095   3/3  2m ago 9M   label:mon

whene i want remove this get

# ceph orch daemon rm mon.apcepfpspsp0111 --force
Error EINVAL: Unable to find daemon(s) ['mon.apcepfpspsp0111']

I don't have such a daemon

mon.apcepfpspsp0101  apcepfpspsp0101   running (88m)
mon.apcepfpspsp0103  apcepfpspsp0103   running (87m)
mon.apcepfpspsp0105  apcepfpspsp0105   running (87m)

What should I do to remove this warning?
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] change user root to non-root after deploy cluster by cephadm

2023-06-07 Thread farhad kh

 Hi guys
I deployed the ceph cluster with cephadm and root user, but I need to
change the user to a non-root user
And I did these steps:
1- Created a non-root user on all hosts with access without password and
sudo
`$USER_NAME ALL = (root) NOPASSWD:ALL`
2- Generated a SSH key pair and use ssh-copy-it to add all hosts
`
ssh-keygen (accept the default file name and leave the passphrase empty)
ssh-copy-id USER_NAME@HOST_NAME
`
3 - ceph cephadm set-user But I get "Error EINVAL: ssh connection to
root@hostname failed" error
How to deal with this issue?
What should be done to change the user to non-root?
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] stray daemons not managed by cephadm

2023-06-12 Thread farhad kh

 i deployed the ceph cluster with 8 node (v17.2.6) and  after add all of
hosts, ceph create 5 mon daemon instances
i try decrease that to 3 instance with ` ceph orch apply mon
--placement=label:mon,count:3 it worked, but after that i get error "2
stray daemons not managed by cephadm" .
But every time I tried to deploy and delete other instances, this number
increased Now I have 7 daemon that are not managed by cephadm
How to deal with this issue?


[root@opcsdfpsbpp0201 ~]# ceph -s
  cluster:
id: 79a2627c-0821-11ee-a494-00505695c58c
health: HEALTH_WARN
16 stray daemon(s) not managed by cephadm

  services:
mon: 3 daemons, quorum opcsdfpsbpp0201,opcsdfpsbpp0205,opcsdfpsbpp0203
(age 2m)
mgr: opcsdfpsbpp0201.vttwxa(active, since 27h), standbys:
opcsdfpsbpp0207.kzxepm
mds: 1/1 daemons up, 2 standby
osd: 74 osds: 74 up (since 26h), 74 in (since 26h)

  data:
volumes: 1/1 healthy
pools:   6 pools, 6 pgs
objects: 2.10k objects, 8.1 GiB
usage:   28 GiB used, 148 TiB / 148 TiB avail
pgs: 6 active+clean

  io:
client:   426 B/s rd, 0 op/s rd, 0 op/s wr

[root@opcsdfpsbpp0201 ~]# ceph health detail
HEALTH_WARN 16 stray daemon(s) not managed by cephadm
[WRN] CEPHADM_STRAY_DAEMON: 16 stray daemon(s) not managed by cephadm
stray daemon mon.opcsdfpsbpp0207 on host opcsdfpsbpp0203 not managed by
cephadm
stray daemon mon.opcsdfpsbpp0209 on host opcsdfpsbpp0203 not managed by
cephadm
stray daemon mon.opcsdfpsbpp0211 on host opcsdfpsbpp0203 not managed by
cephadm
stray daemon mon.opcsdfpsbpp0213 on host opcsdfpsbpp0203 not managed by
cephadm
stray daemon mon.opcsdfpsbpp0207 on host opcsdfpsbpp0205 not managed by
cephadm
stray daemon mon.opcsdfpsbpp0209 on host opcsdfpsbpp0205 not managed by
cephadm
stray daemon mon.opcsdfpsbpp0211 on host opcsdfpsbpp0205 not managed by
cephadm
stray daemon mon.opcsdfpsbpp0213 on host opcsdfpsbpp0205 not managed by
cephadm
stray daemon mon.opcsdfpsbpp0213 on host opcsdfpsbpp0207 not managed by
cephadm
stray daemon mon.opcsdfpsbpp0207 on host opcsdfpsbpp0209 not managed by
cephadm
stray daemon mon.opcsdfpsbpp0209 on host opcsdfpsbpp0209 not managed by
cephadm
stray daemon mon.opcsdfpsbpp0209 on host opcsdfpsbpp0211 not managed by
cephadm
stray daemon mon.opcsdfpsbpp0215 on host opcsdfpsbpp0211 not managed by
cephadm
stray daemon mon.opcsdfpsbpp0211 on host opcsdfpsbpp0213 not managed by
cephadm
stray daemon mon.opcsdfpsbpp0209 on host opcsdfpsbpp0215 not managed by
cephadm
stray daemon mon.opcsdfpsbpp0213 on host opcsdfpsbpp0215 not managed by
cephadm
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] cephfs mount with kernel driver

2023-06-19 Thread farhad kh

 I noticed that in my scenario, when I mount cephfs via the kernel module,
it directly copies to one or three of the OSDs. And the writing speed of
the client is higher than the speed of replication and auto scaling This
causes the writing operation to stop as soon as those OSDs are filled, and
the error of free space is not available What should be done to solve this
problem? Is there a way to increase the speed of scaling or moving objects
in OSD? Or a way to mount cephfs that does not have these problems?
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] autocaling not work and active+remapped+backfilling

2023-06-19 Thread farhad kh

 hi
i have a problem with ceph 17.2.6 , cephfs with mds daemons but see an
unusual behavior.
 create a data pool with default crush rule but data just store in 3
specific osd and other osd is clean
PG auto-scaling is also active but its size does not change when the pool
is biger
  I did this manually but the problem was not solved and I got the error pg
are not balanced across osds
How do I solve this problem? Is this a bug? I did not have this problem in
previous versions
I solved this problem. There are several identical crash rules in the folder

step chooseleaf firstn 0 type host

I think this confuses the balancer and autoscale and output for ceph osd
pool autoscale-status  is empty
after remove other crush rules autoscale runing
but
 move data from osd full to clear osd is slow trying  with  Reducing the
 weight of filled OSDs, I tried to prioritize the use of other OSDs  ceph
osd reweight-by-utilization I hope this works
Is there a solution that makes the process of autoscaling and cleaning
placement groups faster?

---

[root@opcsdfpsbpp0201 ~]# ceph osd crush rule dump
[
{
"rule_id": 0,
"rule_name": "replicated_rule",
"type": 1,
"steps": [
{
"op": "take",
"item": -1,
"item_name": "default"
},
{
"op": "chooseleaf_firstn",
"num": 0,
"type": "host"
},
{
"op": "emit"
}
]
},
{
"rule_id": 1,
"rule_name": "r3-host",
"type": 1,
"steps": [
{
"op": "take",
"item": -2,
"item_name": "default~hdd"
},
{
"op": "chooseleaf_firstn",
"num": 0,
"type": "host"
},
{
"op": "emit"
}
]
},
{
"rule_id": 2,
"rule_name": "r3",
"type": 1,
"steps": [
{
"op": "take",
"item": -2,
"item_name": "default~hdd"
},
{
"op": "chooseleaf_firstn",
"num": 0,
"type": "host"
},
{
"op": "emit"
}
]
}
]



# ceph osd status | grep back
23  opcsdfpsbpp0211  1900G   147G  00   00
backfillfull,exists,up
48  opcsdfpsbpp0201  1900G   147G  00   00
backfillfull,exists,up
61  opcsdfpsbpp0205  1900G   147G  00   00
backfillfull,exists,up

--

Every 2.0s: ceph -s

 opcsdfpsbpp0201: Sun Jun 18 11:44:29 2023

  cluster:
id: 79a2627c-0821-11ee-a494-00505695c58c
health: HEALTH_WARN
3 backfillfull osd(s)
6 pool(s) backfillfull

  services:
mon: 3 daemons, quorum opcsdfpsbpp0201,opcsdfpsbpp0205,opcsdfpsbpp0203
(age 6d)
mgr: opcsdfpsbpp0201.vttwxa(active, since 5d), standbys:
opcsdfpsbpp0205.tpodbs, opcsdfpsbpp0203.jwjkcl
mds: 1/1 daemons up, 2 standby
osd: 74 osds: 74 up (since 7d), 74 in (since 7d); 107 remapped pgs

  data:
volumes: 1/1 healthy
pools:   6 pools, 359 pgs
objects: 599.64k objects, 2.2 TiB
usage:   8.1 TiB used, 140 TiB / 148 TiB avail
pgs: 923085/1798926 objects misplaced (51.313%)
 252 active+clean
 87  active+remapped+backfill_wait
 20  active+remapped+backfilling

  io:
client:   255 B/s rd, 0 op/s rd, 0 op/s wr
recovery: 33 MiB/s, 8 objects/s

  progress:
Global Recovery Event (5h)
  [===.] (remaining: 2h)
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] osd memory target not work

2023-06-20 Thread farhad kh

 when set osd_memory_target for limitation usage memory for osd disk ,This
value is expected to be set for the OSD container .But with the docker stats
command, this value is not seen Is my perception of this process wrong?
---

[root@opcsdfpsbpp0201 ~]# ceph orch ps | grep osd.12
osd.12  opcsdfpsbpp0201
   running (9d) 5m ago   9d1205M1953M  17.2.6
c9a1062f7289  bf27cfe16046
[root@opcsdfpsbpp0201 ~]# docker stats | grep osd
1253766d6a78   ceph-79a2627c-0821-11ee-a494-00505695c58c-osd-48
0.05% 2.237GiB / 7.732GiB   28.93%0B / 0B   86.8GB / 562GB
63
2bc012e5c604   ceph-79a2627c-0821-11ee-a494-00505695c58c-osd-67
0.20% 727MiB / 7.732GiB 9.18% 0B / 0B   37.5GB / 1.29TB
63
dc0bf068050b   ceph-79a2627c-0821-11ee-a494-00505695c58c-osd-62
0.11% 360.5MiB / 7.732GiB   4.55% 0B / 0B   125MB / 1.85GB
63
c5f119a37652   ceph-79a2627c-0821-11ee-a494-00505695c58c-osd-55
0.12% 312.5MiB / 7.732GiB   3.95% 0B / 0B   86.6MB / 1.66GB
63
7f0b7b61807d   ceph-79a2627c-0821-11ee-a494-00505695c58c-osd-5
0.11% 299.4MiB / 7.732GiB   3.78% 0B / 0B   119MB / 1.6GB
63
dadffc77f7b6   ceph-79a2627c-0821-11ee-a494-00505695c58c-osd-40
0.11% 274MiB / 7.732GiB 3.46% 0B / 0B   110MB / 1.5GB
63
e439e58d907e   ceph-79a2627c-0821-11ee-a494-00505695c58c-osd-34
0.12% 355.9MiB / 7.732GiB   4.49% 0B / 0B   125MB / 1.78GB
63
5e500e2197d6   ceph-79a2627c-0821-11ee-a494-00505695c58c-osd-25
0.11% 273.3MiB / 7.732GiB   3.45% 0B / 0B   128MB / 1.55GB
63
a63709567669   ceph-79a2627c-0821-11ee-a494-00505695c58c-osd-19
0.11% 714.6MiB / 7.732GiB   9.03% 0B / 0B   89.8MB / 167GB
63
bf27cfe16046   ceph-79a2627c-0821-11ee-a494-00505695c58c-osd-12
0.16% 1.177GiB / 7.732GiB   15.23%0B / 0B   40.8GB / 644GB
63
---
# ceph orch ps | grep osd | grep opcsdfpsbpp0201
osd.5 opcsdfpsbpp0201   running (9d) 6m ago   9d 298M
1953M  17.2.6  c9a1062f7289  7f0b7b61807d
osd.12opcsdfpsbpp0201   running (9d) 6m ago   9d1205M
1953M  17.2.6  c9a1062f7289  bf27cfe16046
osd.19opcsdfpsbpp0201   running (9d) 6m ago   9d 704M
1953M  17.2.6  c9a1062f7289  a63709567669
osd.25opcsdfpsbpp0201   running (9d) 6m ago   9d 273M
1953M  17.2.6  c9a1062f7289  5e500e2197d6
osd.34opcsdfpsbpp0201   running (9d) 6m ago   9d 355M
1953M  17.2.6  c9a1062f7289  e439e58d907e
osd.40opcsdfpsbpp0201   running (9d) 6m ago   9d 273M
1953M  17.2.6  c9a1062f7289  dadffc77f7b6
osd.48opcsdfpsbpp0201   running (4h) 6m ago   9d2290M
1953M  17.2.6  c9a1062f7289  1253766d6a78
osd.55opcsdfpsbpp0201   running (9d) 6m ago   9d 312M
1953M  17.2.6  c9a1062f7289  c5f119a37652
osd.62opcsdfpsbpp0201   running (9d) 6m ago   9d 359M
1953M  17.2.6  c9a1062f7289  dc0bf068050b
osd.67opcsdfpsbpp0201   running (9d) 6m ago   9d 727M
1953M  17.2.6  c9a1062f7289  2bc012e5c604
--

 #ceph config get mgr  osd_memory_target
204800
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] copy file in nfs over cephfs error "error: error in file IO (code 11)"

2023-06-25 Thread farhad kh

 hi everybody

we have problem with nfs gansha load balancer
whene use rsync -avre to copy file from another share to ceph nfs share
path we get this error
`rsync -rav /mnt/elasticsearch/newLogCluster/acr-202*
/archive/Elastic-v7-archive`

rsync : close failed on "/archive/Elastic-v7-archive/"  :
Input/output error (5)
rsync error: error in file IO (code 11) at receiver.c(586) [Receiver=3.1.3]"

we used ingress for load balancing nfs service and No other problems are
observed in the cluster.
Below is information about the pool, volume path and quota

amount10.20.32.161:/volumes/arch-1/arch   30T  5.0T   26T  17% /archive#
ceph osd pool get-quota arch-bigdata-data
   quotas for pool 'arch-bigdata-data':
   max objects: N/A
   max bytes  : 30 TiB  (current num bytes: 5488192308978 bytes)

---
# ceph fs subvolume info   arch-bigdata arch arch-1
{
"atime": "2023-06-11 13:32:22",
"bytes_pcent": "16.64",
"bytes_quota": 32985348833280,
"bytes_used": 5488566602388,
"created_at": "2023-06-11 13:32:22",
"ctime": "2023-06-25 10:45:35",
"data_pool": "arch-bigdata-data",
"features": [
"snapshot-clone",
"snapshot-autoprotect",
"snapshot-retention"
],
"gid": 0,
"mode": 16877,
"mon_addrs": [
"10.20.32.153:6789",
"10.20.32.155:6789",
"10.20.32.154:6789"
],
"mtime": "2023-06-25 10:38:48",
"path": "/volumes/arch-1/arch/f246a31b-7103-41b9-8005-63d00efe88e4",
"pool_namespace": "",
"state": "complete",
"type": "subvolume",
"uid": 0
}
.Has anyone ever experienced this error? What way do you suggest to solve
it?
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] dashboard for rgw NoSuchKey

2023-07-03 Thread farhad kh

 I deploy the rgw service and the default pool is created automatically But
I get an error in the dashboard
``
Error connecting to Object Gateway: RGW REST API request failed with
default 404 status code","HostId":"736528-default-default"}')

``
There is a dashboard user but I created the bucket manually

# radosgw-admin user info --uid=dashboard
{
"user_id": "dashboard",
"display_name": "Ceph Dashboard",
"email": "",
"suspended": 0,
"max_buckets": 1000,
"subusers": [],
"keys": [
{
"user": "dashboard",
"access_key": "C8YG708VBA3M3AAJW2U2",
"secret_key": "NpkmIZ5JJVnu3EFa0ytv5vO64NGttK9ks7A3gEQP"
}
],
"swift_keys": [],
"caps": [],
"op_mask": "read, write, delete",
"system": "true",
"default_placement": "",
"default_storage_class": "",
"placement_tags": [],
"bucket_quota": {
"enabled": false,
"check_on_raw": false,
"max_size": -1,
"max_size_kb": 0,
"max_objects": -1
},
"user_quota": {
"enabled": false,
"check_on_raw": false,
"max_size": -1,
"max_size_kb": 0,
"max_objects": -1
},
"temp_url_keys": [],
"type": "rgw",
"mfa_ids": []
}

# radosgw-admin buckets list
[
"dashboard"
]

How can I solve the problem?
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] dashboard ERROR exception

2023-10-30 Thread farhad kh

i use ceph 17.2.6 and when i deploy two number of separate rgw realm with
zonegroup and zone , dashboard enabled access for  bouth object gateway and
i can create user and bucket and etc .but when i trying create bucket in on
of object gatways .i get this error in below:


debug 2023-10-29T12:19:50.697+ 7fd203a26700  0 [dashboard ERROR
rest_client] RGW REST API failed PUT req status: 400
debug 2023-10-29T12:19:50.697+ 7fd203a26700  0 [dashboard ERROR
exception] Dashboard Exception
Traceback (most recent call last):
  File "/usr/share/ceph/mgr/dashboard/controllers/rgw.py", line 304, in
create
lock_enabled)
  File "/usr/share/ceph/mgr/dashboard/rest_client.py", line 534, in
func_wrapper
**kwargs)
  File "/usr/share/ceph/mgr/dashboard/services/rgw_client.py", line 563, in
create_bucket
return request(data=data, headers=headers)
  File "/usr/share/ceph/mgr/dashboard/rest_client.py", line 323, in __call__
data, raw_content, headers)
  File "/usr/share/ceph/mgr/dashboard/rest_client.py", line 452, in
do_request
resp.content)
dashboard.rest_client.RequestException: RGW REST API failed request with
status code 400
(b'{"Code":"InvalidLocationConstraint","Message":"The specified
location-constr'
 b'aint is not
valid","BucketName":"farhad2","RequestId":"tx03fa9d80c50a79d'
 b'b6-00653e4de6-285b3-test","HostId":"285b3-test-test"}')

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/share/ceph/mgr/dashboard/services/exception.py", line 47, in
dashboard_exception_handler
return handler(*args, **kwargs)
  File "/lib/python3.6/site-packages/cherrypy/_cpdispatch.py", line 54, in
__call__
return self.callable(*self.args, **self.kwargs)
  File "/usr/share/ceph/mgr/dashboard/controllers/_base_controller.py",
line 258, in inner
ret = func(*args, **kwargs)
  File "/usr/share/ceph/mgr/dashboard/controllers/_rest_controller.py",
line 191, in wrapper
return func(*vpath, **params)
  File "/usr/share/ceph/mgr/dashboard/controllers/rgw.py", line 315, in
create
raise DashboardException(e, http_status_code=500, component='rgw')
dashboard.exceptions.DashboardException: RGW REST API failed request with
status code 400
(b'{"Code":"InvalidLocationConstraint","Message":"The specified
location-constr'
 b'aint is not
valid","BucketName":"farhad2","RequestId":"tx03fa9d80c50a79d'
 b'b6-00653e4de6-285b3-test","HostId":"285b3-test-test"}')
debug 2023-10-29T12:19:50.701+ 7fd203a26700  0 [dashboard INFO request]
[192.168.0.1:55833] [POST] [500] [0.031s] [admin] [252.0B] /api/rgw/bucket
debug 2023-10-29T12:19:50.713+ 7fd204a28700  0 [dashboard ERROR
rest_client] RGW REST API failed GET req status: 404
debug 2023-10-29T12:19:50.715+ 7fd204a28700  0 [dashboard ERROR
exception] Dashboard Exception
Traceback (most recent call last):
  File "/usr/share/ceph/mgr/dashboard/controllers/rgw.py", line 145, in
proxy
result = instance.proxy(method, path, params, None)
  File "/usr/share/ceph/mgr/dashboard/services/rgw_client.py", line 513, in
proxy
params, data)
  File "/usr/share/ceph/mgr/dashboard/rest_client.py", line 534, in
func_wrapper
**kwargs)
  File "/usr/share/ceph/mgr/dashboard/services/rgw_client.py", line 507, in
_proxy_request
raw_content=True)
  File "/usr/share/ceph/mgr/dashboard/rest_client.py", line 323, in __call__
data, raw_content, headers)
  File "/usr/share/ceph/mgr/dashboard/rest_client.py", line 452, in
do_request
resp.content)
dashboard.rest_client.RequestException: RGW REST API failed request with
status code 404
(b'{"Code":"NoSuchBucket","RequestId":"tx086cdcd9547b301e2-00653e4de6-285b3'
 b'-test","HostId":"285b3-test-test"}')

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/share/ceph/mgr/dashboard/services/exception.py", line 47, in
dashboard_exception_handler
return handler(*args, **kwargs)
  File "/lib/python3.6/site-packages/cherrypy/_cpdispatch.py", line 54, in
__call__
return self.callable(*self.args, **self.kwargs)
  File "/usr/share/ceph/mgr/dashboard/controllers/_base_controller.py",
line 258, in inner
ret = func(*args, **kwargs)
  File "/usr/share/ceph/mgr/dashboard/controllers/_rest_controller.py",
line 191, in wrapper
return func(*vpath, **params)
  File "/usr/share/ceph/mgr/dashboard/controllers/rgw.py", line 275, in get
result = self.proxy(daemon_name, 'GET', 'bucket', {'bucket': bucket})
  File "/usr/share/ceph/mgr/dashboard/controllers/rgw.py", line 151, in
proxy
raise DashboardException(e, http_status_code=http_status_code,
component='rgw')
dashboard.exceptions.DashboardException: RGW REST API failed request with
status code 404
(b'{"Code":"NoSuchBucket","RequestId":"tx086cdcd9547b301e2-00653e4de6-285b3'
 b'-test","HostId":"285b3-test-test"}')
--
my cluster have tow realm :
1) rgw-realm= test , rgw-zon

[ceph-users] cephadm file "/sbin/cephadm", line 10098 PK ^

2023-12-18 Thread farhad kh

 Hello, I downloaded cephadm from the link below.
https://download.ceph.com/rpm-18.2.0/el8/noarch/
I change the address of the images to the address of my private registry,
```
DEFAULT_IMAGE = 'opkbhfpspsp0101.fns/ceph/ceph:v18'
DEFAULT_IMAGE_IS_MAIN = False
DEFAULT_IMAGE_RELEASE = 'reef'
DEFAULT_PROMETHEUS_IMAGE = 'opkbhfpspsp0101.fns/ceph/prometheus:v2.43.0'
DEFAULT_LOKI_IMAGE = 'opkbhfpspsp0101.fns/ceph/loki:2.4.0'
DEFAULT_PROMTAIL_IMAGE = 'opkbhfpspsp0101.fns/ceph/promtail:2.4.0'
DEFAULT_NODE_EXPORTER_IMAGE =
'opkbhfpspsp0101.fns/ceph/node-exporter:v1.5.0'
DEFAULT_ALERT_MANAGER_IMAGE =
'opkbhfpspsp0101.fns/ceph/alertmanager:v0.25.0'
DEFAULT_GRAFANA_IMAGE = 'opkbhfpspsp0101.fns/ceph/ceph-grafana:9.4.7'
DEFAULT_HAPROXY_IMAGE = 'opkbhfpspsp0101.fns/ceph/haproxy:2.3'
DEFAULT_KEEPALIVED_IMAGE = 'opkbhfpspsp0101.fns/ceph/keepalived:2.2.4'
DEFAULT_SNMP_GATEWAY_IMAGE = 'opkbhfpspsp0101.fns/ceph/snmp-notifier:v1.2.1'
DEFAULT_ELASTICSEARCH_IMAGE =
'opkbhfpspsp0101.fns/ceph/elasticsearch:6.8.23'
DEFAULT_JAEGER_COLLECTOR_IMAGE =
'opkbhfpspsp0101.fns/ceph/jaeger-collector:1.29'
DEFAULT_JAEGER_AGENT_IMAGE = 'opkbhfpspsp0101.fns/ceph/jaeger-agent:1.29'
DEFAULT_JAEGER_QUERY_IMAGE = 'opkbhfpspsp0101.fns/ceph/jaeger-query:1.29'
DEFAULT_REGISTRY = 'opkbhfpspsp0101.fns'   # normalize unqualified digests
to this
```
but I encounter this error `
  File "/sbin/cephadm", line 10098
PK
  ^
`. Also, there is a wrong line at the beginning of cephadm file
PK^C^D^T^@^@^@^@^@¥<9a>^CW<8e>^[º^×Ü^E^@×Ü^E^@^K^@^@^@__main__.py#!/usr/bin/python3
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: cephadm file "/sbin/cephadm", line 10098 PK ^

2023-12-18 Thread farhad kh

hi,thank you for guidance

There is no ability to change the global image before launching,I need to
download the images from the private registry during the initial setup.
i used option --image but it not worked.

# cephadm bootstrap --image rgistry.test/ceph/ceph:v18 --mon-ip 192.168.0.160
--initial-dashboard-password P@ssw0rd --dashboard-password-noupdate
--allow-fqdn-hostname --ssh-user cephadmin
usage: cephadm [-h] [--image IMAGE] [--docker] [--data-dir DATA_DIR]
[--log-dir LOG_DIR] [--logrotate-dir LOGROTATE_DIR]
[--sysctl-dir SYSCTL_DIR] [--unit-dir UNIT_DIR] [--verbose]
[--timeout TIMEOUT] [--retry RETRY] [--env ENV]
[--no-container-init] [--no-cgroups-split]

{version,pull,inspect-image,ls,list-networks,adopt,rm-daemon,rm-cluster,run,shell,enter,ceph-volume,zap-osds,unit,logs,bootstrap,deploy,_orch,check-host,prepare-host,add-repo,rm-repo,install,registry-login,gather-facts,host-maintenance,agent,disk-rescan}
...
cephadm: error: unrecognized arguments: --image rgistry.test/ceph/ceph:v18

also i used cephadm registry-login and it show loged but when i bootstrap
first node trying download image from quay registry.

cephadm bootstrap --mon-ip 192.168.0.160 --registry-json /root/mylogin.json
--initial-dashboard-password P@ssw0rd --dashboard-password-noupdate
--allow-fqdn-hostname --ssh-user cephadmin
Creating directory /etc/ceph for ceph.conf
Verifying ssh connectivity using standard pubkey authentication ...
Adding key to cephadmin@localhost authorized_keys...
Verifying podman|docker is present...
Verifying lvm2 is present...
Verifying time synchronization is in place...
Unit chronyd.service is enabled and running
Repeating the final host check...
docker (/usr/bin/docker) is present
systemctl is present
lvcreate is present
Unit chronyd.service is enabled and running
Host looks OK
Cluster fsid: 3c00e38c-9e2e-11ee-95cd-000c29e9f44e
Verifying IP 192.168.0.160 port 3300 ...
Verifying IP 192.168.0.160 port 6789 ...
Mon IP `192.168.0.160` is in CIDR network `192.168.0.0/24`
Mon IP `192.168.0.160` is in CIDR network `192.168.0.0/24`
Internal network (--cluster-network) has not been provided, OSD replication
will default to the public_network
Pulling custom registry login info from /root/mylogin.json.
Logging into custom registry.
Pulling container image quay.io/ceph/ceph:v18...

befor i edited cephadm script but now file in coded and can't edited.
and i don't know how can fix it :(((
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] change ip node and public_network in cluster

2024-02-17 Thread farhad kh

I have implemented a ceph cluster with cephadm which has three monitors and
three OSDs
each node have one interface 192.168.0.0/24 network.
I want to change the address of the machines to the range 10.4.4.0/24.
Is there a solution for this change without data loss and failure?
i change the pubic_network in mon and change the ip node but its not worked
.
how can i sovle this problem?
```
ceph orch host ls
HOST ADDR   LABELS  STATUS
ceph-01  192.168.0.130  _admin,rgw
ceph-02  192.168.0.131  _admin,rgw
ceph-03  192.168.0.132  _admin,rgw
3 hosts in cluster


[root@ceph-01 ~]# ceph config get mon public_network
192.168.0.0/24

[root@ceph-01 ~]# ceph orch ls
NAME   PORTSRUNNING  REFRESHED  AGE
 PLACEMENT
alertmanager   ?:9093,9094  1/1  112s ago   9M
count:1
ceph-exporter   3/3  114s ago   8M   *
crash   3/3  114s ago   9M   *
grafana?:3000   1/1  112s ago   8M
count:1
mgr 2/2  113s ago   9M
count:2
mon 3/3  114s ago   8M
count:3
node-exporter  ?:9100   3/3  114s ago   9M   *
osd.dashboard-admin-1685787597651 6  114s ago   8M   *
prometheus ?:9095   1/1  112s ago   3M
count:1

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] ceph api rgw/role

2024-04-22 Thread farhad kh

 hi , i used ceph api for create rgw/role but ther is not api for delete or
edit rgw/role .
how can i delete them or edit ?
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Problem in changing monitor address and public_network

2024-05-26 Thread farhad kh

 Hello, according to ceph own document and the article that I sent the link
to, I tried to change the address of the ceph machines and its public
network.
But the guarantee that I had to set the machines with the new address(ceph
orch host set-addr opcrgfpsksa0101 10.248.35.213)
, the command was not executed and the managers were trying to connect to
the same previous addresses.
I have run my cluster with a non-root user and i change adress from
10.56.12.0/22 to 10.248.35/24.
i used ceph version 17.2.6 and cephadm.
Here, I don't have a configuration for the cluster_network and all
communications are on the public_network.
Also, I have sent the specifications and logs that I think are useful
Are there any ideas and solutions to fix this problem and the cause of it?


--
debug 2024-05-26T17:28:33.191+ 7ffb15579700  0 log_channel(cephadm) log
[INF] : Filtered out host opcrgfpsksa0101: does not belong to mon
public_network(s):  10.248.35.0/24, host network(s):
10.56.12.0/22,172.17.0.0/16
debug 2024-05-26T17:28:33.191+ 7ffb15579700  0 [cephadm INFO
cephadm.serve] Filtered out host opcrgfpsksa0103: does not belong to mon
public_network(s):  10.248.35.0/24, host network(s):
10.56.12.0/22,172.17.0.0/16
debug 2024-05-26T17:28:33.191+ 7ffb15579700  0 log_channel(cephadm) log
[INF] : Filtered out host opcrgfpsksa0103: does not belong to mon
public_network(s):  10.248.35.0/24, host network(s):
10.56.12.0/22,172.17.0.0/16
debug 2024-05-26T17:28:33.192+ 7ffb15579700  0 [cephadm INFO
cephadm.serve] Filtered out host opcmrfpsksa0101: does not belong to mon
public_network(s):  10.248.35.0/24, host network(s):
10.56.12.0/22,172.17.0.0/16
debug 2024-05-26T17:28:33.192+ 7ffb15579700  0 log_channel(cephadm) log
[INF] : Filtered out host opcmrfpsksa0101: does not belong to mon
public_network(s):  10.248.35.0/24, host network(s):
10.56.12.0/22,172.17.0.0/16
debug 2024-05-26T17:28:33.192+ 7ffb15579700  0 [progress WARNING root]
complete: ev da5c20ec-e9df-490f-804d-182d02f0324e does not exist
debug 2024-05-26T17:28:33.192+ 7ffb15579700  0 [progress WARNING root]
complete: ev 82392bf5-4940-4252-9aaf-aa4758c00ead does not exist
debug 2024-05-26T17:28:33.212+ 7ffb15579700  0 [progress WARNING root]
complete: ev 1d64774f-dc6d-46c8-8f96-21d147f4b053 does not exist
debug 2024-05-26T17:28:33.212+ 7ffb15579700  0 [progress WARNING root]
complete: ev 45322c73-8ecb-4471-b41c-0e279805dd0b does not exist
debug 2024-05-26T17:28:35.182+ 7ffb584e1700  0 log_channel(cluster) log
[DBG] : pgmap v232: 1185 pgs: 1185 unknown; 0 B data, 0 B used, 0 B / 0 B
avail
debug 2024-05-26T17:28:37.183+ 7ffb584e1700  0 log_channel(cluster) log
[DBG] : pgmap v233: 1185 pgs: 1185 unknown; 0 B data, 0 B used, 0 B / 0 B
avail
debug 2024-05-26T17:28:38.609+ 7ffb4f4df700  0 log_channel(audit) log
[DBG] : from='client.14874149 -' entity='client.admin' cmd=[{"prefix":
"orch host set-addr", "hostname": "opcrgfpsksa0101", "addr":
"10.248.35.213", "target": ["mon-mgr", ""]}]: dispatch
debug 2024-05-26T17:28:39.184+ 7ffb584e1700  0 log_channel(cluster) log
[DBG] : pgmap v234: 1185 pgs: 1185 unknown; 0 B data, 0 B used, 0 B / 0 B
avail


ceph orch host ls
HOST ADDR  LABELS  STATUS
opcmrfpsksa0101  10.56.12.216  _admin mon osd
opcpmfpsksa0101  10.56.12.204  rgw
opcpmfpsksa0103  10.56.12.205  rgw
opcpmfpsksa0105  10.56.12.206  rgw
opcrgfpsksa0101  10.56.12.213  _admin mon osd
opcrgfpsksa0103  10.56.12.214  _admin mon osd
opcsdfpsksa0101  10.56.12.207  osd
opcsdfpsksa0103  10.56.12.208  osd
opcsdfpsksa0105  10.56.12.209  osd
9 hosts in cluster

NAME   PORTS RUNNING  REFRESHED
 AGE  PLACEMENT
alertmanager   ?:9093,9094   3/3  8m ago
  11M  count:3
ceph-exporter9/9  8m ago
  7w   *
crash9/9  8m ago
  11M  *
grafana?:30003/3  8m ago
  11d  count:3;label:mon
ingress.rgw.k8s10.56.12.215:80,1967  4/4  8m ago
  11d  count:2;label:rgw
mds.k8s-cephfs   3/3  8m ago
  11M  count:3
mgr  2/2  7m ago
  11d  count:2;label:mon
mon  3/3  8m ago
  7M   count:3;label:mon
node-exporter  ?:91009/9  8m ago
  7w   *
osd8  8m ago
  -
osd.dashboard-admin-1695638488579 37  8m ago
  8M   *
prometheus ?:90953/3  8m ago
  7w   count:3;label:mon
rgw.k8s?:80803/3  8m ago
  5M   count:3;label:rgw
___
ceph-users mailing

[ceph-users] when calling the CreateTopic operation: Unknown

2024-07-12 Thread farhad kh

hi, i want to use ceph bucket notification . i try to created topic with
below command but get error when used kafka with user/password
how can i solved this problem ? my syntax have any problem?


https://www.ibm.com/docs/en/storage-ceph/7?topic=management-creating-bucket-notifications
https://docs.ceph.com/en/latest/radosgw/notifications/


# aws sns create-topic --name storage  --profile=test-notifications
--endpoint-url=http://192.168.115.59 --attributes=file://topic.json --debug

#
topic.json:
{"push-endpoint": "kafka://usr-test:p@ssw#0...@kafka.test:9092","verify-ssl":
"False", "kafka-ack-level": "broker", "persistent":"true"}
#


error  is :
content-type:application/x-www-form-urlencoded; charset=utf-8
host:192.168.115.59
x-amz-date:20240712T120700Z

content-type;host;x-amz-date
6f328bc5b0736f23ae2cdf68ccffe1a45c705dd1636f61b999350ae18f8d5ad1
2024-07-12 12:07:00,439 - MainThread - botocore.auth - DEBUG - StringToSign:
AWS4-HMAC-SHA256
20240712T120700Z
20240712/podspace/sns/aws4_request
38b7d8721abdd98c214ea763d9dcc324fcbc5982990353140f6b734455e9c01e
2024-07-12 12:07:00,439 - MainThread - botocore.auth - DEBUG - Signature:
99a0f7d40bc011c09dfe7c6aed4bae7f739df52278e23bd4f16fdb1888c001b7
2024-07-12 12:07:00,440 - MainThread - botocore.endpoint - DEBUG - Sending
http request: http://192.168.115.59/, headers={'Content-Type':
b'application/x-www-form-urlencoded; charset=utf-8', 'User-Agent':
b'aws-cli/1.18.156 Python/3.6.8 Linux/5.4.17-2136.326.6.el8uek.x86_64
botocore/1.18.15', 'X-Amz-Date': b'20240712T120700Z', 'Authorization':
b'AWS4-HMAC-SHA256
Credential=2MRG0TXJFISQAIVE47F1/20240712/podspace/sns/aws4_request,
SignedHeaders=content-type;host;x-amz-date,
Signature=99a0f7d40bc011c09dfe7c6aed4bae7f739df52278e23bd4f16fdb1888c001b7',
'Content-Length': '371'}>
2024-07-12 12:07:00,441 - MainThread - urllib3.connectionpool - DEBUG -
Starting new HTTP connection (1): 192.168.115.59:80
2024-07-12 12:07:00,446 - MainThread - urllib3.connectionpool - DEBUG -
http://192.168.115.59:80 "POST / HTTP/1.1" 400 206
2024-07-12 12:07:00,447 - MainThread - botocore.parsers - DEBUG - Response
headers: {'content-length': '206', 'x-amz-request-id':
'tx0c7a58038e67798c0-0066911c64-19910-site1', 'accept-ranges': 'bytes',
'content-type': 'application/xml', 'date': 'Fri, 12 Jul 2024 12:07:00 GMT'}
2024-07-12 12:07:00,447 - MainThread - botocore.parsers - DEBUG - Response
body:
b'InvalidArgumenttx0c7a58038e67798c0-0066911c64-19910-site119910-site1-podspace'
2024-07-12 12:07:00,448 - MainThread - botocore.hooks - DEBUG - Event
needs-retry.sns.CreateTopic: calling handler

2024-07-12 12:07:00,448 - MainThread - botocore.retryhandler - DEBUG - No
retry needed.
2024-07-12 12:07:00,449 - MainThread - awscli.clidriver - DEBUG - Exception
caught in main()
Traceback (most recent call last):
  File "/usr/lib/python3.6/site-packages/awscli/clidriver.py", line 217, in
main
return command_table[parsed_args.command](remaining, parsed_args)
  File "/usr/lib/python3.6/site-packages/awscli/clidriver.py", line 358, in
__call__
return command_table[parsed_args.operation](remaining, parsed_globals)
  File "/usr/lib/python3.6/site-packages/awscli/clidriver.py", line 530, in
__call__
call_parameters, parsed_globals)
  File "/usr/lib/python3.6/site-packages/awscli/clidriver.py", line 650, in
invoke
client, operation_name, parameters, parsed_globals)
  File "/usr/lib/python3.6/site-packages/awscli/clidriver.py", line 662, in
_make_client_call
**parameters)
  File "/usr/lib/python3.6/site-packages/botocore/client.py", line 357, in
_api_call
return self._make_api_call(operation_name, kwargs)
  File "/usr/lib/python3.6/site-packages/botocore/client.py", line 676, in
_make_api_call
raise error_class(parsed_response, operation_name)
botocore.exceptions.ClientError: An error occurred (Unknown) when calling
the CreateTopic operation: Unknown
2024-07-12 12:07:00,449 - MainThread - awscli.clidriver - DEBUG - Exiting
with rc 255

An error occurred (Unknown) when calling the CreateTopic operation: Unknown
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] ingress for mgr service

2024-07-24 Thread farhad kh

 It is very good to be able to use Ingress for service manager This
possibility of high availability for Ed Will this feature be added in the
next version or should it still be implemented manually?
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] empty bucket

2022-05-14 Thread farhad kh

hi
i deleted all object in bucket but used capacity in  my bucket isnot zero
and show in ls command many objects
why ?
and how can i deleted all ?

  s3 ls  s3://podspace-default-bucket-zone
/usr/lib/python3.6/site-packages/urllib3/connectionpool.py:847:
InsecureRequestWarning: Unverified HTTPS request is being made. Adding
certificate verification is strongly advised.See:
https://urllib3.readthedocs.io/en/latest/advanced-usage.html#ssl-warnings
  InsecureRequestWarning)
   PRE 48203/

real0m0.524s
user0m0.421s
sys 0m0.073s



 rados -p   default.rgw.buckets.data ls  | less

2ee2e53d-bad4-4857-8bea-36eb52a83f34.4951548.2__shadow_1/T626CPCVEKKAKG9D.2~F68zep-KOHUWfw2V6A9CU4BA0OMXGNm.3_1
2ee2e53d-bad4-4857-8bea-36eb52a83f34.4951548.2__multipart_1/CRTAD4MLLO535BBP.2~uNExUFxvHqi2h8SSncjrdsqZ3UpWkNM.16
2ee2e53d-bad4-4857-8bea-36eb52a83f34.4951548.2__multipart_1/FD58WHS37IZR9PB7.2~_z2i7JxWh4PI9YzfxXjLwCHfXnDPxct.38
2ee2e53d-bad4-4857-8bea-36eb52a83f34.4951548.2__multipart_1/PWSJOSS6KI5X4Q8Y.2~iI6zIG2Et2Pu8gXVQ5tiIM-IM4mCkFq.17
2ee2e53d-bad4-4857-8bea-36eb52a83f34.4951548.2__multipart_1/IIKXDVA7CGYEXVZI.2~9HL_J8oJAA8IP2BTS3BeHS4z3DwJoDz.22
2ee2e53d-bad4-4857-8bea-36eb52a83f34.4951548.2__shadow_1/TL3RQO9GO2HN9RHW.2~yVkG1qKqELCg9v7zLcR1FJqViXo2Td6.45_1
2ee2e53d-bad4-4857-8bea-36eb52a83f34.4951548.2__multipart_1/JJONSXRDAG4B5IFT.2~n-nsY0AjL2W16lscy8mQDJ7T_SXzH3h.78
2ee2e53d-bad4-4857-8bea-36eb52a83f34.4951548.2__multipart_1/TX4SVPUIR5NHF4S2.2~YQ4L3p6natLw_9H1YiCbFRjeNoP_mFS.2
2ee2e53d-bad4-4857-8bea-36eb52a83f34.4951548.2__multipart_1/TX4SVPUIR5NHF4S2.2~YQ4L3p6natLw_9H1YiCbFRjeNoP_mFS.54
2ee2e53d-bad4-4857-8bea-36eb52a83f34.4951548.2__multipart_1/6KLGQEDJF5K3K25G.2~7WTJoGwUTUsw16Qr0ut_s61N_0Y_g-j.7
2ee2e53d-bad4-4857-8bea-36eb52a83f34.4951548.2__multipart_1/VPNWVS4QXV83C6TR.2~hOcKk34zsKa7WMJVQbt17XnGp1_d40A.42
2ee2e53d-bad4-4857-8bea-36eb52a83f34.4951548.2__shadow_1/TX4SVPUIR5NHF4S2.2~YQ4L3p6natLw_9H1YiCbFRjeNoP_mFS.14_1
2ee2e53d-bad4-4857-8bea-36eb52a83f34.4951548.2__multipart_1/74XOWJITVI9PWWI1.2~ZVhpWguMYU8VKpIvyw-XMMEazdqxJpE.16
2ee2e53d-bad4-4857-8bea-36eb52a83f34.4951548.2__multipart_1/IIKXDVA7CGYEXVZI.2~9HL_J8oJAA8IP2BTS3BeHS4z3DwJoDz.1


[root@opcpmfpsksa0101 ~]# rados df
POOL_NAME  USED  OBJECTS  CLONES  COPIES
 MISSING_ON_PRIMARY  UNFOUND  DEGRADEDRD_OPS   RDWR_OPS
WR  USED COMPR  UNDER COMPR
.rgw.root48 KiB4   0  12
00 0   709  709 KiB 44 KiB 0 B
 0 B
default.rgw.buckets.data 25 GiB 3344   0   10032
00 0 68635  1.1 GiB385392  164 GiB 0 B
 0 B
default.rgw.buckets.index   696 KiB   22   0  66
00 0142196  189 MiB 88983   44 MiB 0 B
 0 B
default.rgw.buckets.non-ec  1.5 MiB   57   0 171
00 0197968  124 MiB 54827   46 MiB 0 B
 0 B
default.rgw.control 0 B8   0  24
00 0 0  0 B 0  0 B 0 B
 0 B
default.rgw.log 408 KiB  209   0 627
00 0  24160339   23 GiB  15970775   28 MiB 0 B
 0 B
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] client.admin crashed

2022-05-16 Thread farhad kh

i have error a in my cluster ceph

HEALT_WARN 1 demons have recently crashed
[WRN] RECENT_CRASH: 1 demons have  recently crashed
 client.admin crashed on host node1 at 2022-05-16T08:30:41205667z
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] client.admin crashed

2022-05-16 Thread farhad kh

 i have error a in my cluster ceph

HEALT_WARN 1 demons have recently crashed
[WRN] RECENT_CRASH: 1 demons have  recently crashed
 client.admin crashed on host node1 at 2022-05-16T08:30:41205667z
what does this mean
How can I fix it?
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] disaster in many of osd disk

2022-05-24 Thread farhad kh

 I lost some disks in my cluster ceph then began to correct the structure
of the objects and replicate them
This caused me to get some errors on the s3 api

Gateway Time-out (Service: Amazon S3; Status Code: 504; Error Code: 504
Gateway Time-out; Request ID: null; S3 Extended Request ID: null; Proxy:
null

I also have a warning that OSD discs are about to fill up
How can this be explained? How to delay the ramp and clearing of objects?
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] HDD disk for RGW and CACHE tier for giving beter performance

2022-05-24 Thread farhad kh

I want to save data pools for rgw on HDD disk drives And use some SSD hard
drive for the cache  tier on top of it
Has anyone tested this scenario?
Is this practical and optimal?
How can I do this?
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] RGW error s3 api

2022-05-24 Thread farhad kh

hi
 i have a lot of error in s3 api
in client s3 i get this :

2022-05-24 10:49:58.095 ERROR 156723 --- [exec-upload-21640003-285-2]
i.p.p.d.service.UploadDownloadService: Gateway Time-out (Service:
Amazon S3; Status Code: 504; Error Code: 504 Gateway Time-out; Request ID:
null; S3 Extended Request ID: null; Proxy: null)
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] RGW error s3 api

2022-05-24 Thread farhad kh

 hi
 i have a lot of error in s3 api
in client s3 i get this :

2022-05-24 10:49:58.095 ERROR 156723 --- [exec-upload-21640003-285-2]
i.p.p.d.service.UploadDownloadService: Gateway Time-out (Service:
Amazon S3; Status Code: 504; Error Code: 504 Gateway Time-out; Request ID:
null; S3 Extended Request ID: null; Proxy: null)
2022-05-24 10:49:52.933 ERROR 156723 --- [exec-upload-21640003-282-2]
i.p.p.d.service.UploadDownloadService: Unable to execute HTTP request:
Broken pipe (Write failed)
2022-05-24 10:49:30.646 ERROR 156723 --- [exec-upload-21640003-277-2]
i.p.p.d.service.UploadDownloadService: Unable to execute HTTP request:
The target server failed to respond

--and i get logs for rgw container show this error

debug 2022-05-24T10:50:26.888+ 7fcd35b31700  0 req 17644239529065517086
1.464013577s ERROR: RESTFUL_IO(s)->complete_header() returned
err=Connection reset by peer
debug 2022-05-24T10:50:42.487+ 7fccbb23c700  0 req 17818983848567298192
2.116019487s ERROR: RESTFUL_IO(s)->complete_header() returned err=Broken
pipe
debug 2022-05-24T10:49:58.091+ 7fcd5eb83700  0 req 9823709374853588294
2.907027006s ERROR: RESTFUL_IO(s)->complete_header() returned
err=Connection reset by peer
debug 2022-05-24T10:49:58.104+ 7fcd2fb25700  1 req 13985293647902708528
0.00437s op->ERRORHANDLER: err_no=-2 new_err_no=-2
debug 2022-05-24T09:54:45.012+ 7fa10050e700  0 ERROR:
client_io->complete_request() returned Connection reset by peer

anythings in my cluster is ok and healthy
and system load is so lower than normal
Can anyone tell me why this is happening or how I can find the cause?
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] cephadm error mgr not available and ERROR: Failed to add host

2022-05-24 Thread farhad kh

hi
i want used private registry for running cluster ceph storage and i changed
default registry my container runtime (docker)
/etc/docker/deamon.json
{
  "registery-mirrors": ["https://private-registery.fst";]
}

and all registry addres in /usr/sbin/cephadm(quay.ceph.io and docker.io to
my private registry cat /usr/sbin/cephadm | grep private-registery.fst

DEFAULT_IMAGE = 'private-registery.fst/ceph/ceph:v16.2.7'
DEFAULT_PROMETHEUS_IMAGE = 'private-registery.fst/ceph/prometheus:v2.18.1'
DEFAULT_NODE_EXPORTER_IMAGE =
'private-registery.fst/ceph/node-exporter:v0.18.1'
DEFAULT_ALERT_MANAGER_IMAGE =
'private-registery.fst/ceph/alertmanager:v0.20.0'
DEFAULT_GRAFANA_IMAGE = 'private-registery.fst/ceph/ceph-grafana:6.7.4'
DEFAULT_HAPROXY_IMAGE = 'private-registery.fst/ceph/haproxy:2.3'
DEFAULT_KEEPALIVED_IMAGE = 'private-registery.fst/ceph/keepalived'
DEFAULT_REGISTRY = 'private-registery.fst'   # normalize unqualified
digests to this
>>> normalize_image_digest('ceph/ceph:v16', 'private-registery.fst')
>>> normalize_image_digest('private-registery.fst/ceph/ceph:v16',
'private-registery.fst')
'private-registery.fst/ceph/ceph:v16'
>>> normalize_image_digest('private-registery.fst/ceph',
'private-registery.fst')
>>> normalize_image_digest('localhost/ceph', 'private-registery.fst')

when i try deply first node of cluseter with cephadm  i have this error

 cephadm bootstrap   --mon-ip 10.20.23.65 --allow-fqdn-hostname
--initial-dashboard-user admin   --initial-dashboard-password admin
--dashboard-password-noupdate
Verifying podman|docker is present...
Verifying lvm2 is present...
Verifying time synchronization is in place...
Unit chronyd.service is enabled and running
Repeating the final host check...
docker (/bin/docker) is present
systemctl is present
lvcreate is present
Unit chronyd.service is enabled and running
Host looks OK
Cluster fsid: e52bee78-db8b-11ec-9099-00505695f8a8
Verifying IP 10.20.23.65 port 3300 ...
Verifying IP 10.20.23.65 port 6789 ...
Mon IP `10.20.23.65` is in CIDR network `10.20.23.0/24`
- internal network (--cluster-network) has not been provided, OSD
replication will default to the public_network
Pulling container image private-registery.fst/ceph/ceph:v16.2.7...
Ceph version: ceph version 16.2.7
(dd0603118f56ab514f133c8d2e3adfc983942503) pacific (stable)
Extracting ceph user uid/gid from container image...
Creating initial keys...
Creating initial monmap...
Creating mon...
Waiting for mon to start...
Waiting for mon...
mon is available
Assimilating anything we can from ceph.conf...
Generating new minimal ceph.conf...
Restarting the monitor...
Setting mon public_network to 10.20.23.0/24
Wrote config to /etc/ceph/ceph.conf
Wrote keyring to /etc/ceph/ceph.client.admin.keyring
Creating mgr...
Verifying port 9283 ...
Waiting for mgr to start...
Waiting for mgr...
mgr not available, waiting (1/15)...
mgr not available, waiting (2/15)...
mgr not available, waiting (3/15)...
mgr not available, waiting (4/15)...
mgr is available
Enabling cephadm module...
Waiting for the mgr to restart...
Waiting for mgr epoch 5...
mgr epoch 5 is available
Setting orchestrator backend to cephadm...
Generating ssh key...
Wrote public SSH key to /etc/ceph/ceph.pub
Adding key to root@localhost authorized_keys...
Adding host opcpmfpsbpp0101...
Non-zero exit code 22 from /bin/docker run --rm --ipc=host
--stop-signal=SIGTERM --net=host --entrypoint /usr/bin/ceph --init -e
CONTAINER_IMAGE=private-registery.fst/ceph/ceph:v16.2.7 -e
NODE_NAME=opcpmfpsbpp0101 -e  CEPH_USE_RANDOM_NONCE=1 -v
/var/log/ceph/e52bee78-db8b-11ec-9099-00505695f8a8:/var/log/ceph:z -v
/tmp/ceph-tmpwt99ep2e:/etc/ceph/ceph.client.admin.keyring:z -v
/tmp/ceph-tmpweojwqdh:/etc/ceph/ceph.conf:z opkbhfpsb
pp0101.fst/ceph/ceph:v16.2.7 orch host add opcpmfpsbpp0101 10.20.23.65
/usr/bin/ceph: stderr Error EINVAL: Failed to connect to opcpmfpsbpp0101
(10.20.23.65).
/usr/bin/ceph: stderr Please make sure that the host is reachable and
accepts connections using the cephadm SSH key
/usr/bin/ceph: stderr
/usr/bin/ceph: stderr To add the cephadm SSH key to the host:
/usr/bin/ceph: stderr > ceph cephadm get-pub-key > ~/ceph.pub
/usr/bin/ceph: stderr > ssh-copy-id -f -i ~/ceph.pub root@10.20.23.65
/usr/bin/ceph: stderr
/usr/bin/ceph: stderr To check that the host is reachable open a new shell
with the --no-hosts flag:
/usr/bin/ceph: stderr > cephadm shell --no-hosts
/usr/bin/ceph: stderr
/usr/bin/ceph: stderr Then run the following:
/usr/bin/ceph: stderr > ceph cephadm get-ssh-config > ssh_config
/usr/bin/ceph: stderr > ceph config-key get mgr/cephadm/ssh_identity_key >
~/cephadm_private_key
/usr/bin/ceph: stderr > chmod 0600 ~/cephadm_private_key
/usr/bin/ceph: stderr > ssh -F ssh_config -i ~/cephadm_private_key
root@10.20.23.65
ERROR: Failed to add host : Failed command: /bin/docker
run --rm --ipc=host --stop-signal=SIGTERM --net=host --entrypoint
/usr/bin/ceph --init -e
CONTAINER_IMAGE=private-registery.fst/ceph/ceph:v16.2 .7 -e
NODE_NAME=

[ceph-users] HEALTH_ERR Module 'cephadm' has failed: dashboard iscsi-gateway-rm failed: iSCSI gateway 'opcpmfpsbpp0101' does not exist retval: -2 [ERR] MGR_MODULE_ERROR: Module 'cephadm' has failed: d

2022-05-28 Thread farhad kh

hi
i have a error in delete service from dashboard
ceph version is 16.2.6

HEALTH_ERR Module 'cephadm' has failed: dashboard iscsi-gateway-rm failed:
iSCSI gateway 'opcpmfpsbpp0101' does not exist retval: -2
[ERR] MGR_MODULE_ERROR: Module 'cephadm' has failed: dashboard
iscsi-gateway-rm failed: iSCSI gateway 'opcpmfpsbpp0101' does not exist
retval: -2
Module 'cephadm' has failed: dashboard iscsi-gateway-rm failed: iSCSI
gateway 'opcpmfpsbpp0101' does not exist retval: -2
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Ceph's mgr/prometheus module is not available

2022-05-29 Thread farhad kh

hi
i upgraded my cluster from 16.2.6 to 16.2.9
and i have this error in dashboard but not in command line

The mgr/prometheus module at opcpmfpsbpp0103.fst.20.10.in-addr.arpa:9283 is
unreachable. This could mean that the module has been disabled or the mgr
itself is down. Without the mgr/prometheus module metrics and alerts will
no longer function. Open a shell to ceph and use 'ceph -s' to to determine
whether the mgr is active. If the mgr is not active, restart it, otherwise
you can check the mgr/prometheus module is loaded with 'ceph mgr module ls'
and if it's not listed as enabled, enable it with 'ceph mgr module enable
prometheus'


#ceph orch ps

 opcpmfpsbpp0101: Sun May 29 09:53:16 2022

NAME   HOST PORTSSTATUS
REFRESHED  AGE  MEM USE  MEM LIM  VERSION  IMAGE ID  CONTAINER ID
alertmanager.opcpmfpsbpp0101   opcpmfpsbpp0101  *:9093,9094  running
(27m) 2m ago   4d24.9M-   ba2b418f427c  d8c8664b8d84
alertmanager.opcpmfpsbpp0103   opcpmfpsbpp0103  *:9093,9094  running
(25m) 2m ago   3h27.4M-   ba2b418f427c  bb6035a27201
alertmanager.opcpmfpsbpp0105   opcpmfpsbpp0105  *:9093,9094  running
(25m) 2m ago   3h23.1M-   ba2b418f427c  1911a6c14209
crash.opcpmfpsbpp0101  opcpmfpsbpp0101   running
(27m) 2m ago   4d7560k-  16.2.9   3520ead5eb19  5f27bd36a62d
crash.opcpmfpsbpp0103  opcpmfpsbpp0103   running
(25m) 2m ago   3d8452k-  16.2.9   3520ead5eb19  2bc971ac4826
crash.opcpmfpsbpp0105  opcpmfpsbpp0105   running
(25m) 2m ago   3d8431k-  16.2.9   3520ead5eb19  3dcd3809beb3
grafana.opcpmfpsbpp0101opcpmfpsbpp0101  *:3000   running
(27m) 2m ago   4d51.3M-  8.3.5dad864ee21e9  ad6608c1426b
grafana.opcpmfpsbpp0103opcpmfpsbpp0103  *:3000   running
(25m) 2m ago   3h49.4M-  8.3.5dad864ee21e9  7b39e1ec7986
grafana.opcpmfpsbpp0105opcpmfpsbpp0105  *:3000   running
(25m) 2m ago   3h53.1M-  8.3.5dad864ee21e9  0c178fc5e202
iscsi.ca-1.opcpmfpsbpp0101.jpixcv  opcpmfpsbpp0101   running
(23m) 2m ago  23m82.2M-  3.5  3520ead5eb19  8724836ea2cd
iscsi.ca-1.opcpmfpsbpp0103.xgceen  opcpmfpsbpp0103   running
(23m) 2m ago  23m71.4M-  3.5  3520ead5eb19  3b046ad06877
iscsi.ca-1.opcpmfpsbpp0105.uyskvc  opcpmfpsbpp0105   running
(23m) 2m ago  23m67.8M-  3.5  3520ead5eb19  b7dbec1aabdf
mgr.opcpmfpsbpp0101.dbwmph opcpmfpsbpp0101  *:8443,9283  running
(27m) 2m ago   4d 442M-  16.2.9   3520ead5eb19  3dddac975409
mgr.opcpmfpsbpp0103.sihfoj opcpmfpsbpp0103  *:8443,9283  running
(25m) 2m ago   3d 380M-  16.2.9   3520ead5eb19  15d8e94f966e
mon.opcpmfpsbpp0101opcpmfpsbpp0101   running
(27m) 2m ago   4d 154M2048M  16.2.9   3520ead5eb19  90571a4ff6fc
mon.opcpmfpsbpp0103opcpmfpsbpp0103   running
(25m) 2m ago   3d 103M2048M  16.2.9   3520ead5eb19  4d4de3d69288
mon.opcpmfpsbpp0105opcpmfpsbpp0105   running
(25m) 2m ago   3d 103M2048M  16.2.9   3520ead5eb19  db14ad0ef6b6
node-exporter.opcpmfpsbpp0101  opcpmfpsbpp0101  *:9100   running
(27m) 2m ago   4d24.5M-   1dbe0e931976  541eddfabb2c
node-exporter.opcpmfpsbpp0103  opcpmfpsbpp0103  *:9100   running
(25m) 2m ago   3d10.4M-   1dbe0e931976  c63b991e5cf7
node-exporter.opcpmfpsbpp0105  opcpmfpsbpp0105  *:9100   running
(25m) 2m ago   3d8328k-   1dbe0e931976  75404a20b7ab
osd.0  opcpmfpsbpp0101   running
(27m) 2m ago   4h71.0M4096M  16.2.9   3520ead5eb19  c413da18938c
osd.1  opcpmfpsbpp0103   running
(25m) 2m ago   4h72.1M4096M  16.2.9   3520ead5eb19  b259d4262430
osd.2  opcpmfpsbpp0105   running
(25m) 2m ago   4h64.7M4096M  16.2.9   3520ead5eb19  c4ed30712d15
prometheus.opcpmfpsbpp0101 opcpmfpsbpp0101  *:9095   running
(27m) 2m ago   4d82.2M-   514e6a882f6e  34c846d9946b
prometheus.opcpmfpsbpp0103 opcpmfpsbpp0103  *:9095   running
(25m) 2m ago   3h77.4M-   514e6a882f6e  cec307f490c4
prometheus.opcpmfpsbpp0105 opcpmfpsbpp0105  *:9095   running
(25m) 2m ago   3h74.1M-   514e6a882f6e  d13f02d1bb72
# ceph -s
  cluster:
id: c41ccd12-dc01-11ec-9e25-00505695f8a8
health: HEALTH_OK

  services:
mon: 3 daemons, quorum opcpmfpsbpp0101,opcpmfpsbpp0103,opcpmfpsbpp0105
(age 25m)
mgr: opcpmfpsbpp0101.dbwmph(active, since 27m), standby

[ceph-users] Degraded data redundancy and too many PGs per OSD

2022-05-30 Thread farhad kh

hi
i have a problem in my cluster
i used cache tier for rgw data
In this way, three hosts for cache and three hosts for data I have used
SSDs for cache and HDD for data
i set 20 GiB quota for cache pool
when one host of cache tier shulde be offline
released this warning and i decreased quota to 10 GiB but it not resolved
and in dashboard not correct number of pg status ( 1 active+undersize)
what happening in my cluster ?
why this is not resolved?
anyone can explain this situation?

##ceph -s
opcpmfpsksa0101: Mon May 30 12:05:12 2022

  cluster:
id: 54d2b1d6-207e-11ec-8c73-005056ac51bf
health: HEALTH_WARN
1 hosts fail cephadm check
1 pools have many more objects per pg than average
Degraded data redundancy: 1750/53232 objects degraded (3.287%),
1 pg degraded, 1 pg undersized
too many PGs per OSD (259 > max 250)

  services:
mon: 3 daemons, quorum opcpmfpsksa0101,opcpmfpsksa0103,opcpmfpsksa0105
(age 3d)
mgr: opcpmfpsksa0101.apmwdm(active, since 5h)
osd: 12 osds: 10 up (since 95m), 10 in (since 85m)
rgw: 2 daemons active (2 hosts, 1 zones)

  data:
pools:   9 pools, 865 pgs
objects: 17.74k objects, 41 GiB
usage:   128 GiB used, 212 GiB / 340 GiB avail
pgs: 1750/53232 objects degraded (3.287%)
 864 active+clean
 1   active+undersized+degraded

-
## ceph health detail
HEALTH_WARN 1 hosts fail cephadm check; 1 pools have many more objects per
pg than average; Degraded data redundancy: 1665/56910 objects degraded
(2.926%), 1 pg degraded, 1 pg undersized; too many PGs per OSD (259 > max
250)
[WRN] CEPHADM_HOST_CHECK_FAILED: 1 hosts fail cephadm check
host opcpcfpsksa0101 (10.56.12.210) failed check: Failed to connect to
opcpcfpsksa0101 (10.56.12.210).
Please make sure that the host is reachable and accepts connections using
the cephadm SSH key

To add the cephadm SSH key to the host:
> ceph cephadm get-pub-key > ~/ceph.pub
> ssh-copy-id -f -i ~/ceph.pub root@10.56.12.210

To check that the host is reachable open a new shell with the --no-hosts
flag:
> cephadm shell --no-hosts

Then run the following:
> ceph cephadm get-ssh-config > ssh_config
> ceph config-key get mgr/cephadm/ssh_identity_key > ~/cephadm_private_key
> chmod 0600 ~/cephadm_private_key
> ssh -F ssh_config -i ~/cephadm_private_key root@10.56.12.210
[WRN] MANY_OBJECTS_PER_PG: 1 pools have many more objects per pg than
average
pool cache-pool objects per pg (1665) is more than 79.2857 times
cluster average (21)
[WRN] PG_DEGRADED: Degraded data redundancy: 1665/56910 objects degraded
(2.926%), 1 pg degraded, 1 pg undersized
pg 9.0 is stuck undersized for 88m, current state
active+undersized+degraded, last acting [10,11]
[WRN] TOO_MANY_PGS: too many PGs per OSD (259 > max 250)
--
ceph osd df tree
ID   CLASS  WEIGHT   REWEIGHT  SIZE RAW USE  DATA OMAP META
AVAIL%USE   VAR   PGS  STATUS  TYPE NAME
 -1 0.35156 -  340 GiB  128 GiB  121 GiB   12 MiB  6.9 GiB
 212 GiB  37.58  1.00-  root default
 -3 0.01959 -  0 B  0 B  0 B  0 B  0 B
 0 B  0 0-  host opcpcfpsksa0101
  0ssd  0.00980 0  0 B  0 B  0 B  0 B  0 B
 0 B  0 00down  osd.0
  9ssd  0.00980 0  0 B  0 B  0 B  0 B  0 B
 0 B  0 00down  osd.9
 -5 0.01959 -   20 GiB  5.1 GiB  4.0 GiB  588 KiB  1.1 GiB
  15 GiB  25.29  0.67-  host opcpcfpsksa0103
  7ssd  0.00980   0.85004   10 GiB  483 MiB   75 MiB  539 KiB  407 MiB
 9.5 GiB   4.72  0.133  up  osd.7
 10ssd  0.00980   0.55011   10 GiB  4.6 GiB  3.9 GiB   49 KiB  703 MiB
 5.4 GiB  45.85  1.225  up  osd.10
-16 0.01959 -   20 GiB  5.5 GiB  4.0 GiB  542 KiB  1.5 GiB
  15 GiB  27.28  0.73-  host opcpcfpsksa0105
  8ssd  0.00980   0.70007   10 GiB  851 MiB   75 MiB  121 KiB  775 MiB
 9.2 GiB   8.31  0.22   10  up  osd.8
 11ssd  0.00980   0.45013   10 GiB  4.6 GiB  3.9 GiB  421 KiB  742 MiB
 5.4 GiB  46.24  1.235  up  osd.11
-10 0.09760 -  100 GiB   39 GiB   38 GiB  207 KiB  963 MiB
  61 GiB  38.59  1.03-  host opcsdfpsksa0101
  1hdd  0.04880   1.0   50 GiB   19 GiB   19 GiB  207 KiB  639 MiB
  31 GiB  38.77  1.03  424  up  osd.1
 12hdd  0.04880   1.0   50 GiB   19 GiB   19 GiB  0 B  323 MiB
  31 GiB  38.40  1.02  430  up  osd.12
-13 0.09760 -  100 GiB   39 GiB   38 GiB  4.9 MiB  1.8 GiB
  61 GiB  39.41  1.05-  host opcsdfpsksa0103
  2hdd  0.04880   1.0   50 GiB   20 GiB   20 GiB  2.6 MiB  703 MiB
  30 GiB  40.42  1.08  428  up  osd

[ceph-users] multi write in block device

2022-05-30 Thread farhad kh

multi write in block device
i have two windows server and i persent a lun with ceph rbd for bouth
i need when disk is ofline for first  windows server  another sever can
update,write and read all file in disk
but this until first server is down or disconnect from lun  not work
what shoulde be doing ?
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] ceph upgrade bug

2022-05-30 Thread farhad kh

 I will update the cluster to version 16.2.9 but other versions do not show
demons

[root@opcpmfpsbpp0101 c41ccd12-dc01-11ec-9e25-00505695f8a8]# ceph orch ps
NAME   HOST PORTSSTATUS
REFRESHED  AGE  MEM USE  MEM LIM  VERSIONIMAGE ID  CONTAINER ID
alertmanager.opcpmfpsbpp0101   opcpmfpsbpp0101  *:9093,9094  running
(20h)10m ago   5d29.6M- ba2b418f427c
 d8c8664b8d84
alertmanager.opcpmfpsbpp0103   opcpmfpsbpp0103  *:9093,9094  running
(31m)10m ago   2d23.6M- ba2b418f427c
 01a794c4a1c8
alertmanager.opcpmfpsbpp0105   opcpmfpsbpp0105  *:9093,9094  running
(20h)10m ago   2d25.3M- ba2b418f427c
 1911a6c14209
crash.opcpmfpsbpp0101  opcpmfpsbpp0101   running
(20h)10m ago   5d7575k-  16.2.9 3520ead5eb19
 5f27bd36a62d
crash.opcpmfpsbpp0103  opcpmfpsbpp0103   running
(31m)10m ago   5d8267k-  16.2.9 3520ead5eb19
 28e18ba5bc3e
crash.opcpmfpsbpp0105  opcpmfpsbpp0105   running
(20h)10m ago   5d9.89M-  16.2.9 3520ead5eb19
 3dcd3809beb3
grafana.opcpmfpsbpp0101opcpmfpsbpp0101  *:3000   running
(20h)10m ago   5d57.6M-  8.3.5  dad864ee21e9
 ad6608c1426b
grafana.opcpmfpsbpp0103opcpmfpsbpp0103  *:3000   running
(31m)10m ago   2d48.3M-  8.3.5  dad864ee21e9
 26b3c17e7be9
grafana.opcpmfpsbpp0105opcpmfpsbpp0105  *:3000   running
(20h)10m ago   2d58.3M-  8.3.5  dad864ee21e9
 0c178fc5e202
mgr.opcpmfpsbpp0101.dbwmph opcpmfpsbpp0101  *:8443,9283  running
(40m)10m ago   5d 384M-  16.2.9 3520ead5eb19
 e42b81be7e56
mgr.opcpmfpsbpp0103.sihfoj opcpmfpsbpp0103  *:8443,9283  running
(31m)10m ago   5d 433M-  16.2.9 3520ead5eb19
 02f4630acf96
mon.opcpmfpsbpp0101opcpmfpsbpp0101   running
(20h)10m ago   5d 934M2048M  16.2.9 3520ead5eb19
 90571a4ff6fc
mon.opcpmfpsbpp0103opcpmfpsbpp0103   running
(31m)10m ago   5d87.2M2048M  16.2.9 3520ead5eb19
 7815194b7d9d
mon.opcpmfpsbpp0105opcpmfpsbpp0105   running
(20h)10m ago   5d 925M2048M  16.2.9 3520ead5eb19
 db14ad0ef6b6
node-exporter.opcpmfpsbpp0101  opcpmfpsbpp0101  *:9100   running
(20h)10m ago   5d28.2M- 1dbe0e931976
 541eddfabb2c
node-exporter.opcpmfpsbpp0103  opcpmfpsbpp0103  *:9100   running
(31m)10m ago   5d8504k- 1dbe0e931976
 23d1460d6e88
node-exporter.opcpmfpsbpp0105  opcpmfpsbpp0105  *:9100   running
(20h)10m ago   5d9004k- 1dbe0e931976
 75404a20b7ab
osd.0  opcpmfpsbpp0101   running
(20h)10m ago   3d 129M4096M  16.2.9 3520ead5eb19
 c413da18938c
osd.1  opcpmfpsbpp0103   running
(31m)10m ago   3d70.5M4096M  16.2.9 3520ead5eb19
 8369f8c08b1c
osd.2  opcpmfpsbpp0105   running
(20h)10m ago   3d 123M4096M  16.2.9 3520ead5eb19
 c4ed30712d15
prometheus.opcpmfpsbpp0101 opcpmfpsbpp0101  *:9095   running
(20h)10m ago   5d92.6M- 514e6a882f6e
 34c846d9946b
prometheus.opcpmfpsbpp0103 opcpmfpsbpp0103  *:9095   running
(31m)10m ago   2d72.8M- 514e6a882f6e
 ce38c6a3e15b
prometheus.opcpmfpsbpp0105 opcpmfpsbpp0105  *:9095   running
(20h)10m ago   2d91.0M- 514e6a882f6e
 d13f02d1bb72
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Error CephMgrPrometheusModuleInactive

2022-06-01 Thread farhad kh

i have error im dashboard ceph
--
CephMgrPrometheusModuleInactive
description
The mgr/prometheus module at opcpmfpskup0101.p.fnst.10.in-addr.arpa:9283 is
unreachable. This could mean that the module has been disabled or the mgr
itself is down. Without the mgr/prometheus module metrics and alerts will
no longer function. Open a shell to ceph and use 'ceph -s' to to determine
whether the mgr is active. If the mgr is not active, restart it, otherwise
you can check the mgr/prometheus module is loaded with 'ceph mgr module ls'
and if it's not listed as enabled, enable it with 'ceph mgr module enable
prometheus'

and in log container mgr i have this error
-
debug 2022-06-01T07:47:13.929+ 7f21d6525700  0 log_channel(cluster) log
[DBG] : pgmap v386352: 1 pgs: 1 active+clean; 0 B data, 16 MiB used, 60 GiB
/ 60 GiB avail
debug 2022-06-01T07:47:14.039+ 7f21c7b08700  0 [progress INFO root]
Processing OSDMap change 29..29
debug 2022-06-01T07:47:15.128+ 7f21a7b36700  0 [dashboard INFO request]
[10.60.161.64:63651] [GET] [200] [0.011s] [admin] [933.0B] /api/summary
debug 2022-06-01T07:47:15.866+ 7f21bdfe2700  0 [prometheus INFO
cherrypy.access.139783044050056] 10.56.0.223 - - [01/Jun/2022:07:47:15]
"GET /metrics HTTP/1.1" 200 101826 "" "Prometheus/2.33.4"
10.56.0.223 - - [01/Jun/2022:07:47:15] "GET /metrics HTTP/1.1" 200 101826
"" "Prometheus/2.33.4"
debug 2022-06-01T07:47:15.928+ 7f21d6525700  0 log_channel(cluster) log
[DBG] : pgmap v386353: 1 pgs: 1 active+clean; 0 B data, 16 MiB used, 60 GiB
/ 60 GiB avail
debug 2022-06-01T07:47:16.126+ 7f21a6333700  0 [dashboard INFO request]
[10.60.161.64:63651] [GET] [200] [0.003s] [admin] [69.0B]
/api/feature_toggles
debug 2022-06-01T07:47:17.129+ 7f21cd313700  0 [progress WARNING root]
complete: ev f9e995f4-d172-465f-a91a-de6e35319717 does not exist
debug 2022-06-01T07:47:17.129+ 7f21cd313700  0 [progress WARNING root]
complete: ev 1bb8e9ee-7403-42ad-96e4-4324ae6d8c15 does not exist
debug 2022-06-01T07:47:17.130+ 7f21cd313700  0 [progress WARNING root]
complete: ev 6b9a0cd9-b185-4c08-ad99-e7fc2f976590 does not exist
debug 2022-06-01T07:47:17.130+ 7f21cd313700  0 [progress WARNING root]
complete: ev d9bffc48-d463-43bf-a25b-7853b2f334a0 does not exist
debug 2022-06-01T07:47:17.130+ 7f21cd313700  0 [progress WARNING root]
complete: ev c5bf893d-2eac-4bb6-994f-cbcf3822c30c does not exist
debug 2022-06-01T07:47:17.131+ 7f21cd313700  0 [progress WARNING root]
complete: ev 43511d64-6636-455e-8df5-bed1aa853f3e does not exist
debug 2022-06-01T07:47:17.131+ 7f21cd313700  0 [progress WARNING root]
complete: ev 857aabc5-e61b-4a76-90b2-62631bfeba00 does not exist


10.56.0.221 - - [01/Jun/2022:07:47:00] "GET /metrics HTTP/1.1" 200 101830
"" "Prometheus/2.33.4"
debug 2022-06-01T07:47:01.632+ 7f21a7b36700  0 [dashboard ERROR
exception] Internal Server Error
Traceback (most recent call last):
  File "/lib/python3.6/site-packages/cherrypy/lib/static.py", line 58, in
serve_file
st = os.stat(path)
FileNotFoundError: [Errno 2] No such file or directory:
'/usr/share/ceph/mgr/dashboard/frontend/dist/en-US/prometheus_receiver'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/share/ceph/mgr/dashboard/services/exception.py", line 47, in
dashboard_exception_handler
return handler(*args, **kwargs)
  File "/lib/python3.6/site-packages/cherrypy/_cpdispatch.py", line 54, in
__call__
return self.callable(*self.args, **self.kwargs)
  File "/usr/share/ceph/mgr/dashboard/controllers/home.py", line 135, in
__call__
return serve_file(full_path)
  File "/lib/python3.6/site-packages/cherrypy/lib/static.py", line 65, in
serve_file
raise cherrypy.NotFound()

but my cluster show everythings  is ok

#ceph -s
  cluster:
id: 868c3ad2-da76-11ec-b977-005056aa7589
health: HEALTH_OK

  services:
mon: 3 daemons, quorum opcpmfpskup0105,opcpmfpskup0101,opcpmfpskup0103
(age 38m)
mgr: opcpmfpskup0105.mureyk(active, since 8d), standbys:
opcpmfpskup0101.uvkngk
osd: 3 osds: 3 up (since 38m), 3 in (since 84m)

  data:
pools:   1 pools, 1 pgs
objects: 0 objects, 0 B
usage:   16 MiB used, 60 GiB / 60 GiB avail
pgs: 1 active+clean

anyone can explain this ?
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] unknown object

2022-06-06 Thread farhad kh

i deleted all object in my bucket but used capacity not zero
when i list object in pool wit `rados -p default.rgw.buckets.data.ls` shows
me a lot of objects

2ee2e53d-bad4-4857-8bea-36eb52a83f34.5263789.1__shadow_1/16Q91ZUY34EAW9TH.2~zOHhukByW0DKgDIIihOEhtxtW85FO5m.74_1
2ee2e53d-bad4-4857-8bea-36eb52a83f34.5263789.1__shadow_1/PRZEDHF9NSTRGG9G.2~-3Kdywfa6qNjy0j8JaKF8XbwR2e7HPQ.17_1
2ee2e53d-bad4-4857-8bea-36eb52a83f34.5263789.1__shadow_1/YZN8L9MDGZRTAO3F.2~ygKOynlKPsHC23k53N3MtsybuIJgpZa.92_1
2ee2e53d-bad4-4857-8bea-36eb52a83f34.5263789.1__multipart_1/A9JR4TZHBU5EITOV.2~3i2aUR5RIVEHlnZuyAVLf_eSzlziTtq.99
2ee2e53d-bad4-4857-8bea-36eb52a83f34.5263789.1__multipart_1/YZN8L9MDGZRTAO3F.2~ygKOynlKPsHC23k53N3MtsybuIJgpZa.58
2ee2e53d-bad4-4857-8bea-36eb52a83f34.5263789.1__shadow_1/ODVRPVIRSCIQBKRD.2~4ipoaspJ-8RdWU8R6GC9DT4cOOdwBGl.80_1
2ee2e53d-bad4-4857-8bea-36eb52a83f34.5263789.1__shadow_1/1IG3IWQUTAKWW6MI.2~MuhYMb1HsKBU73ZOC7Xpb7ZBHQ_1qrK.41_1

How are these objects created? And how to delete them even though they are
not in the bucket list?
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Degraded data redundancy: 32 pgs undersized

2022-06-12 Thread farhad kh

i upgraded my cluster to 17.2 and locked process upgrade
i have error
[root@ceph2-node-01 ~]# ceph -s
  cluster:
id: 151b48f2-fa98-11eb-b7c4-000c29fa2c84
health: HEALTH_WARN
Reduced data availability: 32 pgs inactive
Degraded data redundancy: 32 pgs undersized

  services:
mon: 3 daemons, quorum ceph2-node-03,ceph2-node-02,ceph2-node-01 (age
4h)
mgr: ceph2-node-02.mjagnd(active, since 11h), standbys:
ceph2-node-01.hgrjgo
osd: 12 osds: 12 up (since 43m), 12 in (since 21h)

  data:
pools:   1 pools, 32 pgs
objects: 0 objects, 0 B
usage:   434 MiB used, 180 GiB / 180 GiB avail
pgs: 100.000% pgs not active
 32 undersized+peered

  progress:
Upgrade to quay.io/ceph/ceph:v17.2.0 (0s)
  []
Global Recovery Event (0s)
  []

root@ceph2-node-01 ~]# ceph health detail
HEALTH_WARN Reduced data availability: 32 pgs inactive; Degraded data
redundancy: 32 pgs undersized
[WRN] PG_AVAILABILITY: Reduced data availability: 32 pgs inactive
pg 58.0 is stuck inactive for 17h, current state undersized+peered,
last acting [1]
pg 58.1 is stuck inactive for 17h, current state undersized+peered,
last acting [1]
pg 58.2 is stuck inactive for 17h, current state undersized+peered,
last acting [1]
pg 58.3 is stuck inactive for 17h, current state undersized+peered,
last acting [1]
pg 58.4 is stuck inactive for 17h, current state undersized+peered,
last acting [1]
pg 58.5 is stuck inactive for 17h, current state undersized+peered,
last acting [1]
pg 58.6 is stuck inactive for 17h, current state undersized+peered,
last acting [1]
pg 58.7 is stuck inactive for 17h, current state undersized+peered,
last acting [1]
pg 58.8 is stuck inactive for 17h, current state undersized+peered,
last acting [1]
pg 58.9 is stuck inactive for 17h, current state undersized+peered,
last acting [1]
pg 58.a is stuck inactive for 17h, current state undersized+peered,
last acting [1]
pg 58.b is stuck inactive for 17h, current state undersized+peered,
last acting [1]
pg 58.c is stuck inactive for 17h, current state undersized+peered,
last acting [1]
pg 58.d is stuck inactive for 17h, current state undersized+peered,
last acting [1]
pg 58.e is stuck inactive for 17h, current state undersized+peered,
last acting [1]
pg 58.f is stuck inactive for 17h, current state undersized+peered,
last acting [1]
pg 58.10 is stuck inactive for 17h, current state undersized+peered,
last acting [1]
pg 58.11 is stuck inactive for 17h, current state undersized+peered,
last acting [1]
pg 58.12 is stuck inactive for 17h, current state undersized+peered,
last acting [1]
pg 58.13 is stuck inactive for 17h, current state undersized+peered,
last acting [1]
pg 58.14 is stuck inactive for 17h, current state undersized+peered,
last acting [1]
pg 58.15 is stuck inactive for 17h, current state undersized+peered,
last acting [1]
pg 58.16 is stuck inactive for 17h, current state undersized+peered,
last acting [1]
pg 58.17 is stuck inactive for 17h, current state undersized+peered,
last acting [1]
pg 58.18 is stuck inactive for 17h, current state undersized+peered,
last acting [1]
pg 58.19 is stuck inactive for 17h, current state undersized+peered,
last acting [1]
pg 58.1a is stuck inactive for 17h, current state undersized+peered,
last acting [1]
pg 58.1b is stuck inactive for 17h, current state undersized+peered,
last acting [1]
pg 58.1c is stuck inactive for 17h, current state undersized+peered,
last acting [1]
pg 58.1d is stuck inactive for 17h, current state undersized+peered,
last acting [1]
pg 58.1e is stuck inactive for 17h, current state undersized+peered,
last acting [1]
pg 58.1f is stuck inactive for 17h, current state undersized+peered,
last acting [1]
[WRN] PG_DEGRADED: Degraded data redundancy: 32 pgs undersized
pg 58.0 is stuck undersized for 48m, current state undersized+peered,
last acting [1]
pg 58.1 is stuck undersized for 48m, current state undersized+peered,
last acting [1]
pg 58.2 is stuck undersized for 48m, current state undersized+peered,
last acting [1]
pg 58.3 is stuck undersized for 48m, current state undersized+peered,
last acting [1]
pg 58.4 is stuck undersized for 48m, current state undersized+peered,
last acting [1]
pg 58.5 is stuck undersized for 48m, current state undersized+peered,
last acting [1]
pg 58.6 is stuck undersized for 48m, current state undersized+peered,
last acting [1]
pg 58.7 is stuck undersized for 48m, current state undersized+peered,
last acting [1]
pg 58.8 is stuck undersized for 48m, current state undersized+peered,
last acting [1]
pg 58.9 is stuck undersized for 48m, current state undersized+peered,
last acting [1]
pg 58.a is stuck undersized for 48m, current state undersized+peered,
l

[ceph-users] lifecycle config minimum time

2022-06-21 Thread farhad kh

i want set lc for incomplete multipart but i not find document that say use
minute or hour for time
 how can set time for lc less than day ?
 
Abort incomplete multipart upload after 1 day

Enabled

1


___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] use ceph rbd for windows cluster "scsi-3 persistent reservation"

2022-06-22 Thread farhad kh

 I need a disk storage block that is shared between two Windows servers.
Servers are active standby (server certification) Only one server can write
at a time, but both servers can read the created files And if the first
server shuts down, the second server can edit the files or create a new file

  I used disk block storage to give both machines a common purpose disk But
the files written by the first machine are not seen by the second machine
and the only solution to this problem is to reload the isci connection on
the second machine.
for handeling this issu i try runnig mscs for use cluster storage windos
but in test plan i have error "Test Disk 1 does not provide Persistent
Reservations support for the mechanisms used by failover clusters. Some
storage devices require specific firmware versions or settings to function
properly with failover clusters. Please contact your storage administrator
or storage vendor to check the configuration of the storage to allow it to
function properly with failover clusters"

how can i resolve this ?
Is there a solution to this issue?
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] remove osd in crush

2022-08-27 Thread farhad kh

i removed osd from crushmap but it still in 'ceph osd tree'

[root@ceph2-node-01 ~]# ceph osd tree
ID   CLASS  WEIGHTTYPE NAME   STATUS  REWEIGHT
 PRI-AFF
 -1 20.03859  root default
-20 20.03859  datacenter dc-1
-21 20.03859  room server-room-1
-22 10.0  rack rack-1
 -3  1.0  host ceph2-node-01
-23 10.0  rack rack-2
 -5  1.0  host ceph2-node-02
-24 10.0  rack rack-3
 -7  1.0  host ceph2-node-03
  10  osd.1 down 0
 1.0
  90  osd.9 down   1.0
 1.0
 120  osd.12down   1.0
 1.0

but in crush tree is not

[root@ceph2-node-01 ~]# ceph osd crush tree
ID   CLASS  WEIGHTTYPE NAME
 -1 20.03859  root default
-20 20.03859  datacenter dc-1
-21 20.03859  room server-room-1
-22 10.0  rack rack-1
 -3  1.0  host ceph2-node-01
-23 10.0  rack rack-2
 -5  1.0  host ceph2-node-02
-24 10.0  rack rack-3
 -7  1.0  host ceph2-node-03

how can i resolve this ?
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

39 matches

Mail list logo