[ceph-users] HBA or RAID-0 + BBU
Good evening everyone! Guys, about the P420 RAID controller, I have a question about the operation mode: What would be better: HBA or RAID-0 with BBU (active write cache)? Thanks in advance! ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: [ceph 17.2.6] unable to create rbd snapshots for images with erasure code data-pool
Ah, yes indeed I had disabled log-to-stderr in cluster wide config. root@zephir:~# rbd -p rbd snap create ceph-dev@backup --id admin --debug-ms 1 --debug-rbd 20 --log-to-stderr=true >/home/rgysi/log.txt 2>&1 root@zephir:~# Here's the log.txt Am Di., 18. Apr. 2023 um 18:36 Uhr schrieb Ilya Dryomov : > On Tue, Apr 18, 2023 at 5:45 PM Reto Gysi wrote: > > > > Hi Ilya > > > > Sure. > > > > root@zephir:~# rbd snap create ceph-dev@backup --id admin --debug-ms 1 > --debug-rbd 20 >/home/rgysi/log.txt 2>&1 > > You probably have custom log settings in the cluster-wide config. Please > append "--log-to-stderr true" and try again. > > Thanks, > > Ilya > 2023-04-18T23:25:42.707+0200 7f4a8d70f4c0 1 Processor -- start 2023-04-18T23:25:42.707+0200 7f4a8d70f4c0 1 -- start start 2023-04-18T23:25:42.707+0200 7f4a8d70f4c0 1 --2- >> [v2:192.168.1.1:3300/0,v1:192.168.1.1:6789/0] conn(0x55637d589150 0x55637d589520 unknown :-1 s=NONE pgs=0 cs=0 l=0 rev1=0 crypto rx=0 tx=0 comp rx=0 tx=0).connect 2023-04-18T23:25:42.707+0200 7f4a8d70f4c0 1 --2- >> [v2:192.168.1.10:3300/0,v1:192.168.1.10:6789/0] conn(0x55637d589a60 0x55637d5967c0 unknown :-1 s=NONE pgs=0 cs=0 l=0 rev1=0 crypto rx=0 tx=0 comp rx=0 tx=0).connect 2023-04-18T23:25:42.707+0200 7f4a8d70f4c0 1 --2- >> [v2:192.168.43.208:3300/0,v1:192.168.43.208:6789/0] conn(0x55637d596d00 0x55637d5990e0 unknown :-1 s=NONE pgs=0 cs=0 l=0 rev1=0 crypto rx=0 tx=0 comp rx=0 tx=0).connect 2023-04-18T23:25:42.707+0200 7f4a8d70f4c0 1 -- --> [v2:192.168.1.1:3300/0,v1:192.168.1.1:6789/0] -- mon_getmap magic: 0 v1 -- 0x55637d478b70 con 0x55637d589150 2023-04-18T23:25:42.707+0200 7f4a8d70f4c0 1 -- --> [v2:192.168.1.10:3300/0,v1:192.168.1.10:6789/0] -- mon_getmap magic: 0 v1 -- 0x55637d4721b0 con 0x55637d589a60 2023-04-18T23:25:42.707+0200 7f4a8d70f4c0 1 -- --> [v2:192.168.43.208:3300/0,v1:192.168.43.208:6789/0] -- mon_getmap magic: 0 v1 -- 0x55637d40e680 con 0x55637d596d00 2023-04-18T23:25:42.707+0200 7f4a8b63e700 1 --2- >> [v2:192.168.1.1:3300/0,v1:192.168.1.1:6789/0] conn(0x55637d589150 0x55637d589520 unknown :-1 s=BANNER_CONNECTING pgs=0 cs=0 l=0 rev1=0 crypto rx=0 tx=0 comp rx=0 tx=0)._handle_peer_banner_payload supported=3 required=0 2023-04-18T23:25:42.707+0200 7f4a8ae3d700 1 --2- >> [v2:192.168.1.10:3300/0,v1:192.168.1.10:6789/0] conn(0x55637d589a60 0x55637d5967c0 unknown :-1 s=BANNER_CONNECTING pgs=0 cs=0 l=0 rev1=0 crypto rx=0 tx=0 comp rx=0 tx=0)._handle_peer_banner_payload supported=3 required=0 2023-04-18T23:25:42.707+0200 7f4a8b63e700 1 --2- >> [v2:192.168.1.1:3300/0,v1:192.168.1.1:6789/0] conn(0x55637d589150 0x55637d589520 unknown :-1 s=HELLO_CONNECTING pgs=0 cs=0 l=0 rev1=1 crypto rx=0 tx=0 comp rx=0 tx=0).handle_hello peer v2:192.168.1.1:3300/0 says I am v2:192.168.1.1:49004/0 (socket says 192.168.1.1:49004) 2023-04-18T23:25:42.707+0200 7f4a8b63e700 1 -- 192.168.1.1:0/3741665115 learned_addr learned my addr 192.168.1.1:0/3741665115 (peer_addr_for_me v2:192.168.1.1:0/0) 2023-04-18T23:25:42.707+0200 7f4a8b63e700 1 -- 192.168.1.1:0/3741665115 >> [v2:192.168.43.208:3300/0,v1:192.168.43.208:6789/0] conn(0x55637d596d00 msgr2=0x55637d5990e0 unknown :-1 s=STATE_CONNECTING_RE l=0).mark_down 2023-04-18T23:25:42.707+0200 7f4a8b63e700 1 --2- 192.168.1.1:0/3741665115 >> [v2:192.168.43.208:3300/0,v1:192.168.43.208:6789/0] conn(0x55637d596d00 0x55637d5990e0 unknown :-1 s=START_CONNECT pgs=0 cs=0 l=0 rev1=0 crypto rx=0 tx=0 comp rx=0 tx=0).stop 2023-04-18T23:25:42.707+0200 7f4a8b63e700 1 -- 192.168.1.1:0/3741665115 >> [v2:192.168.1.10:3300/0,v1:192.168.1.10:6789/0] conn(0x55637d589a60 msgr2=0x55637d5967c0 unknown :-1 s=STATE_CONNECTION_ESTABLISHED l=0).mark_down 2023-04-18T23:25:42.707+0200 7f4a8b63e700 1 --2- 192.168.1.1:0/3741665115 >> [v2:192.168.1.10:3300/0,v1:192.168.1.10:6789/0] conn(0x55637d589a60 0x55637d5967c0 unknown :-1 s=AUTH_CONNECTING pgs=0 cs=0 l=0 rev1=1 crypto rx=0 tx=0 comp rx=0 tx=0).stop 2023-04-18T23:25:42.707+0200 7f4a8b63e700 1 -- 192.168.1.1:0/3741665115 --> [v2:192.168.1.1:3300/0,v1:192.168.1.1:6789/0] -- mon_subscribe({config=0+,monmap=0+}) v3 -- 0x55637d4197f0 con 0x55637d589150 2023-04-18T23:25:42.707+0200 7f4a8ae3d700 1 --2- 192.168.1.1:0/3741665115 >> [v2:192.168.1.10:3300/0,v1:192.168.1.10:6789/0] conn(0x55637d589a60 0x55637d5967c0 unknown :-1 s=CLOSED pgs=0 cs=0 l=0 rev1=1 crypto rx=0 tx=0 comp rx=0 tx=0).handle_auth_done state changed! 2023-04-18T23:25:42.707+0200 7f4a8b63e700 1 --2- 192.168.1.1:0/3741665115 >> [v2:192.168.1.1:3300/0,v1:192.168.1.1:6789/0] conn(0x55637d589150 0x55637d589520 secure :-1 s=READY pgs=939 cs=0 l=1 rev1=1 crypto rx=0x7f4a7c00a700 tx=0x7f4a7c005b10 comp rx=0 tx=0).ready entity=mon.0 client_cookie=b76bfda048033b3b server_cookie=0 in_seq=0 out_seq=0 2023-04-18T23:25:42.707+0200 7f4a8a63c700 1 -- 192.168.1.1:0/3741665115 <== mon.0 v2:192.168.1.1:3300/0 1 mon_map magic: 0 v1 467+0+0 (secure 0 0 0) 0x7f4a7c0089b0 con 0x55637d589150
[ceph-users] Re: [ceph 17.2.6] unable to create rbd snapshots for images with erasure code data-pool
Hi Eugen Yes, I used the default setting of rbd_default_pool='rbd'. I don't have anything set for default_data_pool. root@zephir:~# ceph config show-with-defaults mon.zephir | grep -E "default(_data)*_pool" osd_default_data_pool_replay_window 45 default rbd_default_data_pool default rbd_default_poolrbd default root@zephir:~# If I don't specify a data-pool during 'ceph create ' it will create the image with pool 'rbd' and without a separate data pool. pool 'rbd' is a replica 3 pool. adding '-p rbd' to the snap create command doesn't change/fix the error: root@zephir:~# rbd -p rbd snap create ceph-dev@backup --id admin --debug-ms 1 --debug-rbd 20 2023-04-18T19:25:23.002+0200 7f1a036ff4c0 1 Processor -- start 2023-04-18T19:25:23.002+0200 7f1a036ff4c0 1 -- start start 2023-04-18T19:25:23.002+0200 7f1a036ff4c0 1 --2- >> [v2: 192.168.1.10:3300/0,v1:192.168.1.10:6789/0] conn(0x56151b58f2b0 0x56151b58f680 unknown :-1 s=NONE pgs=0 cs=0 l=0 rev1=0 crypto rx=0 tx=0 comp rx=0 tx=0).connect 2023-04-18T19:25:23.002+0200 7f1a036ff4c0 1 --2- >> [v2: 192.168.1.1:3300/0,v1:192.168.1.1:6789/0] conn(0x56151b58fc50 0x56151b598320 unknown :-1 s=NONE pgs=0 cs=0 l=0 rev1=0 crypto rx=0 tx=0 comp rx=0 tx=0).connect 2023-04-18T19:25:23.002+0200 7f1a036ff4c0 1 --2- >> [v2: 192.168.43.208:3300/0,v1:192.168.43.208:6789/0] conn(0x56151b598860 0x56151b59ac40 unknown :-1 s=NONE pgs=0 cs=0 l=0 rev1=0 crypto rx=0 tx=0 comp rx=0 tx=0).connect 2023-04-18T19:25:23.002+0200 7f1a036ff4c0 1 -- --> [v2: 192.168.1.1:3300/0,v1:192.168.1.1:6789/0] -- mon_getmap magic: 0 v1 -- 0x56151b47cb70 con 0x56151b58fc50 2023-04-18T19:25:23.002+0200 7f1a036ff4c0 1 -- --> [v2: 192.168.1.10:3300/0,v1:192.168.1.10:6789/0] -- mon_getmap magic: 0 v1 -- 0x56151b4761b0 con 0x56151b58f2b0 2023-04-18T19:25:23.002+0200 7f1a036ff4c0 1 -- --> [v2: 192.168.43.208:3300/0,v1:192.168.43.208:6789/0] -- mon_getmap magic: 0 v1 -- 0x56151b412680 con 0x56151b598860 2023-04-18T19:25:23.002+0200 7f19f8d43700 1 --2- >> [v2: 192.168.1.1:3300/0,v1:192.168.1.1:6789/0] conn(0x56151b58fc50 0x56151b598320 unknown :-1 s=BANNER_CONNECTING pgs=0 cs=0 l=0 rev1=0 crypto rx=0 tx=0 comp rx=0 tx=0)._handle_peer_banner_payload supporte d=3 required=0 2023-04-18T19:25:23.002+0200 7f1a01544700 1 --2- >> [v2: 192.168.1.10:3300/0,v1:192.168.1.10:6789/0] conn(0x56151b58f2b0 0x56151b58f680 unknown :-1 s=BANNER_CONNECTING pgs=0 cs=0 l=0 rev1=0 crypto rx=0 tx=0 comp rx=0 tx=0)._handle_peer_banner_payload suppor ted=3 required=0 2023-04-18T19:25:23.002+0200 7f19f8d43700 1 --2- >> [v2: 192.168.1.1:3300/0,v1:192.168.1.1:6789/0] conn(0x56151b58fc50 0x56151b598320 unknown :-1 s=HELLO_CONNECTING pgs=0 cs=0 l=0 rev1=1 crypto rx=0 tx=0 comp rx=0 tx=0).handle_hello peer v2:192.168.1.1:3300 /0 says I am v2:192.168.1.1:35346/0 (socket says 192.168.1.1:35346) 2023-04-18T19:25:23.002+0200 7f19f8d43700 1 -- 192.168.1.1:0/2631157109 learned_addr learned my addr 192.168.1.1:0/2631157109 (peer_addr_for_me v2: 192.168.1.1:0/0) 2023-04-18T19:25:23.002+0200 7f1a01544700 1 -- 192.168.1.1:0/2631157109 >> [v2:192.168.43.208:3300/0,v1:192.168.43.208:6789/0] conn(0x56151b598860 msgr2=0x56151b59ac40 unknown :-1 s=STATE_CONNECTING_RE l=0).mark_down 2023-04-18T19:25:23.002+0200 7f1a01544700 1 --2- 192.168.1.1:0/2631157109 >> [v2:192.168.43.208:3300/0,v1:192.168.43.208:6789/0] conn(0x56151b598860 0x56151b59ac40 unknown :-1 s=START_CONNECT pgs=0 cs=0 l=0 rev1=0 crypto rx=0 tx=0 comp rx=0 tx=0).stop 2023-04-18T19:25:23.002+0200 7f1a01544700 1 -- 192.168.1.1:0/2631157109 >> [v2:192.168.1.1:3300/0,v1:192.168.1.1:6789/0] conn(0x56151b58fc50 msgr2=0x56151b598320 unknown :-1 s=STATE_CONNECTION_ESTABLISHED l=0).mark_down 2023-04-18T19:25:23.002+0200 7f1a01544700 1 --2- 192.168.1.1:0/2631157109 >> [v2:192.168.1.1:3300/0,v1:192.168.1.1:6789/0] conn(0x56151b58fc50 0x56151b598320 unknown :-1 s=AUTH_CONNECTING pgs=0 cs=0 l=0 rev1=1 crypto rx=0 tx=0 comp rx=0 tx=0).stop 2023-04-18T19:25:23.002+0200 7f1a01544700 1 -- 192.168.1.1:0/2631157109 --> [v2:192.168.1.10:3300/0,v1:192.168.1.10:6789/0] -- mon_subscribe({config=0+,monmap=0+}) v3 -- 0x56151b41d7f0 con 0x56151b58f2b0 2023-04-18T19:25:23.002+0200 7f19f8d43700 1 --2- 192.168.1.1:0/2631157109 >> [v2:192.168.1.1:3300/0,v1:192.168.1.1:6789/0] conn(0x56151b58fc50 0x56151b598320 unknown :-1 s=CLOSED pgs=0 cs=0 l=0 rev1=1 crypto rx=0 tx=0 comp rx=0 tx=0).handle_auth_done state changed! 2023-04-18T19:25:23.002+0200 7f1a01544700 1 --2- 192.168.1.1:0/2631157109 >> [v2:192.168.1.10:3300/0,v1:192.168.1.10:6789/0] conn(0x56151b58f2b0 0x56151b58f680 secure :-1 s=READY pgs=214 cs=0 l=1 rev1=1 crypto rx=0x7f19f400a5d0 tx=0x7f19f4005d40 comp rx=0 t x=0).ready entity=mon.1 client_cookie=a9059b943a3e6f58
[ceph-users] Re: [ceph 17.2.6] unable to create rbd snapshots for images with erasure code data-pool
You don't seem to specify a pool name to the snap create command, does your rbd_default_pool match the desired pool? And also does rbd_default_data_pool match what you expect (if those values are even set)? I've never used custom values for those configs but if you don't specify a pool name the default name "rbd" is expected by ceph. At least that's how I know it. Zitat von Reto Gysi : Hi Ilya Sure. root@zephir:~# rbd snap create ceph-dev@backup --id admin --debug-ms 1 --debug-rbd 20 >/home/rgysi/log.txt 2>&1 root@zephir:~# Am Di., 18. Apr. 2023 um 16:19 Uhr schrieb Ilya Dryomov : On Tue, Apr 18, 2023 at 3:21 PM Reto Gysi wrote: > > Hi, > > Yes both snap create commands were executed as user admin: > client.admin >caps: [mds] allow * >caps: [mgr] allow * >caps: [mon] allow * >caps: [osd] allow * > > deep scrubbing+repair of ecpool_hdd is still ongoing, but so far the > problem still exists Hi Reto, Deep scrubbing is unlikely to help with a "Operation not supported" error. I really doubt that the output that you attached in one of the previous emails is all that is logged. Even in the successful case, there is not a single RBD-related debug log. I would suggest repeating the test with an explicit redirection and attaching the file itself. Thanks, Ilya ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: [ceph 17.2.6] unable to create rbd snapshots for images with erasure code data-pool
On Tue, Apr 18, 2023 at 5:45 PM Reto Gysi wrote: > > Hi Ilya > > Sure. > > root@zephir:~# rbd snap create ceph-dev@backup --id admin --debug-ms 1 > --debug-rbd 20 >/home/rgysi/log.txt 2>&1 You probably have custom log settings in the cluster-wide config. Please append "--log-to-stderr true" and try again. Thanks, Ilya ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: [ceph 17.2.6] unable to create rbd snapshots for images with erasure code data-pool
Hi Ilya Sure. root@zephir:~# rbd snap create ceph-dev@backup --id admin --debug-ms 1 --debug-rbd 20 >/home/rgysi/log.txt 2>&1 root@zephir:~# Am Di., 18. Apr. 2023 um 16:19 Uhr schrieb Ilya Dryomov : > On Tue, Apr 18, 2023 at 3:21 PM Reto Gysi wrote: > > > > Hi, > > > > Yes both snap create commands were executed as user admin: > > client.admin > >caps: [mds] allow * > >caps: [mgr] allow * > >caps: [mon] allow * > >caps: [osd] allow * > > > > deep scrubbing+repair of ecpool_hdd is still ongoing, but so far the > > problem still exists > > Hi Reto, > > Deep scrubbing is unlikely to help with a "Operation not supported" > error. > > I really doubt that the output that you attached in one of the previous > emails is all that is logged. Even in the successful case, there is not > a single RBD-related debug log. I would suggest repeating the test > with an explicit redirection and attaching the file itself. > > Thanks, > > Ilya > 2023-04-18T17:37:03.454+0200 7f99393084c0 1 Processor -- start 2023-04-18T17:37:03.454+0200 7f99393084c0 1 -- start start 2023-04-18T17:37:03.454+0200 7f99393084c0 1 --2- >> [v2:192.168.1.10:3300/0,v1:192.168.1.10:6789/0] conn(0x55ec6059b210 0x55ec6059b5e0 unknown :-1 s=NONE pgs=0 cs=0 l=0 rev1=0 crypto rx=0 tx=0 comp rx=0 tx=0).connect 2023-04-18T17:37:03.454+0200 7f99393084c0 1 --2- >> [v2:192.168.1.1:3300/0,v1:192.168.1.1:6789/0] conn(0x55ec6059bbb0 0x55ec605a4280 unknown :-1 s=NONE pgs=0 cs=0 l=0 rev1=0 crypto rx=0 tx=0 comp rx=0 tx=0).connect 2023-04-18T17:37:03.454+0200 7f99393084c0 1 --2- >> [v2:192.168.43.208:3300/0,v1:192.168.43.208:6789/0] conn(0x55ec605a47c0 0x55ec605a6ba0 unknown :-1 s=NONE pgs=0 cs=0 l=0 rev1=0 crypto rx=0 tx=0 comp rx=0 tx=0).connect 2023-04-18T17:37:03.454+0200 7f99393084c0 1 -- --> [v2:192.168.1.1:3300/0,v1:192.168.1.1:6789/0] -- mon_getmap magic: 0 v1 -- 0x55ec60488b70 con 0x55ec6059bbb0 2023-04-18T17:37:03.454+0200 7f99393084c0 1 -- --> [v2:192.168.1.10:3300/0,v1:192.168.1.10:6789/0] -- mon_getmap magic: 0 v1 -- 0x55ec604821b0 con 0x55ec6059b210 2023-04-18T17:37:03.454+0200 7f99393084c0 1 -- --> [v2:192.168.43.208:3300/0,v1:192.168.43.208:6789/0] -- mon_getmap magic: 0 v1 -- 0x55ec6041e680 con 0x55ec605a47c0 2023-04-18T17:37:03.454+0200 7f9936aaf700 1 --2- >> [v2:192.168.1.1:3300/0,v1:192.168.1.1:6789/0] conn(0x55ec6059bbb0 0x55ec605a4280 unknown :-1 s=BANNER_CONNECTING pgs=0 cs=0 l=0 rev1=0 crypto rx=0 tx=0 comp rx=0 tx=0)._handle_peer_banner_payload supported=3 required=0 2023-04-18T17:37:03.454+0200 7f99372b0700 1 --2- >> [v2:192.168.1.10:3300/0,v1:192.168.1.10:6789/0] conn(0x55ec6059b210 0x55ec6059b5e0 unknown :-1 s=BANNER_CONNECTING pgs=0 cs=0 l=0 rev1=0 crypto rx=0 tx=0 comp rx=0 tx=0)._handle_peer_banner_payload supported=3 required=0 2023-04-18T17:37:03.454+0200 7f99372b0700 1 --2- >> [v2:192.168.1.10:3300/0,v1:192.168.1.10:6789/0] conn(0x55ec6059b210 0x55ec6059b5e0 unknown :-1 s=HELLO_CONNECTING pgs=0 cs=0 l=0 rev1=1 crypto rx=0 tx=0 comp rx=0 tx=0).handle_hello peer v2:192.168.1.10:3300/0 says I am v2:192.168.1.1:58756/0 (socket says 192.168.1.1:58756) 2023-04-18T17:37:03.454+0200 7f9936aaf700 1 --2- >> [v2:192.168.1.1:3300/0,v1:192.168.1.1:6789/0] conn(0x55ec6059bbb0 0x55ec605a4280 unknown :-1 s=HELLO_CONNECTING pgs=0 cs=0 l=0 rev1=1 crypto rx=0 tx=0 comp rx=0 tx=0).handle_hello peer v2:192.168.1.1:3300/0 says I am v2:192.168.1.1:51640/0 (socket says 192.168.1.1:51640) 2023-04-18T17:37:03.454+0200 7f99372b0700 1 -- 192.168.1.1:0/229580484 learned_addr learned my addr 192.168.1.1:0/229580484 (peer_addr_for_me v2:192.168.1.1:0/0) 2023-04-18T17:37:03.454+0200 7f99372b0700 1 -- 192.168.1.1:0/229580484 >> [v2:192.168.43.208:3300/0,v1:192.168.43.208:6789/0] conn(0x55ec605a47c0 msgr2=0x55ec605a6ba0 unknown :-1 s=STATE_CONNECTING_RE l=0).mark_down 2023-04-18T17:37:03.454+0200 7f99372b0700 1 --2- 192.168.1.1:0/229580484 >> [v2:192.168.43.208:3300/0,v1:192.168.43.208:6789/0] conn(0x55ec605a47c0 0x55ec605a6ba0 unknown :-1 s=START_CONNECT pgs=0 cs=0 l=0 rev1=0 crypto rx=0 tx=0 comp rx=0 tx=0).stop 2023-04-18T17:37:03.454+0200 7f99372b0700 1 -- 192.168.1.1:0/229580484 >> [v2:192.168.1.1:3300/0,v1:192.168.1.1:6789/0] conn(0x55ec6059bbb0 msgr2=0x55ec605a4280 unknown :-1 s=STATE_CONNECTION_ESTABLISHED l=0).mark_down 2023-04-18T17:37:03.454+0200 7f99372b0700 1 --2- 192.168.1.1:0/229580484 >> [v2:192.168.1.1:3300/0,v1:192.168.1.1:6789/0] conn(0x55ec6059bbb0 0x55ec605a4280 unknown :-1 s=AUTH_CONNECTING pgs=0 cs=0 l=0 rev1=1 crypto rx=0 tx=0 comp rx=0 tx=0).stop 2023-04-18T17:37:03.454+0200 7f99372b0700 1 -- 192.168.1.1:0/229580484 --> [v2:192.168.1.10:3300/0,v1:192.168.1.10:6789/0] -- mon_subscribe({config=0+,monmap=0+}) v3 -- 0x55ec60435d60 con 0x55ec6059b210 2023-04-18T17:37:03.454+0200 7f9936aaf700 1 --2- 192.168.1.1:0/229580484 >> [v2:192.168.1.1:3300/0,v1:192.168.1.1:6789/0] conn(0x55ec6059bbb0 0x55ec605a4280 unknown :-1
[ceph-users] Re: [ceph 17.2.6] unable to create rbd snapshots for images with erasure code data-pool
On Tue, Apr 18, 2023 at 3:21 PM Reto Gysi wrote: > > Hi, > > Yes both snap create commands were executed as user admin: > client.admin >caps: [mds] allow * >caps: [mgr] allow * >caps: [mon] allow * >caps: [osd] allow * > > deep scrubbing+repair of ecpool_hdd is still ongoing, but so far the > problem still exists Hi Reto, Deep scrubbing is unlikely to help with a "Operation not supported" error. I really doubt that the output that you attached in one of the previous emails is all that is logged. Even in the successful case, there is not a single RBD-related debug log. I would suggest repeating the test with an explicit redirection and attaching the file itself. Thanks, Ilya ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: [ceph 17.2.6] unable to create rbd snapshots for images with erasure code data-pool
Hi, Yes both snap create commands were executed as user admin: client.admin caps: [mds] allow * caps: [mgr] allow * caps: [mon] allow * caps: [osd] allow * deep scrubbing+repair of ecpool_hdd is still ongoing, but so far the problem still exists Am Di., 18. Apr. 2023 um 13:43 Uhr schrieb Eugen Block : > Hi, > > > In the meantime I did some further test. I've created a new erasure coded > > datapool 'ecpool_test' and if I create a new rbd image with this data > pool > > I can create snapshots, but I can't create snapshots on both new and > > existing images with existing data pool 'ecpool_hdd' > > just one thought, could this be a caps mismatch? Is it the same user > in those two pools who creates snaps (or tries to)? If those are > different users could you share the auth caps? > > Zitat von Reto Gysi : > > > That was all that it logged. > > In the meantime I did some further test. I've created a new erasure coded > > datapool 'ecpool_test' and if I create a new rbd image with this data > pool > > I can create snapshots, but I can't create snapshots on both new and > > existing images with existing data pool 'ecpool_hdd' > > > > #create new image on existing erasure code data-pool ecpool_hdd > > root@zephir:~# rbd create -p rbd --data-pool ecpool_hdd test_ecpool_hdd > > --size 10G > > > > #create new image on new erasure code data-pool ecpool_test > > root@zephir:~# rbd create -p rbd --data-pool ecpool_test > test_ecpool_test > > --size 10G > > > > # trying to create snap-shot of image with data pool ecpool_hdd -> fails > > root@zephir:~# rbd snap create test_ecpool_hdd@backup --debug-ms 1 > > --debug-rbd 20 > > 2023-04-17T21:52:43.623+0200 7f3f787174c0 1 Processor -- start > > 2023-04-17T21:52:43.623+0200 7f3f787174c0 1 -- start start > > 2023-04-17T21:52:43.623+0200 7f3f787174c0 1 --2- >> [v2: > > 192.168.1.1:3300/0,v1:192.168.1.1:6789/0] conn(0x562a3dd350c0 > > 0x562a3dd35490 unknown :-1 s=NONE pgs=0 cs=0 l=0 rev1=0 crypto rx=0 tx=0 > > comp rx=0 tx=0).connect > > 2023-04-17T21:52:43.623+0200 7f3f787174c0 1 --2- >> [v2: > > 192.168.1.10:3300/0,v1:192.168.1.10:6789/0] conn(0x562a3dd359d0 > > 0x562a3dd42730 unknown :-1 s=NONE pgs=0 cs=0 l=0 rev1=0 crypto rx=0 tx=0 > > comp rx=0 tx=0).connect > > 2023-04-17T21:52:43.623+0200 7f3f787174c0 1 --2- >> [v2: > > 192.168.43.208:3300/0,v1:192.168.43.208:6789/0] conn(0x562a3dd42c70 > > 0x562a3dd45050 unknown :-1 s=NONE pgs=0 cs=0 l=0 rev1=0 crypto rx=0 tx=0 > > comp rx=0 tx=0).connect > > 2023-04-17T21:52:43.623+0200 7f3f787174c0 1 -- --> [v2: > > 192.168.1.1:3300/0,v1:192.168.1.1:6789/0] -- mon_getmap magic: 0 v1 -- > > 0x562a3dc24b70 con 0x562a3dd350c0 > > 2023-04-17T21:52:43.623+0200 7f3f787174c0 1 -- --> [v2: > > 192.168.1.10:3300/0,v1:192.168.1.10:6789/0] -- mon_getmap magic: 0 v1 -- > > 0x562a3dc1e1b0 con 0x562a3dd359d0 > > 2023-04-17T21:52:43.623+0200 7f3f787174c0 1 -- --> [v2: > > 192.168.43.208:3300/0,v1:192.168.43.208:6789/0] -- mon_getmap magic: 0 > v1 > > -- 0x562a3dbba680 con 0x562a3dd42c70 > > 2023-04-17T21:52:43.623+0200 7f3f7668d700 1 --2- >> [v2: > > 192.168.1.1:3300/0,v1:192.168.1.1:6789/0] conn(0x562a3dd350c0 > > 0x562a3dd35490 unknown :-1 s=BANNER_CONNECTING pgs=0 cs=0 l=0 rev1=0 > crypto > > rx=0 tx=0 comp rx=0 tx=0)._handle_peer_banner_payload supporte > > d=3 required=0 > > 2023-04-17T21:52:43.623+0200 7f3f7668d700 1 --2- >> [v2: > > 192.168.1.1:3300/0,v1:192.168.1.1:6789/0] conn(0x562a3dd350c0 > > 0x562a3dd35490 unknown :-1 s=HELLO_CONNECTING pgs=0 cs=0 l=0 rev1=1 > crypto > > rx=0 tx=0 comp rx=0 tx=0).handle_hello peer v2:192.168.1.1:3300 > > /0 says I am v2:192.168.1.1:43298/0 (socket says 192.168.1.1:43298) > > 2023-04-17T21:52:43.623+0200 7f3f7668d700 1 -- 192.168.1.1:0/3758799544 > > learned_addr learned my addr 192.168.1.1:0/3758799544 (peer_addr_for_me > v2: > > 192.168.1.1:0/0) > > 2023-04-17T21:52:43.623+0200 7f3f75e8c700 1 --2- > 192.168.1.1:0/3758799544 > >>> [v2:192.168.1.10:3300/0,v1:192.168.1.10:6789/0] conn(0x562a3dd359d0 > > 0x562a3dd42730 unknown :-1 s=BANNER_CONNECTING pgs=0 cs=0 l=0 rev1=0 > crypto > > rx=0 tx=0 comp rx=0 tx=0)._handle_pe > > er_banner_payload supported=3 required=0 > > 2023-04-17T21:52:43.623+0200 7f3f7668d700 1 -- 192.168.1.1:0/3758799544 > >> > > [v2:192.168.43.208:3300/0,v1:192.168.43.208:6789/0] conn(0x562a3dd42c70 > > msgr2=0x562a3dd45050 unknown :-1 s=STATE_CONNECTING_RE l=0).mark_down > > 2023-04-17T21:52:43.623+0200 7f3f7668d700 1 --2- > 192.168.1.1:0/3758799544 > >>> [v2:192.168.43.208:3300/0,v1:192.168.43.208:6789/0] > conn(0x562a3dd42c70 > > 0x562a3dd45050 unknown :-1 s=START_CONNECT pgs=0 cs=0 l=0 rev1=0 crypto > > rx=0 tx=0 comp rx=0 tx=0).stop > > 2023-04-17T21:52:43.623+0200 7f3f7668d700 1 -- 192.168.1.1:0/3758799544 > >> > > [v2:192.168.1.10:3300/0,v1:192.168.1.10:6789/0] conn(0x562a3dd359d0 > > msgr2=0x562a3dd42730 unknown :-1 s=STATE_CONNECTION_ESTABLISHED > > l=0).mark_down > > 2023-04-17T21:52:43.623+0200 7f3f7668d700 1 --2- >
[ceph-users] Re: ceph pg stuck - missing on 1 osd how to proceed
You may want to consider disabling deep scrubs and scrubs while attempting to complete a backfill operation. On Tue, Apr 18, 2023, at 01:46, Eugen Block wrote: > I didn't mean you should split your PGs now, that won't help because > there is already backfilling going on. I would revert the pg_num > changes (since nothing actually happened yet there's no big risk) and > wait for the backfill to finish. You don't seem to have inactive PGs > so it shouldn't be an issue as long as nothing else breaks down. Do > you see progress of the backfilling? Do the numbers of misplaced > objects change? > > Zitat von xadhoo...@gmail.com: > >> Thanks, I try to change the pg and pgp number to an higher value but >> pg do not increase >> ta: >> pools: 8 pools, 1085 pgs >> objects: 242.28M objects, 177 TiB >> usage: 553 TiB used, 521 TiB / 1.0 PiB avail >> pgs: 635281/726849381 objects degraded (0.087%) >> 91498351/726849381 objects misplaced (12.588%) >> 773 active+clean >> 288 active+remapped+backfilling >> 11 active+clean+scrubbing+deep >> 10 active+clean+scrubbing >> 3 active+undersized+degraded+remapped+backfilling >> >> >> still have those 3 pg in stuck >> ___ >> ceph-users mailing list -- ceph-users@ceph.io >> To unsubscribe send an email to ceph-users-le...@ceph.io > > > ___ > ceph-users mailing list -- ceph-users@ceph.io > To unsubscribe send an email to ceph-users-le...@ceph.io ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: [ceph 17.2.6] unable to create rbd snapshots for images with erasure code data-pool
Hi, In the meantime I did some further test. I've created a new erasure coded datapool 'ecpool_test' and if I create a new rbd image with this data pool I can create snapshots, but I can't create snapshots on both new and existing images with existing data pool 'ecpool_hdd' just one thought, could this be a caps mismatch? Is it the same user in those two pools who creates snaps (or tries to)? If those are different users could you share the auth caps? Zitat von Reto Gysi : That was all that it logged. In the meantime I did some further test. I've created a new erasure coded datapool 'ecpool_test' and if I create a new rbd image with this data pool I can create snapshots, but I can't create snapshots on both new and existing images with existing data pool 'ecpool_hdd' #create new image on existing erasure code data-pool ecpool_hdd root@zephir:~# rbd create -p rbd --data-pool ecpool_hdd test_ecpool_hdd --size 10G #create new image on new erasure code data-pool ecpool_test root@zephir:~# rbd create -p rbd --data-pool ecpool_test test_ecpool_test --size 10G # trying to create snap-shot of image with data pool ecpool_hdd -> fails root@zephir:~# rbd snap create test_ecpool_hdd@backup --debug-ms 1 --debug-rbd 20 2023-04-17T21:52:43.623+0200 7f3f787174c0 1 Processor -- start 2023-04-17T21:52:43.623+0200 7f3f787174c0 1 -- start start 2023-04-17T21:52:43.623+0200 7f3f787174c0 1 --2- >> [v2: 192.168.1.1:3300/0,v1:192.168.1.1:6789/0] conn(0x562a3dd350c0 0x562a3dd35490 unknown :-1 s=NONE pgs=0 cs=0 l=0 rev1=0 crypto rx=0 tx=0 comp rx=0 tx=0).connect 2023-04-17T21:52:43.623+0200 7f3f787174c0 1 --2- >> [v2: 192.168.1.10:3300/0,v1:192.168.1.10:6789/0] conn(0x562a3dd359d0 0x562a3dd42730 unknown :-1 s=NONE pgs=0 cs=0 l=0 rev1=0 crypto rx=0 tx=0 comp rx=0 tx=0).connect 2023-04-17T21:52:43.623+0200 7f3f787174c0 1 --2- >> [v2: 192.168.43.208:3300/0,v1:192.168.43.208:6789/0] conn(0x562a3dd42c70 0x562a3dd45050 unknown :-1 s=NONE pgs=0 cs=0 l=0 rev1=0 crypto rx=0 tx=0 comp rx=0 tx=0).connect 2023-04-17T21:52:43.623+0200 7f3f787174c0 1 -- --> [v2: 192.168.1.1:3300/0,v1:192.168.1.1:6789/0] -- mon_getmap magic: 0 v1 -- 0x562a3dc24b70 con 0x562a3dd350c0 2023-04-17T21:52:43.623+0200 7f3f787174c0 1 -- --> [v2: 192.168.1.10:3300/0,v1:192.168.1.10:6789/0] -- mon_getmap magic: 0 v1 -- 0x562a3dc1e1b0 con 0x562a3dd359d0 2023-04-17T21:52:43.623+0200 7f3f787174c0 1 -- --> [v2: 192.168.43.208:3300/0,v1:192.168.43.208:6789/0] -- mon_getmap magic: 0 v1 -- 0x562a3dbba680 con 0x562a3dd42c70 2023-04-17T21:52:43.623+0200 7f3f7668d700 1 --2- >> [v2: 192.168.1.1:3300/0,v1:192.168.1.1:6789/0] conn(0x562a3dd350c0 0x562a3dd35490 unknown :-1 s=BANNER_CONNECTING pgs=0 cs=0 l=0 rev1=0 crypto rx=0 tx=0 comp rx=0 tx=0)._handle_peer_banner_payload supporte d=3 required=0 2023-04-17T21:52:43.623+0200 7f3f7668d700 1 --2- >> [v2: 192.168.1.1:3300/0,v1:192.168.1.1:6789/0] conn(0x562a3dd350c0 0x562a3dd35490 unknown :-1 s=HELLO_CONNECTING pgs=0 cs=0 l=0 rev1=1 crypto rx=0 tx=0 comp rx=0 tx=0).handle_hello peer v2:192.168.1.1:3300 /0 says I am v2:192.168.1.1:43298/0 (socket says 192.168.1.1:43298) 2023-04-17T21:52:43.623+0200 7f3f7668d700 1 -- 192.168.1.1:0/3758799544 learned_addr learned my addr 192.168.1.1:0/3758799544 (peer_addr_for_me v2: 192.168.1.1:0/0) 2023-04-17T21:52:43.623+0200 7f3f75e8c700 1 --2- 192.168.1.1:0/3758799544 [v2:192.168.1.10:3300/0,v1:192.168.1.10:6789/0] conn(0x562a3dd359d0 0x562a3dd42730 unknown :-1 s=BANNER_CONNECTING pgs=0 cs=0 l=0 rev1=0 crypto rx=0 tx=0 comp rx=0 tx=0)._handle_pe er_banner_payload supported=3 required=0 2023-04-17T21:52:43.623+0200 7f3f7668d700 1 -- 192.168.1.1:0/3758799544 >> [v2:192.168.43.208:3300/0,v1:192.168.43.208:6789/0] conn(0x562a3dd42c70 msgr2=0x562a3dd45050 unknown :-1 s=STATE_CONNECTING_RE l=0).mark_down 2023-04-17T21:52:43.623+0200 7f3f7668d700 1 --2- 192.168.1.1:0/3758799544 [v2:192.168.43.208:3300/0,v1:192.168.43.208:6789/0] conn(0x562a3dd42c70 0x562a3dd45050 unknown :-1 s=START_CONNECT pgs=0 cs=0 l=0 rev1=0 crypto rx=0 tx=0 comp rx=0 tx=0).stop 2023-04-17T21:52:43.623+0200 7f3f7668d700 1 -- 192.168.1.1:0/3758799544 >> [v2:192.168.1.10:3300/0,v1:192.168.1.10:6789/0] conn(0x562a3dd359d0 msgr2=0x562a3dd42730 unknown :-1 s=STATE_CONNECTION_ESTABLISHED l=0).mark_down 2023-04-17T21:52:43.623+0200 7f3f7668d700 1 --2- 192.168.1.1:0/3758799544 [v2:192.168.1.10:3300/0,v1:192.168.1.10:6789/0] conn(0x562a3dd359d0 0x562a3dd42730 unknown :-1 s=AUTH_CONNECTING pgs=0 cs=0 l=0 rev1=1 crypto rx=0 tx=0 comp rx=0 tx=0).stop 2023-04-17T21:52:43.623+0200 7f3f7668d700 1 -- 192.168.1.1:0/3758799544 --> [v2:192.168.1.1:3300/0,v1:192.168.1.1:6789/0] -- mon_subscribe({config=0+,monmap=0+}) v3 -- 0x562a3dbd1d60 con 0x562a3dd350c0 2023-04-17T21:52:43.623+0200 7f3f7668d700 1 --2- 192.168.1.1:0/3758799544 [v2:192.168.1.1:3300/0,v1:192.168.1.1:6789/0] conn(0x562a3dd350c0 0x562a3dd35490 secure :-1 s=READY pgs=511 cs=0 l=1 rev1=1 crypto rx=0x7f3f6800a700 tx=0x7f3f68005b10 comp rx=0 tx=
[ceph-users] Re: deploying Ceph using FQDN for MON / MDS Services
yes thanks, Robert, after installing the Ceph common mount is working fine. On Tue, Apr 18, 2023 at 2:10 PM Robert Sander wrote: > On 18.04.23 06:12, Lokendra Rathour wrote: > > > but if I try mounting from a normal Linux machine with connectivity > > enabled between Ceph mon nodes, it gives the error as stated before. > > Have you installed ceph-common on the "normal Linux machine"? > > Regards > -- > Robert Sander > Heinlein Support GmbH > Linux: Akademie - Support - Hosting > http://www.heinlein-support.de > > Tel: 030-405051-43 > Fax: 030-405051-19 > > Zwangsangaben lt. §35a GmbHG: > HRB 93818 B / Amtsgericht Berlin-Charlottenburg, > Geschäftsführer: Peer Heinlein -- Sitz: Berlin > ___ > ceph-users mailing list -- ceph-users@ceph.io > To unsubscribe send an email to ceph-users-le...@ceph.io > -- ~ Lokendra skype: lokendrarathour ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: deploying Ceph using FQDN for MON / MDS Services
On 18.04.23 06:12, Lokendra Rathour wrote: but if I try mounting from a normal Linux machine with connectivity enabled between Ceph mon nodes, it gives the error as stated before. Have you installed ceph-common on the "normal Linux machine"? Regards -- Robert Sander Heinlein Support GmbH Linux: Akademie - Support - Hosting http://www.heinlein-support.de Tel: 030-405051-43 Fax: 030-405051-19 Zwangsangaben lt. §35a GmbHG: HRB 93818 B / Amtsgericht Berlin-Charlottenburg, Geschäftsführer: Peer Heinlein -- Sitz: Berlin ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Consequence of maintaining hundreds of clones of a single RBD image snapshot
Hello, My use-case involves creating hundreds of clones (~1,000) of a single RBD image snapshot. I assume watchers exist for each clone, due to the copy-on-write nature of clones. Should I expect a penalty for maintaining such a large number of clones: cpu, memory, performance? If such penalty does exist, we might opt to flatten some of the clones. Is consistency guaranteed during the flattening process? In other words, can I write to a clone while it is being flattened? Perspectivus ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: ceph pg stuck - missing on 1 osd how to proceed
I didn't mean you should split your PGs now, that won't help because there is already backfilling going on. I would revert the pg_num changes (since nothing actually happened yet there's no big risk) and wait for the backfill to finish. You don't seem to have inactive PGs so it shouldn't be an issue as long as nothing else breaks down. Do you see progress of the backfilling? Do the numbers of misplaced objects change? Zitat von xadhoo...@gmail.com: Thanks, I try to change the pg and pgp number to an higher value but pg do not increase ta: pools: 8 pools, 1085 pgs objects: 242.28M objects, 177 TiB usage: 553 TiB used, 521 TiB / 1.0 PiB avail pgs: 635281/726849381 objects degraded (0.087%) 91498351/726849381 objects misplaced (12.588%) 773 active+clean 288 active+remapped+backfilling 11 active+clean+scrubbing+deep 10 active+clean+scrubbing 3 active+undersized+degraded+remapped+backfilling still have those 3 pg in stuck ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io