[ceph-users] HBA or RAID-0 + BBU

2023-04-18 Thread Murilo Morais
Good evening everyone!

Guys, about the P420 RAID controller, I have a question about the operation
mode: What would be better: HBA or RAID-0 with BBU (active write cache)?

Thanks in advance!
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: [ceph 17.2.6] unable to create rbd snapshots for images with erasure code data-pool

2023-04-18 Thread Reto Gysi
Ah, yes indeed I had disabled log-to-stderr in cluster wide config.
root@zephir:~# rbd -p rbd snap create ceph-dev@backup --id admin --debug-ms
1 --debug-rbd 20 --log-to-stderr=true >/home/rgysi/log.txt 2>&1
root@zephir:~#

Here's the log.txt


Am Di., 18. Apr. 2023 um 18:36 Uhr schrieb Ilya Dryomov :

> On Tue, Apr 18, 2023 at 5:45 PM Reto Gysi  wrote:
> >
> > Hi Ilya
> >
> > Sure.
> >
> > root@zephir:~# rbd snap create ceph-dev@backup --id admin --debug-ms 1
> --debug-rbd 20 >/home/rgysi/log.txt 2>&1
>
> You probably have custom log settings in the cluster-wide config.  Please
> append "--log-to-stderr true" and try again.
>
> Thanks,
>
> Ilya
>
2023-04-18T23:25:42.707+0200 7f4a8d70f4c0  1  Processor -- start
2023-04-18T23:25:42.707+0200 7f4a8d70f4c0  1 --  start start
2023-04-18T23:25:42.707+0200 7f4a8d70f4c0  1 --2-  >> 
[v2:192.168.1.1:3300/0,v1:192.168.1.1:6789/0] conn(0x55637d589150 
0x55637d589520 unknown :-1 s=NONE pgs=0 cs=0 l=0 rev1=0 crypto rx=0 tx=0 comp 
rx=0 tx=0).connect
2023-04-18T23:25:42.707+0200 7f4a8d70f4c0  1 --2-  >> 
[v2:192.168.1.10:3300/0,v1:192.168.1.10:6789/0] conn(0x55637d589a60 
0x55637d5967c0 unknown :-1 s=NONE pgs=0 cs=0 l=0 rev1=0 crypto rx=0 tx=0 comp 
rx=0 tx=0).connect
2023-04-18T23:25:42.707+0200 7f4a8d70f4c0  1 --2-  >> 
[v2:192.168.43.208:3300/0,v1:192.168.43.208:6789/0] conn(0x55637d596d00 
0x55637d5990e0 unknown :-1 s=NONE pgs=0 cs=0 l=0 rev1=0 crypto rx=0 tx=0 comp 
rx=0 tx=0).connect
2023-04-18T23:25:42.707+0200 7f4a8d70f4c0  1 --  --> 
[v2:192.168.1.1:3300/0,v1:192.168.1.1:6789/0] -- mon_getmap magic: 0 v1 -- 
0x55637d478b70 con 0x55637d589150
2023-04-18T23:25:42.707+0200 7f4a8d70f4c0  1 --  --> 
[v2:192.168.1.10:3300/0,v1:192.168.1.10:6789/0] -- mon_getmap magic: 0 v1 -- 
0x55637d4721b0 con 0x55637d589a60
2023-04-18T23:25:42.707+0200 7f4a8d70f4c0  1 --  --> 
[v2:192.168.43.208:3300/0,v1:192.168.43.208:6789/0] -- mon_getmap magic: 0 v1 
-- 0x55637d40e680 con 0x55637d596d00
2023-04-18T23:25:42.707+0200 7f4a8b63e700  1 --2-  >> 
[v2:192.168.1.1:3300/0,v1:192.168.1.1:6789/0] conn(0x55637d589150 
0x55637d589520 unknown :-1 s=BANNER_CONNECTING pgs=0 cs=0 l=0 rev1=0 crypto 
rx=0 tx=0 comp rx=0 tx=0)._handle_peer_banner_payload supported=3 required=0
2023-04-18T23:25:42.707+0200 7f4a8ae3d700  1 --2-  >> 
[v2:192.168.1.10:3300/0,v1:192.168.1.10:6789/0] conn(0x55637d589a60 
0x55637d5967c0 unknown :-1 s=BANNER_CONNECTING pgs=0 cs=0 l=0 rev1=0 crypto 
rx=0 tx=0 comp rx=0 tx=0)._handle_peer_banner_payload supported=3 required=0
2023-04-18T23:25:42.707+0200 7f4a8b63e700  1 --2-  >> 
[v2:192.168.1.1:3300/0,v1:192.168.1.1:6789/0] conn(0x55637d589150 
0x55637d589520 unknown :-1 s=HELLO_CONNECTING pgs=0 cs=0 l=0 rev1=1 crypto rx=0 
tx=0 comp rx=0 tx=0).handle_hello peer v2:192.168.1.1:3300/0 says I am 
v2:192.168.1.1:49004/0 (socket says 192.168.1.1:49004)
2023-04-18T23:25:42.707+0200 7f4a8b63e700  1 -- 192.168.1.1:0/3741665115 
learned_addr learned my addr 192.168.1.1:0/3741665115 (peer_addr_for_me 
v2:192.168.1.1:0/0)
2023-04-18T23:25:42.707+0200 7f4a8b63e700  1 -- 192.168.1.1:0/3741665115 >> 
[v2:192.168.43.208:3300/0,v1:192.168.43.208:6789/0] conn(0x55637d596d00 
msgr2=0x55637d5990e0 unknown :-1 s=STATE_CONNECTING_RE l=0).mark_down
2023-04-18T23:25:42.707+0200 7f4a8b63e700  1 --2- 192.168.1.1:0/3741665115 >> 
[v2:192.168.43.208:3300/0,v1:192.168.43.208:6789/0] conn(0x55637d596d00 
0x55637d5990e0 unknown :-1 s=START_CONNECT pgs=0 cs=0 l=0 rev1=0 crypto rx=0 
tx=0 comp rx=0 tx=0).stop
2023-04-18T23:25:42.707+0200 7f4a8b63e700  1 -- 192.168.1.1:0/3741665115 >> 
[v2:192.168.1.10:3300/0,v1:192.168.1.10:6789/0] conn(0x55637d589a60 
msgr2=0x55637d5967c0 unknown :-1 s=STATE_CONNECTION_ESTABLISHED l=0).mark_down
2023-04-18T23:25:42.707+0200 7f4a8b63e700  1 --2- 192.168.1.1:0/3741665115 >> 
[v2:192.168.1.10:3300/0,v1:192.168.1.10:6789/0] conn(0x55637d589a60 
0x55637d5967c0 unknown :-1 s=AUTH_CONNECTING pgs=0 cs=0 l=0 rev1=1 crypto rx=0 
tx=0 comp rx=0 tx=0).stop
2023-04-18T23:25:42.707+0200 7f4a8b63e700  1 -- 192.168.1.1:0/3741665115 --> 
[v2:192.168.1.1:3300/0,v1:192.168.1.1:6789/0] -- 
mon_subscribe({config=0+,monmap=0+}) v3 -- 0x55637d4197f0 con 0x55637d589150
2023-04-18T23:25:42.707+0200 7f4a8ae3d700  1 --2- 192.168.1.1:0/3741665115 >> 
[v2:192.168.1.10:3300/0,v1:192.168.1.10:6789/0] conn(0x55637d589a60 
0x55637d5967c0 unknown :-1 s=CLOSED pgs=0 cs=0 l=0 rev1=1 crypto rx=0 tx=0 comp 
rx=0 tx=0).handle_auth_done state changed!
2023-04-18T23:25:42.707+0200 7f4a8b63e700  1 --2- 192.168.1.1:0/3741665115 >> 
[v2:192.168.1.1:3300/0,v1:192.168.1.1:6789/0] conn(0x55637d589150 
0x55637d589520 secure :-1 s=READY pgs=939 cs=0 l=1 rev1=1 crypto 
rx=0x7f4a7c00a700 tx=0x7f4a7c005b10 comp rx=0 tx=0).ready entity=mon.0 
client_cookie=b76bfda048033b3b server_cookie=0 in_seq=0 out_seq=0
2023-04-18T23:25:42.707+0200 7f4a8a63c700  1 -- 192.168.1.1:0/3741665115 <== 
mon.0 v2:192.168.1.1:3300/0 1  mon_map magic: 0 v1  467+0+0 (secure 0 0 
0) 0x7f4a7c0089b0 con 0x55637d589150

[ceph-users] Re: [ceph 17.2.6] unable to create rbd snapshots for images with erasure code data-pool

2023-04-18 Thread Reto Gysi
Hi Eugen

Yes, I used the default setting of rbd_default_pool='rbd'. I don't have
anything set for default_data_pool.
root@zephir:~# ceph config show-with-defaults mon.zephir | grep -E
"default(_data)*_pool"
osd_default_data_pool_replay_window 45

  default

rbd_default_data_pool

  default

rbd_default_poolrbd

  default

root@zephir:~#

If I don't specify a data-pool during 'ceph create ' it will create
the image with pool 'rbd' and without a separate data pool.
pool 'rbd' is a replica 3 pool.

adding '-p rbd' to the snap create command doesn't change/fix the error:
root@zephir:~# rbd -p rbd snap create ceph-dev@backup --id admin --debug-ms
1 --debug-rbd 20
2023-04-18T19:25:23.002+0200 7f1a036ff4c0  1  Processor -- start
2023-04-18T19:25:23.002+0200 7f1a036ff4c0  1 --  start start
2023-04-18T19:25:23.002+0200 7f1a036ff4c0  1 --2-  >> [v2:
192.168.1.10:3300/0,v1:192.168.1.10:6789/0] conn(0x56151b58f2b0
0x56151b58f680 unknown :-1 s=NONE pgs=0 cs=0 l=0 rev1=0 crypto rx=0 tx=0
comp rx=0 tx=0).connect
2023-04-18T19:25:23.002+0200 7f1a036ff4c0  1 --2-  >> [v2:
192.168.1.1:3300/0,v1:192.168.1.1:6789/0] conn(0x56151b58fc50
0x56151b598320 unknown :-1 s=NONE pgs=0 cs=0 l=0 rev1=0 crypto rx=0 tx=0
comp rx=0 tx=0).connect
2023-04-18T19:25:23.002+0200 7f1a036ff4c0  1 --2-  >> [v2:
192.168.43.208:3300/0,v1:192.168.43.208:6789/0] conn(0x56151b598860
0x56151b59ac40 unknown :-1 s=NONE pgs=0 cs=0 l=0 rev1=0 crypto rx=0 tx=0
comp rx=0 tx=0).connect
2023-04-18T19:25:23.002+0200 7f1a036ff4c0  1 --  --> [v2:
192.168.1.1:3300/0,v1:192.168.1.1:6789/0] -- mon_getmap magic: 0 v1 --
0x56151b47cb70 con 0x56151b58fc50
2023-04-18T19:25:23.002+0200 7f1a036ff4c0  1 --  --> [v2:
192.168.1.10:3300/0,v1:192.168.1.10:6789/0] -- mon_getmap magic: 0 v1 --
0x56151b4761b0 con 0x56151b58f2b0
2023-04-18T19:25:23.002+0200 7f1a036ff4c0  1 --  --> [v2:
192.168.43.208:3300/0,v1:192.168.43.208:6789/0] -- mon_getmap magic: 0 v1
-- 0x56151b412680 con 0x56151b598860
2023-04-18T19:25:23.002+0200 7f19f8d43700  1 --2-  >> [v2:
192.168.1.1:3300/0,v1:192.168.1.1:6789/0] conn(0x56151b58fc50
0x56151b598320 unknown :-1 s=BANNER_CONNECTING pgs=0 cs=0 l=0 rev1=0 crypto
rx=0 tx=0 comp rx=0 tx=0)._handle_peer_banner_payload supporte
d=3 required=0
2023-04-18T19:25:23.002+0200 7f1a01544700  1 --2-  >> [v2:
192.168.1.10:3300/0,v1:192.168.1.10:6789/0] conn(0x56151b58f2b0
0x56151b58f680 unknown :-1 s=BANNER_CONNECTING pgs=0 cs=0 l=0 rev1=0 crypto
rx=0 tx=0 comp rx=0 tx=0)._handle_peer_banner_payload suppor
ted=3 required=0
2023-04-18T19:25:23.002+0200 7f19f8d43700  1 --2-  >> [v2:
192.168.1.1:3300/0,v1:192.168.1.1:6789/0] conn(0x56151b58fc50
0x56151b598320 unknown :-1 s=HELLO_CONNECTING pgs=0 cs=0 l=0 rev1=1 crypto
rx=0 tx=0 comp rx=0 tx=0).handle_hello peer v2:192.168.1.1:3300
/0 says I am v2:192.168.1.1:35346/0 (socket says 192.168.1.1:35346)
2023-04-18T19:25:23.002+0200 7f19f8d43700  1 -- 192.168.1.1:0/2631157109
learned_addr learned my addr 192.168.1.1:0/2631157109 (peer_addr_for_me v2:
192.168.1.1:0/0)
2023-04-18T19:25:23.002+0200 7f1a01544700  1 -- 192.168.1.1:0/2631157109 >>
[v2:192.168.43.208:3300/0,v1:192.168.43.208:6789/0] conn(0x56151b598860
msgr2=0x56151b59ac40 unknown :-1 s=STATE_CONNECTING_RE l=0).mark_down
2023-04-18T19:25:23.002+0200 7f1a01544700  1 --2- 192.168.1.1:0/2631157109
>> [v2:192.168.43.208:3300/0,v1:192.168.43.208:6789/0] conn(0x56151b598860
0x56151b59ac40 unknown :-1 s=START_CONNECT pgs=0 cs=0 l=0 rev1=0 crypto
rx=0 tx=0 comp rx=0 tx=0).stop
2023-04-18T19:25:23.002+0200 7f1a01544700  1 -- 192.168.1.1:0/2631157109 >>
[v2:192.168.1.1:3300/0,v1:192.168.1.1:6789/0] conn(0x56151b58fc50
msgr2=0x56151b598320 unknown :-1 s=STATE_CONNECTION_ESTABLISHED
l=0).mark_down
2023-04-18T19:25:23.002+0200 7f1a01544700  1 --2- 192.168.1.1:0/2631157109
>> [v2:192.168.1.1:3300/0,v1:192.168.1.1:6789/0] conn(0x56151b58fc50
0x56151b598320 unknown :-1 s=AUTH_CONNECTING pgs=0 cs=0 l=0 rev1=1 crypto
rx=0 tx=0 comp rx=0 tx=0).stop
2023-04-18T19:25:23.002+0200 7f1a01544700  1 -- 192.168.1.1:0/2631157109
--> [v2:192.168.1.10:3300/0,v1:192.168.1.10:6789/0] --
mon_subscribe({config=0+,monmap=0+}) v3 -- 0x56151b41d7f0 con
0x56151b58f2b0
2023-04-18T19:25:23.002+0200 7f19f8d43700  1 --2- 192.168.1.1:0/2631157109
>> [v2:192.168.1.1:3300/0,v1:192.168.1.1:6789/0] conn(0x56151b58fc50
0x56151b598320 unknown :-1 s=CLOSED pgs=0 cs=0 l=0 rev1=1 crypto rx=0 tx=0
comp rx=0 tx=0).handle_auth_done state
changed!
2023-04-18T19:25:23.002+0200 7f1a01544700  1 --2- 192.168.1.1:0/2631157109
>> [v2:192.168.1.10:3300/0,v1:192.168.1.10:6789/0] conn(0x56151b58f2b0
0x56151b58f680 secure :-1 s=READY pgs=214 cs=0 l=1 rev1=1 crypto
rx=0x7f19f400a5d0 tx=0x7f19f4005d40 comp rx=0 t
x=0).ready entity=mon.1 client_cookie=a9059b943a3e6f58 

[ceph-users] Re: [ceph 17.2.6] unable to create rbd snapshots for images with erasure code data-pool

2023-04-18 Thread Eugen Block
You don't seem to specify a pool name to the snap create command, does  
your rbd_default_pool match the desired pool? And also does  
rbd_default_data_pool match what you expect (if those values are even  
set)? I've never used custom values for those configs but if you don't  
specify a pool name the default name "rbd" is expected by ceph. At  
least that's how I know it.


Zitat von Reto Gysi :


Hi Ilya

Sure.

root@zephir:~# rbd snap create ceph-dev@backup --id admin --debug-ms 1
--debug-rbd 20 >/home/rgysi/log.txt 2>&1
root@zephir:~#

Am Di., 18. Apr. 2023 um 16:19 Uhr schrieb Ilya Dryomov 
:



On Tue, Apr 18, 2023 at 3:21 PM Reto Gysi  wrote:
>
> Hi,
>
> Yes both snap create commands were executed as user admin:
> client.admin
>caps: [mds] allow *
>caps: [mgr] allow *
>caps: [mon] allow *
>caps: [osd] allow *
>
> deep scrubbing+repair of ecpool_hdd is still ongoing, but so far the
> problem still exists

Hi Reto,

Deep scrubbing is unlikely to help with a "Operation not supported"
error.

I really doubt that the output that you attached in one of the previous
emails is all that is logged.  Even in the successful case, there is not
a single RBD-related debug log.  I would suggest repeating the test
with an explicit redirection and attaching the file itself.

Thanks,

Ilya




___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: [ceph 17.2.6] unable to create rbd snapshots for images with erasure code data-pool

2023-04-18 Thread Ilya Dryomov
On Tue, Apr 18, 2023 at 5:45 PM Reto Gysi  wrote:
>
> Hi Ilya
>
> Sure.
>
> root@zephir:~# rbd snap create ceph-dev@backup --id admin --debug-ms 1 
> --debug-rbd 20 >/home/rgysi/log.txt 2>&1

You probably have custom log settings in the cluster-wide config.  Please
append "--log-to-stderr true" and try again.

Thanks,

Ilya
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: [ceph 17.2.6] unable to create rbd snapshots for images with erasure code data-pool

2023-04-18 Thread Reto Gysi
Hi Ilya

Sure.

root@zephir:~# rbd snap create ceph-dev@backup --id admin --debug-ms 1
--debug-rbd 20 >/home/rgysi/log.txt 2>&1
root@zephir:~#

Am Di., 18. Apr. 2023 um 16:19 Uhr schrieb Ilya Dryomov :

> On Tue, Apr 18, 2023 at 3:21 PM Reto Gysi  wrote:
> >
> > Hi,
> >
> > Yes both snap create commands were executed as user admin:
> > client.admin
> >caps: [mds] allow *
> >caps: [mgr] allow *
> >caps: [mon] allow *
> >caps: [osd] allow *
> >
> > deep scrubbing+repair of ecpool_hdd is still ongoing, but so far the
> > problem still exists
>
> Hi Reto,
>
> Deep scrubbing is unlikely to help with a "Operation not supported"
> error.
>
> I really doubt that the output that you attached in one of the previous
> emails is all that is logged.  Even in the successful case, there is not
> a single RBD-related debug log.  I would suggest repeating the test
> with an explicit redirection and attaching the file itself.
>
> Thanks,
>
> Ilya
>
2023-04-18T17:37:03.454+0200 7f99393084c0  1  Processor -- start
2023-04-18T17:37:03.454+0200 7f99393084c0  1 --  start start
2023-04-18T17:37:03.454+0200 7f99393084c0  1 --2-  >> 
[v2:192.168.1.10:3300/0,v1:192.168.1.10:6789/0] conn(0x55ec6059b210 
0x55ec6059b5e0 unknown :-1 s=NONE pgs=0 cs=0 l=0 rev1=0 crypto rx=0 tx=0 comp 
rx=0 tx=0).connect
2023-04-18T17:37:03.454+0200 7f99393084c0  1 --2-  >> 
[v2:192.168.1.1:3300/0,v1:192.168.1.1:6789/0] conn(0x55ec6059bbb0 
0x55ec605a4280 unknown :-1 s=NONE pgs=0 cs=0 l=0 rev1=0 crypto rx=0 tx=0 comp 
rx=0 tx=0).connect
2023-04-18T17:37:03.454+0200 7f99393084c0  1 --2-  >> 
[v2:192.168.43.208:3300/0,v1:192.168.43.208:6789/0] conn(0x55ec605a47c0 
0x55ec605a6ba0 unknown :-1 s=NONE pgs=0 cs=0 l=0 rev1=0 crypto rx=0 tx=0 comp 
rx=0 tx=0).connect
2023-04-18T17:37:03.454+0200 7f99393084c0  1 --  --> 
[v2:192.168.1.1:3300/0,v1:192.168.1.1:6789/0] -- mon_getmap magic: 0 v1 -- 
0x55ec60488b70 con 0x55ec6059bbb0
2023-04-18T17:37:03.454+0200 7f99393084c0  1 --  --> 
[v2:192.168.1.10:3300/0,v1:192.168.1.10:6789/0] -- mon_getmap magic: 0 v1 -- 
0x55ec604821b0 con 0x55ec6059b210
2023-04-18T17:37:03.454+0200 7f99393084c0  1 --  --> 
[v2:192.168.43.208:3300/0,v1:192.168.43.208:6789/0] -- mon_getmap magic: 0 v1 
-- 0x55ec6041e680 con 0x55ec605a47c0
2023-04-18T17:37:03.454+0200 7f9936aaf700  1 --2-  >> 
[v2:192.168.1.1:3300/0,v1:192.168.1.1:6789/0] conn(0x55ec6059bbb0 
0x55ec605a4280 unknown :-1 s=BANNER_CONNECTING pgs=0 cs=0 l=0 rev1=0 crypto 
rx=0 tx=0 comp rx=0 tx=0)._handle_peer_banner_payload supported=3 required=0
2023-04-18T17:37:03.454+0200 7f99372b0700  1 --2-  >> 
[v2:192.168.1.10:3300/0,v1:192.168.1.10:6789/0] conn(0x55ec6059b210 
0x55ec6059b5e0 unknown :-1 s=BANNER_CONNECTING pgs=0 cs=0 l=0 rev1=0 crypto 
rx=0 tx=0 comp rx=0 tx=0)._handle_peer_banner_payload supported=3 required=0
2023-04-18T17:37:03.454+0200 7f99372b0700  1 --2-  >> 
[v2:192.168.1.10:3300/0,v1:192.168.1.10:6789/0] conn(0x55ec6059b210 
0x55ec6059b5e0 unknown :-1 s=HELLO_CONNECTING pgs=0 cs=0 l=0 rev1=1 crypto rx=0 
tx=0 comp rx=0 tx=0).handle_hello peer v2:192.168.1.10:3300/0 says I am 
v2:192.168.1.1:58756/0 (socket says 192.168.1.1:58756)
2023-04-18T17:37:03.454+0200 7f9936aaf700  1 --2-  >> 
[v2:192.168.1.1:3300/0,v1:192.168.1.1:6789/0] conn(0x55ec6059bbb0 
0x55ec605a4280 unknown :-1 s=HELLO_CONNECTING pgs=0 cs=0 l=0 rev1=1 crypto rx=0 
tx=0 comp rx=0 tx=0).handle_hello peer v2:192.168.1.1:3300/0 says I am 
v2:192.168.1.1:51640/0 (socket says 192.168.1.1:51640)
2023-04-18T17:37:03.454+0200 7f99372b0700  1 -- 192.168.1.1:0/229580484 
learned_addr learned my addr 192.168.1.1:0/229580484 (peer_addr_for_me 
v2:192.168.1.1:0/0)
2023-04-18T17:37:03.454+0200 7f99372b0700  1 -- 192.168.1.1:0/229580484 >> 
[v2:192.168.43.208:3300/0,v1:192.168.43.208:6789/0] conn(0x55ec605a47c0 
msgr2=0x55ec605a6ba0 unknown :-1 s=STATE_CONNECTING_RE l=0).mark_down
2023-04-18T17:37:03.454+0200 7f99372b0700  1 --2- 192.168.1.1:0/229580484 >> 
[v2:192.168.43.208:3300/0,v1:192.168.43.208:6789/0] conn(0x55ec605a47c0 
0x55ec605a6ba0 unknown :-1 s=START_CONNECT pgs=0 cs=0 l=0 rev1=0 crypto rx=0 
tx=0 comp rx=0 tx=0).stop
2023-04-18T17:37:03.454+0200 7f99372b0700  1 -- 192.168.1.1:0/229580484 >> 
[v2:192.168.1.1:3300/0,v1:192.168.1.1:6789/0] conn(0x55ec6059bbb0 
msgr2=0x55ec605a4280 unknown :-1 s=STATE_CONNECTION_ESTABLISHED l=0).mark_down
2023-04-18T17:37:03.454+0200 7f99372b0700  1 --2- 192.168.1.1:0/229580484 >> 
[v2:192.168.1.1:3300/0,v1:192.168.1.1:6789/0] conn(0x55ec6059bbb0 
0x55ec605a4280 unknown :-1 s=AUTH_CONNECTING pgs=0 cs=0 l=0 rev1=1 crypto rx=0 
tx=0 comp rx=0 tx=0).stop
2023-04-18T17:37:03.454+0200 7f99372b0700  1 -- 192.168.1.1:0/229580484 --> 
[v2:192.168.1.10:3300/0,v1:192.168.1.10:6789/0] -- 
mon_subscribe({config=0+,monmap=0+}) v3 -- 0x55ec60435d60 con 0x55ec6059b210
2023-04-18T17:37:03.454+0200 7f9936aaf700  1 --2- 192.168.1.1:0/229580484 >> 
[v2:192.168.1.1:3300/0,v1:192.168.1.1:6789/0] conn(0x55ec6059bbb0 
0x55ec605a4280 unknown :-1 

[ceph-users] Re: [ceph 17.2.6] unable to create rbd snapshots for images with erasure code data-pool

2023-04-18 Thread Ilya Dryomov
On Tue, Apr 18, 2023 at 3:21 PM Reto Gysi  wrote:
>
> Hi,
>
> Yes both snap create commands were executed as user admin:
> client.admin
>caps: [mds] allow *
>caps: [mgr] allow *
>caps: [mon] allow *
>caps: [osd] allow *
>
> deep scrubbing+repair of ecpool_hdd is still ongoing, but so far the
> problem still exists

Hi Reto,

Deep scrubbing is unlikely to help with a "Operation not supported"
error.

I really doubt that the output that you attached in one of the previous
emails is all that is logged.  Even in the successful case, there is not
a single RBD-related debug log.  I would suggest repeating the test
with an explicit redirection and attaching the file itself.

Thanks,

Ilya
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: [ceph 17.2.6] unable to create rbd snapshots for images with erasure code data-pool

2023-04-18 Thread Reto Gysi
Hi,

Yes both snap create commands were executed as user admin:
client.admin
   caps: [mds] allow *
   caps: [mgr] allow *
   caps: [mon] allow *
   caps: [osd] allow *

deep scrubbing+repair of ecpool_hdd is still ongoing, but so far the
problem still exists

Am Di., 18. Apr. 2023 um 13:43 Uhr schrieb Eugen Block :

> Hi,
>
> > In the meantime I did some further test. I've created a new erasure coded
> > datapool 'ecpool_test' and if I create a new rbd image with this data
> pool
> > I can create snapshots, but I can't create snapshots on both new and
> > existing images with existing data pool 'ecpool_hdd'
>
> just one thought, could this be a caps mismatch? Is it the same user
> in those two pools who creates snaps (or tries to)? If those are
> different users could you share the auth caps?
>
> Zitat von Reto Gysi :
>
> > That was all that it logged.
> > In the meantime I did some further test. I've created a new erasure coded
> > datapool 'ecpool_test' and if I create a new rbd image with this data
> pool
> > I can create snapshots, but I can't create snapshots on both new and
> > existing images with existing data pool 'ecpool_hdd'
> >
> > #create new image on existing erasure code data-pool ecpool_hdd
> > root@zephir:~# rbd create -p rbd --data-pool ecpool_hdd test_ecpool_hdd
> > --size 10G
> >
> > #create new image on new erasure code data-pool ecpool_test
> > root@zephir:~# rbd create -p rbd --data-pool ecpool_test
> test_ecpool_test
> > --size 10G
> >
> > # trying to create snap-shot of image with data pool ecpool_hdd -> fails
> > root@zephir:~# rbd snap create test_ecpool_hdd@backup --debug-ms 1
> > --debug-rbd 20
> > 2023-04-17T21:52:43.623+0200 7f3f787174c0  1  Processor -- start
> > 2023-04-17T21:52:43.623+0200 7f3f787174c0  1 --  start start
> > 2023-04-17T21:52:43.623+0200 7f3f787174c0  1 --2-  >> [v2:
> > 192.168.1.1:3300/0,v1:192.168.1.1:6789/0] conn(0x562a3dd350c0
> > 0x562a3dd35490 unknown :-1 s=NONE pgs=0 cs=0 l=0 rev1=0 crypto rx=0 tx=0
> > comp rx=0 tx=0).connect
> > 2023-04-17T21:52:43.623+0200 7f3f787174c0  1 --2-  >> [v2:
> > 192.168.1.10:3300/0,v1:192.168.1.10:6789/0] conn(0x562a3dd359d0
> > 0x562a3dd42730 unknown :-1 s=NONE pgs=0 cs=0 l=0 rev1=0 crypto rx=0 tx=0
> > comp rx=0 tx=0).connect
> > 2023-04-17T21:52:43.623+0200 7f3f787174c0  1 --2-  >> [v2:
> > 192.168.43.208:3300/0,v1:192.168.43.208:6789/0] conn(0x562a3dd42c70
> > 0x562a3dd45050 unknown :-1 s=NONE pgs=0 cs=0 l=0 rev1=0 crypto rx=0 tx=0
> > comp rx=0 tx=0).connect
> > 2023-04-17T21:52:43.623+0200 7f3f787174c0  1 --  --> [v2:
> > 192.168.1.1:3300/0,v1:192.168.1.1:6789/0] -- mon_getmap magic: 0 v1 --
> > 0x562a3dc24b70 con 0x562a3dd350c0
> > 2023-04-17T21:52:43.623+0200 7f3f787174c0  1 --  --> [v2:
> > 192.168.1.10:3300/0,v1:192.168.1.10:6789/0] -- mon_getmap magic: 0 v1 --
> > 0x562a3dc1e1b0 con 0x562a3dd359d0
> > 2023-04-17T21:52:43.623+0200 7f3f787174c0  1 --  --> [v2:
> > 192.168.43.208:3300/0,v1:192.168.43.208:6789/0] -- mon_getmap magic: 0
> v1
> > -- 0x562a3dbba680 con 0x562a3dd42c70
> > 2023-04-17T21:52:43.623+0200 7f3f7668d700  1 --2-  >> [v2:
> > 192.168.1.1:3300/0,v1:192.168.1.1:6789/0] conn(0x562a3dd350c0
> > 0x562a3dd35490 unknown :-1 s=BANNER_CONNECTING pgs=0 cs=0 l=0 rev1=0
> crypto
> > rx=0 tx=0 comp rx=0 tx=0)._handle_peer_banner_payload supporte
> > d=3 required=0
> > 2023-04-17T21:52:43.623+0200 7f3f7668d700  1 --2-  >> [v2:
> > 192.168.1.1:3300/0,v1:192.168.1.1:6789/0] conn(0x562a3dd350c0
> > 0x562a3dd35490 unknown :-1 s=HELLO_CONNECTING pgs=0 cs=0 l=0 rev1=1
> crypto
> > rx=0 tx=0 comp rx=0 tx=0).handle_hello peer v2:192.168.1.1:3300
> > /0 says I am v2:192.168.1.1:43298/0 (socket says 192.168.1.1:43298)
> > 2023-04-17T21:52:43.623+0200 7f3f7668d700  1 -- 192.168.1.1:0/3758799544
> > learned_addr learned my addr 192.168.1.1:0/3758799544 (peer_addr_for_me
> v2:
> > 192.168.1.1:0/0)
> > 2023-04-17T21:52:43.623+0200 7f3f75e8c700  1 --2-
> 192.168.1.1:0/3758799544
> >>> [v2:192.168.1.10:3300/0,v1:192.168.1.10:6789/0] conn(0x562a3dd359d0
> > 0x562a3dd42730 unknown :-1 s=BANNER_CONNECTING pgs=0 cs=0 l=0 rev1=0
> crypto
> > rx=0 tx=0 comp rx=0 tx=0)._handle_pe
> > er_banner_payload supported=3 required=0
> > 2023-04-17T21:52:43.623+0200 7f3f7668d700  1 -- 192.168.1.1:0/3758799544
> >>
> > [v2:192.168.43.208:3300/0,v1:192.168.43.208:6789/0] conn(0x562a3dd42c70
> > msgr2=0x562a3dd45050 unknown :-1 s=STATE_CONNECTING_RE l=0).mark_down
> > 2023-04-17T21:52:43.623+0200 7f3f7668d700  1 --2-
> 192.168.1.1:0/3758799544
> >>> [v2:192.168.43.208:3300/0,v1:192.168.43.208:6789/0]
> conn(0x562a3dd42c70
> > 0x562a3dd45050 unknown :-1 s=START_CONNECT pgs=0 cs=0 l=0 rev1=0 crypto
> > rx=0 tx=0 comp rx=0 tx=0).stop
> > 2023-04-17T21:52:43.623+0200 7f3f7668d700  1 -- 192.168.1.1:0/3758799544
> >>
> > [v2:192.168.1.10:3300/0,v1:192.168.1.10:6789/0] conn(0x562a3dd359d0
> > msgr2=0x562a3dd42730 unknown :-1 s=STATE_CONNECTION_ESTABLISHED
> > l=0).mark_down
> > 2023-04-17T21:52:43.623+0200 7f3f7668d700  1 --2-
> 

[ceph-users] Re: ceph pg stuck - missing on 1 osd how to proceed

2023-04-18 Thread David Orman
You may want to consider disabling deep scrubs and scrubs while attempting to 
complete a backfill operation.

On Tue, Apr 18, 2023, at 01:46, Eugen Block wrote:
> I didn't mean you should split your PGs now, that won't help because  
> there is already backfilling going on. I would revert the pg_num  
> changes (since nothing actually happened yet there's no big risk) and  
> wait for the backfill to finish. You don't seem to have inactive PGs  
> so it shouldn't be an issue as long as nothing else breaks down. Do  
> you see progress of the backfilling? Do the numbers of misplaced  
> objects change?
>
> Zitat von xadhoo...@gmail.com:
>
>> Thanks, I try to change the pg and pgp number to an higher value but  
>> pg do not increase
>> ta:
>> pools:   8 pools, 1085 pgs
>> objects: 242.28M objects, 177 TiB
>> usage:   553 TiB used, 521 TiB / 1.0 PiB avail
>> pgs: 635281/726849381 objects degraded (0.087%)
>>  91498351/726849381 objects misplaced (12.588%)
>>  773 active+clean
>>  288 active+remapped+backfilling
>>  11  active+clean+scrubbing+deep
>>  10  active+clean+scrubbing
>>  3   active+undersized+degraded+remapped+backfilling
>>
>>
>> still have those 3 pg in stuck
>> ___
>> ceph-users mailing list -- ceph-users@ceph.io
>> To unsubscribe send an email to ceph-users-le...@ceph.io
>
>
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: [ceph 17.2.6] unable to create rbd snapshots for images with erasure code data-pool

2023-04-18 Thread Eugen Block

Hi,


In the meantime I did some further test. I've created a new erasure coded
datapool 'ecpool_test' and if I create a new rbd image with this data pool
I can create snapshots, but I can't create snapshots on both new and
existing images with existing data pool 'ecpool_hdd'


just one thought, could this be a caps mismatch? Is it the same user  
in those two pools who creates snaps (or tries to)? If those are  
different users could you share the auth caps?


Zitat von Reto Gysi :


That was all that it logged.
In the meantime I did some further test. I've created a new erasure coded
datapool 'ecpool_test' and if I create a new rbd image with this data pool
I can create snapshots, but I can't create snapshots on both new and
existing images with existing data pool 'ecpool_hdd'

#create new image on existing erasure code data-pool ecpool_hdd
root@zephir:~# rbd create -p rbd --data-pool ecpool_hdd test_ecpool_hdd
--size 10G

#create new image on new erasure code data-pool ecpool_test
root@zephir:~# rbd create -p rbd --data-pool ecpool_test test_ecpool_test
--size 10G

# trying to create snap-shot of image with data pool ecpool_hdd -> fails
root@zephir:~# rbd snap create test_ecpool_hdd@backup --debug-ms 1
--debug-rbd 20
2023-04-17T21:52:43.623+0200 7f3f787174c0  1  Processor -- start
2023-04-17T21:52:43.623+0200 7f3f787174c0  1 --  start start
2023-04-17T21:52:43.623+0200 7f3f787174c0  1 --2-  >> [v2:
192.168.1.1:3300/0,v1:192.168.1.1:6789/0] conn(0x562a3dd350c0
0x562a3dd35490 unknown :-1 s=NONE pgs=0 cs=0 l=0 rev1=0 crypto rx=0 tx=0
comp rx=0 tx=0).connect
2023-04-17T21:52:43.623+0200 7f3f787174c0  1 --2-  >> [v2:
192.168.1.10:3300/0,v1:192.168.1.10:6789/0] conn(0x562a3dd359d0
0x562a3dd42730 unknown :-1 s=NONE pgs=0 cs=0 l=0 rev1=0 crypto rx=0 tx=0
comp rx=0 tx=0).connect
2023-04-17T21:52:43.623+0200 7f3f787174c0  1 --2-  >> [v2:
192.168.43.208:3300/0,v1:192.168.43.208:6789/0] conn(0x562a3dd42c70
0x562a3dd45050 unknown :-1 s=NONE pgs=0 cs=0 l=0 rev1=0 crypto rx=0 tx=0
comp rx=0 tx=0).connect
2023-04-17T21:52:43.623+0200 7f3f787174c0  1 --  --> [v2:
192.168.1.1:3300/0,v1:192.168.1.1:6789/0] -- mon_getmap magic: 0 v1 --
0x562a3dc24b70 con 0x562a3dd350c0
2023-04-17T21:52:43.623+0200 7f3f787174c0  1 --  --> [v2:
192.168.1.10:3300/0,v1:192.168.1.10:6789/0] -- mon_getmap magic: 0 v1 --
0x562a3dc1e1b0 con 0x562a3dd359d0
2023-04-17T21:52:43.623+0200 7f3f787174c0  1 --  --> [v2:
192.168.43.208:3300/0,v1:192.168.43.208:6789/0] -- mon_getmap magic: 0 v1
-- 0x562a3dbba680 con 0x562a3dd42c70
2023-04-17T21:52:43.623+0200 7f3f7668d700  1 --2-  >> [v2:
192.168.1.1:3300/0,v1:192.168.1.1:6789/0] conn(0x562a3dd350c0
0x562a3dd35490 unknown :-1 s=BANNER_CONNECTING pgs=0 cs=0 l=0 rev1=0 crypto
rx=0 tx=0 comp rx=0 tx=0)._handle_peer_banner_payload supporte
d=3 required=0
2023-04-17T21:52:43.623+0200 7f3f7668d700  1 --2-  >> [v2:
192.168.1.1:3300/0,v1:192.168.1.1:6789/0] conn(0x562a3dd350c0
0x562a3dd35490 unknown :-1 s=HELLO_CONNECTING pgs=0 cs=0 l=0 rev1=1 crypto
rx=0 tx=0 comp rx=0 tx=0).handle_hello peer v2:192.168.1.1:3300
/0 says I am v2:192.168.1.1:43298/0 (socket says 192.168.1.1:43298)
2023-04-17T21:52:43.623+0200 7f3f7668d700  1 -- 192.168.1.1:0/3758799544
learned_addr learned my addr 192.168.1.1:0/3758799544 (peer_addr_for_me v2:
192.168.1.1:0/0)
2023-04-17T21:52:43.623+0200 7f3f75e8c700  1 --2- 192.168.1.1:0/3758799544

[v2:192.168.1.10:3300/0,v1:192.168.1.10:6789/0] conn(0x562a3dd359d0

0x562a3dd42730 unknown :-1 s=BANNER_CONNECTING pgs=0 cs=0 l=0 rev1=0 crypto
rx=0 tx=0 comp rx=0 tx=0)._handle_pe
er_banner_payload supported=3 required=0
2023-04-17T21:52:43.623+0200 7f3f7668d700  1 -- 192.168.1.1:0/3758799544 >>
[v2:192.168.43.208:3300/0,v1:192.168.43.208:6789/0] conn(0x562a3dd42c70
msgr2=0x562a3dd45050 unknown :-1 s=STATE_CONNECTING_RE l=0).mark_down
2023-04-17T21:52:43.623+0200 7f3f7668d700  1 --2- 192.168.1.1:0/3758799544

[v2:192.168.43.208:3300/0,v1:192.168.43.208:6789/0] conn(0x562a3dd42c70

0x562a3dd45050 unknown :-1 s=START_CONNECT pgs=0 cs=0 l=0 rev1=0 crypto
rx=0 tx=0 comp rx=0 tx=0).stop
2023-04-17T21:52:43.623+0200 7f3f7668d700  1 -- 192.168.1.1:0/3758799544 >>
[v2:192.168.1.10:3300/0,v1:192.168.1.10:6789/0] conn(0x562a3dd359d0
msgr2=0x562a3dd42730 unknown :-1 s=STATE_CONNECTION_ESTABLISHED
l=0).mark_down
2023-04-17T21:52:43.623+0200 7f3f7668d700  1 --2- 192.168.1.1:0/3758799544

[v2:192.168.1.10:3300/0,v1:192.168.1.10:6789/0] conn(0x562a3dd359d0

0x562a3dd42730 unknown :-1 s=AUTH_CONNECTING pgs=0 cs=0 l=0 rev1=1 crypto
rx=0 tx=0 comp rx=0 tx=0).stop
2023-04-17T21:52:43.623+0200 7f3f7668d700  1 -- 192.168.1.1:0/3758799544
--> [v2:192.168.1.1:3300/0,v1:192.168.1.1:6789/0] --
mon_subscribe({config=0+,monmap=0+}) v3 -- 0x562a3dbd1d60 con
0x562a3dd350c0
2023-04-17T21:52:43.623+0200 7f3f7668d700  1 --2- 192.168.1.1:0/3758799544

[v2:192.168.1.1:3300/0,v1:192.168.1.1:6789/0] conn(0x562a3dd350c0

0x562a3dd35490 secure :-1 s=READY pgs=511 cs=0 l=1 rev1=1 crypto
rx=0x7f3f6800a700 tx=0x7f3f68005b10 comp rx=0 tx=

[ceph-users] Re: deploying Ceph using FQDN for MON / MDS Services

2023-04-18 Thread Lokendra Rathour
yes thanks, Robert,
after installing the Ceph common mount is working fine.


On Tue, Apr 18, 2023 at 2:10 PM Robert Sander 
wrote:

> On 18.04.23 06:12, Lokendra Rathour wrote:
>
> > but if I try mounting from a normal Linux machine with connectivity
> > enabled between Ceph mon nodes, it gives the error as stated before.
>
> Have you installed ceph-common on the "normal Linux machine"?
>
> Regards
> --
> Robert Sander
> Heinlein Support GmbH
> Linux: Akademie - Support - Hosting
> http://www.heinlein-support.de
>
> Tel: 030-405051-43
> Fax: 030-405051-19
>
> Zwangsangaben lt. §35a GmbHG:
> HRB 93818 B / Amtsgericht Berlin-Charlottenburg,
> Geschäftsführer: Peer Heinlein  -- Sitz: Berlin
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
>


-- 
~ Lokendra
skype: lokendrarathour
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: deploying Ceph using FQDN for MON / MDS Services

2023-04-18 Thread Robert Sander

On 18.04.23 06:12, Lokendra Rathour wrote:

but if I try mounting from a normal Linux machine with connectivity 
enabled between Ceph mon nodes, it gives the error as stated before.


Have you installed ceph-common on the "normal Linux machine"?

Regards
--
Robert Sander
Heinlein Support GmbH
Linux: Akademie - Support - Hosting
http://www.heinlein-support.de

Tel: 030-405051-43
Fax: 030-405051-19

Zwangsangaben lt. §35a GmbHG:
HRB 93818 B / Amtsgericht Berlin-Charlottenburg,
Geschäftsführer: Peer Heinlein  -- Sitz: Berlin
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Consequence of maintaining hundreds of clones of a single RBD image snapshot

2023-04-18 Thread Eyal Barlev
Hello,

My use-case involves creating hundreds of clones (~1,000) of a single RBD
image snapshot.

I assume watchers exist for each clone, due to the copy-on-write nature of
clones.

Should I expect a penalty for maintaining such a large number of clones:
cpu, memory, performance?

If such penalty does exist, we might opt to flatten some of the clones. Is
consistency guaranteed during the flattening process? In other words, can I
write to a clone while it is being flattened?

Perspectivus
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: ceph pg stuck - missing on 1 osd how to proceed

2023-04-18 Thread Eugen Block
I didn't mean you should split your PGs now, that won't help because  
there is already backfilling going on. I would revert the pg_num  
changes (since nothing actually happened yet there's no big risk) and  
wait for the backfill to finish. You don't seem to have inactive PGs  
so it shouldn't be an issue as long as nothing else breaks down. Do  
you see progress of the backfilling? Do the numbers of misplaced  
objects change?


Zitat von xadhoo...@gmail.com:

Thanks, I try to change the pg and pgp number to an higher value but  
pg do not increase

ta:
pools:   8 pools, 1085 pgs
objects: 242.28M objects, 177 TiB
usage:   553 TiB used, 521 TiB / 1.0 PiB avail
pgs: 635281/726849381 objects degraded (0.087%)
 91498351/726849381 objects misplaced (12.588%)
 773 active+clean
 288 active+remapped+backfilling
 11  active+clean+scrubbing+deep
 10  active+clean+scrubbing
 3   active+undersized+degraded+remapped+backfilling


still have those 3 pg in stuck
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io



___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io