[ceph-users] Initialization timeout, failed to initialize

2023-05-03 Thread Vitaly Goot
playing with MULTI-SITE zones for CEPH Object Gateway

ceph version: 17.2.5 
my setup: 3 zone multi-site; 3-way full sync mode; 
each zone has 3 machines -> RGW+MON+OSD
running load test:  3000 concurrent uploads of 1M object 

after about 3-4 minutes of load RGW machine get stuck, on 2 zone out of 3 RGW 
is not responding (e.g. curl $RGW:80) 
attempt to restart RGW ends up with `Initialization timeout, failed to 
initialize`

here is a backtrace from gdb with a backtrace where it hangs after restart:

(gdb) inf thr
  Id   Target Id   Frame
* 1Thread 0x7fa7d3abbcc0 (LWP 30791) "radosgw" 
futex_wait_cancelable (private=, expected=0, 
futex_word=0x7ffc7f7a2438) at ../sysdeps/nptl/futex-internal.h:183
...

(gdb) bt
#0  futex_wait_cancelable (private=, expected=0, 
futex_word=0x7ffc7f7a2438) at ../sysdeps/nptl/futex-internal.h:183
#1  __pthread_cond_wait_common (abstime=0x0, clockid=0, mutex=0x7ffc7f7a2488, 
cond=0x7ffc7f7a2410) at pthread_cond_wait.c:508
#2  __pthread_cond_wait (cond=cond@entry=0x7ffc7f7a2410, mutex=0x7ffc7f7a2488) 
at pthread_cond_wait.c:647
#3  0x7fa7d7097e42 in ceph::condition_variable_debug::wait 
(this=this@entry=0x7ffc7f7a2410, lock=...) at ../src/common/mutex_debug.h:148
#4  0x7fa7d7953cba in 
ceph::condition_variable_debug::wait > (pred=..., 
lock=..., this=0x7ffc7f7a2410) at ../src/librados/IoCtxImpl.cc:672
#5  librados::IoCtxImpl::operate (this=this@entry=0x558347c21010, oid=..., 
o=0x558347e12310, pmtime=, flags=) at 
../src/librados/IoCtxImpl.cc:672
#6  0x7fa7d792bd55 in librados::v14_2_0::IoCtx::operate 
(this=this@entry=0x558347e44760, oid="notify.0", o=o@entry=0x7ffc7f7a2690, 
flags=flags@entry=0) at ../src/librados/librados_cxx.cc:1536
#7  0x7fa7d9490ad1 in rgw_rados_operate (dpp=, ioctx=..., 
oid="notify.0", op=op@entry=0x7ffc7f7a2690, y=..., flags=0) at 
../src/rgw/rgw_tools.cc:277
#8  0x7fa7d9627e0f in RGWSI_RADOS::Obj::operate 
(this=this@entry=0x558347e44710, dpp=, 
op=op@entry=0x7ffc7f7a2690, y=..., flags=flags@entry=0) at 
../src/rgw/services/svc_rados.h:112
#9  0x7fa7d96209a5 in RGWSI_Notify::init_watch 
(this=this@entry=0x558347c49530, dpp=, y=...) at 
../src/rgw/services/svc_notify.cc:214
#10 0x7fa7d962161b in RGWSI_Notify::do_start (this=0x558347c49530, y=..., 
dpp=) at ../src/rgw/services/svc_notify.cc:277
#11 0x7fa7d8f17bcf in RGWServiceInstance::start (this=0x558347c49530, 
y=..., dpp=) at ../src/rgw/rgw_service.cc:331
#12 0x7fa7d8f1a260 in RGWServices_Def::init 
(this=this@entry=0x558347de90a0, cct=, have_cache=, raw=raw@entry=false, run_sync=, y=..., dpp=) at /usr/include/c++/9/bits/unique_ptr.h:360
#13 0x7fa7d8f1cc40 in RGWServices::do_init (this=this@entry=0x558347de90a0, 
_cct=, have_cache=, raw=raw@entry=false, 
run_sync=, y=..., dpp=) at 
../src/rgw/rgw_service.cc:284
#14 0x7fa7d92a7b1f in RGWServices::init (dpp=, y=..., 
run_sync=, have_cache=, cct=, 
this=0x558347de90a0) at ../src/rgw/rgw_service.h:153
#15 RGWRados::init_svc (this=this@entry=0x558347de8dc0, raw=raw@entry=false, 
dpp=) at ../src/rgw/rgw_rados.cc:1380
#16 0x7fa7d930f241 in RGWRados::initialize (this=0x558347de8dc0, 
dpp=) at ../src/rgw/rgw_rados.cc:1400
#17 0x7fa7d944f85f in RGWRados::initialize (dpp=, 
_cct=0x558347c6a320, this=) at ../src/rgw/rgw_rados.h:586
#18 StoreManager::init_storage_provider (dpp=, 
dpp@entry=0x7ffc7f7a2e90, cct=cct@entry=0x558347c6a320, svc="rados", 
use_gc_thread=use_gc_thread@entry=true, use_lc_thread=use_lc_thread@entry=true, 
quota_threads=quota_threads@entry=true, run_sync_thread=true, 
run_reshard_thread=true, use_cache=true,
use_gc=true) at ../src/rgw/rgw_sal.cc:55
#19 0x7fa7d8e7367a in StoreManager::get_storage (use_gc=true, 
use_cache=true, run_reshard_thread=true, run_sync_thread=true, 
quota_threads=true, use_lc_thread=true, use_gc_thread=true, svc="rados", 
cct=0x558347c6a320, dpp=0x7ffc7f7a2e90) at 
/usr/include/c++/9/bits/basic_string.h:267
#20 radosgw_Main (argc=, argv=) at 
../src/rgw/rgw_main.cc:372
#21 0x558347883f56 in main (argc=, argv=) at 
../src/rgw/radosgw.cc:12
(gdb)
#0  futex_wait_cancelable (private=, expected=0, 
futex_word=0x7ffc7f7a2438) at ../sysdeps/nptl/futex-internal.h:183
#1  __pthread_cond_wait_common (abstime=0x0, clockid=0, mutex=0x7ffc7f7a2488, 
cond=0x7ffc7f7a2410) at pthread_cond_wait.c:508
#2  __pthread_cond_wait (cond=cond@entry=0x7ffc7f7a2410, mutex=0x7ffc7f7a2488) 
at pthread_cond_wait.c:647
#3  0x7fa7d7097e42 in ceph::condition_variable_debug::wait 
(this=this@entry=0x7ffc7f7a2410, lock=...) at ../src/common/mutex_debug.h:148
#4  0x7fa7d7953cba in 
ceph::condition_variable_debug::wait > (pred=..., 
lock=..., this=0x7ffc7f7a2410) at ../src/librados/IoCtxImpl.cc:672
#5  librados::IoCtxImpl::operate (this=this@entry=0x558347c21010, oid=..., 
o=0x558347e12310, pmtime=, flags=) at 
../src/librados/IoCtxImpl.cc:672
#6  0x7fa7d792bd55 in librados::v14_2_0::IoCtx::operate 
(this=this@entry=0x558347e44

[ceph-users] Re: 16.2.13 pacific QE validation status

2023-05-03 Thread Radoslaw Zarzynski
rados approved.

Big thanks to Laura for helping with this!

On Thu, Apr 27, 2023 at 11:21 PM Yuri Weinstein  wrote:
>
> Details of this release are summarized here:
>
> https://tracker.ceph.com/issues/59542#note-1
> Release Notes - TBD
>
> Seeking approvals for:
>
> smoke - Radek, Laura
> rados - Radek, Laura
>   rook - Sébastien Han
>   cephadm - Adam K
>   dashboard - Ernesto
>
> rgw - Casey
> rbd - Ilya
> krbd - Ilya
> fs - Venky, Patrick
> upgrade/octopus-x (pacific) - Laura (look the same as in 16.2.8)
> upgrade/pacific-p2p - Laura
> powercycle - Brad (SELinux denials)
> ceph-volume - Guillaume, Adam K
>
> Thx
> YuriW
> ___
> Dev mailing list -- d...@ceph.io
> To unsubscribe send an email to dev-le...@ceph.io
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: MDS crash on FAILED ceph_assert(cur->is_auth())

2023-05-03 Thread Emmanuel Jaep
Hi,

did you finally figure out what happened?
I do have the same behavior and we can't get the mds to start again...

Thanks,

Emmanuel
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Unable to restart mds - mds crashes almost immediately after finishing recovery

2023-05-03 Thread Emmanuel Jaep
Hi,

I just inherited a ceph storage. Therefore, my level of confidence with the 
tool is certainly less than ideal.

We currently have an mds server that refuses to come back online. While 
reviewing the logs, I can see that, upon mds start, the recovery goes well:
```
   -10> 2023-05-03T08:12:43.632+0200 7f345d00b700  1 mds.4.2638711 cluster 
recovered.
 12: (MDCache::_open_ino_traverse_dir(inodeno_t, MDCache::open_ino_info_t&, 
int)+0xbf) [0x558323d602df]
```

However, rights after this message, ceph handles a couple of clients request:
```
-9> 2023-05-03T08:12:43.632+0200 7f345d00b700  4 mds.4.2638711 
set_osd_epoch_barrier: epoch=249241
-8> 2023-05-03T08:12:43.632+0200 7f3459003700  2 mds.4.cache Memory usage:  
total 2739784, rss 2321188, heap 348412, baseline 315644, 0 / 765023 inodes 
have caps, 0 caps, 0 caps per inode
-7> 2023-05-03T08:12:43.688+0200 7f3458802700  4 mds.4.server 
handle_client_request client_request(client.108396030:57271 lookup 
#0x70001516236/012385530.npy 2023-05-02T20:37:19.675666+0200 RETRY=6 
caller_uid=135551, caller_gid=11157{0,4,27,11157,}) v5
-6> 2023-05-03T08:12:43.688+0200 7f3458802700  4 mds.4.server 
handle_client_request client_request(client.104073212:5109945 readdir 
#0x70001516236 2023-05-02T20:36:29.517066+0200 RETRY=6 caller_uid=180090, 
caller_gid=11157{0,4,27,11157,}) v5
-5> 2023-05-03T08:12:43.688+0200 7f3458802700  4 mds.4.server 
handle_client_request client_request(client.104288735:3008344 readdir 
#0x70001516236 2023-05-02T20:36:29.520801+0200 RETRY=6 caller_uid=135551, 
caller_gid=11157{0,4,27,11157,}) v5
-4> 2023-05-03T08:12:43.688+0200 7f3458802700  4 mds.4.server 
handle_client_request client_request(client.8558540:46306346 readdir 
#0x700019ba15e 2023-05-01T21:26:34.303697+0200 RETRY=49 caller_uid=0, 
caller_gid=0{}) v2
-3> 2023-05-03T08:12:43.688+0200 7f3458802700  4 mds.4.server 
handle_client_request client_request(client.96913903:2156912 create 
#0x1000b37db9a/street-photo-3.png 2023-05-01T17:27:37.454042+0200 RETRY=59 
caller_uid=271932, caller_gid=30034{}) v2
-2> 2023-05-03T08:12:43.688+0200 7f345d00b700  5 mds.icadmin006 
handle_mds_map old map epoch 2638715 <= 2638715, discarding
```

and crashes:
```
-1> 2023-05-03T08:12:43.692+0200 7f345d00b700 -1 
/build/ceph-16.2.10/src/mds/Server.cc: In function 'void 
Server::handle_client_open(MDRequestRef&)' thread 7f345d00b700 time 
2023-05-03T08:12:43.694660+0200
/build/ceph-16.2.10/src/mds/Server.cc: 4240: FAILED ceph_assert(cur->is_auth())

 ceph version 16.2.10 (45fa1a083152e41a408d15505f594ec5f1b4fe17) pacific 
(stable)
 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char 
const*)+0x152) [0x7f3462533d65]
 2: /usr/lib/ceph/libceph-common.so.2(+0x265f6d) [0x7f3462533f6d]
 3: (Server::handle_client_open(boost::intrusive_ptr&)+0x1834) 
[0x558323c89f04]
 4: (Server::handle_client_openc(boost::intrusive_ptr&)+0x28f) 
[0x558323c925ef]
 5: 
(Server::dispatch_client_request(boost::intrusive_ptr&)+0xa45) 
[0x558323cc3575]
 6: (MDCache::dispatch_request(boost::intrusive_ptr&)+0x3d) 
[0x558323d7460d]
 7: (MDSContext::complete(int)+0x61) [0x558323f68681]
 8: (MDCache::_open_remote_dentry_finish(CDentry*, inodeno_t, MDSContext*, 
bool, int)+0x3e) [0x558323d3edce]
 9: (C_MDC_OpenRemoteDentry::finish(int)+0x3e) [0x558323de6cce]
 10: (MDSContext::complete(int)+0x61) [0x558323f68681]
 11: (MDCache::open_ino_finish(inodeno_t, MDCache::open_ino_info_t&, int)+0xcf) 
[0x558323d5ff2f]
 12: (MDCache::_open_ino_traverse_dir(inodeno_t, MDCache::open_ino_info_t&, 
int)+0xbf) [0x558323d602df]
 13: (MDSContext::complete(int)+0x61) [0x558323f68681]
 14: (MDSRank::_advance_queues()+0x88) [0x558323c23c38]
 15: (MDSRank::_dispatch(boost::intrusive_ptr const&, 
bool)+0x1fa) [0x558323c24a1a]
 16: (MDSRankDispatcher::ms_dispatch(boost::intrusive_ptr 
const&)+0x5e) [0x558323c254fe]
 17: (MDSDaemon::ms_dispatch2(boost::intrusive_ptr const&)+0x1d6) 
[0x558323bfd906]
 18: (Messenger::ms_deliver_dispatch(boost::intrusive_ptr 
const&)+0x460) [0x7f34627854e0]
 19: (DispatchQueue::entry()+0x58f) [0x7f3462782d7f]
 20: (DispatchQueue::DispatchThread::entry()+0x11) [0x7f346284eee1]
 21: /lib/x86_64-linux-gnu/libpthread.so.0(+0x8609) [0x7f3462278609]
 22: clone()

 0> 2023-05-03T08:12:43.700+0200 7f345d00b700 -1 *** Caught signal 
(Aborted) **
 in thread 7f345d00b700 thread_name:ms_dispatch

 ceph version 16.2.10 (45fa1a083152e41a408d15505f594ec5f1b4fe17) pacific 
(stable)
 1: /lib/x86_64-linux-gnu/libpthread.so.0(+0x143c0) [0x7f34622843c0]
 2: gsignal()
 3: abort()
 4: (ceph::__ceph_assert_fail(char const*, char const*, int, char 
const*)+0x1ad) [0x7f3462533dc0]
 5: /usr/lib/ceph/libceph-common.so.2(+0x265f6d) [0x7f3462533f6d]
 6: (Server::handle_client_open(boost::intrusive_ptr&)+0x1834) 
[0x558323c89f04]
 7: (Server::handle_client_openc(boost::intrusive_ptr&)+0x28f) 
[0x558323c925ef]
 8: 
(Server::dispatch_client_request(boost::intrusive_ptr&)+0xa45) 
[0x558323cc3575]
 9: (MDCache::dis

[ceph-users] pg upmap primary

2023-05-03 Thread Nguetchouang Ngongang Kevin
Hello, I have a question, when happened when i delete a pg on which i
set a particular osd as primary using the pg-upmap-primary command ?

-- 
Nguetchouang Ngongang Kevin
ENS de Lyon
https://perso.ens-lyon.fr/kevin.nguetchouang/
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] test

2023-05-03 Thread Dan Mick

making some DMARC-related changes.  Ignore this please.
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: 16.2.13 pacific QE validation status

2023-05-03 Thread Guillaume Abrioux
The failure seen in ceph-volume tests isn't related.
That being said, it needs to be fixed to have a better view of the current
status.

On Wed, 3 May 2023 at 21:00, Laura Flores  wrote:

> upgrade/octopus-x (pacific) is approved. Went over failures with Adam King
> and it was decided they are not release blockers.
>
> On Wed, May 3, 2023 at 1:53 PM Yuri Weinstein  wrote:
>
>> upgrade/octopus-x (pacific) - Laura
>> ceph-volume - Guillaume
>>
>> + 2 PRs are the remaining issues
>>
>> Josh FYI
>>
>> On Wed, May 3, 2023 at 11:50 AM Radoslaw Zarzynski 
>> wrote:
>> >
>> > rados approved.
>> >
>> > Big thanks to Laura for helping with this!
>> >
>> > On Thu, Apr 27, 2023 at 11:21 PM Yuri Weinstein 
>> wrote:
>> > >
>> > > Details of this release are summarized here:
>> > >
>> > > https://tracker.ceph.com/issues/59542#note-1
>> > > Release Notes - TBD
>> > >
>> > > Seeking approvals for:
>> > >
>> > > smoke - Radek, Laura
>> > > rados - Radek, Laura
>> > >   rook - Sébastien Han
>> > >   cephadm - Adam K
>> > >   dashboard - Ernesto
>> > >
>> > > rgw - Casey
>> > > rbd - Ilya
>> > > krbd - Ilya
>> > > fs - Venky, Patrick
>> > > upgrade/octopus-x (pacific) - Laura (look the same as in 16.2.8)
>> > > upgrade/pacific-p2p - Laura
>> > > powercycle - Brad (SELinux denials)
>> > > ceph-volume - Guillaume, Adam K
>> > >
>> > > Thx
>> > > YuriW
>> > > ___
>> > > Dev mailing list -- d...@ceph.io
>> > > To unsubscribe send an email to dev-le...@ceph.io
>> >
>> ___
>> Dev mailing list -- d...@ceph.io
>> To unsubscribe send an email to dev-le...@ceph.io
>>
>
>
> --
>
> Laura Flores
>
> She/Her/Hers
>
> Software Engineer, Ceph Storage 
>
> Chicago, IL
>
> lflo...@ibm.com | lflo...@redhat.com 
> M: +17087388804
>
>
> ___
> Dev mailing list -- d...@ceph.io
> To unsubscribe send an email to dev-le...@ceph.io
>


-- 

*Guillaume Abrioux*Senior Software Engineer
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: 16.2.13 pacific QE validation status

2023-05-03 Thread Laura Flores
upgrade/octopus-x (pacific) is approved. Went over failures with Adam King
and it was decided they are not release blockers.

On Wed, May 3, 2023 at 1:53 PM Yuri Weinstein  wrote:

> upgrade/octopus-x (pacific) - Laura
> ceph-volume - Guillaume
>
> + 2 PRs are the remaining issues
>
> Josh FYI
>
> On Wed, May 3, 2023 at 11:50 AM Radoslaw Zarzynski 
> wrote:
> >
> > rados approved.
> >
> > Big thanks to Laura for helping with this!
> >
> > On Thu, Apr 27, 2023 at 11:21 PM Yuri Weinstein 
> wrote:
> > >
> > > Details of this release are summarized here:
> > >
> > > https://tracker.ceph.com/issues/59542#note-1
> > > Release Notes - TBD
> > >
> > > Seeking approvals for:
> > >
> > > smoke - Radek, Laura
> > > rados - Radek, Laura
> > >   rook - Sébastien Han
> > >   cephadm - Adam K
> > >   dashboard - Ernesto
> > >
> > > rgw - Casey
> > > rbd - Ilya
> > > krbd - Ilya
> > > fs - Venky, Patrick
> > > upgrade/octopus-x (pacific) - Laura (look the same as in 16.2.8)
> > > upgrade/pacific-p2p - Laura
> > > powercycle - Brad (SELinux denials)
> > > ceph-volume - Guillaume, Adam K
> > >
> > > Thx
> > > YuriW
> > > ___
> > > Dev mailing list -- d...@ceph.io
> > > To unsubscribe send an email to dev-le...@ceph.io
> >
> ___
> Dev mailing list -- d...@ceph.io
> To unsubscribe send an email to dev-le...@ceph.io
>


-- 

Laura Flores

She/Her/Hers

Software Engineer, Ceph Storage 

Chicago, IL

lflo...@ibm.com | lflo...@redhat.com 
M: +17087388804
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: 16.2.13 pacific QE validation status

2023-05-03 Thread Yuri Weinstein
upgrade/octopus-x (pacific) - Laura
ceph-volume - Guillaume

+ 2 PRs are the remaining issues

Josh FYI

On Wed, May 3, 2023 at 11:50 AM Radoslaw Zarzynski  wrote:
>
> rados approved.
>
> Big thanks to Laura for helping with this!
>
> On Thu, Apr 27, 2023 at 11:21 PM Yuri Weinstein  wrote:
> >
> > Details of this release are summarized here:
> >
> > https://tracker.ceph.com/issues/59542#note-1
> > Release Notes - TBD
> >
> > Seeking approvals for:
> >
> > smoke - Radek, Laura
> > rados - Radek, Laura
> >   rook - Sébastien Han
> >   cephadm - Adam K
> >   dashboard - Ernesto
> >
> > rgw - Casey
> > rbd - Ilya
> > krbd - Ilya
> > fs - Venky, Patrick
> > upgrade/octopus-x (pacific) - Laura (look the same as in 16.2.8)
> > upgrade/pacific-p2p - Laura
> > powercycle - Brad (SELinux denials)
> > ceph-volume - Guillaume, Adam K
> >
> > Thx
> > YuriW
> > ___
> > Dev mailing list -- d...@ceph.io
> > To unsubscribe send an email to dev-le...@ceph.io
>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: 16.2.13 pacific QE validation status

2023-05-03 Thread Yuri Weinstein
I know of two PRs that have been requested to be cherry-picked in 16.2.13

https://github.com/ceph/ceph/pull/51232 -- fs
https://github.com/ceph/ceph/pull/51200 -- rgw

Casey, Venky - would you approve it?

On Wed, May 3, 2023 at 6:41 AM Venky Shankar  wrote:
>
> On Tue, May 2, 2023 at 8:25 PM Yuri Weinstein  wrote:
> >
> > Venky, I did plan to cherry-pick this PR if you approve this (this PR
> > was used for a rerun)
>
> OK. The fs suite failure is being looked into
> (https://tracker.ceph.com/issues/59626).
>
> >
> > On Tue, May 2, 2023 at 7:51 AM Venky Shankar  wrote:
> > >
> > > Hi Yuri,
> > >
> > > On Fri, Apr 28, 2023 at 2:53 AM Yuri Weinstein  
> > > wrote:
> > > >
> > > > Details of this release are summarized here:
> > > >
> > > > https://tracker.ceph.com/issues/59542#note-1
> > > > Release Notes - TBD
> > > >
> > > > Seeking approvals for:
> > > >
> > > > smoke - Radek, Laura
> > > > rados - Radek, Laura
> > > >   rook - Sébastien Han
> > > >   cephadm - Adam K
> > > >   dashboard - Ernesto
> > > >
> > > > rgw - Casey
> > > > rbd - Ilya
> > > > krbd - Ilya
> > > > fs - Venky, Patrick
> > >
> > > There are a couple of new failures which are qa/test related - I'll
> > > have a look at those (they _do not_ look serious).
> > >
> > > Also, Yuri, do you plan to merge
> > >
> > > https://github.com/ceph/ceph/pull/51232
> > >
> > > into the pacific-release branch although it's tagged with one of your
> > > other pacific runs?
> > >
> > > > upgrade/octopus-x (pacific) - Laura (look the same as in 16.2.8)
> > > > upgrade/pacific-p2p - Laura
> > > > powercycle - Brad (SELinux denials)
> > > > ceph-volume - Guillaume, Adam K
> > > >
> > > > Thx
> > > > YuriW
> > > > ___
> > > > ceph-users mailing list -- ceph-users@ceph.io
> > > > To unsubscribe send an email to ceph-users-le...@ceph.io
> > >
> > >
> > >
> > > --
> > > Cheers,
> > > Venky
> > >
> >
>
>
> --
> Cheers,
> Venky
>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: rbd map: corrupt full osdmap (-22) when

2023-05-03 Thread Ilya Dryomov
On Wed, May 3, 2023 at 11:24 AM Kamil Madac  wrote:
>
> Hi,
>
> We deployed pacific cluster 16.2.12 with cephadm. We experience following
> error during rbd map:
>
> [Wed May  3 08:59:11 2023] libceph: mon2 (1)[2a00:da8:ffef:1433::]:6789
> session established
> [Wed May  3 08:59:11 2023] libceph: another match of type 1 in addrvec
> [Wed May  3 08:59:11 2023] libceph: corrupt full osdmap (-22) epoch 200 off
> 1042 (9876284d of 0cb24b58-80b70596)
> [Wed May  3 08:59:11 2023] osdmap: : 08 07 7d 10 00 00 09 01 5d 09
> 00 00 a2 22 3b 86  ..}.]";.
> [Wed May  3 08:59:11 2023] osdmap: 0010: e4 f5 11 ed 99 ee 47 75 ca 3c
> ad 23 c8 00 00 00  ..Gu.<.#
> [Wed May  3 08:59:11 2023] osdmap: 0020: 21 68 4a 64 98 d2 5d 2e 84 fd
> 50 64 d9 3a 48 26  !hJd..]...Pd.:H&
> [Wed May  3 08:59:11 2023] osdmap: 0030: 02 00 00 00 01 00 00 00 00 00
> 00 00 1d 05 71 01  ..q.
> 
>
> Linux Kernel is 6.1.13 and the important thing is that we are using ipv6
> addresses for connection to ceph nodes.
> We were able to map rbd from client with kernel 5.10, but in prod
> environment we are not allowed to use that kernel.
>
> What could be the reason for such behavior on newer kernels and how to
> troubleshoot it?
>
> Here is output of ceph osd dump:
>
> # ceph osd dump
> epoch 200
> fsid a2223b86-e4f5-11ed-99ee-4775ca3cad23
> created 2023-04-27T12:18:41.777900+
> modified 2023-05-02T12:09:40.642267+
> flags sortbitwise,recovery_deletes,purged_snapdirs,pglog_hardlimit
> crush_version 34
> full_ratio 0.95
> backfillfull_ratio 0.9
> nearfull_ratio 0.85
> require_min_compat_client luminous
> min_compat_client jewel
> require_osd_release pacific
> stretch_mode_enabled false
> pool 1 'device_health_metrics' replicated size 3 min_size 2 crush_rule 0
> object_hash rjenkins pg_num 1 pgp_num 1 autoscale_mode on last_change 183
> flags hashpspool stripe_width 0 pg_num_max 32 pg_num_min 1 application
> mgr_devicehealth
> pool 2 'idp' replicated size 3 min_size 2 crush_rule 0 object_hash rjenkins
> pg_num 32 pgp_num 32 autoscale_mode on last_change 48 flags
> hashpspool,selfmanaged_snaps stripe_width 0 application rbd
> max_osd 3
> osd.0 up   in  weight 1 up_from 176 up_thru 182 down_at 172
> last_clean_interval [170,171)
> [v2:[2a00:da8:ffef:1431::]:6800/805023868,v1:[2a00:da8:ffef:1431::]:6801/805023868,v2:
> 0.0.0.0:6802/805023868,v1:0.0.0.0:6803/805023868]
> [v2:[2a00:da8:ffef:1431::]:6804/805023868,v1:[2a00:da8:ffef:1431::]:6805/805023868,v2:
> 0.0.0.0:6806/805023868,v1:0.0.0.0:6807/805023868] exists,up
> e8fd0ee2-ea63-4d02-8f36-219d36869078
> osd.1 up   in  weight 1 up_from 136 up_thru 182 down_at 0
> last_clean_interval [0,0)
> [v2:[2a00:da8:ffef:1432::]:6800/2172723816,v1:[2a00:da8:ffef:1432::]:6801/2172723816,v2:
> 0.0.0.0:6802/2172723816,v1:0.0.0.0:6803/2172723816]
> [v2:[2a00:da8:ffef:1432::]:6804/2172723816,v1:[2a00:da8:ffef:1432::]:6805/2172723816,v2:
> 0.0.0.0:6806/2172723816,v1:0.0.0.0:6807/2172723816] exists,up
> 0b7b5628-9273-4757-85fb-9c16e8441895
> osd.2 up   in  weight 1 up_from 182 up_thru 182 down_at 178
> last_clean_interval [123,177)
> [v2:[2a00:da8:ffef:1433::]:6800/887631330,v1:[2a00:da8:ffef:1433::]:6801/887631330,v2:
> 0.0.0.0:6802/887631330,v1:0.0.0.0:6803/887631330]
> [v2:[2a00:da8:ffef:1433::]:6804/887631330,v1:[2a00:da8:ffef:1433::]:6805/887631330,v2:
> 0.0.0.0:6806/887631330,v1:0.0.0.0:6807/887631330] exists,up
> 21f8d0d5-6a3f-4f78-96c8-8ec4e4f78a01

Hi Kamil,

The issue is bogus 0.0.0.0 addresses.  This came up before, see [1] and
later messages from Stefan in the thread.  You would need to ensure that
ms_bind_ipv4 is set to false and restart OSDs.

[1] 
https://lists.ceph.io/hyperkitty/list/ceph-users@ceph.io/message/Q6VYRJBPHQI63OQTBJG2N3BJD2KBEZM4/

Thanks,

Ilya
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: 16.2.13 pacific QE validation status

2023-05-03 Thread Venky Shankar
On Tue, May 2, 2023 at 8:25 PM Yuri Weinstein  wrote:
>
> Venky, I did plan to cherry-pick this PR if you approve this (this PR
> was used for a rerun)

OK. The fs suite failure is being looked into
(https://tracker.ceph.com/issues/59626).

>
> On Tue, May 2, 2023 at 7:51 AM Venky Shankar  wrote:
> >
> > Hi Yuri,
> >
> > On Fri, Apr 28, 2023 at 2:53 AM Yuri Weinstein  wrote:
> > >
> > > Details of this release are summarized here:
> > >
> > > https://tracker.ceph.com/issues/59542#note-1
> > > Release Notes - TBD
> > >
> > > Seeking approvals for:
> > >
> > > smoke - Radek, Laura
> > > rados - Radek, Laura
> > >   rook - Sébastien Han
> > >   cephadm - Adam K
> > >   dashboard - Ernesto
> > >
> > > rgw - Casey
> > > rbd - Ilya
> > > krbd - Ilya
> > > fs - Venky, Patrick
> >
> > There are a couple of new failures which are qa/test related - I'll
> > have a look at those (they _do not_ look serious).
> >
> > Also, Yuri, do you plan to merge
> >
> > https://github.com/ceph/ceph/pull/51232
> >
> > into the pacific-release branch although it's tagged with one of your
> > other pacific runs?
> >
> > > upgrade/octopus-x (pacific) - Laura (look the same as in 16.2.8)
> > > upgrade/pacific-p2p - Laura
> > > powercycle - Brad (SELinux denials)
> > > ceph-volume - Guillaume, Adam K
> > >
> > > Thx
> > > YuriW
> > > ___
> > > ceph-users mailing list -- ceph-users@ceph.io
> > > To unsubscribe send an email to ceph-users-le...@ceph.io
> >
> >
> >
> > --
> > Cheers,
> > Venky
> >
>


-- 
Cheers,
Venky
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] To drain unmanaged OSDs by an oversight

2023-05-03 Thread Ackermann, Christoph
Hello List,

i made a mistake, draining a host instead of entering it into Maintenance
Mode (for OS reboot). :-/
After "Stop Drain" and restore of original "crush reweight" values, so far
everything looks fine.

 cluster:
   health: HEALTH_OK
 services:
[..]
   osd: 79 osds: 78 up (since 3h), 78 in (since 6w); 166 remapped pgs
[..]
And some minor objects being misplaced..

But I can't remove/solve this annoying message "deleting" on these host
OSDs.
[image: Auswahl_2023-05-03_14-17.png]

Does someone have a hint for me?

Thanks,
Christoph
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: MDS "newly corrupt dentry" after patch version upgrade

2023-05-03 Thread Patrick Donnelly
On Wed, May 3, 2023 at 4:33 AM Janek Bevendorff
 wrote:
>
> Hi Patrick,
>
> > I'll try that tomorrow and let you know, thanks!
>
> I was unable to reproduce the crash today. Even with
> mds_abort_on_newly_corrupt_dentry set to true, all MDS booted up
> correctly (though they took forever to rejoin with logs set to 20).
>
> To me it looks like the issue has resolved itself overnight. I had run a
> recursive scrub on the file system and another snapshot was taken, in
> case any of those might have had an effect on this. It could also be the
> case that the (supposedly) corrupt journal entry has simply been
> committed now and hence doesn't trigger the assertion any more. Is there
> any way I can verify this?

You can run:

https://github.com/ceph/ceph/blob/main/src/tools/cephfs/first-damage.py

Just do:

python3 first-damage.py --memo run.1 

No need to do any of the other steps if you just want a read-only check.

-- 
Patrick Donnelly, Ph.D.
He / Him / His
Red Hat Partner Engineer
IBM, Inc.
GPG: 19F28A586F808C2402351B93C3301A3E258DD79D
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] MAnual Upgrade from Octopus Ubuntu 18.04 to Quincy 20.04

2023-05-03 Thread Iban Cabrillo
Dear Cephers, 
We are planing the dist upgrade from Octopus to Quincy in the next weeks. 
The first step is the linux version upgrade from Ubuntu 18.04 to Ubuntu 20.04 
from some big ODS servers runnign this OS version. 
we just have a look at ( Upgrading non-cephadm clusters [ 
https://ceph.io/en/news/blog/2022/v17-2-0-quincy-released/#upgrading-non-cephadm-clusters
 | ¶ ] ): 
https://ceph.io/en/news/blog/2022/v17-2-0-quincy-released/ 

Is any advise or suggestion before to start the procedure ? 

regards, I 
-- 

 
Ibán Cabrillo Bartolomé 
Instituto de Física de Cantabria (IFCA-CSIC) 
Santander, Spain 
Tel: +34942200969/+34669930421 
Responsible for advanced computing service (RSC) 
=
 
=
 
All our suppliers must know and accept IFCA policy available at: 

https://confluence.ifca.es/display/IC/Information+Security+Policy+for+External+Suppliers
 
==
 


___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Flushing stops as copy-from message being throttled

2023-05-03 Thread Eugen Block

Hi,

I doubt that you will get a satisfying response for cache-tier related  
questions. It hasn't been maintained for quite some time and is  
considered deprecated for years. It will be removed in one of the  
upcoming releases, maybe Reef.


Regards,
Eugen

Zitat von lingu2008 :


Hi all,

On one server with a cache tier on Samsung PM983 SSDs for an EC base  
tier on HDDs, I find the cache tier stops flushing or evicting when  
the cache tier is near full. With quite some gdb-debugging, I find  
the problem may be with the throttling mechanism. When the write  
traffic is high, the cache tier quickly fills its maximum request  
count and throttles further requests. Then flush stops because  
copy-from requests are throttled by the cache tier OSD. Ironically,  
the 256 requests already accepted by the cache tier cannot proceed,  
either, because the cache tier is full and cannot flush/evict.


While we may advise cache tier should not go full, this deadlock  
situation is not entirely comprehensible to me because a full cache  
usually can flush/evict as long as the base tier has space.


I wonder whether there has been some specific reasons for this  
behavior. My test environment is with version 15.2.17 but the code  
in 17.2.2 appears to handle this part of logic in the same way.


Cheers,

lin

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io



___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: quincy 17.2.6 - write performance continuously slowing down until OSD restart needed

2023-05-03 Thread Igor Fedotov



On 5/2/2023 9:02 PM, Nikola Ciprich wrote:


hewever, probably worh noting, historically we're using following OSD options:
ceph config set osd bluestore_rocksdb_options 
compression=kNoCompression,max_write_buffer_number=32,min_write_buffer_number_to_merge=2,recycle_log_file_num=32,compaction_style=kCompactionStyleLevel,write_buffer_size=67108864,target_file_size_base=67108864,max_background_compactions=31,level0_file_num_compaction_trigger=8,level0_slowdown_writes_trigger=32,level0_stop_writes_trigger=64,max_bytes_for_level_base=536870912,compaction_threads=32,max_bytes_for_level_multiplier=8,flusher_threads=8,compaction_readahead_size=2MB
ceph config set osd bluestore_cache_autotune 0
ceph config set osd bluestore_cache_size_ssd 2G
ceph config set osd bluestore_cache_kv_ratio 0.2
ceph config set osd bluestore_cache_meta_ratio 0.8
ceph config set osd osd_min_pg_log_entries 10
ceph config set osd osd_max_pg_log_entries 10
ceph config set osd osd_pg_log_dups_tracked 10
ceph config set osd osd_pg_log_trim_min 10

so maybe I'll start resetting those to defaults (ie enabling cache autotune etc)
as a first step..


Generally I wouldn't recommend using non-default settings unless there 
are explicit rationales. So yeah better to revert to defaults whenever 
possible.


I doubt this is a root cause for your issue though..







Thanks,

Igor

On 5/2/2023 11:32 AM, Nikola Ciprich wrote:

Hello dear CEPH users and developers,

we're dealing with strange problems.. we're having 12 node alma linux 9 cluster,
initially installed CEPH 15.2.16, then upgraded to 17.2.5. It's running bunch
of KVM virtual machines accessing volumes using RBD.

everything is working well, but there is strange and for us quite serious issue
   - speed of write operations (both sequential and random) is constantly 
degrading
   drastically to almost unusable numbers (in ~1week it drops from ~70k 4k 
writes/s
   from 1 VM  to ~7k writes/s)

When I restart all OSD daemons, numbers immediately return to normal..

volumes are stored on replicated pool of 4 replicas, on top of 7*12 = 84
INTEL SSDPE2KX080T8 NVMEs.

I've updated cluster to 17.2.6 some time ago, but the problem persists. This is
especially annoying in connection with https://tracker.ceph.com/issues/56896
as restarting OSDs is quite painfull when half of them crash..

I don't see anything suspicious, nodes load is quite low, no logs errors,
network latency and throughput is OK too

Anyone having simimar issue?

I'd like to ask for hints on what should I check further..

we're running lots of 14.2.x and 15.2.x clusters, none showing similar
issue, so I'm suspecting this is something related to quincy

thanks a lot in advance

with best regards

nikola ciprich




--
Igor Fedotov
Ceph Lead Developer

Looking for help with your Ceph cluster? Contact us at https://croit.io

croit GmbH, Freseniusstr. 31h, 81247 Munich
CEO: Martin Verges - VAT-ID: DE310638492
Com. register: Amtsgericht Munich HRB 231263
Web: https://croit.io | YouTube: https://goo.gl/PGE1Bx


--
Igor Fedotov
Ceph Lead Developer

Looking for help with your Ceph cluster? Contact us at https://croit.io

croit GmbH, Freseniusstr. 31h, 81247 Munich
CEO: Martin Verges - VAT-ID: DE310638492
Com. register: Amtsgericht Munich HRB 231263
Web: https://croit.io | YouTube: https://goo.gl/PGE1Bx
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: RBD mirroring, asking for clarification

2023-05-03 Thread Eugen Block

Hi,

the question is if both sites are used as primary clusters from  
different clients or if it's for disaster recovery only (site1 fails,  
make site2 primary). If both clusters are used independently with  
different clients I would prefer to separate the pools, so this option:



PoolA (site1)  -> PoolA (site2)
PoolB (site1) <-  PoolB (site2)


That means for images in poolA site1 is the primary site while site2  
is the backup site. And for images in poolB site2 is the primary site.


Zitat von wodel youchi :


Hi,

The goal is to sync some VMs from site1 - to - site2 and vice-versa sync
some VMs in the other way.
I am thinking of using rdb mirroring for that. But I have little experience
with Ceph management.

I am searching for the best way to do that.

I could create two pools on each site, and cross sync the pools.
PoolA (site1)  -> PoolA (site2)
PoolB (site1) <-  PoolB (site2)

Or create one pool on each site and cross sync the VMs I need.
PoolA (site1) <-> PoolA (site2)


The first option seems to be the safest and the easiest to manage.

Regards.


Virus-free.www.avast.com

<#DAB4FAD8-2DD7-40BB-A1B8-4E2AA1F9FDF2>

Le mer. 3 mai 2023 à 08:21, Eugen Block  a écrit :


Hi,

just to clarify, you mean in addition to the rbd mirroring you want to
have another sync of different VMs between those clusters (potentially
within the same pools) or are you looking for one option only? Please
clarify. Anyway, I would use dedicated pools for rbd mirroring and
then add more pools for different use-cases.

Regards,
Eugen

Zitat von wodel youchi :

> Hi,
>
> Thanks
> I am trying to find out what is the best way to synchronize VMS between
two
> HCI Proxmox clusters.
> Each cluster will contain 3 compute/storage nodes and each node will
> contain 4 nvme osd disks.
>
> There will be a 10gbs link between the two platforms.
>
> The idea is to be able to sync VMS between the two platforms in case of
> disaster bring the synced VMS up.
>
> Would you recommend to create a dedicated pool in each platform to
> synchronization?
>
> Regards.
>
> On Tue, May 2, 2023, 13:30 Eugen Block  wrote:
>
>> Hi,
>>
>> while your assumptions are correct (you can use the rest of the pool
>> for other non-mirrored images), at least I'm not aware of any
>> limitations, can I ask for the motivation behind this question? Mixing
>> different use-cases doesn't seem like a good idea to me. There's
>> always a chance that a client with caps for that pool deletes or
>> modifies images or even the entire pool. Why not simply create a
>> different pool and separate those clients?
>>
>> Thanks,
>> Eugen
>>
>> Zitat von wodel youchi :
>>
>> > Hi,
>> >
>> > When using rbd mirroring, the mirroring concerns the images only, not
the
>> > whole pool? So, we don't need to have a dedicated pool in the
destination
>> > site to be mirrored, the only obligation is that the mirrored pools
must
>> > have the same name.
>> >
>> > In other words, We create two pools with the same name, one on the
source
>> > site the other on the destination site, we create the mirror link (one
>> way
>> > or two ways replication), then we choose what images to sync.
>> >
>> > Both pools can be used simultaneously on both sites, it's the mirrored
>> > images that cannot be used simultaneously, only promoted ones.
>> >
>> > Is this correct?
>> >
>> > Regards.
>> > ___
>> > ceph-users mailing list -- ceph-users@ceph.io
>> > To unsubscribe send an email to ceph-users-le...@ceph.io
>>
>>
>> ___
>> ceph-users mailing list -- ceph-users@ceph.io
>> To unsubscribe send an email to ceph-users-le...@ceph.io
>>







___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: rbd map: corrupt full osdmap (-22) when

2023-05-03 Thread Eugen Block

Hi,

seems like a reoccuring issue, e.g. [1] [2], but it seems to be  
triggered by something different than [1] since you don't seem to have  
discontinuing OSD numbers. Maybe a regression, I don't really know,  
maybe file a new tracker issue for that?


Thanks,
Eugen

[1] https://tracker.ceph.com/issues/51112
[2] https://www.spinics.net/lists/ceph-users/msg64969.html

Zitat von Kamil Madac :


Hi,

We deployed pacific cluster 16.2.12 with cephadm. We experience following
error during rbd map:

[Wed May  3 08:59:11 2023] libceph: mon2 (1)[2a00:da8:ffef:1433::]:6789
session established
[Wed May  3 08:59:11 2023] libceph: another match of type 1 in addrvec
[Wed May  3 08:59:11 2023] libceph: corrupt full osdmap (-22) epoch 200 off
1042 (9876284d of 0cb24b58-80b70596)
[Wed May  3 08:59:11 2023] osdmap: : 08 07 7d 10 00 00 09 01 5d 09
00 00 a2 22 3b 86  ..}.]";.
[Wed May  3 08:59:11 2023] osdmap: 0010: e4 f5 11 ed 99 ee 47 75 ca 3c
ad 23 c8 00 00 00  ..Gu.<.#
[Wed May  3 08:59:11 2023] osdmap: 0020: 21 68 4a 64 98 d2 5d 2e 84 fd
50 64 d9 3a 48 26  !hJd..]...Pd.:H&
[Wed May  3 08:59:11 2023] osdmap: 0030: 02 00 00 00 01 00 00 00 00 00
00 00 1d 05 71 01  ..q.


Linux Kernel is 6.1.13 and the important thing is that we are using ipv6
addresses for connection to ceph nodes.
We were able to map rbd from client with kernel 5.10, but in prod
environment we are not allowed to use that kernel.

What could be the reason for such behavior on newer kernels and how to
troubleshoot it?

Here is output of ceph osd dump:

# ceph osd dump
epoch 200
fsid a2223b86-e4f5-11ed-99ee-4775ca3cad23
created 2023-04-27T12:18:41.777900+
modified 2023-05-02T12:09:40.642267+
flags sortbitwise,recovery_deletes,purged_snapdirs,pglog_hardlimit
crush_version 34
full_ratio 0.95
backfillfull_ratio 0.9
nearfull_ratio 0.85
require_min_compat_client luminous
min_compat_client jewel
require_osd_release pacific
stretch_mode_enabled false
pool 1 'device_health_metrics' replicated size 3 min_size 2 crush_rule 0
object_hash rjenkins pg_num 1 pgp_num 1 autoscale_mode on last_change 183
flags hashpspool stripe_width 0 pg_num_max 32 pg_num_min 1 application
mgr_devicehealth
pool 2 'idp' replicated size 3 min_size 2 crush_rule 0 object_hash rjenkins
pg_num 32 pgp_num 32 autoscale_mode on last_change 48 flags
hashpspool,selfmanaged_snaps stripe_width 0 application rbd
max_osd 3
osd.0 up   in  weight 1 up_from 176 up_thru 182 down_at 172
last_clean_interval [170,171)
[v2:[2a00:da8:ffef:1431::]:6800/805023868,v1:[2a00:da8:ffef:1431::]:6801/805023868,v2:
0.0.0.0:6802/805023868,v1:0.0.0.0:6803/805023868]
[v2:[2a00:da8:ffef:1431::]:6804/805023868,v1:[2a00:da8:ffef:1431::]:6805/805023868,v2:
0.0.0.0:6806/805023868,v1:0.0.0.0:6807/805023868] exists,up
e8fd0ee2-ea63-4d02-8f36-219d36869078
osd.1 up   in  weight 1 up_from 136 up_thru 182 down_at 0
last_clean_interval [0,0)
[v2:[2a00:da8:ffef:1432::]:6800/2172723816,v1:[2a00:da8:ffef:1432::]:6801/2172723816,v2:
0.0.0.0:6802/2172723816,v1:0.0.0.0:6803/2172723816]
[v2:[2a00:da8:ffef:1432::]:6804/2172723816,v1:[2a00:da8:ffef:1432::]:6805/2172723816,v2:
0.0.0.0:6806/2172723816,v1:0.0.0.0:6807/2172723816] exists,up
0b7b5628-9273-4757-85fb-9c16e8441895
osd.2 up   in  weight 1 up_from 182 up_thru 182 down_at 178
last_clean_interval [123,177)
[v2:[2a00:da8:ffef:1433::]:6800/887631330,v1:[2a00:da8:ffef:1433::]:6801/887631330,v2:
0.0.0.0:6802/887631330,v1:0.0.0.0:6803/887631330]
[v2:[2a00:da8:ffef:1433::]:6804/887631330,v1:[2a00:da8:ffef:1433::]:6805/887631330,v2:
0.0.0.0:6806/887631330,v1:0.0.0.0:6807/887631330] exists,up
21f8d0d5-6a3f-4f78-96c8-8ec4e4f78a01


Thank you.
--
Kamil Madac
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io



___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] rbd map: corrupt full osdmap (-22) when

2023-05-03 Thread Kamil Madac
Hi,

We deployed pacific cluster 16.2.12 with cephadm. We experience following
error during rbd map:

[Wed May  3 08:59:11 2023] libceph: mon2 (1)[2a00:da8:ffef:1433::]:6789
session established
[Wed May  3 08:59:11 2023] libceph: another match of type 1 in addrvec
[Wed May  3 08:59:11 2023] libceph: corrupt full osdmap (-22) epoch 200 off
1042 (9876284d of 0cb24b58-80b70596)
[Wed May  3 08:59:11 2023] osdmap: : 08 07 7d 10 00 00 09 01 5d 09
00 00 a2 22 3b 86  ..}.]";.
[Wed May  3 08:59:11 2023] osdmap: 0010: e4 f5 11 ed 99 ee 47 75 ca 3c
ad 23 c8 00 00 00  ..Gu.<.#
[Wed May  3 08:59:11 2023] osdmap: 0020: 21 68 4a 64 98 d2 5d 2e 84 fd
50 64 d9 3a 48 26  !hJd..]...Pd.:H&
[Wed May  3 08:59:11 2023] osdmap: 0030: 02 00 00 00 01 00 00 00 00 00
00 00 1d 05 71 01  ..q.


Linux Kernel is 6.1.13 and the important thing is that we are using ipv6
addresses for connection to ceph nodes.
We were able to map rbd from client with kernel 5.10, but in prod
environment we are not allowed to use that kernel.

What could be the reason for such behavior on newer kernels and how to
troubleshoot it?

Here is output of ceph osd dump:

# ceph osd dump
epoch 200
fsid a2223b86-e4f5-11ed-99ee-4775ca3cad23
created 2023-04-27T12:18:41.777900+
modified 2023-05-02T12:09:40.642267+
flags sortbitwise,recovery_deletes,purged_snapdirs,pglog_hardlimit
crush_version 34
full_ratio 0.95
backfillfull_ratio 0.9
nearfull_ratio 0.85
require_min_compat_client luminous
min_compat_client jewel
require_osd_release pacific
stretch_mode_enabled false
pool 1 'device_health_metrics' replicated size 3 min_size 2 crush_rule 0
object_hash rjenkins pg_num 1 pgp_num 1 autoscale_mode on last_change 183
flags hashpspool stripe_width 0 pg_num_max 32 pg_num_min 1 application
mgr_devicehealth
pool 2 'idp' replicated size 3 min_size 2 crush_rule 0 object_hash rjenkins
pg_num 32 pgp_num 32 autoscale_mode on last_change 48 flags
hashpspool,selfmanaged_snaps stripe_width 0 application rbd
max_osd 3
osd.0 up   in  weight 1 up_from 176 up_thru 182 down_at 172
last_clean_interval [170,171)
[v2:[2a00:da8:ffef:1431::]:6800/805023868,v1:[2a00:da8:ffef:1431::]:6801/805023868,v2:
0.0.0.0:6802/805023868,v1:0.0.0.0:6803/805023868]
[v2:[2a00:da8:ffef:1431::]:6804/805023868,v1:[2a00:da8:ffef:1431::]:6805/805023868,v2:
0.0.0.0:6806/805023868,v1:0.0.0.0:6807/805023868] exists,up
e8fd0ee2-ea63-4d02-8f36-219d36869078
osd.1 up   in  weight 1 up_from 136 up_thru 182 down_at 0
last_clean_interval [0,0)
[v2:[2a00:da8:ffef:1432::]:6800/2172723816,v1:[2a00:da8:ffef:1432::]:6801/2172723816,v2:
0.0.0.0:6802/2172723816,v1:0.0.0.0:6803/2172723816]
[v2:[2a00:da8:ffef:1432::]:6804/2172723816,v1:[2a00:da8:ffef:1432::]:6805/2172723816,v2:
0.0.0.0:6806/2172723816,v1:0.0.0.0:6807/2172723816] exists,up
0b7b5628-9273-4757-85fb-9c16e8441895
osd.2 up   in  weight 1 up_from 182 up_thru 182 down_at 178
last_clean_interval [123,177)
[v2:[2a00:da8:ffef:1433::]:6800/887631330,v1:[2a00:da8:ffef:1433::]:6801/887631330,v2:
0.0.0.0:6802/887631330,v1:0.0.0.0:6803/887631330]
[v2:[2a00:da8:ffef:1433::]:6804/887631330,v1:[2a00:da8:ffef:1433::]:6805/887631330,v2:
0.0.0.0:6806/887631330,v1:0.0.0.0:6807/887631330] exists,up
21f8d0d5-6a3f-4f78-96c8-8ec4e4f78a01


Thank you.
-- 
Kamil Madac
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: RBD mirroring, asking for clarification

2023-05-03 Thread wodel youchi
Hi,

The goal is to sync some VMs from site1 - to - site2 and vice-versa sync
some VMs in the other way.
I am thinking of using rdb mirroring for that. But I have little experience
with Ceph management.

I am searching for the best way to do that.

I could create two pools on each site, and cross sync the pools.
PoolA (site1)  -> PoolA (site2)
PoolB (site1) <-  PoolB (site2)

Or create one pool on each site and cross sync the VMs I need.
PoolA (site1) <-> PoolA (site2)


The first option seems to be the safest and the easiest to manage.

Regards.


Virus-free.www.avast.com

<#DAB4FAD8-2DD7-40BB-A1B8-4E2AA1F9FDF2>

Le mer. 3 mai 2023 à 08:21, Eugen Block  a écrit :

> Hi,
>
> just to clarify, you mean in addition to the rbd mirroring you want to
> have another sync of different VMs between those clusters (potentially
> within the same pools) or are you looking for one option only? Please
> clarify. Anyway, I would use dedicated pools for rbd mirroring and
> then add more pools for different use-cases.
>
> Regards,
> Eugen
>
> Zitat von wodel youchi :
>
> > Hi,
> >
> > Thanks
> > I am trying to find out what is the best way to synchronize VMS between
> two
> > HCI Proxmox clusters.
> > Each cluster will contain 3 compute/storage nodes and each node will
> > contain 4 nvme osd disks.
> >
> > There will be a 10gbs link between the two platforms.
> >
> > The idea is to be able to sync VMS between the two platforms in case of
> > disaster bring the synced VMS up.
> >
> > Would you recommend to create a dedicated pool in each platform to
> > synchronization?
> >
> > Regards.
> >
> > On Tue, May 2, 2023, 13:30 Eugen Block  wrote:
> >
> >> Hi,
> >>
> >> while your assumptions are correct (you can use the rest of the pool
> >> for other non-mirrored images), at least I'm not aware of any
> >> limitations, can I ask for the motivation behind this question? Mixing
> >> different use-cases doesn't seem like a good idea to me. There's
> >> always a chance that a client with caps for that pool deletes or
> >> modifies images or even the entire pool. Why not simply create a
> >> different pool and separate those clients?
> >>
> >> Thanks,
> >> Eugen
> >>
> >> Zitat von wodel youchi :
> >>
> >> > Hi,
> >> >
> >> > When using rbd mirroring, the mirroring concerns the images only, not
> the
> >> > whole pool? So, we don't need to have a dedicated pool in the
> destination
> >> > site to be mirrored, the only obligation is that the mirrored pools
> must
> >> > have the same name.
> >> >
> >> > In other words, We create two pools with the same name, one on the
> source
> >> > site the other on the destination site, we create the mirror link (one
> >> way
> >> > or two ways replication), then we choose what images to sync.
> >> >
> >> > Both pools can be used simultaneously on both sites, it's the mirrored
> >> > images that cannot be used simultaneously, only promoted ones.
> >> >
> >> > Is this correct?
> >> >
> >> > Regards.
> >> > ___
> >> > ceph-users mailing list -- ceph-users@ceph.io
> >> > To unsubscribe send an email to ceph-users-le...@ceph.io
> >>
> >>
> >> ___
> >> ceph-users mailing list -- ceph-users@ceph.io
> >> To unsubscribe send an email to ceph-users-le...@ceph.io
> >>
>
>
>
>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: How can I use not-replicated pool (replication 1 or raid-0)

2023-05-03 Thread Frank Schilder
Hi mhnx.

> I also agree with you, Ceph is not designed for this kind of use case
> but I tried to continue what I know.
If your only tool is a hammer ...
Sometimes its worth looking around.

While your tests show that a rep-1 pool is faster than a rep-2 pool, the values 
are not exactly impressive. There are 2 things that are relevant here: ceph is 
a high latency system, its software stack is quite heavy-weight. Even for a 
rep-1 pool its doing a lot to ensure data integrity. BeeGFS is a lightweight 
low-latency system skipping a lot of magic, which makes it very suited for 
performance critical tasks but less for long-term archival applications.

The second is that the device /dev/urandom is actually very slow (and even 
unpredictable on some systems, it might wait for more entropy to be created). 
Your times are almost certainly affected by that. If you want to have 
comparable and close to native storage performance, create the files you want 
to write to storage first in RAM and then copy from RAM to storage. Using 
random data is a good idea to bypass potential built-in accelerations for 
special data, like all-zeros. However, exclude the random number generator from 
the benchmark and generate the data first before timing its use.

Best regards,
=
Frank Schilder
AIT Risø Campus
Bygning 109, rum S14


From: mhnx 
Sent: Tuesday, May 2, 2023 5:25 PM
To: Frank Schilder
Cc: Janne Johansson; Ceph Users
Subject: Re: [ceph-users] Re: How can I use not-replicated pool (replication 1 
or raid-0)

Thank you for the explanation Frank.

I also agree with you, Ceph is not designed for this kind of use case
but I tried to continue what I know.
My idea was exactly what you described, I was trying to automate
cleaning or recreating on any failure.

As you can see below, rep1 pool is very fast:
- Create: time for i in {1..9}; do head -c 1K randfile$i; done
replication 2 : 31m59.917s
replication 1 : 7m6.046s

- Delete: time rm -rf testdir/
replication 2 : 11m56.994s
replication 1 : 0m40.756s
-

I started learning DRBD, I will also check BeeGFS thanks for the advice.

Regards.

Frank Schilder , 1 May 2023 Pzt, 10:27 tarihinde şunu yazdı:
>
> I think you misunderstood Janne's reply. The main statement is at the end, 
> ceph is not designed for an "I don't care about data" use case. If you need 
> speed for temporary data where you can sustain data loss, go for something 
> simpler. For example, we use beegfs with great success for a burst buffer for 
> an HPC cluster. It is very lightweight and will pull out all performance your 
> drives can offer. In case of disaster it is easily possible to clean up. 
> Beegfs does not care about lost data, such data will simply become 
> inaccessible while everything else just moves on. It will not try to 
> self-heal either. It doesn't even scrub data, so no competition of users with 
> admin IO.
>
> Its pretty much your use case. We clean it up every 6-8 weeks and if 
> something breaks we just redeploy the whole thing from scratch. Performance 
> is great and its a very simple and economic system to administrate. No need 
> for the whole ceph daemon engine with large RAM requirements and extra admin 
> daemons.
>
> Use ceph for data you want to survive a nuclear blast. Don't use it for 
> things its not made for and then complain.
>
> Best regards,
> =
> Frank Schilder
> AIT Risø Campus
> Bygning 109, rum S14
>
> 
> From: mhnx 
> Sent: Saturday, April 29, 2023 5:48 AM
> To: Janne Johansson
> Cc: Ceph Users
> Subject: [ceph-users] Re: How can I use not-replicated pool (replication 1 or 
> raid-0)
>
> Hello Janne, thank you for your response.
>
> I understand your advice and be sure that I've designed too many EC
> pools and I know the mess. This is not an option because I need SPEED.
>
> Please let me tell you, my hardware first to meet the same vision.
> Server: R620
> Cpu: 2 x Xeon E5-2630 v2 @ 2.60GHz
> Ram: 128GB - DDR3
> Disk1: 20x Samsung SSD 860 2TB
> Disk2: 10x Samsung SSD 870 2TB
>
> My ssds does not have PLP. Because of that, every ceph write also
> waits for TRIM. I want to know how much latency we are talking about
> because I'm thinking of adding PLP NVME for wal+db cache to gain some
> speed.
> As you can see, I even try to gain from every TRIM command.
> Currently I'm testing replication 2 pool and even this speed is not
> enough for my use case.
> Now I'm trying to boost the deletion speed because I'm writing and
> deleting files all the time and this never ends.
> I write this mail because replication 1 will decrease the deletion
> speed but still I'm trying to tune some MDS+ODS parameters to increase
> delete speed.
>
> Any help and idea will be great for me. Thanks.
> Regards.
>
>
>
> Janne Johansson , 12 Nis 2023 Çar, 10:10
> tarihinde şunu yazdı:
> >
> > Den mån 10 apr. 2023

[ceph-users] Re: MDS "newly corrupt dentry" after patch version upgrade

2023-05-03 Thread Janek Bevendorff

Hi Patrick,


I'll try that tomorrow and let you know, thanks!


I was unable to reproduce the crash today. Even with 
mds_abort_on_newly_corrupt_dentry set to true, all MDS booted up 
correctly (though they took forever to rejoin with logs set to 20).


To me it looks like the issue has resolved itself overnight. I had run a 
recursive scrub on the file system and another snapshot was taken, in 
case any of those might have had an effect on this. It could also be the 
case that the (supposedly) corrupt journal entry has simply been 
committed now and hence doesn't trigger the assertion any more. Is there 
any way I can verify this?


Janek
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Help needed to configure erasure coding LRC plugin

2023-05-03 Thread Eugen Block
I think I got it wrong with the locality setting, I'm still limited by  
the number of hosts I have available in my test cluster, but as far as  
I got with failure-domain=osd I believe k=6, m=3, l=3 with  
locality=datacenter could fit your requirement, at least with regards  
to the recovery bandwidth usage between DCs, but the resiliency would  
not match your requirement (one DC failure). That profile creates 3  
groups of 4 chunks (3 data/coding chunks and one parity chunk) across  
three DCs, in total 12 chunks. The min_size=7 would not allow an  
entire DC to go down, I'm afraid, you'd have to reduce it to 6 to  
allow reads/writes in a disaster scenario. I'm still not sure if I got  
it right this time, but maybe you're better off without the LRC plugin  
with the limited number of hosts. Instead you could use the jerasure  
plugin with a profile like k=4 m=5 allowing an entire DC to fail  
without losing data access (we have one customer using that).


Zitat von Eugen Block :


Hi,

disclaimer: I haven't used LRC in a real setup yet, so there might  
be some misunderstandings on my side. But I tried to play around  
with one of my test clusters (Nautilus). Because I'm limited in the  
number of hosts (6 across 3 virtual DCs) I tried two different  
profiles with lower numbers to get a feeling for how that works.


# first attempt
ceph:~ # ceph osd erasure-code-profile set LRCprofile plugin=lrc k=4  
m=2 l=3 crush-failure-domain=host


For every third OSD one parity chunk is added, so 2 more chunks to  
store ==> 8 chunks in total. Since my failure-domain is host and I  
only have 6 I get incomplete PGs.


# second attempt
ceph:~ # ceph osd erasure-code-profile set LRCprofile plugin=lrc k=2  
m=2 l=2 crush-failure-domain=host


This gives me 6 chunks in total to store across 6 hosts which works:

ceph:~ # ceph pg ls-by-pool lrcpool
PG   OBJECTS DEGRADED MISPLACED UNFOUND BYTES OMAP_BYTES* OMAP_KEYS*  
LOG STATESINCE VERSION REPORTED UPACTING  
   SCRUB_STAMPDEEP_SCRUB_STAMP
50.0   10 0   0   619   0  0  
  1 active+clean   72s 18410'1 18415:54   [27,13,0,2,25,7]p27
[27,13,0,2,25,7]p27 2023-05-02 14:53:54.322135 2023-05-02  
14:53:54.322135
50.1   00 0   0 0   0  0  
  0 active+clean6m 0'0 18414:26 [27,33,22,6,13,34]p27  
[27,33,22,6,13,34]p27 2023-05-02 14:53:54.322135 2023-05-02  
14:53:54.322135
50.2   00 0   0 0   0  0  
  0 active+clean6m 0'0 18413:25   [1,28,14,4,31,21]p1
[1,28,14,4,31,21]p1 2023-05-02 14:53:54.322135 2023-05-02  
14:53:54.322135
50.3   00 0   0 0   0  0  
  0 active+clean6m 0'0 18413:24   [8,16,26,33,7,25]p8
[8,16,26,33,7,25]p8 2023-05-02 14:53:54.322135 2023-05-02  
14:53:54.322135


After stopping all OSDs on one host I was still able to read and  
write into the pool, but after stopping a second host one PG from  
that pool went "down". That I don't fully understand yet, but I just  
started to look into it.
With your setup (12 hosts) I would recommend to not utilize all of  
them so you have capacity to recover, let's say one "spare" host per  
DC, leaving 9 hosts in total. A profile with k=3 m=3 l=2 could make  
sense here, resulting in 9 total chunks (one more parity chunks for  
every other OSD), min_size 4. But as I wrote, it probably doesn't  
have the resiliency for a DC failure, so that needs some further  
investigation.


Regards,
Eugen

Zitat von Michel Jouvin :


Hi,

No... our current setup is 3 datacenters with the same  
configuration, i.e. 1 mon/mgr + 4 OSD servers with 16 OSDs each.  
Thus the total of 12 OSDs servers. As with LRC plugin, k+m must be  
a multiple of l, I found that k=9/m=66/l=5 with  
crush-locality=datacenter was achieving my goal of being resilient  
to a datacenter failure. Because I had this, I considered that  
lowering the crush failure domain to osd was not a major issue in  
my case (as it would not be worst than a datacenter failure if all  
the shards are on the same server in a datacenter) and was working  
around the lack of hosts for k=9/m=6 (15 OSDs).


May be it helps, if I give the erasure code profile used:

crush-device-class=hdd
crush-failure-domain=osd
crush-locality=datacenter
crush-root=default
k=9
l=5
m=6
plugin=lrc

The previously mentioned strange number for min_size for the pool  
created with this profile has vanished after Quincy upgrade as this  
parameter is no longer in the CRUH map rule! and the `ceph osd pool  
get` command reports the expected number (10):


-


ceph osd pool get fink-z1.rgw.buckets.data min_size

min_size: 10


Cheers,

Michel

Le 29/04/2023 à 20:36, Curt a écrit :

Hello,

What is your current setup, 1 server pet data center with 12 osd  
each? What is your current crush rule and LRC crush rule?



On Fri, Apr 28

[ceph-users] Re: 17.2.6 fs 'ls' ok, but 'cat' 'operation not permitted' puzzle

2023-05-03 Thread Eugen Block

Hi,

we had the NFS discussion a few weeks back [2] and at the Cephalocon I  
talked to Zac about it.


@Zac: seems like not only NFS over CephFS is affected but CephFS in  
general. Could you add that note about the application metadata to the  
general CephFS docs as well?


Thanks,
Eugen

[2]  
https://lists.ceph.io/hyperkitty/list/ceph-users@ceph.io/thread/2NL2Q57HTSGDDBLARLRCVRVX2PE6FKDA/


Zitat von Harry G Coin :

This problem of inaccessible file systems post upgrade by other than  
client.admin date back from v14 carries on through v17.  It also  
applies to any case of specifying other than the default pool names  
for new file systems.  Solved because Curt remembered link on this  
list.  (Thanks Curt!) Here's what the official ceph docs ought have  
provided, for others who hit this.  YMMV:


   IF

   you have ceph file systems which have data and meta data pools that
   were specified in the 'ceph fs new' command (meaning not left to the
   defaults which create them for you),

   OR

   you have an existing ceph file system and are upgrading to a new
   major version of ceph

   THEN

   for the documented 'ceph fs authorize...' commands to do as
   documented (and avoid strange 'operation not permitted' errors when
   doing file I/O or similar security related problems for all but such
   as the client.admin user), you must first run:

   ceph osd pool application set  cephfs
   metadata 

   and

   ceph osd pool application set  cephfs data
   

   Otherwise when the OSD's get a request to read or write data (not
   the directory info, but file data) they won't know which ceph file
   system name to look up, nevermind the names you may have chosen for
   the pools,  as the 'defaults' themselves changed in the major
   releases,  from

   data pool=fsname
   metadata pool=fsname_metadata

   to

   data pool=fsname.data and
   metadata pool=fsname.meta

   as the ceph revisions came and went.  Any setup that just used
   'client.admin' for all mounts didn't see the problem as the admin
   key gave blanket permission.

   A temporary 'fix' is to change mount requests to the 'client.admin'
   and associated key.  A less drastic but still half-fix is to change
   the osd cap for your user to just 'caps osd = "allow rw"  and delete
   "tag cephfs data="

The only documentation I could find for this upgrade  
security-related ceph-ending catastrophe was in the NFS, not cephfs  
docs:


https://docs.ceph.com/en/latest/cephfs/nfs/

and the Genius level much appreciated pointer from Curt here:


On 5/2/23 14:21, Curt wrote:
This thread might be of use, it's an older version of ceph 14, but  
might still apply,  
https://lists.ceph.io/hyperkitty/list/ceph-users@ceph.io/thread/23FDDSYBCDVMYGCUTALACPFAJYITLOHJ/  
?


On Tue, May 2, 2023 at 11:06 PM Harry G Coin  wrote:

   In 17.2.6 is there a security requirement that pool names
   supporting a
   ceph fs filesystem match the filesystem name.data for the data and
   name.meta for the associated metadata pool? (multiple file systems
   are
   enabled)

   I have filesystems from older versions with the data pool name
   matching
   the filesystem and appending _metadata for that,

   and even older filesystems with the pool name as in 'library' and
   'library_metadata' supporting a filesystem called 'libraryfs'

   The pools all have the cephfs tag.

   But using the documented:

   ceph fs authorize libraryfs client.basicuser / rw

   command allows the root user to mount and browse the library
   directory
   tree, but fails with 'operation not permitted' when even reading
   any file.

   However, changing the client.basicuser osd auth to 'allow rw'
   instead of
   'allow rw tag...' allows normal operations.

   So:

   [client.basicuser]
   key = ==
   caps mds = "allow rw fsname=libraryfs"
   caps mon = "allow r fsname=libraryfs"
   caps osd = "allow rw"

   works, but the same with

       caps osd = "allow rw tag cephfs data=libraryfs"

   leads to the 'operation not permitted' on read, or write or any
   actual
   access.

   It remains a puzzle.  Help appreciated!

   Were there upgrade instructions about that, any help pointing me
   to them?

   Thanks

   Harry Coin
   Rock Stable Systems

   ___
   ceph-users mailing list -- ceph-users@ceph.io
   To unsubscribe send an email to ceph-users-le...@ceph.io


___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io



___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: RBD mirroring, asking for clarification

2023-05-03 Thread Eugen Block

Hi,

just to clarify, you mean in addition to the rbd mirroring you want to  
have another sync of different VMs between those clusters (potentially  
within the same pools) or are you looking for one option only? Please  
clarify. Anyway, I would use dedicated pools for rbd mirroring and  
then add more pools for different use-cases.


Regards,
Eugen

Zitat von wodel youchi :


Hi,

Thanks
I am trying to find out what is the best way to synchronize VMS between two
HCI Proxmox clusters.
Each cluster will contain 3 compute/storage nodes and each node will
contain 4 nvme osd disks.

There will be a 10gbs link between the two platforms.

The idea is to be able to sync VMS between the two platforms in case of
disaster bring the synced VMS up.

Would you recommend to create a dedicated pool in each platform to
synchronization?

Regards.

On Tue, May 2, 2023, 13:30 Eugen Block  wrote:


Hi,

while your assumptions are correct (you can use the rest of the pool
for other non-mirrored images), at least I'm not aware of any
limitations, can I ask for the motivation behind this question? Mixing
different use-cases doesn't seem like a good idea to me. There's
always a chance that a client with caps for that pool deletes or
modifies images or even the entire pool. Why not simply create a
different pool and separate those clients?

Thanks,
Eugen

Zitat von wodel youchi :

> Hi,
>
> When using rbd mirroring, the mirroring concerns the images only, not the
> whole pool? So, we don't need to have a dedicated pool in the destination
> site to be mirrored, the only obligation is that the mirrored pools must
> have the same name.
>
> In other words, We create two pools with the same name, one on the source
> site the other on the destination site, we create the mirror link (one
way
> or two ways replication), then we choose what images to sync.
>
> Both pools can be used simultaneously on both sites, it's the mirrored
> images that cannot be used simultaneously, only promoted ones.
>
> Is this correct?
>
> Regards.
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io


___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io




___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: RBD mirroring, asking for clarification

2023-05-03 Thread wodel youchi
Hi,

Thanks
I am trying to find out what is the best way to synchronize VMS between two
HCI Proxmox clusters.
Each cluster will contain 3 compute/storage nodes and each node will
contain 4 nvme osd disks.

There will be a 10gbs link between the two platforms.

The idea is to be able to sync VMS between the two platforms in case of
disaster bring the synced VMS up.

Would you recommend to create a dedicated pool in each platform to
synchronization?

Regards.

On Tue, May 2, 2023, 13:30 Eugen Block  wrote:

> Hi,
>
> while your assumptions are correct (you can use the rest of the pool
> for other non-mirrored images), at least I'm not aware of any
> limitations, can I ask for the motivation behind this question? Mixing
> different use-cases doesn't seem like a good idea to me. There's
> always a chance that a client with caps for that pool deletes or
> modifies images or even the entire pool. Why not simply create a
> different pool and separate those clients?
>
> Thanks,
> Eugen
>
> Zitat von wodel youchi :
>
> > Hi,
> >
> > When using rbd mirroring, the mirroring concerns the images only, not the
> > whole pool? So, we don't need to have a dedicated pool in the destination
> > site to be mirrored, the only obligation is that the mirrored pools must
> > have the same name.
> >
> > In other words, We create two pools with the same name, one on the source
> > site the other on the destination site, we create the mirror link (one
> way
> > or two ways replication), then we choose what images to sync.
> >
> > Both pools can be used simultaneously on both sites, it's the mirrored
> > images that cannot be used simultaneously, only promoted ones.
> >
> > Is this correct?
> >
> > Regards.
> > ___
> > ceph-users mailing list -- ceph-users@ceph.io
> > To unsubscribe send an email to ceph-users-le...@ceph.io
>
>
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io