[ceph-users] Re: Performance improvement suggestion

2024-02-20 Thread Dan van der Ster
Hi, I just want to echo what the others are saying. Keep in mind that RADOS needs to guarantee read-after-write consistency for the higher level apps to work (RBD, RGW, CephFS). If you corrupt VM block devices, S3 objects or bucket metadata/indexes, or CephFS metadata, you're going to suffer some

[ceph-users] Re: pacific 16.2.15 QE validation status

2024-02-20 Thread Nizamudeen A
dashboard approved. our e2e specs are passing but the suite failed because of a different error. cluster [WRN] Health check failed: 1 stray daemon(s) not managed by cephadm (CEPHADM_STRAY_DAEMON)" in cluster log On Tue, Feb 20, 2024 at 9:29 PM Yuri Weinstein wrote: > We have restarted QE valida

[ceph-users] Re: pacific 16.2.15 QE validation status

2024-02-20 Thread Venky Shankar
Hi Yuri, On Tue, Feb 20, 2024 at 9:29 PM Yuri Weinstein wrote: > > We have restarted QE validation after fixing issues and merging several PRs. > The new Build 3 (rebase of pacific) tests are summarized in the same > note (see Build 3 runs) https://tracker.ceph.com/issues/64151#note-1 > > Seeking

[ceph-users] User + Dev Meetup February 22 - CephFS Snapshots story!

2024-02-20 Thread Neha Ojha
Hi everyone, You are invited to join us at the User + Dev meeting this week Thursday, February 22 at 10:00 AM Eastern Time! Focus Topic: CephFS Snapshots Evaluation Presented by: Enrico Bocchi and Abhishek Lekshmanan, Ceph operators from CERN >From the presenters: Ceph at CERN provides block, o

[ceph-users] Re: Performance improvement suggestion

2024-02-20 Thread Alex Gorbachev
I would be against such an option, because it introduces a significant risk of data loss. Ceph has made a name for itself as a very reliable system, where almost no one lost data, no matter how bad of a decision they made with architecture and design. This is what you pay for in commercial system

[ceph-users] Re: Performance improvement suggestion

2024-02-20 Thread Anthony D'Atri
Cache tiering is deprecated. > On Feb 20, 2024, at 17:03, Özkan Göksu wrote: > > Hello. > > I didn't test it personally but what about rep 1 write cache pool with nvme > backed by another rep 2 pool? > > It has the potential exactly what you are looking for in theory. > > > 1 Şub 2024 Per 20

[ceph-users] Re: Performance improvement suggestion

2024-02-20 Thread Özkan Göksu
Hello. I didn't test it personally but what about rep 1 write cache pool with nvme backed by another rep 2 pool? It has the potential exactly what you are looking for in theory. 1 Şub 2024 Per 20:54 tarihinde quag...@bol.com.br şunu yazdı: > > > Ok Anthony, > > I understood what you said. I a

[ceph-users] Re: Performance improvement suggestion

2024-02-20 Thread Anthony D'Atri
> Hi Anthony, > Did you decide that it's not a feature to be implemented? That isn't up to me. > I'm asking about this so I can offer options here. > > I'd not be confortable to enable "mon_allow_pool_size_one" at a specific > pool. > > It would be better if this feature could

[ceph-users] Re: Performance improvement suggestion

2024-02-20 Thread quag...@bol.com.br
___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: pacific 16.2.15 QE validation status

2024-02-20 Thread Ilya Dryomov
On Tue, Feb 20, 2024 at 4:59 PM Yuri Weinstein wrote: > > We have restarted QE validation after fixing issues and merging several PRs. > The new Build 3 (rebase of pacific) tests are summarized in the same > note (see Build 3 runs) https://tracker.ceph.com/issues/64151#note-1 > > Seeking approvals

[ceph-users] Re: pacific 16.2.15 QE validation status

2024-02-20 Thread Yuri Weinstein
We have restarted QE validation after fixing issues and merging several PRs. The new Build 3 (rebase of pacific) tests are summarized in the same note (see Build 3 runs) https://tracker.ceph.com/issues/64151#note-1 Seeking approvals: rados - Radek, Junior, Travis, Ernesto, Adam King rgw - Casey f

[ceph-users] Re: Scrub stuck and 'pg has invalid (post-split) stat'

2024-02-20 Thread Eugen Block
Please don't drop the list from your response. The first question coming to mind is, why do you have a cache-tier if all your pools are on nvme decices anyway? I don't see any benefit here. Did you try the suggested workaround and disable the cache-tier? Zitat von Cedric : Thanks Eugen, see

[ceph-users] Re: Scrub stuck and 'pg has invalid (post-split) stat'

2024-02-20 Thread Eugen Block
Hi, some more details would be helpful, for example what's the pool size of the cache pool? Did you issue a PG split before or during the upgrade? This thread [1] deals with the same problem, the described workaround was to set hit_set_count to 0 and disable the cache layer until that is

[ceph-users] Re: RoCE?

2024-02-20 Thread Jan Marek
Hello, we've found problem: In systemd unit for OSD there is missing this line in the [Service] section: LimitMEMLOCK=infinity When I added this line to systemd unit, OSD daemon started and we have HEALTH_OK state in the cluster status. Sincerely Jan Marek Dne Po, úno 05, 2024 at 11:10:21 CET