[ceph-users] Re: quincy v17.2.6 QE Validation status

2023-03-23 Thread Ken Dreyer
How much more time do we need to get PR 50549 in if we delayed v17.2.6?

- Ken

On Thu, Mar 23, 2023 at 2:44 PM Laura Flores  wrote:

> We are all good on the Core end of things.
> https://github.com/ceph/ceph/pull/50549 is needed for downstream, but it
> should not block upstream.
>
> On Thu, Mar 23, 2023 at 12:59 PM Laura Flores  wrote:
>
>> https://github.com/ceph/ceph/pull/50575 was also merged.
>>
>> On Thu, Mar 23, 2023 at 12:36 PM Yuri Weinstein 
>> wrote:
>>
>>> We are still working on core PRs:
>>>
>>> https://github.com/ceph/ceph/pull/50549
>>> https://github.com/ceph/ceph/pull/50625 - merged
>>> https://github.com/ceph/ceph/pull/50575
>>>
>>> Will update as soon as we are ready for the next steps.
>>>
>>> On Thu, Mar 23, 2023 at 10:34 AM Casey Bodley 
>>> wrote:
>>> >
>>> > On Wed, Mar 22, 2023 at 9:27 AM Casey Bodley 
>>> wrote:
>>> > >
>>> > > On Tue, Mar 21, 2023 at 4:06 PM Yuri Weinstein 
>>> wrote:
>>> > > >
>>> > > > Details of this release are summarized here:
>>> > > >
>>> > > > https://tracker.ceph.com/issues/59070#note-1
>>> > > > Release Notes - TBD
>>> > > >
>>> > > > The reruns were in the queue for 4 days because of some slowness
>>> issues.
>>> > > > The core team (Neha, Radek, Laura, and others) are trying to narrow
>>> > > > down the root cause.
>>> > > >
>>> > > > Seeking approvals/reviews for:
>>> > > >
>>> > > > rados - Neha, Radek, Travis, Ernesto, Adam King (we still have to
>>> test
>>> > > > and merge at least one PR https://github.com/ceph/ceph/pull/50575
>>> for
>>> > > > the core)
>>> > > > rgw - Casey
>>> > >
>>> > > there were some java_s3test failures related to
>>> > > https://tracker.ceph.com/issues/58554. i've added the fix to
>>> > > https://github.com/ceph/java_s3tests/commits/ceph-quincy, so a rerun
>>> > > should resolve those failures
>>> > > there were also some 'Failed to fetch package version' failures in
>>> the
>>> > > rerun that warranted another rerun anyway
>>> > >
>>> > > there's also an urgent priority bug fix in
>>> > > https://github.com/ceph/ceph/pull/50625 that i'd really like to add
>>> to
>>> > > this release; sorry for the late notice
>>> >
>>> > this fix merged, so rgw is now approved. thanks Yuri
>>> >
>>> > >
>>> > > > fs - Venky (the fs suite has an unusually high amount of failed
>>> jobs,
>>> > > > any reason to suspect it in the observed slowness?)
>>> > > > orch - Adam King
>>> > > > rbd - Ilya
>>> > > > krbd - Ilya
>>> > > > upgrade/octopus-x - Laura is looking into failures
>>> > > > upgrade/pacific-x - Laura is looking into failures
>>> > > > upgrade/quincy-p2p - Laura is looking into failures
>>> > > > client-upgrade-octopus-quincy-quincy - missing packages, Adam
>>> Kraitman
>>> > > > is looking into it
>>> > > > powercycle - Brad
>>> > > > ceph-volume - needs a rerun on merged
>>> > > > https://github.com/ceph/ceph-ansible/pull/7409
>>> > > >
>>> > > > Please reply to this email with approval and/or trackers of known
>>> > > > issues/PRs to address them.
>>> > > >
>>> > > > Also, share any findings or hypnosis about the slowness in the
>>> > > > execution of the suite.
>>> > > >
>>> > > > Josh, Neha - gibba and LRC upgrades pending major suites approvals.
>>> > > > RC release - pending major suites approvals.
>>> > > >
>>> > > > Thx
>>> > > > YuriW
>>> > > > ___
>>> > > > ceph-users mailing list -- ceph-users@ceph.io
>>> > > > To unsubscribe send an email to ceph-users-le...@ceph.io
>>> > > >
>>> >
>>> ___
>>> ceph-users mailing list -- ceph-users@ceph.io
>>> To unsubscribe send an email to ceph-users-le...@ceph.io
>>>
>>
>>
>> --
>>
>> Laura Flores
>>
>> She/Her/Hers
>>
>> Software Engineer, Ceph Storage 
>>
>> Chicago, IL
>>
>> lflo...@ibm.com | lflo...@redhat.com 
>> M: +17087388804
>>
>>
>>
>
> --
>
> Laura Flores
>
> She/Her/Hers
>
> Software Engineer, Ceph Storage 
>
> Chicago, IL
>
> lflo...@ibm.com | lflo...@redhat.com 
> M: +17087388804
>
>
> ___
> Dev mailing list -- d...@ceph.io
> To unsubscribe send an email to dev-le...@ceph.io
>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: quincy v17.2.6 QE Validation status

2023-03-23 Thread Laura Flores
I will have a review of the rados suite ready soon.

On Thu, Mar 23, 2023 at 1:44 PM Laura Flores  wrote:

> We are all good on the Core end of things.
> https://github.com/ceph/ceph/pull/50549 is needed for downstream, but it
> should not block upstream.
>
> On Thu, Mar 23, 2023 at 12:59 PM Laura Flores  wrote:
>
>> https://github.com/ceph/ceph/pull/50575 was also merged.
>>
>> On Thu, Mar 23, 2023 at 12:36 PM Yuri Weinstein 
>> wrote:
>>
>>> We are still working on core PRs:
>>>
>>> https://github.com/ceph/ceph/pull/50549
>>> https://github.com/ceph/ceph/pull/50625 - merged
>>> https://github.com/ceph/ceph/pull/50575
>>>
>>> Will update as soon as we are ready for the next steps.
>>>
>>> On Thu, Mar 23, 2023 at 10:34 AM Casey Bodley 
>>> wrote:
>>> >
>>> > On Wed, Mar 22, 2023 at 9:27 AM Casey Bodley 
>>> wrote:
>>> > >
>>> > > On Tue, Mar 21, 2023 at 4:06 PM Yuri Weinstein 
>>> wrote:
>>> > > >
>>> > > > Details of this release are summarized here:
>>> > > >
>>> > > > https://tracker.ceph.com/issues/59070#note-1
>>> > > > Release Notes - TBD
>>> > > >
>>> > > > The reruns were in the queue for 4 days because of some slowness
>>> issues.
>>> > > > The core team (Neha, Radek, Laura, and others) are trying to narrow
>>> > > > down the root cause.
>>> > > >
>>> > > > Seeking approvals/reviews for:
>>> > > >
>>> > > > rados - Neha, Radek, Travis, Ernesto, Adam King (we still have to
>>> test
>>> > > > and merge at least one PR https://github.com/ceph/ceph/pull/50575
>>> for
>>> > > > the core)
>>> > > > rgw - Casey
>>> > >
>>> > > there were some java_s3test failures related to
>>> > > https://tracker.ceph.com/issues/58554. i've added the fix to
>>> > > https://github.com/ceph/java_s3tests/commits/ceph-quincy, so a rerun
>>> > > should resolve those failures
>>> > > there were also some 'Failed to fetch package version' failures in
>>> the
>>> > > rerun that warranted another rerun anyway
>>> > >
>>> > > there's also an urgent priority bug fix in
>>> > > https://github.com/ceph/ceph/pull/50625 that i'd really like to add
>>> to
>>> > > this release; sorry for the late notice
>>> >
>>> > this fix merged, so rgw is now approved. thanks Yuri
>>> >
>>> > >
>>> > > > fs - Venky (the fs suite has an unusually high amount of failed
>>> jobs,
>>> > > > any reason to suspect it in the observed slowness?)
>>> > > > orch - Adam King
>>> > > > rbd - Ilya
>>> > > > krbd - Ilya
>>> > > > upgrade/octopus-x - Laura is looking into failures
>>> > > > upgrade/pacific-x - Laura is looking into failures
>>> > > > upgrade/quincy-p2p - Laura is looking into failures
>>> > > > client-upgrade-octopus-quincy-quincy - missing packages, Adam
>>> Kraitman
>>> > > > is looking into it
>>> > > > powercycle - Brad
>>> > > > ceph-volume - needs a rerun on merged
>>> > > > https://github.com/ceph/ceph-ansible/pull/7409
>>> > > >
>>> > > > Please reply to this email with approval and/or trackers of known
>>> > > > issues/PRs to address them.
>>> > > >
>>> > > > Also, share any findings or hypnosis about the slowness in the
>>> > > > execution of the suite.
>>> > > >
>>> > > > Josh, Neha - gibba and LRC upgrades pending major suites approvals.
>>> > > > RC release - pending major suites approvals.
>>> > > >
>>> > > > Thx
>>> > > > YuriW
>>> > > > ___
>>> > > > ceph-users mailing list -- ceph-users@ceph.io
>>> > > > To unsubscribe send an email to ceph-users-le...@ceph.io
>>> > > >
>>> >
>>> ___
>>> ceph-users mailing list -- ceph-users@ceph.io
>>> To unsubscribe send an email to ceph-users-le...@ceph.io
>>>
>>
>>
>> --
>>
>> Laura Flores
>>
>> She/Her/Hers
>>
>> Software Engineer, Ceph Storage 
>>
>> Chicago, IL
>>
>> lflo...@ibm.com | lflo...@redhat.com 
>> M: +17087388804
>>
>>
>>
>
> --
>
> Laura Flores
>
> She/Her/Hers
>
> Software Engineer, Ceph Storage 
>
> Chicago, IL
>
> lflo...@ibm.com | lflo...@redhat.com 
> M: +17087388804
>
>
>

-- 

Laura Flores

She/Her/Hers

Software Engineer, Ceph Storage 

Chicago, IL

lflo...@ibm.com | lflo...@redhat.com 
M: +17087388804
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: quincy v17.2.6 QE Validation status

2023-03-23 Thread Laura Flores
We are all good on the Core end of things.
https://github.com/ceph/ceph/pull/50549 is needed for downstream, but it
should not block upstream.

On Thu, Mar 23, 2023 at 12:59 PM Laura Flores  wrote:

> https://github.com/ceph/ceph/pull/50575 was also merged.
>
> On Thu, Mar 23, 2023 at 12:36 PM Yuri Weinstein 
> wrote:
>
>> We are still working on core PRs:
>>
>> https://github.com/ceph/ceph/pull/50549
>> https://github.com/ceph/ceph/pull/50625 - merged
>> https://github.com/ceph/ceph/pull/50575
>>
>> Will update as soon as we are ready for the next steps.
>>
>> On Thu, Mar 23, 2023 at 10:34 AM Casey Bodley  wrote:
>> >
>> > On Wed, Mar 22, 2023 at 9:27 AM Casey Bodley 
>> wrote:
>> > >
>> > > On Tue, Mar 21, 2023 at 4:06 PM Yuri Weinstein 
>> wrote:
>> > > >
>> > > > Details of this release are summarized here:
>> > > >
>> > > > https://tracker.ceph.com/issues/59070#note-1
>> > > > Release Notes - TBD
>> > > >
>> > > > The reruns were in the queue for 4 days because of some slowness
>> issues.
>> > > > The core team (Neha, Radek, Laura, and others) are trying to narrow
>> > > > down the root cause.
>> > > >
>> > > > Seeking approvals/reviews for:
>> > > >
>> > > > rados - Neha, Radek, Travis, Ernesto, Adam King (we still have to
>> test
>> > > > and merge at least one PR https://github.com/ceph/ceph/pull/50575
>> for
>> > > > the core)
>> > > > rgw - Casey
>> > >
>> > > there were some java_s3test failures related to
>> > > https://tracker.ceph.com/issues/58554. i've added the fix to
>> > > https://github.com/ceph/java_s3tests/commits/ceph-quincy, so a rerun
>> > > should resolve those failures
>> > > there were also some 'Failed to fetch package version' failures in the
>> > > rerun that warranted another rerun anyway
>> > >
>> > > there's also an urgent priority bug fix in
>> > > https://github.com/ceph/ceph/pull/50625 that i'd really like to add
>> to
>> > > this release; sorry for the late notice
>> >
>> > this fix merged, so rgw is now approved. thanks Yuri
>> >
>> > >
>> > > > fs - Venky (the fs suite has an unusually high amount of failed
>> jobs,
>> > > > any reason to suspect it in the observed slowness?)
>> > > > orch - Adam King
>> > > > rbd - Ilya
>> > > > krbd - Ilya
>> > > > upgrade/octopus-x - Laura is looking into failures
>> > > > upgrade/pacific-x - Laura is looking into failures
>> > > > upgrade/quincy-p2p - Laura is looking into failures
>> > > > client-upgrade-octopus-quincy-quincy - missing packages, Adam
>> Kraitman
>> > > > is looking into it
>> > > > powercycle - Brad
>> > > > ceph-volume - needs a rerun on merged
>> > > > https://github.com/ceph/ceph-ansible/pull/7409
>> > > >
>> > > > Please reply to this email with approval and/or trackers of known
>> > > > issues/PRs to address them.
>> > > >
>> > > > Also, share any findings or hypnosis about the slowness in the
>> > > > execution of the suite.
>> > > >
>> > > > Josh, Neha - gibba and LRC upgrades pending major suites approvals.
>> > > > RC release - pending major suites approvals.
>> > > >
>> > > > Thx
>> > > > YuriW
>> > > > ___
>> > > > ceph-users mailing list -- ceph-users@ceph.io
>> > > > To unsubscribe send an email to ceph-users-le...@ceph.io
>> > > >
>> >
>> ___
>> ceph-users mailing list -- ceph-users@ceph.io
>> To unsubscribe send an email to ceph-users-le...@ceph.io
>>
>
>
> --
>
> Laura Flores
>
> She/Her/Hers
>
> Software Engineer, Ceph Storage 
>
> Chicago, IL
>
> lflo...@ibm.com | lflo...@redhat.com 
> M: +17087388804
>
>
>

-- 

Laura Flores

She/Her/Hers

Software Engineer, Ceph Storage 

Chicago, IL

lflo...@ibm.com | lflo...@redhat.com 
M: +17087388804
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: quincy v17.2.6 QE Validation status

2023-03-23 Thread Laura Flores
https://github.com/ceph/ceph/pull/50575 was also merged.

On Thu, Mar 23, 2023 at 12:36 PM Yuri Weinstein  wrote:

> We are still working on core PRs:
>
> https://github.com/ceph/ceph/pull/50549
> https://github.com/ceph/ceph/pull/50625 - merged
> https://github.com/ceph/ceph/pull/50575
>
> Will update as soon as we are ready for the next steps.
>
> On Thu, Mar 23, 2023 at 10:34 AM Casey Bodley  wrote:
> >
> > On Wed, Mar 22, 2023 at 9:27 AM Casey Bodley  wrote:
> > >
> > > On Tue, Mar 21, 2023 at 4:06 PM Yuri Weinstein 
> wrote:
> > > >
> > > > Details of this release are summarized here:
> > > >
> > > > https://tracker.ceph.com/issues/59070#note-1
> > > > Release Notes - TBD
> > > >
> > > > The reruns were in the queue for 4 days because of some slowness
> issues.
> > > > The core team (Neha, Radek, Laura, and others) are trying to narrow
> > > > down the root cause.
> > > >
> > > > Seeking approvals/reviews for:
> > > >
> > > > rados - Neha, Radek, Travis, Ernesto, Adam King (we still have to
> test
> > > > and merge at least one PR https://github.com/ceph/ceph/pull/50575
> for
> > > > the core)
> > > > rgw - Casey
> > >
> > > there were some java_s3test failures related to
> > > https://tracker.ceph.com/issues/58554. i've added the fix to
> > > https://github.com/ceph/java_s3tests/commits/ceph-quincy, so a rerun
> > > should resolve those failures
> > > there were also some 'Failed to fetch package version' failures in the
> > > rerun that warranted another rerun anyway
> > >
> > > there's also an urgent priority bug fix in
> > > https://github.com/ceph/ceph/pull/50625 that i'd really like to add to
> > > this release; sorry for the late notice
> >
> > this fix merged, so rgw is now approved. thanks Yuri
> >
> > >
> > > > fs - Venky (the fs suite has an unusually high amount of failed jobs,
> > > > any reason to suspect it in the observed slowness?)
> > > > orch - Adam King
> > > > rbd - Ilya
> > > > krbd - Ilya
> > > > upgrade/octopus-x - Laura is looking into failures
> > > > upgrade/pacific-x - Laura is looking into failures
> > > > upgrade/quincy-p2p - Laura is looking into failures
> > > > client-upgrade-octopus-quincy-quincy - missing packages, Adam
> Kraitman
> > > > is looking into it
> > > > powercycle - Brad
> > > > ceph-volume - needs a rerun on merged
> > > > https://github.com/ceph/ceph-ansible/pull/7409
> > > >
> > > > Please reply to this email with approval and/or trackers of known
> > > > issues/PRs to address them.
> > > >
> > > > Also, share any findings or hypnosis about the slowness in the
> > > > execution of the suite.
> > > >
> > > > Josh, Neha - gibba and LRC upgrades pending major suites approvals.
> > > > RC release - pending major suites approvals.
> > > >
> > > > Thx
> > > > YuriW
> > > > ___
> > > > ceph-users mailing list -- ceph-users@ceph.io
> > > > To unsubscribe send an email to ceph-users-le...@ceph.io
> > > >
> >
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
>


-- 

Laura Flores

She/Her/Hers

Software Engineer, Ceph Storage 

Chicago, IL

lflo...@ibm.com | lflo...@redhat.com 
M: +17087388804
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: ln: failed to create hard link 'file name': Read-only file system

2023-03-23 Thread Frank Schilder
Hi Xiubo and Gregory,

sorry for the slow reply, I did some more debugging and didn't have too much 
time. First some questions to collecting logs, but please see also below for 
reproducing the issue yourselves.

I can reproduce it reliably but need some input for these:

> enabling the kclient debug logs and
How do I do that? I thought the kclient ignores the ceph.conf and I'm not aware 
of a mount option to this effect. Is there a "ceph config set ..." setting I 
can change for a specific client (by host name/IP) and how exactly?

> also the mds debug logs
I guess here I should set a higher loglevel for the MDS serving this directory 
(it is pinned to a single rank) or is it something else?

The issue seems to require a certain load to show up. I created a minimal tar 
file mimicking the problem and having 2 directories with a hard link from a 
file in the first to a new name in the second directory. This does not cause 
any problems, so its not that easy to reproduce.

How you can reproduce it:

As an alternative to my limited skills of pulling logs out, I make the 
tgz-archive available to you both. You will receive an e-mail from our 
one-drive with a download link. If you un-tar the archive on an NFS client dir 
that's a re-export of a kclient mount, after some time you should see the 
errors showing up.

I can reliably reproduce these errors on our production- as well as on our test 
cluster. You should be able to reproduce it too with the tgz file.

Here is a result on our set-up:

- production cluster (executed in a sub-dir conda to make cleanup easy):

$ time tar -xzf ../conda.tgz
tar: mambaforge/pkgs/libstdcxx-ng-9.3.0-h6de172a_18/lib/libstdc++.so.6.0.28: 
Cannot hard link to ‘envs/satwindspy/lib/libstdc++.so.6.0.28’: Read-only file 
system
[...]
tar: mambaforge/pkgs/boost-cpp-1.72.0-h9d3c048_4/lib/libboost_log.so.1.72.0: 
Cannot hard link to ‘envs/satwindspy/lib/libboost_log.so.1.72.0’: Read-only 
file system
^C

real1m29.008s
user0m0.612s
sys 0m6.870s

By this time there are already hard links created, so it doesn't fail right 
away:
$ find -type f -links +1
./mambaforge/pkgs/libev-4.33-h516909a_1/share/man/man3/ev.3
./mambaforge/pkgs/libev-4.33-h516909a_1/include/ev++.h
./mambaforge/pkgs/libev-4.33-h516909a_1/include/ev.h
...

- test cluster (octopus latest stable, 3 OSD hosts with 3 HDD OSDs each, simple 
ceph-fs):

# ceph fs status
fs - 2 clients
==
RANK  STATE MDSACTIVITY DNSINOS
 0active  tceph-02  Reqs:0 /s  1807k  1739k
  POOL  TYPE USED  AVAIL
fs-meta1  metadata  18.3G   156G
fs-meta2data   0156G
fs-data data1604G   312G
STANDBY MDS
  tceph-01
  tceph-03
MDS version: ceph version 15.2.17 (8a82819d84cf884bd39c17e3236e0632ac146dc4) 
octopus (stable)

Its the new recommended 3-pool layout with fs-data being a 4+2 EC pool.

$ time tar -xzf / ... /conda.tgz
tar: mambaforge/ssl/cacert.pem: Cannot hard link to 
‘envs/satwindspy/ssl/cacert.pem’: Read-only file system
[...]
tar: mambaforge/lib/engines-1.1/padlock.so: Cannot hard link to 
‘envs/satwindspy/lib/engines-1.1/padlock.so’: Read-only file system
^C

real6m23.522s
user0m3.477s
sys 0m25.792s

Same story here, a large number of hard links has already been created before 
it starts failing:

$ find -type f -links +1
./mambaforge/lib/liblzo2.so.2.0.0
...

Looking at the output of find in both cases it also looks a bit 
non-deterministic when it starts failing.

It would be great if you can reproduce the issue on a similar test setup using 
the archive conda.tgz. If not, I'm happy to collect any type of logs on our 
test cluster.

We have now one user who has problems with rsync to an NFS share and it would 
be really appreciated if this could be sorted.

Thanks for your help and best regards,
=
Frank Schilder
AIT Risø Campus
Bygning 109, rum S14


From: Xiubo Li 
Sent: Thursday, March 23, 2023 2:41 AM
To: Frank Schilder; Gregory Farnum
Cc: ceph-users@ceph.io
Subject: Re: [ceph-users] Re: ln: failed to create hard link 'file name': 
Read-only file system

Hi Frank,

Could you reproduce it again by enabling the kclient debug logs and also
the mds debug logs ?

I need to know what exactly has happened in kclient and mds side.
Locally I couldn't reproduce it.

Thanks

- Xiubo

On 22/03/2023 23:27, Frank Schilder wrote:
> Hi Gregory,
>
> thanks for your reply. First a quick update. Here is how I get ln to work 
> after it failed, there seems no timeout:
>
> $ ln envs/satwindspy/include/ffi.h 
> mambaforge/pkgs/libffi-3.3-h58526e2_2/include/ffi.h
> ln: failed to create hard link 
> 'mambaforge/pkgs/libffi-3.3-h58526e2_2/include/ffi.h': Read-only file system
> $ ls -l envs/satwindspy/include mambaforge/pkgs/libffi-3.3-h58526e2_2
> envs/satwindspy/include:
> total 7664
> -rw-rw-r--.   1 rit rit959 Mar  5  2021 ares_build.h
> [...]
> $ ln envs/satwindspy/include/ffi.h 
> mambaforge/pkgs/libffi-3.3-h58526e2_2/in

[ceph-users] Re: quincy v17.2.6 QE Validation status

2023-03-23 Thread Yuri Weinstein
We are still working on core PRs:

https://github.com/ceph/ceph/pull/50549
https://github.com/ceph/ceph/pull/50625 - merged
https://github.com/ceph/ceph/pull/50575

Will update as soon as we are ready for the next steps.

On Thu, Mar 23, 2023 at 10:34 AM Casey Bodley  wrote:
>
> On Wed, Mar 22, 2023 at 9:27 AM Casey Bodley  wrote:
> >
> > On Tue, Mar 21, 2023 at 4:06 PM Yuri Weinstein  wrote:
> > >
> > > Details of this release are summarized here:
> > >
> > > https://tracker.ceph.com/issues/59070#note-1
> > > Release Notes - TBD
> > >
> > > The reruns were in the queue for 4 days because of some slowness issues.
> > > The core team (Neha, Radek, Laura, and others) are trying to narrow
> > > down the root cause.
> > >
> > > Seeking approvals/reviews for:
> > >
> > > rados - Neha, Radek, Travis, Ernesto, Adam King (we still have to test
> > > and merge at least one PR https://github.com/ceph/ceph/pull/50575 for
> > > the core)
> > > rgw - Casey
> >
> > there were some java_s3test failures related to
> > https://tracker.ceph.com/issues/58554. i've added the fix to
> > https://github.com/ceph/java_s3tests/commits/ceph-quincy, so a rerun
> > should resolve those failures
> > there were also some 'Failed to fetch package version' failures in the
> > rerun that warranted another rerun anyway
> >
> > there's also an urgent priority bug fix in
> > https://github.com/ceph/ceph/pull/50625 that i'd really like to add to
> > this release; sorry for the late notice
>
> this fix merged, so rgw is now approved. thanks Yuri
>
> >
> > > fs - Venky (the fs suite has an unusually high amount of failed jobs,
> > > any reason to suspect it in the observed slowness?)
> > > orch - Adam King
> > > rbd - Ilya
> > > krbd - Ilya
> > > upgrade/octopus-x - Laura is looking into failures
> > > upgrade/pacific-x - Laura is looking into failures
> > > upgrade/quincy-p2p - Laura is looking into failures
> > > client-upgrade-octopus-quincy-quincy - missing packages, Adam Kraitman
> > > is looking into it
> > > powercycle - Brad
> > > ceph-volume - needs a rerun on merged
> > > https://github.com/ceph/ceph-ansible/pull/7409
> > >
> > > Please reply to this email with approval and/or trackers of known
> > > issues/PRs to address them.
> > >
> > > Also, share any findings or hypnosis about the slowness in the
> > > execution of the suite.
> > >
> > > Josh, Neha - gibba and LRC upgrades pending major suites approvals.
> > > RC release - pending major suites approvals.
> > >
> > > Thx
> > > YuriW
> > > ___
> > > ceph-users mailing list -- ceph-users@ceph.io
> > > To unsubscribe send an email to ceph-users-le...@ceph.io
> > >
>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: quincy v17.2.6 QE Validation status

2023-03-23 Thread Casey Bodley
On Wed, Mar 22, 2023 at 9:27 AM Casey Bodley  wrote:
>
> On Tue, Mar 21, 2023 at 4:06 PM Yuri Weinstein  wrote:
> >
> > Details of this release are summarized here:
> >
> > https://tracker.ceph.com/issues/59070#note-1
> > Release Notes - TBD
> >
> > The reruns were in the queue for 4 days because of some slowness issues.
> > The core team (Neha, Radek, Laura, and others) are trying to narrow
> > down the root cause.
> >
> > Seeking approvals/reviews for:
> >
> > rados - Neha, Radek, Travis, Ernesto, Adam King (we still have to test
> > and merge at least one PR https://github.com/ceph/ceph/pull/50575 for
> > the core)
> > rgw - Casey
>
> there were some java_s3test failures related to
> https://tracker.ceph.com/issues/58554. i've added the fix to
> https://github.com/ceph/java_s3tests/commits/ceph-quincy, so a rerun
> should resolve those failures
> there were also some 'Failed to fetch package version' failures in the
> rerun that warranted another rerun anyway
>
> there's also an urgent priority bug fix in
> https://github.com/ceph/ceph/pull/50625 that i'd really like to add to
> this release; sorry for the late notice

this fix merged, so rgw is now approved. thanks Yuri

>
> > fs - Venky (the fs suite has an unusually high amount of failed jobs,
> > any reason to suspect it in the observed slowness?)
> > orch - Adam King
> > rbd - Ilya
> > krbd - Ilya
> > upgrade/octopus-x - Laura is looking into failures
> > upgrade/pacific-x - Laura is looking into failures
> > upgrade/quincy-p2p - Laura is looking into failures
> > client-upgrade-octopus-quincy-quincy - missing packages, Adam Kraitman
> > is looking into it
> > powercycle - Brad
> > ceph-volume - needs a rerun on merged
> > https://github.com/ceph/ceph-ansible/pull/7409
> >
> > Please reply to this email with approval and/or trackers of known
> > issues/PRs to address them.
> >
> > Also, share any findings or hypnosis about the slowness in the
> > execution of the suite.
> >
> > Josh, Neha - gibba and LRC upgrades pending major suites approvals.
> > RC release - pending major suites approvals.
> >
> > Thx
> > YuriW
> > ___
> > ceph-users mailing list -- ceph-users@ceph.io
> > To unsubscribe send an email to ceph-users-le...@ceph.io
> >
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] With Ceph Quincy, the "ceph" package does not include ceph-volume anymore

2023-03-23 Thread Geert Kloosterman
Hi all,

Until Ceph Pacific, installing just the "ceph" package was enough to get 
everything needed to deploy Ceph.

However, with Quincy, ceph-volume was split off into its own package, and it is 
not automatically installed anymore.

Here we can see it is not listed as a dependency:

$ rpm -q --requires ceph
binutils
ceph-mds = 2:17.2.5-0.el8
ceph-mgr = 2:17.2.5-0.el8
ceph-mon = 2:17.2.5-0.el8
ceph-osd = 2:17.2.5-0.el8
rpmlib(CompressedFileNames) <= 3.0.4-1
rpmlib(FileDigests) <= 4.6.0-1
rpmlib(PayloadFilesHavePrefix) <= 4.0-1
rpmlib(PayloadIsXz) <= 5.2-1
systemd

Should I file a bug for this?

Best regards,
Geert Kloosterman
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Ceph Mgr/Dashboard Python depedencies: a new approach

2023-03-23 Thread Casey Bodley
hi Ernesto and lists,

> [1] https://github.com/ceph/ceph/pull/47501

are we planning to backport this to quincy so we can support centos 9
there? enabling that upgrade path on centos 9 was one of the
conditions for dropping centos 8 support in reef, which i'm still keen
to do

if not, can we find another resolution to
https://tracker.ceph.com/issues/58832? as i understand it, all of
those python packages exist in centos 8. do we know why they were
dropped for centos 9? have we looked into making those available in
epel? (cc Ken and Kaleb)

On Fri, Sep 2, 2022 at 12:01 PM Ernesto Puerta  wrote:
>
> Hi Kevin,
>
>>
>> Isn't this one of the reasons containers were pushed, so that the packaging 
>> isn't as big a deal?
>
>
> Yes, but the Ceph community has a strong commitment to provide distro 
> packages for those users who are not interested in moving to containers.
>
>> Is it the continued push to support lots of distros without using containers 
>> that is the problem?
>
>
> If not a problem, it definitely makes it more challenging. Compiled 
> components often sort this out by statically linking deps whose packages are 
> not widely available in distros. The approach we're proposing here would be 
> the closest equivalent to static linking for interpreted code (bundling).
>
> Thanks for sharing your questions!
>
> Kind regards,
> Ernesto
> ___
> Dev mailing list -- d...@ceph.io
> To unsubscribe send an email to dev-le...@ceph.io
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Unexpected ceph pool creation error with Ceph Quincy

2023-03-23 Thread Geert Kloosterman
Hi,

Thanks again for your input.  The value of mon_max_pool_pg_num was at
its default.

It turns out I had missed a few steps in my earlier effort:

After I removed the old default settings for osd_pool_default_pg_num
and osd_pool_default_pgp_num from ceph.conf on *all* nodes, and
restarted all ceph services on all nodes, I finally got rid of the
error.

The error appears to return when setting osd_pool_default_pgp_num to
anything else but 0.

Even with 

  osd_pool_default_pg_num = 32
  osd_pool_default_pgp_num = 32

the error returns:
  
   Error ERANGE: 'pgp_num' must be greater than 0 and lower or equal
than 'pg_num', which in this case is 1


Not sure whether this is intended behavior (to my understanding pg_num
and pgp_num should be equal), but at least I can get rid of the error
now by not putting these settings in ceph.conf anymore.

Cheers,
Geert


On Tue, 2023-03-21 at 08:03 +, Eugen Block wrote:
> External email: Use caution opening links or attachments
> 
> 
> Sorry, hit send too early. It seems I could reproduce it by reducing
> the value to 1:
> 
> host1:~ # ceph config set mon mon_max_pool_pg_num 1
> host1:~ # ceph config get mon mon_max_pool_pg_num
> 1
> host1:~ # ceph osd pool create pool3
> Error ERANGE: 'pg_num' must be greater than 0 and less than or equal
> to 1 (you may adjust 'mon max pool pg num' for higher values)
> 
> The default is 65536. Can you verify if this is your issue?
> 
> Zitat von Eugen Block :
> 
> > Did you ever adjust mon_max_pool_pg_num? Can you check what your
> > current config value is?
> > 
> > host1:~ # ceph config get mon mon_max_pool_pg_num
> > 65536
> > 
> > Zitat von Geert Kloosterman :
> > 
> > > Hi,
> > > 
> > > Thanks Eugen for checking this.  I get the same default values as
> > > you when I remove the entries from my ceph.conf:
> > > 
> > >   [root@gjk-ceph ~]# ceph-conf -D | grep default_pg
> > >   osd_pool_default_pg_autoscale_mode = on
> > >   osd_pool_default_pg_num = 32
> > >   osd_pool_default_pgp_num = 0
> > > 
> > > However, in my case, the pool creation error remains:
> > > 
> > >   [root@gjk-ceph ~]# ceph osd pool create asdf
> > >   Error ERANGE: 'pgp_num' must be greater than 0 and lower or
> > > equal
> > >   than 'pg_num', which in this case is 1
> > > 
> > > But I can create the pool when passing the same pg_num and
> > > pgp_num
> > > values explicity:
> > > 
> > >   [root@gjk-ceph ~]# ceph osd pool create asdf 32 0
> > >   pool 'asdf' created
> > > 
> > > Does anyone have an idea how I can debug this further?
> > > 
> > > I'm running Ceph on a virtualized Rocky 8.7 test cluster, with
> > > Ceph
> > > rpms installed from 
> > > https://nam11.safelinks.protection.outlook.com/?url=http%3A%2F%2Fdownload.ceph.com%2Frpm-quincy%2Fel8%2F&data=05%7C01%7Cgkloosterman%40nvidia.com%7C7433da6078a54827d52508db29e2fdfb%7C43083d15727340c1b7db39efd9ccc17a%7C0%7C0%7C638149827139745086%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=a6%2FwrNWvlcb57AW14JVNfJnUI8ryDfAXfirXzuBwl58%3D&reserved=0
> > > 
> > > Best regards,
> > > Geert Kloosterman
> > > 
> > > 
> > > On Wed, 2023-03-15 at 13:42 +, Eugen Block wrote:
> > > > External email: Use caution opening links or attachments
> > > > 
> > > > 
> > > > Hi,
> > > > 
> > > > I could not confirm this in a virtual lab cluster, also on
> > > > 17.2.5:
> > > > 
> > > > host1:~ # ceph osd pool create asdf
> > > > pool 'asdf' created
> > > > 
> > > > host1:~ # ceph-conf -D | grep 'osd_pool_default_pg'
> > > > osd_pool_default_pg_autoscale_mode = on
> > > > osd_pool_default_pg_num = 32
> > > > osd_pool_default_pgp_num = 0
> > > > 
> > > > So it looks quite similar except the pgp_num value (I can't
> > > > remember
> > > > having that modified). This is an upgraded Nautilus cluster.
> > > > 
> > > > Zitat von Geert Kloosterman :
> > > > 
> > > > > Hi all,
> > > > > 
> > > > > I'm trying out Ceph Quincy (17.2.5) for the first time and
> > > > > I'm
> > > > > running into unexpected behavior of "ceph osd pool create".
> > > > > 
> > > > > When not passing any pg_num and pgp_num values, I get the
> > > > > following
> > > > > error with Quincy:
> > > > > 
> > > > > [root@gjk-ceph ~]# ceph osd pool create asdf
> > > > > Error ERANGE: 'pgp_num' must be greater than 0 and lower
> > > > > or
> > > > > equal than 'pg_num', which in this case is 1
> > > > > 
> > > > > I checked with Ceph Pacific (16.2.11) and there the extra
> > > > > arguments
> > > > > are not needed.
> > > > > 
> > > > > I expected it would use osd_pool_default_pg_num and
> > > > > osd_pool_default_pgp_num as defined in my ceph.conf:
> > > > > 
> > > > > [root@gjk-ceph ~]# ceph-conf -D | grep
> > > > > 'osd_pool_default_pg'
> > > > > osd_pool_default_pg_autoscale_mode = on
> > > > > osd_pool_default_pg_num = 8
> > > > > osd_pool_default_pgp_num = 8
> > > > > 
> > > > > At least, this is what appears to be used with Pacific.
> > > > > 
> > > > > Is this an inte

[ceph-users] Re: Almalinux 9

2023-03-23 Thread Dario Graña
I made some tests with a virtual environment, mons, mds and OSDs. The OSDs
were 3 VMs with 3 disks each.
Now I'm testing Ceph Quincy on AlmaLinux9 without problems in a test
environment. I'm using VMs for mons (3) and mds(2) but the OSDs (8) are all
physical nodes with 24 HDD.
The installation worked out of the box using the Ceph orchestrator and a
production cluster is in our roadmap. If all our tests are successful, we
will install this new cluster on Alma9.
Regards!



On Mon, Mar 20, 2023 at 12:37 PM Michael Lipp  wrote:

>
> > Has anyone used almalinux 9 to install ceph. Have you encountered
> problems? Other tips on this installation are also welcome.
>
> I have installed Ceph on AlmaLinux 9.1 (both Ceph and later Ceph/Rook)
> on a three node VM cluster and then a three node bare metal cluster
> (with 4 OSDs each) without any problems. (Rook based required some
> research but there were no problems -- see
> https://github.com/mnlipp/kubernetes-experience .)
>
> Mind that this is, obviously, a small cluster and although the bare
> metal cluster is eventually intended for production, it has not
> thoroughly been tested yet. But installation went smoothly and I think
> that any problems that I might encounter in the future would occur with
> any OS. What really has surprised me is the resilience. Especially when
> experimenting with the VM-based cluster I have brought nodes down in a
> "power off" way several times (not because of problems with Ceph but
> because of k8s configuration changes) and my file systems have recovered
> every time.
>
>   - Michael
>
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io