We've done 14.04 -> 16.04 -> 18.04 -> 20.04 all at various stages of our
ceph cluster life.
The latest 18.04 to 20.04 was painless and we ran:
apt update && apt dist-upgrade -y -o Dpkg::Options::=\"--force-confdef\" -o
Dpkg::Options::=\"--force-confold\"
do-release-upgrade --allow-third-party -f
Hey all,
Recently upgraded to Ceph Octopus (15.2.14). We also run Zabbix
5.0.15. Have had ceph/zabbix monitoring for a long time. After the
Ceph Octopus update I installed the latest version of the Ceph
template in Zabbix
ss and we seem to be humming along nicely now.
On Tue, Oct 5, 2021 at 4:55 PM shubjero wrote:
>
> Just upgraded from Ceph Nautilus to Ceph Octopus on Ubuntu 18.04 using
> standard ubuntu packages from the Ceph repo.
>
> Upgrade has gone OK but we are having issues with ou
Just upgraded from Ceph Nautilus to Ceph Octopus on Ubuntu 18.04 using
standard ubuntu packages from the Ceph repo.
Upgrade has gone OK but we are having issues with our radosgw service,
eventually failing after some load, here's what we see in the logs:
2021-10-05T15:55:16.328-0400 7fa47700
ax_chunk_size > rgw_put_obj_min_window_size,
> because we try to write in units of chunk size but the window is too
> small to write a single chunk.
>
> On Wed, Sep 9, 2020 at 8:51 AM shubjero wrote:
> >
> > Will do Matt
> >
> > On Tue, Sep 8, 2020 at 5
Will do Matt
On Tue, Sep 8, 2020 at 5:36 PM Matt Benjamin wrote:
>
> thanks, Shubjero
>
> Would you consider creating a ceph tracker issue for this?
>
> regards,
>
> Matt
>
> On Tue, Sep 8, 2020 at 4:13 PM shubjero wrote:
> >
> > I had been looking
Hey all,
I'm creating a new post for this issue as we've narrowed the problem
down to a partsize limitation on multipart upload. We have discovered
that in our production Nautilus (14.2.11) cluster and our lab Nautilus
(14.2.10) cluster that multipart uploads with a configured part size
of
We have our object storage endpoint fqdn DNS round robining to 2 IP's.
Those 2 IP's are managed by keepalived across 3 servers running
haproxy where each haproxy instance is listening on each round robin'd
IP and then load balanced to 5 servers running radosgw.
On Fri, Sep 4, 2020 at 12:35 PM
, Sep 2, 2020 at 3:15 PM shubjero wrote:
>
> Good day,
>
> I am having an issue with some multipart uploads to radosgw. I
> recently upgraded my cluster from Mimic to Nautilus and began having
> problems with multipart uploads from clients using the Java AWS SDK
> (specifi
Good day,
I am having an issue with some multipart uploads to radosgw. I
recently upgraded my cluster from Mimic to Nautilus and began having
problems with multipart uploads from clients using the Java AWS SDK
(specifically 1.11.219). I do NOT have issues with multipart uploads
with other clients
Hi all,
I have a 39 node, 1404 spinning disk Ceph Mimic cluster across 6 racks
for a total of 9.1PiB raw and about 40% utilized. These storage nodes
started their life on Ubuntu 14.04 and in-place upgraded to 16.04 2
years ago however I have started a project to do fresh installs of
each OSD node
I've reported stability problems with ceph-mgr w/ prometheus plugin
enabled on all versions we ran in production which were several
versions of Luminous and Mimic. Our solution was to disable the
prometheus exporter. I am using Zabbix instead. Our cluster is 1404
OSD's in size with about 9PB raw
I talked to some guys on IRC about going back to the non-1 reweight
OSD's and setting them to 1.
I went from a standard deviation of 2+ to 0.5.
Awesome.
On Wed, Feb 26, 2020 at 10:08 AM shubjero wrote:
>
> Right, but should I be proactively returning any reweighted OSD's that
> are n
Right, but should I be proactively returning any reweighted OSD's that
are not 1. to 1.?
On Wed, Feb 26, 2020 at 3:36 AM Konstantin Shalygin wrote:
>
> On 2/26/20 3:40 AM, shubjero wrote:
> > I'm running a Ceph Mimic cluster 13.2.6 and we use the ceph-balancer
>
Hi all,
I'm running a Ceph Mimic cluster 13.2.6 and we use the ceph-balancer
in upmap mode. This cluster is fairly old and pre-Mimic we used to set
osd reweights to balance the standard deviation of the cluster. Since
moving to Mimic about 9 months ago I enabled the ceph-balancer with
upmap mode
Hey all,
Yesterday our cluster went in to HEALTH_WARN due to 1 large omap
object in the .usage pool (I've posted about this in the past). Last
time we resolved the issue by trimming the usage log below the alert
threshold but this time it seems like the alert wont clear even after
trimming and
Good day,
We have a Ceph cluster and make use of object-storage and integrate
with OpenStack. Each OpenStack project/tenant is given a radosgw user
which allows all keystone users of that project to access the
object-storage as that single radosgw user. The radosgw user is the
project id of the
I'm having a similar issue with ceph-mgr stability problems since
upgrading from 13.2.5 to 13.2.6. I have isolated the crashing to the
prometheus module being enabled and notice much better stability when
the prometheus module is NOT enabled. No more failovers, however I do
notice that even with
18 matches
Mail list logo