Guys,
We had set up a five nodes Ceph cluster. Four are OSD servers and the other
one is MON and MGR.
Recently, during RGW stability test, RGW default
pool: default.rgw.buckets.data is accidentally written full. As a result,
RGW is stuck. We don't know the exact steps to recover, then we deleted a
I really not sure why monitor mark the OSD to down state
"Monitor daemon marked osd.9 down, but it is still running"
2018-01-30 16:07 GMT+08:00 blackpiglet J. :
> Guys,
>
> We had set up a five nodes Ceph cluster. Four are OSD servers and the
> other one is MON and MGR.
> Recently, during RGW s
I found some logs says osd.91 is down. I think that should be same for osd.9
I am not sure what will cause the OSD process treated by peers as down.
2018-01-30 06:39:33.767747 7f2402ef9700 1 mon.ubuntuser8@0(leader).log
v424396 check_sub sending message to client.164108 10.1.248.8:0/3388257888
wi
On 11/03/2017 02:43 PM, Mark Nelson wrote:
On 11/03/2017 08:25 AM, Wido den Hollander wrote:
Op 3 november 2017 om 13:33 schreef Mark Nelson :
On 11/03/2017 02:44 AM, Wido den Hollander wrote:
Op 3 november 2017 om 0:09 schreef Nigel Williams
:
On 3 November 2017 at 07:45, Martin
Hi,
We got a nasty issue during our Jewel->Luminous upgrade. The Mon/MGR and
OSD part went well, but the first of our rgws is incompatible to the
others:
The problem:
Some Buckets are not accessible from the luminous gateway. The metadata
for that buckets seemed ok, but listing was not possible.
Hi,
Is there a ETA yet for 12.2.3? Looking at the tracker there aren't that
many outstanding issues: http://tracker.ceph.com/projects/ceph/roadmap
On Github we have more outstanding PRs though for the Luminous
milestone: https://github.com/ceph/ceph/milestone/10
Are we expecting 12.2.3 in F
On Tue, Jan 30, 2018 at 12:50 AM, Paul Kunicki wrote:
> I know that snapshots on Cephfs are experimental and that a known
> issue exists with multiple filesystems on one pool but I was surprised
> at the result of the following:
>
> I attempted to take a snapshot of a directory in a pool with a si
On Tue, Jan 30, 2018 at 10:22 AM, John Spray wrote:
> On Tue, Jan 30, 2018 at 12:50 AM, Paul Kunicki
> wrote:
>> I know that snapshots on Cephfs are experimental and that a known
>> issue exists with multiple filesystems on one pool but I was surprised
>> at the result of the following:
>>
>> I
Hi,
we have a ceph cluster with 3 cluster nodes and 20 OSD's, with 6-7-7 2
TB HDD/s per node.
In long term we want to use 7-9 pools, and for 20 OSD and 8 pools I
calculate that the ideal pg_num was 250 (20 * 100 / 8).
In this case normally each OSD store 100 pg's, that is the recommanded.
I hav
Hi,
We observe high apply_latency(ms) and poor write performance I believe.
In logs there are repetitive slow request warnings related different OSDs
and servers.
ceph versions 12.2.2
Cluster HW description:
9x Dell PowerEdge R730xd
1x Intel(R) Xeon(R) CPU E5-2630 v4 @ 2.20GHz (10C/20T)
256 GB
Hi,
we have several times a day different OSDs running Luminous 12.2.2 and
Bluestore crashing with errors like this:
starting osd.2 at - osd_data /var/lib/ceph/osd/ceph-2
/var/lib/ceph/osd/ceph-2/journal
2018-01-30 13:45:28.440883 7f1e193cbd00 -1 osd.2 107082 log_to_monitors
{default=true}
Unfortunately, any snapshots created prior to 12.2.2 against a separate
data pool were incorrectly associated to the base image pool instead of the
data pool. Was the base RBD pool used only for data-pool associated images
(i.e. all the snapshots that exists within the pool can be safely deleted)?
Hi Jason,
>> Was the base RBD pool used only for data-pool associated images
Yes, it is only used for storing metadata of ecpool.
We use 2 pools for erasure coding
ecpool - erasure coded datapool
vm - replicated pool to store metadata
Karun Josy
On Tue, Jan 30, 2018 at 8:00 PM, Jason Dillama
OK, at least it should be pretty straightforward to correct
programmatically. I can throw together a quick program to clean your pools
but you will need to compile it (you will need the librbd1-devel package
installed) since unfortunately the rados Python API doesn't provide access
to self-managed
Hi all,
We are still very new to running a Ceph cluster and have run a RGW cluster for
a while now (6-ish mo), it mainly holds large DB backups (Write once, read
once, delete after N days). The system is now warning us about an OSD that is
near_full and so we went to look at the usage across O
Sorry, obviously should have been Luminous 12.2.2,
-B
From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of Bryan
Banister
Sent: Tuesday, January 30, 2018 10:24 AM
To: Ceph Users
Subject: [ceph-users] Help rebalancing OSD usage, Luminus 1.2.2
Note: External Email
Thanks for the fast reply. I started recording a session where I
unmounted and re-mounted the file system and could not duplicate the
issue. I am going to do some more testing and report back any relevant
findings. For now here are some details about our setup where files
contained in shapshots wer
On Tue, Jan 30, 2018 at 10:32:04AM +0100, Ingo Reimann wrote:
> The problem:
> Some Buckets are not accessible from the luminous gateway. The metadata
> for that buckets seemed ok, but listing was not possible. A local s3cmd
> got "404 NoSuchKey". I exported and imported the metadata for one instan
It appears that this issue is somewhat intermittent as it took several
tries to reproduce. What follows is a session where a snapshot is used
to successfully retrieve the original version of a hosts file prior to
the entire etc directory being snapped. The version is available even
after the entire
When I run 'ceph osd perf' on my cluster (12.2.2), for bluestore osd, the
commit latency and apply latency are always equal. This is not the case on
filestore osd; the commit is lower or equal to apply. Are the bluestore
numbers expected behavior?
___
c
On 2018/01/29 2:31 pm, Alfredo Deza wrote:
So I'm wondering what my options are at this point. Perhaps rebuild
this
OSD node, using ceph-volume and 'simple', but would not be able to use
encryption?
Ungh, I forgot to mention that there is no encryption support.
However, ceph-volume lvm gain
Sorry I hadn't RTFML archive before posting this... Looking at the following
thread for guidance:
http://lists.ceph.com/pipermail/ceph-users-ceph.com/2016-March/008626.html
Not the exact same situation (e.g. didn't add larger OSD disks later on) but
seems like the same recommendations from thi
Your PG count per OSD looks really low, that might be why. I think in Luminous,
you should aim for about 200. I'd use the pgcalc on ceph.com to verify.
From: ceph-users on behalf of Bryan
Banister
Sent: Wednesday, 31 January 2018 8:28:06 AM
To: Ceph Users
Subje
Yes, this is expected. Bluestore’s internal design differs a great deal
from filestore, and it doesn’t have a distinction between the journaled and
readable data the way filestore does with its separate journal and then
normal xfs filesystem.
-Greg
On Tue, Jan 30, 2018 at 11:05 AM Shawn Edwards
wr
Dear Ceph community,
I have a very small ceph cluster for testing with this configuration:
* 2x compute nodes each with:
* dual port of 25 nic
* 2x socket (56 cores with hyperthreading)
* X10 intel nvme DC P3500 drives
* 512 GB RAM
One of the nodes is
Hi Robin,
thanks for your reply.
Concerning "https://tracker.ceph.com/issues/22756 - buckets showing as
empty": Our cluster is rather old - argonaut, but the affected bucket and
user are created under jewel.
If you need more data, I may post it.
Best regards,
Ingo
-Ursprüngliche Nachricht-
On Wed, Jan 31, 2018 at 07:39:02AM +0100, Ingo Reimann wrote:
> Hi Robin,
>
> thanks for your reply.
>
> Concerning "https://tracker.ceph.com/issues/22756 - buckets showing as
> empty": Our cluster is rather old - argonaut, but the affected bucket and
> user are created under jewel.
>
> If you ne
On 2018-01-31 08:14, Manuel Sopena Ballesteros wrote:
> Dear Ceph community,
>
> I have a very small ceph cluster for testing with this configuration:
>
> · 2x compute nodes each with:
>
> · dual port of 25 nic
>
> · 2x socket (56 cores with hyperthreading)
>
> ·
28 matches
Mail list logo