Hi,
I send the logfile in the attachment. I can find no error messages
or anything problematic…
I didn't see any log file attached to the email.
Another question: Is there a link between the VMs that fail to write
to CephFS and the hypervisors? Are all failing clients on the same
hypervi
Hello,
Yesterday I have encountered a strange osd crash which led to cluster
flapping. I had to force nodown flag on the cluster to finish the
flapping. The first osd that crashed with:
2018-08-02 17:23:23.275417 7f87ec8d7700 1 heartbeat_map is_healthy
'OSD::osd_op_tp thread 0x7f8803dfb700' had
I have moved the pool, but strange thing is that if I do something like
this
for object in `cat out`; do rados -p fs_meta get $object /dev/null ;
done
I do not see any activity on the ssd drives with something like dstat
(checked on all nodes (sdh))
net/eth4.60-net/eth4.52
--dsk/sda-d
Hello,
Another OSD started randomly crashing with segmentation fault. Haven't
managed to add the last 3 OSDs back to the cluster as the daemons keep
crashing.
---
-2> 2018-08-03 12:12:52.670076 7f12b6b15700 4 rocksdb:
EVENT_LOG_v1 {"time_micros": 1533287572670073, "job": 3, "event":
"table_
Try IOR mdtest for metadata performance.
From: ceph-users on behalf of Marc Roos
Sent: Friday, 3 August 2018 7:49:13 PM
To: dcsysengineer
Cc: ceph-users
Subject: Re: [ceph-users] Cephfs meta data pool to ssd and measuring
performance difference
I have moved t
hello!
We did maintenance works (cluster shrinking) on one cluster (jewel) and
after shutting one of osds down we found this situation where recover of pg
can't be started because of "querying" one of peers. We restarted this OSD,
tried to out and in. Nothing helped, finally we moved out data (the
On 08/03/2018 01:45 PM, Pawel S wrote:
> hello!
>
> We did maintenance works (cluster shrinking) on one cluster (jewel)
> and after shutting one of osds down we found this situation where
> recover of pg can't be started because of "querying" one of peers. We
> restarted this OSD, tried to out and
Thanks, that's useful to know. I've pasted the output you asked for
below, thanks for taking a look.
Here's the output of dump_mempools:
{
"mempool": {
"by_pool": {
"bloom_filter": {
"items": 4806709,
"bytes": 4806709
},
On Fri, Aug 3, 2018 at 8:53 PM Benjeman Meekhof wrote:
>
> Thanks, that's useful to know. I've pasted the output you asked for
> below, thanks for taking a look.
>
> Here's the output of dump_mempools:
>
> {
> "mempool": {
> "by_pool": {
> "bloom_filter": {
>
On Fri, Aug 3, 2018 at 2:07 PM Paweł Sadowsk wrote:
> On 08/03/2018 01:45 PM, Pawel S wrote:
> > hello!
> >
> > We did maintenance works (cluster shrinking) on one cluster (jewel)
> > and after shutting one of osds down we found this situation where
> > recover of pg can't be started because of "
Hi,
we have a full bluestore cluster and had to deal with read errors on
the SSD for the block.db. Something like this helped us to recreate a
pre-existing OSD without rebalancing, just refilling the PGs. I would
zap the journal device and let it recreate. It's very similar to your
ceph-d
Thanks Eugen!
I was looking into running all the commands manually, following the docs for
add/remove osd but tried ceph-disk first.
I actually made it work by changing the id part in ceph-disk ( it was checking
the wrong journal device, which was owned by root:root ). The next part was
that I
Hi,
Anyone can help us answer these questions?
2018-08-03 8:36 GMT+07:00 Sam Huracan :
> Hi Cephers,
>
> We intend to upgrade our Cluster from Jewel to Luminous (or Mimic?)
>
> Our model is currently using OSD File Store with SSD Journal (1 SSD for 7
> SATA 7.2K)
>
> My question are:
>
>
> 1.S
I am currently unable to write any data to this bucket in this current
state. Does anyone have any ideas for reverting to the original index
shards and cancel the reshard processes happening to the bucket?
On Thu, Aug 2, 2018 at 12:32 PM David Turner wrote:
> I upgraded my last cluster to Lumin
I suppose I may have found the solution I was unaware existed.
> balancer optimize { [...]} : Run optimizer to create a
> new plan
So apparently you can create a plan specific to a pool(s).
So just to double check this, I created two plans, plan1 with the hdd pool (and
not the ssd pool); plan
Is it actually resharding, or is it just stuck in that state?
On Fri, Aug 3, 2018 at 7:55 AM, David Turner wrote:
> I am currently unable to write any data to this bucket in this current
> state. Does anyone have any ideas for reverting to the original index
> shards and cancel the reshard proce
Oh, also -- one thing that might work is running bucket check --fix on
the bucket. That should overwrite the reshard status field in the
bucket index.
Let me know if it happens to fix the issue for you.
Yehuda.
On Fri, Aug 3, 2018 at 9:46 AM, Yehuda Sadeh-Weinraub wrote:
> Is it actually reshar
I came across you mentioning bucket check --fix before, but I totally
forgot that I should be passing --bucket=mybucket with the command to
actually do anything. I'm running this now and it seems to actually be
doing something. My guess was that it was stuck in the state and now that
I can clean
Hi Sam,
Having done any benchmark myself -as we only use SSDs or NVMes- but is my
understanding Luminous -I would not recommend upgrading production to Mimic
yet, but I’m quite conservative- Bluestore is going to be slower for writes
than filestore with SSD journals.
You could try dmcache, bca
Hi all.
We have an issue with some down+peering PGs (I think), when I try to
mount or access data the requests are blocked:
114891/7509353 objects degraded (1.530%)
887 stale+active+clean
1 peering
54 active+recovery_wait
19609
Hi,
You can export and import PG's using ceph_objectstore_tool, but if the osd
won't start you may have trouble exporting a PG.
It maybe useful to share the errors you get when trying to start the osd.
Thanks
On Fri, Aug 3, 2018 at 10:13 PM, Sean Patronis wrote:
>
>
> Hi all.
>
> We have an i
Hi,
I run a cluster with 7 OSD. The cluster has no much traffic on it. But
every few days, I get a HEALTH_ERR, because of inconsistent PGs:
root@Sam ~ # ceph status
cluster:
id: c4bfc288-8ba8-4c3a-b3a6-ed95503f50b7
health: HEALTH_ERR
3 scrub errors
Possibl
Forgive the wall of text, i shortened it a little here is the osd
log when I attempt to start the osd:
2018-08-04 03:53:28.917418 7f3102aa87c0 0
xfsfilestorebackend(/var/lib/ceph/osd/ceph-21) detect_feature: extsize is
disabled by conf
2018-08-04 03:53:28.977564 7f3102aa87c0 0
filestore(
23 matches
Mail list logo