Hi all
We are looking at implementing Ceph/CephFS for a project. Over time, we may
wish to add additional replicas to our cluster. If we modify a CRUSH map, is
there a way of then requesting Ceph to re-evaluate the placement of objects
across the cluster according to the modified CRUSH map?
I have just done a test on rbd-mirror. Follow the steps:
1. deploy two new clusters, clusterA and clusterB
2. configure one-way replication from clusterA to clusterB with rbd-mirror
3. write data to rbd_blk on clusterA once every 5 seconds
4. get information with 'rbd mirror image status rbd_blk',
hi,guys.
we have a ceph cluster which version is luminous 12.2.13. and Recently we
encountered a problem.here are some log infomations:
2020-06-08 12:33:52.706070 7f4097e2d700 0 log_channel(cluster) log [WRN] :
slow request 30.518930 seconds old, received at 2020-06-08 12:33:22.186924:
Thanks Mark & Marc
We will do more testing inc kernel client as well as testing the block
storage performance first.
We just did some direct raw performance test on a single spinning disk
(format as ext4) and it could delivery 200-300MB/s throughput in various
writing and mix testings. But FUSE
Well, I'm afraid that the image didn't replay continuously, which means I
have some data lost.
The "rbd mirror image status" shows the image is replayed and its time is
just before I demote
the primary image. I lost about 24 hours' data and I'm not sure whether
there is an interval
between the
Greetings,
I'm using ceph (14.2.2); in conjunction with Proxmox. Currently I'm just doing
tests and ran into an issue relating to high I/O waits. Just to give a little
bit of a background, specifically relating to my current ceph configurations;
we have 6 nodes, each consisting of 2 osds (each
Rather unsatisfactory to not know where it really went wrong, but
completely removing all traces of peer settings and auth keys, and I redid
the peer bootstrap and this did result in a working sync.
My initial mirror config stemmed from Nautilus and was configged for
journaling on a pool. Perhaps
Hi all,
Does the ceph manager prometheus module export bluestore rocksdb
compaction times per OSD? I couldn't find anything.
thx
Frank
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io
Is there any quick fix for the issue listed here -
https://tracker.ceph.com/issues/45032
I have hit this on ceph upgrade to 15.2.3.
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io
OK, now we are talking. It is very well possible that trimming will not start
until this operation is completed.
If there are enough shards/copies to recover the lost objects, you should try a
pg repair first. If you did loose too many replicas, there are ways to flush
this PG out of the
Nevermind didn't see that Octopus isn't really supported on C7 so I'll just
stick with what I have until I want to upgrade to C8.
Thanks,
-Drew
-Original Message-
From: Drew Weaver
Sent: Monday, June 8, 2020 1:38 PM
To: 'ceph-users@ceph.io'
Subject: [ceph-users] Trying to upgrade to
That's strange. Maybe there is another problem. Do you have any other health
warnings that might be related? Is there some recovery/rebalancing going on?
Best regards,
=
Frank Schilder
AIT Risø Campus
Bygning 109, rum S14
From: Francois
Hi cluster is version 14.2.9
ceph-deploy v2.0.1
using command ceph-deploy install --release octopus mon0 mon1 mon2
result is this command being run:
sudo yum remove -y ceph-release
which removes this package:
ceph-releasenoarch1-1.el7 @/ceph-release-1-0.el7.noarch
Then it tries to
Hi Francois,
this sounds great. At least its operational. I guess it is still using a lot of
swap while trying to replay operations.
I would disconnect cleanly all clients if you didn't do so already, even any
read-only clients. Any extra load will just slow down recovery. My best guess
is,
Okay, that was not clear to me, thanks for clearing that up. Do you
see martians logged?
Zitat von Amjad Kotobi :
On 8. Jun 2020, at 18:39, Eugen Block wrote:
Your client machine is in 136.172.26.0/24 while your ceph public
network is defined as 10.60.1.0/24. Clients need to be able to
Your client machine is in 136.172.26.0/24 while your ceph public
network is defined as 10.60.1.0/24. Clients need to be able to reach
the MONs public IP, that's where they get their information from (e.g.
which OSDs they have to talk to). So you'll probably need some routing
to reach the
> On 8. Jun 2020, at 18:01, Eugen Block wrote:
>
> Clients need access to the public network in order to access ceph resources.
What do you mean by access?
E.g. any machines with IP subnet 10.60.1.X/24 able to access CEPH and perform
any operations.
Amjad
>
>
>
> Zitat von Amjad Kotobi :
Clients need access to the public network in order to access ceph resources.
Zitat von Amjad Kotobi :
Hi,
I’m trying to access CEPH cluster from client machines via RBD for
operations like listing pools, creating image, etc..
I put all needed keys/configuration in ceph directory and when
Hi,
I’m trying to access CEPH cluster from client machines via RBD for operations
like listing pools, creating image, etc..
I put all needed keys/configuration in ceph directory and when I try to list
pool rbd hung/wait forever
Client machine in IP range of: 136.172.26.0/24
The command e.g. I’m
https://drive.switch.ch/index.php/s/Jwk0Kgy7Q1EIxuE
On 08.06.20 17:30, Igor Fedotov wrote:
I think it's better to put the log to some public cloud and paste the
link here..
On 6/8/2020 6:27 PM, Harald Staub wrote:
(really sorry for spamming, but it is still waiting for moderator, so
trying
Hi all,
We have a cephfs with data_pool in erasure coding (3+2) ans 1024 pg
(nautilus 14.2.8).
One of the pgs is partially destroyed (we lost 3 osd thus 3 shards), it
have 143 objects unfound and is stuck in state
"active+recovery_unfound+undersized+degraded+remapped".
We then lost some datas
I think it's better to put the log to some public cloud and paste the
link here..
On 6/8/2020 6:27 PM, Harald Staub wrote:
(really sorry for spamming, but it is still waiting for moderator, so
trying with xz ...)
On 08.06.20 17:21, Harald Staub wrote:
(and now with trimmed attachment
(really sorry for spamming, but it is still waiting for moderator, so
trying with xz ...)
On 08.06.20 17:21, Harald Staub wrote:
(and now with trimmed attachment because of size restriction: only the
debug log)
On 08.06.20 16:53, Harald Staub wrote:
(and now with attachment ...)
On
Hi Igor
Thank you for looking into this! I attached the complete log of today,
with the preceding "ceph_assert(h->file->fnode.ino != 1)" at
13:13:22.609, the first "FAILED ceph_assert(is_valid_io(off, len))" at
13:44:52.059, the debug log starting at 16:42:20.883.
Cheers
Harry
On 08.06.20
I already had some discussion on the list about this problem. But I
should ask again.
We really lost some objects and there are not enought shards to
reconstruct them (it's an erasure coding data pool)... so it cannot be
fixed anymore and we know we have data loss ! I did not marked the PG
out
Hi Harald,
was this exact OSD suffering from "ceph_assert(h->file->fnode.ino != 1)"?
Could you please collect extended log with debug-bluefs set ot 20?
Thanks,
Igor
On 6/8/2020 4:48 PM, Harald Staub wrote:
This is again about our bad cluster, with far too many objects. Now
another OSD
When I try
ceph-kvstore-tool bluestore-kv /var/lib/ceph/osd/ceph-$OSD fsck
I get:
2020-06-08 16:05:39.393 7fc589500d80 1
bluestore(/var/lib/ceph/osd/ceph-244) _mount path /var/lib/ceph/osd/ceph-244
2020-06-08 16:05:39.393 7fc589500d80 1 bdev create path
/var/lib/ceph/osd/ceph-244/block type
There is no recovery going on, but indeed we have a pg damaged (with
some lost objects due to a major crash few weeks ago)... and there are
some shards of this pg on osd 27 !
That's also why we are migrating all the data out of this FS !
It's certainly related and I guess that it's trying to
Hi Francois,
did you manage to get any further with this?
Best regards,
=
Frank Schilder
AIT Risø Campus
Bygning 109, rum S14
From: Frank Schilder
Sent: 06 June 2020 15:21:59
To: ceph-users; f...@lpnhe.in2p3.fr
Subject: [ceph-users] Re:
This is again about our bad cluster, with far too many objects. Now
another OSD crashes immediately at startup:
/build/ceph-14.2.8/src/os/bluestore/KernelDevice.cc: 944: FAILED
ceph_assert(is_valid_io(off, len))
1: (ceph::__ceph_assert_fail(char const*, char const*, int, char
const*)+0x152)
Thanks again for the hint !
Indeed, I did a
ceph daemon mds.lpnceph-mds02.in2p3.fr objecter_requests
and it seems that osd 27 is more or less stuck with op of age 34987.5
(while others osd have ages < 1).
I tryed a ceph osd down 27 which resulted in reseting the age but I can
notice that age
Hi,
We had this issue in a 14.2.8 cluster, although it appeared after
resizing db device to a larger one.
After some time (weeks), spillover was gone...
Cheers
Eneko
El 6/6/20 a las 0:07, Reed Dier escribió:
I'm going to piggy back on this somewhat.
I've battled RocksDB spillovers over
Reed,
No, "ceph-kvstore-tool stats" isn't be of any interest.
For the sake of better issue understanding it might be interesting to
have bluefs log dump obtained via ceph-bluestore-tool's bluefs-log-dump
command. This will give some insight what RocksDB files are spilled
over. It's still
Hi Franck,
Finally I dit :
ceph config set global mds_beacon_grace 60
and create /etc/sysctl.d/sysctl-ceph.conf with
vm.min_free_kbytes=4194303
and then
sysctl --system
After that, the mds went to rejoin for a very long time (almost 24
hours) with errors like :
2020-06-07 04:10:36.802
On Sun, Jun 7, 2020 at 8:06 AM Hans van den Bogert wrote:
>
> Hi list,
>
> I've awaited octopus for a along time to be able to use mirror with
> snapshotting, since my setup does not allow for journal based
> mirroring. (K8s/Rook 1.3.x with ceph 15.2.2)
>
> However, I seem to be stuck, i've come
Hi,
you can specify a customized ceph.conf before the 'cephadm bootstrap'
command [1] to add a dedicated cluster network (if you really need
that, it has been discussed extensively on the list):
---snip---
octo1:~ # cat < /root/ceph.conf
[global]
public network = 192.168.124.0/24
cluster
Just a wild guess, but how many OSDs have you deployed on the MGR
node? The inventory page shows a default of 10 entries, maybe you
increase that (upper right corner)? Do you see more OSDs in the "OSDs"
page (also limited to 10 entries)?
Zitat von Amudhan P :
Hi,
I am using Ceph
Hello
I know that nfs on octopus is still a bit under development.
I'm trying to deploy nfs daemons and have some issues with the orchestartor.
For the other daemons, for example monitors, I can issue the command "ceph orch
apply mon 3"
This will tell the orchestrator to deploy or remove
38 matches
Mail list logo