We see the same messages and are similarly on a 4.4 KRBD version that is
affected by this.
I have seen no impact from it so far that I know about
> -Original Message-
> From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of
> Jason Dillaman
> Sent: Thursday, 5 October
On 10/04/2017 09:19 PM, Gregory Farnum wrote:
Oh, hmm, you're right. I see synchronization starts but it seems to
progress very slowly, and it certainly doesn't complete in that 2.5
minute logging window. I don't see any clear reason why it's so slow; it
might be more clear if you could provide
On Wed, 4 Oct 2017, Sage Weil wrote:
> Hi everyone,
>
> After further discussion we are targetting 9 months for Mimic 13.2.0:
>
> - Mar 16, 2018 feature freeze
> - May 1, 2018 release
>
> Upgrades for Mimic will be from Luminous only (we've already made that a
> required stop), but we plan to
Hi everyone,
After further discussion we are targetting 9 months for Mimic 13.2.0:
- Mar 16, 2018 feature freeze
- May 1, 2018 release
Upgrades for Mimic will be from Luminous only (we've already made that a
required stop), but we plan to allow Luminous -> Nautilus too (and Mimic
-> O).
Nau
Oh, hmm, you're right. I see synchronization starts but it seems to
progress very slowly, and it certainly doesn't complete in that 2.5 minute
logging window. I don't see any clear reason why it's so slow; it might be
more clear if you could provide logs of the other logs at the same time
(especial
Some more detail:
when restarting the monitor on server1, it stays in synchronizing state
forever.
However the other two monitors change into electing state.
I have double checked that there are not (host) firewalls active and
that the times are within 1 second different of the hosts (they all
I am evaluating multi active mds ( 2 active and 1 standby) and seemed the
failover take longer time.
1. Is there anyway reduce failover time and hang the mount point (
ceph-fuse)?.
2. Is mds_standby_replay config valid options to enable warmup metadata
cache with multimds?
Regards
Kri
Hello Gregory,
the logfile I produced has already debug mon = 20 set:
[21:03:51] server1:~# grep "debug mon" /etc/ceph/ceph.conf
debug mon = 20
It is clear that server1 is out of quorum, however how do we make it
being part of the quorum again?
I expected that the quorum finding process is tri
Perhaps this is related to a known issue on some 4.4 and later kernels
[1] where the stable write flag was not preserved by the kernel?
[1] http://tracker.ceph.com/issues/19275
On Wed, Oct 4, 2017 at 2:36 PM, Gregory Farnum wrote:
> That message indicates that the checksums of messages between y
That message indicates that the checksums of messages between your kernel
client and OSD are incorrect. It could be actual physical transmission
errors, but if you don't see other issues then this isn't fatal; they can
recover from it.
On Wed, Oct 4, 2017 at 8:52 AM Josy wrote:
> Hi,
>
> We have
This says it's actually missing one object, and a repair won't fix that (if
it could, the object wouldn't be missing!). There should be more details
somewhere in the logs about which object.
On Wed, Oct 4, 2017 at 5:03 AM Kenneth Waegeman
wrote:
> Hi,
>
> We have some inconsistency / scrub error
You'll need to change the config so that it's running "debug mon = 20" for
the log to be very useful here. It does say that it's dropping client
connections because it's been out of quorum for too long, which is the
correct behavior in general. I'd imagine that you've got clients trying to
connect
On Wed, Oct 4, 2017 at 9:14 AM, Benjeman Meekhof wrote:
> Wondering if anyone can tell me how to summarize recovery
> bytes/ops/objects from counters available in the ceph-mgr python
> interface? To put another way, how does the ceph -s command put
> together that infomation and can I access that
On Wed, Oct 04, 2017 at 03:02:09AM -0300, Leonardo Vaz wrote:
> On Thu, Sep 28, 2017 at 12:08:00AM -0300, Leonardo Vaz wrote:
> > Hey Cephers,
> >
> > This is just a friendly reminder that the next Ceph Developer Montly
> > meeting is coming up:
> >
> > http://wiki.ceph.com/Planning
> >
> > If
Wondering if anyone can tell me how to summarize recovery
bytes/ops/objects from counters available in the ceph-mgr python
interface? To put another way, how does the ceph -s command put
together that infomation and can I access that information from a
counter queryable by the ceph-mgr python modu
Hi,
We have setup a cluster with 8 OSD servers (31 disks)
Ceph health is Ok.
--
[root@las1-1-44 ~]# ceph -s
cluster:
id: de296604-d85c-46ab-a3af-add3367f0e6d
health: HEALTH_OK
services:
mon: 3 daemons, quorum ceph-las-mon-a1,ceph-las-mon-a2,ceph-las-mon-a3
mg
Hi,
We have some inconsistency / scrub error on a Erasure coded pool, that I
can't seem to solve.
[root@osd008 ~]# ceph health detail
HEALTH_ERR 1 pgs inconsistent; 1 scrub errors
pg 5.144 is active+clean+inconsistent, acting
[81,119,148,115,142,100,25,63,48,11,43]
1 scrub errors
In the log
Good morning,
we have recently upgraded our kraken cluster to luminous and since then
noticed an odd behaviour: we cannot add a monitor anymore.
As soon as we start a new monitor (server2), ceph -s and ceph -w start to hang.
The situation became worse, since one of our staff stopped an existing
Hi,
Did you edit the code before trying Luminous?
Yes, I'm still on jewel.
I also noticed from your > original mail that it appears you're using multiple
active metadata> servers? If so, that's not stable in Jewel. You may have tripped
on> one of many bugs fixed in Luminous for that conf
ok, thanks for the feedback Piotr and Dan!
MJ
On 4-10-2017 9:38, Dan van der Ster wrote:
Since Jewel (AFAIR), when (re)starting OSDs, pg status is reset to "never
contacted", resulting in "pgs are stuck inactive for more than 300 seconds"
being reported until osds regain connections between the
On Wed, Oct 4, 2017 at 9:08 AM, Piotr Dałek wrote:
> On 17-10-04 08:51 AM, lists wrote:
>>
>> Hi,
>>
>> Yesterday I chowned our /var/lib/ceph ceph, to completely finalize our
>> jewel migration, and noticed something interesting.
>>
>> After I brought back up the OSDs I just chowned, the system ha
On 17-10-04 08:51 AM, lists wrote:
Hi,
Yesterday I chowned our /var/lib/ceph ceph, to completely finalize our jewel
migration, and noticed something interesting.
After I brought back up the OSDs I just chowned, the system had some
recovery to do. During that recovery, the system went to HEAL
22 matches
Mail list logo