gt;>>>
>>>>
>>>>
>>>> I am curious if those OSDs are flapping all at once? If a single host
>>>> is affected I would consider the network connectivity (bottlenecks and
>>>> misconfigured bonds can generate strange situations), storage co
would consider the network connectivity (bottlenecks and
>>> misconfigured bonds can generate strange situations), storage controller
>>> and firmware.
>>>
>>>
>>>
>>> Cheers,
>>>
>>> Maxime
>>>
>>>
>>>
e
>>
>>
>>
>> *From: *ceph-users on behalf of Vlad
>> Blando
>> *Date: *Sunday 2 April 2017 16:28
>> *To: *ceph-users
>> *Subject: *[ceph-users] Flapping OSDs
>>
>>
>>
>> Hi,
>>
>>
>>
>> One of my c
ituations), storage controller
> and firmware.
>
>
>
> Cheers,
>
> Maxime
>
>
>
> *From: *ceph-users on behalf of Vlad
> Blando
> *Date: *Sunday 2 April 2017 16:28
> *To: *ceph-users
> *Subject: *[ceph-users] Flapping OSDs
>
>
>
> Hi,
Blando
Date: Sunday 2 April 2017 16:28
To: ceph-users
Subject: [ceph-users] Flapping OSDs
Hi,
One of my ceph nodes have flapping OSDs, network between nodes are fine, it's
on a 10GBase-T network. I don't see anything wrong with the network, but these
OSDs are going up/down.
[root@ava
Hi,
One of my ceph nodes have flapping OSDs, network between nodes are fine,
it's on a 10GBase-T network. I don't see anything wrong with the network,
but these OSDs are going up/down.
[root@avatar0-ceph4 ~]# ceph osd tree
# idweight type name up/down reweight
-1 174.7 root defa
set in some of the test software…
> >>
> >> Paul
> >>
> >>
> >> From: ceph-users on behalf of Tom
> >> Christensen
> >> Date: Monday, 30 November 2015 at 23:20
> >> To: "ceph-users@lists.ceph.com"
> >> Subject:
in Ceph – can
>> anybody confirm this?
>> I could not find any usage in the Ceph source code except that the value
>> is set in some of the test software…
>>
>> Paul
>>
>>
>> From: ceph-users on behalf of Tom
>> Christensen
>> Date: Mond
xattr use omap’ is no longer used in Ceph – can
>> anybody confirm this?
>> I could not find any usage in the Ceph source code except that the value
>> is set in some of the test software…
>>
>> Paul
>>
>>
>> From: ceph-users on behalf of Tom
>
h source code except that the value
> is set in some of the test software…
>
> Paul
>
>
> From: ceph-users on behalf of Tom
> Christensen
> Date: Monday, 30 November 2015 at 23:20
> To: "ceph-users@lists.ceph.com"
> Subject: Re: [ceph-users] Flapping OSDs
of Tom Christensen mailto:pav...@gmail.com>>
Date: Monday, 30 November 2015 at 23:20
To: "ceph-users@lists.ceph.com<mailto:ceph-users@lists.ceph.com>"
mailto:ceph-users@lists.ceph.com>>
Subject: Re: [ceph-users] Flapping OSDs, Large meta directories in OSDs
What counts as
On Tue, Dec 1, 2015 at 12:20 AM, Tom Christensen wrote:
> What counts as ancient? Concurrent to our hammer upgrade we went from
> 3.16->3.19 on ubuntu 14.04. We are looking to revert to the 3.16 kernel
> we'd been running because we're also seeing an intermittent (its happened
> twice in 2 weeks
What counts as ancient? Concurrent to our hammer upgrade we went from
3.16->3.19 on ubuntu 14.04. We are looking to revert to the 3.16 kernel
we'd been running because we're also seeing an intermittent (its happened
twice in 2 weeks) massive load spike that completely hangs the osd node
(we're ta
The trick with debugging heartbeat errors is to grep back through the log
to find the last thing the affected thread was doing, e.g. is
0x7f5affe72700 stuck in messaging, writing to the disk, reading through the
omap, etc..
I agree this doesn't look to be network related, but if you want to rule i
No, CPU and memory look normal. We haven't been fast/lucky enough with
iostat to see if we're just slamming the disk itself, I continue to attempt
to catch one, get logged into the node, find the disk and get iostat
running before the OSD comes back up. We haven't flapped that many OSDs,
and most
On 11/30/2015 08:56 PM, Tom Christensen wrote:
> We recently upgraded to 0.94.3 from firefly and now for the last week
> have had intermittent slow requests and flapping OSDs. We have been
> unable to nail down the cause, but its feeling like it may be related to
> our osdmaps not getting deleted
We recently upgraded to 0.94.3 from firefly and now for the last week have
had intermittent slow requests and flapping OSDs. We have been unable to
nail down the cause, but its feeling like it may be related to our osdmaps
not getting deleted properly. Most of our osds are now storing over 100GB
Anything in dmesg?
Just
[188924.137100] init: ceph-osd (ceph/6) main process (8262) killed by
ABRT signal
[188924.137138] init: ceph-osd (ceph/6) main process ended, respawning
When you say restart, do you mean a physical restart, or just
restarting the daemon? If it takes a physical re
Anything in dmesg? When you say restart, do you mean a physical
restart, or just restarting the daemon? If it takes a physical restart
and you're using intel NICs, it might be worth upgrading network
drivers. Old versions have some bugs that cause them to just drop traffic.
On 5/14/2014 9:0
I have 4 OSDs that won't stay in the cluster. I restart them, they join
for a bit, then get kicked out because they stop responding to pings
from the other OSDs.
I don't know what the issue is. The disks look fine. SMART reports no
errors or reallocated sectors. iostat says the disks are n
20 matches
Mail list logo