.com] On Behalf Of Max A.
Krasilnikov
Sent: Monday, July 27, 2015 4:07 AM
Cc: ceph-users@lists.ceph.com
Subject: Re: [ceph-users] Enclosure power failure pausing client IO till all
connected hosts up
Здравствуйте!
On Tue, Jul 07, 2015 at 02:21:56PM +0530, mallikarjuna.biradar wrote:
>
Здравствуйте!
On Tue, Jul 07, 2015 at 02:21:56PM +0530, mallikarjuna.biradar wrote:
> Hi all,
> Setup details:
> Two storage enclosures each connected to 4 OSD nodes (Shared storage).
> Failure domain is Chassis (enclosure) level. Replication count is 2.
> Each host has allotted with 4 drives.
Sorry, autocorrect. Decompiled crush map.
Robert LeBlanc
Sent from a mobile device please excuse any typos.
On Jul 24, 2015 9:44 AM, "Robert LeBlanc" wrote:
> Please provide the recompiled crush map.
>
> Robert LeBlanc
>
> Sent from a mobile device please excuse any typos.
> On Jul 23, 2015 7:0
s
proceed"},
{ "osd": 10,
"current_lost_at": 0,
"comment": "starting or marking this osd lost may let us
proceed"}]},
{ "name": "Started",
"enter_time": "2
You may want to check your min_size value for your pools. If it is
set to the pool size value, then the cluster will not do I/O if you
loose a chassis.
On Sun, Jul 5, 2015 at 11:04 PM, Mallikarjun Biradar
wrote:
> Hi all,
>
> Setup details:
> Two storage enclosures each connected to 4 OSD nodes
Hi all,
Setup details:
Two storage enclosures each connected to 4 OSD nodes (Shared storage).
Failure domain is Chassis (enclosure) level. Replication count is 2.
Each host has allotted with 4 drives.
I have active client IO running on cluster. (Random write profile with
4M block size & 64 Queue
Hi all,
Setup details:
Two storage enclosures each connected to 4 OSD nodes (Shared storage).
Failure domain is Chassis (enclosure) level. Replication count is 2.
Each host has allotted with 4 drives.
I have active client IO running on cluster. (Random write profile with
4M block size & 64 Queue
Hi all,
Setup details:
Two storage enclosures each connected to 4 OSD nodes (Shared storage).
Failure domain is Chassis (enclosure) level. Replication count is 2.
Each host has allotted with 4 drives.
I have active client IO running on cluster. (Random write profile with
4M block size & 64 Queue
Hi all,
Setup details:
Two storage enclosures each connected to 4 OSD nodes (Shared storage).
Failure domain is Chassis (enclosure) level. Replication count is 2.
Each host has allotted with 4 drives.
I have active client IO running on cluster. (Random write profile with
4M block size & 64 Queue
Hi all,
Setup details:
Two storage enclosures each connected to 4 OSD nodes (Shared storage).
Failure domain is Chassis (enclosure) level. Replication count is 2.
Each host has allotted with 4 drives.
I have active client IO running on cluster. (Random write profile with 4M
block size & 64 Queue
cluster state:
osdmap e3240: 24 osds: 12 up, 12 in
pgmap v46050: 1088 pgs, 2 pools, 20322 GB data, 5080 kobjects
4 GB used, 61841 GB / 84065 GB avail
4745644/10405374 objects degraded (45.608%);
3688079/10405374 objects misplaced (35.444%)
5
Sorry for delay in replying to this, as I was doing some retries on
this issue and summarise.
Tony,
Setup details:
Two storage box (each with 12 drives) , each connected with 4 hosts.
Each host own 3 disk from storage box. Total of 24 OSD's.
Failure domain is at Chassis level.
OSD tree:
-1
Sounds to me like you've put yourself at too much risk - *if* I'm reading
your message right about your configuration, you have multiple hosts
accessing OSDs that are stored on a single shared box - so if that single
shared box (single point of failure for multiple nodes) goes down it's
possible fo
Your first point of troubleshooting is pretty much always to look at
"ceph -s" and see what it says. In this case it's probably telling you
that some PGs are down, and then you can look at why (but perhaps it's
something else).
-Greg
On Thu, Jul 9, 2015 at 12:22 PM, Mallikarjun Biradar
wrote:
> Y
Yeah. All OSD's down and monitors still up..
On Thu, Jul 9, 2015 at 4:51 PM, Jan Schermer wrote:
> And are the OSDs getting marked down during the outage?
> Are all the MONs still up?
>
> Jan
>
>> On 09 Jul 2015, at 13:20, Mallikarjun Biradar
>> wrote:
>>
>> I have size=2 & min_size=1 and IO is
And are the OSDs getting marked down during the outage?
Are all the MONs still up?
Jan
> On 09 Jul 2015, at 13:20, Mallikarjun Biradar
> wrote:
>
> I have size=2 & min_size=1 and IO is paused till all hosts com back.
>
> On Thu, Jul 9, 2015 at 4:41 PM, Jan Schermer wrote:
>> What is the min_
I have size=2 & min_size=1 and IO is paused till all hosts com back.
On Thu, Jul 9, 2015 at 4:41 PM, Jan Schermer wrote:
> What is the min_size setting for the pool? If you have size=2 and min_size=2,
> then all your data is safe when one replica is down, but the IO is paused. If
> you want to
What is the min_size setting for the pool? If you have size=2 and min_size=2,
then all your data is safe when one replica is down, but the IO is paused. If
you want to continue IO you need to set min_size=1.
But be aware that a single failure after that causes you to lose all the data,
you’d hav
Hi all,
Setup details:
Two storage enclosures each connected to 4 OSD nodes (Shared storage).
Failure domain is Chassis (enclosure) level. Replication count is 2.
Each host has allotted with 4 drives.
I have active client IO running on cluster. (Random write profile with
4M block size & 64 Queue
19 matches
Mail list logo