Re: [ceph-users] Enclosure power failure pausing client IO till all connected hosts up

2015-08-02 Thread Somnath Roy
.com] On Behalf Of Max A. Krasilnikov Sent: Monday, July 27, 2015 4:07 AM Cc: ceph-users@lists.ceph.com Subject: Re: [ceph-users] Enclosure power failure pausing client IO till all connected hosts up Здравствуйте! On Tue, Jul 07, 2015 at 02:21:56PM +0530, mallikarjuna.biradar wrote: >

Re: [ceph-users] Enclosure power failure pausing client IO till all connected hosts up

2015-07-27 Thread Max A. Krasilnikov
Здравствуйте! On Tue, Jul 07, 2015 at 02:21:56PM +0530, mallikarjuna.biradar wrote: > Hi all, > Setup details: > Two storage enclosures each connected to 4 OSD nodes (Shared storage). > Failure domain is Chassis (enclosure) level. Replication count is 2. > Each host has allotted with 4 drives.

Re: [ceph-users] Enclosure power failure pausing client IO till all connected hosts up

2015-07-24 Thread Robert LeBlanc
Sorry, autocorrect. Decompiled crush map. Robert LeBlanc Sent from a mobile device please excuse any typos. On Jul 24, 2015 9:44 AM, "Robert LeBlanc" wrote: > Please provide the recompiled crush map. > > Robert LeBlanc > > Sent from a mobile device please excuse any typos. > On Jul 23, 2015 7:0

Re: [ceph-users] Enclosure power failure pausing client IO till all connected hosts up

2015-07-23 Thread Varada Kari
s proceed"}, { "osd": 10, "current_lost_at": 0, "comment": "starting or marking this osd lost may let us proceed"}]}, { "name": "Started", "enter_time": "2

Re: [ceph-users] Enclosure power failure pausing client IO till all connected hosts up

2015-07-23 Thread Eric Eastman
You may want to check your min_size value for your pools. If it is set to the pool size value, then the cluster will not do I/O if you loose a chassis. On Sun, Jul 5, 2015 at 11:04 PM, Mallikarjun Biradar wrote: > Hi all, > > Setup details: > Two storage enclosures each connected to 4 OSD nodes

[ceph-users] Enclosure power failure pausing client IO till all connected hosts up

2015-07-23 Thread Mallikarjun Biradar
Hi all, Setup details: Two storage enclosures each connected to 4 OSD nodes (Shared storage). Failure domain is Chassis (enclosure) level. Replication count is 2. Each host has allotted with 4 drives. I have active client IO running on cluster. (Random write profile with 4M block size & 64 Queue

[ceph-users] Enclosure power failure pausing client IO till all connected hosts up

2015-07-23 Thread Mallikarjun Biradar
Hi all, Setup details: Two storage enclosures each connected to 4 OSD nodes (Shared storage). Failure domain is Chassis (enclosure) level. Replication count is 2. Each host has allotted with 4 drives. I have active client IO running on cluster. (Random write profile with 4M block size & 64 Queue

[ceph-users] Enclosure power failure pausing client IO till all connected hosts up

2015-07-23 Thread Mallikarjun Biradar
Hi all, Setup details: Two storage enclosures each connected to 4 OSD nodes (Shared storage). Failure domain is Chassis (enclosure) level. Replication count is 2. Each host has allotted with 4 drives. I have active client IO running on cluster. (Random write profile with 4M block size & 64 Queue

[ceph-users] Enclosure power failure pausing client IO till all connected hosts up

2015-07-23 Thread Mallikarjun Biradar
Hi all, Setup details: Two storage enclosures each connected to 4 OSD nodes (Shared storage). Failure domain is Chassis (enclosure) level. Replication count is 2. Each host has allotted with 4 drives. I have active client IO running on cluster. (Random write profile with 4M block size & 64 Queue

[ceph-users] Enclosure power failure pausing client IO till all connected hosts up

2015-07-23 Thread Mallikarjun Biradar
Hi all, Setup details: Two storage enclosures each connected to 4 OSD nodes (Shared storage). Failure domain is Chassis (enclosure) level. Replication count is 2. Each host has allotted with 4 drives. I have active client IO running on cluster. (Random write profile with 4M block size & 64 Queue

Re: [ceph-users] Enclosure power failure pausing client IO till all connected hosts up

2015-07-15 Thread Mallikarjun Biradar
cluster state: osdmap e3240: 24 osds: 12 up, 12 in pgmap v46050: 1088 pgs, 2 pools, 20322 GB data, 5080 kobjects 4 GB used, 61841 GB / 84065 GB avail 4745644/10405374 objects degraded (45.608%); 3688079/10405374 objects misplaced (35.444%) 5

Re: [ceph-users] Enclosure power failure pausing client IO till all connected hosts up

2015-07-15 Thread Mallikarjun Biradar
Sorry for delay in replying to this, as I was doing some retries on this issue and summarise. Tony, Setup details: Two storage box (each with 12 drives) , each connected with 4 hosts. Each host own 3 disk from storage box. Total of 24 OSD's. Failure domain is at Chassis level. OSD tree: -1

Re: [ceph-users] Enclosure power failure pausing client IO till all connected hosts up

2015-07-09 Thread Tony Harris
Sounds to me like you've put yourself at too much risk - *if* I'm reading your message right about your configuration, you have multiple hosts accessing OSDs that are stored on a single shared box - so if that single shared box (single point of failure for multiple nodes) goes down it's possible fo

Re: [ceph-users] Enclosure power failure pausing client IO till all connected hosts up

2015-07-09 Thread Gregory Farnum
Your first point of troubleshooting is pretty much always to look at "ceph -s" and see what it says. In this case it's probably telling you that some PGs are down, and then you can look at why (but perhaps it's something else). -Greg On Thu, Jul 9, 2015 at 12:22 PM, Mallikarjun Biradar wrote: > Y

Re: [ceph-users] Enclosure power failure pausing client IO till all connected hosts up

2015-07-09 Thread Mallikarjun Biradar
Yeah. All OSD's down and monitors still up.. On Thu, Jul 9, 2015 at 4:51 PM, Jan Schermer wrote: > And are the OSDs getting marked down during the outage? > Are all the MONs still up? > > Jan > >> On 09 Jul 2015, at 13:20, Mallikarjun Biradar >> wrote: >> >> I have size=2 & min_size=1 and IO is

Re: [ceph-users] Enclosure power failure pausing client IO till all connected hosts up

2015-07-09 Thread Jan Schermer
And are the OSDs getting marked down during the outage? Are all the MONs still up? Jan > On 09 Jul 2015, at 13:20, Mallikarjun Biradar > wrote: > > I have size=2 & min_size=1 and IO is paused till all hosts com back. > > On Thu, Jul 9, 2015 at 4:41 PM, Jan Schermer wrote: >> What is the min_

Re: [ceph-users] Enclosure power failure pausing client IO till all connected hosts up

2015-07-09 Thread Mallikarjun Biradar
I have size=2 & min_size=1 and IO is paused till all hosts com back. On Thu, Jul 9, 2015 at 4:41 PM, Jan Schermer wrote: > What is the min_size setting for the pool? If you have size=2 and min_size=2, > then all your data is safe when one replica is down, but the IO is paused. If > you want to

Re: [ceph-users] Enclosure power failure pausing client IO till all connected hosts up

2015-07-09 Thread Jan Schermer
What is the min_size setting for the pool? If you have size=2 and min_size=2, then all your data is safe when one replica is down, but the IO is paused. If you want to continue IO you need to set min_size=1. But be aware that a single failure after that causes you to lose all the data, you’d hav

[ceph-users] Enclosure power failure pausing client IO till all connected hosts up

2015-07-09 Thread Mallikarjun Biradar
Hi all, Setup details: Two storage enclosures each connected to 4 OSD nodes (Shared storage). Failure domain is Chassis (enclosure) level. Replication count is 2. Each host has allotted with 4 drives. I have active client IO running on cluster. (Random write profile with 4M block size & 64 Queue