Re: [ceph-users] Multiple journals and an OSD on one SSD doable?

2015-06-08 Thread Cameron . Scrace
Just used the method in the link you sent me to test one of the EVO 850s, 
with one job it reached a speed of around 2.5MB/s but it didn't max out 
until around 32 jobs at 24MB/s: 

sudo fio --filename=/dev/sdh --direct=1 --sync=1 --rw=write --bs=4k 
--numjobs=32 --iodepth=1 --runtime=60 --time_based --group_reporting 
--name=journal-test
write: io=1507.4MB, bw=25723KB/s, iops=6430, runt= 60007msec

Also tested a Micron 550 we had sitting around and it maxed out at 
2.5mb/s, both results conflict with the chart

Regards,

Cameron Scrace
Infrastructure Engineer

Mobile +64 22 610 4629
Phone  +64 4 462 5085 
Email  cameron.scr...@solnet.co.nz
Solnet Solutions Limited
Level 12, Solnet House
70 The Terrace, Wellington 6011
PO Box 397, Wellington 6140

www.solnet.co.nz



From:   Christian Balzer ch...@gol.com
To: ceph-us...@ceph.com ceph-us...@ceph.com
Cc: cameron.scr...@solnet.co.nz
Date:   08/06/2015 02:40 p.m.
Subject:Re: [ceph-users] Multiple journals and an OSD on one SSD 
doable?



On Mon, 8 Jun 2015 14:30:17 +1200 cameron.scr...@solnet.co.nz wrote:

 Thanks for all the feedback. 
 
 What makes the EVOs unusable? They should have plenty of speed but your 
 link has them at 1.9MB/s, is it just the way they handle O_DIRECT and 
 D_SYNC? 
 
Precisely. 
Read that ML thread for details.

And once more, they also are not very endurable.
So depending on your usage pattern and Ceph (Ceph itself and the
underlying FS) write amplification their TBW/$ will be horrible, costing
you more in the end than more expensive, but an order of magnitude more
endurable DC SSDs. 

 Not sure if we will be able to spend anymore, we may just have to take
 the performance hit until we can get more money for the project.

You could cheap out with 200GB DC S3700s (half the price), but they will
definitely become the bottleneck at a combined max speed of about 700MB/s,
as opposed to the 400GB ones at 900MB/s combined.
 
Christian

 Thanks,
 
 Cameron Scrace
 Infrastructure Engineer
 
 Mobile +64 22 610 4629
 Phone  +64 4 462 5085 
 Email  cameron.scr...@solnet.co.nz
 Solnet Solutions Limited
 Level 12, Solnet House
 70 The Terrace, Wellington 6011
 PO Box 397, Wellington 6140
 
 www.solnet.co.nz
 
 
 
 From:   Christian Balzer ch...@gol.com
 To: ceph-us...@ceph.com ceph-us...@ceph.com
 Cc: cameron.scr...@solnet.co.nz
 Date:   08/06/2015 02:00 p.m.
 Subject:Re: [ceph-users] Multiple journals and an OSD on one SSD 

 doable?
 
 
 
 
 Cameron,
 
 To offer at least some constructive advice here instead of just all doom
 and gloom, here's what I'd do:
 
 Replace the OS SSDs with 2 400GB Intel DC S3700s (or S3710s).
 They have enough BW to nearly saturate your network.
 
 Put all your journals on them (3 SSD OSD and 3 HDD OSD per). 
 While that's a bad move from a failure domain perspective, your budget
 probably won't allow for anything better and those are VERY reliable and
 just as important durable SSDs. 
 
 This will give you the speed your current setup is capable of, probably
 limited by the CPU when it comes to SSD pool operations.
 
 Christian
 
 On Mon, 8 Jun 2015 10:44:06 +0900 Christian Balzer wrote:
 
  
  Hello Cameron,
  
  On Mon, 8 Jun 2015 13:13:33 +1200 cameron.scr...@solnet.co.nz wrote:
  
   Hi Christian,
   
   Yes we have purchased all our hardware, was very hard to convince 
   management/finance to approve it, so some of the stuff we have is a
   bit cheap.
   
  Unfortunate. Both the done deal and the cheapness. 
  
   We have four storage nodes each with 6 x 6TB Western Digital Red
   SATA Drives (WD60EFRX-68M) and 6 x 1TB Samsung EVO 850s SSDs and
   2x250GB Samsung EVO 850s (for OS raid).
   CPUs are Intel Atom C2750  @ 2.40GHz (8 Cores) with 32 GB of RAM. 
   We have a 10Gig Network.
  
  I wish there was a nice way to say this, but it unfortunately boils
  down to a You're fooked.
  
  There have been many discussions about which SSDs are usable with 
Ceph,
  very recently as well.
  Samsung EVOs (the non DC type for sure) are basically unusable for
  journals. See the recent thread:
   Possible improvements for a slow write speed (excluding independent
  SSD journals) and:
  
 
http://www.sebastien-han.fr/blog/2014/10/10/ceph-how-to-test-if-your-ssd-is-suitable-as-a-journal-device/

 
  for reference.
  
  I presume your intention for the 1TB SSDs is for a SSD backed pool? 
  Note that the EVOs have a pretty low (guaranteed) endurance, so aside
  from needing journal SSDs that actually can do the job, you're looking 

 at
  wearing them out rather quickly (depending on your use case of 
course).
  
  Now with SSD based OSDs or even HDD based OSDs with SSD journals that 
 CPU
  looks a bit anemic.
  
  More below:
   The two options we are considering are:
   
   1) Use two of the 1TB SSDs for the spinning disk journals (3 each)
   and 
 
   then use the remaining 900+GB of each drive as an OSD to be part of
   the cache pool.
   
   2) Put the spinning disk journals on the OS

Re: [ceph-users] Multiple journals and an OSD on one SSD doable?

2015-06-07 Thread Cameron . Scrace
The other option we were considering was putting the journals on the OS 
SSDs, they are only 250GB and the rest would be for the OS. Is that a 
decent option?

Thanks!

Cameron Scrace
Infrastructure Engineer

Mobile +64 22 610 4629
Phone  +64 4 462 5085 
Email  cameron.scr...@solnet.co.nz
Solnet Solutions Limited
Level 12, Solnet House
70 The Terrace, Wellington 6011
PO Box 397, Wellington 6140

www.solnet.co.nz



From:   Somnath Roy somnath@sandisk.com
To: cameron.scr...@solnet.co.nz cameron.scr...@solnet.co.nz, 
ceph-us...@ceph.com ceph-us...@ceph.com
Date:   08/06/2015 09:34 a.m.
Subject:RE: [ceph-users] Multiple journals and an OSD on one SSD 
doable?



Cameron,
Generally, it’s not a good idea. 
You want to protect your SSDs used as journal.If any problem on that disk, 
you will be losing all of your dependent OSDs.
I don’t think a bigger journal will gain you much performance , so, 
default 5 GB journal size should be good enough. If you want to reduce the 
fault domain and want to put 3 journals on a SSD , go for minimum size and 
high endurance SSDs for that.
Now, if you want to use your rest of space of 1 TB ssd, creating just OSDs 
will not gain you much (rather may get some burst performance). You may 
want to consider the following.
 
1. If your spindle OSD size is much bigger than 900 GB , you don’t want to 
make all OSDs of similar sizes, cache pool could be one of your option. 
But, remember, cache pool can wear out your SSDs faster as presently I 
guess it is not optimizing the extra writes. Sorry, I don’t have exact 
data as I am yet to test that out.
 
2. If you want to make all the OSDs of similar sizes and you will be able 
to create a substantial number of OSDs with your unused SSDs (depends on 
how big the cluster is), you may want to put all of your primary OSDs to 
SSD and gain significant performance boost for read. Also, in this case, I 
don’t think you will be getting any burst performance.
 
Thanks  Regards
Somnath
 
From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of 
cameron.scr...@solnet.co.nz
Sent: Sunday, June 07, 2015 1:49 PM
To: ceph-us...@ceph.com
Subject: [ceph-users] Multiple journals and an OSD on one SSD doable?
 
Setting up a Ceph cluster and we want the journals for our spinning disks 
to be on SSDs but all of our SSDs are 1TB. We were planning on putting 3 
journals on each SSD, but that leaves 900+GB unused on the drive, is it 
possible to use the leftover space as another OSD or will it affect 
performance too much? 

Thanks, 

Cameron Scrace
Infrastructure Engineer

Mobile +64 22 610 4629
Phone  +64 4 462 5085 
Email  cameron.scr...@solnet.co.nz
Solnet Solutions Limited
Level 12, Solnet House
70 The Terrace, Wellington 6011
PO Box 397, Wellington 6140

www.solnet.co.nzAttention: This email may contain information intended for 
the sole use of the original recipient. Please respect this when sharing 
or disclosing this email's contents with any third party. If you believe 
you have received this email in error, please delete it and notify the 
sender or postmas...@solnetsolutions.co.nz as soon as possible. The 
content of this email does not necessarily reflect the views of Solnet 
Solutions Ltd. 


PLEASE NOTE: The information contained in this electronic mail message is 
intended only for the use of the designated recipient(s) named above. If 
the reader of this message is not the intended recipient, you are hereby 
notified that you have received this message in error and that any review, 
dissemination, distribution, or copying of this message is strictly 
prohibited. If you have received this communication in error, please 
notify the sender by telephone or e-mail (as shown above) immediately and 
destroy any and all copies of this message in your possession (whether 
hard copies or electronically stored copies).



Attention:
This email may contain information intended for the sole use of
the original recipient. Please respect this when sharing or
disclosing this email's contents with any third party. If you
believe you have received this email in error, please delete it
and notify the sender or postmas...@solnetsolutions.co.nz as
soon as possible. The content of this email does not necessarily
reflect the views of Solnet Solutions Ltd.

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Multiple journals and an OSD on one SSD doable?

2015-06-07 Thread Cameron . Scrace
Hi Christian,

Yes we have purchased all our hardware, was very hard to convince 
management/finance to approve it, so some of the stuff we have is a bit 
cheap.

We have four storage nodes each with 6 x 6TB Western Digital Red SATA 
Drives (WD60EFRX-68M) and 6 x 1TB Samsung EVO 850s SSDs and 2x250GB 
Samsung EVO 850s (for OS raid).
CPUs are Intel Atom C2750  @ 2.40GHz (8 Cores) with 32 GB of RAM. 
We have a 10Gig Network.

The two options we are considering are:

1) Use two of the 1TB SSDs for the spinning disk journals (3 each) and 
then use the remaining 900+GB of each drive as an OSD to be part of the 
cache pool.

2) Put the spinning disk journals on the OS SSDs and use the 2 1TB SSDs 
for the cache pool.

In both cases the other 4 1TB SSDs will be part of their own tier.

Thanks a lot!

Cameron Scrace
Infrastructure Engineer

Mobile +64 22 610 4629
Phone  +64 4 462 5085 
Email  cameron.scr...@solnet.co.nz
Solnet Solutions Limited
Level 12, Solnet House
70 The Terrace, Wellington 6011
PO Box 397, Wellington 6140

www.solnet.co.nz



From:   Christian Balzer ch...@gol.com
To: ceph-us...@ceph.com ceph-us...@ceph.com
Cc: cameron.scr...@solnet.co.nz
Date:   08/06/2015 12:18 p.m.
Subject:Re: [ceph-users] Multiple journals and an OSD on one SSD 
doable?




Hello,


On Mon, 8 Jun 2015 09:55:56 +1200 cameron.scr...@solnet.co.nz wrote:

 The other option we were considering was putting the journals on the OS 
 SSDs, they are only 250GB and the rest would be for the OS. Is that a 
 decent option?

You'll be getting a LOT better advice if you're telling us more details.

For starters, have you bought the hardware yet?
Tell us about your design, how many initial storage nodes, how many
HDDs/SSDs per node, what CPUs/RAM/network?

What SSDs are we talking about, exact models please.
(Both the sizes you mentioned do not ring a bell for DC level SSDs I'm
aware of)

That said, I'm using Intel DC S3700s for mixed OS and journal use with 
good
results. 
In your average Ceph storage node, normal OS (logging mostly) activity is 
a
minute drop in the bucket for any decent SSD, so nearly all of it's
resources are available to journals.

You want to match the number of journals per SSD according to the
capabilities of your SSD, HDDs and network.

For example 8 HDD OSDs with 2 200GB DC S3700 and a 10Gb/s network is a
decent match. 
The two SSDs at 900MB/s would appear to be the bottleneck, but in reality
I'd expect the HDDs to be it.
Never mind that you'd be more likely to be IOPS than bandwidth bound.
 
Regards,

Christian

 Thanks!
 
 Cameron Scrace
 Infrastructure Engineer
 
 Mobile +64 22 610 4629
 Phone  +64 4 462 5085 
 Email  cameron.scr...@solnet.co.nz
 Solnet Solutions Limited
 Level 12, Solnet House
 70 The Terrace, Wellington 6011
 PO Box 397, Wellington 6140
 
 www.solnet.co.nz
 
 
 
 From:   Somnath Roy somnath@sandisk.com
 To: cameron.scr...@solnet.co.nz cameron.scr...@solnet.co.nz, 
 ceph-us...@ceph.com ceph-us...@ceph.com
 Date:   08/06/2015 09:34 a.m.
 Subject:RE: [ceph-users] Multiple journals and an OSD on one SSD 

 doable?
 
 
 
 Cameron,
 Generally, it’s not a good idea. 
 You want to protect your SSDs used as journal.If any problem on that
 disk, you will be losing all of your dependent OSDs.
 I don’t think a bigger journal will gain you much performance , so, 
 default 5 GB journal size should be good enough. If you want to reduce
 the fault domain and want to put 3 journals on a SSD , go for minimum
 size and high endurance SSDs for that.
 Now, if you want to use your rest of space of 1 TB ssd, creating just
 OSDs will not gain you much (rather may get some burst performance). You
 may want to consider the following.
 
 1. If your spindle OSD size is much bigger than 900 GB , you don’t want
 to make all OSDs of similar sizes, cache pool could be one of your
 option. But, remember, cache pool can wear out your SSDs faster as
 presently I guess it is not optimizing the extra writes. Sorry, I don’t
 have exact data as I am yet to test that out.
 
 2. If you want to make all the OSDs of similar sizes and you will be
 able to create a substantial number of OSDs with your unused SSDs
 (depends on how big the cluster is), you may want to put all of your
 primary OSDs to SSD and gain significant performance boost for read.
 Also, in this case, I don’t think you will be getting any burst
 performance. 
 Thanks  Regards
 Somnath
 
 From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of 

 cameron.scr...@solnet.co.nz
 Sent: Sunday, June 07, 2015 1:49 PM
 To: ceph-us...@ceph.com
 Subject: [ceph-users] Multiple journals and an OSD on one SSD doable?
 
 Setting up a Ceph cluster and we want the journals for our spinning
 disks to be on SSDs but all of our SSDs are 1TB. We were planning on
 putting 3 journals on each SSD, but that leaves 900+GB unused on the
 drive, is it possible to use the leftover space as another OSD or will
 it affect performance too much

Re: [ceph-users] Multiple journals and an OSD on one SSD doable?

2015-06-07 Thread Cameron . Scrace
Thanks for all the feedback. 

What makes the EVOs unusable? They should have plenty of speed but your 
link has them at 1.9MB/s, is it just the way they handle O_DIRECT and 
D_SYNC? 

Not sure if we will be able to spend anymore, we may just have to take the 
performance hit until we can get more money for the project.

Thanks,

Cameron Scrace
Infrastructure Engineer

Mobile +64 22 610 4629
Phone  +64 4 462 5085 
Email  cameron.scr...@solnet.co.nz
Solnet Solutions Limited
Level 12, Solnet House
70 The Terrace, Wellington 6011
PO Box 397, Wellington 6140

www.solnet.co.nz



From:   Christian Balzer ch...@gol.com
To: ceph-us...@ceph.com ceph-us...@ceph.com
Cc: cameron.scr...@solnet.co.nz
Date:   08/06/2015 02:00 p.m.
Subject:Re: [ceph-users] Multiple journals and an OSD on one SSD 
doable?




Cameron,

To offer at least some constructive advice here instead of just all doom
and gloom, here's what I'd do:

Replace the OS SSDs with 2 400GB Intel DC S3700s (or S3710s).
They have enough BW to nearly saturate your network.

Put all your journals on them (3 SSD OSD and 3 HDD OSD per). 
While that's a bad move from a failure domain perspective, your budget
probably won't allow for anything better and those are VERY reliable and
just as important durable SSDs. 

This will give you the speed your current setup is capable of, probably
limited by the CPU when it comes to SSD pool operations.

Christian

On Mon, 8 Jun 2015 10:44:06 +0900 Christian Balzer wrote:

 
 Hello Cameron,
 
 On Mon, 8 Jun 2015 13:13:33 +1200 cameron.scr...@solnet.co.nz wrote:
 
  Hi Christian,
  
  Yes we have purchased all our hardware, was very hard to convince 
  management/finance to approve it, so some of the stuff we have is a
  bit cheap.
  
 Unfortunate. Both the done deal and the cheapness. 
 
  We have four storage nodes each with 6 x 6TB Western Digital Red SATA 
  Drives (WD60EFRX-68M) and 6 x 1TB Samsung EVO 850s SSDs and 2x250GB 
  Samsung EVO 850s (for OS raid).
  CPUs are Intel Atom C2750  @ 2.40GHz (8 Cores) with 32 GB of RAM. 
  We have a 10Gig Network.
 
 I wish there was a nice way to say this, but it unfortunately boils down
 to a You're fooked.
 
 There have been many discussions about which SSDs are usable with Ceph,
 very recently as well.
 Samsung EVOs (the non DC type for sure) are basically unusable for
 journals. See the recent thread:
  Possible improvements for a slow write speed (excluding independent SSD
 journals) and:
 
http://www.sebastien-han.fr/blog/2014/10/10/ceph-how-to-test-if-your-ssd-is-suitable-as-a-journal-device/

 for reference.
 
 I presume your intention for the 1TB SSDs is for a SSD backed pool? 
 Note that the EVOs have a pretty low (guaranteed) endurance, so aside
 from needing journal SSDs that actually can do the job, you're looking 
at
 wearing them out rather quickly (depending on your use case of course).
 
 Now with SSD based OSDs or even HDD based OSDs with SSD journals that 
CPU
 looks a bit anemic.
 
 More below:
  The two options we are considering are:
  
  1) Use two of the 1TB SSDs for the spinning disk journals (3 each) and 

  then use the remaining 900+GB of each drive as an OSD to be part of
  the cache pool.
  
  2) Put the spinning disk journals on the OS SSDs and use the 2 1TB
  SSDs for the cache pool.
  
 Cache pools aren't all that speedy currently (research the ML archives),
 even less so with the SSDs you have.
 
 Christian
 
  In both cases the other 4 1TB SSDs will be part of their own tier.
  
  Thanks a lot!
  
  Cameron Scrace
  Infrastructure Engineer
  
  Mobile +64 22 610 4629
  Phone  +64 4 462 5085 
  Email  cameron.scr...@solnet.co.nz
  Solnet Solutions Limited
  Level 12, Solnet House
  70 The Terrace, Wellington 6011
  PO Box 397, Wellington 6140
  
  www.solnet.co.nz
  
  
  
  From:   Christian Balzer ch...@gol.com
  To: ceph-us...@ceph.com ceph-us...@ceph.com
  Cc: cameron.scr...@solnet.co.nz
  Date:   08/06/2015 12:18 p.m.
  Subject:Re: [ceph-users] Multiple journals and an OSD on one
  SSD doable?
  
  
  
  
  Hello,
  
  
  On Mon, 8 Jun 2015 09:55:56 +1200 cameron.scr...@solnet.co.nz wrote:
  
   The other option we were considering was putting the journals on the
   OS SSDs, they are only 250GB and the rest would be for the OS. Is
   that a decent option?
  
  You'll be getting a LOT better advice if you're telling us more
  details.
  
  For starters, have you bought the hardware yet?
  Tell us about your design, how many initial storage nodes, how many
  HDDs/SSDs per node, what CPUs/RAM/network?
  
  What SSDs are we talking about, exact models please.
  (Both the sizes you mentioned do not ring a bell for DC level SSDs I'm
  aware of)
  
  That said, I'm using Intel DC S3700s for mixed OS and journal use with 

  good
  results. 
  In your average Ceph storage node, normal OS (logging mostly) activity
  is a
  minute drop in the bucket for any decent SSD, so nearly all of it's
  resources are available

Re: [ceph-users] Monitors not reaching quorum. (SELinux off, IPtables off, can see tcp traffic)

2015-06-03 Thread Cameron . Scrace
It mostly like is the model of switch. In its settings the minimum frame 
size you can set is 1518, default MTU is 1500, seems the switch wants the 
18 byte difference.

We are using a pair of Netgear XS712T and bonded pairs of Intel 10-Gigabit 
X540-AT2 (rev 01) with 3 VLans. 

Cameron Scrace
Infrastructure Engineer

Mobile +64 22 610 4629
Phone  +64 4 462 5085 
Email  cameron.scr...@solnet.co.nz
Solnet Solutions Limited
Level 12, Solnet House
70 The Terrace, Wellington 6011
PO Box 397, Wellington 6140

www.solnet.co.nz



From:   Somnath Roy somnath@sandisk.com
To: cameron.scr...@solnet.co.nz cameron.scr...@solnet.co.nz, Jan 
Schermer j...@schermer.cz
Cc: ceph-users@lists.ceph.com ceph-users@lists.ceph.com, 
ceph-users ceph-users-boun...@lists.ceph.com
Date:   04/06/2015 11:13 a.m.
Subject:RE: [ceph-users] Monitors not reaching quorum. (SELinux 
off, IPtables off, can see tcp traffic)



Hmm…Thanks for sharing this..
Any chance it depends on switch ?
Could you please share what NIC card and switch you are using ?
 
Thanks  Regards
Somnath
 
From: cameron.scr...@solnet.co.nz [mailto:cameron.scr...@solnet.co.nz] 
Sent: Wednesday, June 03, 2015 4:07 PM
To: Somnath Roy; Jan Schermer
Cc: ceph-users@lists.ceph.com; ceph-users
Subject: RE: [ceph-users] Monitors not reaching quorum. (SELinux off, 
IPtables off, can see tcp traffic)
 
The interface MTU has to be 18 or more bytes lower than the switch MTU or 
it just stops working. As far as I know the monitor communication is not 
being encapsulated by any SDN. 

Cameron Scrace
Infrastructure Engineer

Mobile +64 22 610 4629
Phone  +64 4 462 5085 
Email  cameron.scr...@solnet.co.nz
Solnet Solutions Limited
Level 12, Solnet House
70 The Terrace, Wellington 6011
PO Box 397, Wellington 6140

www.solnet.co.nz 



From:Somnath Roy somnath@sandisk.com 
To:Jan Schermer j...@schermer.cz, cameron.scr...@solnet.co.nz 
cameron.scr...@solnet.co.nz 
Cc:ceph-users@lists.ceph.com ceph-users@lists.ceph.com, 
ceph-users ceph-users-boun...@lists.ceph.com 
Date:04/06/2015 02:58 a.m. 
Subject:RE: [ceph-users] Monitors not reaching quorum. (SELinux 
off, IPtables off, can see tcp traffic) 




The TCP_NODELAY issue was with kernel rbd *not* with OSD. Ceph messenger 
code base is setting it by default. 
BTW, I doubt TCP_NODELAY has anything to do with it. 
  
Thanks  Regards 
Somnath 
  
From: Jan Schermer [mailto:j...@schermer.cz] 
Sent: Wednesday, June 03, 2015 1:37 AM
To: cameron.scr...@solnet.co.nz
Cc: Somnath Roy; ceph-users@lists.ceph.com; ceph-users
Subject: Re: [ceph-users] Monitors not reaching quorum. (SELinux off, 
IPtables off, can see tcp traffic) 
 
Interface and switch should have the same MTU and that should not cause 
any issues (setting switch MTU higher is always safe, though). 
Aren’t you encapsulating the mon communication in some SDN like 
openwswitch? Is that a straight L2 connection? 
 
I think this is worth investigating. For example are mons properly setting 
TCP_NODELAY on the sockets that are latency sensitive? (I just tried 
finding out and lsof/netstat doesn’t report that to me, I’d need to 
restart and strace it… I vaguely remember there was an issue with NODELAY 
that was fixed on OSD side.) 
 
Jan 
 
 
On 03 Jun 2015, at 06:30, cameron.scr...@solnet.co.nz wrote: 
  
Seems to be something to do with our switch. If the interface MTU is too 
close to the switch MTU it stops working. Thanks for all your help :) 

Cameron Scrace
Infrastructure Engineer

Mobile +64 22 610 4629
Phone  +64 4 462 5085 
Email  cameron.scr...@solnet.co.nz
Solnet Solutions Limited
Level 12, Solnet House
70 The Terrace, Wellington 6011
PO Box 397, Wellington 6140

www.solnet.co.nz 



From:Somnath Roy somnath@sandisk.com 
To:cameron.scr...@solnet.co.nz cameron.scr...@solnet.co.nz 
Cc:ceph-users@lists.ceph.com ceph-users@lists.ceph.com, 
ceph-users ceph-users-boun...@lists.ceph.com, Joao Eduardo Luis 
j...@suse.de 
Date:03/06/2015 11:49 a.m. 
Subject:RE: [ceph-users] Monitors not reaching quorum. (SELinux 
off, IPtables off, can see tcp traffic) 





I doubt it is anything to do with Ceph, hope you checked your switch is 
supporting Jumbo frames and you have set MTU 9000 to all the devices in 
between. It‘s better to ping your devices (all the devices participating 
in the cluster) like the way it mentioned in the following articles , just 
in case you are not sure. 
 
http://www.mylesgray.com/hardware/test-jumbo-frames-working/ 
http://serverfault.com/questions/234311/testing-whether-jumbo-frames-are-actually-working
 

 
Hope this helps, 
 
Thanks  Regards 
Somnath 
 
From: cameron.scr...@solnet.co.nz [mailto:cameron.scr...@solnet.co.nz] 
Sent: Tuesday, June 02, 2015 4:32 PM
To: Somnath Roy
Cc: ceph-users@lists.ceph.com; ceph-users; Joao Eduardo Luis
Subject: RE: [ceph-users] Monitors not reaching quorum. (SELinux off, 
IPtables off, can see tcp traffic) 
 
Setting the MTU

[ceph-users] ceph-deploy osd prepare/activate failing with journal on raid device.

2015-06-03 Thread Cameron . Scrace
I'm trying to set up some OSDs and if I try to use a raid device for the 
journal disk it fails: http://pastebin.com/mTw6xzNV

The main issue I see is that the symlink in /dev/disk/by-partuuid is not 
being made correctly. When I make it manually and try to activate I still 
get errors, It seems to think that the journal and OSD are meant to be on 
the same drive.: http://pastebin.com/CEk8Teys

Anyone had this issue before or any suggestions on how to fix it? When I 
use a non raided device it works perfectly.

Thanks,

Cameron Scrace
Infrastructure Engineer

Mobile +64 22 610 4629
Phone  +64 4 462 5085 
Email  cameron.scr...@solnet.co.nz
Solnet Solutions Limited
Level 12, Solnet House
70 The Terrace, Wellington 6011
PO Box 397, Wellington 6140

www.solnet.co.nz
Attention:
This email may contain information intended for the sole use of
the original recipient. Please respect this when sharing or
disclosing this email's contents with any third party. If you
believe you have received this email in error, please delete it
and notify the sender or postmas...@solnetsolutions.co.nz as
soon as possible. The content of this email does not necessarily
reflect the views of Solnet Solutions Ltd.

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Monitors not reaching quorum. (SELinux off, IPtables off, can see tcp traffic)

2015-06-02 Thread Cameron . Scrace
Thanks for the links, Jumbo frames are definitely working. Although we had 
to set the MTU to 8192 because one of the components doesn't support an 
MTU higher than that. 

Thanks for the help. Looks like we may just have to deal with jumbo frames 
being off.

Cameron Scrace
Infrastructure Engineer

Mobile +64 22 610 4629
Phone  +64 4 462 5085 
Email  cameron.scr...@solnet.co.nz
Solnet Solutions Limited
Level 12, Solnet House
70 The Terrace, Wellington 6011
PO Box 397, Wellington 6140

www.solnet.co.nz



From:   Somnath Roy somnath@sandisk.com
To: cameron.scr...@solnet.co.nz cameron.scr...@solnet.co.nz
Cc: ceph-users@lists.ceph.com ceph-users@lists.ceph.com, 
ceph-users ceph-users-boun...@lists.ceph.com, Joao Eduardo Luis 
j...@suse.de
Date:   03/06/2015 11:49 a.m.
Subject:RE: [ceph-users] Monitors not reaching quorum. (SELinux 
off, IPtables off, can see tcp traffic)



I doubt it is anything to do with Ceph, hope you checked your switch is 
supporting Jumbo frames and you have set MTU 9000 to all the devices in 
between. It‘s better to ping your devices (all the devices participating 
in the cluster) like the way it mentioned in the following articles , just 
in case you are not sure.
 
http://www.mylesgray.com/hardware/test-jumbo-frames-working/
http://serverfault.com/questions/234311/testing-whether-jumbo-frames-are-actually-working
 
Hope this helps,
 
Thanks  Regards
Somnath
 
From: cameron.scr...@solnet.co.nz [mailto:cameron.scr...@solnet.co.nz] 
Sent: Tuesday, June 02, 2015 4:32 PM
To: Somnath Roy
Cc: ceph-users@lists.ceph.com; ceph-users; Joao Eduardo Luis
Subject: RE: [ceph-users] Monitors not reaching quorum. (SELinux off, 
IPtables off, can see tcp traffic)
 
Setting the MTU to 1500 worked, monitors reach quorum right away. 
Unfortunately we really want Jumbo Frames to be on, any ideas on how to 
get ceph to work with them on? 

Thanks! 

Cameron Scrace
Infrastructure Engineer

Mobile +64 22 610 4629
Phone  +64 4 462 5085 
Email  cameron.scr...@solnet.co.nz
Solnet Solutions Limited
Level 12, Solnet House
70 The Terrace, Wellington 6011
PO Box 397, Wellington 6140

www.solnet.co.nz 



From:Somnath Roy somnath@sandisk.com 
To:cameron.scr...@solnet.co.nz cameron.scr...@solnet.co.nz 
Cc:ceph-users@lists.ceph.com ceph-users@lists.ceph.com, 
ceph-users ceph-users-boun...@lists.ceph.com, Joao Eduardo Luis 
j...@suse.de 
Date:03/06/2015 10:34 a.m. 
Subject:RE: [ceph-users] Monitors not reaching quorum. (SELinux 
off, IPtables off, can see tcp traffic) 




We have seen some communication issue with that, try to make all the 
server MTU 1500 and try out… 
  
From: cameron.scr...@solnet.co.nz [mailto:cameron.scr...@solnet.co.nz] 
Sent: Tuesday, June 02, 2015 3:31 PM
To: Somnath Roy
Cc: ceph-users@lists.ceph.com; ceph-users; Joao Eduardo Luis
Subject: Re: [ceph-users] Monitors not reaching quorum. (SELinux off, 
IPtables off, can see tcp traffic) 
  
We are running with Jumbo Frames turned on. Is that likely to be the 
issue? Do I need to configure something in ceph? 

The mon maps are fine and after setting debug to 10 and debug ms to 1, I 
see probe timeouts in the logs: http://pastebin.com/44M1uJZc 
I just set probe timeout to 10 (up from 2) and it still times out. 

Thanks! 

Cameron Scrace
Infrastructure Engineer

Mobile +64 22 610 4629
Phone  +64 4 462 5085 
Email  cameron.scr...@solnet.co.nz
Solnet Solutions Limited
Level 12, Solnet House
70 The Terrace, Wellington 6011
PO Box 397, Wellington 6140

www.solnet.co.nz 



From:Somnath Roy somnath@sandisk.com 
To:Joao Eduardo Luis j...@suse.de, ceph-users@lists.ceph.com 
ceph-users@lists.ceph.com 
Date:03/06/2015 03:49 a.m. 
Subject:Re: [ceph-users] Monitors not reaching quorum. (SELinux 
off, IPtables off, can see tcp traffic) 
Sent by:ceph-users ceph-users-boun...@lists.ceph.com 





By any chance are you running with jumbo frame turned on ?

Thanks  Regards
Somnath

-Original Message-
From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of 
Joao Eduardo Luis
Sent: Tuesday, June 02, 2015 12:52 AM
To: ceph-users@lists.ceph.com
Subject: Re: [ceph-users] Monitors not reaching quorum. (SELinux off, 
IPtables off, can see tcp traffic)

On 06/02/2015 01:42 AM, cameron.scr...@solnet.co.nz wrote:
 I am trying to deploy a new ceph cluster and my monitors are not
 reaching quorum. SELinux is off, firewalls are off, I can see traffic
 between the nodes on port 6789 but when I use the admin socket to
 force a re-election only the monitor I send the request to shows the
 new election in its logs. My logs are filled entirely of the following
 two
 lines:

 2015-06-02 11:31:56.447975 7f795b17a700  0 log_channel(audit) log
 [DBG]
 : from='admin socket' entity='admin socket' cmd='mon_status' args=[]:
 dispatch
 2015-06-02 11:31:56.448272 7f795b17a700  0 log_channel(audit) log
 [DBG]
 : from='admin socket' entity='admin socket' cmd

Re: [ceph-users] Monitors not reaching quorum. (SELinux off, IPtables off, can see tcp traffic)

2015-06-02 Thread Cameron . Scrace
Setting the MTU to 1500 worked, monitors reach quorum right away. 
Unfortunately we really want Jumbo Frames to be on, any ideas on how to 
get ceph to work with them on?

Thanks!

Cameron Scrace
Infrastructure Engineer

Mobile +64 22 610 4629
Phone  +64 4 462 5085 
Email  cameron.scr...@solnet.co.nz
Solnet Solutions Limited
Level 12, Solnet House
70 The Terrace, Wellington 6011
PO Box 397, Wellington 6140

www.solnet.co.nz



From:   Somnath Roy somnath@sandisk.com
To: cameron.scr...@solnet.co.nz cameron.scr...@solnet.co.nz
Cc: ceph-users@lists.ceph.com ceph-users@lists.ceph.com, 
ceph-users ceph-users-boun...@lists.ceph.com, Joao Eduardo Luis 
j...@suse.de
Date:   03/06/2015 10:34 a.m.
Subject:RE: [ceph-users] Monitors not reaching quorum. (SELinux 
off, IPtables off, can see tcp traffic)



We have seen some communication issue with that, try to make all the 
server MTU 1500 and try out…
 
From: cameron.scr...@solnet.co.nz [mailto:cameron.scr...@solnet.co.nz] 
Sent: Tuesday, June 02, 2015 3:31 PM
To: Somnath Roy
Cc: ceph-users@lists.ceph.com; ceph-users; Joao Eduardo Luis
Subject: Re: [ceph-users] Monitors not reaching quorum. (SELinux off, 
IPtables off, can see tcp traffic)
 
We are running with Jumbo Frames turned on. Is that likely to be the 
issue? Do I need to configure something in ceph? 

The mon maps are fine and after setting debug to 10 and debug ms to 1, I 
see probe timeouts in the logs: http://pastebin.com/44M1uJZc 
I just set probe timeout to 10 (up from 2) and it still times out. 

Thanks! 

Cameron Scrace
Infrastructure Engineer

Mobile +64 22 610 4629
Phone  +64 4 462 5085 
Email  cameron.scr...@solnet.co.nz
Solnet Solutions Limited
Level 12, Solnet House
70 The Terrace, Wellington 6011
PO Box 397, Wellington 6140

www.solnet.co.nz 



From:Somnath Roy somnath@sandisk.com 
To:Joao Eduardo Luis j...@suse.de, ceph-users@lists.ceph.com 
ceph-users@lists.ceph.com 
Date:03/06/2015 03:49 a.m. 
Subject:Re: [ceph-users] Monitors not reaching quorum. (SELinux 
off, IPtables off, can see tcp traffic) 
Sent by:ceph-users ceph-users-boun...@lists.ceph.com 




By any chance are you running with jumbo frame turned on ?

Thanks  Regards
Somnath

-Original Message-
From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of 
Joao Eduardo Luis
Sent: Tuesday, June 02, 2015 12:52 AM
To: ceph-users@lists.ceph.com
Subject: Re: [ceph-users] Monitors not reaching quorum. (SELinux off, 
IPtables off, can see tcp traffic)

On 06/02/2015 01:42 AM, cameron.scr...@solnet.co.nz wrote:
 I am trying to deploy a new ceph cluster and my monitors are not
 reaching quorum. SELinux is off, firewalls are off, I can see traffic
 between the nodes on port 6789 but when I use the admin socket to
 force a re-election only the monitor I send the request to shows the
 new election in its logs. My logs are filled entirely of the following
 two
 lines:

 2015-06-02 11:31:56.447975 7f795b17a700  0 log_channel(audit) log
 [DBG]
 : from='admin socket' entity='admin socket' cmd='mon_status' args=[]:
 dispatch
 2015-06-02 11:31:56.448272 7f795b17a700  0 log_channel(audit) log
 [DBG]
 : from='admin socket' entity='admin socket' cmd=mon_status args=[]:
 finished

You are running on default debug levels, so you'll hardly get anything 
more than that.  I suggest setting 'debug mon = 10' and 'debug ms = 1'
for added verbosity and come back to us with the logs.

There are many reasons for this, but the more common are due to the 
monitors not being able to communicate with each other.  Given you see 
traffic between the monitors, I'm inclined to assume that the other two 
monitors do not have each other on the monmap or, if they do know each 
other, either 1) the monitor's auth keys do not match, or 2) the probe 
timeout is being triggered before they successfully manage to find enough 
monitors to trigger an election -- which may be due to latency.

Logs will tells us more.

 -Joao

 Querying the admin socket with mon_status (the other two are the
 similar but with their hostnames and rank):

 {
 name: wcm1,
 rank: 0,
 state: probing,
 election_epoch: 1,
 quorum: [],
 outside_quorum: [
 wcm1
 ],
 extra_probe_peers: [],
 sync_provider: [],
 monmap: {
 epoch: 0,
 fsid: adb8c500-122e-49fd-9c1e-a99af7832307,
 modified: 2015-06-02 10:43:41.467811,
 created: 2015-06-02 10:43:41.467811,
 mons: [
 {
 rank: 0,
 name: wcm1,
 addr: 10.1.226.64:6789\/0
 },
 {
 rank: 1,
 name: wcm2,
 addr: 10.1.226.65:6789\/0
 },
 {
 rank: 2,
 name: wcm3,
 addr: 10.1.226.66:6789\/0
 }
 ]
 }
 }

___
ceph-users mailing list
ceph

Re: [ceph-users] Monitors not reaching quorum. (SELinux off, IPtables off, can see tcp traffic)

2015-06-02 Thread Cameron . Scrace
We are running with Jumbo Frames turned on. Is that likely to be the 
issue? Do I need to configure something in ceph?

The mon maps are fine and after setting debug to 10 and debug ms to 1, I 
see probe timeouts in the logs: http://pastebin.com/44M1uJZc
I just set probe timeout to 10 (up from 2) and it still times out.

Thanks!

Cameron Scrace
Infrastructure Engineer

Mobile +64 22 610 4629
Phone  +64 4 462 5085 
Email  cameron.scr...@solnet.co.nz
Solnet Solutions Limited
Level 12, Solnet House
70 The Terrace, Wellington 6011
PO Box 397, Wellington 6140

www.solnet.co.nz



From:   Somnath Roy somnath@sandisk.com
To: Joao Eduardo Luis j...@suse.de, ceph-users@lists.ceph.com 
ceph-users@lists.ceph.com
Date:   03/06/2015 03:49 a.m.
Subject:Re: [ceph-users] Monitors not reaching quorum. (SELinux 
off, IPtables off, can see tcp traffic)
Sent by:ceph-users ceph-users-boun...@lists.ceph.com



By any chance are you running with jumbo frame turned on ?

Thanks  Regards
Somnath

-Original Message-
From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of 
Joao Eduardo Luis
Sent: Tuesday, June 02, 2015 12:52 AM
To: ceph-users@lists.ceph.com
Subject: Re: [ceph-users] Monitors not reaching quorum. (SELinux off, 
IPtables off, can see tcp traffic)

On 06/02/2015 01:42 AM, cameron.scr...@solnet.co.nz wrote:
 I am trying to deploy a new ceph cluster and my monitors are not
 reaching quorum. SELinux is off, firewalls are off, I can see traffic
 between the nodes on port 6789 but when I use the admin socket to
 force a re-election only the monitor I send the request to shows the
 new election in its logs. My logs are filled entirely of the following
 two
 lines:

 2015-06-02 11:31:56.447975 7f795b17a700  0 log_channel(audit) log
 [DBG]
 : from='admin socket' entity='admin socket' cmd='mon_status' args=[]:
 dispatch
 2015-06-02 11:31:56.448272 7f795b17a700  0 log_channel(audit) log
 [DBG]
 : from='admin socket' entity='admin socket' cmd=mon_status args=[]:
 finished

You are running on default debug levels, so you'll hardly get anything 
more than that.  I suggest setting 'debug mon = 10' and 'debug ms = 1'
for added verbosity and come back to us with the logs.

There are many reasons for this, but the more common are due to the 
monitors not being able to communicate with each other.  Given you see 
traffic between the monitors, I'm inclined to assume that the other two 
monitors do not have each other on the monmap or, if they do know each 
other, either 1) the monitor's auth keys do not match, or 2) the probe 
timeout is being triggered before they successfully manage to find enough 
monitors to trigger an election -- which may be due to latency.

Logs will tells us more.

  -Joao

 Querying the admin socket with mon_status (the other two are the
 similar but with their hostnames and rank):

 {
 name: wcm1,
 rank: 0,
 state: probing,
 election_epoch: 1,
 quorum: [],
 outside_quorum: [
 wcm1
 ],
 extra_probe_peers: [],
 sync_provider: [],
 monmap: {
 epoch: 0,
 fsid: adb8c500-122e-49fd-9c1e-a99af7832307,
 modified: 2015-06-02 10:43:41.467811,
 created: 2015-06-02 10:43:41.467811,
 mons: [
 {
 rank: 0,
 name: wcm1,
 addr: 10.1.226.64:6789\/0
 },
 {
 rank: 1,
 name: wcm2,
 addr: 10.1.226.65:6789\/0
 },
 {
 rank: 2,
 name: wcm3,
 addr: 10.1.226.66:6789\/0
 }
 ]
 }
 }

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



PLEASE NOTE: The information contained in this electronic mail message is 
intended only for the use of the designated recipient(s) named above. If 
the reader of this message is not the intended recipient, you are hereby 
notified that you have received this message in error and that any review, 
dissemination, distribution, or copying of this message is strictly 
prohibited. If you have received this communication in error, please 
notify the sender by telephone or e-mail (as shown above) immediately and 
destroy any and all copies of this message in your possession (whether 
hard copies or electronically stored copies).

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Attention:
This email may contain information intended for the sole use of
the original recipient. Please respect this when sharing or
disclosing this email's contents with any third party. If you
believe you have received this email in error, please delete it
and notify the sender or postmas...@solnetsolutions.co.nz as
soon as possible

Re: [ceph-users] Monitors not reaching quorum. (SELinux off, IPtables off, can see tcp traffic)

2015-06-02 Thread Cameron . Scrace
Seems to be something to do with our switch. If the interface MTU is too 
close to the switch MTU it stops working. Thanks for all your help :)

Cameron Scrace
Infrastructure Engineer

Mobile +64 22 610 4629
Phone  +64 4 462 5085 
Email  cameron.scr...@solnet.co.nz
Solnet Solutions Limited
Level 12, Solnet House
70 The Terrace, Wellington 6011
PO Box 397, Wellington 6140

www.solnet.co.nz



From:   Somnath Roy somnath@sandisk.com
To: cameron.scr...@solnet.co.nz cameron.scr...@solnet.co.nz
Cc: ceph-users@lists.ceph.com ceph-users@lists.ceph.com, 
ceph-users ceph-users-boun...@lists.ceph.com, Joao Eduardo Luis 
j...@suse.de
Date:   03/06/2015 11:49 a.m.
Subject:RE: [ceph-users] Monitors not reaching quorum. (SELinux 
off, IPtables off, can see tcp traffic)



I doubt it is anything to do with Ceph, hope you checked your switch is 
supporting Jumbo frames and you have set MTU 9000 to all the devices in 
between. It‘s better to ping your devices (all the devices participating 
in the cluster) like the way it mentioned in the following articles , just 
in case you are not sure.
 
http://www.mylesgray.com/hardware/test-jumbo-frames-working/
http://serverfault.com/questions/234311/testing-whether-jumbo-frames-are-actually-working
 
Hope this helps,
 
Thanks  Regards
Somnath
 
From: cameron.scr...@solnet.co.nz [mailto:cameron.scr...@solnet.co.nz] 
Sent: Tuesday, June 02, 2015 4:32 PM
To: Somnath Roy
Cc: ceph-users@lists.ceph.com; ceph-users; Joao Eduardo Luis
Subject: RE: [ceph-users] Monitors not reaching quorum. (SELinux off, 
IPtables off, can see tcp traffic)
 
Setting the MTU to 1500 worked, monitors reach quorum right away. 
Unfortunately we really want Jumbo Frames to be on, any ideas on how to 
get ceph to work with them on? 

Thanks! 

Cameron Scrace
Infrastructure Engineer

Mobile +64 22 610 4629
Phone  +64 4 462 5085 
Email  cameron.scr...@solnet.co.nz
Solnet Solutions Limited
Level 12, Solnet House
70 The Terrace, Wellington 6011
PO Box 397, Wellington 6140

www.solnet.co.nz 



From:Somnath Roy somnath@sandisk.com 
To:cameron.scr...@solnet.co.nz cameron.scr...@solnet.co.nz 
Cc:ceph-users@lists.ceph.com ceph-users@lists.ceph.com, 
ceph-users ceph-users-boun...@lists.ceph.com, Joao Eduardo Luis 
j...@suse.de 
Date:03/06/2015 10:34 a.m. 
Subject:RE: [ceph-users] Monitors not reaching quorum. (SELinux 
off, IPtables off, can see tcp traffic) 




We have seen some communication issue with that, try to make all the 
server MTU 1500 and try out… 
  
From: cameron.scr...@solnet.co.nz [mailto:cameron.scr...@solnet.co.nz] 
Sent: Tuesday, June 02, 2015 3:31 PM
To: Somnath Roy
Cc: ceph-users@lists.ceph.com; ceph-users; Joao Eduardo Luis
Subject: Re: [ceph-users] Monitors not reaching quorum. (SELinux off, 
IPtables off, can see tcp traffic) 
  
We are running with Jumbo Frames turned on. Is that likely to be the 
issue? Do I need to configure something in ceph? 

The mon maps are fine and after setting debug to 10 and debug ms to 1, I 
see probe timeouts in the logs: http://pastebin.com/44M1uJZc 
I just set probe timeout to 10 (up from 2) and it still times out. 

Thanks! 

Cameron Scrace
Infrastructure Engineer

Mobile +64 22 610 4629
Phone  +64 4 462 5085 
Email  cameron.scr...@solnet.co.nz
Solnet Solutions Limited
Level 12, Solnet House
70 The Terrace, Wellington 6011
PO Box 397, Wellington 6140

www.solnet.co.nz 



From:Somnath Roy somnath@sandisk.com 
To:Joao Eduardo Luis j...@suse.de, ceph-users@lists.ceph.com 
ceph-users@lists.ceph.com 
Date:03/06/2015 03:49 a.m. 
Subject:Re: [ceph-users] Monitors not reaching quorum. (SELinux 
off, IPtables off, can see tcp traffic) 
Sent by:ceph-users ceph-users-boun...@lists.ceph.com 





By any chance are you running with jumbo frame turned on ?

Thanks  Regards
Somnath

-Original Message-
From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of 
Joao Eduardo Luis
Sent: Tuesday, June 02, 2015 12:52 AM
To: ceph-users@lists.ceph.com
Subject: Re: [ceph-users] Monitors not reaching quorum. (SELinux off, 
IPtables off, can see tcp traffic)

On 06/02/2015 01:42 AM, cameron.scr...@solnet.co.nz wrote:
 I am trying to deploy a new ceph cluster and my monitors are not
 reaching quorum. SELinux is off, firewalls are off, I can see traffic
 between the nodes on port 6789 but when I use the admin socket to
 force a re-election only the monitor I send the request to shows the
 new election in its logs. My logs are filled entirely of the following
 two
 lines:

 2015-06-02 11:31:56.447975 7f795b17a700  0 log_channel(audit) log
 [DBG]
 : from='admin socket' entity='admin socket' cmd='mon_status' args=[]:
 dispatch
 2015-06-02 11:31:56.448272 7f795b17a700  0 log_channel(audit) log
 [DBG]
 : from='admin socket' entity='admin socket' cmd=mon_status args=[]:
 finished

You are running on default debug levels, so you'll hardly get anything 
more than

[ceph-users] Monitors not reaching quorum. (SELinux off, IPtables off, can see tcp traffic)

2015-06-01 Thread Cameron . Scrace
I am trying to deploy a new ceph cluster and my monitors are not reaching 
quorum. SELinux is off, firewalls are off, I can see traffic between the 
nodes on port 6789 but when I use the admin socket to force a re-election 
only the monitor I send the request to shows the new election in its logs. 
My logs are filled entirely of the following two lines:

2015-06-02 11:31:56.447975 7f795b17a700  0 log_channel(audit) log [DBG] : 
from='admin socket' entity='admin socket' cmd='mon_status' args=[]: 
dispatch
2015-06-02 11:31:56.448272 7f795b17a700  0 log_channel(audit) log [DBG] : 
from='admin socket' entity='admin socket' cmd=mon_status args=[]: finished

Querying the admin socket with mon_status (the other two are the similar 
but with their hostnames and rank):

{
name: wcm1,
rank: 0,
state: probing,
election_epoch: 1,
quorum: [],
outside_quorum: [
wcm1
],
extra_probe_peers: [],
sync_provider: [],
monmap: {
epoch: 0,
fsid: adb8c500-122e-49fd-9c1e-a99af7832307,
modified: 2015-06-02 10:43:41.467811,
created: 2015-06-02 10:43:41.467811,
mons: [
{
rank: 0,
name: wcm1,
addr: 10.1.226.64:6789\/0
},
{
rank: 1,
name: wcm2,
addr: 10.1.226.65:6789\/0
},
{
rank: 2,
name: wcm3,
addr: 10.1.226.66:6789\/0
}
]
}
}

Any suggestions on what could be the issue?

Regards,

Cameron Scrace
Infrastructure Engineer

Mobile +64 22 610 4629
Phone  +64 4 462 5085 
Email  cameron.scr...@solnet.co.nz
Solnet Solutions Limited
Level 12, Solnet House
70 The Terrace, Wellington 6011
PO Box 397, Wellington 6140

www.solnet.co.nz
Attention:
This email may contain information intended for the sole use of
the original recipient. Please respect this when sharing or
disclosing this email's contents with any third party. If you
believe you have received this email in error, please delete it
and notify the sender or postmas...@solnetsolutions.co.nz as
soon as possible. The content of this email does not necessarily
reflect the views of Solnet Solutions Ltd.

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com