from:"Steve Thompson"

[ceph-users] Incomplete MON removal

2015-07-08 Thread Steve Thompson


Ceph newbie here; ceph 0.94.2, CentOS 6.6 x86_64. Kernel 2.6.32.

Initial test cluster of five OSD nodes, 3 MON, 1 MDS. Working well. I was 
testing the removal of two MONs, just to see how it works. The second MON 
was stopped and removed: no problems. The third MON was stopped and 
removed: apparently no problems, and ceph told me that only one MON 
remained. However, a "ceph -s", along with many other commands, now hang 
for 5 minutes and then give me an authentication timeout. On the initial 
MON node, anderson, I get:


# ceph daemon mon.anderson mon_status
{
"name": "anderson",
"rank": 1,
"state": "probing",
"election_epoch": 0,
"quorum": [],
"outside_quorum": [
"anderson"
],
"extra_probe_peers": [],
"sync_provider": [],
"monmap": {
"epoch": 4,
"fsid": "b9aeb134-fe63-46b4-a939-152a6c188f6a",
"modified": "2015-07-07 17:18:02.816853",
"created": "0.00",
"mons": [
{
"rank": 0,
"name": "benford",
"addr": "10.22.200.13:6789\/0"
},
{
"rank": 1,
"name": "anderson",
"addr": "10.22.200.16:6789\/0"
}
]
}
}

So, no quorum. Here benford is the third MON that was already removed. 
This removal, which initially appeared to work, evidently did not complete 
fully. I cannot start a MON on benford, however ("mon.benford not present 
in monmap"). I cannot start the OSD's on any node.


How do I recover from this situation?

Steve
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Deadly slow Ceph cluster revisited

2015-07-17 Thread Steve Thompson


On Fri, 17 Jul 2015, J David wrote:


f16 inbound: 6Gbps
f16 outbound: 6Gbps
f17 inbound: 6Gbps
f17 outbound: 6Gbps
f18 inbound: 6Gbps
f18 outbound: 1.2Mbps


Unless the network was very busy when you did this, I think that 6 Gb/s 
may not be very good either. Usually iperf will give you much more than 
that. For example, between two of my OSD's, I get 9.4 Gb/s, or up to 9.9 
Gb/s when nothing else is happening.


Steve
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[ceph-users] Ceph experiences

2015-07-18 Thread Steve Thompson



Ceph newbie (three weeks).

Ceph 0.94.2, CentOS 6.6 x86_64, kernel 2.6.32. Twelve identical OSD's (1 
TB each), three MON's, one active MDS and two standby MDS's. 10GbE cluster 
network, 1GbE public network. Using CephFS on a single client via the 
4.1.1 kernel from elrepo; using rsync to copy data to the Ceph file system 
(mostly small files). Only one client (me). All set up with ceph-deploy.


For this test setup, the OSD's are present on two quad-core 3.16GHz hosts 
with 16GB memory each; six OSD's on each node. Journals are on the OSD 
drives for now. The two hosts are not user-accessible, and so are doing 
mostly OSD duty only (but they have light duty iSCSI targets on them).


First surprise: I have noticed that the OSD drives do not fill at the same 
rate. For example, when the Ceph file system was 71% full, I had one OSD 
go into a full state at 95%, while there is another OSD that is only 51% 
full, and another at 60%.


Second surprise: one full OSD results in ENOSPC for *all* writes, even 
though there is plenty of space available on other OSD's. I marked the 
full OSD as out to attempt to rebalance ("ceph osd out ods.0"). This 
appeared to be working, albeit very slowly. I stopped client writes.


Third surprise: restart client writes after about an hour; data is still 
being written to the full OSD, but the full condition is no longer 
recognized; it went to 96% before I stopped the client writes one more. 
That was yesterday evening; today it is down to 91%. File system is not 
going to be useable until the rebalance completes (looks like taking 
days).


I did not expect any of this. Any thoughts?

Steve
--
---- 
Steve Thompson E-mail:  smt AT vgersoft DOT com Voyager Software LLC Web: 
http://www DOT vgersoft DOT com 39 Smugglers Path VSW Support: support AT 
vgersoft DOT com Ithaca, NY 14850

  "186,282 miles per second: it's not just a good idea, it's the law"

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[ceph-users] Incomplete MON removal

Re: [ceph-users] Deadly slow Ceph cluster revisited

[ceph-users] Ceph experiences

3 matches

Site Navigation

Mail list logo

Footer information