[ceph-users] memory stats

2015-10-05 Thread Serg M
What difference between memory statistics of "ceph tell {daemon}.{id} heap stats", ps aux | grep "ceph-" and "ceph {daemon} perf dump mds_mem"? Which is better to use for monitoring a cluster? Maybe there is better way to check more precise memory usage of ceph daemons with a help of python scripts

Re: [ceph-users] Correct method to deploy on jessie

2015-10-05 Thread Dmitry Ogorodnikov
Good day, I think I will use wheezy for now for tests. Bad thing is wheezy full support ends in 5 months, so wheezy is not ok for persistent production cluster. I cant find out what ceph team offer to debian users, move to other distro? Is there any 'official' answer?.. Best rsgards, Dmitry 02 о

Re: [ceph-users] A tiny quesion about the object id

2015-10-05 Thread Mark Kirkwood
If you look at the rados api (e.g http://docs.ceph.com/docs/master/rados/api/python/), there is no explicit call for the object id - the closest is the 'key', which is actually the object's name. If you are using the python bindings you can see this by calling dir() on a rados object and look

Re: [ceph-users] Potential OSD deadlock?

2015-10-05 Thread Robert LeBlanc
-BEGIN PGP SIGNED MESSAGE- Hash: SHA256 With some off-list help, we have adjusted osd_client_message_cap=1. This seems to have helped a bit and we have seen some OSDs have a value up to 4,000 for client messages. But it does not solve the problem with the blocked I/O. One thing that I

[ceph-users] Seeing huge number of open pipes per OSD process

2015-10-05 Thread Eric Eastman
I am testing a Ceph cluster running Ceph v9.0.3 on Trusty using the 4.3rc4 kernel and I am seeing a huge number of open pipes on my OSD processes as I run a sequential load on the system using a single Ceph file system client. A "lsof -n > file.txt" on one of the OSD servers produced a 9GB file wi

[ceph-users] Can't mount cephfs to host outside of cluster

2015-10-05 Thread Egor Kartashov
Hello! I have cluster of 3 machines with ceph 0.80.10 (package shipped with Ubuntu Trusty). Ceph sucessfully mounts on all of them. On external machine I'm reciving error "can't read superblock" and dmesg shows records like: [1485389.625036] libceph: mon0 [...]:6789 socket closed (con state CON

Re: [ceph-users] Read performance in VMs

2015-10-05 Thread Nick Fisk
You will want to increase your readahead setting, but please see the recent thread regarding this, as readahead has effectively been rendered useless for the last couple of years. You may also be able to tune the librbd readahead settings, but it would make more sense to do it in the VM if it is a

[ceph-users] Read performance in VMs

2015-10-05 Thread Martin Bureau
Hello, Is there a way to improve sequentials reads in a VM ? My understanding is that a read will be served from a single osd at a time, so it can't be faster than the drive that osd in running on. Is that correct ? Is there any ways to improve this ? Thanks for any answers, Martin __

Re: [ceph-users] CephFS "corruption" -- Nulled bytes

2015-10-05 Thread Sage Weil
On Mon, 5 Oct 2015, Adam Tygart wrote: > Okay, this has happened several more times. Always seems to be a small > file that should be read-only (perhaps simultaneously) on many > different clients. It is just through the cephfs interface that the > files are corrupted, the objects in the cachepool

Re: [ceph-users] CephFS "corruption" -- Nulled bytes

2015-10-05 Thread Adam Tygart
Okay, this has happened several more times. Always seems to be a small file that should be read-only (perhaps simultaneously) on many different clients. It is just through the cephfs interface that the files are corrupted, the objects in the cachepool and erasure coded pool are still correct. I am

Re: [ceph-users] Write barriers, controler cache and disk cache.

2015-10-05 Thread Jan Schermer
The controller might do several different things, you should first test how it behaves Does it flush data from cache to disk when instructed to? Most controllers ignore flushes in writeback mode (that is why they are so fast), which is both bad and good - bad when the controller dies but good f

[ceph-users] Write barriers, controler cache and disk cache.

2015-10-05 Thread Frédéric Nass
Hello, We are building a new Ceph cluster and have a few questions regarding the use of write barriers, controler cache, and disk cache (buffer). Greg said that barriers should be used (http://lists.ceph.com/pipermail/ceph-users-ceph.com/2013-July/002854.html) for data safety which is the de

Re: [ceph-users] Potential OSD deadlock?

2015-10-05 Thread Josef Johansson
Hi, Looking over disks etc and comparing to our setup, we got a bit different hardware, but they should be comparable. Running Hitachi 4TB (HUS724040AL), Intel DC S3700 and SAS3008 instead. In our old cluster (almost same hardware in new and old) we have overloaded the cluster and had to wait

Re: [ceph-users] Fwd: warn in a mds log(pipe/.fault, server, going to standby)

2015-10-05 Thread Jan Schermer
I don't think this is an issue? The log entry probably means that the client (your monitoring script/client) disconnected. No harm in that? It might indicate the connection was not closed cleanly, though... Jan > On 05 Oct 2015, at 11:48, Serg M wrote: > > Greetings! > i got some problem whic

[ceph-users] Fwd: warn in a mds log(pipe/.fault, server, going to standby)

2015-10-05 Thread Serg M
Greetings! i got some problem which is described in http://tracker.ceph.com/issues/13267 So when i check with heap stats i got warn only in mds's log, other mon and osd logs dont contain such info. I use it without starting a profiler.(even i start/stop profiler, heap dump - causes same warn) p.s

[ceph-users] warn in a mds log(pipe/.fault, server, going to standby)

2015-10-05 Thread Serg M
greetings! i got some problem which is described in http://tracker.ceph.com/issues/13267 So when i check with heap stats i got warn only in mds's log, other mon and osd logs dont contain such info. I use it without starting a profiler.(even i start/stop profiler, heap dump - causes same warn) p.s

Re: [ceph-users] RGW ERROR: endpoints not configured for upstream zone

2015-10-05 Thread Abhishek Varshney
Hi, I just resolved this issue. It was probably due to a faulty region map configuration, where more than 1 regions were marked as default. After updating the is_master tag to false of all the non-master regions, doing a radosgw-admin region-map update and radosgw restart, things are working fine