Re: [ceph-users] Scrub Error / How does ceph pg repair work?

2015-05-11 Thread Christian Eichelmann
Hi Christian, Hi Robert, thank you for your replies! I was already expecting something like this. But I am seriously worried about that! Just assume that this is happening at night. Our shift has not necessarily enough knowledge to perform all the steps in Sebasien's article. And if we always hav

Re: [ceph-users] EC backend benchmark

2015-05-11 Thread Christian Balzer
Hello, Could you have another EC run with differing block sizes like described here: http://lists.opennebula.org/pipermail/ceph-users-ceph.com/2014-October/043949.html and look for write amplification? I'd suspect that by the very nature of EC and the addition local checksums it (potentially) wr

Re: [ceph-users] Scrub Error / How does ceph pg repair work?

2015-05-11 Thread Christian Balzer
Hello, I can only nod emphatically to what Robert said, don't issue repairs unless you a) don't care about the data or b) have verified that your primary OSD is good. See this for some details on how establish which replica(s) are actually good or not: http://www.sebastien-han.fr/blog/2015/04/

Re: [ceph-users] HEALTH_WARN 6 requests are blocked

2015-05-11 Thread Irek Fasikhov
Patrick, At the moment, you do not have any problems related to the slow query. 2015-05-12 8:56 GMT+03:00 Patrik Plank : > So ok, understand. > > But what can I do if the scrubbing process hangs by one page since last > night: > > > root@ceph01:~# ceph health detail > HEALTH_OK > > root@ceph01:~

Re: [ceph-users] HEALTH_WARN 6 requests are blocked

2015-05-11 Thread Patrik Plank
So ok, understand. But what can I do if the scrubbing process hangs by one page since last night: root@ceph01:~# ceph health detail HEALTH_OK root@ceph01:~# ceph pg dump | grep scrub pg_stat    objects    mip    degr    misp    unf    bytes    log    disklog    state    state_stamp    v    r

Re: [ceph-users] HEALTH_WARN 6 requests are blocked

2015-05-11 Thread Irek Fasikhov
Scrubbing greatly affects the I / O and can slow queries on OSD. For more information, look in the 'ceph health detail' and 'ceph pg dump | grep scrub' 2015-05-12 8:42 GMT+03:00 Patrik Plank : > Hi, > > > is that the reason for the Health Warn or the scrubbing notification? > > > > thanks > > re

Re: [ceph-users] HEALTH_WARN 6 requests are blocked

2015-05-11 Thread Irek Fasikhov
Hi, Patrik. You must configure the priority of the I / O for scrubbing. http://dachary.org/?p=3268 2015-05-12 8:03 GMT+03:00 Patrik Plank : > Hi, > > > the ceph cluster shows always the scrubbing notifications, although he do > not scrub. > > And what does the "Health Warn" mean. > > Does an

[ceph-users] HEALTH_WARN 6 requests are blocked

2015-05-11 Thread Patrik Plank
Hi, the ceph cluster shows always the scrubbing notifications, although he do not scrub. And what does the "Health Warn" mean. Does anybody have an idea why the warning is displayed. How can I solve this?  cluster 78227661-3a1b-4e56-addc-c2a272933ac2 health HEALTH_WARN 6 requests are

[ceph-users] Replicas handling

2015-05-11 Thread Anthony Levesque
Greetings, We have been testing a full SSD Ceph cluster for a few weeks now and still testing. One of the outcome(We will post a full report on our test soon but for now this email will only be for replicas) is that as soon as you put more than 1 copy of the cluster, it kills the performance b

Re: [ceph-users] Shadow Files

2015-05-11 Thread Yehuda Sadeh-Weinraub
It's the wip-rgw-orphans branch. - Original Message - > From: "Daniel Hoffman" > To: "Yehuda Sadeh-Weinraub" > Cc: "Ben" , "David Zafman" , > "ceph-users" > Sent: Monday, May 11, 2015 4:30:11 PM > Subject: Re: [ceph-users] Shadow Files > > Thanks. > > Can you please let me know the s

Re: [ceph-users] Shadow Files

2015-05-11 Thread Daniel Hoffman
Thanks. Can you please let me know the suitable/best git version/tree to be pulling to compile and use this feature/patch? Thanks On Tue, May 12, 2015 at 4:38 AM, Yehuda Sadeh-Weinraub wrote: > > > -- > > *From: *"Daniel Hoffman" > *To: *"Yehuda Sadeh-Weinraub" >

Re: [ceph-users] Scrub Error / How does ceph pg repair work?

2015-05-11 Thread Anthony D'Atri
Agree that 99+% of the inconsistent PG's I see correlate directly to disk flern. Check /var/log/kern.log*, /var/log/messages*, etc. and I'll bet you find errors correlating. -- Anthony ___ ceph-users mailing list ceph-users@lists.ceph.com http://lis

Re: [ceph-users] [cephfs][ceph-fuse] cache size or memory leak?

2015-05-11 Thread Gregory Farnum
On Fri, May 8, 2015 at 1:34 AM, Yan, Zheng wrote: > On Fri, May 8, 2015 at 11:15 AM, Dexter Xiong wrote: >> I tried "echo 3 > /proc/sys/vm/drop_caches" and dentry_pinned_count dropped. >> >> Thanks for your help. >> > > could you please try the attached patch I haven't followed the whole convers

Re: [ceph-users] ceph-fuse options: writeback cache

2015-05-11 Thread Gregory Farnum
On Mon, May 11, 2015 at 1:57 AM, Kenneth Waegeman wrote: > Hi all, > > I have a few questions about ceph-fuse options: > - Is the fuse writeback cache being used? How can we see this? Can it be > turned on with allow_wbcache somehow? I'm not quite sure what you mean here. ceph-fuse does maintain

[ceph-users] Inconsistent PGs because 0 copies of objects...

2015-05-11 Thread Aaron Ten Clay
Fellow Cephers, I'm scratching my head on this one. Somehow a bunch of objects were lost in my cluster, which is currently ceph version 0.87.1 (283c2e7cfa2457799f534744d7d549f83ea1335e). The symptoms are that "ceph -s" reports a bunch of inconsistent PGs: cluster 8a2c9e43-9f17-42e0-92fd-88a4

Re: [ceph-users] EC backend benchmark

2015-05-11 Thread Somnath Roy
Thanks Loic.. << inline Regards Somnath -Original Message- From: Loic Dachary [mailto:l...@dachary.org] Sent: Monday, May 11, 2015 3:02 PM To: Somnath Roy Cc: ceph-users@lists.ceph.com; Ceph Development Subject: Re: EC backend benchmark Hi, [Sorry I missed the body of your questions, here

Re: [ceph-users] EC backend benchmark

2015-05-11 Thread Loic Dachary
Hi, [Sorry I missed the body of your questions, here is my answer ;-] On 11/05/2015 23:13, Somnath Roy wrote:> Summary : > > - > > > > 1. It is doing pretty good in Reads and 4 Rados Bench clients are saturating > 40 GB network. With more physical server, it is scaling almost lin

[ceph-users] New Calamari server

2015-05-11 Thread Michael Kuriger
I had an issue with my calamari server, so I built a new one from scratch. I¹ve been struggling trying to get the new server to start up and see my ceph cluster. I went so far as to remove salt and diamond from my ceph nodes and reinstalled again. On my calamari server, it sees the hosts connect

Re: [ceph-users] EC backend benchmark

2015-05-11 Thread Somnath Roy
Loic, I thought this one didn't go through ! I have sent another mail with attached doc. This is the data with rados bench . In case you missed it, could you please share your thoughts on the questions I posted (way below in the mail, not sure how so many space came along!!) below ? Thanks & Rega

Re: [ceph-users] EC backend benchmark

2015-05-11 Thread Loic Dachary
Hi, Thanks for sharing :-) Have you published the tools that you used to gather these results ? It would be great to have a way to reproduce the same measures in different contexts. Cheers On 11/05/2015 23:13, Somnath Roy wrote: > > > Hi Loic and community, > > > > I have gathered the f

[ceph-users] EC backend benchmark

2015-05-11 Thread Somnath Roy
Hi Loic and community, I have gathered the following data on EC backend (all flash). I have decided to use Jerasure since space saving is the utmost priority. Setup: 41 OSDs (each on 8 TB flash), 5 node Ceph cluster. 48 core HT enabled cpu/64 GB RAM. Tested with Rados Bench clients. R

Re: [ceph-users] Is CephFS ready for production?

2015-05-11 Thread Neil Levine
We are still laying the foundations for eventual VMware integration and indeed the Red Hat acquisition has made this more real now. The first step is iSCSI support and work is ongoing in the kernel to get HA iSCSI working with LIO and kRBD. See the blueprint and CDS sessions with Mike Christie for

Re: [ceph-users] Shadow Files

2015-05-11 Thread Yehuda Sadeh-Weinraub
- Original Message - > From: "Daniel Hoffman" > To: "Yehuda Sadeh-Weinraub" > Cc: "Ben" , "ceph-users" > Sent: Sunday, May 10, 2015 5:03:22 PM > Subject: Re: [ceph-users] Shadow Files > Any updates on when this is going to be released? > Daniel > On Wed, May 6, 2015 at 3:51 AM, Yehud

Re: [ceph-users] "too many PGs per OSD" in Hammer

2015-05-11 Thread Chris Armstrong
Thanks for the help! We've lowered the number of PGs per pool to 64, so with 12 pools and a replica count of 3, all 3 OSDs have a full 768 PGs. If anyone has any concerns or objections (particularly folks from the Ceph/Redhat team), please let me know. Thanks again! On Fri, May 8, 2015 at 1:21 P

Re: [ceph-users] civetweb lockups

2015-05-11 Thread Yehuda Sadeh-Weinraub
- Original Message - > From: "Daniel Hoffman" > To: "ceph-users" > Sent: Sunday, May 10, 2015 10:54:21 PM > Subject: [ceph-users] civetweb lockups > Hi All. > We have a wierd issue where civetweb just locks up, it just fails to respond > to HTTP and a restart resolves the problem. This

Re: [ceph-users] osd does not start when object store is set to "newstore"

2015-05-11 Thread Srikanth Madugundi
Did not work. $ ls -l /usr/lib64/|grep liburcu-bp lrwxrwxrwx 1 root root 19 May 10 05:27 liburcu-bp.so -> liburcu-bp.so.2.0.0 lrwxrwxrwx 1 root root 19 May 10 05:26 liburcu-bp.so.2 -> liburcu-bp.so.2.0.0 -rwxr-xr-x 1 root root32112 Feb 25 20:27 liburcu-bp.so.2.0.0 Can you point

Re: [ceph-users] very different performance on two volumes in the same pool #2

2015-05-11 Thread Alexandre DERUMIER
Hi, I'm currently doing benchmark too, and I don't see this behavior >>I get very nice performance of up to 200k IOPS. However once the volume is >>written to (ie when I map it using rbd map and dd whole volume with some >>random data), >>and repeat the benchmark, random performance drops to ~23k

Re: [ceph-users] Scrub Error / How does ceph pg repair work?

2015-05-11 Thread Robert LeBlanc
Personally I would not just run this command automatically because as you stated, it only copies the primary PGs to the replicas and if the primary is corrupt, you will corrupt your secondaries.I think the monitor log shows which OSD has the problem so if it is not your primary, then just issue the

Re: [ceph-users] xfs corruption, data disaster!

2015-05-11 Thread Ric Wheeler
On 05/05/2015 04:13 AM, Yujian Peng wrote: Emmanuel Florac writes: Le Mon, 4 May 2015 07:00:32 + (UTC) Yujian Peng 126.com> écrivait: I'm encountering a data disaster. I have a ceph cluster with 145 osd. The data center had a power problem yesterday, and all of the ceph nodes were down.

Re: [ceph-users] Find out the location of OSD Journal

2015-05-11 Thread Sebastien Han
Under the OSD directory, you can look where the symlink points. This is generally called ‘journal’, it should point to a device. > On 06 May 2015, at 06:54, Patrik Plank wrote: > > Hi, > > i cant remember on which drive I install which OSD journal :-|| > Is there any command to show this? > >

Re: [ceph-users] OSD in ceph.conf

2015-05-11 Thread Robert LeBlanc
If you use ceph-disk (and I believe ceph-depoly) to create your OSDs, or you go through the manual steps to set up the partition UUIDs, then yes udev and the init script will do all the magic. Your disks can be moved to another box without problems. I've moved disks to different ports on controller

Re: [ceph-users] very different performance on two volumes in the same pool #2

2015-05-11 Thread Mason, Michael
I had the same problem when doing benchmarks with small block sizes (<8k) to RBDs. These settings seemed to fix the problem for me. sudo ceph tell osd.* injectargs '--filestore_merge_threshold 40' sudo ceph tell osd.* injectargs '--filestore_split_multiple 8' After you apply the settings give it

Re: [ceph-users] OSD in ceph.conf

2015-05-11 Thread Georgios Dimitrakakis
Hi Robert, just to make sure I got it correctly: Do you mean that the /etc/mtab entries are completely ignored and no matter what the order of the /dev/sdX device is Ceph will just mount correctly the osd/ceph-X by default? In addition, assuming that an OSD node fails for a reason other than

Re: [ceph-users] RFC: Deprecating ceph-tool commands

2015-05-11 Thread John Spray
On 09/05/2015 00:55, Joao Eduardo Luis wrote: A command being DEPRECATED must be: - clearly marked as DEPRECATED in usage; - kept around for at least 2 major releases; - kept compatible for the duration of the deprecation period. Once two major releases go by, the command will then ente

Re: [ceph-users] osd does not start when object store is set to "newstore"

2015-05-11 Thread Alexandre DERUMIER
>>I tries searching on internet and could not find a el7 package with >>liburcu-bp.la file, let me know which rpm package has this libtool archive. Hi, maybe can you try ./install-deps.sh to install needed dependencies. - Mail original - De: "Srikanth Madugundi" À: "Somnath Roy" C

Re: [ceph-users] Crush rule freeze cluster

2015-05-11 Thread Georgios Dimitrakakis
Oops... to fast to answer... G. On Mon, 11 May 2015 12:13:48 +0300, Timofey Titovets wrote: Hey! I catch it again. Its a kernel bug. Kernel crushed if i try to map rbd device with map like above! Hooray! 2015-05-11 12:11 GMT+03:00 Timofey Titovets : FYI and history Rule: # rules rule replicat

Re: [ceph-users] Crush rule freeze cluster

2015-05-11 Thread Georgios Dimitrakakis
Timofey, glad that you 've managed to get it working :-) Best, George FYI and history Rule: # rules rule replicated_ruleset { ruleset 0 type replicated min_size 1 max_size 10 step take default step choose firstn 0 type room step choose firstn 0 type rack step choose firstn 0 t

Re: [ceph-users] Crush rule freeze cluster

2015-05-11 Thread Timofey Titovets
Hey! I catch it again. Its a kernel bug. Kernel crushed if i try to map rbd device with map like above! Hooray! 2015-05-11 12:11 GMT+03:00 Timofey Titovets : > FYI and history > Rule: > # rules > rule replicated_ruleset { > ruleset 0 > type replicated > min_size 1 > max_size 10 > step ta

Re: [ceph-users] Crush rule freeze cluster

2015-05-11 Thread Timofey Titovets
FYI and history Rule: # rules rule replicated_ruleset { ruleset 0 type replicated min_size 1 max_size 10 step take default step choose firstn 0 type room step choose firstn 0 type rack step choose firstn 0 type host step chooseleaf firstn 0 type osd step emit } And after reset

Re: [ceph-users] Scrub Error / How does ceph pg repair work?

2015-05-11 Thread Chris Hoy Poy
Hi Christian In my experience, inconsistent PGs are almost always related back to a bad drive somewhere. They are going to keep happening, and with that many drives you still need to be diligent/aggressive in dropping bad drives and replacing them. If a drive returns an incorrect read, it can

[ceph-users] ceph-fuse options: writeback cache

2015-05-11 Thread Kenneth Waegeman
Hi all, I have a few questions about ceph-fuse options: - Is the fuse writeback cache being used? How can we see this? Can it be turned on with allow_wbcache somehow? - What is the default of the big_writes option? (as seen in /usr/bin/ceph-fuse --help) . Where can we see this? If we run cep

[ceph-users] Scrub Error / How does ceph pg repair work?

2015-05-11 Thread Christian Eichelmann
Hi all! We are experiencing approximately 1 scrub error / inconsistent pg every two days. As far as I know, to fix this you can issue a "ceph pg repair", which works fine for us. I have a few qestions regarding the behavior of the ceph cluster in such a case: 1. After ceph detects the scrub error

Re: [ceph-users] very different performance on two volumes in the same pool #2

2015-05-11 Thread Somnath Roy
Nik, If you increase num_jobs beyond 4 , is it helping further ? Try 8 or so. Yeah, libsoft* is definitely consuming some cpu cycles , but I don't know how to resolve that. Also, acpi_processor_ffh_cstate_enter popped up and consuming lot of cpu. Try disabling cstate and run cpu in maximum per