[ceph-users] ERROR: modinfo: could not find module rbd

2014-05-09 Thread Ease Lu
Hi All, I run the command: [ceph@ceph-client ~]$ sudo rbd map ceph_block_dev --pool rbd --id admin -k /etc/ceph/ceph.client.admin.keyring ERROR: modinfo: could not find module rbd FATAL: Module rbd not found. rbd: modprobe rbd failed! (256) As you may already see, there got failed!

Re: [ceph-users] Fwd: Bad performance of CephFS (first use)

2014-05-09 Thread Christian Balzer
Hello, On Fri, 09 May 2014 08:10:54 +0200 Michal Pazdera wrote: > Hi everyone, > > I am new to the Ceph. I have 5 PC test cluster on wich id like to test > CephFS behavior and performance. I have used ceph-deploy on nod pc1 and > installed ceph software (emeperor 0.72.2-0.el6) on all 5 machines

Re: [ceph-users] Delete pool .rgw.bucket and objects within it

2014-05-09 Thread Thanh Tran
Hi Irek, the default value of replication level in version firefly is 3, while in version emperor is 2, this is reason to make my cluster unstable. I have another issue: the size of the folder omap of some osds is very big, it is at about 2GB - 8GB, is there any way to clean up this folder? Best

Re: [ceph-users] Ceph Not getting into a clean state

2014-05-09 Thread Mark Kirkwood
Right, I've run into the situation where the system seems reluctant to reorganise after changing all the pool sizes - until the osds are restarted (essentially I just rebooted each host in turn) *then* the health went to OK. This was a while ago (pre 0.72), so something else may be going on w

Re: [ceph-users] Help -Ceph deployment in Single node Like Devstack

2014-05-09 Thread Sebastien Han
http://www.sebastien-han.fr/blog/2014/05/01/vagrant-up-install-ceph-in-one-command/ Sébastien Han Cloud Engineer "Always give 100%. Unless you're giving blood.” Phone: +33 (0)1 49 70 99 72 Mail: sebastien@enovance.com Address : 11 bis, rue Roquépine - 75008 Paris Web : www.enovanc

Re: [ceph-users] Ceph Not getting into a clean state

2014-05-09 Thread Martin B Nielsen
Hi, I experienced exactly the same with 14.04 and the 0.79 release. It was a fresh clean install with default crushmap and ceph-deploy install as pr. the quick-start guide. Oddly enough changing replica size (incl min_size) from 3 - 2 (and 2->1) and back again it worked. I didn't have time to l

[ceph-users] Migrate whole clusters

2014-05-09 Thread Gandalf Corvotempesta
Let's assume a test cluster up and running with real data on it. Which is the best way to migrate everything to a production (and larger) cluster? I'm thinking to add production MONs to the test cluster, after that, add productions OSDs to the test cluster, waiting for a full rebalance and then st

Re: [ceph-users] Bulk storage use case

2014-05-09 Thread Cédric Lemarchand
An other thought, I would hope that with EC, data chunks spreads would profits of each drives writes capability where there will be stored. I did not get any rely for now ! Does this kind of configuration (hard & soft) looks crazy ?! Am I missing something ? Looking forward for your comments, t

Re: [ceph-users] NFS over CEPH - best practice

2014-05-09 Thread Andrei Mikhailovsky
No particular reason actually. Just thought it would be simpler. However, iSCSI looks simple enough from the howtos. Thanks for your suggestions and I will give it a shot - Original Message - From: "Stuart Longland" To: ceph-users@lists.ceph.com Sent: Friday, 9 May, 2014 12:26:17

Re: [ceph-users] NFS over CEPH - best practice

2014-05-09 Thread Andrei Mikhailovsky
Ideally I would like to have a setup with 2+ iscsi servers, so that I can perform maintenance if necessary without shutting down the vms running on the servers. I guess multipathing is what I need. Also I will need to have more than one xenserver/vmware host servers, so the iscsi LUNs will be

Re: [ceph-users] NFS over CEPH - best practice

2014-05-09 Thread Maciej Bonin
What you need is tgtd and this patch from the ceph site http://ceph.com/dev-notes/updates-to-ceph-tgt-iscsi-support/ for tgt-admin, then you can set up udev rules for persistent naming on the initiator and in turn set up multipathd – the caveat here is that we’ve never had any luck getting more

Re: [ceph-users] Replace journals disk

2014-05-09 Thread Sage Weil
On Fri, 9 May 2014, Indra Pramana wrote: > Hi Gandalf and Sage, > > Just would like to confirm if my steps below to replace a journal disk are > correct? Presuming the journal disk to be replaced is /dev/sdg and the two > affected OSDs using the disk as journals are osd.30 and osd.31: > > - ceph

Re: [ceph-users] Replace journals disk

2014-05-09 Thread Gandalf Corvotempesta
2014-05-09 15:55 GMT+02:00 Sage Weil : > This looks correct to me! Some command to automate this in ceph would be nice. For example, skipping the "mkjournal" step: ceph-osd -i 30 --mkjournal ceph-osd -i 31 --mkjournal ceph should be smarth enough to automatically make journals if missing so that

[ceph-users] pgs not mapped to osds, tearing hair out

2014-05-09 Thread Jeff Bachtel
I'm working on http://tracker.ceph.com/issues/8310 , basically by bringing osds down and up I've come to a state where on-disk I have pgs, osds seem to scan the directories on boot, but the crush map isn't mapping the objects properly. In addition to that ticket, I've got a decompile of my cru

[ceph-users] issues with ceph

2014-05-09 Thread Aronesty, Erik
So we were attempting to stress test a cephfs installation, and last night, after copying 500GB of files, we got this: 570G in the "raw" directory q782657@usadc-seaxd01:/mounts/ceph1/pubdata/tcga$ ls -lh total 32M -rw-rw-r-- 1 q783775 pipeline 32M May 8 10:39 2014-02-25T12:00:01-0800_data_man

Re: [ceph-users] issues with ceph

2014-05-09 Thread Lincoln Bryant
Hi Erik, What happens if you try to stat one of the "missing" files (assuming you know the name of the file before you remount raw)? I had a problem where files would disappear and reappear in CephFS, which I believe was fixed in kernel 3.12. Cheers, Lincoln On May 9, 2014, at 9:30 AM, Arones

[ceph-users] Low latency values

2014-05-09 Thread Dan Ryder (daryder)
Hi, I'm seeing really low latency values, to the extent that they don't seem realistic. Snippet from the latest perf dump for this OSD: "op_r_latency": { "avgcount": 184229, "sum": 178.07771}, Long run avg = 178.07771/184229 = 0.00097 ms? Is it correct that latency values have m

Re: [ceph-users] pgs not mapped to osds, tearing hair out

2014-05-09 Thread Sage Weil
On Fri, 9 May 2014, Jeff Bachtel wrote: > I'm working on http://tracker.ceph.com/issues/8310 , basically by bringing > osds down and up I've come to a state where on-disk I have pgs, osds seem to > scan the directories on boot, but the crush map isn't mapping the objects > properly. > > In additio

Re: [ceph-users] Low latency values

2014-05-09 Thread Haomai Wang
178/184229=0.00097 s = 0.97ms On Fri, May 9, 2014 at 10:49 PM, Dan Ryder (daryder) wrote: > Hi, > > > > I’m seeing really low latency values, to the extent that they don’t seem > realistic. > > > > Snippet from the latest perf dump for this OSD: > > > > "op_r_latency": { "avgcount": 184229, > >

Re: [ceph-users] Suggestions on new cluster

2014-05-09 Thread Christian Balzer
Re-added ML. On Fri, 9 May 2014 14:50:54 + Carlos M. Perez wrote: > Christian, > > Thanks for the responses. See below for a few > reposnses/comments/further questions... > > > -Original Message- > > From: Christian Balzer [mailto:ch...@gol.com] > > Sent: Friday, May 9, 2014 1:35

Re: [ceph-users] issues with ceph

2014-05-09 Thread Aronesty, Erik
If I stat on that box, I get nothing: q782657@usadc-seaxd01:/mounts/ceph1/pubdata/tcga/raw$ cd BRCA -bash: cd: BRCA: No such file or directory perl -e 'print stat("BRCA")' If I access a mount on another machine, I can see the files: q782657@usadc-nasea05:/mounts/ceph1/pubdata/tcga$ ls -l raw t

Re: [ceph-users] issues with ceph

2014-05-09 Thread Aronesty, Erik
I can always remount and see them. But I wanted to preserve the "broken" state and see if I could figure out why it was happening. (strace isn't particularly revealing.) Some other things I noted was that - if I reboot the metadata server nobody seems to "fail over" to the hot spare (eve

Re: [ceph-users] Low latency values

2014-05-09 Thread Dan Ryder (daryder)
Thanks Haomai, So are all latency values calculated in seconds? Dan -Original Message- From: Haomai Wang [mailto:haomaiw...@gmail.com] Sent: Friday, May 09, 2014 11:20 AM To: Dan Ryder (daryder) Cc: ceph-us...@ceph.com Subject: Re: [ceph-users] Low latency values 178/184229=0.00097 s =

Re: [ceph-users] Low latency values

2014-05-09 Thread Haomai Wang
yes On Sat, May 10, 2014 at 12:19 AM, Dan Ryder (daryder) wrote: > Thanks Haomai, > > So are all latency values calculated in seconds? > > Dan > > -Original Message- > From: Haomai Wang [mailto:haomaiw...@gmail.com] > Sent: Friday, May 09, 2014 11:20 AM > To: Dan Ryder (daryder) > Cc: cep

Re: [ceph-users] Low latency values

2014-05-09 Thread Dan Ryder (daryder)
Ok, that makes sense for the OSD IO latency values. But I'm confused about the recoverystate_perf latency values. For example: "started_latency": { "avgcount": 296, "sum": 86047405.517876000}, "primary_latency": { "avgcount": 240, "sum": 53489945.22253}, If these values a

Re: [ceph-users] Migrate whole clusters

2014-05-09 Thread Gregory Farnum
I don't think anybody's done this before, but that will functionally work, yes. Depending on how much of the data in the cluster you actually care about, you might be better off just taking it out (rbd export/import or something) instead of trying to incrementally move all the data over, but...*shr

Re: [ceph-users] too slowly upload on ceph object storage

2014-05-09 Thread Stephen Taylor
+1 with 0.80 on Ubuntu 14.04. I have a 7-node cluster with 10 OSDs per node on a 10Gbps network. 3 of the nodes are also acting as monitors. Each node has a single 6-core CPU, 32GB of memory, 1 SSD for journals, and 36 3TB hard drives. I'm currently using 11 of the hard drives in each node, one

Re: [ceph-users] issues with ceph

2014-05-09 Thread Gregory Farnum
I'm less current on the kernel client, so maybe there are some since-fixed bugs I'm forgetting, but: On Fri, May 9, 2014 at 8:55 AM, Aronesty, Erik wrote: > I can always remount and see them. > > But I wanted to preserve the "broken" state and see if I could figure out why > it was happening.

Re: [ceph-users] Low latency values

2014-05-09 Thread Gregory Farnum
The recovery_state "latencies" are all about how long your PGs are in various states of recovery; they're not per-operation latencies. 3 days still seems awfully long, but if you had a lot of data that needed to get recovered and were throttling it tightly enough that could happen. -Greg Software E

Re: [ceph-users] Migrate whole clusters

2014-05-09 Thread Kyle Bader
> Let's assume a test cluster up and running with real data on it. > Which is the best way to migrate everything to a production (and > larger) cluster? > > I'm thinking to add production MONs to the test cluster, after that, > add productions OSDs to the test cluster, waiting for a full rebalance

Re: [ceph-users] Low latency values

2014-05-09 Thread Dan Ryder (daryder)
Thanks Greg, That makes sense. Can you also confirm that latency values are always in seconds? I haven't seen any documentation for it and want to be sure before I say it is one way or the other. Dan -Original Message- From: Gregory Farnum [mailto:g...@inktank.com] Sent: Friday, May

Re: [ceph-users] issues with ceph

2014-05-09 Thread Lincoln Bryant
FWIW, I believe the particular/similar bug I was thinking of was fixed by: commit 590fb51f1c (vfs: call d_op->d_prune() before unhashing dentry) --Lincoln On May 9, 2014, at 12:37 PM, Gregory Farnum wrote: > I'm less current on the kernel client, so maybe there are some > since-fixed bu

Re: [ceph-users] Low latency values

2014-05-09 Thread Gregory Farnum
On Fri, May 9, 2014 at 10:49 AM, Dan Ryder (daryder) wrote: > Thanks Greg, > > That makes sense. > > Can you also confirm that latency values are always in seconds? > I haven't seen any documentation for it and want to be sure before I say it > is one way or the other. I believe that's the case,

Re: [ceph-users] Fwd: Bad performance of CephFS (first use)

2014-05-09 Thread Michal Pazdera
Dne 9.5.2014 9:08, Christian Balzer napsal(a): Is that really just one disk? Yes, its just one disk in all PCs. I know that the setup is bad, but I want just to get familiar with Ceph (and other parallel fs like Gluster ot Lustre) and see what they can do and cannot. You have the reason for t

Re: [ceph-users] Suggestions on new cluster

2014-05-09 Thread Carlos M. Perez
Sorry about not sending to list. Most of my other lists default to from the list, not the individual... Thanks for the links below. Equipment should be here next week, so we'll get to do some testing... Carlos M. Perez CMP Consulting Services 305-669-1515 > -Original Message- > From:

Re: [ceph-users] pgs not mapped to osds, tearing hair out

2014-05-09 Thread Jeff Bachtel
Wow I'm an idiot for getting the wrong reweight command. Thanks so much, Jeff On May 9, 2014 11:06 AM, "Sage Weil" wrote: > On Fri, 9 May 2014, Jeff Bachtel wrote: > > I'm working on http://tracker.ceph.com/issues/8310 , basically by > bringing > > osds down and up I've come to a state where on

Re: [ceph-users] v0.80 Firefly released

2014-05-09 Thread Mike Dawson
Andrey, In initial testing, it looks like it may work rather efficiently. 1) Upgrade all mon, osd, and clients to Firefly. Restart everything so no legacy ceph code is running. 2) Add "mon osd allow primary affinity = true" to ceph.conf, distribute ceph.conf to nodes. 3) Inject it into t

Re: [ceph-users] Bulk storage use case

2014-05-09 Thread Craig Lewis
I'm still a noob too, so don't take anything I say with much weight. I was hoping that somebody with more experience would reply. I see a few potential problems. With that CPU to disk ratio, you're going to need to slow recovery down a lot to make sure you have enough CPU available after a n

Re: [ceph-users] Fwd: Bad performance of CephFS (first use)

2014-05-09 Thread Christian Balzer
On Fri, 09 May 2014 23:03:50 +0200 Michal Pazdera wrote: > Dne 9.5.2014 9:08, Christian Balzer napsal(a): > > Is that really just one disk? > > Yes, its just one disk in all PCs. I know that the setup is bad, but I > want just to get > familiar with Ceph (and other parallel fs like Gluster ot Lu