[ceph-users] krbd exclusive-lock

2017-03-21 Thread Mikaël Cluseau
Hi, There's something I don't understand about the exclusive-lock feature. I created an image: $ ssh host-3 Container Linux by CoreOS stable (1298.6.0) Update Strategy: No Reboots host-3 ~ # uname -a Linux host-3 4.9.9-coreos-r1 #1 SMP Tue Mar 14 21:09:42 UTC 2017 x86_64 Intel(R) Xeon(R) CPU E5

Re: [ceph-users] More than 50% osds down, CPUs still busy; will the cluster recover without help?

2015-03-27 Thread Mikaël Cluseau
Hi, On 03/18/2015 03:01 PM, Gregory Farnum wrote: I think it tended to crash rather than hang like this so I'm a bit surprised, but if this op is touching a "broken" file or something that could explain it. FWIW, the last time I had the issue (on a 3.10.9 kernel), btrfs was freezing, waiting

Re: [ceph-users] Help with SSDs

2014-12-17 Thread Mikaël Cluseau
On 12/17/2014 02:58 AM, Bryson McCutcheon wrote: Is there a good work around if our SSDs are not handling D_SYNC very well? We invested a ton of money into Samsung 840 EVOS and they are not playing well with D_SYNC. Would really appreciate the help! Just in case it's linked with the recent pe

Re: [ceph-users] Merging two active ceph clusters: suggestions needed

2014-09-23 Thread Mikaël Cluseau
On 09/22/2014 05:17 AM, Robin H. Johnson wrote: Can somebody else make comments about migrating S3 buckets with preserved mtime data (and all of the ACLs & CORS) then? I don't know how radosgw objects are stored, but have you considered a lower level rados export/import ? IMPORT AND EXPORT

[ceph-users] RBD over cache tier over EC pool: rbd rm doesn't remove objects

2014-09-19 Thread Mikaël Cluseau
Hi all, I have weird behaviour on my firefly "test + convenience storage" cluster. It consists of 2 nodes with a light imbalance in available space: # idweighttype nameup/downreweight -114.58root default -28.19host store-1 12.73osd.1up

Re: [ceph-users] Introductions

2014-08-13 Thread Mikaël Cluseau
On 08/11/2014 01:14 PM, Zach Hill wrote: Thanks for the info! Great data points. We will still recommend a separated solution, but it's good to know that some have tried to unify compute and storage and have had some success. Yes, and using drives on compute node for backup is a seducing idea

Re: [ceph-users] Introductions

2014-08-09 Thread Mikaël Cluseau
Hi Zach, On 08/09/2014 11:33 AM, Zach Hill wrote: Generally, we recommend strongly against such a deployment in order to ensure performance and failure isolation between the compute and storage sides of the system. But, I'm curious if anyone is doing this in practice and if they've found reaso

Re: [ceph-users] Is it still unsafe to map a RBD device on an OSD server?

2014-06-11 Thread Mikaël Cluseau
On 06/11/2014 08:20 AM, Sebastien Han wrote: Thanks for your answers u I have that for an apt-cache since more than 1 year now, never had an issue. Of course, your question is not about having a krbd device backing an OSD of the same cluster ;-) <>_

Re: [ceph-users] Gentoo ceph-deploy

2013-11-11 Thread Mikaël Cluseau
On 11/12/2013 04:20 AM, Aaron Ten Clay wrote: I'm interested in helping as well. I currently maintain ebuilds for the latest Ceph versions at an overlay called Nextoo, if anyone is interested: https://github.com/nextoo/portage-overlay/tree/master/sys-cluster/ceph Very nice :) Same approach a

Re: [ceph-users] Gentoo ceph-deploy

2013-11-10 Thread Mikaël Cluseau
Hi, pleased to see another gentoo user :) On 11/10/2013 07:02 PM, Philipp Strobl wrote: As i read at ceph-mailinglist you are trying to get ceph-deploy to work on gentoo ? I've been but since I'm used to work "by hand" (I'm from the pre-ceph-ceploy era), I didn't put much effort in this for

Re: [ceph-users] Kernel Panic / RBD Instability

2013-11-06 Thread Mikaël Cluseau
Hello, if you use kernel RBD, maybe your issue is linked to this one : http://tracker.ceph.com/issues/5760 Best regards, Mikael. ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[ceph-users] Production locked: OSDs down

2013-10-14 Thread Mikaël Cluseau
Hi, I have a pretty big problem here... my OSDs are marked down (except one?!) I have ceph ceph version 0.61.8 (a6fdcca3bddbc9f177e4e2bf0d9cdd85006b028b). I recently had a full monitors so I had to remove them but it seemed to work. # idweighttype nameup/downreweight -115

Re: [ceph-users] ceph 0.67, 0.67.1: ceph_init bug

2013-08-17 Thread Mikaël Cluseau
On 08/18/2013 08:53 AM, Sage Weil wrote: Yep! It's working without any change in the udev rules files ;) ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] ceph 0.67, 0.67.1: ceph_init bug

2013-08-17 Thread Mikaël Cluseau
On 08/18/2013 08:53 AM, Sage Weil wrote: Yep! What distro is this? I'm working on Gentoo packaging to get a full stack of ceph and openstack. Overlay here: git clone https://git.isi.nc/cloud/cloud-overlay.git And a small fork of ceph-deploy to add gentoo support: git clone https://git.isi.nc

Re: [ceph-users] ceph 0.67, 0.67.1: ceph_init bug

2013-08-17 Thread Mikaël Cluseau
On 08/18/2013 08:44 AM, Mikaël Cluseau wrote: On 08/18/2013 08:39 AM, Mikaël Cluseau wrote: # ceph-disk -v activate-all DEBUG:ceph-disk-python2.7:Scanning /dev/disk/by-parttypeuuid Maybe /dev/disk/by-parttypeuuid is specific? # ls -l /dev/disk total 0 drwxr-xr-x 2 root root 1220 Aug 18 07

Re: [ceph-users] ceph 0.67, 0.67.1: ceph_init bug

2013-08-17 Thread Mikaël Cluseau
On 08/18/2013 08:39 AM, Mikaël Cluseau wrote: # ceph-disk -v activate-all DEBUG:ceph-disk-python2.7:Scanning /dev/disk/by-parttypeuuid Maybe /dev/disk/by-parttypeuuid is specific? # ls -l /dev/disk total 0 drwxr-xr-x 2 root root 1220 Aug 18 07:01 by-id drwxr-xr-x 2 root root 60 Aug 18 07

Re: [ceph-users] ceph 0.67, 0.67.1: ceph_init bug

2013-08-17 Thread Mikaël Cluseau
On 08/18/2013 08:35 AM, Sage Weil wrote: The ceph-disk activate-all command is looking for partitions that are marked with the ceph type uuid. Maybe the jouranls are missing? What does ceph-disk -v activate /dev/sdc1 say? Or ceph-disk -v activate-all Where does the 'journal' symlink in

[ceph-users] ceph 0.67, 0.67.1: ceph_init bug

2013-08-17 Thread Mikaël Cluseau
Hi, troubles with ceph_init (after a test reboot) # ceph_init restart osd # ceph_init restart osd.0 /usr/lib/ceph/ceph_init.sh: osd.0 not found (/etc/ceph/ceph.conf defines mon.xxx , /var/lib/ceph defines mon.xxx) 1 # ceph-disk list [...] /dev/sdc : /dev/sdc1 ceph data, prepared, cluster ceph

Re: [ceph-users] v0.67 Dumpling released

2013-08-16 Thread Mikaël Cluseau
On 08/17/2013 02:06 PM, Dan Mick wrote: That loosk interesting, but I cannot browse without making an account; can you make your source freely available? gitlab's policy is the following : Public access If checked, this project can be cloned /without any/ authentication. It will also be list

Re: [ceph-users] v0.67 Dumpling released

2013-08-16 Thread Mikaël Cluseau
On 08/17/2013 02:06 PM, Dan Mick wrote: That loosk interesting, but I cannot browse without making an account; can you make your source freely available? umm it seems the policy of gitlab is that you can clone but not browse online... but you can clone so it's freely available : $ GIT_SSL_N

Re: [ceph-users] v0.67 Dumpling released

2013-08-14 Thread Mikaël Cluseau
Hi lists, in this release I see that the ceph command is not compatible with python 3. The changes were not all trivial so I gave up, but for those using gentoo, I made my ceph git repository available here with an ebuild that forces the python version to 2.6 ou 2.7 : git clone https://git.i

Re: [ceph-users] Issues going from 1 to 3 mons

2013-08-02 Thread Mikaël Cluseau
Hi Nelson, On 07/31/13 18:11, Jeppesen, Nelson wrote: ceph mon add 10.198.141.203:6789 was the monmap modified after the mon add ? I had a problem with bobtail, on my lab, going from 1 to 2 and back because of quorum loss, maybe its me same. I had to get the monmap from the mon filesystem

[ceph-users] Kernel's rbd in 3.10.1

2013-07-24 Thread Mikaël Cluseau
Hi, I have a bug in the 3.10 kernel under debian, be it a self compiled linux-stable from the git (built with make-kpkg) or the sid's package. I'm using format-2 images (ceph version 0.61.6 (59ddece17e36fef69ecf40e239aeffad33c9db35)) to make snapshots and clones of a database for development

Re: [ceph-users] SSD recommendations for OSD journals

2013-07-22 Thread Mikaël Cluseau
On 07/23/13 07:35, Charles 'Boyo wrote: Considering using a mSATA to PCIe adapter with a SATA III mSATA SSD. Any thoughts on what to expect from this combination? Going PCIe I think I would use a SSD card rather than adding yet another (relatively slow) bus. I haven't looked at the models but

Re: [ceph-users] SSD recommendations for OSD journals

2013-07-21 Thread Mikaël Cluseau
On 22/07/2013 08:03, Charles 'Boyo wrote: Counting on the kernel's cache, it appears I will be best served purchasing write-optimized SSDs? Can you share any information on the SSD you are using, is it PCIe connected? We are on a standard SAS bus so any SSD going to 500MB/s and being stable o

Re: [ceph-users] 1 x raid0 or 2 x disk

2013-07-21 Thread Mikaël Cluseau
On 07/21/13 20:37, Wido den Hollander wrote: I'd saw two disks and not raid0. Since when you are doing parallel I/O both disks can be doing something completely different. Completely agree, Ceph is already doing the stripping :) ___ ceph-users mailin

Re: [ceph-users] optimizing recovery throughput

2013-07-20 Thread Mikaël Cluseau
Hi, On 07/21/13 09:05, Dan van der Ster wrote: This is with a 10Gb network -- and we can readily get 2-3GBytes/s in "normal" rados bench tests across many hosts in the cluster. I wasn't too concerned with the overall MBps throughput in my question, but rather the objects/s recovery rate -- the

Re: [ceph-users] latency when OSD falls out of cluster

2013-07-20 Thread Mikaël Cluseau
Hi, On 07/12/13 19:57, Edwin Peer wrote: Seconds of down time is quite severe, especially when it is a planned shut down or rejoining. I can understand if an OSD just disappears, that some requests might be directed to the now gone node, but I see similar latency hiccups on scheduled shut down

Re: [ceph-users] Hadoop/Ceph and DFS IO tests

2013-07-20 Thread Mikaël Cluseau
Hi, On 07/11/13 12:23, ker can wrote: Unfortunately I currently do not have access to SSDs, so I had a separate disk for the journal for each data disk for now. you can try the RAM as a journal (well... not in production of course), if you want an idea of the performance on SSDs. I tried this

Re: [ceph-users] optimizing recovery throughput

2013-07-19 Thread Mikaël Cluseau
HI, On 07/19/13 07:16, Dan van der Ster wrote: and that gives me something like this: 2013-07-18 21:22:56.546094 mon.0 128.142.142.156:6789/0 27984 : [INF] pgmap v112308: 9464 pgs: 8129 active+clean, 398 active+remapped+wait_backfill, 3 active+recovery_wait, 933 active+remapped+backfilling, 1 a

[ceph-users] weird: "-23/116426 degraded (-0.020%)"

2013-07-17 Thread Mikaël Cluseau
Hi list, not a real problem but weird thing under cuttlefish : 2013-07-18 10:51:01.597390 mon.0 [INF] pgmap v266324: 216 pgs: 215 active+clean, 1 active+remapped+backfilling; 144 GB data, 305 GB used, 453 GB / 766 GB avail; 3921KB/s rd, 2048KB/s wr, 288op/s; 1/116426 degraded (0.001%); recov

Re: [ceph-users] osd client op priority vs osd recovery op priority

2013-07-08 Thread Mikaël Cluseau
On 09/07/2013 14:57, Mikaël Cluseau wrote: I think I'll go for the second option because the problematic load spikes seem to have a period of 24h + epsilon... Seems good : the load drop behind the 1.0 line, ceph starts to scrub, the scrub is fast and load goes higher the 1.0, there'

Re: [ceph-users] osd client op priority vs osd recovery op priority

2013-07-08 Thread Mikaël Cluseau
On 09/07/2013 14:41, Gregory Farnum wrote: On Mon, Jul 8, 2013 at 8:08 PM, Mikaël Cluseau wrote: Hi Greg, thank you for your (fast) answer. Please keep all messages on the list. :) oops, reply-to isn't set by default here ^^ I just realized you were talking about increased late

Re: [ceph-users] osd client op priority vs osd recovery op priority

2013-07-08 Thread Mikaël Cluseau
Hi Greg, thank you for your (fast) answer. Since we're going more in-depth, in must say : * we're running 2 Gentoo GNU/Linux servers doing both storage and virtualization (I know this is not recommended but we mostly have a low load and virtually no writes outside of ceph) * sys-cluster

[ceph-users] osd client op priority vs osd recovery op priority

2013-07-08 Thread Mikaël Cluseau
Hi dear list :) I have a small doubt about these two options, as the documentation states this : osd client op priority Description: The priority set for client operations. It is relative to osd recovery op priority. Default:63 osd recovery op priority Description: The priority