Re: [ceph-users] HW Raid vs. Multiple OSD

2017-11-13 Thread Peter Maloney
_ > ceph-users mailing list > ceph-users@lists.ceph.com > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com -- Peter Maloney Brockmann Consult Max-Pl

Re: [ceph-users] Cluster hang (deep scrub bug? "waiting for scrub")

2017-11-10 Thread Peter Maloney
o identify the issue. >> >> Thank you. >> Regards, >> >> Matteo >> >> >> ___ >> ceph-users mailing list >> ceph-users@lists.ceph.com >> http://lis

Re: [ceph-users] ceph-backed VM drive became corrupted after unexpected VM termination

2017-11-07 Thread Peter Maloney
bug_tp = 0/0 > debug_auth = 0/0 > debug_finisher = 0/0 > debug_heartbeatmap = 0/0 > debug_perfcounter = 0/0 > debug_asok = 0/0 > debug_throttle = 0/0 > debug_mon = 0/0 > debug_paxos = 0/0 > debug_rgw = 0/0 > > [osd] > osd op threads = 4 > osd disk threads = 2 &g

Re: [ceph-users] Ceph not recovering after osd/host failure

2017-10-16 Thread Peter Maloney
item osd.60 weight 1.818 > item osd.62 weight 1.818 > item osd.64 weight 1.818 > item osd.67 weight 1.818 > item osd.70 weight 1.818 > item osd.68 weight 1.818 > item osd.72 weight 1.818 > item osd.74 weight 1.818 > item osd.7

Re: [ceph-users] Luminous CephFS on EC - how?

2017-08-30 Thread Peter Maloney
What kind of terrible mail client is this that sends a multipart message where one part is blank and that's the one Thunderbird chooses to show? (see blankness below) Yes you're on the right track. As long as the main fs is on a replicated pool (the one with omap), the ones below it (using file la

Re: [ceph-users] Ceph cluster in error state (full) with raw usage 32% of total capacity

2017-08-10 Thread Peter Maloney
"op": "chooseleaf_firstn", > >"num": 0, > >"type": "host" > >}, > >{ > >"op": "emit" > >} > >] > > } > > > # ceph osd crush rule dump ip-10-0-

Re: [ceph-users] ceph osd safe to remove

2017-08-03 Thread Peter Maloney
On 08/03/17 11:05, Dan van der Ster wrote: > On Fri, Jul 28, 2017 at 9:42 PM, Peter Maloney > wrote: >> Hello Dan, >> >> Based on what I know and what people told me on IRC, this means basicaly the >> condition that the osd is not acting nor up for any pg. And for on

Re: [ceph-users] ceph osd safe to remove

2017-07-28 Thread Peter Maloney
.X were to > be stopped). > > Does anyone have such a script that they'd like to share? > > Thanks! > > Dan > ___ > ceph-users mailing list > ceph-users@lists.ceph.com > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] High iowait on OSD node

2017-07-27 Thread Peter Maloney
0.00 0.000.000.50 0.00 6.00 > 24.00 0.008.00 0.00 8.00 8.00 0.40 > dm-1 0.00 0.000.000.00 0.00 0.00 > 0.00 0.000.000.000.00 0.00 0.00 > > > >

Re: [ceph-users] Yet another performance tuning for CephFS

2017-07-18 Thread Peter Maloney
... probably poor performance with sync writes on filestore, and not sure what would happen with bluestore... probably much better than filestore though if you use a large block size. > > > -Gencer. > > > -Original Message- > From: Peter Maloney [mailto:peter

Re: [ceph-users] Yet another performance tuning for CephFS

2017-07-18 Thread Peter Maloney
eed 200mb/s? What > prevents it im really wonder this. > > Gencer. > > On 2017-07-17 23:24, Peter Maloney wrote: >> You should have a separate public and cluster network. And journal or >> wal/db performance is important... are the devices fast NVMe? >> >> On

Re: [ceph-users] Yet another performance tuning for CephFS

2017-07-17 Thread Peter Maloney
You should have a separate public and cluster network. And journal or wal/db performance is important... are the devices fast NVMe? On 07/17/17 21:31, gen...@gencgiyen.com wrote: > > Hi, > > > > I located and applied almost every different tuning setting/config > over the internet. I couldn’t m

Re: [ceph-users] missing feature 400000000000000 ?

2017-07-14 Thread Peter Maloney
___ > ceph-users mailing list > ceph-users@lists.ceph.com > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com -- Peter Maloney Brockmann Consult Max-Planck-Str. 2 21502 Geesthacht Germany Te

Re: [ceph-users] Specifying a cache tier for erasure-coding?

2017-07-07 Thread Peter Maloney
On 07/07/17 14:03, David Turner wrote: > > So many of your questions depends on what your cluster is used for. We > don't even know rbd or cephfs from what you said and that still isn't > enough to fully answer your questions. I have a much smaller 3 node > cluster using Erasure coding for rbds as

Re: [ceph-users] Adding storage to exiting clusters with minimal impact

2017-07-06 Thread Peter Maloney
Here's my possibly unique method... I had 3 nodes with 12 disks each, and when adding 2 more nodes, I had issues with the common method you describe, totally blocking clients for minutes, but this worked great for me: > my own method > - osd max backfills = 1 and osd recovery max active = 1 > - cr

Re: [ceph-users] Watch for fstrim running on your Ubuntu systems

2017-07-06 Thread Peter Maloney
Hey, I have some SAS Micron S630DC-400 which came with firmware M013 which did the same or worse (takes very long... 100% blocked for about 5min for 16GB trimmed), and works just fine with firmware M017 (4s for 32GB trimmed). So maybe you just need an update. Peter On 07/06/17 18:39, Reed Dier

Re: [ceph-users] dropping filestore+btrfs testing for luminous

2017-06-30 Thread Peter Maloney
On 06/30/17 05:21, Sage Weil wrote: > We're having a series of problems with the valgrind included in xenial[1] > that have led us to restrict all valgrind tests to centos nodes. At teh > same time, we're also seeing spurious ENOSPC errors from btrfs on both > centos on xenial kernels[2], makin

Re: [ceph-users] Very HIGH Disk I/O latency on instances

2017-06-29 Thread Peter Maloney
On 06/28/17 21:57, Gregory Farnum wrote: > > > On Wed, Jun 28, 2017 at 9:17 AM Peter Maloney > <mailto:peter.malo...@brockmann-consult.de>> wrote: > > On 06/28/17 16:52, keynes_...@wistron.com > <mailto:keynes_...@wistron.com> wrote: >> [.

Re: [ceph-users] Very HIGH Disk I/O latency on instances

2017-06-28 Thread Peter Maloney
On 06/28/17 16:52, keynes_...@wistron.com wrote: > > We were using HP Helion 2.1.5 ( OpenStack + Ceph ) > > The OpenStack version is *Kilo* and Ceph version is *firefly* > > > > The way we backup VMs is create a snapshot by Ceph commands (rbd > snapshot) then download (rbd export) it. > > > > W

Re: [ceph-users] Snapshot removed, cluster thrashed...

2017-06-26 Thread Peter Maloney
ble when doing snapshots and snap removal) And keep in mind all the "priority" stuff possibly doesn't have any effect without the cfq disk scheduler (at least in hammer... I think I've heard different for jewel and later). Check with: > grep . /sys/block/*/queue/scheduler --

Re: [ceph-users] radosgw: scrub causing slow requests in the md log

2017-06-21 Thread Peter Maloney
with scrub. Restarting the osd that is mentioned there (osd.155 in your case) will fix it for now. And tuning scrub changes the way it behaves (defaults make it happen more rarely than what I had before). -- Peter Maloney Brockmann Consult Max-Planck

Re: [ceph-users] Prioritise recovery on specific PGs/OSDs?

2017-06-20 Thread Peter Maloney
sat waiting to see when the > ones I care about will finally be handled so I can get on with replacing > those disks. > > Rich > > > > ___ > ceph-users mailing list > ceph-users@lists.ceph.com > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com --

Re: [ceph-users] removing cluster name support

2017-06-11 Thread Peter Maloney
On 06/08/17 21:37, Sage Weil wrote: > Questions: > > - Does anybody on the list use a non-default cluster name? > - If so, do you have a reason not to switch back to 'ceph'? > > Thanks! > sage Will it still be possible for clients to use multiple clusters? Also how does this affect rbd mirroring

Re: [ceph-users] PG that should not be on undersized+degraded on multi datacenter Ceph cluster

2017-06-07 Thread Peter Maloney
On 06/06/17 19:23, Alejandro Comisario wrote: > Hi all, i have a multi datacenter 6 nodes (6 osd) ceph jewel cluster. > There are 3 pools in the cluster, all three with size 3 and min_size 2. > > Today, i shut down all three nodes (controlled and in order) on > datacenter "CPD2" just to validate th

Re: [ceph-users] RBD exclusive-lock and lqemu/librbd

2017-06-02 Thread Peter Maloney
On 06/02/17 12:25, koukou73gr wrote: > On 2017-06-02 13:01, Peter Maloney wrote: >>> Is it easy for you to reproduce it? I had the same problem, and the same >>> solution. But it isn't easy to reproduce... Jason Dillaman asked me for >>> a gcore dump of a hung p

Re: [ceph-users] RBD exclusive-lock and lqemu/librbd

2017-06-02 Thread Peter Maloney
recautions before > posting potentially sensitive data (for example, logs or data > directories that contain Ceph secrets). > -K. > > > On 2017-06-02 12:59, Peter Maloney wrote: >> On 06/01/17 17:12, koukou73gr wrote: >>> Hello list, >>> >>> Today I

Re: [ceph-users] RBD exclusive-lock and lqemu/librbd

2017-06-02 Thread Peter Maloney
On 06/02/17 11:59, Peter Maloney wrote: > On 06/01/17 17:12, koukou73gr wrote: >> Hello list, >> >> Today I had to create a new image for a VM. This was the first time, >> since our cluster was updated from Hammer to Jewel. So far I was just >> copying an existi

Re: [ceph-users] RBD exclusive-lock and lqemu/librbd

2017-06-02 Thread Peter Maloney
Could you > please point me to some documentation on this behaviour? > > Thanks for any feedback. > > -K. > > ___ > ceph-users mailing list > ceph-users@lists.ceph.com > http://lists.ceph.com/listinfo.cgi/ceph-user

Re: [ceph-users] Changing replica size of a running pool

2017-05-19 Thread Peter Maloney
o.cgi/ceph-users-ceph.com> > > > > > -- > *Alejandro Comisario* > *CTO | NUBELIU* > E-mail: alejan...@nubeliu.com <mailto:alejan...@nubeliu.com>Cell: +54 > 9 11 3770 1857 > _ > www.nubeliu.com <http://www.nubel

Re: [ceph-users] mds slow requests

2017-05-12 Thread Peter Maloney
On 05/12/17 16:54, James Eckersall wrote: > Hi, > > We have an 11 node ceph cluster 8 OSD nodes with 5 disks each and 3 > MDS servers. > Since upgrading from Jewel to Kraken last week, we are seeing the > active MDS constantly reporting a number of slow requests > 30 seconds. > The load on the Ceph

Re: [ceph-users] trouble starting ceph @ boot

2017-05-10 Thread Peter Maloney
On 05/10/17 22:07, David Turner wrote: > Are you mounting your OSDs using fstab or anything else? Ceph uses > udev rules and partition identifiers to know what a disk is and where > to mount it, assuming that you have your GUIDs set properly on your > disks. ceph-deploy does this by default. > >

Re: [ceph-users] trouble starting ceph @ boot

2017-05-10 Thread Peter Maloney
On 05/10/17 15:34, vida.z...@gmail.com wrote: > > System: Ubuntu Trusty 14.04 > > Release : Kraken > > > Issue: > > When starting ceph-osd daemon on boot via upstart. Error message in > /var/log/upstart/ceph-osd-ceph_#.log reports 3 attempt to start the > service with the errors message below > > >

Re: [ceph-users] corrupted rbd filesystems since jewel

2017-05-08 Thread Peter Maloney
On 05/08/17 14:50, Stefan Priebe - Profihost AG wrote: > Hi, > Am 08.05.2017 um 14:40 schrieb Jason Dillaman: >> You are saying that you had v2 RBD images created against Hammer OSDs >> and client libraries where exclusive lock, object map, etc were never >> enabled. You then upgraded the OSDs and

[ceph-users] CentOS 7 and ipv4 is trying to bind ipv6

2017-05-06 Thread Peter Maloney
I was testing ceph in a centos 7 VM, just so I could support someone that wants to use CentOS 7, and I ran into a problem... What am I doing wrong? First I tried just without any ipv6 changes, and the eth0 had an ipv6 address, but it still didn't work. I think it looked the same but didn't save th

Re: [ceph-users] Ceph Performance

2017-05-04 Thread Peter Maloney
ormance" mean - how have you determined > that (throughout, latency)? > > > Any changes or configuration require for OS kernel? > > Regards, > James > > ___

Re: [ceph-users] snapshot removal slows cluster

2017-04-26 Thread Peter Maloney
On 04/26/17 14:54, Vladimir Prokofev wrote: > Hello ceph-users. > > Short description: during snapshot removal osd usilisation goes up to > 100%, which leads to slow requests and VM failures due to IOPS stall. > > We're using Openstack Cinder with CEPH cluster as a volume backend. > CEPH version is

Re: [ceph-users] hung rbd requests for one pool

2017-04-24 Thread Peter Maloney
On 04/24/17 22:23, Phil Lacroute wrote: > Jason, > > Thanks for the suggestion. That seems to show it is not the OSD that > got stuck: > > ceph7:~$ sudo rbd -c debug/ceph.conf info app/image1 > … > 2017-04-24 13:13:49.761076 7f739aefc700 1 -- > 192.168.206.17:0/1250293899 --> 192.168.206.13:6804/

Re: [ceph-users] OSD disk concern

2017-04-19 Thread Peter Maloney
On 04/19/17 07:42, gjprabu wrote: > Hi Shuresh, > >Thanks for your reply, Is it ok to have OS on normal SATA > hard drive, volume and journal on same SSD. Mainly we are asking this > suggestion for performance purpose. > For performance, it's always best to make it as parallel as pos

Re: [ceph-users] slow requests and short OSD failures in small cluster

2017-04-18 Thread Peter Maloney
s again (<10% peeks, <5% average). > > So, either there is one VM/image that irritates the entire cluster or > we reached some kind of threshold or it's something completely > different. > > As for the well known fact: Peter Maloney pointed that out in this > thre

Re: [ceph-users] Ceph extension - how to equilibrate ?

2017-04-18 Thread Peter Maloney
On 04/18/17 16:31, pascal.pu...@pci-conseil.net wrote: > > Hello, > > Just an advise : next time, I will extend my Jewel ceph cluster with a > fourth node. > > Actually, we have 3 x nodes of 12 x OSD with 4TB DD (36 x DD 4TB). > > I will add a new node with 12 x 8TB DD (will add 12 new OSD => 48 OS

Re: [ceph-users] Extremely high OSD memory utilization on Kraken 11.2.0 (with XFS -or- bluestore)

2017-04-15 Thread Peter Maloney
How many PGs do you have? And did you change any config, like mds cache size? Show your ceph.conf. On 04/15/17 07:34, Aaron Ten Clay wrote: > Hi all, > > Our cluster is experiencing a very odd issue and I'm hoping for some > guidance on troubleshooting steps and/or suggestions to mitigate the > is

Re: [ceph-users] slow requests and short OSD failures in small cluster

2017-04-13 Thread Peter Maloney
On 04/13/17 10:34, Jogi Hofmüller wrote: > Dear David, > > Am Mittwoch, den 12.04.2017, 13:46 + schrieb David Turner: >> I can almost guarantee what you're seeing is PG subfolder splitting. > Evey day there's something new to learn about ceph ;) > >> When the subfolders in a PG get X number of

Re: [ceph-users] null characters at the end of the file on hard reboot of VM

2017-04-07 Thread Peter Maloney
You should describe your configuration... krbd? librbd? cephfs? is rbd_cache = true? rbd cache writethrough until flush = true? is it kvm? maybe the filesystem in the VM is relevant (I saw something similar testing cephfs... if I blacklisted a client and then force unmounted, I would get whole fi

Re: [ceph-users] Preconditioning an RBD image

2017-04-06 Thread Peter Maloney
On 03/25/17 23:01, Nick Fisk wrote: > >> I think I owe you another graph later when I put all my VMs on there >> (probably finally fixed my rbd snapshot hanging VM issue ...worked around it >> by disabling exclusive-lock,object-map,fast-diff). The bandwidth hungry ones >> (which hung the most often

Re: [ceph-users] disk timeouts in libvirt/qemu VMs...

2017-03-30 Thread Peter Maloney
On 03/28/17 17:28, Brian Andrus wrote: > Just adding some anecdotal input. It likely won't be ultimately > helpful other than a +1.. > > Seemingly, we also have the same issue since enabling exclusive-lock > on images. We experienced these messages at a large scale when making > a CRUSH map change

Re: [ceph-users] disk timeouts in libvirt/qemu VMs...

2017-03-27 Thread Peter Maloney
I can't guarantee it's the same as my issue, but from that it sounds the same. Jewel 10.2.4, 10.2.5 tested hypervisors are proxmox qemu-kvm, using librbd 3 ceph nodes with mon+osd on each -faster journals, more disks, bcache, rbd_cache, fewer VMs on ceph, iops and bw limits on client side, jumbo

Re: [ceph-users] Preconditioning an RBD image

2017-03-23 Thread Peter Maloney
che device and > rely on the writeback to provide journal performance, or do you still > create a separate partition on whatever SSD/NVME you use, effectively > giving triple write overhead? > > > > Nick > > > > *From:*ceph-users [mailto:ceph-users-boun...@lists.c

Re: [ceph-users] can a OSD affect performance from pool X when blocking/slow requests PGs from pool Y ?

2017-03-23 Thread Peter Maloney
I think Greg (who appears to be a ceph committer) basically said he was interested in looking at it, if only you had the pool that failed this way. Why not try to reproduce it, and make a log of your procedure so he can reproduce it too? What caused the slow requests... copy on write from snapshot

Re: [ceph-users] Preconditioning an RBD image

2017-03-22 Thread Peter Maloney
_ > ceph-users mailing list > ceph-users@lists.ceph.com > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com -- Peter Maloney Brockmann Consult Max-Planck-Str. 2 21502 Geesthacht Germany Te

Re: [ceph-users] mkjournal error creating journal ... : (13) Permission denied

2017-03-15 Thread Peter Maloney
On 03/15/17 08:43, Gunwoo Gim wrote: > After a reboot, all the partitions of LVM don't show up in > /dev/mapper -nor in the /dev/dm- or /proc/partitions- though > the whole disks show up; I have to make the hosts run one 'partprobe' > every time they boot so as to have the partitions all show up.

Re: [ceph-users] mkjournal error creating journal ... : (13) Permission denied

2017-03-14 Thread Peter Maloney
> brw-rw 1 root disk 252, 7 Mar 14 13:46 /dev/dm-7 > brw-rw 1 root disk 252, 8 Mar 14 13:46 /dev/dm-8 > brw-rw 1 root disk 252, 9 Mar 14 13:47 /dev/dm-9 > > > Best Regards, > Nicholas. > --

Re: [ceph-users] re enable scrubbing

2017-03-08 Thread Peter Maloney
On 03/08/17 13:50, Laszlo Budai wrote: > > In my case we have 72 OSDs. We are experiencing some performance > issues. We believe that the reason is the scrubbing, so we want to > turn scrubbing off for a few days. > Given the default parameters of 1 day for scrub and 7 days for deep > scrub. We tur

Re: [ceph-users] re enable scrubbing

2017-03-08 Thread Peter Maloney
ne on IRC gave me the above options and I tested and fiddled with them to see how ceph behaves. > > Thank you, > Laszlo > ___ > ceph-users mailing list > ceph-users@lists.ceph.com > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Safely Upgrading OS on a live Ceph Cluster

2017-03-01 Thread Peter Maloney
roblem, which would imply the ceph.conf is wrong. > -Chris >> On Mar 1, 2017, at 9:31 AM, Peter Maloney >> > <mailto:peter.malo...@brockmann-consult.de>> wrote: >> >> On 03/01/17 14:41, Heller, Chris wrote: >>> That is a good question, and I'm

Re: [ceph-users] Safely Upgrading OS on a live Ceph Cluster

2017-03-01 Thread Peter Maloney
put it in the ceph.conf. And since it said you have a non-block journal now, it probably means there is a file... you should remove that (rename it to journal.junk until you're sure it's not an important file, and delete it later). > -Chris > >> On Mar 1, 2017, at 8:37 AM

Re: [ceph-users] Safely Upgrading OS on a live Ceph Cluster

2017-03-01 Thread Peter Maloney
On 02/28/17 18:55, Heller, Chris wrote: > Quick update. So I'm trying out the procedure as documented here. > > So far I've: > > 1. Stopped ceph-mds > 2. set noout, norecover, norebalance, nobackfill > 3. Stopped all ceph-osd > 4. Stopped ceph-mon > 5. Installed new OS > 6. Started ceph-mon > 7. St

Re: [ceph-users] Safely Upgrading OS on a live Ceph Cluster

2017-02-28 Thread Peter Maloney
On 02/27/17 18:01, Heller, Chris wrote: > First I bring down the Ceph FS via `ceph mds cluster_down`. > Second, to prevent OSDs from trying to repair data, I run `ceph osd > set noout` > Finally I stop the ceph processes in the following order: ceph-mds, > ceph-mon, ceph-osd > This is the wrong pro

Re: [ceph-users] SMR disks go 100% busy after ~15 minutes

2017-02-13 Thread Peter Maloney
Then you're not aware of what the SMR disks do. They are just slow for all writes, having to read the tracks around, then write it all again instead of just the one thing you really wanted to write, due to overlap. Then to partially mitigate this, they have some tiny write buffer like 8GB flash, an

Re: [ceph-users] osd_snap_trim_sleep keeps locks PG during sleep?

2017-02-05 Thread Peter Maloney
On 02/03/17 19:54, David Turner wrote: > Our current solution in Hammer involves a daemon monitoring the > cluster load and setting the osd_snap_trim_sleep accordingly between 0 > and 0.35 which does a good job of preventing IO blocking and help us > to clear out the snap_trim_q each day. These se

Re: [ceph-users] Replacing an mds server

2017-01-26 Thread Peter Maloney
emory. I want to > replace it with a new system that has a lot more memory. How does one > go about replacing the ceph-mds server? I looked at the documentation, > figuring I could remove the current metadata server and add the new > one, but the remove metadata server section just says &q

Re: [ceph-users] [RBD][mirror]Can't remove mirrored image.

2017-01-24 Thread Peter Maloney
_ > ceph-users mailing list > ceph-users@lists.ceph.com > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com -- Peter Maloney Brockmann Consult Max-Planck

Re: [ceph-users] machine hangs & soft lockups with 10.2.2 / kernel 4.4.0

2017-01-24 Thread Peter Maloney
ould be appreciated! > > There's no obvious pattern to these problems, and we've had 2 freezes > and 1 soft lockup in the last ~1.5 weeks. > > Thanks, > > Matthew > > -- Pete

Re: [ceph-users] bluestore activation error on Ubuntu Xenial/Ceph Jewel

2017-01-12 Thread Peter Maloney
Hey there... resurrecting a dead apparently unanswered question. I had issues with this, and nobody online had any answers, and I accidentally ran into the solution. So I hope this helps someone. > Hello, > > I have been trying to deploy bluestore OSDs in a test cluster of 2x OSDs > and 3x mon (xe

Re: [ceph-users] docs.ceph.com down?

2017-01-03 Thread Peter Maloney
___ >> ceph-users mailing list >> ceph-users@lists.ceph.com <mailto:ceph-users@lists.ceph.com> >> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >> <http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com> &

Re: [ceph-users] 2x replication: A BIG warning

2016-12-07 Thread Peter Maloney
t find it online "We have recently increase osd op threads from 2 (default value) to 16 because CPU usage on DN was very low. We have the impression it has increased overall ceph cluster performances and reduced block ops occurrences." --

Re: [ceph-users] Interpretation Guidance for Slow Requests

2016-12-07 Thread Peter Maloney
rkB/swkB/s avgrq-sz > avgqu-sz await r_await w_await svctm %util > | sdl 0.00 0.000.00 22392.00 0.00 89568.00 8.00 > 0.790.040.000.04 0.04 79.26 > | > | avg-cpu: %user %nice %system %iowait %steal

Re: [ceph-users] renaming ceph server names

2016-12-02 Thread Peter Maloney
On 12/02/16 12:33, Peter Maloney wrote: > # last section on the other mons (using the file produced on > the first) > # repeat on each monitor node > ceph-mon --cluster newname -i newhostname --inject-monmap > /tmp/monmap correction do t

Re: [ceph-users] renaming ceph server names

2016-12-02 Thread Peter Maloney
t; Many thanks for your help and ideas > > Andrei > > ___ > ceph-users mailing list > ceph-users@lists.ceph.com > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > > > > __

Re: [ceph-users] Deploying new OSDs in parallel or one after another

2016-11-28 Thread Peter Maloney
On 11/28/16 10:02, Kevin Olbrich wrote: > Hi! > > I want to deploy two nodes with 4 OSDs each. I already prepared OSDs > and only need to activate them. > What is better? One by one or all at once? > > Kind regards, > Kevin. I think the general statement is that if your cluster is very small, you m

Re: [ceph-users] ceph cluster having blocke requests very frequently

2016-11-26 Thread Peter Maloney
On 11/26/16 09:52, Peter Maloney wrote: > On 11/18/16 23:15, Peter Maloney wrote: >> BTW, my rebalance finished, and I guess the performance is a bit >> better, with load distributed a bit better, but blocked requests still >> happen if I use snapshot create + export-diff +

Re: [ceph-users] ceph cluster having blocke requests very frequently

2016-11-26 Thread Peter Maloney
On 11/18/16 23:15, Peter Maloney wrote: > > BTW, my rebalance finished, and I guess the performance is a bit > better, with load distributed a bit better, but blocked requests still > happen if I use snapshot create + export-diff + delete snapshot, and > make qemu clients hang

Re: [ceph-users] KVM / Ceph performance problems

2016-11-23 Thread Peter Maloney
_ > ceph-users mailing list > ceph-users@lists.ceph.com > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com -- Peter Maloney Brockmann Consult Max-Planck-Str. 2 21502 Geesthacht Germany Tel: +49 4152 889 300 Fa

Re: [ceph-users] ceph cluster having blocke requests very frequently

2016-11-18 Thread Peter Maloney
On 11/18/16 18:00, Thomas Danan wrote: > > I often read that small IO write and RBD are working better with > bigger filestore_max_sync_interval than default value. > > Default value is 5 sec and I saw many post saying they are using 30 sec. > > Also the slow request symptom is often linked to this

Re: [ceph-users] ceph cluster having blocke requests very frequently

2016-11-15 Thread Peter Maloney
6 different ceph RBD clients. > Snapshoting the RBD image is quite immediate while we are seing the > issue continuously during the day... > > Will check all of this tomorrow . .. > > Thanks again > > Thomas > > > > Sent from my Samsung device > > > ---

Re: [ceph-users] ceph cluster having blocke requests very frequently

2016-11-15 Thread Peter Maloney
On 11/15/16 14:05, Thomas Danan wrote: > Hi Peter, > > Ceph cluster version is 0.94.5 and we are running with Firefly tunables and > also we have 10KPGs instead of the 30K / 40K we should have. > The linux kernel version is 3.10.0-327.36.1.el7.x86_64 with RHEL 7.2 > > On our side we havethe follow

Re: [ceph-users] ceph cluster having blocke requests very frequently

2016-11-15 Thread Peter Maloney
Which kernel version are you using? I have a similar issue..ubuntu 14.04 kernel 3.13.0-96-generic, and ceph jewel 10.2.3. I get logs like this: 2016-11-15 13:13:57.295067 osd.9 10.3.0.132:6817/24137 98 : cluster [WRN] 16 slow requests, 5 included below; oldest blocked for > 7.957045 secs I set o

[ceph-users] kernel versions and slow requests - WAS: Re: FW: Kernel 4.7 on OSD nodes

2016-11-15 Thread Peter Maloney
On 11/15/16 12:58, Оралов Алкексей wrote: > > > > Hello! > > > > I have problem with slow requests on kernel 4.4.0-45 , rolled back all > nodes to 4.4.0-42 > > > > Ubuntu 16.04.1 LTS (Xenial Xerus) > > ceph version 10.2.3 (ecc23778eb545d8dd55e2e4735b53cc93f92e65b) > > > Can you describe yo

Re: [ceph-users] ceph-mon crash after update from hammer 0.94.7 to jewel 10.2.3

2016-11-09 Thread Peter Maloney
aemons, > etc.) to ensure that they are all on the same release... > -- Peter Maloney Brockmann Consult Max-Planck-Str. 2 21502 Geesthacht Germany Tel: +49 4152 889 300 Fax: +49 4152 889 333 E-mail: pet

Re: [ceph-users] Question about writing a program that transfer snapshot diffs between ceph clusters

2016-11-01 Thread Peter Maloney
On 11/01/16 10:22, Peter Maloney wrote: > On 11/01/16 06:57, xxhdx1985126 wrote: >> Hi, everyone. >> >> I'm trying to write a program based on the librbd API that transfers >> snapshot diffs between ceph clusters without the need for a temporary >> stora

Re: [ceph-users] Question about writing a program that transfer snapshot diffs between ceph clusters

2016-11-01 Thread Peter Maloney
On 11/01/16 06:57, xxhdx1985126 wrote: > Hi, everyone. > > I'm trying to write a program based on the librbd API that transfers > snapshot diffs between ceph clusters without the need for a temporary > storage which is required if I use the "rbd export-diff" and "rbd > import-diff" pair. You don't

Re: [ceph-users] After kernel upgrade OSD's on different disk.

2016-11-01 Thread Peter Maloney
On 11/01/16 00:10, jan hugo prins wrote: > After the kernel upgrade, I also upgraded the cluster to 10.2.3 from > 10.2.2. > Let's hope I only hit a bug and that this bug is now fixed, on the other > hand, I think I also saw the issue with a 10.2.3 node, but I'm not sure. It's not a bug for disks to

Re: [ceph-users] log file owner not right

2016-10-31 Thread Peter Maloney
On 10/31/16 05:56, xxhdx1985126 wrote: > Hi, everyone. > > Recently, I deployed a ceph cluster manually. And I found that, after > I start the ceph osd through "/etc/init.d/ceph -a start osd", the size > of the log file "ceph-osd.log" is 0, and its owner isnot "ceph" which > I configured in /etc/ce

Re: [ceph-users] Ceph consultants?

2016-10-05 Thread Peter Maloney
What do you need help with specifically? Setting up ceph isn't very complicated... just fixing it when things go wrong should be. What type of scale are you working with, and do you already have hardware? Or is the problem more to do with integrating it with clients? On 10/05/16 20:16, Erick Perez

Re: [ceph-users] production cluster down :(

2016-09-30 Thread Peter Maloney
the switch config, so i had to > recreate all and forgot about the MTU on the uplink ports. did you remember to save the current config as the bootup config? If not, then a reset of the switch will forget all your changes. > > Thank you ! > -- --------

Re: [ceph-users] Ceph Very Small Cluster

2016-09-29 Thread Peter Maloney
3rd osd, even an empty one (weight=0). (and I had an unresolved issue testing the same with cephfs on my virtual test cluster) > > BR, > Ranjan > > > Am 29.09.2016 um 13:00 schrieb Peter Maloney: >> >> And also you could try: >> ceph osd down > ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Ceph Very Small Cluster

2016-09-29 Thread Peter Maloney
On 09/29/16 12:08, Ranjan Ghosh wrote: > Yes, all the pools have min_size 1: > > root@uhu2 /scripts # ceph osd lspools > 0 rbd,1 cephfs_data,2 cephfs_metadata, > root@uhu2 /scripts # ceph osd pool get cephfs_data min_size > min_size: 1 > root@uhu2 /scripts # ceph osd pool get cephfs_metadata min_si

Re: [ceph-users] crash of osd using cephfs jewel 10.2.2, and corruption

2016-09-21 Thread Peter Maloney
Samuel Just wrote: > Looks like the OSD didn't like an error return it got from the > underlying fs. Can you reproduce with > > debug filestore = 20 > debug osd = 20 > debug ms = 1 > > on the osd and post the whole log? > -Sam > > On Wed, Sep 21, 2016 at 12:10 AM,

[ceph-users] crash of osd using cephfs jewel 10.2.2, and corruption

2016-09-21 Thread Peter Maloney
Hi, I created a one disk osd with data and separate journal on the same lvm volume group just for test, one mon, one mds on my desktop. I managed to crash the osd just by mounting cephfs and doing cp -a of the linux-stable git tree into it. It crashed after copying 2.1G which only covers some of