Re: [ceph-users] How to use cgroup to bind ceph-osd to a specific cpu core?

2015-06-30 Thread Jan Schermer
Re: your previous question I will not elaborate on this much more, I hope some of you will try it if you have NUMA systems and see for yourself. But I can recommend some docs: http://globalsp.ts.fujitsu.com/dmsp/Publications/public/wp-ivy-bridge-ep-memory-performance-ww-en.pdf

Re: [ceph-users] Simple CephFS benchmark

2015-06-30 Thread Tuomas Juntunen
Hi Our ceph is running the following hardware: 3 nodes with 36 OSDs, 18 SSD’s one SSD for two OSD’s, each node has 64gb mem & 2x6core cpu’s 4 monitors running on other servers 40gbit infiniband with IPoIB Here's my cephfs fio test results using the following file, and changing rw parameter [test

Re: [ceph-users] Very low 4k randread performance ~1000iops

2015-06-30 Thread Tuomas Juntunen
Hi For seq reads here's the latencies: lat (usec) : 2=0.01%, 10=0.01%, 20=0.01%, 50=0.02%, 100=0.03% lat (usec) : 250=1.02%, 500=87.09%, 750=7.47%, 1000=1.50% lat (msec) : 2=0.76%, 4=1.72%, 10=0.19%, 20=0.19% Random reads: lat (usec) : 10=0.01% lat (msec) : 2=0.01%, 4=0.01%, 1

Re: [ceph-users] Ceph's RBD flattening and image options

2015-06-30 Thread Haomai Wang
On Tue, Jun 30, 2015 at 9:07 PM, Michał Chybowski wrote: > Hi, > > Lately I've been working on XEN RBD SM and I'm using RBD's built-in snapshot > functionality. > > My system looks like this: > base image -> snapshot -> snaphot is used to create XEN VM's volumes -> > volume snapshots (via rbd snap

Re: [ceph-users] Where is what type if IO generated?

2015-06-30 Thread Haomai Wang
On Wed, Jul 1, 2015 at 4:50 AM, Steffen Tilsch wrote: > Hello Cephers, > > I got some questions regarding where what type of IO is generated. > > > > As far as I understand it looks like this (please see picture: > http://imageshack.com/a/img673/4563/zctaGA.jpg ) : > > 1. Clients -> OSD (Journal):

Re: [ceph-users] CephFS posix test performance

2015-06-30 Thread Yan, Zheng
> On Jul 1, 2015, at 00:34, Dan van der Ster wrote: > > On Tue, Jun 30, 2015 at 11:37 AM, Yan, Zheng wrote: >> >>> On Jun 30, 2015, at 15:37, Ilya Dryomov wrote: >>> >>> On Tue, Jun 30, 2015 at 6:57 AM, Yan, Zheng wrote: I tried 4.1 kernel and 0.94.2 ceph-fuse. their performance are ab

Re: [ceph-users] How to use cgroup to bind ceph-osd to a specific cpu core?

2015-06-30 Thread Ray Sun
Jan, Thanks a lot. I can do my contribution to this project if I can. Best Regards -- Ray On Tue, Jun 30, 2015 at 11:50 PM, Jan Schermer wrote: > Hi all, > our script is available on GitHub > > https://github.com/prozeta/pincpus > > I haven’t had much time to do a proper README, but I hope the

Re: [ceph-users] Simple CephFS benchmark

2015-06-30 Thread Mark Nelson
Two popular benchmarks in the HPC space for testing distributed file systems are IOR and mdtest. Both use MPI to coordinate processes on different clients. Another option may be to use fio or iozone. Netmist may also be an option, but I haven't used it myself and I'm not sure that it's fully

[ceph-users] Where is what type if IO generated?

2015-06-30 Thread Steffen Tilsch
Hello Cephers, I got some questions regarding where what type of IO is generated. As far as I understand it looks like this (please see picture: http://imageshack.com/a/img673/4563/zctaGA.jpg ) : 1. Clients -> OSD (Journal): - Is it sequential write? - Is it parallel due to the many open soc

[ceph-users] Simple CephFS benchmark

2015-06-30 Thread Hadi Montakhabi
I have set up a ceph storage cluster and I'd like to utilize the cephfs (I am assuming this is the only way one could use some other code without using the API). To do so, I have mounted my cephfs on the client node. I'd like to know what would be a good benchmark for measuring write and read perfo

Re: [ceph-users] Very low 4k randread performance ~1000iops

2015-06-30 Thread Stephen Mercier
I currently have about 250 VMs, ranging from 16GB to 2TB in size. What I found, after about a week of testing, sniffing, and observing, is that the larger read ahead buffer causes the VM to chunk reads over to ceph, and in doing so, allows it to better align with the 4MB block size that Ceph use

Re: [ceph-users] runtime Error for creating ceph MON via ceph-deploy

2015-06-30 Thread Alan Johnson
I use sudo visudo and then add in a line under Defaults requiretty --> Defaults: !requiretty Where is the username. Hope this helps? Alan From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of Vida Ahmadi Sent: Monday, June 22, 2015 6:31 AM To: ceph-users@lists.ceph.com Subj

Re: [ceph-users] Very low 4k randread performance ~1000iops

2015-06-30 Thread Mark Nelson
Seems reasonable. What's the latency distribution look like in your fio output file? Would be useful to know if it's universally slow or if some ops are taking much longer to complete than others. Mark On 06/30/2015 01:27 PM, Tuomas Juntunen wrote: I created a file which has the following p

Re: [ceph-users] Very low 4k randread performance ~1000iops

2015-06-30 Thread Tuomas Juntunen
I have already set readahead to OSD’s before, It is now 2048, this didn’t affect the random reads, but gave a lot more sequential performance. Br, T From: Somnath Roy [mailto:somnath@sandisk.com] Sent: 30. kesäkuuta 2015 21:00 To: Tuomas Juntunen; 'Stephen Mercier' Cc: 'ceph-users'

Re: [ceph-users] Very low 4k randread performance ~1000iops

2015-06-30 Thread Tuomas Juntunen
I created a file which has the following parameters [random-read] rw=randread size=128m directory=/root/asd ioengine=libaio bs=4k #numjobs=8 iodepth=64 Br,T -Original Message- From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of Mark Nelson Sent: 30. kesäkuuta 2015 2

[ceph-users] ceph osd out trigerred the pg recovery process, but by the end, why not all pgs are active+clean?

2015-06-30 Thread Cory
Hi ceph experts, I did some test on my ceph cluster recently with following steps: 1. at the beginning, all pgs are active+clean; 2. stop a osd. I observed a lot of pgs are degraded. 3. ceph osd out. 4. then I observed ceph is doing recovery process. my question is I expected by the end, all pgs

Re: [ceph-users] xattrs vs. omap with radosgw

2015-06-30 Thread Zhou, Yuan
FWIW, there was some discussion in OpenStack Swift and their performance tests showed 255 is not the best in recent XFS. They decided to use large xattr boundary size(65535). https://gist.github.com/smerritt/5e7e650abaa20599ff34 -Original Message- From: ceph-devel-ow...@vger.kernel.org

Re: [ceph-users] RGW access problem

2015-06-30 Thread I Kozin
Thank you Alex. This is useful. The Ceph documentation is a bit vague about the subject http://ceph.com/docs/master/radosgw/admin/#create-a-user whereas your link clearly states what is escaped by backslash. To answer your question, I was not using a parser. I was just copy/pasting. It has not imme

Re: [ceph-users] Explanation for "ceph osd set nodown" and "ceph osd cluster_snap"

2015-06-30 Thread Jan Schermer
Thanks. Nobody else knows anything about “cluster_snap”? It is mentioned in the docs, but that’s all… Jan > On 19 Jun 2015, at 12:49, Carsten Schmitt > wrote: > > Hi Jan, > > On 06/18/2015 12:48 AM, Jan Schermer wrote: >> 1) Flags available in ceph osd set are >> >> pause|noup|nodown|noout

[ceph-users] Perfomance issue.

2015-06-30 Thread Marcus Forness
hi! anyone able to privide som tips on performance issue on a newly installe all flash ceph cluster? When we do write test we get 900MB/s write. but read tests are only 200MB/s all servers are on 10GBit connections. [global] fsid = 453d2db9-c764-4921-8f3c-ee0f75412e19 mon_initial_members = ceph02,

[ceph-users] runtime Error for creating ceph MON via ceph-deploy

2015-06-30 Thread Vida Ahmadi
Hi all, I am a new user who want to deploy simple ceph cluster. I start to create ceph monitor node via ceph-deploy and got error: [*ceph_deploy*][*ERROR* ] RuntimeError: remote connection got closed, ensure ``requiretty`` is disabled for node1 I commented requiretty and I have a password-less ac

Re: [ceph-users] low power single disk nodes

2015-06-30 Thread Xu (Simon) Chen
hth, Any idea what caused the pause? I am curious to know more details. Thanks. -Simon On Friday, April 10, 2015, 10 minus wrote: > Hi , > > Question is what do you want to use it for . As an OSD it wont cut it. > Maybe as an iscsi target and YMMV > > I played around with an OEM product from T

Re: [ceph-users] rbd_cache, limiting read on high iops around 40k

2015-06-30 Thread pushpesh sharma
Just an update, there seems to be no proper way to pass iothread parameter from openstack-nova (not at least in Juno release). So a default single iothread per VM is what all we have. So in conclusion a nova instance max iops on ceph rbd will be limited to 30-40K. On Tue, Jun 16, 2015 at 10:08 PM,

Re: [ceph-users] v9.0.1 released

2015-06-30 Thread Yuri Weinstein
Sage We still running nightlies on next and branches. Just wanted to reaffirm that this is not time yet to start scheduling suites on "infernalis"? Thx YuriW - Original Message - From: "Sage Weil" To: ceph-annou...@ceph.com, ceph-de...@vger.kernel.org, ceph-us...@ceph.com, ceph-maint

[ceph-users] Ceph's RBD flattening and image options

2015-06-30 Thread Michał Chybowski
Hi, Lately I've been working on XEN RBD SM and I'm using RBD's built-in snapshot functionality. My system looks like this: base image -> snapshot -> snaphot is used to create XEN VM's volumes -> volume snapshots (via rbd snap..) -> another VMs -> etc. I'd like to be able to delete one of th

Re: [ceph-users] Unexpected disk write activity with btrfs OSDs

2015-06-30 Thread Jan Schermer
I don’t run Ceph on btrfs, but isn’t this related to the btrfs snapshotting feature ceph uses to ensure a consistent journal? Jan > On 19 Jun 2015, at 14:26, Lionel Bouton wrote: > > On 06/19/15 13:42, Burkhard Linke wrote: >> >> Forget the reply to the list... >> >> Forwarded Messa

Re: [ceph-users] 403-Forbidden error using radosgw

2015-06-30 Thread B, Naga Venkata
I am also having same issue can somebody help me out. But for me it is "HTTP/1.1 404 Not Found". ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Very low 4k randread performance ~1000iops

2015-06-30 Thread Somnath Roy
Read_ahead_kb should help you in case of seq workload, but, if you are saying it is helping your workload in random case also, try to do it both in VM as well as in OSD side as well and see if it is making any difference. Thanks & Regards Somnath From: Tuomas Juntunen [mailto:tuomas.juntu...@da

Re: [ceph-users] Very low 4k randread performance ~1000iops

2015-06-30 Thread Mark Nelson
Hi Tuomos, Can you paste the command you ran to do the test? Thanks, Mark On 06/30/2015 12:18 PM, Tuomas Juntunen wrote: Hi It’s not probably hitting the disks, but that really doesn’t matter. The point is we have very responsive VM’s while writing and that is what the users will see. The io

Re: [ceph-users] Very low 4k randread performance ~1000iops

2015-06-30 Thread Tuomas Juntunen
Hi This is something I was thinking too. But it doesn’t take away the problem. Can you share your setup and how many VM’s you are running, that would give us some starting point on sizing our setup. Thanks Br, Tuomas From: Stephen Mercier [mailto:stephen.merc...@attainia.com]

Re: [ceph-users] Very low 4k randread performance ~1000iops

2015-06-30 Thread Stephen Mercier
I ran into the same problem. What we did, and have been using since, is increased the read ahead buffer in the VMs to 16MB (The sweet spot we settled on after testing). This isn't a solution for all scenarios, but for our uses, it was enough to get performance inline with expectations. In Ubunt

Re: [ceph-users] Very low 4k randread performance ~1000iops

2015-06-30 Thread Tuomas Juntunen
Hi It’s not probably hitting the disks, but that really doesn’t matter. The point is we have very responsive VM’s while writing and that is what the users will see. The iops we get with sequential read is good, but the random read is way too low. Is using SSD’s as OSD’s the only way to get

Re: [ceph-users] CephFS posix test performance

2015-06-30 Thread Dan van der Ster
On Tue, Jun 30, 2015 at 11:37 AM, Yan, Zheng wrote: > >> On Jun 30, 2015, at 15:37, Ilya Dryomov wrote: >> >> On Tue, Jun 30, 2015 at 6:57 AM, Yan, Zheng wrote: >>> I tried 4.1 kernel and 0.94.2 ceph-fuse. their performance are about the >>> same. >>> >>> fuse: >>> Files=191, Tests=1964, 60 wal

Re: [ceph-users] Very low 4k randread performance ~1000iops

2015-06-30 Thread Somnath Roy
Break it down, try fio-rbd to see what is the performance you getting.. But, I am really surprised you are getting > 100k iops for write, did you check it is hitting the disks ? Thanks & Regards Somnath From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of Tuomas Juntunen Sen

Re: [ceph-users] How to use cgroup to bind ceph-osd to a specific cpu core?

2015-06-30 Thread Jan Schermer
Hi all, our script is available on GitHub https://github.com/prozeta/pincpus I haven’t had much time to do a proper README, but I hope the configuration is self explanatory enough for now. What it does is pin each OSD into the most “empty” cgroup assigned to

[ceph-users] Very low 4k randread performance ~1000iops

2015-06-30 Thread Tuomas Juntunen
Hi I have been trying to figure out why our 4k random reads in VM's are so bad. I am using fio to test this. Write : 170k iops Random write : 109k iops Read : 64k iops Random read : 1k iops Our setup is: 3 nodes with 36 OSDs, 18 SSD's one SSD for two OSD's, each node has 64gb mem &

[ceph-users] CDS Jewel Wed/Thurs

2015-06-30 Thread Patrick McGarry
Hey cephers, Just a friendly reminder that our Ceph Developer Summit for Jewel planning is set to run tomorrow and Thursday. The schedule and dial in information is available on the new wiki: http://tracker.ceph.com/projects/ceph/wiki/CDS_Jewel Please let me know if you have any questions. Thank

Re: [ceph-users] Old vs New pool on same OSDs - Performance Difference

2015-06-30 Thread Nick Fisk
Answering the question myself, here are the contents of xattr for the object user.cephos.spill_out: 30 00 0. user.ceph._: 0F 08 05 01 00 00 04 03 41 00 00 00 00 00 00 00A... 0010 20 00 00 00 72 62 2E 30 2E 31 62 61 37 30

Re: [ceph-users] Old vs New pool on same OSDs - Performance Difference

2015-06-30 Thread Nick Fisk
> -Original Message- > From: Somnath Roy [mailto:somnath@sandisk.com] > Sent: 29 June 2015 23:29 > To: Nick Fisk > Cc: ceph-users@lists.ceph.com > Subject: RE: [ceph-users] Old vs New pool on same OSDs - Performance > Difference > > Nick, > I think you are probably hitting the issue of

Re: [ceph-users] CephFS posix test performance

2015-06-30 Thread Yan, Zheng
> On Jun 30, 2015, at 15:37, Ilya Dryomov wrote: > > On Tue, Jun 30, 2015 at 6:57 AM, Yan, Zheng wrote: >> I tried 4.1 kernel and 0.94.2 ceph-fuse. their performance are about the >> same. >> >> fuse: >> Files=191, Tests=1964, 60 wallclock secs ( 0.43 usr 0.08 sys + 1.16 cusr >> 0.65 csys

[ceph-users] Node reboot -- OSDs not "logging off" from cluster

2015-06-30 Thread Daniel Schneller
Hi! We are seeing a strange - and problematic - behavior in our 0.94.1 cluster on Ubuntu 14.04.1. We have 5 nodes, 4 OSDs each. When rebooting one of the nodes (e. g. for a kernel upgrade) the OSDs do not seem to shut down correctly. Clients hang and ceph osd tree show the OSDs of that node stil

Re: [ceph-users] How to use cgroup to bind ceph-osd to a specific cpu core?

2015-06-30 Thread Huang Zhiteng
On Tue, Jun 30, 2015 at 4:25 PM, Jan Schermer wrote: > Not having OSDs and KVMs compete against each other is one thing. > But there are more reasons to do this > > 1) not moving the processes and threads between cores that much (better > cache utilization) > 2) aligning the processes with memory

[ceph-users] adding a extra monitor with ceph-deploy

2015-06-30 Thread Makkelie, R (ITCDCC) - KLM
i'm trying to add a extra monitor with ceph-deploy the current/first monitor is installed by hand when i do ceph-deploy mon add HOST the new monitor seems to assimilate the old monitor so the old/first monitor is now in the same state as the new monitor so it is not aware of anything. i needed t

Re: [ceph-users] RHEL 7.1 ceph-disk failures creating OSD with ver 0.94.2

2015-06-30 Thread HEWLETT, Paul (Paul)
We are using Ceph (Hammer) on Centos7 and RHEL7.1 successfully. One secret is to ensure that the disk is cleaned prior to ceph-disk command. Because GPT tables are used one must use the Œsgdisk -Z¹ command to purge the disk of all partition tables. We usually issue this command in the RedHat kicks

Re: [ceph-users] How to use cgroup to bind ceph-osd to a specific cpu core?

2015-06-30 Thread Jan Schermer
Not having OSDs and KVMs compete against each other is one thing. But there are more reasons to do this 1) not moving the processes and threads between cores that much (better cache utilization) 2) aligning the processes with memory on NUMA systems (that means all modern dual socket systems) - y

[ceph-users] which version of ceph with my kernel 3.14 ?

2015-06-30 Thread Pascal GREGIS
Hello, I have installed a ceph firefly (0.80) on my system last year. I run a 3.14.43 kernel (I upgraded it recently from 3.14.4). ceph seems to be working well in most cases, though I haven't used it in real production mode as of now. The only thing I noticed recently was some Input/Output Erro

Re: [ceph-users] CephFS posix test performance

2015-06-30 Thread Ilya Dryomov
On Tue, Jun 30, 2015 at 6:57 AM, Yan, Zheng wrote: > I tried 4.1 kernel and 0.94.2 ceph-fuse. their performance are about the same. > > fuse: > Files=191, Tests=1964, 60 wallclock secs ( 0.43 usr 0.08 sys + 1.16 cusr > 0.65 csys = 2.32 CPU) > > kernel: > Files=191, Tests=2286, 61 wallclock se

Re: [ceph-users] krbd splitting large IO's into smaller IO's

2015-06-30 Thread Ilya Dryomov
On Tue, Jun 30, 2015 at 8:30 AM, Z Zhang wrote: > Hi Ilya, > > Thanks for your explanation. This makes sense. Will you make max_segments to > be configurable? Could you pls point me the fix you have made? We might help > to test it. [PATCH] rbd: bump queue_max_segments on ceph-devel. Thanks,