Re: [ceph-users] All SSD cluster performance

2017-01-13 Thread Somnath Roy
Also, there are lot of discussion about SSDs not suitable for Ceph write workload (with filestore) in community as those are not good for odirect/odsync kind of writes. Hope your SSDs are tolerant of that. -Original Message- From: Somnath Roy Sent: Friday, January 13, 2017 10:06 AM

Re: [ceph-users] All SSD cluster performance

2017-01-13 Thread Somnath Roy
<< Both OSDs are pinned to two cores on the system Is there any reason you are pinning osds like that ? I would say for object workload there is no need to pin osds. The configuration you mentioned , Ceph with 4M object PUT it should be saturating your network first. Have you run say 4M object

Re: [ceph-users] Ceph performance is too good (impossible..)...

2016-12-11 Thread Somnath Roy
I generally do a 1M seq write to fill up the device. Block size doesn’t matter here but bigger block size is faster to fill up and that’s why people use that. From: V Plus [mailto:v.plussh...@gmail.com] Sent: Sunday, December 11, 2016 7:03 PM To: Somnath Roy Cc: ceph-users@lists.ceph.com Subject

Re: [ceph-users] Ceph performance is too good (impossible..)...

2016-12-11 Thread Somnath Roy
will be created beforehand. Thanks & Regards Somnath From: V Plus [mailto:v.plussh...@gmail.com] Sent: Sunday, December 11, 2016 6:01 PM To: Somnath Roy Cc: ceph-users@lists.ceph.com Subject: Re: [ceph-users] Ceph performance is too good (impossible..)... Thanks Somnath! As you recommended, I exec

Re: [ceph-users] Ceph performance is too good (impossible..)...

2016-12-11 Thread Somnath Roy
Fill up the image with big write (say 1M) first before reading and you should see sane throughput. Thanks & Regards Somnath From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of V Plus Sent: Sunday, December 11, 2016 5:44 PM To: ceph-users@lists.ceph.com Subject: [ceph-users]

Re: [ceph-users] OSDs are flapping and marked down wrongly

2016-10-17 Thread Somnath Roy
To: Somnath Roy Cc: ceph-users@lists.ceph.com; ceph-de...@vger.kernel.org Subject: Re: [ceph-users] OSDs are flapping and marked down wrongly On Mon, Oct 17, 2016 at 3:16 PM, Somnath Roy <somnath@sandisk.com> wrote: > Hi Sage et. al, > > I know this issue is reported number of tim

Re: [ceph-users] OSDs are flapping and marked down wrongly

2016-10-17 Thread Somnath Roy
Somnath -Original Message- From: Piotr Dałek [mailto:bra...@predictor.org.pl] Sent: Monday, October 17, 2016 12:52 AM To: ceph-users@lists.ceph.com; Somnath Roy; ceph-de...@vger.kernel.org Subject: Re: OSDs are flapping and marked down wrongly On Mon, Oct 17, 2016 at 07:16:44AM +, Somnath

[ceph-users] OSDs are flapping and marked down wrongly

2016-10-17 Thread Somnath Roy
Hi Sage et. al, I know this issue is reported number of times in community and attributed to either network issue or unresponsive OSDs. Recently, we are seeing this issue when our all SSD cluster (Jewel based) is stressed with large block size and very high QD. Lowering QD it is working just

Re: [ceph-users] Rbd map command doesn't work

2016-08-16 Thread Somnath Roy
This is usual feature mismatch stuff , the inbox krbd you are using is not supporting Jewel. Try googling with the error and I am sure you will get lot of prior discussion around that.. From: EP Komarla [mailto:ep.koma...@flextronics.com] Sent: Tuesday, August 16, 2016 4:15 PM To: Somnath Roy

Re: [ceph-users] Rbd map command doesn't work

2016-08-16 Thread Somnath Roy
The default format of rbd image in jewel is 2 along with bunch of other deatures enabled , so, you have following two option: 1. create a format 1 image -image-format 1 2. Or, do this in the ceph.conf file [client] or [global] before creating image.. rbd_default_features = 3 Thanks & Regards

Re: [ceph-users] Multi-device BlueStore OSDs multiple fsck failures

2016-08-03 Thread Somnath Roy
the number of time we need to restart OSDs.. Thanks & Regards Somnath -Original Message- From: Gregory Farnum [mailto:gfar...@redhat.com] Sent: Wednesday, August 03, 2016 4:03 PM To: Somnath Roy Cc: Stillwell, Bryan J; ceph-users@lists.ceph.com Subject: Re: [ceph-users] Multi-de

Re: [ceph-users] Multi-device BlueStore OSDs multiple fsck failures

2016-08-03 Thread Somnath Roy
Probably, it is better to move to latest master and reproduce this defect. Lot of stuff has changed between this. This is a good test case and I doubt any of us testing by enabling fsck() on mount/unmount. Thanks & Regards Somnath -Original Message- From: ceph-users

Re: [ceph-users] RocksDB compression

2016-07-28 Thread Somnath Roy
I am using snappy and it is working fine with Bluestore.. Thanks & Regards Somnath -Original Message- From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of Mark Nelson Sent: Thursday, July 28, 2016 2:03 PM To: ceph-users@lists.ceph.com Subject: Re: [ceph-users]

Re: [ceph-users] bluestore overlay write failure

2016-07-26 Thread Somnath Roy
Bluestore has evolved a long way and I don’t think we support this overlay anymore. Please try Bluestore with latest master.. Thanks & Regards Somnath From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of ??? Sent: Tuesday, July 26, 2016 7:09 PM To: ceph-users@lists.ceph.com

Re: [ceph-users] Ceph performance pattern

2016-07-26 Thread Somnath Roy
<< Ceph performance in general (without read_ahead_kb) will be lower specially in all flash as the requests will be serialized within a PG I meant to say Ceph sequential performance..Sorry for the spam.. From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of Somnath Ro

Re: [ceph-users] Ceph performance pattern

2016-07-26 Thread Somnath Roy
will be serialized within a PG. Our test is with all flash though and take my comments with a grain of salt in case of ceph + HDD.. Thanks & Regards Somnath From: EP Komarla [mailto:ep.koma...@flextronics.com] Sent: Tuesday, July 26, 2016 4:50 PM To: Somnath Roy; ceph-users@lists.ceph.com Subject: RE:

Re: [ceph-users] Ceph performance pattern

2016-07-26 Thread Somnath Roy
Which OS/kernel you are running with ? Try setting bigger read_ahead_kb for sequential runs. Thanks & Regards Somnath From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of EP Komarla Sent: Tuesday, July 26, 2016 4:38 PM To: ceph-users@lists.ceph.com Subject: [ceph-users] Ceph

Re: [ceph-users] Too much pgs backfilling

2016-07-19 Thread Somnath Roy
The settings are per OSD and the messages you are seeing aggregated on the cluster with multiple OSDs doing backfill (working on multiple PGs in parallel).. Thanks & Regards Somnath From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of Jimmy Goffaux Sent: Tuesday, July 19,

Re: [ceph-users] Multi-device BlueStore testing

2016-07-19 Thread Somnath Roy
I don't think ceph-disk has support for separating block.db and block.wal yet (?). You need to create the cluster manually by running mkfs. Or if you have old mkcephfs script (which sadly deprecated) you can point the db / wal path and it will create cluster for you. I am using that to configure

Re: [ceph-users] Terrible RBD performance with Jewel

2016-07-14 Thread Somnath Roy
From: Garg, Pankaj [mailto:pankaj.g...@cavium.com] Sent: Thursday, July 14, 2016 10:05 AM To: Somnath Roy; ceph-users@lists.ceph.com Subject: RE: Terrible RBD performance with Jewel Something in this section is causing all the 0 IOPS issue. Have not been able to nail down it yet. (I did comment out

Re: [ceph-users] Terrible RBD performance with Jewel

2016-07-13 Thread Somnath Roy
sday, July 13, 2016 6:40 PM To: Somnath Roy; ceph-users@lists.ceph.com Subject: RE: Terrible RBD performance with Jewel I agree, but I'm dealing with something else out here with this setup. I just ran a test, and within 3 seconds my IOPS went to 0, and stayed there for 90 secondsthen sta

Re: [ceph-users] Terrible RBD performance with Jewel

2016-07-13 Thread Somnath Roy
You should do that first to get a stable performance out with filestore. 1M seq write for the entire image should be sufficient to precondition it. From: Garg, Pankaj [mailto:pankaj.g...@cavium.com] Sent: Wednesday, July 13, 2016 6:04 PM To: Somnath Roy; ceph-users@lists.ceph.com Subject: RE

Re: [ceph-users] Terrible RBD performance with Jewel

2016-07-13 Thread Somnath Roy
...@cavium.com] Sent: Wednesday, July 13, 2016 5:55 PM To: Somnath Roy; ceph-users@lists.ceph.com Subject: RE: Terrible RBD performance with Jewel Thanks Somnath. I will try all these, but I think there is something else going on too. Firstly my test reaches 0 IOPS within 10 seconds sometimes. Secondly

Re: [ceph-users] Terrible RBD performance with Jewel

2016-07-13 Thread Somnath Roy
Also increase the following.. filestore_op_threads From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of Somnath Roy Sent: Wednesday, July 13, 2016 5:47 PM To: Garg, Pankaj; ceph-users@lists.ceph.com Subject: Re: [ceph-users] Terrible RBD performance with Jewel Pankaj

Re: [ceph-users] Terrible RBD performance with Jewel

2016-07-13 Thread Somnath Roy
Pankaj, Could be related to the new throttle parameter introduced in jewel. By default these throttles are off , you need to tweak it according to your setup. What is your journal size and fio block size ? If it is default 5GB , with this rate (assuming 4K RW) you mentioned and considering 3X

Re: [ceph-users] Mounting Ceph RBD image to XenServer 7 as SR

2016-06-30 Thread Somnath Roy
It seems your client kernel is pretty old ? Either upgrade your kernel to 3.15 or later or you need to disable CRUSH_TUNABLES3. ceph osd crush tunables bobtail or ceph osd crush tunables legacy should help. This will start rebalancing and also you will lose improvement added in Firefly. So,

Re: [ceph-users] Dramatic performance drop at certain number of objects in pool

2016-06-23 Thread Somnath Roy
Oops , typo , 128 GB :-)... -Original Message- From: Christian Balzer [mailto:ch...@gol.com] Sent: Thursday, June 23, 2016 5:08 PM To: ceph-users@lists.ceph.com Cc: Somnath Roy; Warren Wang - ISD; Wade Holler; Blair Bethwaite; Ceph Development Subject: Re: [ceph-users] Dramatic

Re: [ceph-users] Dramatic performance drop at certain number of objects in pool

2016-06-23 Thread Somnath Roy
Or even vm.vfs_cache_pressure = 0 if you have sufficient memory to *pin* inode/dentries in memory. We are using that for long now (with 128 TB node memory) and it seems helping specially for the random write workload and saving xattrs read in between. Thanks & Regards Somnath -Original

Re: [ceph-users] rbd ioengine for fio

2016-06-16 Thread Somnath Roy
What is your fio script ? Make sure you do this.. 1. Run say ‘ceph-s’ from the server you are trying to connect and see if it is connecting properly or not. If so, you don’t have any keyring issues. 2. Now, make sure you have given the following param value properly based on your setup.

Re: [ceph-users] Fio randwrite does not work on Centos 7.2 VM

2016-06-15 Thread Somnath Roy
You ran out of fd limit..Increase with ulimit.. From: Mansour Shafaei Moghaddam [mailto:mansoor.shaf...@gmail.com] Sent: Wednesday, June 15, 2016 2:08 PM To: Somnath Roy Cc: ceph-users@lists.ceph.com Subject: Re: [ceph-users] Fio randwrite does not work on Centos 7.2 VM It fails at "FileSto

Re: [ceph-users] Fio randwrite does not work on Centos 7.2 VM

2016-06-15 Thread Somnath Roy
There should be a line in the log specifying which assert is failing , post that along with say 10 lines from top of that.. Thanks & Regards Somnath From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of Mansour Shafaei Moghaddam Sent: Wednesday, June 15, 2016 1:57 PM To:

Re: [ceph-users] Ceph Pool JERASURE issue.

2016-06-01 Thread Somnath Roy
You need to either change failure domain to osd or need at least 5 host to satisfy host failure domain. Since it is not satisfying failure domain , pgs are undersized and degraded.. Thanks & Regards Somnath From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of Khang Nguy?n

Re: [ceph-users] NVRAM cards as OSD journals

2016-05-24 Thread Somnath Roy
If you are not tweaking ceph.conf settings when using NVRAM as journal , I would highly recommend to try the following. 1. Since you have very small journal , try to reduce filestore_max_sync_interval/min_sync_interval significantly. 2. If you are using Jewel , there are bunch of filestore

Re: [ceph-users] Public and Private network over 1 interface

2016-05-23 Thread Somnath Roy
4MB block size EC-object use case scenario (Mostly for reads , not so much for writes) we saw some benefit separating public/cluster network for 40GbE. We didn’t use two NIC though. We configured two ports on a NIC. Both network can give up to 48Gb/s but with Mellanox card/Mellanox switch

Re: [ceph-users] using jemalloc in trusty

2016-05-23 Thread Somnath Roy
Yes, if you are using do_autogen , use -J option. If you are using config files directly , use --with-jemalloc -Original Message- From: Luis Periquito [mailto:periqu...@gmail.com] Sent: Monday, May 23, 2016 7:44 AM To: Somnath Roy Cc: Ceph Users Subject: Re: [ceph-users] using jemalloc

Re: [ceph-users] using jemalloc in trusty

2016-05-23 Thread Somnath Roy
You need to build ceph code base to use jemalloc for OSDs..LD_PRELOAD won't work.. Thanks & regards Somnath -Original Message- From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of Luis Periquito Sent: Monday, May 23, 2016 7:30 AM To: Ceph Users Subject: [ceph-users]

Re: [ceph-users] OSD process doesn't die immediately after device disappears

2016-05-17 Thread Somnath Roy
Hi Marcel, FileStore doesn't subscribe for any such event from the device. Presently, it is relying on filesystem (for the FileStore assert) to return back error during IO and based on the error it is giving an assert. FileJournal assert you are getting in the aio path is relying on linux aio

Re: [ceph-users] Segfault in libtcmalloc.so.4.2.2

2016-05-13 Thread Somnath Roy
BTW, I am not saying latest tcmalloc will fix the issue , but worth trying. Thanks & Regards Somnath From: David [mailto:da...@visions.se] Sent: Friday, May 13, 2016 7:49 AM To: Somnath Roy Cc: ceph-users Subject: Re: [ceph-users] Segfault in libtcmalloc.so.4.2.2 Linux osd11.storage 3.16

Re: [ceph-users] Weighted Priority Queue testing

2016-05-13 Thread Somnath Roy
Thanks Christian for the input. I will start digging the code and look for possible explanation. Regards Somnath -Original Message- From: Christian Balzer [mailto:ch...@gol.com] Sent: Thursday, May 12, 2016 11:52 PM To: Somnath Roy Cc: Scottix; ceph-users@lists.ceph.com; Nick Fisk

Re: [ceph-users] Segfault in libtcmalloc.so.4.2.2

2016-05-13 Thread Somnath Roy
What is the exact kernel version ? Ubuntu has a new tcmalloc incorporated from 3.16.0.50 kernel onwards. If you are using older kernel than this better to upgrade kernel or try building latest tcmalloc and try to see if this is happening there. Ceph is not packaging tcmalloc it is using the

Re: [ceph-users] Weighted Priority Queue testing

2016-05-12 Thread Somnath Roy
FYI in my test I used osd_max_backfills = 10 which is hammer default. Post hammer it's been changed to 1. Thanks & Regards Somnath -Original Message- From: Christian Balzer [mailto:ch...@gol.com] Sent: Thursday, May 12, 2016 10:40 PM To: Scottix Cc: Somnath Roy; ceph-users@lists.ceph

Re: [ceph-users] Weighted Priority Queue testing

2016-05-11 Thread Somnath Roy
ds Somnath -Original Message- From: Mark Nelson [mailto:mnel...@redhat.com] Sent: Wednesday, May 11, 2016 5:16 AM To: Somnath Roy; Nick Fisk; Ben England; Kyle Bader Cc: Sage Weil; Samuel Just; ceph-users@lists.ceph.com Subject: Re: Weighted Priority Queue testing > 1. First scenario, only 4

Re: [ceph-users] Weighted Priority Queue testing

2016-05-11 Thread Somnath Roy
1 AM To: Somnath Roy Cc: Mark Nelson; Nick Fisk; ceph-users@lists.ceph.com Subject: Re: [ceph-users] Weighted Priority Queue testing Hello, not sure if the Cc: to the users ML was intentional or not, but either way. The issue seen in the tracker: http://tracker.ceph.com/issues/15763 and what you

Re: [ceph-users] Weighted Priority Queue testing

2016-05-11 Thread Somnath Roy
. Thanks & Regards Somnath -Original Message- From: Somnath Roy Sent: Wednesday, May 04, 2016 11:47 AM To: 'Mark Nelson'; Nick Fisk; Ben England; Kyle Bader Cc: Sage Weil; Samuel Just Subject: RE: Weighted Priority Queue testing Thanks Mark, I will come back to you with some

Re: [ceph-users] performance drop a lot when running fio mix read/write

2016-05-02 Thread Somnath Roy
Yes, reads will be affected a lot for mix read/write scenarios as Ceph is serializing ops on a PG. Write path is inefficient and that is affecting reads in turn. Hope you are following all the config settings (shards/threads, pg numbers etc. etc.) already discussed in the community. You may

Re: [ceph-users] OSD Crashes

2016-04-29 Thread Somnath Roy
Check system log and search for the corresponding drive. It should have the information what is failing.. Thanks & Regards Somnath -Original Message- From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of Garg, Pankaj Sent: Friday, April 29, 2016 8:59 AM To: Samuel

Re: [ceph-users] krbd map on Jewel, sysfs write failed when rbd map

2016-04-26 Thread Somnath Roy
By default image format is 2 in jewel which is not supported by krbd..try creating image with --image-format 1 and it should be resolved.. Thanks Somnath Sent from my iPhone On Apr 25, 2016, at 9:38 PM, "wd_hw...@wistron.com"

Re: [ceph-users] On-going Bluestore Performance Testing Results

2016-04-22 Thread Somnath Roy
Yes, kernel should do read ahead , it's a block device setting..but if there is something extra xfs is doing for seq workload , not sure... Sent from my iPhone > On Apr 22, 2016, at 8:54 AM, Jan Schermer wrote: > > Having correlated graphs of CPU and block device usage would

Re: [ceph-users] Scrubbing a lot

2016-03-29 Thread Somnath Roy
We faced this issue too and figured out it in Jewel the default image creation was with format 2. Not sure if this is a good idea to change the default though as almost all the LTS releases are with older kernel and will face incompatibility issue. Thanks & Regards Somnath -Original

Re: [ceph-users] SSD and Journal

2016-03-15 Thread Somnath Roy
Yes, if you can manage *cost* , separating journal on a different device should improve write performance. But, you need to evaluate how many osd journals you can dedicate to a single OSD as at some point it will be bottlenecked by that journal device BW. Thanks & Regards Somnath From:

Re: [ceph-users] INFARNALIS with 64K Kernel PAGES

2016-03-01 Thread Somnath Roy
: - https://github.com/ceph/ceph/blob/infernalis/src/os/FileJournal.cc#L151 Which is hard coded 4096 Not sure why this is changed, Sam/Sage ? Thanks & Regards Somnath From: Garg, Pankaj [mailto:pankaj.g...@caviumnetworks.com] Sent: Tuesday, March 01, 2016 9:34 PM To: Somnath Roy; ceph-u

Re: [ceph-users] INFARNALIS with 64K Kernel PAGES

2016-03-01 Thread Somnath Roy
Did you recreated OSDs on this setup meaning did you do mkfs with 64K page size ? From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of Garg, Pankaj Sent: Tuesday, March 01, 2016 9:07 PM To: ceph-users@lists.ceph.com Subject: [ceph-users] INFARNALIS with 64K Kernel PAGES Hi,

Re: [ceph-users] SSD Journal Performance Priorties

2016-02-26 Thread Somnath Roy
You need to make sure SSD O_DIRECT|O_DSYNC performance is good. Not all the SSDs are good at it..Refer the prior discussions in the community for that. << Presumably as long as the SSD read speed exceeds that of the spinners, that is sufficient. You probably meant write speed of SSDs ? Journal

Re: [ceph-users] How to recover from OSDs full in small cluster

2016-02-17 Thread Somnath Roy
If you are not sure about what weight to put , ‘ceph osd reweight-by-utilization’ should also do the job for you automatically.. Thanks & Regards Somnath From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of Jan Schermer Sent: Wednesday, February 17, 2016 12:48 PM To: Lukáš

Re: [ceph-users] Extra RAM to improve OSD write performance ?

2016-02-14 Thread Somnath Roy
I doubt it will do much good in case of 100% write workload. You can tweak your VM dirty ration stuff to help the buffered write but the down side is the more amount of data it has to sync (while dumping dirty buffer eventually) the more spikiness it will induce..The write behavior won’t be

Re: [ceph-users] Fwd: HEALTH_WARN pool vol has too few pgs

2016-02-03 Thread Somnath Roy
You can increase it, but, that will trigger rebalancing and based on the amount of data it will take some time before cluster is coming into clean state. Client IO performance will be affected during this. BTW this is not really an error , it is a warning because performance on that pool will be

Re: [ceph-users] SSD Journal

2016-01-28 Thread Somnath Roy
Hi, Ceph needs to maintain a journal in case of filestore as underlying filesystem like XFS *doesn’t have* any transactional semantics. Ceph has to do a transactional write with data and metadata in the write path. It does in the following way. 1. It creates a transaction object having

Re: [ceph-users] SSD Journal

2016-01-28 Thread Somnath Roy
<mailto:j...@schermer.cz] Sent: Thursday, January 28, 2016 3:51 PM To: Somnath Roy Cc: Tyler Bishop; ceph-users@lists.ceph.com Subject: Re: SSD Journal Thanks for a great walkthrough explanation. I am not really going to (and capable) of commenting on everything but.. see below On 28 Jan 2

Re: [ceph-users] optimized SSD settings for hammer

2016-01-25 Thread Somnath Roy
] Sent: Monday, January 25, 2016 12:09 AM To: Somnath Roy; ceph-users@lists.ceph.com Subject: Re: optimized SSD settings for hammer Am 25.01.2016 um 08:54 schrieb Somnath Roy: > ms_nocrc options is changed to the following in Hammer.. > > ms_crc_data = false > ms_crc_he

Re: [ceph-users] optimized SSD settings for hammer

2016-01-24 Thread Somnath Roy
is reduced significantly and you may want to turn back on.. Thanks & Regards Somnath -Original Message- From: Stefan Priebe - Profihost AG [mailto:s.pri...@profihost.ag] Sent: Sunday, January 24, 2016 11:48 PM To: ceph-users@lists.ceph.com Cc: Somnath Roy Subject: optimized SSD sett

[ceph-users] Ceph scale testing

2016-01-20 Thread Somnath Roy
Hi, Here is the copy of the ppt I presented in today's performance meeting.. https://docs.google.com/presentation/d/1j4Lcb9fx0OY7eQlQ_iUI6TPVJ6t_orZWKJyhz0S_3ic/edit?usp=sharing Thanks & Regards Somnath ___ ceph-users mailing list

Re: [ceph-users] SSD OSDs - more Cores or more GHz

2016-01-20 Thread Somnath Roy
Yes, thanks for the data.. BTW, Nick, do we know what is more important more cpu core or more frequency ? For example, We have Xeon cpus available with a bit less frequency but with more cores /socket , so, which one we should be going with for OSD servers ? Thanks & Regards Somnath

Re: [ceph-users] OSD size and performance

2016-01-03 Thread Somnath Roy
ards Somnath From: gjprabu [mailto:gjpr...@zohocorp.com] Sent: Sunday, January 03, 2016 10:53 PM To: gjprabu Cc: Somnath Roy; ceph-users; Siva Sokkumuthu Subject: Re: [ceph-users] OSD size and performance Hi Somnath, Just check the below details and let us know do you need any o

Re: [ceph-users] more performance issues :(

2015-12-30 Thread Somnath Roy
Sent: Wednesday, December 30, 2015 2:54 AM To: Somnath Roy Cc: Tyler Bishop; ceph-users@lists.ceph.com Subject: Re: more performance issues :( Hi all, again thanks for all the suggestions.. I have now narrowed it down to this problem: Data gets written to journal (SSD), but the journal, when flus

Re: [ceph-users] My OSDs are down and not coming UP

2015-12-29 Thread Somnath Roy
& Regards Somnath From: Jan Schermer [mailto:j...@schermer.cz] Sent: Tuesday, December 29, 2015 3:32 PM To: Ing. Martin Samek Cc: Somnath Roy; ceph-users@lists.ceph.com Subject: Re: [ceph-users] My OSDs are down and not coming UP Just try putting something like the following in ceph.conf: [gl

Re: [ceph-users] My OSDs are down and not coming UP

2015-12-29 Thread Somnath Roy
--pid-file /run/ceph/osd.0.pid -c /etc/ceph/ceph.conf starting osd.0 at :/0 osd_data /var/lib/ceph/osd/ceph-0 /var/lib/ceph/osd/ceph-0/journal 2015-12-29 00:18:05.878954 7fd9892e7800 -1 journal FileJournal::_open: disabling aio for non-block journal. Use journal_force_aio to force use of ai

Re: [ceph-users] My OSDs are down and not coming UP

2015-12-29 Thread Somnath Roy
PM To: Jan Schermer Cc: Somnath Roy; ceph-users@lists.ceph.com Subject: Re: [ceph-users] My OSDs are down and not coming UP Hi, No, never. It is my first attempt, first ceph cluster i try ever run. im not sure, if "mon initial members" should contain mon servers ids or hostnames

Re: [ceph-users] My OSDs are down and not coming UP

2015-12-29 Thread Somnath Roy
May be try commenting out mon_initial_members (or give mon host name) and see..It is certainly not correct as Jan pointed out.. From: Ing. Martin Samek [mailto:samek...@fel.cvut.cz] Sent: Tuesday, December 29, 2015 3:16 PM To: Somnath Roy; Jan Schermer Cc: ceph-users@lists.ceph.com Subject: Re

Re: [ceph-users] OSD size and performance

2015-12-29 Thread Somnath Roy
FYI , we are using 8TB SSD drive as OSD and not seeing any problem so far. Failure domain could be a concern for bigger OSDs. Thanks & Regards Somnath From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of gjprabu Sent: Tuesday, December 29, 2015 9:38 PM To: ceph-users Cc:

Re: [ceph-users] My OSDs are down and not coming UP

2015-12-28 Thread Somnath Roy
It could be a network issue..May be related to MTU (?)..Try running with debug_ms = 1 and see if you find anything..Also, try running command like 'traceroute' and see if it is reporting any error.. Thanks & Regards Somnath -Original Message- From: ceph-users

Re: [ceph-users] more performance issues :(

2015-12-26 Thread Somnath Roy
FYI, osd_op_threads is not used in the main io path anymore (from Giant). I don’t think increasing this will do any good. If you want to tweak threads in io path play with the following two. osd_op_num_threads_per_shard osd_op_num_shards But, It may not be the problem with writes..Default value

Re: [ceph-users] RGW pool contents

2015-12-22 Thread Somnath Roy
Thanks for responding back, unfortunately Cosbench setup is not there.. Good to know that there are cleanup steps for Cosbench data. Regards Somnath From: ghislain.cheval...@orange.com [mailto:ghislain.cheval...@orange.com] Sent: Tuesday, December 22, 2015 11:28 PM To: Somnath Roy; ceph-users

Re: [ceph-users] does anyone know what xfsaild and kworker are?they make osd disk busy. produce 100-200iops per osd disk?

2015-12-02 Thread Somnath Roy
104165 xfs_buf_rele: > 108383 xfs_iunlock: > > Could you please give me another hint? :) Thanks! > > On 2015年12月02日 05:14, Somnath Roy wrote: >> Sure..The following settings helped me minimizing the effect a bit >> for the PR https://github.com/ceph/ceph/pull/6670

Re: [ceph-users] does anyone know what xfsaild and kworker are?they make osd disk busy. produce 100-200iops per osd disk?

2015-12-01 Thread Somnath Roy
rker are?they make osd disk busy. produce 100-200iops per osd disk? On 2015年12月02日 01:31, Somnath Roy wrote: > This is xfs metadata sync process...when it is waking up and there are lot of > data to sync it will throttle all the process accessing the drive...There are > some xfs settin

Re: [ceph-users] does anyone know what xfsaild and kworker are?they make osd disk busy. produce 100-200iops per osd disk?

2015-12-01 Thread Somnath Roy
This is xfs metadata sync process...when it is waking up and there are lot of data to sync it will throttle all the process accessing the drive...There are some xfs settings to control the behavior, but you can't stop that Sent from my iPhone >> On Dec 1, 2015, at 8:26 AM, flisky

Re: [ceph-users] Ceph OSD: Memory Leak problem

2015-11-29 Thread Somnath Roy
It could be a network issue in your environment.. First thing to check is MTU (if you have changed it) and run tool like traceroute to see if all the cluster nodes are reachable from each other.. Thanks & Regards Somnath From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of

Re: [ceph-users] RGW pool contents

2015-11-26 Thread Somnath Roy
-users-boun...@lists.ceph.com] On Behalf Of Wido den Hollander Sent: Wednesday, November 25, 2015 10:56 PM To: ceph-users@lists.ceph.com Subject: Re: [ceph-users] RGW pool contents On 11/24/2015 08:48 PM, Somnath Roy wrote: > Hi Yehuda/RGW experts, > > I have one cluster with RGW up an

[ceph-users] RGW pool contents

2015-11-24 Thread Somnath Roy
Hi Yehuda/RGW experts, I have one cluster with RGW up and running in the customer site. I did some heavy performance testing on that with CosBench and as a result written significant amount of data to showcase performance on that. Over time, customer also wrote significant amount of data using S3

Re: [ceph-users] about PG_Number

2015-11-13 Thread Somnath Roy
Use the following link, it should give an idea about pg_number. http://ceph.com/pgcalc/ Number of PGs/OSd has implication on number of TCP connection in the system along with some resources on cpu/memory. So, if you have lots of PG/OSD it may degrade performance, I think mainly because of

Re: [ceph-users] iSCSI over RDB is a good idea ?

2015-11-04 Thread Somnath Roy
We are using SCST over RBD and not seeing much of a degradation...Need to make sure you tune SCST properly and use multiple session.. Thanks & Regards Somnath -Original Message- From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of Hugo Slabbert Sent: Wednesday,

Re: [ceph-users] Understanding the number of TCP connections between clients and OSDs

2015-11-04 Thread Somnath Roy
Hope this will be helpful.. Total connections per osd = (Target PGs per osd) * (# of pool replicas) * 3 + (2 #clients) + (min_hb_peer) # of pool replicas = configurable, default is 3 3 = is number of data communication messengers (cluster, hb_backend, hb_frontend) min_hb_peer = default

Re: [ceph-users] BAD nvme SSD performance

2015-10-26 Thread Somnath Roy
One thing, *don't* trust iostat disk util% in case of SSDs..100% doesn't mean you are saturating SSDs there..I have seen a large performance delta even if iostat is reporting 100% disk util in both the cases. Also, the ceph.conf file you are using is not optimal..Try to add these..

Re: [ceph-users] BAD nvme SSD performance

2015-10-26 Thread Somnath Roy
boun...@lists.ceph.com] On Behalf Of Somnath Roy Sent: Monday, October 26, 2015 9:20 AM To: Christian Balzer; ceph-users@lists.ceph.com Subject: Re: [ceph-users] BAD nvme SSD performance One thing, *don't* trust iostat disk util% in case of SSDs..100% doesn't mean you are saturating SSDs there..I have

Re: [ceph-users] Ceph journal - isn't it a bit redundant sometimes?

2015-10-14 Thread Somnath Roy
Jan, Journal helps FileStore to maintain the transactional integrity in the event of a crash. That's the main reason. Thanks & Regards Somnath -Original Message- From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of Jan Schermer Sent: Wednesday, October 14, 2015 2:28

Re: [ceph-users] Ceph journal - isn't it a bit redundant sometimes?

2015-10-14 Thread Somnath Roy
there and only trimming from journal when it is actually applied (all the operation executed) and persisted in the backend. Thanks & Regards Somnath -Original Message- From: Jan Schermer [mailto:j...@schermer.cz] Sent: Wednesday, October 14, 2015 9:06 AM To: Somnath Roy Cc: ceph-u

Re: [ceph-users] Initial performance cluster SimpleMessenger vs AsyncMessenger results

2015-10-13 Thread Somnath Roy
Wang [mailto:haomaiw...@gmail.com] Sent: Monday, October 12, 2015 11:35 PM To: Somnath Roy Cc: Mark Nelson; ceph-devel; ceph-users@lists.ceph.com Subject: Re: [ceph-users] Initial performance cluster SimpleMessenger vs AsyncMessenger results On Tue, Oct 13, 2015 at 12:18 PM, Somnath Roy <somn

Re: [ceph-users] Initial performance cluster SimpleMessenger vs AsyncMessenger results

2015-10-12 Thread Somnath Roy
Mark, Thanks for this data. This means probably simple messenger (not OSD core) is not doing optimal job of handling memory. Haomai, I am not that familiar with Async messenger code base, do you have an explanation of the behavior (like good performance with default tcmalloc) Mark reported ?

Re: [ceph-users] Ceph, SSD, and NVMe

2015-09-30 Thread Somnath Roy
ely to blow up in > our faces, it would be better to leave well enough alone. The current > performance is by no means bad, we're just always greedy for more. :) > ) > > Thanks for any advice/suggestions! Hi David, The single biggest performance improvement we've seen for SSDs has

Re: [ceph-users] OSD reaching file open limit - known issues?

2015-09-25 Thread Somnath Roy
Yes, known issue, make sure your system open file limit is pretty high.. From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of Jan Schermer Sent: Friday, September 25, 2015 4:42 AM To: ceph-users@lists.ceph.com Subject: [ceph-users] OSD reaching file open limit - known issues?

Re: [ceph-users] bad perf for librbd vs krbd using FIO

2015-09-11 Thread Somnath Roy
o:n...@fisk.me.uk>> > wrote: > > > > > >> -Original Message- >> From: ceph-users >> [mailto:ceph-users-boun...@lists.ceph.com<mailto:ceph-users-boun...@lists.ceph.com>] >> On Behalf Of >> Somnath Roy >> Sent: 11 September 2015 06:23 &

Re: [ceph-users] Hammer reduce recovery impact

2015-09-10 Thread Somnath Roy
6 PM To: Somnath Roy Cc: ceph-users@lists.ceph.com Subject: Re: [ceph-users] Hammer reduce recovery impact -BEGIN PGP SIGNED MESSAGE- Hash: SHA256 Do the recovery options kick in when there is only backfill going on? - Robert LeBlanc PGP Fingerprint 79A2 9CA4 6CC4 45DD

Re: [ceph-users] bad perf for librbd vs krbd using FIO

2015-09-10 Thread Somnath Roy
.edu] Sent: Thursday, September 10, 2015 10:12 PM To: Somnath Roy Cc: ceph-users@lists.ceph.com Subject: Re: [ceph-users] bad perf for librbd vs krbd using FIO Ok I ran the two tests again with direct=1, smaller block size (4k) and smaller total io (100m), disabled cache at ceph.conf side on cl

Re: [ceph-users] bad perf for librbd vs krbd using FIO

2015-09-10 Thread Somnath Roy
It may be due to rbd cache effect.. Try the following.. Run your test with direct = 1 both the cases and rbd_cache = false (disable all other rbd cache option as well). This should give you similar result like krbd. In direct =1 case, we saw ~10-20% degradation if we make rbd_cache = true.

Re: [ceph-users] bad perf for librbd vs krbd using FIO

2015-09-10 Thread Somnath Roy
Only changing client side ceph.conf and rerunning the tests is sufficient. Thanks & Regards Somnath From: Rafael Lopez [mailto:rafael.lo...@monash.edu] Sent: Thursday, September 10, 2015 8:58 PM To: Somnath Roy Cc: ceph-users@lists.ceph.com Subject: Re: [ceph-users] bad perf for librbd vs

Re: [ceph-users] Hammer reduce recovery impact

2015-09-10 Thread Somnath Roy
Try all these.. osd recovery max active = 1 osd max backfills = 1 osd recovery threads = 1 osd recovery op priority = 1 Thanks & Regards Somnath -Original Message- From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of Robert LeBlanc Sent: Thursday, September 10, 2015

Re: [ceph-users] maximum object size

2015-09-08 Thread Somnath Roy
I think the limit is 90 MB from OSD side, isn't it ? If so, how are you able to write object till 1.99 GB ? Am I missing anything ? Thanks & Regards Somnath -Original Message- From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of HEWLETT, Paul (Paul) Sent: Tuesday,

Re: [ceph-users] Extra RAM use as Read Cache

2015-09-07 Thread Somnath Roy
Vickey, OSDs are on top of filesystem and those unused memory will be automatically part of paged cache by filesystem. But, the read performance improvement depends on the pattern application is reading data and the size of working set. Sequential pattern will benefit most (you may need to tweak

Re: [ceph-users] Accelio & Ceph

2015-09-01 Thread Somnath Roy
dea what is the scale you are looking at for future deployment ? Regards Somnath From: German Anders [mailto:gand...@despegar.com] Sent: Tuesday, September 01, 2015 11:19 AM To: Somnath Roy Cc: Robert LeBlanc; ceph-users Subject: Re: [ceph-users] Accelio & Ceph Hi Roy, I understand, we are

Re: [ceph-users] Accelio & Ceph

2015-09-01 Thread Somnath Roy
Thanks ! 6 OSD daemons per server should be good. Vu, Could you please send out the doc you are maintaining ? Regards Somnath From: German Anders [mailto:gand...@despegar.com] Sent: Tuesday, September 01, 2015 11:36 AM To: Somnath Roy Cc: Robert LeBlanc; ceph-users Subject: Re: [ceph-users

Re: [ceph-users] Accelio & Ceph

2015-09-01 Thread Somnath Roy
Hi German, We are working on to make it production ready ASAP. As you know RDMA is very resource constrained and at the same time will outperform TCP as well. There will be some definite tradeoff between cost Vs Performance. We are lacking on ideas on how big the RDMA deployment could be and it

  1   2   3   >