Re: [ceph-users] Introducing DeepSea: A tool for deploying Ceph using Salt

2016-11-11 Thread Bill Sanders
I'm curious what the relationship is with python_ceph_cfg[0] and DeepSea,
which have some overlap in contributors and functionality (and supporting
organizations?).

[0] https://github.com/oms4suse/python-ceph-cfg

Bill Sanders

On Wed, Nov 2, 2016 at 10:52 PM, Tim Serong  wrote:

> Hi All,
>
> I thought I should make a little noise about a project some of us at
> SUSE have been working on, called DeepSea.  It's a collection of Salt
> states, runners and modules for orchestrating deployment of Ceph
> clusters.  To help everyone get a feel for it, I've written a blog post
> which walks through using DeepSea to set up a small test cluster:
>
>   http://ourobengr.com/2016/11/hello-salty-goodness/
>
> If you'd like to try it out yourself, the code is on GitHub:
>
>   https://github.com/SUSE/DeepSea
>
> More detailed documentation can be found at:
>
>   https://github.com/SUSE/DeepSea/wiki/intro
>   https://github.com/SUSE/DeepSea/wiki/management
>   https://github.com/SUSE/DeepSea/wiki/policy
>
> Usual story: feedback, issues, pull requests are all welcome ;)
>
> Enjoy,
>
> Tim
> --
> Tim Serong
> Senior Clustering Engineer
> SUSE
> tser...@suse.com
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Ceph & systemctl on Debian

2016-03-07 Thread Bill Sanders
Avoiding the merits of SysV vs SystemD:

I went and grabbed the systemd init scripts (unit files, whatever)
from upstream.  As Christian suggests, answers will vary depending on
your ceph version (and where your packages came from), but if you go
to: https://github.com/ceph/ceph/tree/master/systemd and grab
*{service,target} and place those in the systemd directory
(/usr/lib/systemd/system/ on my EL systems), everything works *a lot*
better.

We're still on Hammer, so the above scripts required some minor
modifications (basically, they added a ceph user account, but Hammer
doesn't know about it, so you have to remove the parts of the
command-line that specify user accounts).

Bill

On Mon, Mar 7, 2016 at 3:44 AM, Florent B  wrote:
> Hi,
>
> Yes sorry version are Jessie and last Infernalis package 9.2.1-1~bpo80+1.
>
> I could switch to SysV, but I don't want to, only for Ceph. All my
> others services are working well, I don't know why Ceph couldn't.
>
> On 03/07/2016 12:41 PM, Christian Balzer wrote:
>> Hello,
>>
>> since everybody else who might actually have answers for you will ask
>> this:
>>
>> What version of Debian (Jessie one assumes)?
>> What Ceph packages, Debian ones or from the Ceph repository?
>> Exact versions please.
>>
>> As for me, I had similar experiences with Firefly (Debian package) under
>> Jessie and switched to SysV init for serene happiness.
>>
>> Christian
>>
>> On Mon, 7 Mar 2016 12:27:48 +0100 Florent B wrote:
>>
>>> Hi everyone,
>>>
>>> I try to understand how Ceph works with systemctl on Debian and it seems
>>> to be a mess.
>>>
>>> First question : why is /etc/init.d/ceph is used as a "ceph" service ?
>>> that's an old sysvinit script !
>>>
>>> Second question: why this "ceph" service is handling everything while it
>>> should be the new "ceph.target" unit ? "ceph.target" is disabled !
>>>
>>> Third question : why some components like OSDs are handled via systemctl
>>> whereas MONs are unable to be handled via systemctl ? Here examples :
>>>
>>> # systemctl status ceph-mon@* -l
>>> ● ceph-mon@\x2a.service - Ceph cluster monitor daemon
>>>Loaded: loaded (/lib/systemd/system/ceph-mon@.service; enabled)
>>>*Active: failed (Result: exit-code) *since Mon 2016-03-07 12:16:50
>>> CET; 1min 55s ago
>>>   Process: 1861 ExecStart=/usr/bin/ceph-mon -f --cluster ${CLUSTER} --id
>>> %i --setuser ceph --setgroup ceph (code=exited, status=1/FAILURE)
>>>  Main PID: 1861 (code=exited, status=1/FAILURE)
>>>
>>> ● ceph-mon@5.service - Ceph cluster monitor daemon
>>>Loaded: loaded (/lib/systemd/system/ceph-mon@.service; enabled)
>>>*Active: failed (Result: exit-code)* since Tue 2016-03-01 12:46:27
>>> CET; 5 days ago
>>>  Main PID: 10600 (code=exited, status=1/FAILURE)
>>>
>>> ● ceph-mon@host2.service - Ceph cluster monitor daemon
>>>Loaded: loaded (/lib/systemd/system/ceph-mon@.service; enabled)
>>>*Active: failed (Result: exit-code)* since Mon 2016-03-07 12:07:48
>>> CET; 10min ago
>>>   Process: 31231 ExecStart=/usr/bin/ceph-mon -f --cluster ${CLUSTER}
>>> --id %i --setuser ceph --setgroup ceph (code=exited, status=1/FAILURE)
>>>  Main PID: 31231 (code=exited, status=1/FAILURE)
>>>
>>>
>>> # systemctl status ceph-osd@*
>>> ● ceph-osd@15.service - Ceph object storage daemon
>>>Loaded: loaded (/lib/systemd/system/ceph-osd@.service; disabled)
>>>*Active: active (running)* since Fri 2016-02-26 19:04:45 CET; 1 weeks
>>> 2 days ago
>>>  Main PID: 30211 (ceph-osd)
>>>CGroup: /system.slice/system-ceph\x2dosd.slice/ceph-osd@15.service
>>>└─30211 /usr/bin/ceph-osd -f --cluster ceph --id 15 --setuser
>>> ceph...
>>>
>>> ● ceph-osd@14.service - Ceph object storage daemon
>>>Loaded: loaded (/lib/systemd/system/ceph-osd@.service; disabled)
>>>*Active: active (running)* since Tue 2016-03-01 12:44:32 CET; 5 days
>>> ago Main PID: 9586 (ceph-osd)
>>>CGroup: /system.slice/system-ceph\x2dosd.slice/ceph-osd@14.service
>>>└─9586 /usr/bin/ceph-osd -f --cluster ceph --id 14 --setuser
>>> ceph ...
>>>
>>> ceph-osd ID's are automatically found, but ceph-mon ID's are not !
>>>
>>> "systemctl start ceph.target" is not starting MONs on my systems !
>>>
>>> What's in trouble ? How to handle this mess ?
>>>
>>> Thank you for your help
>>>
>>> Florent
>>
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Separate hosts for osd and its journal

2016-02-10 Thread Bill Sanders
Going in a tiny bit more detail to what Michał said, one of the key
reasons for having the journal (in particular, to use SSD's) is to
reduce latency on writes (the other being a replay in the event of a
crash).  Even if the functionality existed, adding a network trip to
this would be detrimental :)

Bill

On Wed, Feb 10, 2016 at 2:42 PM, Michał Chybowski
 wrote:
> "Remote journal"? No, don't do it even if it'd be possible via NFS or any
> kind of network-FS.
>
> You could always keep the journal on HDD (yes, I know it's not what You
> wanted to achieve, but I don't think that setting journal on remote machine
> would be a good idea in any way)
>
> Regards
> Michał
>
> W dniu 10.02.2016 o 22:54, Yu Xiang pisze:
>
> Dear list,
>  we have a cluster with 2 nodes, one have ssd and the other do not (host 1
> has ssd and host 2 does not have),  is there any possibility that host 2 can
> still use the ssd from host 1 for journaling?
> I see that we can change journal path in ceph.conf, but this is a path when
> journal and osd are on the same node, so i am not sure how to set the
> journal path when it's on a different node of the osd..
>
> Any help would be highly appreciated! Just wondering if ceph could do this!
>
> Thanks in advance!
>
> Regards,
> Mavis
>
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
>
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] v10.0.2 released

2016-01-14 Thread Bill Sanders
Is there some information about rbd-nbd somewhere?  If it has feature
parity with librbd and is easier to maintain, will this eventually
deprecate krbd?  We're using the RBD kernel client right now, and so
this looks like something we might want to explore at my employer.

Bill

On Thu, Jan 14, 2016 at 9:04 AM, Yehuda Sadeh-Weinraub
 wrote:
> On Thu, Jan 14, 2016 at 7:37 AM, Sage Weil  wrote:
>> This development release includes a raft of changes and improvements for
>> Jewel. Key additions include CephFS scrub/repair improvements, an AIX and
>> Solaris port of librados, many librbd journaling additions and fixes,
>> extended per-pool options, and NBD driver for RBD (rbd-nbd) that allows
>> librbd to present a kernel-level block device on Linux, multitenancy
>> support for RGW, RGW bucket lifecycle support, RGW support for Swift
>
> rgw bucket lifecycle isn't there, it still has some more way to go
> before we merge it in.
>
> Yehuda
>
>> static large objects (SLO), and RGW support for Swift bulk delete.
>>
>> There are also lots of smaller optimizations and performance fixes going
>> in all over the tree, particular in the OSD and common code.
>>
>> Notable Changes
>> ---
>>
>> See
>>
>> http://ceph.com/releases/v10-0-2-released/
>>
>> [I'd include the changelog here but I'm missing a oneliner that renders
>> the rst in email-suitable form...]
>>
>> Getting Ceph
>> 
>>
>> * Git at git://github.com/ceph/ceph.git
>> * Tarball at http://download.ceph.com/tarballs/ceph-10.0.2.tar.gz
>> * For packages, see http://ceph.com/docs/master/install/get-packages
>> * For ceph-deploy, see 
>> http://ceph.com/docs/master/install/install-ceph-deploy
>> --
>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>> the body of a message to majord...@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Ceph job posting

2015-12-01 Thread Bill Sanders
Just dropping a note to say that Teradata (I work there!) is hiring to
build out a small-at-first Ceph team in our Rancho Bernardo office (near
San Diego, CA).

We're looking for engineers interested in getting Ceph to spin like a top
for our data warehouse applications.  You should know C/C++,
virtualization, and of course Ceph.  There's a lot of exciting projects at
Teradata right now, and in increased interest in Open Source.

We have a couple junior level and a couple senior level positions open
right now.  Take a peek if you're interested:

http://teradata.jobs/jobs/?location=San+Diego%2C+CA&q=ceph+software+defined

Or, if you'd like to know more send me an email. (Disclaimer, I work for
Teradata on this team, but I'm not the hiring manager)

Bill
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Performance question

2015-11-24 Thread Bill Sanders
I think what Nick is suggesting is that you create Nx5GB partitions on the
SSD's (where N is the number of OSD's you want to have fast journals for),
and use the rest of the space for OSDs that would form the SSD pool.

Bill

On Tue, Nov 24, 2015 at 10:56 AM, Marek Dohojda <
mdoho...@altitudedigital.com> wrote:

> Oh, well in that you made my life easier, I like that :)
>
> I thought Journal needed to be on a physical space though, not within raw
> rbd pool.  Was I mistaken?
>
> On Tue, Nov 24, 2015 at 11:51 AM, Nick Fisk  wrote:
>
>> Ok, but it’s probably a bit of a waste. The journals for each disk will
>> probably require 200-300iops from each SSD and maybe 5GB of space.
>> Personally I would keep the SSD pool, maybe use it for high perf VM’s?
>>
>>
>>
>> Typically VM’s will generate more random smaller IO’s so a default rados
>> bench might not be a true example of expected performance.
>>
>>
>>
>> *From:* ceph-users [mailto:ceph-users-boun...@lists.ceph.com] *On Behalf
>> Of *Marek Dohojda
>> *Sent:* 24 November 2015 18:47
>> *To:* Nick Fisk 
>>
>> *Cc:* ceph-users@lists.ceph.com
>> *Subject:* Re: [ceph-users] Performance question
>>
>>
>>
>> I dunno, I think I just go into my Lotus and mull this over ;) (I wish)
>>
>> This is a storage for a KVM, and we have quite a few boxes.  While right
>> now none are suffering from IO load, I am seeing slowdown personally and
>> know that sooner or later others will notice as well.
>>
>>
>>
>> I think what I will do is remove the SSD from the cluster, and put
>> journals on it.
>>
>>
>>
>> On Tue, Nov 24, 2015 at 11:42 AM, Nick Fisk  wrote:
>>
>> Separate would be best, but as with many things in life we are not all
>> driving around in sports cars!!
>>
>>
>>
>> Moving the journals to the SSD’s that are also OSD’s themselves will be
>> fine. SSD’s tend to be more bandwidth limited than IOPs and the reverse is
>> true for Disks, so you will get maybe 2x improvement for the disk pool and
>> you probably won’t even notice the impact on the SSD pool.
>>
>>
>>
>> Can I just ask what your workload will be? There maybe other things that
>> can be done.
>>
>>
>>
>> *From:* ceph-users [mailto:ceph-users-boun...@lists.ceph.com] *On Behalf
>> Of *Marek Dohojda
>> *Sent:* 24 November 2015 18:32
>> *To:* Alan Johnson 
>> *Cc:* ceph-users@lists.ceph.com; Nick Fisk 
>>
>>
>> *Subject:* Re: [ceph-users] Performance question
>>
>>
>>
>> Thank you! I will do that.  Would you suggest getting another SSD drive
>> or move the journal to the SSD OSD?
>>
>>
>>
>> (Sorry for a stupid question, if that is such).
>>
>>
>>
>> On Tue, Nov 24, 2015 at 11:25 AM, Alan Johnson 
>> wrote:
>>
>> Or separate the journals as this will bring the workload down on the
>> spinners to 3Xrather than 6X
>>
>>
>>
>> *From:* Marek Dohojda [mailto:mdoho...@altitudedigital.com]
>> *Sent:* Tuesday, November 24, 2015 1:24 PM
>> *To:* Nick Fisk
>> *Cc:* Alan Johnson; ceph-users@lists.ceph.com
>>
>>
>> *Subject:* Re: [ceph-users] Performance question
>>
>>
>>
>> Crad I think you are 100% correct:
>>
>>
>>
>> rrqm/s   wrqm/s r/s w/srkB/swkB/s avgrq-sz avgqu-sz
>> await r_await w_await  svctm  %util
>>
>>
>>
>>  0.00   369.00   33.00 1405.00   132.00 135656.00   188.86 5.61
>>  4.02   21.943.60   0.70 100.00
>>
>>
>>
>> I was kinda wondering that this maybe the case, which is why I was
>> wondering if I should be doing too much in terms of troubleshooting.
>>
>>
>>
>> So basically what you are saying I need to wait for new version?
>>
>>
>>
>>
>>
>> Thank you very much everybody!
>>
>>
>>
>>
>>
>> On Tue, Nov 24, 2015 at 9:35 AM, Nick Fisk  wrote:
>>
>> You haven’t stated what size replication you are running. Keep in mind
>> that with a replication factor of 3, you will be writing 6x the amount of
>> data down to disks than what the benchmark says (3x replication x2 for
>> data+journal write).
>>
>>
>>
>> You might actually be near the hardware maximums. What does iostat looks
>> like whilst you are running rados bench, are the disks getting maxed out?
>>
>>
>>
>> *From:* ceph-users [mailto:ceph-users-boun...@lists.ceph.com] *On Behalf
>> Of *Marek Dohojda
>> *Sent:* 24 November 2015 16:27
>> *To:* Alan Johnson 
>>
>>
>> *Cc:* ceph-users@lists.ceph.com
>> *Subject:* Re: [ceph-users] Performance question
>>
>>
>>
>> 7 total servers, 20 GIG pipe between servers, both reads and writes.  The
>> network itself has plenty of pipe left, it is averaging 40Mbits/s
>>
>>
>>
>> Rados Bench SAS 30 writes
>>
>>  Total time run: 30.591927
>>
>> Total writes made:  386
>>
>> Write size: 4194304
>>
>> Bandwidth (MB/sec): 50.471
>>
>>
>>
>> Stddev Bandwidth:   48.1052
>>
>> Max bandwidth (MB/sec): 160
>>
>> Min bandwidth (MB/sec): 0
>>
>> Average Latency:1.25908
>>
>> Stddev Latency: 2.62018
>>
>> Max latency:21.2809
>>
>> Min latency:0.029227
>>
>>
>>
>> Rados Bench SSD writes
>>
>>  Total time run: 20.425192
>>
>> Total write

Re: [ceph-users] bad perf for librbd vs krbd using FIO

2015-09-11 Thread Bill Sanders
Is there a thread on the mailing list (or LKML?) with some background about
tcp_low_latency and TCP_NODELAY?

Bill

On Fri, Sep 11, 2015 at 2:30 AM, Jan Schermer  wrote:

> Can you try
>
> echo 1 > /proc/sys/net/ipv4/tcp_low_latency
>
> And see if it improves things? I remember there being an option to disable
> nagle completely, but it's gone apparently.
>
> Jan
>
> > On 11 Sep 2015, at 10:43, Nick Fisk  wrote:
> >
> >
> >
> >
> >
> >> -Original Message-
> >> From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf
> Of
> >> Somnath Roy
> >> Sent: 11 September 2015 06:23
> >> To: Rafael Lopez 
> >> Cc: ceph-users@lists.ceph.com
> >> Subject: Re: [ceph-users] bad perf for librbd vs krbd using FIO
> >>
> >> That’s probably because the krbd version you are using doesn’t have the
> >> TCP_NODELAY patch. We have submitted it (and you can build it from
> latest
> >> rbd source) , but, I am not sure when it will be in linux mainline.
> >
> > From memory it landed in 3.19, but there are also several issues with
> max IO size, max nr_requests and readahead. I would suggest for testing,
> try one of these:-
> >
> >
> http://gitbuilder.ceph.com/kernel-deb-precise-x86_64-basic/ref/ra-bring-back/
> >
> >
> >>
> >> Thanks & Regards
> >> Somnath
> >>
> >> From: Rafael Lopez [mailto:rafael.lo...@monash.edu]
> >> Sent: Thursday, September 10, 2015 10:12 PM
> >> To: Somnath Roy
> >> Cc: ceph-users@lists.ceph.com
> >> Subject: Re: [ceph-users] bad perf for librbd vs krbd using FIO
> >>
> >> Ok I ran the two tests again with direct=1, smaller block size (4k) and
> smaller
> >> total io (100m), disabled cache at ceph.conf side on client by adding:
> >>
> >> [client]
> >> rbd cache = false
> >> rbd cache max dirty = 0
> >> rbd cache size = 0
> >> rbd cache target dirty = 0
> >>
> >>
> >> The result seems to have swapped around, now the librbd job is running
> >> ~50% faster than the krbd job!
> >>
> >> ### krbd job:
> >>
> >> [root@rcprsdc1r72-01-ac rafaell]# fio ext4_test
> >> job1: (g=0): rw=rw, bs=4K-4K/4K-4K/4K-4K, ioengine=sync, iodepth=16
> >> fio-2.2.8
> >> Starting 1 process
> >> Jobs: 1 (f=1): [W(1)] [100.0% done] [0KB/571KB/0KB /s] [0/142/0 iops]
> [eta
> >> 00m:00s]
> >> job1: (groupid=0, jobs=1): err= 0: pid=29095: Fri Sep 11 14:48:21 2015
> >>  write: io=102400KB, bw=647137B/s, iops=157, runt=162033msec
> >>clat (msec): min=2, max=25, avg= 6.32, stdev= 1.21
> >> lat (msec): min=2, max=25, avg= 6.32, stdev= 1.21
> >>clat percentiles (usec):
> >> |  1.00th=[ 2896],  5.00th=[ 4320], 10.00th=[ 4768], 20.00th=[
> 5536],
> >> | 30.00th=[ 5920], 40.00th=[ 6176], 50.00th=[ 6432], 60.00th=[
> 6624],
> >> | 70.00th=[ 6816], 80.00th=[ 7136], 90.00th=[ 7584], 95.00th=[
> 7968],
> >> | 99.00th=[ 9024], 99.50th=[ 9664], 99.90th=[15808],
> 99.95th=[17536],
> >> | 99.99th=[19328]
> >>bw (KB  /s): min=  506, max= 1171, per=100.00%, avg=632.22,
> stdev=104.77
> >>lat (msec) : 4=2.88%, 10=96.69%, 20=0.43%, 50=0.01%
> >>  cpu  : usr=0.17%, sys=0.71%, ctx=25634, majf=0, minf=35
> >>  IO depths: 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%,
> >>> =64=0.0%
> >> submit: 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%,
> >>> =64=0.0%
> >> complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%,
> >>> =64=0.0%
> >> issued: total=r=0/w=25600/d=0, short=r=0/w=0/d=0,
> >> drop=r=0/w=0/d=0
> >> latency   : target=0, window=0, percentile=100.00%, depth=16
> >>
> >> Run status group 0 (all jobs):
> >>  WRITE: io=102400KB, aggrb=631KB/s, minb=631KB/s, maxb=631KB/s,
> >> mint=162033msec, maxt=162033msec
> >>
> >> Disk stats (read/write):
> >>  rbd0: ios=0/25638, merge=0/32, ticks=0/160765, in_queue=160745,
> >> util=99.11%
> >> [root@rcprsdc1r72-01-ac rafaell]#
> >>
> >> ## librb job:
> >>
> >> [root@rcprsdc1r72-01-ac rafaell]# fio fio_rbd_test
> >> job1: (g=0): rw=rw, bs=4K-4K/4K-4K/4K-4K, ioengine=rbd, iodepth=16
> >> fio-2.2.8
> >> Starting 1 process
> >> rbd engine: RBD version: 0.1.9
> >> Jobs: 1 (f=1): [W(1)] [100.0% done] [0KB/703KB/0KB /s] [0/175/0 iops]
> [eta
> >> 00m:00s]
> >> job1: (groupid=0, jobs=1): err= 0: pid=30568: Fri Sep 11 14:50:24 2015
> >>  write: io=102400KB, bw=950141B/s, iops=231, runt=110360msec
> >>slat (usec): min=70, max=992, avg=115.05, stdev=30.07
> >>clat (msec): min=13, max=117, avg=67.91, stdev=24.93
> >> lat (msec): min=13, max=117, avg=68.03, stdev=24.93
> >>clat percentiles (msec):
> >> |  1.00th=[   19],  5.00th=[   26], 10.00th=[   38], 20.00th=[
>  40],
> >> | 30.00th=[   46], 40.00th=[   62], 50.00th=[   77], 60.00th=[
>  85],
> >> | 70.00th=[   88], 80.00th=[   91], 90.00th=[   95], 95.00th=[
>  99],
> >> | 99.00th=[  105], 99.50th=[  110], 99.90th=[  116], 99.95th=[
> 117],
> >> | 99.99th=[  118]
> >>bw (KB  /s): min=  565, max= 3174, per=100.00%, avg=935.74,
> stdev=407.67
> >>lat (msec) : 20=2.41%, 50=29.85%, 100=64.46%, 250=3.29%
> >>  c

Re: [ceph-users] Ceph cluster NO read / write performance :: Ops are blocked

2015-09-09 Thread Bill Sanders
We were experiencing something similar in our setup (rados bench does some
work, then comes to a screeching halt).  No pattern to which OSD's were
causing the problem, though.  Sounds like similar hardware (This was on
Dell R720xd, and yeah, that controller is suuuper frustrating).

For us, setting tcp_moderate_rcvbuf to 0 on all nodes solved the issue.

echo 0 > /proc/sys/net/ipv4/tcp_moderate_rcvbuf

Or set it in /etc/sysctl.conf:

net.ipv4.tcp_moderate_rcvbuf = 0

We figured this out independently after I posted this thread, "Slow/Hung
IOs":
http://lists.ceph.com/pipermail/ceph-users-ceph.com/2015-January/045674.html

Hope this helps

Bill Sanders

On Wed, Sep 9, 2015 at 11:09 AM, Lincoln Bryant 
wrote:

> Hi Jan,
>
> I’ll take a look at all of those things and report back (hopefully :))
>
> I did try setting all of my OSDs to writethrough instead of writeback on
> the controller, which was significantly more consistent in performance
> (from 1100MB/s down to 300MB/s, but still occasionally dropping to 0MB/s).
> Still plenty of blocked ops.
>
> I was wondering if not-so-nicely failing OSD(s) might be the cause. My
> controller (PERC H730 Mini) seems frustratingly terse with SMART
> information, but at least one disk has a “Non-medium error count” of over
> 20,000..
>
> I’ll try disabling offloads as well.
>
> Thanks much for the suggestions!
>
> Cheers,
> Lincoln
>
> > On Sep 9, 2015, at 3:59 AM, Jan Schermer  wrote:
> >
> > Just to recapitulate - the nodes are doing "nothing" when it drops to
> zero? Not flushing something to drives (iostat)? Not cleaning pagecache
> (kswapd and similiar)? Not out of any type of memory (slab,
> min_free_kbytes)? Not network link errors, no bad checksums (those are hard
> to spot, though)?
> >
> > Unless you find something I suggest you try disabling offloads on the
> NICs and see if the problem goes away.
> >
> > Jan
> >
> >> On 08 Sep 2015, at 18:26, Lincoln Bryant  wrote:
> >>
> >> For whatever it’s worth, my problem has returned and is very similar to
> yours. Still trying to figure out what’s going on over here.
> >>
> >> Performance is nice for a few seconds, then goes to 0. This is a
> similar setup to yours (12 OSDs per box, Scientific Linux 6, Ceph 0.94.3,
> etc)
> >>
> >> 384  16 29520 29504   307.287  1188 0.0492006  0.208259
> >> 385  16 29813 29797   309.532  1172 0.0469708  0.206731
> >> 386  16 30105 30089   311.756  1168 0.0375764  0.205189
> >> 387  16 30401 30385   314.009  1184  0.036142  0.203791
> >> 388  16 30695 30679   316.231  1176 0.0372316  0.202355
> >> 389  16 30987 30971318.42  1168 0.0660476  0.200962
> >> 390  16 31282 31266   320.628  1180 0.0358611  0.199548
> >> 391  16 31568 31552   322.734  1144 0.0405166  0.198132
> >> 392  16 31857 31841   324.859  1156 0.0360826  0.196679
> >> 393  16 32090 32074   326.404   932 0.0416869   0.19549
> >> 394  16 32205 32189   326.743   460 0.0251877  0.194896
> >> 395  16 32302 32286   326.897   388 0.0280574  0.194395
> >> 396  16 32348 32332   326.537   184 0.0256821  0.194157
> >> 397  16 32385 32369   326.087   148 0.0254342  0.193965
> >> 398  16 32424 32408   325.659   156 0.0263006  0.193763
> >> 399  16 32445 32429   325.05484 0.0233839  0.193655
> >> 2015-09-08 11:22:31.940164 min lat: 0.0165045 max lat: 67.6184 avg lat:
> 0.193655
> >> sec Cur ops   started  finished  avg MB/s  cur MB/s  last lat   avg lat
> >> 400  16 32445 32429   324.241 0 -  0.193655
> >> 401  16 32445 32429   323.433 0 -  0.193655
> >> 402  16 32445 32429   322.628 0 -  0.193655
> >> 403  16 32445 32429   321.828 0 -  0.193655
> >> 404  16 32445 32429   321.031 0 -  0.193655
> >> 405  16 32445 32429   320.238 0 -  0.193655
> >> 406  16 32445 32429319.45 0 -  0.193655
> >> 407  16 32445 32429   318.665 0 -  0.193655
> >>
> >> needless to say, very strange.
> >>
> >> —Lincoln
> >>
> >>
> >>> On Sep 7, 2015, at 3:35 PM, Vickey Singh 
> wrote:
> >>>
> >>> Adding ceph-users.
> >>>
> &g

Re: [ceph-users] Cache tier best practices

2015-08-13 Thread Bill Sanders
I think you're looking for this.

http://ceph.com/docs/master/man/8/rbd/#cmdoption-rbd--order

It's used when you create the RBD images.  1MB is order=20, 512 is order=19.

Thanks,
Bill Sanders


On Thu, Aug 13, 2015 at 1:31 AM, Vickey Singh 
wrote:

> Thanks Nick for your suggestion.
>
> Can you also tell how i can reduce RBD block size to 512K or 1M , do i
> need to put something in clients ceph.conf  ( what parameter i need to set )
>
> Thanks once again
>
> - Vickey
>
> On Wed, Aug 12, 2015 at 4:49 PM, Nick Fisk  wrote:
>
>> > -Original Message-
>> > From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf
>> Of
>> > Dominik Zalewski
>> > Sent: 12 August 2015 14:40
>> > To: ceph-us...@ceph.com
>> > Subject: [ceph-users] Cache tier best practices
>> >
>> > Hi,
>> >
>> > I would like to hear from people who use cache tier in Ceph about best
>> > practices and things I should avoid.
>> >
>> > I remember hearing that it wasn't that stable back then. Has it changed
>> in
>> > Hammer release?
>>
>> It's not so much the stability, but the performance. If your working set
>> will sit mostly in the cache tier and won't tend to change then you might
>> be alright. Otherwise you will find that performance is very poor.
>>
>> Only tip I can really give is that I have found dropping the RBD block
>> size down to 512kb-1MB helps quite a bit as it makes the cache more
>> effective and also minimises the amount of data transferred on each
>> promotion/flush.
>>
>> >
>> > Any tips and tricks are much appreciated!
>> >
>> > Thanks
>> >
>> > Dominik
>>
>>
>>
>>
>> ___
>> ceph-users mailing list
>> ceph-users@lists.ceph.com
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>
>
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] CephFS Attributes Question Marks

2015-03-02 Thread Bill Sanders
Forgive me if this is unhelpful, but could it be something to do with
permissions of the directory and not Ceph at all?

http://superuser.com/a/528467

Bill

On Mon, Mar 2, 2015 at 3:47 PM, Gregory Farnum  wrote:

> On Mon, Mar 2, 2015 at 3:39 PM, Scottix  wrote:
> > We have a file system running CephFS and for a while we had this issue
> when
> > doing an ls -la we get question marks in the response.
> >
> > -rw-r--r-- 1 wwwrun root14761 Feb  9 16:06
> > data.2015-02-08_00-00-00.csv.bz2
> > -? ? ?  ?   ??
> > data.2015-02-09_00-00-00.csv.bz2
> >
> > If we do another directory listing it show up fine.
> >
> > -rw-r--r-- 1 wwwrun root14761 Feb  9 16:06
> > data.2015-02-08_00-00-00.csv.bz2
> > -rw-r--r-- 1 wwwrun root13675 Feb 10 15:21
> > data.2015-02-09_00-00-00.csv.bz2
> >
> > It hasn't been a problem but just wanted to see if this is an issue,
> could
> > the attributes be timing out? We do have a lot of files in the
> filesystem so
> > that could be a possible bottleneck.
>
> Huh, that's not something I've seen before. Are the systems you're
> doing this on the same? What distro and kernel version? Is it reliably
> one of them showing the question marks, or does it jump between
> systems?
> -Greg
>
> >
> > We are using the ceph-fuse mount.
> > ceph version 0.87 (c51c8f9d80fa4e0168aa52685b8de40e42758578)
> > We are planning to do the update soon to 87.1
> >
> > Thanks
> > Scottie
> >
> >
> > ___
> > ceph-users mailing list
> > ceph-users@lists.ceph.com
> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> >
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com