Re: [ceph-users] Setting up a proper mirror system for Ceph

2015-08-05 Thread David Moreau Simard
Would love to be a part of this Wido, we currently have a mirror at 
ceph.mirror.iweb.ca based on the script you provided me a while back. It is 
already available over http, rsync, IPv4 and IPv6.

The way we currently mirror it does feel a bit clunky and I would welcome a 
better way to mirror Ceph.

We’re on the eastern coast of Canada (Montreal) and we’re already official 
mirrors for several projects/distros @ http://mirror.iweb.ca/
I feel it complements well the US Ceph mirror which if I remember correctly is 
located on the southwestern coast of the USA.

Let me know if you have any questions !

- dms





On 2015-08-05, 10:15 AM, ceph-users on behalf of Wido den Hollander 
ceph-users-boun...@lists.ceph.com on behalf of w...@42on.com wrote:

Hi,

One of the first things I want to do as the Ceph User Committee is set
up a proper mirror system for Ceph.

Currently there is ceph.com, eu.ceph.com and au.ceph.com (thanks
Matthew!), but this isn't the way I want to see it.

I want to set up a series of localized mirrors from there you can easily
synchronize:
* Ceph source releases
* Debian packages
* RPM packages
* Docs

To do so there will be a few official mirrors:
* ceph.com / us.ceph.com
* eu.ceph.com (EU / Amsterdam)
* au.ceph.com (AU / Sydney)

Next to that I would like to set up:
* br.ceph.com (Brazil)
* cn.ceph.com (China)
* jp.ceph.com (Japan)

To get these locations online I'm looking for sponsors who are willing
to sponsor a mirror.

I haven't fully worked this out, it's still in the early stages, but for
a mirror you should be able to provide:
* HTTP access
* Rsync access
* 1TB of storage
* IPv4 AND IPv6 connectivity

If there is anybody who is willing to sponsor a mirror on the above
locations or has a suggestion for another mirror, please contact me!

The official mirrors will all synchronize from a private source. Users
can then sync from one of their local mirrors via rsync.

In the ceph Git repository there will be a official script which you can
use to synchronize Ceph to your local mirror.

Any suggestions or comments?

I'll be working on this over the coming weeks. Starting with the root so
that we have a proper master/golden source to sync from.

Wido

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Load balancing RGW and Scaleout

2015-06-11 Thread David Moreau Simard
What I've seen work well is to set multiple A records for your RGW endpoint.
Then, with something like corosync, you ensure that these multiple IP
addresses are always bound somewhere.

You can then have as many nodes in active-active mode as you want.

-- 
David Moreau Simard

On 2015-06-11 11:29 AM, Florent MONTHEL wrote:
 Hi Team

 Is it possible for you to share your setup on radosgw in order to use maximum 
 of network bandwidth and to have no SPOF

 I have 5 servers on 10gb network and 3 radosgw on it
 We would like to setup Haproxy on 1 node with 3 rgw but :
 - SPOF become Haproxy node
 - Max bandwidth will be on HAproxy node (10gb/s)

 Thanks

 Sent from my iPhone
 ___
 ceph-users mailing list
 ceph-users@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Ceph repo - RSYNC?

2015-04-15 Thread David Moreau Simard
Hey, you're right.

Thanks for bringing that to my attention, it's syncing now :)

Should be available soon.

David Moreau Simard

On 2015-04-15 12:17 PM, Paul Mansfield wrote:
 Sorry for starting a new thread, I've only just subscribed to the list
 and the archive on the mail listserv is far from complete at the moment.

 on 8th March David Moreau Simard said
http://www.spinics.net/lists/ceph-users/msg16334.html
 that there was a rsync'able mirror of the ceph repo at
 http://ceph.mirror.iweb.ca/


 My problem is that the repo doesn't include Hammer. Is there someone who
 can get that added to the mirror?

 thanks very much
 Paul
 ___
 ceph-users mailing list
 ceph-users@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Ceph release timeline

2015-03-16 Thread David Moreau Simard
Great work !

David Moreau Simard

On 2015-03-15 06:29 PM, Loic Dachary wrote:
 Hi Ceph,

 In an attempt to clarify what Ceph release is stable, LTS or development. a 
 new page was added to the documentation: 
 http://ceph.com/docs/master/releases/ It is a matrix where each cell is a 
 release number linked to the release notes from 
 http://ceph.com/docs/master/release-notes/. One line per month and one column 
 per release.

 Cheers



 ___
 ceph-users mailing list
 ceph-users@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Ceph repo - RSYNC?

2015-03-08 Thread David Moreau Simard
Hi,

With the help of Inktank we have been providing a Ceph mirror at
ceph.mirror.iweb.ca.
Quick facts:
- Located on the east coast of Canada (Montreal, Quebec)
- Syncs every four hours directly off of the official repositories
- Available over http (http://ceph.mirror.iweb.ca/) and rsync
(rsync://mirror.iweb.ca/ceph)

We're working on a brand new, faster and improved infrastructure for all
of our mirrors and it will be backed by Ceph.. So the Ceph mirror will
soon be stored on a Ceph cluster :)

Feel free to use it !
--
David Moreau Simard


On 2015-03-05, 1:14 PM, Brian Rak b...@gameservers.com wrote:

Do any of the Ceph repositories run rsync?  We generally mirror the
repository locally so we don't encounter any unexpected upgrades.

eu.ceph.com used to run this, but it seems to be down now.

# rsync rsync://eu.ceph.com
rsync: failed to connect to eu.ceph.com: Connection refused (111)
rsync error: error in socket IO (code 10) at clientserver.c(124)
[receiver=3.0.6]

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Poor RBD performance as LIO iSCSI target

2014-12-08 Thread David Moreau Simard
Haven't tried other iSCSI implementations (yet).

LIO/targetcli makes it very easy to implement/integrate/wrap/automate around so 
I'm really trying to get this right.

PCI-E SSD cache tier in front of spindles-backed erasure coded pool in 10 Gbps 
across the board yields results slightly better or very similar to two spindles 
in hardware RAID-0 with writeback caching.
With that in mind, the performance is not outright awful by any means, there's 
just a lot of overhead we have to be reminded about.

What I'd like to further test but am unable to right now is to see what happens 
if you scale up the cluster. Right now I'm testing on only two nodes.
Does the IOPS scale linearly with increasing amount of OSDs/servers ? Or is it 
more about a capacity thing ?

Perhaps if someone else can chime in, I'm really curious.
--
David Moreau Simard

 On Dec 6, 2014, at 11:18 AM, Nick Fisk n...@fisk.me.uk wrote:
 
 Hi David,
 
 Very strange, but  I'm glad you managed to finally get the cluster working
 normally. Thank you for posting the benchmarks figures, it's interesting to
 see the overhead of LIO over pure RBD performance. 
 
 I should have the hardware for our cluster up and running early next year, I
 will be in a better position to test the iSCSI performance then. I will
 report back once I have some numbers.
 
 Just out of interest, have you tried any of the other iSCSI implementations
 to see if they show the same performance drop?
 
 Nick
 
 -Original Message-
 From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of
 David Moreau Simard
 Sent: 05 December 2014 16:03
 To: Nick Fisk
 Cc: ceph-users@lists.ceph.com
 Subject: Re: [ceph-users] Poor RBD performance as LIO iSCSI target
 
 I've flushed everything - data, pools, configs and reconfigured the whole
 thing.
 
 I was particularly careful with cache tiering configurations (almost leaving
 defaults when possible) and it's not locking anymore.
 It looks like the cache tiering configuration I had was causing the problem
 ? I can't put my finger on exactly what/why and I don't have the luxury of
 time to do this lengthy testing again.
 
 Here's what I dumped as far as config goes before wiping:
 
 # for var in size min_size pg_num pgp_num crush_ruleset
 erasure_code_profile; do ceph osd pool get volumes $var; done
 size: 5
 min_size: 2
 pg_num: 7200
 pgp_num: 7200
 crush_ruleset: 1
 erasure_code_profile: ecvolumes
 
 # for var in size min_size pg_num pgp_num crush_ruleset hit_set_type
 hit_set_period hit_set_count target_max_objects target_max_bytes
 cache_target_dirty_ratio cache_target_full_ratio cache_min_flush_age
 cache_min_evict_age; do ceph osd pool get volumecache $var; done
 size: 2
 min_size: 1
 pg_num: 7200
 pgp_num: 7200
 crush_ruleset: 4
 hit_set_type: bloom
 hit_set_period: 3600
 hit_set_count: 1
 target_max_objects: 0
 target_max_bytes: 1000
 cache_target_dirty_ratio: 0.5
 cache_target_full_ratio: 0.8
 cache_min_flush_age: 600
 cache_min_evict_age: 1800
 
 # ceph osd erasure-code-profile get ecvolumes
 directory=/usr/lib/ceph/erasure-code
 k=3
 m=2
 plugin=jerasure
 ruleset-failure-domain=osd
 technique=reed_sol_van
 
 
 And now:
 
 # for var in size min_size pg_num pgp_num crush_ruleset
 erasure_code_profile; do ceph osd pool get volumes $var; done
 size: 5
 min_size: 3
 pg_num: 2048
 pgp_num: 2048
 crush_ruleset: 1
 erasure_code_profile: ecvolumes
 
 # for var in size min_size pg_num pgp_num crush_ruleset hit_set_type
 hit_set_period hit_set_count target_max_objects target_max_bytes
 cache_target_dirty_ratio cache_target_full_ratio cache_min_flush_age
 cache_min_evict_age; do ceph osd pool get volumecache $var; done
 size: 2
 min_size: 1
 pg_num: 2048
 pgp_num: 2048
 crush_ruleset: 4
 hit_set_type: bloom
 hit_set_period: 3600
 hit_set_count: 1
 target_max_objects: 0
 target_max_bytes: 1500
 cache_target_dirty_ratio: 0.5
 cache_target_full_ratio: 0.8
 cache_min_flush_age: 0
 cache_min_evict_age: 1800
 
 # ceph osd erasure-code-profile get ecvolumes
 directory=/usr/lib/ceph/erasure-code
 k=3
 m=2
 plugin=jerasure
 ruleset-failure-domain=osd
 technique=reed_sol_van
 
 
 Crush map hasn't really changed before and after.
 
 FWIW, the benchmarks I pulled out of the setup:
 https://gist.github.com/dmsimard/2737832d077cfc5eff34
 Definite overhead going from krbd to krbd + LIO...
 --
 David Moreau Simard
 
 
 On Nov 20, 2014, at 4:14 PM, Nick Fisk n...@fisk.me.uk wrote:
 
 Here you go:-
 
 Erasure Profile
 k=2
 m=1
 plugin=jerasure
 ruleset-failure-domain=osd
 ruleset-root=hdd
 technique=reed_sol_van
 
 Cache Settings
 hit_set_type: bloom
 hit_set_period: 3600
 hit_set_count: 1
 target_max_objects
 target_max_objects: 0
 target_max_bytes: 10
 cache_target_dirty_ratio: 0.4
 cache_target_full_ratio: 0.8
 cache_min_flush_age: 0
 cache_min_evict_age: 0
 
 Crush Dump
 # begin crush map
 tunable choose_local_tries 0
 tunable choose_local_fallback_tries 0
 tunable

Re: [ceph-users] Giant or Firefly for production

2014-12-05 Thread David Moreau Simard
What are the kernel versions involved ?

We have Ubuntu precise clients talking to a Ubuntu trusty cluster without 
issues - with tunables optimal.
0.88 (Giant) and 0.89 has been working well for us as far the client and 
Openstack are concerned.

This link provides some insight as to the possible problems:
http://cephnotes.ksperis.com/blog/2014/01/21/feature-set-mismatch-error-on-ceph-kernel-client

Things to look for:
- Kernel versions
- Cache tiering
- Tunables
- hashpspool

--
David Moreau Simard


 On Dec 5, 2014, at 4:36 AM, Antonio Messina antonio.mess...@s3it.uzh.ch 
 wrote:
 
 On Fri, Dec 5, 2014 at 2:24 AM, Anthony Alba ascanio.al...@gmail.com wrote:
 Hi Cephers,
 
 Have anyone of you decided to put Giant into production instead of Firefly?
 
 This is very interesting to me too: we are going to deploy a large
 ceph cluster on Ubuntu 14.04 LTS, and so far what I have found is that
 the rbd module in Ubuntu Trusty doesn't seem compatible with giant:
 
feature set mismatch, my 4a042a42  server's 2104a042a42, missing
 210
 
 I tried with different ceph osd tunables but nothing seems to fix the issue
 
 However, this cluster will be mainly used for OpenStack, and qemu is
 able to access the rbd volume, so this might not be a big problem for
 me.
 
 .a.
 
 -- 
 antonio.mess...@s3it.uzh.ch +41 (0)44 635 42 22
 antonio.s.mess...@gmail.com
 S3IT: Service and Support for Science IT   http://www.s3it.uzh.ch/
 University of Zurich
 Winterthurerstrasse 190
 CH-8057 Zurich Switzerland
 ___
 ceph-users mailing list
 ceph-users@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Poor RBD performance as LIO iSCSI target

2014-12-05 Thread David Moreau Simard
I've flushed everything - data, pools, configs and reconfigured the whole thing.

I was particularly careful with cache tiering configurations (almost leaving 
defaults when possible) and it's not locking anymore.
It looks like the cache tiering configuration I had was causing the problem ? I 
can't put my finger on exactly what/why and I don't have the luxury of time to 
do this lengthy testing again.

Here's what I dumped as far as config goes before wiping:

# for var in size min_size pg_num pgp_num crush_ruleset erasure_code_profile; 
do ceph osd pool get volumes $var; done
size: 5
min_size: 2
pg_num: 7200
pgp_num: 7200
crush_ruleset: 1
erasure_code_profile: ecvolumes

# for var in size min_size pg_num pgp_num crush_ruleset hit_set_type 
hit_set_period hit_set_count target_max_objects target_max_bytes 
cache_target_dirty_ratio cache_target_full_ratio cache_min_flush_age 
cache_min_evict_age; do ceph osd pool get volumecache $var; done
size: 2
min_size: 1
pg_num: 7200
pgp_num: 7200
crush_ruleset: 4
hit_set_type: bloom
hit_set_period: 3600
hit_set_count: 1
target_max_objects: 0
target_max_bytes: 1000
cache_target_dirty_ratio: 0.5
cache_target_full_ratio: 0.8
cache_min_flush_age: 600
cache_min_evict_age: 1800

# ceph osd erasure-code-profile get ecvolumes
directory=/usr/lib/ceph/erasure-code
k=3
m=2
plugin=jerasure
ruleset-failure-domain=osd
technique=reed_sol_van


And now:

# for var in size min_size pg_num pgp_num crush_ruleset erasure_code_profile; 
do ceph osd pool get volumes $var; done
size: 5
min_size: 3
pg_num: 2048
pgp_num: 2048
crush_ruleset: 1
erasure_code_profile: ecvolumes

# for var in size min_size pg_num pgp_num crush_ruleset hit_set_type 
hit_set_period hit_set_count target_max_objects target_max_bytes 
cache_target_dirty_ratio cache_target_full_ratio cache_min_flush_age 
cache_min_evict_age; do ceph osd pool get volumecache $var; done
size: 2
min_size: 1
pg_num: 2048
pgp_num: 2048
crush_ruleset: 4
hit_set_type: bloom
hit_set_period: 3600
hit_set_count: 1
target_max_objects: 0
target_max_bytes: 1500
cache_target_dirty_ratio: 0.5
cache_target_full_ratio: 0.8
cache_min_flush_age: 0
cache_min_evict_age: 1800

# ceph osd erasure-code-profile get ecvolumes
directory=/usr/lib/ceph/erasure-code
k=3
m=2
plugin=jerasure
ruleset-failure-domain=osd
technique=reed_sol_van


Crush map hasn't really changed before and after.

FWIW, the benchmarks I pulled out of the setup: 
https://gist.github.com/dmsimard/2737832d077cfc5eff34
Definite overhead going from krbd to krbd + LIO...
--
David Moreau Simard


 On Nov 20, 2014, at 4:14 PM, Nick Fisk n...@fisk.me.uk wrote:
 
 Here you go:-
 
 Erasure Profile
 k=2
 m=1
 plugin=jerasure
 ruleset-failure-domain=osd
 ruleset-root=hdd
 technique=reed_sol_van
 
 Cache Settings
 hit_set_type: bloom
 hit_set_period: 3600
 hit_set_count: 1
 target_max_objects
 target_max_objects: 0
 target_max_bytes: 10
 cache_target_dirty_ratio: 0.4
 cache_target_full_ratio: 0.8
 cache_min_flush_age: 0
 cache_min_evict_age: 0
 
 Crush Dump
 # begin crush map
 tunable choose_local_tries 0
 tunable choose_local_fallback_tries 0
 tunable choose_total_tries 50
 tunable chooseleaf_descend_once 1
 
 # devices
 device 0 osd.0
 device 1 osd.1
 device 2 osd.2
 device 3 osd.3
 
 # types
 type 0 osd
 type 1 host
 type 2 chassis
 type 3 rack
 type 4 row
 type 5 pdu
 type 6 pod
 type 7 room
 type 8 datacenter
 type 9 region
 type 10 root
 
 # buckets
 host ceph-test-hdd {
id -5   # do not change unnecessarily
# weight 2.730
alg straw
hash 0  # rjenkins1
item osd.1 weight 0.910
item osd.2 weight 0.910
item osd.0 weight 0.910
 }
 root hdd {
id -3   # do not change unnecessarily
# weight 2.730
alg straw
hash 0  # rjenkins1
item ceph-test-hdd weight 2.730
 }
 host ceph-test-ssd {
id -6   # do not change unnecessarily
# weight 1.000
alg straw
hash 0  # rjenkins1
item osd.3 weight 1.000
 }
 root ssd {
id -4   # do not change unnecessarily
# weight 1.000
alg straw
hash 0  # rjenkins1
item ceph-test-ssd weight 1.000
 }
 
 # rules
 rule hdd {
ruleset 0
type replicated
min_size 0
max_size 10
step take hdd
step chooseleaf firstn 0 type osd
step emit
 }
 rule ssd {
ruleset 1
type replicated
min_size 0
max_size 4
step take ssd
step chooseleaf firstn 0 type osd
step emit
 }
 rule ecpool {
ruleset 2
type erasure
min_size 3
max_size 20
step set_chooseleaf_tries 5
step take hdd
step chooseleaf indep 0 type osd
step emit
 }
 
 
 
 -Original Message-
 From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of
 David Moreau Simard
 Sent: 20 November 2014 20:03

Re: [ceph-users] Poor RBD performance as LIO iSCSI target

2014-11-20 Thread David Moreau Simard
Nick,

Can you share more datails on the configuration you are using ? I'll try and 
duplicate those configurations in my environment and see what happens.
I'm mostly interested in:
- Erasure code profile (k, m, plugin, ruleset-failure-domain)
- Cache tiering pool configuration (ex: hit_set_type, hit_set_period, 
hit_set_count, target_max_objects, target_max_bytes, cache_target_dirty_ratio, 
cache_target_full_ratio, cache_min_flush_age, cache_min_evict_age)

The crush rulesets would also be helpful.

Thanks,
--
David Moreau Simard

 On Nov 20, 2014, at 12:43 PM, Nick Fisk n...@fisk.me.uk wrote:
 
 Hi David,
 
 I've just finished running the 75GB fio test you posted a few days back on
 my new test cluster.
 
 The cluster is as follows:-
 
 Single server with 3x hdd and 1 ssd
 Ubuntu 14.04 with 3.16.7 kernel
 2+1 EC pool on hdds below a 10G ssd cache pool. SSD is also partitioned to
 provide journals for hdds.
 150G RBD mapped locally
 
 The fio test seemed to run without any problems. I want to run a few more
 tests with different settings to see if I can reproduce your problem. I will
 let you know if I find anything.
 
 If there is anything you would like me to try, please let me know.
 
 Nick
 
 -Original Message-
 From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of
 David Moreau Simard
 Sent: 19 November 2014 10:48
 To: Ramakrishna Nishtala (rnishtal)
 Cc: ceph-users@lists.ceph.com; Nick Fisk
 Subject: Re: [ceph-users] Poor RBD performance as LIO iSCSI target
 
 Rama,
 
 Thanks for your reply.
 
 My end goal is to use iSCSI (with LIO/targetcli) to export rbd block
 devices.
 
 I was encountering issues with iSCSI which are explained in my previous
 emails.
 I ended up being able to reproduce the problem at will on various Kernel and
 OS combinations, even on raw RBD devices - thus ruling out the hypothesis
 that it was a problem with iSCSI but rather with Ceph.
 I'm even running 0.88 now and the issue is still there.
 
 I haven't isolated the issue just yet.
 My next tests involve disabling the cache tiering.
 
 I do have client krbd cache as well, i'll try to disable it too if cache
 tiering isn't enough.
 --
 David Moreau Simard
 
 
 On Nov 18, 2014, at 8:10 PM, Ramakrishna Nishtala (rnishtal)
 rnish...@cisco.com wrote:
 
 Hi Dave
 Did you say iscsi only? The tracker issue does not say though.
 I am on giant, with both client and ceph on RHEL 7 and seems to work ok,
 unless I am missing something here. RBD on baremetal with kmod-rbd and
 caching disabled.
 
 [root@compute4 ~]# time fio --name=writefile --size=100G 
 --filesize=100G --filename=/dev/rbd0 --bs=1M --nrfiles=1 --direct=1 
 --sync=0 --randrepeat=0 --rw=write --refill_buffers --end_fsync=1 
 --iodepth=200 --ioengine=libaio
 writefile: (g=0): rw=write, bs=1M-1M/1M-1M/1M-1M, ioengine=libaio, 
 iodepth=200
 fio-2.1.11
 Starting 1 process
 Jobs: 1 (f=1): [W(1)] [100.0% done] [0KB/853.0MB/0KB /s] [0/853/0 
 iops] [eta 00m:00s] ...
 Disk stats (read/write):
  rbd0: ios=184/204800, merge=0/0, ticks=70/16164931, 
 in_queue=16164942, util=99.98%
 
 real1m56.175s
 user0m18.115s
 sys 0m10.430s
 
 Regards,
 
 Rama
 
 
 -Original Message-
 From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf 
 Of David Moreau Simard
 Sent: Tuesday, November 18, 2014 3:49 PM
 To: Nick Fisk
 Cc: ceph-users@lists.ceph.com
 Subject: Re: [ceph-users] Poor RBD performance as LIO iSCSI target
 
 Testing without the cache tiering is the next test I want to do when I
 have time..
 
 When it's hanging, there is no activity at all on the cluster.
 Nothing in ceph -w, nothing in ceph osd pool stats.
 
 I'll provide an update when I have a chance to test without tiering.
 --
 David Moreau Simard
 
 
 On Nov 18, 2014, at 3:28 PM, Nick Fisk n...@fisk.me.uk wrote:
 
 Hi David,
 
 Have you tried on a normal replicated pool with no cache? I've seen 
 a number of threads recently where caching is causing various things to
 block/hang.
 It would be interesting to see if this still happens without the 
 caching layer, at least it would rule it out.
 
 Also is there any sign that as the test passes ~50GB that the cache 
 might start flushing to the backing pool causing slow performance?
 
 I am planning a deployment very similar to yours so I am following 
 this with great interest. I'm hoping to build a single node test 
 cluster shortly, so I might be in a position to work with you on 
 this issue and hopefully get it resolved.
 
 Nick
 
 -Original Message-
 From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On 
 Behalf Of David Moreau Simard
 Sent: 18 November 2014 19:58
 To: Mike Christie
 Cc: ceph-users@lists.ceph.com; Christopher Spearman
 Subject: Re: [ceph-users] Poor RBD performance as LIO iSCSI target
 
 Thanks guys. I looked at http://tracker.ceph.com/issues/8818 and 
 chatted with dis on #ceph-devel.
 
 I ran a LOT of tests on a LOT of comabination of kernels (sometimes 
 with tunables legacy). I

Re: [ceph-users] Poor RBD performance as LIO iSCSI target

2014-11-18 Thread David Moreau Simard
Thanks guys. I looked at http://tracker.ceph.com/issues/8818 and chatted with 
dis on #ceph-devel.

I ran a LOT of tests on a LOT of comabination of kernels (sometimes with 
tunables legacy). I haven't found a magical combination in which the following 
test does not hang:
fio --name=writefile --size=100G --filesize=100G --filename=/dev/rbd0 --bs=1M 
--nrfiles=1 --direct=1 --sync=0 --randrepeat=0 --rw=write --refill_buffers 
--end_fsync=1 --iodepth=200 --ioengine=libaio

Either directly on a mapped rbd device, on a mounted filesystem (over rbd), 
exported through iSCSI.. nothing.
I guess that rules out a potential issue with iSCSI overhead.

Now, something I noticed out of pure luck is that I am unable to reproduce the 
issue if I drop the size of the test to 50GB. Tests will complete in under 2 
minutes.
75GB will hang right at the end and take more than 10 minutes.

TL;DR of tests:
- 3x fio --name=writefile --size=50G --filesize=50G --filename=/dev/rbd0 
--bs=1M --nrfiles=1 --direct=1 --sync=0 --randrepeat=0 --rw=write 
--refill_buffers --end_fsync=1 --iodepth=200 --ioengine=libaio
-- 1m44s, 1m49s, 1m40s

- 3x fio --name=writefile --size=75G --filesize=75G --filename=/dev/rbd0 
--bs=1M --nrfiles=1 --direct=1 --sync=0 --randrepeat=0 --rw=write 
--refill_buffers --end_fsync=1 --iodepth=200 --ioengine=libaio
-- 10m12s, 10m11s, 10m13s

Details of tests here: http://pastebin.com/raw.php?i=3v9wMtYP

Does that ring you guys a bell ?

--
David Moreau Simard


 On Nov 13, 2014, at 3:31 PM, Mike Christie mchri...@redhat.com wrote:
 
 On 11/13/2014 10:17 AM, David Moreau Simard wrote:
 Running into weird issues here as well in a test environment. I don't have a 
 solution either but perhaps we can find some things in common..
 
 Setup in a nutshell:
 - Ceph cluster: Ubuntu 14.04, Kernel 3.16.7, Ceph 0.87-1 (OSDs with separate 
 public/cluster network in 10 Gbps)
 - iSCSI Proxy node (targetcli/LIO): Ubuntu 14.04, Kernel 3.16.7, Ceph 0.87-1 
 (10 Gbps)
 - Client node: Ubuntu 12.04, Kernel 3.11 (10 Gbps)
 
 Relevant cluster config: Writeback cache tiering with NVME PCI-E cards (2 
 replica) in front of a erasure coded pool (k=3,m=2) backed by spindles.
 
 I'm following the instructions here: 
 http://www.hastexo.com/resources/hints-and-kinks/turning-ceph-rbd-images-san-storage-devices
 No issues with creating and mapping a 100GB RBD image and then creating the 
 target.
 
 I'm interested in finding out the overhead/performance impact of 
 re-exporting through iSCSI so the idea is to run benchmarks.
 Here's a fio test I'm trying to run on the client node on the mounted iscsi 
 device:
 fio --name=writefile --size=100G --filesize=100G --filename=/dev/sdu --bs=1M 
 --nrfiles=1 --direct=1 --sync=0 --randrepeat=0 --rw=write --refill_buffers 
 --end_fsync=1 --iodepth=200 --ioengine=libaio
 
 The benchmark will eventually hang towards the end of the test for some long 
 seconds before completing.
 On the proxy node, the kernel complains with iscsi portal login timeout: 
 http://pastebin.com/Q49UnTPr and I also see irqbalance errors in syslog: 
 http://pastebin.com/AiRTWDwR
 
 
 You are hitting a different issue. German Anders is most likely correct
 and you hit the rbd hang. That then caused the iscsi/scsi command to
 timeout which caused the scsi error handler to run. In your logs we see
 the LIO error handler has received a task abort from the initiator and
 that timed out which caused the escalation (iscsi portal login related
 messages).

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Stackforge Puppet Module

2014-11-18 Thread David Moreau Simard
Great find Nick.

I've discussed it on IRC and it does look like a real issue: 
https://github.com/enovance/edeploy-roles/blob/master/puppet-master.install#L48-L52
I've pushed the fix for review: https://review.openstack.org/#/c/135421/

--
David Moreau Simard


 On Nov 18, 2014, at 3:32 PM, Nick Fisk n...@fisk.me.uk wrote:
 
 Hi David,
 
 Just to let you know I finally managed to get to the bottom of this.
 
 In the repo.pp one of the authors has a non ASCII character in his name, for
 whatever reason this was tripping up my puppet environment. After removing
 the following line:-
 
 # Author: François Charlier francois.charl...@enovance.com
 
 The module proceeds further, I'm now getting an error about a missing arg
 parameter, but I hope this should be pretty easy to solve.
 
 Nick
 
 -Original Message-
 From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of
 David Moreau Simard
 Sent: 12 November 2014 14:25
 To: Nick Fisk
 Cc: ceph-users@lists.ceph.com
 Subject: Re: [ceph-users] Stackforge Puppet Module
 
 What comes to mind is that you need to make sure that you've cloned the git
 repository to /etc/puppet/modules/ceph and not
 /etc/puppet/modules/puppet-ceph.
 
 Feel free to hop on IRC to discuss about puppet-ceph on freenode in
 #puppet-openstack.
 You can find me there as dmsimard.
 
 --
 David Moreau Simard
 
 On Nov 12, 2014, at 8:58 AM, Nick Fisk n...@fisk.me.uk wrote:
 
 Hi David,
 
 Many thanks for your reply.
 
 I must admit I have only just started looking at puppet, but a lot of 
 what you said makes sense to me and understand the reason for not 
 having the module auto discover disks.
 
 I'm currently having a problem with the ceph::repo class when trying 
 to push this out to a test server:-
 
 Error: Could not retrieve catalog from remote server: Error 400 on SERVER:
 Could not find class ceph::repo for ceph-puppet-test on node 
 ceph-puppet-test
 Warning: Not using cache on failed catalog
 Error: Could not retrieve catalog; skipping run
 
 I'm a bit stuck but will hopefully work out why it's not working soon 
 and then I can attempt your idea of using a script to dynamically pass 
 disks to the puppet module.
 
 Thanks,
 Nick
 
 
 -Original Message-
 From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf 
 Of David Moreau Simard
 Sent: 11 November 2014 12:05
 To: Nick Fisk
 Cc: ceph-users@lists.ceph.com
 Subject: Re: [ceph-users] Stackforge Puppet Module
 
 Hi Nick,
 
 The great thing about puppet-ceph's implementation on Stackforge is 
 that it is both unit and integration tested.
 You can see the integration tests here:
 https://github.com/ceph/puppet-ceph/tree/master/spec/system
 
 Where I'm getting at is that the tests allow you to see how you can 
 use the module to a certain extent.
 For example, in the OSD integration tests:
 -
 https://github.com/ceph/puppet-ceph/blob/master/spec/system/ceph_osd_s
 pec.rb
 #L24 and then:
 -
 https://github.com/ceph/puppet-ceph/blob/master/spec/system/ceph_osd_s
 pec.rb
 #L82-L110
 
 There's no auto discovery mechanism built-in the module right now. 
 It's kind of dangerous, you don't want to format the wrong disks.
 
 Now, this doesn't mean you can't discover the disks yourself and 
 pass them to the module from your site.pp or from a composition layer.
 Here's something I have for my CI environment that uses the 
 $::blockdevices fact to discover all devices, split that fact into a 
 list of the devices and then reject the drives I don't want (such as the
 OS disk):
 
   # Assume OS is installed on xvda/sda/vda.
   # On an Openstack VM, vdb is ephemeral, we don't want to use vdc.
   # WARNING: ALL OTHER DISKS WILL BE FORMATTED/PARTITIONED BY CEPH!
   $block_devices = reject(split($::blockdevices, ','),
 '(xvda|sda|vda|vdc|sr0)')
   $devices = prefix($block_devices, '/dev/')
 
 And then you can pass $devices to the module.
 
 Let me know if you have any questions !
 --
 David Moreau Simard
 
 On Nov 11, 2014, at 6:23 AM, Nick Fisk n...@fisk.me.uk wrote:
 
 Hi,
 
 I'm just looking through the different methods of deploying Ceph and 
 I particularly liked the idea that the stackforge puppet module 
 advertises of using discover to automatically add new disks. I 
 understand the principle of how it should work; using ceph-disk list 
 to find unknown disks, but I would like to see in a little more 
 detail on
 how it's been implemented.
 
 I've been looking through the puppet module on Github, but I can't 
 see anyway where this discovery is carried out.
 
 Could anyone confirm if this puppet modules does currently support 
 the auto discovery and where  in the code its carried out?
 
 Many Thanks,
 Nick
 
 
 
 
 ___
 ceph-users mailing list
 ceph-users@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
 
 ___
 ceph-users mailing list
 ceph-users@lists.ceph.com
 http://lists.ceph.com

Re: [ceph-users] Poor RBD performance as LIO iSCSI target

2014-11-18 Thread David Moreau Simard
Testing without the cache tiering is the next test I want to do when I have 
time..

When it's hanging, there is no activity at all on the cluster.
Nothing in ceph -w, nothing in ceph osd pool stats.

I'll provide an update when I have a chance to test without tiering. 
--
David Moreau Simard


 On Nov 18, 2014, at 3:28 PM, Nick Fisk n...@fisk.me.uk wrote:
 
 Hi David,
 
 Have you tried on a normal replicated pool with no cache? I've seen a number
 of threads recently where caching is causing various things to block/hang.
 It would be interesting to see if this still happens without the caching
 layer, at least it would rule it out.
 
 Also is there any sign that as the test passes ~50GB that the cache might
 start flushing to the backing pool causing slow performance?
 
 I am planning a deployment very similar to yours so I am following this with
 great interest. I'm hoping to build a single node test cluster shortly, so
 I might be in a position to work with you on this issue and hopefully get it
 resolved.
 
 Nick
 
 -Original Message-
 From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of
 David Moreau Simard
 Sent: 18 November 2014 19:58
 To: Mike Christie
 Cc: ceph-users@lists.ceph.com; Christopher Spearman
 Subject: Re: [ceph-users] Poor RBD performance as LIO iSCSI target
 
 Thanks guys. I looked at http://tracker.ceph.com/issues/8818 and chatted
 with dis on #ceph-devel.
 
 I ran a LOT of tests on a LOT of comabination of kernels (sometimes with
 tunables legacy). I haven't found a magical combination in which the
 following test does not hang:
 fio --name=writefile --size=100G --filesize=100G --filename=/dev/rbd0
 --bs=1M --nrfiles=1 --direct=1 --sync=0 --randrepeat=0 --rw=write
 --refill_buffers --end_fsync=1 --iodepth=200 --ioengine=libaio
 
 Either directly on a mapped rbd device, on a mounted filesystem (over rbd),
 exported through iSCSI.. nothing.
 I guess that rules out a potential issue with iSCSI overhead.
 
 Now, something I noticed out of pure luck is that I am unable to reproduce
 the issue if I drop the size of the test to 50GB. Tests will complete in
 under 2 minutes.
 75GB will hang right at the end and take more than 10 minutes.
 
 TL;DR of tests:
 - 3x fio --name=writefile --size=50G --filesize=50G --filename=/dev/rbd0
 --bs=1M --nrfiles=1 --direct=1 --sync=0 --randrepeat=0 --rw=write
 --refill_buffers --end_fsync=1 --iodepth=200 --ioengine=libaio
 -- 1m44s, 1m49s, 1m40s
 
 - 3x fio --name=writefile --size=75G --filesize=75G --filename=/dev/rbd0
 --bs=1M --nrfiles=1 --direct=1 --sync=0 --randrepeat=0 --rw=write
 --refill_buffers --end_fsync=1 --iodepth=200 --ioengine=libaio
 -- 10m12s, 10m11s, 10m13s
 
 Details of tests here: http://pastebin.com/raw.php?i=3v9wMtYP
 
 Does that ring you guys a bell ?
 
 --
 David Moreau Simard
 
 
 On Nov 13, 2014, at 3:31 PM, Mike Christie mchri...@redhat.com wrote:
 
 On 11/13/2014 10:17 AM, David Moreau Simard wrote:
 Running into weird issues here as well in a test environment. I don't
 have a solution either but perhaps we can find some things in common..
 
 Setup in a nutshell:
 - Ceph cluster: Ubuntu 14.04, Kernel 3.16.7, Ceph 0.87-1 (OSDs with 
 separate public/cluster network in 10 Gbps)
 - iSCSI Proxy node (targetcli/LIO): Ubuntu 14.04, Kernel 3.16.7, Ceph 
 0.87-1 (10 Gbps)
 - Client node: Ubuntu 12.04, Kernel 3.11 (10 Gbps)
 
 Relevant cluster config: Writeback cache tiering with NVME PCI-E cards (2
 replica) in front of a erasure coded pool (k=3,m=2) backed by spindles.
 
 I'm following the instructions here: 
 http://www.hastexo.com/resources/hints-and-kinks/turning-ceph-rbd-ima
 ges-san-storage-devices No issues with creating and mapping a 100GB 
 RBD image and then creating the target.
 
 I'm interested in finding out the overhead/performance impact of
 re-exporting through iSCSI so the idea is to run benchmarks.
 Here's a fio test I'm trying to run on the client node on the mounted
 iscsi device:
 fio --name=writefile --size=100G --filesize=100G --filename=/dev/sdu 
 --bs=1M --nrfiles=1 --direct=1 --sync=0 --randrepeat=0 --rw=write 
 --refill_buffers --end_fsync=1 --iodepth=200 --ioengine=libaio
 
 The benchmark will eventually hang towards the end of the test for some
 long seconds before completing.
 On the proxy node, the kernel complains with iscsi portal login 
 timeout: http://pastebin.com/Q49UnTPr and I also see irqbalance 
 errors in syslog: http://pastebin.com/AiRTWDwR
 
 
 You are hitting a different issue. German Anders is most likely 
 correct and you hit the rbd hang. That then caused the iscsi/scsi 
 command to timeout which caused the scsi error handler to run. In your 
 logs we see the LIO error handler has received a task abort from the 
 initiator and that timed out which caused the escalation (iscsi portal 
 login related messages).
 
 ___
 ceph-users mailing list
 ceph-users@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-users

Re: [ceph-users] Poor RBD performance as LIO iSCSI target

2014-11-13 Thread David Moreau Simard
Running into weird issues here as well in a test environment. I don't have a 
solution either but perhaps we can find some things in common..

Setup in a nutshell:
- Ceph cluster: Ubuntu 14.04, Kernel 3.16.7, Ceph 0.87-1 (OSDs with separate 
public/cluster network in 10 Gbps)
- iSCSI Proxy node (targetcli/LIO): Ubuntu 14.04, Kernel 3.16.7, Ceph 0.87-1 
(10 Gbps)
- Client node: Ubuntu 12.04, Kernel 3.11 (10 Gbps)

Relevant cluster config: Writeback cache tiering with NVME PCI-E cards (2 
replica) in front of a erasure coded pool (k=3,m=2) backed by spindles.

I'm following the instructions here: 
http://www.hastexo.com/resources/hints-and-kinks/turning-ceph-rbd-images-san-storage-devices
No issues with creating and mapping a 100GB RBD image and then creating the 
target.

I'm interested in finding out the overhead/performance impact of re-exporting 
through iSCSI so the idea is to run benchmarks.
Here's a fio test I'm trying to run on the client node on the mounted iscsi 
device:
fio --name=writefile --size=100G --filesize=100G --filename=/dev/sdu --bs=1M 
--nrfiles=1 --direct=1 --sync=0 --randrepeat=0 --rw=write --refill_buffers 
--end_fsync=1 --iodepth=200 --ioengine=libaio

The benchmark will eventually hang towards the end of the test for some long 
seconds before completing.
On the proxy node, the kernel complains with iscsi portal login timeout: 
http://pastebin.com/Q49UnTPr and I also see irqbalance errors in syslog: 
http://pastebin.com/AiRTWDwR

Doing the same test on the machines directly (raw, rbd, on the osd filesystem) 
doesn't yield any issues.

I've tried a couple things to see if I could get things to work...
- Set irqbalance --hintpolicy=ignore (http://sourceforge.net/p/e1000/bugs/394/ 
 https://bugs.launchpad.net/ubuntu/+source/irqbalance/+bug/1321425)
- Changed size on cache pool to 1 (for the sake of testing, improved 
performance but still hangs)
- Set crush tunables to legacy (and back to optimal)
- Various package and kernel versions and putting the proxy node on Ubuntu 
precise
- Formatting and mounting the iscsi block device and running the test on the 
formatted filesystem

I don't think it's related .. but I don't remember running into issues before 
I've swapped out SSDs for the NVME cards for the cache pool.
I don't have time *right now* but I definitely want to test if I am able to 
reproduce the issue on the SSDs..

Let me know if this gives you any ideas, I'm all ears.
--
David Moreau Simard

 On Oct 28, 2014, at 4:07 PM, Christopher Spearman neromaver...@gmail.com 
 wrote:
 
 Sage:
 
 That'd be my assumption, performance looked pretty fantastic over loop until 
 it started being used it heavily
 
 Mike:
 
 The configs you asked for are at the end of this message I've subtracted  
 changed some info, iqn/wwn/portal, for security purposes. The raw  loop 
 target configs are all in one since I'm running both types of configs 
 currently. I also included the running config (ls /) of targetcli for anyone 
 interested in what it looks like from the console.
 
 The tool I used was dd, I ran through various options using dd but didn't 
 really see much difference. The one on top is my go to command for my first 
 test
 
 time dd if=/dev/zero of=test bs=32M count=32 oflag=direct,sync
 time dd if=/dev/zero of=test bs=32M count=128 oflag=direct,sync
 time dd if=/dev/zero of=test bs=8M count=512 oflag=direct,sync
 time dd if=/dev/zero of=test bs=4M count=1024 oflag=direct,sync
 
 
 ---ls / from current targetcli (no mounted ext4 - image file config)---
 
 /iscsi ls /
 o- / 
 .
  [...]
   o- backstores 
 ..
  [...]
   | o- block 
 ..
  [Storage Objects: 2]
   | | o- ceph_lun0 
 .. 
 [/dev/loop0 (2.0TiB) write-thru activated]
   | | o- ceph_noloop00 .. 
 [/dev/rbd/vmiscsi/noloop00 (1.0TiB) write-thru activated]
   | o- fileio 
 .
  [Storage Objects: 0]
   | o- pscsi 
 ..
  [Storage Objects: 0]
   | o- ramdisk 
 
  [Storage Objects: 0]
   o- iscsi 
 
  [Targets: 2]
   | o- iqn.gateway2_01 . 
 [TPGs: 1]
   | | o- tpg1

Re: [ceph-users] Poor RBD performance as LIO iSCSI target

2014-11-13 Thread David Moreau Simard
That's interesting.

Although I'm running 3.16.7 and that I'd expect the patch to be in already, 
I'll downgrade to the working 3.16.0 kernel and report back if this fixes the 
issue.

Thanks for the pointer.
--
David Moreau Simard


 On Nov 13, 2014, at 1:15 PM, German Anders gand...@despegar.com wrote:
 
 Is possible that you hit bug #8818 ?
 
  
 German Anders
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
  
 --- Original message --- 
 Asunto: Re: [ceph-users] Poor RBD performance as LIO iSCSI target 
 De: David Moreau Simard dmsim...@iweb.com 
 Para: Christopher Spearman neromaver...@gmail.com 
 Cc: ceph-users@lists.ceph.com ceph-users@lists.ceph.com 
 Fecha: Thursday, 13/11/2014 13:17
 
 Running into weird issues here as well in a test environment. I don't have a 
 solution either but perhaps we can find some things in common..
 
 Setup in a nutshell:
 - Ceph cluster: Ubuntu 14.04, Kernel 3.16.7, Ceph 0.87-1 (OSDs with separate 
 public/cluster network in 10 Gbps)
 - iSCSI Proxy node (targetcli/LIO): Ubuntu 14.04, Kernel 3.16.7, Ceph 0.87-1 
 (10 Gbps)
 - Client node: Ubuntu 12.04, Kernel 3.11 (10 Gbps)
 
 Relevant cluster config: Writeback cache tiering with NVME PCI-E cards (2 
 replica) in front of a erasure coded pool (k=3,m=2) backed by spindles.
 
 I'm following the instructions here: 
 http://www.hastexo.com/resources/hints-and-kinks/turning-ceph-rbd-images-san-storage-devices
 No issues with creating and mapping a 100GB RBD image and then creating the 
 target.
 
 I'm interested in finding out the overhead/performance impact of 
 re-exporting through iSCSI so the idea is to run benchmarks.
 Here's a fio test I'm trying to run on the client node on the mounted iscsi 
 device:
 fio --name=writefile --size=100G --filesize=100G --filename=/dev/sdu --bs=1M 
 --nrfiles=1 --direct=1 --sync=0 --randrepeat=0 --rw=write --refill_buffers 
 --end_fsync=1 --iodepth=200 --ioengine=libaio
 
 The benchmark will eventually hang towards the end of the test for some long 
 seconds before completing.
 On the proxy node, the kernel complains with iscsi portal login timeout: 
 http://pastebin.com/Q49UnTPr and I also see irqbalance errors in syslog: 
 http://pastebin.com/AiRTWDwR
 
 Doing the same test on the machines directly (raw, rbd, on the osd 
 filesystem) doesn't yield any issues.
 
 I've tried a couple things to see if I could get things to work...
 - Set irqbalance --hintpolicy=ignore 
 (http://sourceforge.net/p/e1000/bugs/394/  
 https://bugs.launchpad.net/ubuntu/+source/irqbalance/+bug/1321425)
 - Changed size on cache pool to 1 (for the sake of testing, improved 
 performance but still hangs)
 - Set crush tunables to legacy (and back to optimal)
 - Various package and kernel versions and putting the proxy node on Ubuntu 
 precise
 - Formatting and mounting the iscsi block device and running the test on the 
 formatted filesystem
 
 I don't think it's related .. but I don't remember running into issues 
 before I've swapped out SSDs for the NVME cards for the cache pool.
 I don't have time *right now* but I definitely want to test if I am able to 
 reproduce the issue on the SSDs..
 
 Let me know if this gives you any ideas, I'm all ears.
 --
 David Moreau Simard
 
 On Oct 28, 2014, at 4:07 PM, Christopher Spearman neromaver...@gmail.com 
 wrote:
 
 Sage:
 
 That'd be my assumption, performance looked pretty fantastic over loop 
 until it started being used it heavily
 
 Mike:
 
 The configs you asked for are at the end of this message I've subtracted  
 changed some info, iqn/wwn/portal, for security purposes. The raw  loop 
 target configs are all in one since I'm running both types of configs 
 currently. I also included the running config (ls /) of targetcli for 
 anyone interested in what it looks like from the console.
 
 The tool I used was dd, I ran through various options using dd but didn't 
 really see much difference. The one on top is my go to command for my first 
 test
 
 time dd if=/dev/zero of=test bs=32M count=32 oflag=direct,sync
 time dd if=/dev/zero of=test bs=32M count=128 oflag=direct,sync
 time dd if=/dev/zero of=test bs=8M count=512 oflag=direct,sync
 time dd if=/dev/zero of=test bs=4M count=1024 oflag=direct,sync
 
 
 ---ls / from current targetcli (no mounted ext4 - image file config)---
 
 /iscsi ls /
 o- / 
 .
  [...]
 o- backstores 
 ..
  [...]
 | o- block 
 ..
  [Storage Objects: 2]
 | | o- ceph_lun0 
 .. 
 [/dev/loop0 (2.0TiB) write-thru activated]
 | | o- ceph_noloop00 .. 
 [/dev/rbd/vmiscsi/noloop00 (1.0TiB) write-thru activated]
 | o

Re: [ceph-users] Stackforge Puppet Module

2014-11-12 Thread David Moreau Simard
What comes to mind is that you need to make sure that you've cloned the git 
repository to /etc/puppet/modules/ceph and not /etc/puppet/modules/puppet-ceph.

Feel free to hop on IRC to discuss about puppet-ceph on freenode in 
#puppet-openstack.
You can find me there as dmsimard.

--
David Moreau Simard

 On Nov 12, 2014, at 8:58 AM, Nick Fisk n...@fisk.me.uk wrote:
 
 Hi David,
 
 Many thanks for your reply.
 
 I must admit I have only just started looking at puppet, but a lot of what
 you said makes sense to me and understand the reason for not having the
 module auto discover disks.
 
 I'm currently having a problem with the ceph::repo class when trying to push
 this out to a test server:-
 
 Error: Could not retrieve catalog from remote server: Error 400 on SERVER:
 Could not find class ceph::repo for ceph-puppet-test on node
 ceph-puppet-test
 Warning: Not using cache on failed catalog
 Error: Could not retrieve catalog; skipping run
 
 I'm a bit stuck but will hopefully work out why it's not working soon and
 then I can attempt your idea of using a script to dynamically pass disks to
 the puppet module.
 
 Thanks,
 Nick
 
 
 -Original Message-
 From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of
 David Moreau Simard
 Sent: 11 November 2014 12:05
 To: Nick Fisk
 Cc: ceph-users@lists.ceph.com
 Subject: Re: [ceph-users] Stackforge Puppet Module
 
 Hi Nick,
 
 The great thing about puppet-ceph's implementation on Stackforge is that it
 is both unit and integration tested.
 You can see the integration tests here:
 https://github.com/ceph/puppet-ceph/tree/master/spec/system
 
 Where I'm getting at is that the tests allow you to see how you can use the
 module to a certain extent.
 For example, in the OSD integration tests:
 -
 https://github.com/ceph/puppet-ceph/blob/master/spec/system/ceph_osd_spec.rb
 #L24 and then:
 -
 https://github.com/ceph/puppet-ceph/blob/master/spec/system/ceph_osd_spec.rb
 #L82-L110
 
 There's no auto discovery mechanism built-in the module right now. It's kind
 of dangerous, you don't want to format the wrong disks.
 
 Now, this doesn't mean you can't discover the disks yourself and pass them
 to the module from your site.pp or from a composition layer.
 Here's something I have for my CI environment that uses the $::blockdevices
 fact to discover all devices, split that fact into a list of the devices and
 then reject the drives I don't want (such as the OS disk):
 
# Assume OS is installed on xvda/sda/vda.
# On an Openstack VM, vdb is ephemeral, we don't want to use vdc.
# WARNING: ALL OTHER DISKS WILL BE FORMATTED/PARTITIONED BY CEPH!
$block_devices = reject(split($::blockdevices, ','),
 '(xvda|sda|vda|vdc|sr0)')
$devices = prefix($block_devices, '/dev/')
 
 And then you can pass $devices to the module.
 
 Let me know if you have any questions !
 --
 David Moreau Simard
 
 On Nov 11, 2014, at 6:23 AM, Nick Fisk n...@fisk.me.uk wrote:
 
 Hi,
 
 I'm just looking through the different methods of deploying Ceph and I 
 particularly liked the idea that the stackforge puppet module 
 advertises of using discover to automatically add new disks. I 
 understand the principle of how it should work; using ceph-disk list 
 to find unknown disks, but I would like to see in a little more detail on
 how it's been implemented.
 
 I've been looking through the puppet module on Github, but I can't see 
 anyway where this discovery is carried out.
 
 Could anyone confirm if this puppet modules does currently support the 
 auto discovery and where  in the code its carried out?
 
 Many Thanks,
 Nick
 
 
 
 
 ___
 ceph-users mailing list
 ceph-users@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
 
 ___
 ceph-users mailing list
 ceph-users@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
 
 
 
 

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Stackforge Puppet Module

2014-11-11 Thread David Moreau Simard
Hi Nick,

The great thing about puppet-ceph's implementation on Stackforge is that it is 
both unit and integration tested.
You can see the integration tests here: 
https://github.com/ceph/puppet-ceph/tree/master/spec/system

Where I'm getting at is that the tests allow you to see how you can use the 
module to a certain extent.
For example, in the OSD integration tests:
- 
https://github.com/ceph/puppet-ceph/blob/master/spec/system/ceph_osd_spec.rb#L24
 and then:
- 
https://github.com/ceph/puppet-ceph/blob/master/spec/system/ceph_osd_spec.rb#L82-L110

There's no auto discovery mechanism built-in the module right now. It's kind of 
dangerous, you don't want to format the wrong disks.

Now, this doesn't mean you can't discover the disks yourself and pass them to 
the module from your site.pp or from a composition layer.
Here's something I have for my CI environment that uses the $::blockdevices 
fact to discover all devices, split that fact into a list of the devices and 
then reject the drives I don't want (such as the OS disk):

# Assume OS is installed on xvda/sda/vda.
# On an Openstack VM, vdb is ephemeral, we don't want to use vdc.
# WARNING: ALL OTHER DISKS WILL BE FORMATTED/PARTITIONED BY CEPH!
$block_devices = reject(split($::blockdevices, ','), 
'(xvda|sda|vda|vdc|sr0)')
$devices = prefix($block_devices, '/dev/')

And then you can pass $devices to the module.

Let me know if you have any questions !
--
David Moreau Simard

 On Nov 11, 2014, at 6:23 AM, Nick Fisk n...@fisk.me.uk wrote:
 
 Hi,
 
 I'm just looking through the different methods of deploying Ceph and I
 particularly liked the idea that the stackforge puppet module advertises of
 using discover to automatically add new disks. I understand the principle of
 how it should work; using ceph-disk list to find unknown disks, but I would
 like to see in a little more detail on how it's been implemented.
 
 I've been looking through the puppet module on Github, but I can't see
 anyway where this discovery is carried out.
 
 Could anyone confirm if this puppet modules does currently support the auto
 discovery and where  in the code its carried out?
 
 Many Thanks,
 Nick
 
 
 
 
 ___
 ceph-users mailing list
 ceph-users@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Trying to figure out usable space on erasure coded pools

2014-11-10 Thread David Moreau Simard
Hi,

It's easy to calculate the amount of raw storage vs actual storage on 
replicated pools.
Example with 4x 2TB disks:
- 8TB raw
- 4TB usable (when using 2 replicas)

I understand how erasure coded pools reduces the overhead of storage required 
for data redundancy and resiliency and how it depends on the erasure coding 
profile you use.

Do you guys have an easy way to figure out the amount of usable storage ?

Thanks !
--
David Moreau Simard



___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Trying to figure out usable space on erasure coded pools

2014-11-10 Thread David Moreau Simard
Oh, that's interesting - I didn't know that.

Thanks.
--
David Moreau Simard


 On Nov 10, 2014, at 6:06 PM, Sage Weil s...@newdream.net wrote:
 
 On Mon, 10 Nov 2014, David Moreau Simard wrote:
 Hi,
 
 It's easy to calculate the amount of raw storage vs actual storage on 
 replicated pools.
 Example with 4x 2TB disks:
 - 8TB raw
 - 4TB usable (when using 2 replicas)
 
 I understand how erasure coded pools reduces the overhead of storage 
 required for data redundancy and resiliency and how it depends on the 
 erasure coding profile you use.
 
 Do you guys have an easy way to figure out the amount of usable storage ?
 
 The 'ceph df' command now has a 'MAX AVAIL' column that factors in either 
 the replication factor or erasure k/(k+m) ratio.  It also takes into 
 account the projected distribution of data across disks from the CRUSH 
 rule and uses the 'first OSD to fill up' as the target.
 
 What it doesn't take into account is the expected variation in utilization 
 or the 'full_ratio' and 'near_full_ratio' which will stop writes sometime 
 before that point.
 
 sage

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] RadosGW over HTTPS

2014-10-08 Thread David Moreau Simard
Hi Marco,

While I do not have a RadosGW implementation right now, I do have a successful 
setup with tengine and Swift - it should be pretty similar.

What version of tengine are you trying to use ?
It dates back to a while.. but I remember having issues with the 2.0.x branch 
of tengine. We package our own version of 1.5.x.
In hindsight, the issues I got might've been because of the SPDY implementation 
but I didn't put much thought into it at the time.

On my end, the config is in fact very simple and looks a bit like this:

server {
  listen ip:443;

  server_name swift.tld;

  access_log /var/log/nginx/swift_https_access.log;
  error_log /var/log/nginx/swift_https_error.log;

  ssl on;
  ssl_certificate /etc/nginx/ssl/swift.crt;
  ssl_certificate_key /etc/nginx/ssl/swift.key;

  chunkin on;

  error_page 502 503 504 = @errors;
  error_page 411 = @chunk_411_error;
  location @chunk_411_error {
  chunkin_resume;
  }

  proxy_cache swift;
  location / {
proxy_pass http://swift;
proxy_redirect off;
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
  }

  location @errors {
proxy_pass http://127.0.0.1;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header Host 127.0.0.1;
  }
}

Regarding the HTTP thing, maybe you could set up a redirection and see what 
happens - a bit like this:
server {
  listen ip:80;

  server_name rgw.tld;

  access_log /var/log/nginx/rgw_http_access.log;
  error_log /var/log/nginx/rgw_http_error.log;

  error_page 502 503 504 = @errors;

  if ( $scheme = 'http' ) {
rewrite ^ https://$server_name$request_uri? permanent;
  }

  location @errors {
proxy_pass http://127.0.0.1;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header Host 127.0.0.1;
  }
}
--
David Moreau Simard

On Oct 8, 2014, at 7:53 AM, Marco Garcês ma...@garces.cc wrote:

 Hi there,
 
 I am using RadosGW over NGINX, with Swift API, and everything is
 working great, over HTTP, but with HTTPS, I keep getting errors, and
 I'm guessing is something on the gateway itself.
 
 Does anyone have a working HTTPS gateway with nginx? Can you provide
 it, so I can compare to mine?
 
 If I do a HTTP request, using Swift client from my machine, I get the
 response ok, but If I try it with HTTPS, I get:
 
 Account HEAD failed: http://gateway.local/swift/v1 400 Bad Request
 
 and on nginx side:
 
 2014/10/08 13:37:34 [info] 18198#0: *50 client sent plain HTTP request
 to HTTPS port while reading client request headers, client:
 10.5.5.222, server: *.gatew
 ay.local, request: HEAD /swift/v1 HTTP/1.1, host: gateway.local:443
 2014/10/08 13:37:34 [info] 18197#0: *48 client 10.5.5.222 closed
 keepalive connection
 
 I have wiresharked my connection, and there is no evidence that HTTP
 traffic is going out, when I make the request via HTTPS, so thats why
 I believe that the issue is on the gateway end.
 
 NGINX Config:
 server {
listen 80;
listen 443 ssl default;
 
server_name *.gateway.bcitestes.local gateway.bcitestes.local;
error_log logs/error_https.log debug;
client_max_body_size 10g;
 
# This is the important option that tengine has, but nginx does not
fastcgi_request_buffering off;
 
ssl_certificate  /etc/pki/tls/certs/ca_rgw.crt;
ssl_certificate_key  /etc/pki/tls/private/ca_rgw.key;
 
ssl_session_timeout  5m;
 
ssl_protocols  SSLv2 SSLv3 TLSv1;
ssl_ciphers  HIGH:!aNULL:!MD5;
ssl_prefer_server_ciphers   on;
location / {
fastcgi_pass_header Authorization;
fastcgi_pass_request_headers on;
fastcgi_param HTTPS on;
 
if ($request_method  = PUT ) {
rewrite ^ /PUT$request_uri;
 }
 include fastcgi_params;
 fastcgi_param HTTPS on;
 
 fastcgi_pass
 unix:/var/run/ceph/ceph.radosgw.gateway.fastcgi.sock;
 }
 
 location /PUT/ {
 internal;
 fastcgi_pass_header Authorization;
 fastcgi_pass_request_headers on;
 
 include fastcgi_params;
 fastcgi_param  CONTENT_LENGTH   $content_length;
 fastcgi_param HTTPS on;
 
 fastcgi_pass
 unix:/var/run/ceph/ceph.radosgw.gateway.fastcgi.sock;
 }
 
}
 
 Ceph config:
 [client.radosgw.gw]
 host = GATEWAY
 keyring = /etc/ceph/keyring.radosgw.gw
 rgw socket path = /var/run/ceph/ceph.radosgw.gateway.fastcgi.sock
 log file = /var/log/ceph/client.radosgw.gateway.log
 rgw print continue = false
 rgw dns name = gateway.bcitestes.local
 rgw enable ops log = false
 rgw enable usage log = true
 rgw usage log tick interval = 30
 rgw usage log flush threshold = 1024
 rgw usage max shards = 32
 rgw usage max user shards = 1
 rgw cache lru size = 15000
 rgw thread pool size = 2048

Re: [ceph-users] [Ceph-community] Paris Ceph meetup : september 18th, 2014

2014-09-01 Thread David Moreau Simard
This reminds me that we should also schedule some sort of meetup during
the Openstack summit which is also in Paris !

-- 
David Moreau Simard



Le 2014-09-01, 8:06 AM, « Loic Dachary » l...@dachary.org a écrit :

Hi Ceph,

The next Paris Ceph meetup is scheduled immediately after the Ceph day.

   http://www.meetup.com/Ceph-in-Paris/events/204412892/

I'll be there and hope to discuss the Giant features on this occasion :-)

Cheers

-- 
Loïc Dachary, Artisan Logiciel Libre

___
Ceph-community mailing list
ceph-commun...@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-community-ceph.com

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] [Single OSD performance on SSD] Can't go over 3, 2K IOPS

2014-08-28 Thread David Moreau Simard
That's definitely interesting.

Is this meant to be released in a dot release in Firefly or will they land
in Giant ?
-- 
David Moreau Simard


Le 2014-08-28, 1:49 PM, « Somnath Roy » somnath@sandisk.com a écrit :

Yes, Mark, all of my changes are in ceph main now and we are getting
significant RR performance improvement with that.

Thanks  Regards
Somnath

-Original Message-
From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of
Mark Nelson
Sent: Thursday, August 28, 2014 10:43 AM
To: ceph-users@lists.ceph.com
Subject: Re: [ceph-users] [Single OSD performance on SSD] Can't go over
3, 2K IOPS

On 08/28/2014 12:39 PM, Somnath Roy wrote:
 Hi Sebastian,
 If you are trying with the latest Ceph master, there are some changes
we made that will be increasing your read performance from SSD a factor
of ~5X if the ios are hitting the disks. Otherwise, the serving from
memory the improvement is even more. The single OSD will be cpu bound
with increasing number of clients eventually both reading from disk and
memory scenario. Some new config option are introduced and here are
those.

  osd_op_num_threads_per_shard
  osd_op_num_shards
  throttler_perf_counter
  osd_enable_op_tracker
  filestore_fd_cache_size
  filestore_fd_cache_shards

 The work pool for the io path is now sharded and the above options are
for controlling this. Osd_op_threads are no longer in the io path. Also,
the filestore FDcache is sharded now.
 In my setup(64GB RAM, 40 core CPU with HT enabled)  the following
config file on a single OSD is giving optimum result for 4k RR read.

 [global]

  filestore_xattr_use_omap = true

  debug_lockdep = 0/0
  debug_context = 0/0
  debug_crush = 0/0
  debug_buffer = 0/0
  debug_timer = 0/0
  debug_filer = 0/0
  debug_objecter = 0/0
  debug_rados = 0/0
  debug_rbd = 0/0
  debug_journaler = 0/0
  debug_objectcatcher = 0/0
  debug_client = 0/0
  debug_osd = 0/0
  debug_optracker = 0/0
  debug_objclass = 0/0
  debug_filestore = 0/0
  debug_journal = 0/0
  debug_ms = 0/0
  debug_monc = 0/0
  debug_tp = 0/0
  debug_auth = 0/0
  debug_finisher = 0/0
  debug_heartbeatmap = 0/0
  debug_perfcounter = 0/0
  debug_asok = 0/0
  debug_throttle = 0/0
  debug_mon = 0/0
  debug_paxos = 0/0
  debug_rgw = 0/0
  osd_op_threads = 5
  osd_op_num_threads_per_shard = 1
  osd_op_num_shards = 25
  #osd_op_num_sharded_pool_threads = 25
  filestore_op_threads = 4

  ms_nocrc = true
  filestore_fd_cache_size = 64
  filestore_fd_cache_shards = 32
  cephx sign messages = false
  cephx require signatures = false

  ms_dispatch_throttle_bytes = 0
  throttler_perf_counter = false


 [osd]
  osd_client_message_size_cap = 0
  osd_client_message_cap = 0
  osd_enable_op_tracker = false


 What I saw optracker is one of the major bottleneck and we are in
process of optimizing that. For now, optracker enabled/disabled code
introduced. Also, there are several bottlenecks in the filestore level
are removed.
 Unfortunately, we are yet to optimize the write path. All of these
should help the write path as well, but, write path improvement will not
be visible till all the lock serialization are removed.

This is what I'm waiting for. :)  I've been meaning to ask you Somnath,
how goes progress?

Mark


 Thanks  Regards
 Somnath
 -Original Message-
 From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf
 Of Sebastien Han
 Sent: Thursday, August 28, 2014 9:12 AM
 To: ceph-users
 Cc: Mark Nelson
 Subject: [ceph-users] [Single OSD performance on SSD] Can't go over 3,
 2K IOPS

 Hey all,

 It has been a while since the last thread performance related on the ML
:p I've been running some experiment to see how much I can get from an
SSD on a Ceph cluster.
 To achieve that I did something pretty simple:

 * Debian wheezy 7.6
 * kernel from debian 3.14-0.bpo.2-amd64
 * 1 cluster, 3 mons (i'd like to keep this realistic since in a real
 deployment i'll use 3)
 * 1 OSD backed by an SSD (journal and osd data on the same device)
 * 1 replica count of 1
 * partitions are perfectly aligned
 * io scheduler is set to noon but deadline was showing the same
 results
 * no updatedb running

 About the box:

 * 32GB of RAM
 * 12 cores with HT @ 2,4 GHz
 * WB cache is enabled on the controller
 * 10Gbps network (doesn't help here)

 The SSD is a 200G Intel DC S3700 and is capable of delivering around
29K iops with random 4k writes (my fio results) As a benchmark tool I
used fio with the rbd engine (thanks deutsche telekom guys!).

 O_DIECT and D_SYNC don't seem to be a problem for the SSD:

 # dd if=/dev/urandom of=rand.file

Re: [ceph-users] Fixed all active+remapped PGs stuck forever (but I have no clue why)

2014-08-14 Thread David Moreau Simard
Ah, I was afraid it would be related to the amount of replicas versus the 
amount of host buckets.

Makes sense. I was unable to reproduce the issue with three hosts and one OSD 
on each host.

Thanks.
--
David Moreau Simard

On Aug 14, 2014, at 12:36 AM, Christian Balzer 
ch...@gol.commailto:ch...@gol.com wrote:


Hello,

On Thu, 14 Aug 2014 03:38:11 + David Moreau Simard wrote:

Hi,

Trying to update my continuous integration environment.. same deployment
method with the following specs:
- Ubuntu Precise, Kernel 3.2, Emperor (0.72.2) - Yields a successful,
healthy cluster.
- Ubuntu Trusty, Kernel 3.13, Firefly (0.80.5) - I have stuck placement
groups.

Here’s some relevant bits from the Trusty/Firefly setup before I move on
to what I’ve done/tried: http://pastebin.com/eqQTHcxU — This was about
halfway through PG healing.

So, the setup is three monitors, two other hosts on which there are 9
OSDs each. At the beginning, all my placement groups were stuck unclean.

And there's your reason why the firefly install failed.
The default replication is 3 and you have just 2 storage nodes, combined
with the default CRUSH rules that's exactly what will happen.
To avoid this from the start either use 3 nodes or set
---
osd_pool_default_size = 2
osd_pool_default_min_size = 1
---
in your ceph.conf very early on, before creating anything, especially
OSDs.

Setting the replication for all your pools to 2 with ceph osd pool name
set size 2 as the first step after your install should have worked, too.

But with all the things you tried, I can't really tell you why things
behaved they way they did for you.

Christian

I tried the easy things first:
- set crush tunables to optimal
- run repairs/scrub on OSDs
- restart OSDs

Nothing happened. All ~12000 PGs remained stuck unclean since forever
active+remapped. Next, I played with the crush map. I deleted the
default replicated_ruleset rule and created a (basic) rule for each pool
for the time being. I set the pools to use their respective rule and
also reduced their size to 2 and min_size to 1.

Still nothing, all PGs stuck.
I’m not sure why but I tried setting the crush tunables to legacy - I
guess in a trial and error attempt.

Half my PGs healed almost immediately. 6082 PGs remained in
active+remapped. I try running scrubs/repairs - it won’t heal the other
half. I set the tunables back to optimal, still nothing.

I set tunables to legacy again and most of them end up healing with only
1335 left in active+remapped.

The remainder of the PGs healed when I restarted the OSDs.

Does anyone have a clue why this happened ?
It looks like switching back and forth between tunables fixed the stuck
PGs ?

I can easily reproduce this if anyone wants more info.

Let me know !
--
David Moreau Simard

___
ceph-users mailing list
ceph-users@lists.ceph.commailto:ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



--
Christian BalzerNetwork/Systems Engineer
ch...@gol.commailto:ch...@gol.comGlobal OnLine Japan/Fusion Communications
http://www.gol.com/

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Fixed all active+remapped PGs stuck forever (but I have no clue why)

2014-08-13 Thread David Moreau Simard
Hi,

Trying to update my continuous integration environment.. same deployment method 
with the following specs:
- Ubuntu Precise, Kernel 3.2, Emperor (0.72.2) - Yields a successful, healthy 
cluster.
- Ubuntu Trusty, Kernel 3.13, Firefly (0.80.5) - I have stuck placement groups.

Here’s some relevant bits from the Trusty/Firefly setup before I move on to 
what I’ve done/tried:
http://pastebin.com/eqQTHcxU — This was about halfway through PG healing.

So, the setup is three monitors, two other hosts on which there are 9 OSDs each.
At the beginning, all my placement groups were stuck unclean.

I tried the easy things first:
- set crush tunables to optimal
- run repairs/scrub on OSDs
- restart OSDs

Nothing happened. All ~12000 PGs remained stuck unclean since forever 
active+remapped.
Next, I played with the crush map. I deleted the default replicated_ruleset 
rule and created a (basic) rule for each pool for the time being.
I set the pools to use their respective rule and also reduced their size to 2 
and min_size to 1.

Still nothing, all PGs stuck.
I’m not sure why but I tried setting the crush tunables to legacy - I guess in 
a trial and error attempt.

Half my PGs healed almost immediately. 6082 PGs remained in active+remapped.
I try running scrubs/repairs - it won’t heal the other half. I set the tunables 
back to optimal, still nothing.

I set tunables to legacy again and most of them end up healing with only 1335 
left in active+remapped.

The remainder of the PGs healed when I restarted the OSDs.

Does anyone have a clue why this happened ?
It looks like switching back and forth between tunables fixed the stuck PGs ?

I can easily reproduce this if anyone wants more info.

Let me know !
--
David Moreau Simard

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] New Ceph mirror on the east coast

2014-07-31 Thread David Moreau Simard
Hi there,

I’m glad to announce that Ceph is now part of the mirrors iWeb provides.

It is available in both ipv4 and ipv6 by:
- http on http://mirror.iweb.ca/ or directly on http://ceph.mirror.iweb.ca/
- rsync on ceph.mirror.iweb.ca::ceph

The mirror provides 4 Gbps of connectivity and is located on the eastern coast 
of Canada, more precisely in Montreal, Quebec.
We feel this complements very well the principal ceph mirror at ceph.com 
located on the west coast and the european mirror on eu.ceph.com.

Feel free to give it a try and let me know if you see any problems !

--
David Moreau Simard
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] EU mirror now supports rsync

2014-07-17 Thread David Moreau Simard
(taking this back to ceph-users, not sure why I posted to ceph-devel?)

Thanks for the info, I sent them a message to inquire about access.
In the meantime, the mirror is already synchronized (sync every 4 hours) and 
available on http://mirror.iweb.ca or directly on http://ceph.mirror.iweb.ca.

David Moreau Simard

On Jul 17, 2014, at 5:21 AM, Wido den Hollander 
w...@42on.commailto:w...@42on.com wrote:

On 07/16/2014 09:48 PM, David Moreau Simard wrote:
Hi,

Thanks for making this available.
I am currently synchronizing off of it and will make it available on our 4 Gbps 
mirror on the Canadian east coast by the end of this week.


Cool! More mirrors is always better.

Are you able to share how you are synchronizing from the Ceph repositories ?
It would probably be better for us to synchronize from the source rather than 
the Europe mirror.


I have a SSH account into the ceph.comhttp://ceph.com/ server and use that to 
sync the packages. I've set this up the the community guys Ross and Patrick, 
you might want to ping them.

There is no official distribution mechanism for this right now. I simply set up 
a rsyncd to provide this the community.

Wido

--
David Moreau Simard

On Apr 9, 2014, at 2:04 AM, Wido den Hollander 
w...@42on.commailto:w...@42on.com wrote:

Hi,

I just enabled rsync on the eu.ceph.comhttp://eu.ceph.com mirror.

eu.ceph.comhttp://eu.ceph.com mirrors from Ceph.comhttp://Ceph.com every 3 
hours.

Feel free to rsync all the contents to your local environment, might be useful
for some large deployments where you want to save external bandwidth by not
having each machine fetch the Deb/RPM packages from the internet.

Rsync is available over IPv4 and IPv6, simply sync with this command:
$ mkdir cephmirror
$ rsync -avr --stats --progress eu.ceph.comhttp://eu.ceph.com::ceph cephmirror

I ask you all to be gentle. It's a free service, so don't start hammering the
server by setting your Cron to sync every 5 minutes. Once every couple of hours
should be sufficient.

Also, please don't all start syncing at the first minute of the hour. When
setting up the Cron, select a random minute from the hour. This way the load on
the system can be spread out.

Should you have any questions or issues, let me know!

--
Wido den Hollander
42on B.V.
Ceph trainer and consultant
___
ceph-users mailing list
ceph-users@lists.ceph.commailto:ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



--
Wido den Hollander
Ceph consultant and trainer
42on B.V.

Phone: +31 (0)20 700 9902
Skype: contact42on

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Calamari Goes Open Source

2014-05-30 Thread David Moreau Simard
Awesome.

--
David Moreau Simard

On May 30, 2014, at 6:04 PM, Patrick McGarry patr...@inktank.com wrote:

 Hey cephers,
 
 Sorry to push this announcement so late on a Friday but...
 
 Calamari has arrived!
 
 The source code bits have been flipped, the ticket tracker has been
 moved, and we have even given you a little bit of background from both
 a technical and vision point of view:
 
 Technical (ceph.com):
 http://ceph.com/community/ceph-calamari-goes-open-source/
 
 Vision (inktank.com):
 http://www.inktank.com/software/future-of-calamari/
 
 The ceph.com link should give you everything you need to know about
 what tech comprises Calamari, where the source lives, and where the
 discussions will take place.  If you have any questions feel free to
 hit the new ceph-calamari list or stop by IRC and we'll get you
 started.  Hope you all enjoy the GUI!
 
 
 
 Best Regards,
 
 Patrick McGarry
 Director, Community || Inktank
 http://ceph.com  ||  http://inktank.com
 @scuttlemonkey || @ceph || @inktank
 --
 To unsubscribe from this list: send the line unsubscribe ceph-devel in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] The Kraken has been released!

2014-01-12 Thread David Moreau Simard
Hey guys, just wanted to say that python-cephclient is packaged on pypi now:
https://pypi.python.org/pypi/python-cephclient

It is still in it’s early stages and feel free to provide me some feedback on 
github:
https://github.com/dmsimard/python-cephclient

David Moreau Simard
IT Architecture Specialist

On Jan 9, 2014, at 2:04 PM, David Moreau Simard dmsim...@iweb.com wrote:

 Awesome.
 
 You guys feel free to contribute to 
 https://github.com/dmsimard/python-cephclient as well :)
 
 David Moreau Simard
 IT Architecture Specialist
 
 http://iweb.com
 
 On Jan 9, 2014, at 12:31 AM, Don Talton (dotalton) dotal...@cisco.com wrote:
 
 The first phase of Kraken (free) dashboard for Ceph cluster monitoring is 
 complete. You can grab it here (https://github.com/krakendash/krakendash)
  
 Pictures here http://imgur.com/a/JoVPy
  
 Current features:
  
   MON statuses
   OSD statuses
 OSD detail drilldown
   Pool statuses
 Pool detail drilldown
  
 Upcoming features:
   Advanced metrics via collectd
   Cluster management (eg write) operations
   Multi-cluster support
   Hardware node monitoring
  
 Dave Simard has contributed a wrapper for the Ceph API here 
 (https://github.com/dmsimard/python-cephclient) which Kraken will begin 
 using shortly.
  
 Pull requests are welcome! The more the merrier, I’d love to get more 
 features developed.
  
 Donald Talton
 Cloud Systems Development
 Cisco Systems
  
  
 ___
 ceph-users mailing list
 ceph-users@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
 
 ___
 ceph-users mailing list
 ceph-users@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



smime.p7s
Description: S/MIME cryptographic signature
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] The Kraken has been released!

2014-01-09 Thread David Moreau Simard
Awesome.

You guys feel free to contribute to 
https://github.com/dmsimard/python-cephclient as well :)

David Moreau Simard
IT Architecture Specialist

http://iweb.com

On Jan 9, 2014, at 12:31 AM, Don Talton (dotalton) dotal...@cisco.com wrote:

 The first phase of Kraken (free) dashboard for Ceph cluster monitoring is 
 complete. You can grab it here (https://github.com/krakendash/krakendash)
  
 Pictures here http://imgur.com/a/JoVPy
  
 Current features:
  
   MON statuses
   OSD statuses
 OSD detail drilldown
   Pool statuses
 Pool detail drilldown
  
 Upcoming features:
   Advanced metrics via collectd
   Cluster management (eg write) operations
   Multi-cluster support
   Hardware node monitoring
  
 Dave Simard has contributed a wrapper for the Ceph API here 
 (https://github.com/dmsimard/python-cephclient) which Kraken will begin using 
 shortly.
  
 Pull requests are welcome! The more the merrier, I’d love to get more 
 features developed.
  
 Donald Talton
 Cloud Systems Development
 Cisco Systems
  
  
 ___
 ceph-users mailing list
 ceph-users@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



smime.p7s
Description: S/MIME cryptographic signature
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com