Re: [ceph-users] performance issue with jewel on ubuntu xenial (kernel)

2016-06-22 Thread Florian Haas
On Wed, Jun 22, 2016 at 10:56 AM, Yoann Moulin wrote: > Hello Florian, > >> On Tue, Jun 21, 2016 at 3:11 PM, Yoann Moulin wrote: >>> Hello, >>> >>> I found a performance drop between kernel 3.13.0-88 (default kernel on >>> Ubuntu >>> Trusty 14.04) and

Re: [ceph-users] Dramatic performance drop at certain number of objects in pool

2016-06-22 Thread Christian Balzer
Hello Blair, hello Wade (see below), On Thu, 23 Jun 2016 12:55:17 +1000 Blair Bethwaite wrote: > On 23 June 2016 at 12:37, Christian Balzer wrote: > > Case in point, my main cluster (RBD images only) with 18 5+TB OSDs on 3 > > servers (64GB RAM each) has 1.8 million 4MB RBD

Re: [ceph-users] Dramatic performance drop at certain number of objects in pool

2016-06-22 Thread Blair Bethwaite
On 23 June 2016 at 12:37, Christian Balzer wrote: > Case in point, my main cluster (RBD images only) with 18 5+TB OSDs on 3 > servers (64GB RAM each) has 1.8 million 4MB RBD objects using about 7% of > the available space. > Don't think I could hit this problem before running out

Re: [ceph-users] Dramatic performance drop at certain number of objects in pool

2016-06-22 Thread Christian Balzer
On Thu, 23 Jun 2016 11:33:05 +1000 Blair Bethwaite wrote: > Wade, good to know. > > For the record, what does this work out to roughly per OSD? And how > much RAM and how many PGs per OSD do you have? > > What's your workload? I wonder whether for certain workloads (e.g. > RBD) it's better to

Re: [ceph-users] Dramatic performance drop at certain number of objects in pool

2016-06-22 Thread Blair Bethwaite
Hi Christian, Ah ok, I didn't see object size mentioned earlier. But I guess direct rados small objects would be a rarish use-case and explains the very high object counts. I'm interested in finding the right balance for RBD given object size is another variable that can be tweaked there. I

Re: [ceph-users] Dramatic performance drop at certain number of objects in pool

2016-06-22 Thread Wade Holler
No. Our application writes very small objects. On Wed, Jun 22, 2016 at 10:01 PM, Blair Bethwaite wrote: > On 23 June 2016 at 11:41, Wade Holler wrote: >> Workload is native librados with python. ALL 4k objects. > > Was that meant to be 4MB? >

Re: [ceph-users] Dramatic performance drop at certain number of objects in pool

2016-06-22 Thread Christian Balzer
On Thu, 23 Jun 2016 12:01:38 +1000 Blair Bethwaite wrote: > On 23 June 2016 at 11:41, Wade Holler wrote: > > Workload is native librados with python. ALL 4k objects. > > Was that meant to be 4MB? > Nope, he means 4K, he's putting lots of small objects via a python

Re: [ceph-users] Dramatic performance drop at certain number of objects in pool

2016-06-22 Thread Blair Bethwaite
On 23 June 2016 at 11:41, Wade Holler wrote: > Workload is native librados with python. ALL 4k objects. Was that meant to be 4MB? -- Cheers, ~Blairo ___ ceph-users mailing list ceph-users@lists.ceph.com

Re: [ceph-users] Ceph 10.1.1 rbd map fail

2016-06-22 Thread Brad Hubbard
On Wed, Jun 22, 2016 at 3:20 PM, 王海涛 wrote: > I find this message in dmesg: > [83090.212918] libceph: mon0 192.168.159.128:6789 feature set mismatch, my > 4a042a42 < server's 2004a042a42, missing 200 > > According to >

Re: [ceph-users] Dramatic performance drop at certain number of objects in pool

2016-06-22 Thread Wade Holler
Blairo, We'll speak in pre-replication numbers, replication for this pool is 3. 23.3 Million Objects / OSD pg_num 2048 16 OSDs / Server 3 Servers 660 GB RAM Total, 179 GB Used (free -t) / Server vm.swappiness = 1 vm.vfs_cache_pressure = 100 Workload is native librados with python. ALL 4k

Re: [ceph-users] Dramatic performance drop at certain number of objects in pool

2016-06-22 Thread Blair Bethwaite
Wade, good to know. For the record, what does this work out to roughly per OSD? And how much RAM and how many PGs per OSD do you have? What's your workload? I wonder whether for certain workloads (e.g. RBD) it's better to increase default object size somewhat before pushing the split/merge up a

[ceph-users] about image's largest size

2016-06-22 Thread Ops Cloud
We want to run a backup server, which has huge storage as backend. If we use rbd client to mount a block storage from ceph, for a single image, how large can it be? xxx TB or PB? Thank you. -- Ops Cloud o...@19cloud.net___ ceph-users mailing list

Re: [ceph-users] Dramatic performance drop at certain number of objects in pool

2016-06-22 Thread Wade Holler
Based on everyones suggestions; The first modification to 50 / 16 enabled our config to get to ~645Mill objects before the behavior in question was observed (~330 was the previous ceiling). Subsequent modification to 50 / 24 has enabled us to get to 1.1 Billion+ Thank you all very much for your

Re: [ceph-users] Cache Tiering with Same Cache Pool

2016-06-22 Thread Christian Balzer
Hello, On Wed, 22 Jun 2016 15:40:40 +0700 Lazuardi Nasution wrote: > Hi Christian, > > If I have several cache pool on the same SSD OSDs (by using same ruleset) > so those cache pool always show same Max. Available of "ceph df detail" > output, That's true for all pools that share the same

Re: [ceph-users] Ceph RBD object-map and discard in VM

2016-06-22 Thread Jason Dillaman
I'm not sure why I never received the original list email, so I apologize for the delay. Is /dev/sda1, from your example, fresh with no data to actually discard or does it actually have lots of data to discard? Thanks, On Wed, Jun 22, 2016 at 1:56 PM, Brian Andrus wrote: >

[ceph-users] cephfs snapshots

2016-06-22 Thread Gregory Farnum
[re-adding ceph-users] Yes, it can corrupt the metadata and require use of filesystem repair tools. I really don't recommend using snapshots except on toy clusters. On Wednesday, June 22, 2016, Brady Deetz > wrote: > Snapshots

Re: [ceph-users] cephfs snapshots

2016-06-22 Thread Gregory Farnum
On Wednesday, June 22, 2016, Kenneth Waegeman wrote: > Hi all, > > In Jewel ceph fs snapshots are still experimental. Does someone has a clue > when this would become stable, or how experimental this is ? > We're not sure yet. Probably it will follow stable multi-MDS;

[ceph-users] issues with misplaced object and PG that won't clean

2016-06-22 Thread Mike Shaffer
Hi, I'm running into an issue with a PG that will not get clean and seems to be blocking requests. When I restart an OSD the PG is on I never reach recovery. Removing the OSD in question only seems to move the problem. Additionally ceph pg query hangs forever unless the OSD in question is

Re: [ceph-users] Ceph RBD object-map and discard in VM

2016-06-22 Thread Brian Andrus
I've created a downstream bug for this same issue. https://bugzilla.redhat.com/show_bug.cgi?id=1349116 On Wed, Jun 15, 2016 at 6:23 AM, wrote: > Hello guys, > > We are currently testing Ceph Jewel with object-map feature enabled: > > rbd image 'disk-22920': > size

[ceph-users] libceph dns resolution

2016-06-22 Thread Willi Fehler
Hello, I'm trying to mount a ceph storage. It seems that libceph does not support records in /etc/hosts? libceph: parse_ips bad ip 'linsrv001.willi-net.local' Regards - Willi ___ ceph-users mailing list ceph-users@lists.ceph.com

[ceph-users] Use of legacy bobtail tunables and potential performance impact to "jewel"?

2016-06-22 Thread Yang X
Our ceph client is using bobtail legacy tunable and in particular, "chooseleaf_vary_r" is set to 0. My question is how it would impact CRUSH and hence performance in deploying "jewel" on the server side and also the experimental "bluestore" backend. Does it only affect data placement or does it

[ceph-users] Error EPERM when running ceph tell command

2016-06-22 Thread Andrei Mikhailovsky
Hi I am trying to run an osd level benchmark but get the following error: # ceph tell osd.3 bench Error EPERM: problem getting command descriptions from osd.3 I am running Jewel 10.2.2 on Ubuntu 16.04 servers. Has the syntax change or do I have an issue? Cheers Andrei

Re: [ceph-users] cluster down during backfilling, Jewel tunables and client IO optimisations

2016-06-22 Thread Andrei Mikhailovsky
Hi Daniel, Many thanks, I will keep this in mind while performing the updates in the future. Note to documentation manager - perhaps it makes sens to add this solution as a note/tip to the Upgrade section of the release notes? Andrei - Original Message - > From: "Daniel Swarbrick"

Re: [ceph-users] cluster down during backfilling, Jewel tunables and client IO optimisations

2016-06-22 Thread Daniel Swarbrick
On 22/06/16 17:54, Andrei Mikhailovsky wrote: > Hi Daniel, > > Many thanks for your useful tests and your results. > > How much IO wait do you have on your client vms? Has it significantly > increased or not? > Hi Andrei, Bearing in mind that this cluster is tiny (four nodes, each with four

Re: [ceph-users] cluster down during backfilling, Jewel tunables and client IO optimisations

2016-06-22 Thread Andrei Mikhailovsky
Hi Daniel, Many thanks for your useful tests and your results. How much IO wait do you have on your client vms? Has it significantly increased or not? Many thanks Andrei - Original Message - > From: "Daniel Swarbrick" > To: "ceph-users"

[ceph-users] ceph osd create - how to detect changes

2016-06-22 Thread George Shuklin
I'm writing ceph playbook and I see some issues with ceph osd create command. When I call it with all arguments, I can't distinct when it creates new or when it confirms that this osd already exists. ceph osd create 5ecc7a8c-388a-11e6-b8ad-5f3ab2552b13 22; echo $? 22 0 ceph osd create

Re: [ceph-users] problems mounting from fstab on boot

2016-06-22 Thread David Riedl
You have do at _netdev to your line in fstab. Example: localhost:/data/var/dataglusterfs _netdev0 0 from https://blog.sleeplessbeastie.eu/2013/05/10/centos-6-_netdev-fstab-option-and-netfs-service/ On 22.06.2016 15:16, Daniel Davidson wrote: When I add my ceph system to

[ceph-users] problems mounting from fstab on boot

2016-06-22 Thread Daniel Davidson
When I add my ceph system to fstab, I can make mount by referencing it, but when I restart the system it stops during boot because the mount failed. I am guessing it is because fstab is run before the network starts? Using centos 7. thanks for the help, Dan

Re: [ceph-users] cluster down during backfilling, Jewel tunables and client IO optimisations

2016-06-22 Thread Daniel Swarbrick
On 20/06/16 19:51, Gregory Farnum wrote: > On Mon, Jun 20, 2016 at 8:33 AM, Daniel Swarbrick >> >> At this stage, I have a strong suspicion that it is the introduction of >> "require_feature_tunables5 = 1" in the tunables. This seems to require >> all RADOS connections to be re-established. > >

[ceph-users] cephfs snapshots

2016-06-22 Thread Kenneth Waegeman
Hi all, In Jewel ceph fs snapshots are still experimental. Does someone has a clue when this would become stable, or how experimental this is ? ___ ceph-users mailing list ceph-users@lists.ceph.com

[ceph-users] ceph-release RPM has broken URL

2016-06-22 Thread Oleksandr Natalenko
Hello. ceph-release-1-1.el7.noarch.rpm [1] is considered to be broken now because it contains wrong baseurl: === baseurl=http://ceph.com/rpm-hammer/rhel7/$basearch === That leads to 404 for yum trying to use it. I believe, "rhel7" should be replaced by "el7", and

Re: [ceph-users] stuck unclean since forever

2016-06-22 Thread min fang
Thanks, actually I create a pool with more pgs also meet this problem. Following is my crush map, please help point how to change the crush ruleset? thanks. #begin crush map tunable choose_local_tries 0 tunable choose_local_fallback_tries 0 tunable choose_total_tries 50 tunable

Re: [ceph-users] Inconsistent PGs

2016-06-22 Thread Paweł Sadowski
Query on that PGs hangs forever. We ended up using *ceph-objectstore-tool**mark-complete* on those PGs. On 06/22/2016 11:45 AM, 施柏安 wrote: > Hi, > You can use command 'ceph pg query' to check what's going on with the > pgs which have problem and use "ceph-objectstore-tool" to recover that pg. > >

Re: [ceph-users] stuck unclean since forever

2016-06-22 Thread Burkhard Linke
Hi, On 06/22/2016 12:10 PM, min fang wrote: Hi, I created a new ceph cluster, and create a pool, but see "stuck unclean since forever" errors happen(as the following), can help point out the possible reasons for this? thanks. ceph -s cluster 602176c1-4937-45fc-a246-cc16f1066f65

Re: [ceph-users] stuck unclean since forever

2016-06-22 Thread Oliver Dzombic
Hi Min, as its written there: too few PGs per OSD (2 < min 30) You have to raise the PG's. -- Mit freundlichen Gruessen / Best regards Oliver Dzombic IP-Interactive mailto:i...@ip-interactive.de Anschrift: IP Interactive UG ( haftungsbeschraenkt ) Zum Sonnenberg 1-3 63571 Gelnhausen HRB

[ceph-users] stuck unclean since forever

2016-06-22 Thread min fang
Hi, I created a new ceph cluster, and create a pool, but see "stuck unclean since forever" errors happen(as the following), can help point out the possible reasons for this? thanks. ceph -s cluster 602176c1-4937-45fc-a246-cc16f1066f65 health HEALTH_WARN 8 pgs degraded

Re: [ceph-users] Ceph deployment

2016-06-22 Thread Oliver Dzombic
Hi Fran, public_network = the network of the clients to access ceph ressources cluster_network = the network ceph use to keep the osd's sycronizing themself So if you want that your ceph cluster is available to public internet addresses, you will have to assign IPs from a real public

Re: [ceph-users] Inconsistent PGs

2016-06-22 Thread 施柏安
Hi, You can use command 'ceph pg query' to check what's going on with the pgs which have problem and use "ceph-objectstore-tool" to recover that pg. 2016-06-21 19:09 GMT+08:00 Paweł Sadowski : > Already restarted those OSD and then whole cluster (rack by rack, > failure domain is

[ceph-users] Ceph deployment

2016-06-22 Thread Fran Barrera
Hi all, I have a couple of question about the deployment of Ceph. This is what I plan: Private Net - 10.0.0.0/24 Public Net - 192.168.1.0/24 Ceph server: - eth1: 192.168.1.67 - eth2: 10.0.0.67 Openstack server: - eth1: 192.168.1.65 - eth2: 10.0.0.65 ceph.conf - mon_host: 10.0.0.67

Re: [ceph-users] performance issue with jewel on ubuntu xenial (kernel)

2016-06-22 Thread Yoann Moulin
Hello Florian, > On Tue, Jun 21, 2016 at 3:11 PM, Yoann Moulin wrote: >> Hello, >> >> I found a performance drop between kernel 3.13.0-88 (default kernel on Ubuntu >> Trusty 14.04) and kernel 4.4.0.24.14 (default kernel on Ubuntu Xenial 16.04) >> >> ceph version is Jewel

Re: [ceph-users] Cache Tiering with Same Cache Pool

2016-06-22 Thread Lazuardi Nasution
Hi Christian, If I have several cache pool on the same SSD OSDs (by using same ruleset) so those cache pool always show same Max. Available of "ceph df detail" output, what should I put on target_max_bytes of cache tiering configuration for each cache pool? should it be same and use Max Available

Re: [ceph-users] Ceph Performance vs Entry Level San Arrays

2016-06-22 Thread Oliver Dzombic
Hi Denver, its like christian said. On top of that, i would add, that iSCSI is always a more native protocol. You dont have to go through as much layers as you have it -per design- with a software defined storage. So you can expect always a better performance with hardware accelerated iSCSI. If