Re: [ceph-users] Bluestore + erasure coding memory usage

2016-11-04 Thread bobobo1...@gmail.com
> Then you can view the output data with ms_print or with massif-visualizer. > This may help narrow down where in the code we are using the memory. Done! I've dumped the output from ms_print here: http://ix.io/1CrS It seems most of the memory comes from here: 92.78% (998,248,799B) (heap alloca

Re: [ceph-users] VM disk operation blocked during OSDs failures

2016-11-04 Thread Christian Wuerdig
What are your pool size and min_size settings? An object with less than min_size replicas will not receive I/O ( http://docs.ceph.com/docs/jewel/rados/operations/pools/#set-the-number-of-object-replicas). So if size=2 and min_size=1 then an OSD failure means blocked operations to all objects locate

[ceph-users] Configuring Ceph RadosGW with SLA based rados pools

2016-11-04 Thread Andrey Ptashnik
Hello Ceph team! I’m trying to create different pools in Ceph in order to have different tiers (some are fast, small and expensive and others are plain big and cheap), so certain users will be tied to one pool or another. - I created two additional pools. .rgw.factor-2.buckets.data .rgw.fact

[ceph-users] VM disk operation blocked during OSDs failures

2016-11-04 Thread fcid
Dear ceph community, I'm working in a small ceph deployment for testing purposes, in which i want to test the high availability features of Ceph and how clients are affected during outages in the cluster. This small cluster is deployed using 3 servers on which are running 2 OSDs and 1 monito

[ceph-users] Graceful shutdown issue

2016-11-04 Thread Brendan Moloney
Hi, I have my monitors running on three out of my eight OSD servers (I know the docs recommend separate hardware but I don't think that makes sense for a small/dense cluster like this). I noticed that when I reboot the server that hosts the primary monitor, the OSDs on that server are not grace

Re: [ceph-users] Adjust PG PGP placement groups on the fly

2016-11-04 Thread Andrey Ptashnik
Thank you everyone for your feedback! Regards, Andrey Ptashnik From: David Turner mailto:david.tur...@storagecraft.com>> Date: Friday, November 4, 2016 at 12:13 PM To: Andrey Ptashnik mailto:aptash...@cccis.com>>, Vasu Kulkarni mailto:vakul...@redhat.com>> Cc: "ceph-users@lists.ceph.com

Re: [ceph-users] Adjust PG PGP placement groups on the fly

2016-11-04 Thread David Turner
You are correct, you do not need to wait for the rebalance to finish before continuing. There is a lot of information about increasing your pg_num on a production system. The general concensus it to not increase it by more than 256 at a time. That is to prevent too much peering to happen at

Re: [ceph-users] Adjust PG PGP placement groups on the fly

2016-11-04 Thread Andrey Ptashnik
Hi Vasu, Thank you for your input! I was very hesitant in changing those on a live system. As I understand I don’t need to wait for a cluster to re-balance between PG and PGP commands, right? Regards, Andrey Ptashnik From: Vasu Kulkarni mailto:vakul...@redhat.com>> Date: Friday, November 4, 2

Re: [ceph-users] Adjust PG PGP placement groups on the fly

2016-11-04 Thread Vasu Kulkarni
from the docs (also important to read what pgp_num does): http://docs.ceph.com/docs/jewel/rados/operations/placement-groups/ To set the number of placement groups in a pool, you must specify the number of placement groups at the time you create the pool. See Create a Pool for details. Once you’ve

[ceph-users] Adjust PG PGP placement groups on the fly

2016-11-04 Thread Andrey Ptashnik
Hello Ceph team, Is it possible to increase number of placement groups on a live system without any issues and data loss? If so what is the correct sequence of steps? Regards, Andrey Ptashnik ___ ceph-users mailing list ceph-users@lists.ceph.com http

[ceph-users] Replication strategy, write throughput

2016-11-04 Thread Andreas Gerstmayr
Hello, I'd like to understand how replication works. In the paper [1] several replication strategies are described, and according to a (bit old) mailing list post [2] primary-copy is used. Therefore the primary OSD waits until the object is persisted and then updates all replicas in parallel. Cur

Re: [ceph-users] MDS Problems

2016-11-04 Thread Patrick Donnelly
Hello Nick, On Fri, Nov 4, 2016 at 9:54 AM, Nick Fisk wrote: > I upgraded to 10.2.3 today and after restarting the MDS, the same or very > similar problem occurred. I didn't see any of the symlink > errors, so I think that was fixed in the upgrade but I was still seeing > looping and crashing u

Re: [ceph-users] MDS Problems

2016-11-04 Thread Nick Fisk
Hi John, thanks for your response > -Original Message- > From: John Spray [mailto:jsp...@redhat.com] > Sent: 04 November 2016 14:26 > To: n...@fisk.me.uk > Cc: Ceph Users > Subject: Re: [ceph-users] MDS Problems > > On Fri, Nov 4, 2016 at 2:54 PM, Nick Fisk wrote: > > I upgraded to 10.2

Re: [ceph-users] suddenly high memory usage for ceph-mon process

2016-11-04 Thread igor.podo...@ts.fujitsu.com
Maybe you hit this https://github.com/ceph/ceph/pull/10238 still waits for merge. This will occur only if you have ceph-mds process in your cluster, but it's not configured (you not need to use MDS, this process could be running only on some node). Check your monitor logs for something like: "

Re: [ceph-users] MDS Problems

2016-11-04 Thread John Spray
On Fri, Nov 4, 2016 at 2:54 PM, Nick Fisk wrote: > I upgraded to 10.2.3 today and after restarting the MDS, the same or very > similar problem occurred. I didn't see any of the symlink > errors, so I think that was fixed in the upgrade but I was still seeing > looping and crashing until I killed

Re: [ceph-users] suddenly high memory usage for ceph-mon process

2016-11-04 Thread David Turner
We have half a dozen clusters of varying sizes and all of them have high memory usage on the mons every 1-3 months. I've thought about opening a ticket with Ceph Enterprise support or bringing it up here, but there's no way for us to really get logs on it because we can't run with high logging f

[ceph-users] MDS Problems

2016-11-04 Thread Nick Fisk
I upgraded to 10.2.3 today and after restarting the MDS, the same or very similar problem occurred. I didn't see any of the symlink errors, so I think that was fixed in the upgrade but I was still seeing looping and crashing until I killed all clients and evicted sessions. Just after restarting

Re: [ceph-users] Multi-tenancy and sharing CephFS data pools with other RADOS users

2016-11-04 Thread John Spray
On Wed, Nov 2, 2016 at 9:21 PM, Dan Jakubiec wrote: > We currently have one master RADOS pool in our cluster that is shared among > many applications. All objects stored in the pool are currently stored using > specific namespaces -- nothing is stored in the default namespace. > > We would like

Re: [ceph-users] CephFS in existing pool namespace

2016-11-04 Thread John Spray
On Wed, Nov 2, 2016 at 9:10 PM, Dan Jakubiec wrote: > Hi John, > > How does one configure namespaces for file/dir layouts? I'm looking here, > but am not seeing any mentions of namespaces: > > http://docs.ceph.com/docs/jewel/cephfs/file-layouts/ The field is called "pool_namespace" (set it the

[ceph-users] suddenly high memory usage for ceph-mon process

2016-11-04 Thread mj
Hi, Running ceph 0.94.9 on jessie (proxmox), three hosts, 4 OSDs per host, ssd journal, 10G cluster network. Hosts have 65G ram. The cluster is generally not very buzy. Suddenly we were getting HEALTH_WRN today, with two osd's (both on the same server) being slow. Looking into this, we notic

Re: [ceph-users] Monitor troubles

2016-11-04 Thread Joao Eduardo Luis
On 11/04/2016 01:39 AM, Tracy Reed wrote: After a lot of messing about I have manually created a monmap and got the two new monitors working for a total of three. But to do that I had to delete the first monitor which for some reason was coming up with a bogus fsid after manipulated the monmap wh

[ceph-users] nfs-ganesha and rados gateway, Cannot find supported RGW runtime. Disabling RGW fsal build

2016-11-04 Thread 于 姜
ceph version 10.2.3 ubuntu 14.04 server nfs-ganesha 2.4.1 ntirpc 1.4.3 cmake -DUSE_FSAL_RGW=ON ../src/ -- Found rgw libraries: /usr/lib -- Could NOT find RGW: Found unsuitable version ".", but required is at least "1.1" (found /usr) CMake Warning at CMakeLists.txt:571 (message): Cannot find supp