Re: [ceph-users] osd_recovery_max_chunk value

2018-02-05 Thread Karun Josy
Hi Christian, Thank you for your help. Ceph version is 12.2.2. So is this value bad ? Do you have any suggestions ? So to reduce the max chunk ,I assume I can choose something like 7 << 20 ,ie 7340032 ? Karun Josy On Tue, Feb 6, 2018 at 1:15 PM, Christian Balzer wrote: > On T

[ceph-users] osd_recovery_max_chunk value

2018-02-05 Thread Karun Josy
chunk Description: The maximum size of a recovered chunk of data to push. Type: 64-bit Unsigned Integer Default: 8 << 20 I am confused. Can anyone let me know what is the value that I have to give to reduce this parameter ? Karun Josy

Re: [ceph-users] High RAM usage in OSD servers

2018-02-03 Thread Karun Josy
Can it be this bug : http://lists.ceph.com/pipermail/ceph-users-ceph.com/2017-October/021676.html In most of the OSDs buffer anon is high }, "buffer_anon": { "items": 268443, "bytes": 1421912265 Karun Josy On Sun, Feb 4, 2018 at 7:03 AM, Kar

Re: [ceph-users] High RAM usage in OSD servers

2018-02-03 Thread Karun Josy
And can see this in error log : Feb 2 16:41:28 ceph-las1-a4-osd kernel: bstore_kv_sync: page allocation stalls for 14188ms, order:0, mode:0x14280ca(GFP_HIGHUSER_MOVABLE|__GFP_ZERO), nodemask=(null) Karun Josy On Sun, Feb 4, 2018 at 6:19 AM, Karun Josy wrote: > Hi, > > We are

[ceph-users] High RAM usage in OSD servers

2018-02-03 Thread Karun Josy
cannot turn this node off as it will force some pgs to be incomplete state. And help would be really appreciated. Karun Josy ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Snapshot trimming

2018-01-30 Thread Karun Josy
Hi Jason, >> Was the base RBD pool used only for data-pool associated images Yes, it is only used for storing metadata of ecpool. We use 2 pools for erasure coding ecpool - erasure coded datapool vm - replicated pool to store metadata Karun Josy On Tue, Jan 30, 2018 at 8:00 PM,

Re: [ceph-users] lease_timeout

2018-01-29 Thread Karun Josy
gets updated correctly. Karun Josy On Tue, Jan 30, 2018 at 1:35 AM, John Spray wrote: > On Mon, Jan 29, 2018 at 6:58 PM, Gregory Farnum > wrote: > > The lease timeout means this (peon) monitor hasn't heard from the leader > > monitor in too long; its read lease on the s

Re: [ceph-users] Snapshot trimming

2018-01-29 Thread Karun Josy
version 12.2.1 and then updated to .2. Karun Josy On Mon, Jan 29, 2018 at 7:01 PM, Karun Josy wrote: > Thank you for your response. > > We don't think there is an issue with the cluster being behind snap > trimming. We just don't think snaptrim is occurring at all. > &g

Re: [ceph-users] Snapshot trimming

2018-01-29 Thread Karun Josy
ng. But the space is not being reclaimed. All clusters are same hardware. Some have more disks and servers than others. The only major difference is that this particular Ceph with this problem, it had the noscrub and nodeep-scrub flags set for many weeks. Karun Josy On Mon, Jan 29, 2018 at 6:27 PM, D

Re: [ceph-users] Snapshot trimming

2018-01-29 Thread Karun Josy
fast-diff map is not enabled for RBD images. Can it be a reason for Trimming not happening ? Karun Josy On Sat, Jan 27, 2018 at 10:19 PM, Karun Josy wrote: > Hi David, > > Thank you for your reply! I really appreciate it. > > The images are in pool id 55. It is an era

Re: [ceph-users] POOL_NEARFULL

2018-01-29 Thread Karun Josy
In Luminous version, we have to use osd set command -- ceph osd set -backfillfull-ratio .89 ceph osd set-nearfull-ratio .84 ceph osd set-full-ratio .96 -- Karun Josy On Thu, Dec 21, 2017 at 4:29 PM, Konstantin Shalygin wrote: > Update your ceph.conf file > > This is

Re: [ceph-users] Limit deep scrub

2018-01-28 Thread Karun Josy
Yes, scrubbing is slower now, but there has been no osd flapping and slow requests! Thanks for all your help! Karun Josy On Sun, Jan 28, 2018 at 9:25 PM, David Turner wrote: > Use a get with the second syntax to see the currently running config. > > On Sun, Jan 28, 2018, 3:41 AM Karun J

Re: [ceph-users] lease_timeout

2018-01-28 Thread Karun Josy
Still the issue is continuing. Any one else has noticed it ? When this happens, the Ceph Dashboard GUI gets stuck and we have to restart the manager daemon to make it work again Karun Josy On Wed, Jan 17, 2018 at 6:16 AM, Karun Josy wrote: > Hello, > > In one of our cluster set up,

Re: [ceph-users] Limit deep scrub

2018-01-28 Thread Karun Josy
b_sleep .1 While using both it shows (not observed, change may require restart) So is it not set ? Karun Josy On Mon, Jan 15, 2018 at 7:16 AM, shadow_lin wrote: > hi, > you can try to adjusting osd_scrub_chunk_min,osd_scrub_chunk_max and > osd_scrub_sleep. > > > osd scrub sleep &g

Re: [ceph-users] Snapshot trimming

2018-01-27 Thread Karun Josy
if it is snaptrimming ? I am sorry I feel like pestering you too much. But in mailing lists, I can see you have dealt with similar issues with Snapshots So I think you can help me figure this mess out. Karun Josy On Sat, Jan 27, 2018 at 7:15 PM, David Turner wrote: > Prove* a positive >

Re: [ceph-users] Snapshot trimming

2018-01-26 Thread Karun Josy
Is scrubbing and deep scrubbing necessary for Snaptrim operation to happen ? Karun Josy On Fri, Jan 26, 2018 at 9:29 PM, Karun Josy wrote: > Thank you for your quick response! > > I used the command to fetch the snap_trimq from many pgs, however it seems > they don't

[ceph-users] Snapshot trimming

2018-01-25 Thread Karun Josy
Hi, We have set no scrub , no deep scrub flag on a ceph cluster. When we are deleting snapshots we are not seeing any change in usage space. I understand that Ceph OSDs delete data asynchronously, so deleting a snapshot doesn’t free up the disk space immediately. But we are not seeing any change

Re: [ceph-users] Full Ratio

2018-01-24 Thread Karun Josy
Thank you! Ceph version is 12.2 Also, can you let me know the format to set osd_backfill_full_ratio ? Is it " ceph osd set -backfillfull-ratio .89 " ? Karun Josy On Thu, Jan 25, 2018 at 1:29 AM, Jean-Charles Lopez wrote: > Hi, > > if you are using an olde

[ceph-users] Full Ratio

2018-01-24 Thread Karun Josy
Hi, I am trying to increase the full ratio of OSDs in a cluster. While adding a new node one of the new disk got backfilled to more than 95% and cluster freezed. So I am trying to avoid it from happening again. Tried pg set command but it is not working : $ ceph pg set_nearfull_ratio 0.88 Error

[ceph-users] PG inactive, peering

2018-01-22 Thread Karun Josy
Hi, We added a new host to cluster and it was rebalancing. And one PG became "inactive, peering" for very long time which created lot of slow requests and poor performance to the whole cluster. When I queried that PG, it showed this : "recovery_state": [ { "name": "Started/Pr

[ceph-users] lease_timeout

2018-01-16 Thread Karun Josy
Hello, In one of our cluster set up, there is frequent monitor elections happening. In the logs of one of the monitor, there is "lease_timeout" message before that happens. Can anyone help me to figure it out ? (When this happens, the Ceph Dashboard GUI gets stuck and we have to restart the manage

[ceph-users] Limit deep scrub

2018-01-14 Thread Karun Josy
Hello, It appears that cluster is having many slow requests while it is scrubbing and deep scrubbing. Also sometimes we can see osds flapping. So we have put the flags : noscrub,nodeep-scrub When we unset it, 5 PGs start to scrub. Is there a way to limit it to one at a time? # ceph daemon osd.3

[ceph-users] rbd: map failed

2018-01-09 Thread Karun Josy
uot;. rbd: map failed: (1) Operation not permitted Is it because that user has only read permission in templates pool ? Karun Josy ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] How to evict a client in rbd

2018-01-03 Thread Karun Josy
It happens randomly. Karun Josy On Wed, Jan 3, 2018 at 7:07 AM, Jason Dillaman wrote: > I tried to reproduce this for over an hour today using the specified > versions w/o any success. Is this something that you can repeat > on-demand or was this a one-time occurance? > > On Sa

Re: [ceph-users] Increasing PG number

2018-01-02 Thread Karun Josy
Should I increase it gradually? Or set pg as 512 in one step ? Karun Josy On Tue, Jan 2, 2018 at 9:26 PM, Hans van den Bogert wrote: > Please refer to standard documentation as much as possible, > > http://docs.ceph.com/docs/jewel/rados/operations/ > placement-groups/#set-t

[ceph-users] Increasing PG number

2018-01-02 Thread Karun Josy
Hi, Initial PG count was not properly planned while setting up the cluster, so now there are only less than 50 PGs per OSDs. What are the best practises to increase PG number of a pool ? We have replicated pools as well as EC pools. Or is it better to create a new pool with higher PG numbers?

Re: [ceph-users] PG active+clean+remapped status

2018-01-02 Thread Karun Josy
Hi, We added some more osds to the cluster and it was fixed. Karun Josy On Tue, Jan 2, 2018 at 6:21 AM, 한승진 wrote: > Are all odsd are same version? > I recently experienced similar situation. > > I upgraded all osds to exact same version and reset of pool configuration > like

Re: [ceph-users] Cache tiering on Erasure coded pools

2017-12-28 Thread Karun Josy
intensive than EC computing? Karun Josy On Wed, Dec 27, 2017 at 3:42 AM, David Turner wrote: > Please use the version of the docs for your installed version of ceph. > Now the Jewel in your URL and the Luminous in mine. In Luminous you no > longer need a cache tier to use EC with RBDs.

[ceph-users] Cache tiering on Erasure coded pools

2017-12-26 Thread Karun Josy
pool to store metadata and ecpool as data pool . Is it possible to setup cache tiering since there is already a replicated pool that is being used ? Karun Josy ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com

Re: [ceph-users] How to evict a client in rbd

2017-12-26 Thread Karun Josy
Any help is really appreciated. Karun Josy On Sun, Dec 24, 2017 at 2:18 AM, Karun Josy wrote: > Hello, > > The image is not mapped. > > # ceph --version > ceph version 12.2.1 luminous (stable) > # uname -r > 4.14.0-1.el7.elrepo.x86_64 > > > Karun Josy &g

Re: [ceph-users] How to evict a client in rbd

2017-12-23 Thread Karun Josy
Hello, The image is not mapped. # ceph --version ceph version 12.2.1 luminous (stable) # uname -r 4.14.0-1.el7.elrepo.x86_64 Karun Josy On Sat, Dec 23, 2017 at 6:51 PM, Jason Dillaman wrote: > What Ceph and what kernel version are you using? Are you positive that > the image ha

[ceph-users] How to evict a client in rbd

2017-12-22 Thread Karun Josy
Hello, I am unable to delete this abandoned image.Rbd info shows a watcher ip Image is not mapped Image has no snapshots rbd status cvm/image --id clientuser Watchers: watcher=10.255.0.17:0/3495340192 client.390908 cookie=18446462598732841114 How can I evict or black list a watcher cl

Re: [ceph-users] Proper way of removing osds

2017-12-22 Thread Karun Josy
Thank you! Karun Josy On Thu, Dec 21, 2017 at 3:51 PM, Konstantin Shalygin wrote: > Is this the correct way to removes OSDs, or am I doing something wrong ? >> > Generic way for maintenance (e.g. disk replace) is rebalance by change osd > weight: > > > ceph os

[ceph-users] Proper way of removing osds

2017-12-21 Thread Karun Josy
Hi, This is how I remove an OSD from cluster - Take it out ceph osd out osdid Wait for the balancing to finish - Mark it down ceph osd down osdid Then Purge it ceph osd purge osdid --yes-i-really-mean-it While purging I can see there is another rebalancing occurring. I

Re: [ceph-users] POOL_NEARFULL

2017-12-19 Thread Karun Josy
restart) mon.mon-a2: injectargs:mon_osd_nearfull_ratio = '0.86' (not observed, change may require restart) mon.mon-a3: injectargs:mon_osd_nearfull_ratio = '0.86' (not observed, change may require restart) Karun Josy On Tue, Dec 19, 2017 at 10:05 PM, Jean-Charles Lopez wrote: > OK

Re: [ceph-users] POOL_NEARFULL

2017-12-19 Thread Karun Josy
No, I haven't. Interestingly, the POOL_NEARFULL flag is shown only when there is OSD_NEARFULL flag. I have recently upgraded to Luminous 12.2.2, haven't seen this flag in 12.2.1 Karun Josy On Tue, Dec 19, 2017 at 9:27 PM, Jean-Charles Lopez wrote: > Hi > > did you set qu

[ceph-users] POOL_NEARFULL

2017-12-19 Thread Karun Josy
Hello, In one of our clusters, health is showing these warnings : - OSD_NEARFULL 1 nearfull osd(s) osd.22 is near full POOL_NEARFULL 3 pool(s) nearfull pool 'templates' is nearfull pool 'cvm' is nearfull pool 'ecpool' is nearfull One osd is above 85% used, whi

Re: [ceph-users] PG active+clean+remapped status

2017-12-18 Thread Karun Josy
in the active+remapped state Its a small cluster with unequal number of osds and one of the OSD disk failed and I had taken it out. I have already purged it, so I cannot use the reweight option mentioned in that link. So any other workarounds ? Will adding more disks will clear it ? Karun Josy

Re: [ceph-users] PG active+clean+remapped status

2017-12-17 Thread Karun Josy
Tried restarting all osds. Still no luck. Will adding a new disk to any of the server forces a rebalance and fix it? Karun Josy On Sun, Dec 17, 2017 at 12:22 PM, Cary wrote: > Karun, > > Could you paste in the output from "ceph health detail"? Which OSD > was just adde

Re: [ceph-users] Adding new host

2017-12-17 Thread Karun Josy
-alignment=false k=5 m=3 plugin=jerasure technique=reed_sol_van w=8 Karun Josy On Sun, Dec 17, 2017 at 11:26 PM, David Turner wrote: > I like to avoid adding disks from more than 1 failure domain at a time in > case some of the new disks are bad. In your example of only adding 1 new > nod

[ceph-users] Adding new host

2017-12-17 Thread Karun Josy
Hi, We have a live cluster with 8 OSD nodes all having 5-6 disks each. We would like to add a new host and expand the cluster. We have 4 pools - 3 replicated pools with replication factor 5 and 3 - 1 erasure coded pool with k=5, m=3 So my concern is, is there any precautions that are needed to

Re: [ceph-users] PG active+clean+remapped status

2017-12-16 Thread Karun Josy
Any help would be appreciated! Karun Josy On Sat, Dec 16, 2017 at 11:04 PM, Karun Josy wrote: > Hi, > > Repair didnt fix the issue. > > In the pg dump details, I notice this None. Seems pg is missing from one > of the OSD > > [0,2,NONE,4,12,10,5,1] > [0,2,1,4,12,

Re: [ceph-users] PG active+clean+remapped status

2017-12-16 Thread Karun Josy
ing, or restarting while Ceph is repairing or > recovering. If possible, wait until the cluster is in a healthy state > first. > > Cary > -Dynamic > > On Sat, Dec 16, 2017 at 2:05 PM, Karun Josy wrote: > > Hi Cary, > > > > No, I didnt try to repair it. >

Re: [ceph-users] PG active+clean+remapped status

2017-12-16 Thread Karun Josy
Hi Cary, No, I didnt try to repair it. I am comparatively new in ceph. Is it okay to try to repair it ? Or should I take any precautions while doing it ? Karun Josy On Sat, Dec 16, 2017 at 2:08 PM, Cary wrote: > Karun, > > Did you attempt a "ceph pg repair "? Replace wi

[ceph-users] PG active+clean+remapped status

2017-12-16 Thread Karun Josy
Hello, I added 1 disk to the cluster and after rebalancing, it shows 1 PG is in remapped state. How can I correct it ? (I had to restart some osds during the rebalancing as there were some slow requests) $ ceph pg dump | grep remapped dumped all 3.4 981 00

Re: [ceph-users] Health Error : Request Stuck

2017-12-13 Thread Karun Josy
ce. And it worked! To be honest, not exactly sure its the correct way. P.S : I had upgraded to Luminous 12.2.2 yesterday. Karun Josy On Wed, Dec 13, 2017 at 4:31 PM, Nick Fisk wrote: > Hi Karun, > > > > I too am experiencing something very similar with a PG stuck in > activa

Re: [ceph-users] Health Error : Request Stuck

2017-12-12 Thread Karun Josy
] 2 1.1eactivating+remapped [2,3,1,10,0] 2 [2,3,1,10,5] 2 2.29activating+remapped [1,0,13] 1 [1,8,11] 1 1.6factivating+remapped [8,2,0,4,13] 8 [8,2,4,13,1] 8 1.74activating+remapped [7,13,2,0,4] 7 [7,13,2,4,1] 7 Karun Josy

[ceph-users] Health Error : Request Stuck

2017-12-12 Thread Karun Josy
Hello, We added a new disk to the cluster and while rebalancing we are getting error warnings. = Overall status: HEALTH_ERR REQUEST_SLOW: 1824 slow requests are blocked > 32 sec REQUEST_STUCK: 1022 stuck requests are blocked > 4096 sec == The load in the servers seems to

Re: [ceph-users] HEALTH_ERR : PG_DEGRADED_FULL

2017-12-07 Thread Karun Josy
Hi Lars, Sean, Thank you for your response. The cluster health is ok now! :) Karun Josy On Thu, Dec 7, 2017 at 3:35 PM, Sean Redmond wrote: > Can you share - ceph osd tree / crushmap and `ceph health detail` via > pastebin? > > Is recovery stuck or it is on going? > > O

[ceph-users] HEALTH_ERR : PG_DEGRADED_FULL

2017-12-06 Thread Karun Josy
Hello, I am seeing health error in our production cluster. health: HEALTH_ERR 1105420/11038158 objects misplaced (10.015%) Degraded data redundancy: 2046/11038158 objects degraded (0.019%), 102 pgs unclean, 2 pgs degraded Degraded data redundancy (low space):

Re: [ceph-users] Adding multiple OSD

2017-12-04 Thread Karun Josy
1.0 894G 423G 470G 47.34 1.09 117 18 ssd 0.87320 1.0 894G 403G 490G 45.18 1.04 120 21 ssd 0.87320 1.0 894G 444G 450G 49.67 1.15 130 TOTAL 23490G 10170G 13320G 43.30 Karun Josy On Tue, Dec 5, 2017 at 4:42 AM, Karun Josy wrote: > Th

Re: [ceph-users] Adding multiple OSD

2017-12-04 Thread Karun Josy
Thank you for detailed explanation! Got one another doubt, This is the total space available in the cluster : TOTAL 23490G Use 10170G Avail : 13320G But ecpool shows max avail as just 3 TB. Karun Josy On Tue, Dec 5, 2017 at 1:06 AM, David Turner wrote: > No, I would only add disks t

Re: [ceph-users] Adding multiple OSD

2017-12-04 Thread Karun Josy
nodes, with 3 disks each. We are planning to add 2 more on each nodes. If I understand correctly, then I can add 3 disks at once right , assuming 3 disks can fail at a time as per the ec code profile. Karun Josy On Tue, Dec 5, 2017 at 12:06 AM, David Turner wrote: > Depending on how well you b

[ceph-users] Adding multiple OSD

2017-12-04 Thread Karun Josy
Hi, Is it recommended to add OSD disks one by one or can I add couple of disks at a time ? Current cluster size is about 4 TB. Karun ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[ceph-users] OSD down ( rocksdb: submit_transaction error: Corruption: block checksum mismatch)

2017-11-23 Thread Karun Josy
Hi, One OSD in the cluster is down. Tried to restart the service, but its still failing. I can see the below error in log file. Can this be a hardware issue ? - -9> 2017-11-23 09:47:37.768969 7f368686a700 3 rocksdb: [/home/jenkins-build/build/workspace/ceph

[ceph-users] Admin server

2017-11-23 Thread Karun Josy
Hi, Just a not so significant doubt :) We have a cluster with 1 admin server and 3 monitors and 8 OSD nodes. Admin server is used to deploy the cluster. What if the admin server permanently fails? Will it affect the cluster ? Karun ___ ceph-users mai

Re: [ceph-users] How to set osd_max_backfills in Luminous

2017-11-22 Thread Karun Josy
Thanks! Karun Josy On Wed, Nov 22, 2017 at 5:44 AM, Jean-Charles Lopez wrote: > Hi, > > to check a current value use the following command on the machine where > the OSD you want to check is running > > ceph daemon osd.{id} config show | grep {parameter} > Or > cep

[ceph-users] How to set osd_max_backfills in Luminous

2017-11-21 Thread Karun Josy
Hello, We added couple of OSDs to the cluster and the recovery is taking much time. So I tried to increase the osd_max_backfills value dynamically. But its saying the change may need restart. $ ceph tell osd.* injectargs '--osd-max-backfills 5' osd.0: osd_max_backfills = '5' osd_objectstore = 'b

Re: [ceph-users] Reuse pool id

2017-11-15 Thread Karun Josy
Any suggestions ? Karun Josy On Mon, Nov 13, 2017 at 10:06 PM, Karun Josy wrote: > Hi, > > Is there anyway we can change or reuse pool id ? > I had created and deleted lot of test pools. So the IDs kind of look like > this now: > > --- > $ ceph osd lspools >

Re: [ceph-users] Incorrect pool usage statistics

2017-11-14 Thread Karun Josy
29848 43 149240 0 00 66138389 2955G 1123900 539G Karun Josy On Tue, Nov 14, 2017 at 4:16 AM, Karun Josy wrote: > Hello, > > Recently, I deleted all the disks from an erasure pool 'ecpool'. > The pool is empty. However the space usage show

[ceph-users] Incorrect pool usage statistics

2017-11-13 Thread Karun Josy
imagepool 34 90430M 0.62 4684G 22789 Karun Josy ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[ceph-users] Reuse pool id

2017-11-13 Thread Karun Josy
Hi, Is there anyway we can change or reuse pool id ? I had created and deleted lot of test pools. So the IDs kind of look like this now: --- $ ceph osd lspools 34 imagepool,37 cvmpool,40 testecpool,41 ecpool1, -- Can I change it to 0,1,2,3 etc ? Karun ___

[ceph-users] Disconnect a client Hypervisor

2017-11-08 Thread Karun Josy
Hi, Do you think there is a way for ceph to disconnect an HV client from a cluster? We want to prevent the possibility that two hvs are running the same vm. When a hv crashes, we have to make sure that when the vms are started in a new hv, that the disk is not open in the crashed hv. I can see

[ceph-users] OSD daemons active in nodes after removal

2017-10-25 Thread Karun Josy
Hello everyone! :) I have an interesting problem. For a few weeks, we've been testing Luminous in a cluster made up of 8 servers and with about 20 SSD disks almost evenly distributed. It is running erasure coding. Yesterday, we decided to bring the cluster to a minimum of 8 servers and 1 disk per

Re: [ceph-users] Erasure code profile

2017-10-24 Thread Karun Josy
h.com/docs/master/rados/operations/erasure-code-profile/ Is there a better article that I can refer to ? Karun Josy On Tue, Oct 24, 2017 at 1:23 AM, David Turner wrote: > This can be changed to a failure domain of OSD in which case it could > satisfy the criteria. The problem wit

Re: [ceph-users] Erasure code profile

2017-10-23 Thread Karun Josy
Thank you for the reply. There are 8 OSD nodes with 23 OSDs in total. (However, they are not distributed equally on all nodes) So it satisfies that criteria, right? Karun Josy On Tue, Oct 24, 2017 at 12:30 AM, LOPEZ Jean-Charles wrote: > Hi, > > yes you need as many OSDs that k+m

[ceph-users] Erasure code profile

2017-10-23 Thread Karun Josy
Hi, While creating a pool with erasure code profile k=10, m=4, I get PG status as "200 creating+incomplete" While creating pool with profile k=5, m=3 it works fine. Cluster has 8 OSDs with total 23 disks. Is there any requirements for setting the first profile ? Karun _

Re: [ceph-users] Not able to start OSD

2017-10-20 Thread Josy
, Brad Hubbard wrote: On Fri, Oct 20, 2017 at 6:32 AM, Josy wrote: Hi, have you checked the output of "ceph-disk list” on the nodes where the OSDs are not coming back on? Yes, it shows all the disk correctly mounted. And finally inspect /var/log/ceph/ceph-osd.${id}.log to see messages prod

Re: [ceph-users] Not able to start OSD

2017-10-19 Thread Josy
;ceph-disk list” on the nodes where the OSDs are not coming back on? This should give you a hint on what’s going one. Also use dmesg to search for any error message And finally inspect /var/log/ceph/ceph-osd.${id}.log to see messages produced by the OSD itself when it starts. Regards JC On Oct

[ceph-users] Not able to start OSD

2017-10-19 Thread Josy
Hi, I am not able to start some of the OSDs in the cluster. This is a test cluster and had 8 OSDs. One node was taken out for maintenance. I set the noout flag and after the server came back up I unset the noout flag. Suddenly couple of OSDs went down. And now I can start the OSDs manually

Re: [ceph-users] Erasure code settings

2017-10-19 Thread Josy
Please ignore. I found the mistake. On 19-10-2017 21:08, Josy wrote: Hi, I created a testprofile, but not able to create a pool using it == $ ceph osd erasure-code-profile get testprofile1 crush-device-class= crush-failure-domain=host crush-root=default jerasure-per-chunk-alignment

Re: [ceph-users] Erasure code settings

2017-10-19 Thread Josy
create ecpool 100 100 testprofile1 Error ENOENT: specified rule testprofile1 doesn't exist On 19-10-2017 19:54, Josy wrote: Hi, I would like to set up an erasure code profile with k=10 amd m=4 settings. Is there any minimum requirement of OSD nodes and OSDs to achieve this setting ?

[ceph-users] Erasure code settings

2017-10-19 Thread Josy
Hi, I would like to set up an erasure code profile with k=10 amd m=4 settings. Is there any minimum requirement of OSD nodes and OSDs to achieve this setting ? Can I create a pool with 8 OSD servers, with one disk each in it ? ___ ceph-users mail

Re: [ceph-users] To check RBD cache enabled

2017-10-17 Thread Josy
. 'ls /var/run/ceph/ lists' no files in the client server On 18-10-2017 04:07, Jason Dillaman wrote: On Tue, Oct 17, 2017 at 6:30 PM, Josy wrote: Hi, I am running the command from the admin server. Because there are no asok file in the client server ls /var/run/ceph/ lists no

Re: [ceph-users] To check RBD cache enabled

2017-10-17 Thread Josy
f file to take effect. You mean restart the client server ? (I am sorry, this is something new for me. I have just started learning ceph.) On 18-10-2017 03:32, Jean-Charles Lopez wrote: Hi Josy, just a doubt but it looks like your ASOK file is the one from a Ceph Manager. So my suspicion i

Re: [ceph-users] To check RBD cache enabled

2017-10-17 Thread Josy
cal Instructor, Global Storage Consulting Practice Red Hat, Inc. jelo...@redhat.com <mailto:jelo...@redhat.com> +1 408-680-6959 On Oct 17, 2017, at 12:50, Josy <mailto:j...@colossuscloudtech.com>> wrote: Hi, I am following this article : http://ceph.com/geen-categorie/ceph-v

[ceph-users] To check RBD cache enabled

2017-10-17 Thread Josy
Hi, I am following this article : http://ceph.com/geen-categorie/ceph-validate-that-the-rbd-cache-is-active/ I have enabled this flag in ceph.conf |[client] admin socket = /var/run/ceph/$cluster-$type.$id.$pid.$cctid.asok log file = /var/log/ceph/| But the command to show the conf is not

Re: [ceph-users] Erasure coding with RBD

2017-10-12 Thread Josy
your image you will see how your replicated metadata pool grows a little bit when the actual Erasure data pool grows way more! You can get information about your image using rbd info {replicated pool}/{image name} Mensaje original ---- De: Josy Fecha: 12/10/17 8:40 PM (GMT+0

Re: [ceph-users] Erasure coding with RBD

2017-10-12 Thread Josy
about your image using rbd info {replicated pool}/{image name} Mensaje original ---- De: Josy Fecha: 12/10/17 8:40 PM (GMT+01:00) Para: David Turner , dilla...@redhat.com Cc: ceph-users Asunto: Re: [ceph-users] Erasure coding with RBD Thank you for your reply. I created a erasure

Re: [ceph-users] Erasure coding with RBD

2017-10-12 Thread Josy
only the data blocks can be stored in an EC pool. Therefore, when creating the image, you should provide the "--data-pool " optional to specify the EC pool name. On Thu, Oct 12, 2017 at 2:06 PM, Josy mailto:j...@colossuscloudtech.com>> wrote: > Hi, > &g

[ceph-users] Erasure coding with RBD

2017-10-12 Thread Josy
Hi, I am trying to setup an erasure coded pool with rbd image. The ceph version is Luminous 12.2.1. and I understand,  since Luminous, RBD and Cephfs can store their data in an erasure coded pool without use of cache tiring. I created a pool ecpool and when trying to create a rbd image, gets

[ceph-users] MGR Dahhsboard hostname missing

2017-10-12 Thread Josy
Hello, After taking down couple of OSDs, the dashboard is not showing the corresponding hostname. It shows correctly in ceph osd tree output -- -15    3.49280 host ceph-las1-a7-osd  21   ssd  0.87320 osd.21    up  1.0 1.0  22   ssd  0.87320

Re: [ceph-users] Snapshot space

2017-10-09 Thread Josy
e image in your response. The clone is the image chained from a parent image snapshot. Is "e01f31e94a65cf7e786972b915e07364-1@d-1" the clone's parent? On Mon, Oct 9, 2017 at 5:14 PM, Josy wrote: Thank you for your response! If the cloned VM had written around 10Gbs of data, wouldn&#

Re: [ceph-users] Snapshot space

2017-10-09 Thread Josy
st snapshot. If the "fast-diff" feature is enabled, note that it only calculates usage in object size chunks (defaults to 4MB) -- which means that even writing 1 byte to a 4MB object would flag the object as dirty (it's an upper-bound estimate). On Sun, Oct 8, 2017 at 12:01 PM, Josy

[ceph-users] Snapshot space

2017-10-08 Thread Josy
Hello, I noticed that when we create a snapshot of a clone, the first snapshot seems to be quite large. For example: Clone VM is taking up 136MBs according to rbd du First snapshot: 10GBs Second snapshot: 104MBs Third snapshot: 57MBs The clone is a Windows virtual machine, which does take ar

[ceph-users] Real disk usage of clone images

2017-10-07 Thread Josy
Hi, Not sure if this is a good/valid question. I have deployed a lot of VMs in a ceph cluster that were cloned from an original rbd image. I want to see how much of the original image is a new VM (cloned image)  using. Is there any command to get such details? == $ rbd info

Re: [ceph-users] bad crc/signature errors

2017-10-05 Thread Josy
m Cc: ceph-users ; Josy Subject: Re: [ceph-users] bad crc/signature errors Perhaps this is related to a known issue on some 4.4 and later kernels [1] where the stable write flag was not preserved by the kernel? [1] http://tracker.ceph.com/issues/19275 The stable pages bug manifests as multiple spo

[ceph-users] bad crc/signature errors

2017-10-04 Thread Josy
Hi, We have setup a cluster with 8 OSD servers (31 disks) Ceph health is Ok. -- [root@las1-1-44 ~]# ceph -s   cluster:     id: de296604-d85c-46ab-a3af-add3367f0e6d     health: HEALTH_OK   services:     mon: 3 daemons, quorum ceph-las-mon-a1,ceph-las-mon-a2,ceph-las-mon-a3     mg