Re: [ceph-users] osd_recovery_max_chunk value
Hi Christian, Thank you for your help. Ceph version is 12.2.2. So is this value bad ? Do you have any suggestions ? So to reduce the max chunk ,I assume I can choose something like 7 << 20 ,ie 7340032 ? Karun Josy On Tue, Feb 6, 2018 at 1:15 PM, Christian Balzer wrote: > On Tue, 6 Feb 2018 13:01:12 +0530 Karun Josy wrote: > > > Hello, > > > > We are seeing slow requests while recovery process going on. > > > > I am trying to slow down the recovery process. I set > osd_recovery_max_active > > and osd_recovery_sleep as below : > > -- > > ceph tell osd.* injectargs '--osd_recovery_max_active 1' > > ceph tell osd.* injectargs '--osd_recovery_sleep .1' > > -- > What version of Ceph, in some "sleep" values will make things _worse_! > Would be nice if that was documented in like, the documentation... > > > > > But I am confused with the osd_recovery_max_chunk. Currently, it shows > > 8388608. > > > > # ceph daemon osd.4 config get osd_recovery_max_chunk > > { > > "osd_recovery_max_chunk": "8388608" > > > > > > In ceph documentation, it shows > > > > --- > > osd recovery max chunk > > Description: The maximum size of a recovered chunk of data to push. > > Type: 64-bit Unsigned Integer > > Default: 8 << 20 > > > > > > I am confused. Can anyone let me know what is the value that I have to > give > > to reduce this parameter ? > > > This is what you get when programmers write docs. > > The above is a left-shift operation, see for example: > http://bit-calculator.com/bit-shift-calculator > > Now if shrinking that value is beneficial for reducing recovery load, > that's for you to find out. > > Christian > > > > > > > Karun Josy > > > -- > Christian BalzerNetwork/Systems Engineer > ch...@gol.com Rakuten Communications > ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] osd_recovery_max_chunk value
Hello, We are seeing slow requests while recovery process going on. I am trying to slow down the recovery process. I set osd_recovery_max_active and osd_recovery_sleep as below : -- ceph tell osd.* injectargs '--osd_recovery_max_active 1' ceph tell osd.* injectargs '--osd_recovery_sleep .1' -- But I am confused with the osd_recovery_max_chunk. Currently, it shows 8388608. # ceph daemon osd.4 config get osd_recovery_max_chunk { "osd_recovery_max_chunk": "8388608" In ceph documentation, it shows --- osd recovery max chunk Description: The maximum size of a recovered chunk of data to push. Type: 64-bit Unsigned Integer Default: 8 << 20 I am confused. Can anyone let me know what is the value that I have to give to reduce this parameter ? Karun Josy ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] High RAM usage in OSD servers
Can it be this bug : http://lists.ceph.com/pipermail/ceph-users-ceph.com/2017-October/021676.html In most of the OSDs buffer anon is high }, "buffer_anon": { "items": 268443, "bytes": 1421912265 Karun Josy On Sun, Feb 4, 2018 at 7:03 AM, Karun Josy wrote: > And can see this in error log : > > Feb 2 16:41:28 ceph-las1-a4-osd kernel: bstore_kv_sync: page allocation > stalls for 14188ms, order:0, mode:0x14280ca(GFP_HIGHUSER_MOVABLE|__GFP_ZERO), > nodemask=(null) > > > Karun Josy > > On Sun, Feb 4, 2018 at 6:19 AM, Karun Josy wrote: > >> Hi, >> >> We are using EC profile in our cluster. >> We are seeing very high RAM usage in 1 OSD server. >> Sometimes it goes too low and server hangs. We have to restart the >> daemons which frees up the memory, but in very short time get used up again >> >> Memory usage of daemons from issue server >> - >> PID USER PR NIVIRTRESSHR S %CPU %MEM TIME+ >> COMMAND >> >> 16918 ceph 20 0 15.780g 0.013t 7928 S 28.2 21.9 67:29.09 >> ceph-osd >> 18568 ceph 20 0 25.833g 0.023t 26096 S 24.9 36.8 9:15.58 >> ceph-osd >> 22630 ceph 20 0 12.520g 0.011t 26660 S 22.3 18.3 5:49.03 >> ceph-osd >> 2796 ceph 20 0 11.091g 9.851g 8900 S 13.6 15.7 25:17.68 >> ceph-osd >> >> >> Memory usage from another server : >> --- >> PID USER PR NIVIRTRESSHR S %CPU %MEM TIME+ >> COMMAND >> 11649 ceph 20 0 12.788g 9.563g 25068 S 107.6 7.6 12285:54 >> ceph-osd >> 18295 ceph 20 0 11.028g 6.069g 26212 S 54.0 4.8 2122:18 >> ceph-osd >> 30974 ceph 20 0 13.860g 0.010t 24956 S 46.4 8.1 10984:47 >> ceph-osd >> >> >> We are using ec profile 5/3. And there are 2 failed disks in the cluster >> in another nodes, (I have marked them down, but not out) so cannot turn >> this node off as it will force some pgs to be incomplete state. >> >> And help would be really appreciated. >> >> >> Karun Josy >> > > ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] High RAM usage in OSD servers
And can see this in error log : Feb 2 16:41:28 ceph-las1-a4-osd kernel: bstore_kv_sync: page allocation stalls for 14188ms, order:0, mode:0x14280ca(GFP_HIGHUSER_MOVABLE|__GFP_ZERO), nodemask=(null) Karun Josy On Sun, Feb 4, 2018 at 6:19 AM, Karun Josy wrote: > Hi, > > We are using EC profile in our cluster. > We are seeing very high RAM usage in 1 OSD server. > Sometimes it goes too low and server hangs. We have to restart the daemons > which frees up the memory, but in very short time get used up again > > Memory usage of daemons from issue server > - > PID USER PR NIVIRTRESSHR S %CPU %MEM TIME+ COMMAND > > 16918 ceph 20 0 15.780g 0.013t 7928 S 28.2 21.9 67:29.09 > ceph-osd > 18568 ceph 20 0 25.833g 0.023t 26096 S 24.9 36.8 9:15.58 > ceph-osd > 22630 ceph 20 0 12.520g 0.011t 26660 S 22.3 18.3 5:49.03 > ceph-osd > 2796 ceph 20 0 11.091g 9.851g 8900 S 13.6 15.7 25:17.68 > ceph-osd > > > Memory usage from another server : > --- > PID USER PR NIVIRTRESSHR S %CPU %MEM TIME+ COMMAND > 11649 ceph 20 0 12.788g 9.563g 25068 S 107.6 7.6 12285:54 > ceph-osd > 18295 ceph 20 0 11.028g 6.069g 26212 S 54.0 4.8 2122:18 > ceph-osd > 30974 ceph 20 0 13.860g 0.010t 24956 S 46.4 8.1 10984:47 > ceph-osd > > > We are using ec profile 5/3. And there are 2 failed disks in the cluster > in another nodes, (I have marked them down, but not out) so cannot turn > this node off as it will force some pgs to be incomplete state. > > And help would be really appreciated. > > > Karun Josy > ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] High RAM usage in OSD servers
Hi, We are using EC profile in our cluster. We are seeing very high RAM usage in 1 OSD server. Sometimes it goes too low and server hangs. We have to restart the daemons which frees up the memory, but in very short time get used up again Memory usage of daemons from issue server - PID USER PR NIVIRTRESSHR S %CPU %MEM TIME+ COMMAND 16918 ceph 20 0 15.780g 0.013t 7928 S 28.2 21.9 67:29.09 ceph-osd 18568 ceph 20 0 25.833g 0.023t 26096 S 24.9 36.8 9:15.58 ceph-osd 22630 ceph 20 0 12.520g 0.011t 26660 S 22.3 18.3 5:49.03 ceph-osd 2796 ceph 20 0 11.091g 9.851g 8900 S 13.6 15.7 25:17.68 ceph-osd Memory usage from another server : --- PID USER PR NIVIRTRESSHR S %CPU %MEM TIME+ COMMAND 11649 ceph 20 0 12.788g 9.563g 25068 S 107.6 7.6 12285:54 ceph-osd 18295 ceph 20 0 11.028g 6.069g 26212 S 54.0 4.8 2122:18 ceph-osd 30974 ceph 20 0 13.860g 0.010t 24956 S 46.4 8.1 10984:47 ceph-osd We are using ec profile 5/3. And there are 2 failed disks in the cluster in another nodes, (I have marked them down, but not out) so cannot turn this node off as it will force some pgs to be incomplete state. And help would be really appreciated. Karun Josy ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Snapshot trimming
Hi Jason, >> Was the base RBD pool used only for data-pool associated images Yes, it is only used for storing metadata of ecpool. We use 2 pools for erasure coding ecpool - erasure coded datapool vm - replicated pool to store metadata Karun Josy On Tue, Jan 30, 2018 at 8:00 PM, Jason Dillaman wrote: > Unfortunately, any snapshots created prior to 12.2.2 against a separate > data pool were incorrectly associated to the base image pool instead of the > data pool. Was the base RBD pool used only for data-pool associated images > (i.e. all the snapshots that exists within the pool can be safely deleted)? > > On Mon, Jan 29, 2018 at 11:50 AM, Karun Josy wrote: > >> >> The problem we are experiencing is described here: >> >> https://bugzilla.redhat.com/show_bug.cgi?id=1497332 >> >> However, we are running 12.2.2. >> >> Across our 6 ceph clusters, this one with the problem was first version >> 12.2.0, then upgraded to .1 and then to .2. >> >> The other 5 ceph installations started as version 12.2.1 and then updated >> to .2. >> >> Karun Josy >> >> On Mon, Jan 29, 2018 at 7:01 PM, Karun Josy wrote: >> >>> Thank you for your response. >>> >>> We don't think there is an issue with the cluster being behind snap >>> trimming. We just don't think snaptrim is occurring at all. >>> >>> We have 6 individual ceph clusters. When we delete old snapshots for >>> clients, we can see space being made available. In this particular one >>> however, with 300 virtual machines, 28TBs of data (this is our largest >>> ceph), I can delete hundreds of snapshots, and not a single gigabyte >>> becomes available after doing that. >>> >>> In our other 5, smaller Ceph clusters, we can see hundreds of gigabytes >>> becoming available again after doing massive deletions of snapshots. >>> >>> The Luminous gui also never shows "snaptrimming" occurring in the EC >>> pool. While the other 5 Luminous clusters, their GUI will show >>> snaptrimming occurring for the EC pool. Within minutes we can see the >>> additional space becoming available. >>> >>> This isn't an issue of the trimming queue behind schedule. The system >>> shows there is no trimming scheduled in the queue, ever. >>> >>> However, when using ceph du on particular virtual machines, we can see >>> that snapshots we delete are indeed no longer listed in ceph du's output. >>> >>> So, they seem to be deleting. But the space is not being reclaimed. >>> >>> All clusters are same hardware. Some have more disks and servers than >>> others. The only major difference is that this particular Ceph with this >>> problem, it had the noscrub and nodeep-scrub flags set for many weeks. >>> >>> >>> Karun Josy >>> >>> On Mon, Jan 29, 2018 at 6:27 PM, David Turner >>> wrote: >>> >>>> I don't know why you keep asking the same question about snap trimming. >>>> You haven't shown any evidence that your cluster is behind on that. Have >>>> you looked into fstrim inside of your VMs? >>>> >>>> On Mon, Jan 29, 2018, 4:30 AM Karun Josy wrote: >>>> >>>>> fast-diff map is not enabled for RBD images. >>>>> Can it be a reason for Trimming not happening ? >>>>> >>>>> Karun Josy >>>>> >>>>> On Sat, Jan 27, 2018 at 10:19 PM, Karun Josy >>>>> wrote: >>>>> >>>>>> Hi David, >>>>>> >>>>>> Thank you for your reply! I really appreciate it. >>>>>> >>>>>> The images are in pool id 55. It is an erasure coded pool. >>>>>> >>>>>> --- >>>>>> $ echo $(( $(ceph pg 55.58 query | grep snap_trimq | cut -d[ -f2 | >>>>>> cut -d] -f1 | tr ',' '\n' | wc -l) - 1 )) >>>>>> 0 >>>>>> $ echo $(( $(ceph pg 55.a query | grep snap_trimq | cut -d[ -f2 | >>>>>> cut -d] -f1 | tr ',' '\n' | wc -l) - 1 )) >>>>>> 0 >>>>>> $ echo $(( $(ceph pg 55.65 query | grep snap_trimq | cut -d[ -f2 | >>>>>> cut -d] -f1 | tr ',' '\n' | wc -l) - 1 )) >>>>>> 0 >>>>>> -- >>>>>> >>>>>> Current snap_trim_sleep value is default. >>>>>&
Re: [ceph-users] lease_timeout
Thank you for looking into it. Yes, I believe it is the same issue as reported in the bug. Sorry I was not specific. - Health section is not updated - The Activity values under Pools section (right side) gets stuck it shows the old data and it is not updated. However, the Cluster log section gets updated correctly. Karun Josy On Tue, Jan 30, 2018 at 1:35 AM, John Spray wrote: > On Mon, Jan 29, 2018 at 6:58 PM, Gregory Farnum > wrote: > > The lease timeout means this (peon) monitor hasn't heard from the leader > > monitor in too long; its read lease on the system state has expired. So > it > > calls a new election since that means the leader is down or misbehaving. > Do > > the other monitors have a similar problem at this stage? > > > > The manager freezing until you restart it is a separate bug, but I'm not > > sure what the dashboard/mgr people will want to see there. John? > > There is a bug where the mgr will stop getting updates from the mon in > some situations (http://tracker.ceph.com/issues/22142), which is fixed > in master but not backported to luminous yet. > > However, I don't know what "gets stuck" means in this context. Karun, > can you be more specific? Is it rendering but old data? Is the page > not loading at all? > > John > > > -Greg > > > > On Sun, Jan 28, 2018 at 9:11 AM Karun Josy wrote: > >> > >> Still the issue is continuing. Any one else has noticed it ? > >> > >> > >> When this happens, the Ceph Dashboard GUI gets stuck and we have to > >> restart the manager daemon to make it work again > >> > >> Karun Josy > >> > >> On Wed, Jan 17, 2018 at 6:16 AM, Karun Josy > wrote: > >>> > >>> Hello, > >>> > >>> In one of our cluster set up, there is frequent monitor elections > >>> happening. > >>> In the logs of one of the monitor, there is "lease_timeout" message > >>> before that happens. Can anyone help me to figure it out ? > >>> (When this happens, the Ceph Dashboard GUI gets stuck and we have to > >>> restart the manager daemon to make it work again) > >>> > >>> Ceph version : Luminous 12.2.2 > >>> > >>> Log : > >>> = > >>> > >>> 2018-01-16 16:33:08.001937 7f0cfbaad700 4 rocksdb: > >>> [/home/jenkins-build/build/workspace/ceph-build/ARCH/x86_ > 64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos7/DIST/ > centos7/MACHINE_SIZE/huge/release/12.2.2/rpm/el7/BUILD/ > ceph-12.2.2/src/rocksdb/db/compaction_job.cc:1173] > >>> [default] [JOB 885] Compacted 1@0 + 1@1 files to L1 => 20046585 bytes > >>> 2018-01-16 16:33:08.015891 7f0cfbaad700 4 rocksdb: (Original Log Time > >>> 2018/01/16-16:33:08.015826) > >>> [/home/jenkins-build/build/workspace/ceph-build/ARCH/x86_ > 64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos7/DIST/ > centos7/MACHINE_SIZE/huge/release/12.2.2/rpm/el7/BUILD/ > ceph-12.2.2/src/rocksdb/db/compaction_job.cc:621] > >>> [default] compacted to: base level 1 max bytes base 268435456 files[0 > 1 0 0 > >>> 0 0 0] max score 0.07, MB/sec: 32.7 rd, 30.9 wr, level 1, files in(1, > 1) > >>> out(1) MB in(1.3, 18.9) out(19.1), read-write-amplify(31.0) > >>> write-amplify(15.1) OK, records in: 4305, records dropped: 515 > >>> > >>> 2018-01-16 16:33:08.015897 7f0cfbaad700 4 rocksdb: (Original Log Time > >>> 2018/01/16-16:33:08.015840) EVENT_LOG_v1 {"time_micros": > 1516149188015833, > >>> "job": 885, "event": "compaction_finished", "compaction_time_micros": > >>> 647876, "output_level": 1, "num_output_files": 1, "total_output_size": > >>> 20046585, "num_input_records": 4305, "num_output_records": 3790, > >>> "num_subcompactions": 1, "num_single_delete_mismatches": 0, > >>> "num_single_delete_fallthrough": 0, "lsm_state": [0, 1, 0, 0, 0, 0, > 0]} > >>> 2018-01-16 16:33:08.016131 7f0cfbaad700 4 rocksdb: EVENT_LOG_v1 > >>> {"time_micros": 1516149188016128, "job": 885, "event": > >>> "table_file_deletion", "file_number": 2419} > >>> 2018-01-16 16:33:08.018147 7f0cfbaad700 4 rocksdb: EVENT_LOG_v1 > >>> {"time_micros": 1516149188018146, "job": 885, "event": > >>> "table_file_deletion", "file_numb
Re: [ceph-users] Snapshot trimming
The problem we are experiencing is described here: https://bugzilla.redhat.com/show_bug.cgi?id=1497332 However, we are running 12.2.2. Across our 6 ceph clusters, this one with the problem was first version 12.2.0, then upgraded to .1 and then to .2. The other 5 ceph installations started as version 12.2.1 and then updated to .2. Karun Josy On Mon, Jan 29, 2018 at 7:01 PM, Karun Josy wrote: > Thank you for your response. > > We don't think there is an issue with the cluster being behind snap > trimming. We just don't think snaptrim is occurring at all. > > We have 6 individual ceph clusters. When we delete old snapshots for > clients, we can see space being made available. In this particular one > however, with 300 virtual machines, 28TBs of data (this is our largest > ceph), I can delete hundreds of snapshots, and not a single gigabyte > becomes available after doing that. > > In our other 5, smaller Ceph clusters, we can see hundreds of gigabytes > becoming available again after doing massive deletions of snapshots. > > The Luminous gui also never shows "snaptrimming" occurring in the EC > pool. While the other 5 Luminous clusters, their GUI will show > snaptrimming occurring for the EC pool. Within minutes we can see the > additional space becoming available. > > This isn't an issue of the trimming queue behind schedule. The system > shows there is no trimming scheduled in the queue, ever. > > However, when using ceph du on particular virtual machines, we can see > that snapshots we delete are indeed no longer listed in ceph du's output. > > So, they seem to be deleting. But the space is not being reclaimed. > > All clusters are same hardware. Some have more disks and servers than > others. The only major difference is that this particular Ceph with this > problem, it had the noscrub and nodeep-scrub flags set for many weeks. > > > Karun Josy > > On Mon, Jan 29, 2018 at 6:27 PM, David Turner > wrote: > >> I don't know why you keep asking the same question about snap trimming. >> You haven't shown any evidence that your cluster is behind on that. Have >> you looked into fstrim inside of your VMs? >> >> On Mon, Jan 29, 2018, 4:30 AM Karun Josy wrote: >> >>> fast-diff map is not enabled for RBD images. >>> Can it be a reason for Trimming not happening ? >>> >>> Karun Josy >>> >>> On Sat, Jan 27, 2018 at 10:19 PM, Karun Josy >>> wrote: >>> >>>> Hi David, >>>> >>>> Thank you for your reply! I really appreciate it. >>>> >>>> The images are in pool id 55. It is an erasure coded pool. >>>> >>>> --- >>>> $ echo $(( $(ceph pg 55.58 query | grep snap_trimq | cut -d[ -f2 | cut >>>> -d] -f1 | tr ',' '\n' | wc -l) - 1 )) >>>> 0 >>>> $ echo $(( $(ceph pg 55.a query | grep snap_trimq | cut -d[ -f2 | cut >>>> -d] -f1 | tr ',' '\n' | wc -l) - 1 )) >>>> 0 >>>> $ echo $(( $(ceph pg 55.65 query | grep snap_trimq | cut -d[ -f2 | cut >>>> -d] -f1 | tr ',' '\n' | wc -l) - 1 )) >>>> 0 >>>> -- >>>> >>>> Current snap_trim_sleep value is default. >>>> "osd_snap_trim_sleep": "0.00". I assume it means there is no delay. >>>> (Can't find any documentation related to it) >>>> Will changing its value initiate snaptrimming, like >>>> ceph tell osd.* injectargs '--osd_snap_trim_sleep 0.05' >>>> >>>> Also, we are using an rbd user with the below profile. It is used while >>>> deleting snapshots >>>> --- >>>> caps: [mon] profile rbd >>>> caps: [osd] profile rbd pool=ecpool, profile rbd pool=vm, >>>> profile rbd-read-only pool=templates >>>> --- >>>> >>>> Can it be a reason ? >>>> >>>> Also, can you let me know which all logs to check while deleting >>>> snapshots to see if it is snaptrimming ? >>>> I am sorry I feel like pestering you too much. >>>> But in mailing lists, I can see you have dealt with similar issues with >>>> Snapshots >>>> So I think you can help me figure this mess out. >>>> >>>> >>>> Karun Josy >>>> >>>> On Sat, Jan 27, 2018 at 7:15 PM, David Turner >>>> wrote: >>>> >>>>> Prove* a positive >>>>> >>>>> On Sat,
Re: [ceph-users] Snapshot trimming
Thank you for your response. We don't think there is an issue with the cluster being behind snap trimming. We just don't think snaptrim is occurring at all. We have 6 individual ceph clusters. When we delete old snapshots for clients, we can see space being made available. In this particular one however, with 300 virtual machines, 28TBs of data (this is our largest ceph), I can delete hundreds of snapshots, and not a single gigabyte becomes available after doing that. In our other 5, smaller Ceph clusters, we can see hundreds of gigabytes becoming available again after doing massive deletions of snapshots. The Luminous gui also never shows "snaptrimming" occurring in the EC pool. While the other 5 Luminous clusters, their GUI will show snaptrimming occurring for the EC pool. Within minutes we can see the additional space becoming available. This isn't an issue of the trimming queue behind schedule. The system shows there is no trimming scheduled in the queue, ever. However, when using ceph du on particular virtual machines, we can see that snapshots we delete are indeed no longer listed in ceph du's output. So, they seem to be deleting. But the space is not being reclaimed. All clusters are same hardware. Some have more disks and servers than others. The only major difference is that this particular Ceph with this problem, it had the noscrub and nodeep-scrub flags set for many weeks. Karun Josy On Mon, Jan 29, 2018 at 6:27 PM, David Turner wrote: > I don't know why you keep asking the same question about snap trimming. > You haven't shown any evidence that your cluster is behind on that. Have > you looked into fstrim inside of your VMs? > > On Mon, Jan 29, 2018, 4:30 AM Karun Josy wrote: > >> fast-diff map is not enabled for RBD images. >> Can it be a reason for Trimming not happening ? >> >> Karun Josy >> >> On Sat, Jan 27, 2018 at 10:19 PM, Karun Josy >> wrote: >> >>> Hi David, >>> >>> Thank you for your reply! I really appreciate it. >>> >>> The images are in pool id 55. It is an erasure coded pool. >>> >>> --- >>> $ echo $(( $(ceph pg 55.58 query | grep snap_trimq | cut -d[ -f2 | cut >>> -d] -f1 | tr ',' '\n' | wc -l) - 1 )) >>> 0 >>> $ echo $(( $(ceph pg 55.a query | grep snap_trimq | cut -d[ -f2 | cut >>> -d] -f1 | tr ',' '\n' | wc -l) - 1 )) >>> 0 >>> $ echo $(( $(ceph pg 55.65 query | grep snap_trimq | cut -d[ -f2 | cut >>> -d] -f1 | tr ',' '\n' | wc -l) - 1 )) >>> 0 >>> -- >>> >>> Current snap_trim_sleep value is default. >>> "osd_snap_trim_sleep": "0.00". I assume it means there is no delay. >>> (Can't find any documentation related to it) >>> Will changing its value initiate snaptrimming, like >>> ceph tell osd.* injectargs '--osd_snap_trim_sleep 0.05' >>> >>> Also, we are using an rbd user with the below profile. It is used while >>> deleting snapshots >>> --- >>> caps: [mon] profile rbd >>> caps: [osd] profile rbd pool=ecpool, profile rbd pool=vm, >>> profile rbd-read-only pool=templates >>> --- >>> >>> Can it be a reason ? >>> >>> Also, can you let me know which all logs to check while deleting >>> snapshots to see if it is snaptrimming ? >>> I am sorry I feel like pestering you too much. >>> But in mailing lists, I can see you have dealt with similar issues with >>> Snapshots >>> So I think you can help me figure this mess out. >>> >>> >>> Karun Josy >>> >>> On Sat, Jan 27, 2018 at 7:15 PM, David Turner >>> wrote: >>> >>>> Prove* a positive >>>> >>>> On Sat, Jan 27, 2018, 8:45 AM David Turner >>>> wrote: >>>> >>>>> Unless you have things in your snap_trimq, your problem isn't snap >>>>> trimming. That is currently how you can check snap trimming and you say >>>>> you're caught up. >>>>> >>>>> Are you certain that you are querying the correct pool for the images >>>>> you are snapshotting. You showed that you tested 4 different pools. You >>>>> should only need to check the pool with the images you are dealing with. >>>>> >>>>> You can inversely price a positive by changing your snap_trim settings >>>>> to not do any cleanup and see if the appropriate PGs have anything in >>>>>
Re: [ceph-users] Snapshot trimming
fast-diff map is not enabled for RBD images. Can it be a reason for Trimming not happening ? Karun Josy On Sat, Jan 27, 2018 at 10:19 PM, Karun Josy wrote: > Hi David, > > Thank you for your reply! I really appreciate it. > > The images are in pool id 55. It is an erasure coded pool. > > --- > $ echo $(( $(ceph pg 55.58 query | grep snap_trimq | cut -d[ -f2 | cut > -d] -f1 | tr ',' '\n' | wc -l) - 1 )) > 0 > $ echo $(( $(ceph pg 55.a query | grep snap_trimq | cut -d[ -f2 | cut -d] > -f1 | tr ',' '\n' | wc -l) - 1 )) > 0 > $ echo $(( $(ceph pg 55.65 query | grep snap_trimq | cut -d[ -f2 | cut > -d] -f1 | tr ',' '\n' | wc -l) - 1 )) > 0 > -- > > Current snap_trim_sleep value is default. > "osd_snap_trim_sleep": "0.00". I assume it means there is no delay. > (Can't find any documentation related to it) > Will changing its value initiate snaptrimming, like > ceph tell osd.* injectargs '--osd_snap_trim_sleep 0.05' > > Also, we are using an rbd user with the below profile. It is used while > deleting snapshots > --- > caps: [mon] profile rbd > caps: [osd] profile rbd pool=ecpool, profile rbd pool=vm, profile > rbd-read-only pool=templates > --- > > Can it be a reason ? > > Also, can you let me know which all logs to check while deleting snapshots > to see if it is snaptrimming ? > I am sorry I feel like pestering you too much. > But in mailing lists, I can see you have dealt with similar issues with > Snapshots > So I think you can help me figure this mess out. > > > Karun Josy > > On Sat, Jan 27, 2018 at 7:15 PM, David Turner > wrote: > >> Prove* a positive >> >> On Sat, Jan 27, 2018, 8:45 AM David Turner wrote: >> >>> Unless you have things in your snap_trimq, your problem isn't snap >>> trimming. That is currently how you can check snap trimming and you say >>> you're caught up. >>> >>> Are you certain that you are querying the correct pool for the images >>> you are snapshotting. You showed that you tested 4 different pools. You >>> should only need to check the pool with the images you are dealing with. >>> >>> You can inversely price a positive by changing your snap_trim settings >>> to not do any cleanup and see if the appropriate PGs have anything in their >>> q. >>> >>> On Sat, Jan 27, 2018, 12:06 AM Karun Josy wrote: >>> >>>> Is scrubbing and deep scrubbing necessary for Snaptrim operation to >>>> happen ? >>>> >>>> Karun Josy >>>> >>>> On Fri, Jan 26, 2018 at 9:29 PM, Karun Josy >>>> wrote: >>>> >>>>> Thank you for your quick response! >>>>> >>>>> I used the command to fetch the snap_trimq from many pgs, however it >>>>> seems they don't have any in queue ? >>>>> >>>>> For eg : >>>>> >>>>> $ echo $(( $(ceph pg 55.4a query | grep snap_trimq | cut -d[ -f2 | >>>>> cut -d] -f1 | tr ',' '\n' | wc -l) - 1 )) >>>>> 0 >>>>> $ echo $(( $(ceph pg 55.5a query | grep snap_trimq | cut -d[ -f2 | >>>>> cut -d] -f1 | tr ',' '\n' | wc -l) - 1 )) >>>>> 0 >>>>> $ echo $(( $(ceph pg 55.88 query | grep snap_trimq | cut -d[ -f2 | >>>>> cut -d] -f1 | tr ',' '\n' | wc -l) - 1 )) >>>>> 0 >>>>> $ echo $(( $(ceph pg 55.55 query | grep snap_trimq | cut -d[ -f2 | >>>>> cut -d] -f1 | tr ',' '\n' | wc -l) - 1 )) >>>>> 0 >>>>> $ echo $(( $(ceph pg 54.a query | grep snap_trimq | cut -d[ -f2 | cut >>>>> -d] -f1 | tr ',' '\n' | wc -l) - 1 )) >>>>> 0 >>>>> $ echo $(( $(ceph pg 34.1d query | grep snap_trimq | cut -d[ -f2 | >>>>> cut -d] -f1 | tr ',' '\n' | wc -l) - 1 )) >>>>> 0 >>>>> $ echo $(( $(ceph pg 1.3f query | grep snap_trimq | cut -d[ -f2 | cut >>>>> -d] -f1 | tr ',' '\n' | wc -l) - 1 )) >>>>> 0 >>>>> = >>>>> >>>>> >>>>> While going through the PG query, I find that these PGs have no value >>>>> in purged_snaps section too. >>>>> For eg : >>>>>
Re: [ceph-users] POOL_NEARFULL
In Luminous version, we have to use osd set command -- ceph osd set -backfillfull-ratio .89 ceph osd set-nearfull-ratio .84 ceph osd set-full-ratio .96 -- Karun Josy On Thu, Dec 21, 2017 at 4:29 PM, Konstantin Shalygin wrote: > Update your ceph.conf file > > This is also not help. I was create ticket http://tracker.ceph.com/ > issues/22520 > > ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Limit deep scrub
Hi, I used these settings and there are no more slow requests in the cluster. - ceph tell osd.* injectargs '--osd_scrub_sleep 0.1' ceph tell osd.* injectargs '--osd_scrub_load_threshold 0.3' ceph tell osd.* injectargs '--osd_scrub_chunk_max 6' -- Yes, scrubbing is slower now, but there has been no osd flapping and slow requests! Thanks for all your help! Karun Josy On Sun, Jan 28, 2018 at 9:25 PM, David Turner wrote: > Use a get with the second syntax to see the currently running config. > > On Sun, Jan 28, 2018, 3:41 AM Karun Josy wrote: > >> Hello, >> >> Sorry for bringing this up again. >> >> What is the proper way to adjust the scrub settings ? >> Can I use injectargs ? >> --- >> ceph tell osd.* injectargs '--osd_scrub_sleep .1' >> --- >> >> Or do I have to use set manually in each osd daemons ? >> --- >> ceph daemon osd.21 set osd_scrub_sleep .1 >> >> >> While using both it shows (not observed, change may require restart) >> So is it not set ? >> >> >> Karun Josy >> >> On Mon, Jan 15, 2018 at 7:16 AM, shadow_lin wrote: >> >>> hi, >>> you can try to adjusting osd_scrub_chunk_min,osd_scrub_chunk_max and >>> osd_scrub_sleep. >>> >>> >>> osd scrub sleep >>> >>> Description: Time to sleep before scrubbing next group of chunks. >>> Increasing this value will slow down whole scrub operation while client >>> operations will be less impacted. >>> Type: Float >>> Default: 0 >>> >>> osd scrub chunk min >>> >>> Description: The minimal number of object store chunks to scrub during >>> single operation. Ceph blocks writes to single chunk during scrub. >>> Type: 32-bit Integer >>> Default: 5 >>> >>> >>> 2018-01-15 >>> -- >>> lin.yunfan >>> -- >>> >>> *发件人:*Karun Josy >>> *发送时间:*2018-01-15 06:53 >>> *主题:*[ceph-users] Limit deep scrub >>> *收件人:*"ceph-users" >>> *抄送:* >>> >>> Hello, >>> >>> It appears that cluster is having many slow requests while it is >>> scrubbing and deep scrubbing. Also sometimes we can see osds flapping. >>> >>> So we have put the flags : noscrub,nodeep-scrub >>> >>> When we unset it, 5 PGs start to scrub. >>> Is there a way to limit it to one at a time? >>> >>> # ceph daemon osd.35 config show | grep scrub >>> "mds_max_scrub_ops_in_progress": "5", >>> "mon_scrub_inject_crc_mismatch": "0.00", >>> "mon_scrub_inject_missing_keys": "0.00", >>> "mon_scrub_interval": "86400", >>> "mon_scrub_max_keys": "100", >>> "mon_scrub_timeout": "300", >>> "mon_warn_not_deep_scrubbed": "0", >>> "mon_warn_not_scrubbed": "0", >>> "osd_debug_scrub_chance_rewrite_digest": "0", >>> "osd_deep_scrub_interval": "604800.00", >>> "osd_deep_scrub_randomize_ratio": "0.15", >>> "osd_deep_scrub_stride": "524288", >>> "osd_deep_scrub_update_digest_min_age": "7200", >>> "osd_max_scrubs": "1", >>> "osd_op_queue_mclock_scrub_lim": "0.001000", >>> "osd_op_queue_mclock_scrub_res": "0.00", >>> "osd_op_queue_mclock_scrub_wgt": "1.00", >>> "osd_requested_scrub_priority": "120", >>> "osd_scrub_auto_repair": "false", >>> "osd_scrub_auto_repair_num_errors": "5", >>> "osd_scrub_backoff_ratio": "0.66", >>> "osd_scrub_begin_hour": "0", >>> "osd_scrub_chunk_max": "25", >>> "osd_scrub_chunk_min": "5", >>> "osd_scrub_cost": "52428800", >>> "osd_scrub_during_recovery": "false", >>> "osd_scrub_end_hour": "24", >>> "osd_scrub_interval_randomize_ratio": "0.50", >>> "osd_scrub_invalid_stats": "true", >>> "osd_scrub_load_threshold": "0.50", >>> "osd_scrub_max_interval": "604800.00", >>> "osd_scrub_min_interval": "86400.00", >>> "osd_scrub_priority": "5", >>> "osd_scrub_sleep": "0.00", >>> >>> >>> Karun >>> >>> >> ___ >> ceph-users mailing list >> ceph-users@lists.ceph.com >> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >> > ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] lease_timeout
Still the issue is continuing. Any one else has noticed it ? When this happens, the Ceph Dashboard GUI gets stuck and we have to restart the manager daemon to make it work again Karun Josy On Wed, Jan 17, 2018 at 6:16 AM, Karun Josy wrote: > Hello, > > In one of our cluster set up, there is frequent monitor elections > happening. > In the logs of one of the monitor, there is "lease_timeout" message before > that happens. Can anyone help me to figure it out ? > (When this happens, the Ceph Dashboard GUI gets stuck and we have to > restart the manager daemon to make it work again) > > Ceph version : Luminous 12.2.2 > > Log : > = > > 2018-01-16 16:33:08.001937 7f0cfbaad700 4 rocksdb: > [/home/jenkins-build/build/workspace/ceph-build/ARCH/x86_ > 64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos7/DIST/ > centos7/MACHINE_SIZE/huge/release/12.2.2/rpm/el7/BUILD/ > ceph-12.2.2/src/rocksdb/db/compaction_job.cc:1173] [default] [JOB 885] > Compacted 1@0 + 1@1 files to L1 => 20046585 bytes > 2018-01-16 16:33:08.015891 7f0cfbaad700 4 rocksdb: (Original Log Time > 2018/01/16-16:33:08.015826) [/home/jenkins-build/build/ > workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/ > AVAILABLE_DIST/centos7/DIST/centos7/MACHINE_SIZE/huge/ > release/12.2.2/rpm/el7/BUILD/ceph-12.2.2/src/rocksdb/db/compaction_job.cc:621] > [default] compacted to: base level 1 max bytes base 268435456 files[0 1 0 0 > 0 0 0] max score 0.07, MB/sec: 32.7 rd, 30.9 wr, level 1, files in(1, 1) > out(1) MB in(1.3, 18.9) out(19.1), read-write-amplify(31.0) > write-amplify(15.1) OK, records in: 4305, records dropped: 515 > > 2018-01-16 16:33:08.015897 7f0cfbaad700 4 rocksdb: (Original Log Time > 2018/01/16-16:33:08.015840) EVENT_LOG_v1 {"time_micros": 1516149188015833, > "job": 885, "event": "compaction_finished", "compaction_time_micros": > 647876, "output_level": 1, "num_output_files": 1, "total_output_size": > 20046585, "num_input_records": 4305, "num_output_records": 3790, > "num_subcompactions": 1, "num_single_delete_mismatches": 0, > "num_single_delete_fallthrough": 0, "lsm_state": [0, 1, 0, 0, 0, 0, 0]} > 2018-01-16 16:33:08.016131 7f0cfbaad700 4 rocksdb: EVENT_LOG_v1 > {"time_micros": 1516149188016128, "job": 885, "event": > "table_file_deletion", "file_number": 2419} > 2018-01-16 16:33:08.018147 7f0cfbaad700 4 rocksdb: EVENT_LOG_v1 > {"time_micros": 1516149188018146, "job": 885, "event": > "table_file_deletion", "file_number": 2417} > 2018-01-16 16:33:11.051010 7f0d042be700 0 > mon.ceph-mon3@2(peon).data_health(436) > update_stats avail 84% total 20918 MB, used 2179 MB, avail 17653 MB > 2018-01-16 16:33:17.269954 7f0d042be700 1 mon.ceph-mon3@2(peon).paxos(paxos > active c 84337..84838) lease_timeout -- calling new election > 2018-01-16 16:33:17.291096 7f0d01ab9700 0 log_channel(cluster) log [INF] > : mon.ceph-sgp-mon3 calling new monitor election > 2018-01-16 16:33:17.291182 7f0d01ab9700 1 > mon.ceph-mon3@2(electing).elector(436) > init, last seen epoch 436 > 2018-01-16 16:33:20.834853 7f0d01ab9700 1 mon.ceph-mon3@2(peon).log > v23189 check_sub sending message to client.65755 10.255.0.95:0/2603001850 > with 8 entries (version 23189) > > > > Karun > ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Limit deep scrub
Hello, Sorry for bringing this up again. What is the proper way to adjust the scrub settings ? Can I use injectargs ? --- ceph tell osd.* injectargs '--osd_scrub_sleep .1' --- Or do I have to use set manually in each osd daemons ? --- ceph daemon osd.21 set osd_scrub_sleep .1 While using both it shows (not observed, change may require restart) So is it not set ? Karun Josy On Mon, Jan 15, 2018 at 7:16 AM, shadow_lin wrote: > hi, > you can try to adjusting osd_scrub_chunk_min,osd_scrub_chunk_max and > osd_scrub_sleep. > > > osd scrub sleep > > Description: Time to sleep before scrubbing next group of chunks. > Increasing this value will slow down whole scrub operation while client > operations will be less impacted. > Type: Float > Default: 0 > > osd scrub chunk min > > Description: The minimal number of object store chunks to scrub during > single operation. Ceph blocks writes to single chunk during scrub. > Type: 32-bit Integer > Default: 5 > > > 2018-01-15 > -- > lin.yunfan > -- > > *发件人:*Karun Josy > *发送时间:*2018-01-15 06:53 > *主题:*[ceph-users] Limit deep scrub > *收件人:*"ceph-users" > *抄送:* > > Hello, > > It appears that cluster is having many slow requests while it is scrubbing > and deep scrubbing. Also sometimes we can see osds flapping. > > So we have put the flags : noscrub,nodeep-scrub > > When we unset it, 5 PGs start to scrub. > Is there a way to limit it to one at a time? > > # ceph daemon osd.35 config show | grep scrub > "mds_max_scrub_ops_in_progress": "5", > "mon_scrub_inject_crc_mismatch": "0.00", > "mon_scrub_inject_missing_keys": "0.00", > "mon_scrub_interval": "86400", > "mon_scrub_max_keys": "100", > "mon_scrub_timeout": "300", > "mon_warn_not_deep_scrubbed": "0", > "mon_warn_not_scrubbed": "0", > "osd_debug_scrub_chance_rewrite_digest": "0", > "osd_deep_scrub_interval": "604800.00", > "osd_deep_scrub_randomize_ratio": "0.15", > "osd_deep_scrub_stride": "524288", > "osd_deep_scrub_update_digest_min_age": "7200", > "osd_max_scrubs": "1", > "osd_op_queue_mclock_scrub_lim": "0.001000", > "osd_op_queue_mclock_scrub_res": "0.00", > "osd_op_queue_mclock_scrub_wgt": "1.00", > "osd_requested_scrub_priority": "120", > "osd_scrub_auto_repair": "false", > "osd_scrub_auto_repair_num_errors": "5", > "osd_scrub_backoff_ratio": "0.66", > "osd_scrub_begin_hour": "0", > "osd_scrub_chunk_max": "25", > "osd_scrub_chunk_min": "5", > "osd_scrub_cost": "52428800", > "osd_scrub_during_recovery": "false", > "osd_scrub_end_hour": "24", > "osd_scrub_interval_randomize_ratio": "0.50", > "osd_scrub_invalid_stats": "true", > "osd_scrub_load_threshold": "0.50", > "osd_scrub_max_interval": "604800.00", > "osd_scrub_min_interval": "86400.00", > "osd_scrub_priority": "5", > "osd_scrub_sleep": "0.00", > > > Karun > > ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Snapshot trimming
Hi David, Thank you for your reply! I really appreciate it. The images are in pool id 55. It is an erasure coded pool. --- $ echo $(( $(ceph pg 55.58 query | grep snap_trimq | cut -d[ -f2 | cut -d] -f1 | tr ',' '\n' | wc -l) - 1 )) 0 $ echo $(( $(ceph pg 55.a query | grep snap_trimq | cut -d[ -f2 | cut -d] -f1 | tr ',' '\n' | wc -l) - 1 )) 0 $ echo $(( $(ceph pg 55.65 query | grep snap_trimq | cut -d[ -f2 | cut -d] -f1 | tr ',' '\n' | wc -l) - 1 )) 0 -- Current snap_trim_sleep value is default. "osd_snap_trim_sleep": "0.00". I assume it means there is no delay. (Can't find any documentation related to it) Will changing its value initiate snaptrimming, like ceph tell osd.* injectargs '--osd_snap_trim_sleep 0.05' Also, we are using an rbd user with the below profile. It is used while deleting snapshots --- caps: [mon] profile rbd caps: [osd] profile rbd pool=ecpool, profile rbd pool=vm, profile rbd-read-only pool=templates --- Can it be a reason ? Also, can you let me know which all logs to check while deleting snapshots to see if it is snaptrimming ? I am sorry I feel like pestering you too much. But in mailing lists, I can see you have dealt with similar issues with Snapshots So I think you can help me figure this mess out. Karun Josy On Sat, Jan 27, 2018 at 7:15 PM, David Turner wrote: > Prove* a positive > > On Sat, Jan 27, 2018, 8:45 AM David Turner wrote: > >> Unless you have things in your snap_trimq, your problem isn't snap >> trimming. That is currently how you can check snap trimming and you say >> you're caught up. >> >> Are you certain that you are querying the correct pool for the images you >> are snapshotting. You showed that you tested 4 different pools. You should >> only need to check the pool with the images you are dealing with. >> >> You can inversely price a positive by changing your snap_trim settings to >> not do any cleanup and see if the appropriate PGs have anything in their q. >> >> On Sat, Jan 27, 2018, 12:06 AM Karun Josy wrote: >> >>> Is scrubbing and deep scrubbing necessary for Snaptrim operation to >>> happen ? >>> >>> Karun Josy >>> >>> On Fri, Jan 26, 2018 at 9:29 PM, Karun Josy >>> wrote: >>> >>>> Thank you for your quick response! >>>> >>>> I used the command to fetch the snap_trimq from many pgs, however it >>>> seems they don't have any in queue ? >>>> >>>> For eg : >>>> >>>> $ echo $(( $(ceph pg 55.4a query | grep snap_trimq | cut -d[ -f2 | cut >>>> -d] -f1 | tr ',' '\n' | wc -l) - 1 )) >>>> 0 >>>> $ echo $(( $(ceph pg 55.5a query | grep snap_trimq | cut -d[ -f2 | cut >>>> -d] -f1 | tr ',' '\n' | wc -l) - 1 )) >>>> 0 >>>> $ echo $(( $(ceph pg 55.88 query | grep snap_trimq | cut -d[ -f2 | cut >>>> -d] -f1 | tr ',' '\n' | wc -l) - 1 )) >>>> 0 >>>> $ echo $(( $(ceph pg 55.55 query | grep snap_trimq | cut -d[ -f2 | cut >>>> -d] -f1 | tr ',' '\n' | wc -l) - 1 )) >>>> 0 >>>> $ echo $(( $(ceph pg 54.a query | grep snap_trimq | cut -d[ -f2 | cut >>>> -d] -f1 | tr ',' '\n' | wc -l) - 1 )) >>>> 0 >>>> $ echo $(( $(ceph pg 34.1d query | grep snap_trimq | cut -d[ -f2 | cut >>>> -d] -f1 | tr ',' '\n' | wc -l) - 1 )) >>>> 0 >>>> $ echo $(( $(ceph pg 1.3f query | grep snap_trimq | cut -d[ -f2 | cut >>>> -d] -f1 | tr ',' '\n' | wc -l) - 1 )) >>>> 0 >>>> = >>>> >>>> >>>> While going through the PG query, I find that these PGs have no value >>>> in purged_snaps section too. >>>> For eg : >>>> ceph pg 55.80 query >>>> -- >>>> --- >>>> --- >>>> { >>>> "peer": "83(3)", >>>> "pgid": "55.80s3", >>>> "last_update": "43360'15121927", >>>> "last_complete": "43345'15073146", >>>> "log_tail": "43335'15064480", >>>> "last_user_version": 15066124, >>>> "last_backfill": "MAX", >>>> &qu
Re: [ceph-users] Snapshot trimming
Is scrubbing and deep scrubbing necessary for Snaptrim operation to happen ? Karun Josy On Fri, Jan 26, 2018 at 9:29 PM, Karun Josy wrote: > Thank you for your quick response! > > I used the command to fetch the snap_trimq from many pgs, however it seems > they don't have any in queue ? > > For eg : > > $ echo $(( $(ceph pg 55.4a query | grep snap_trimq | cut -d[ -f2 | cut > -d] -f1 | tr ',' '\n' | wc -l) - 1 )) > 0 > $ echo $(( $(ceph pg 55.5a query | grep snap_trimq | cut -d[ -f2 | cut > -d] -f1 | tr ',' '\n' | wc -l) - 1 )) > 0 > $ echo $(( $(ceph pg 55.88 query | grep snap_trimq | cut -d[ -f2 | cut > -d] -f1 | tr ',' '\n' | wc -l) - 1 )) > 0 > $ echo $(( $(ceph pg 55.55 query | grep snap_trimq | cut -d[ -f2 | cut > -d] -f1 | tr ',' '\n' | wc -l) - 1 )) > 0 > $ echo $(( $(ceph pg 54.a query | grep snap_trimq | cut -d[ -f2 | cut -d] > -f1 | tr ',' '\n' | wc -l) - 1 )) > 0 > $ echo $(( $(ceph pg 34.1d query | grep snap_trimq | cut -d[ -f2 | cut > -d] -f1 | tr ',' '\n' | wc -l) - 1 )) > 0 > $ echo $(( $(ceph pg 1.3f query | grep snap_trimq | cut -d[ -f2 | cut -d] > -f1 | tr ',' '\n' | wc -l) - 1 )) > 0 > = > > > While going through the PG query, I find that these PGs have no value in > purged_snaps section too. > For eg : > ceph pg 55.80 query > -- > --- > --- > { > "peer": "83(3)", > "pgid": "55.80s3", > "last_update": "43360'15121927", > "last_complete": "43345'15073146", > "log_tail": "43335'15064480", > "last_user_version": 15066124, > "last_backfill": "MAX", > "last_backfill_bitwise": 1, > "purged_snaps": [], > "history": { > "epoch_created": 5950, > "epoch_pool_created": 5950, > "last_epoch_started": 43339, > "last_interval_started": 43338, > "last_epoch_clean": 43340, > "last_interval_clean": 43338, > "last_epoch_split": 0, > "last_epoch_marked_full": 42032, > "same_up_since": 43338, > "same_interval_since": 43338, > "same_primary_since": 43276, > "last_scrub": "35299'13072533", > "last_scrub_stamp": "2018-01-18 14:01:19.557972", > "last_deep_scrub": "31372'12176860", > "last_deep_scrub_stamp": "2018-01-15 12:21:17.025305", > "last_clean_scrub_stamp": "2018-01-18 14:01:19.557972" > }, > > Not sure if it is related. > > The cluster is not open to any new clients. However we see a steady growth > of space usage every day. > And worst case scenario, it might grow faster than we can add more space, > which will be dangerous. > > Any help is really appreciated. > > Karun Josy > > On Fri, Jan 26, 2018 at 8:23 PM, David Turner > wrote: > >> "snap_trimq": "[]", >> >> That is exactly what you're looking for to see how many objects a PG >> still had that need to be cleaned up. I think something like this should >> give you the number of objects in the snap_trimq for a PG. >> >> echo $(( $(ceph pg $pg query | grep snap_trimq | cut -d[ -f2 | cut -d] >> -f1 | tr ',' '\n' | wc -l) - 1 )) >> >> Note, I'm not at a computer and topping this from my phone so it's not >> pretty and I know of a few ways to do that better, but that should work all >> the same. >> >> For your needs a visual inspection of several PGs should be sufficient to >> see if there is anything in the snap_trimq to begin with. >> >> On Fri, Jan 26, 2018, 9:18 AM Karun Josy wrote: >> >>> Hi David, >>> >>> Thank you for the response. To be honest, I am afraid it is going to be >>> a issue in our cluster. >>> It seems snaptrim has not been going on for sometime now , maybe because >>> we were expanding the cluster adding nodes for the past few weeks. >>> >>> I would be really glad if you can
[ceph-users] Snapshot trimming
Hi, We have set no scrub , no deep scrub flag on a ceph cluster. When we are deleting snapshots we are not seeing any change in usage space. I understand that Ceph OSDs delete data asynchronously, so deleting a snapshot doesn’t free up the disk space immediately. But we are not seeing any change for sometime. What can be possible reason ? Any suggestions would be really helpful as the cluster size seems to be growing each day even though snapshots are deleted. Karun ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Full Ratio
Thank you! Ceph version is 12.2 Also, can you let me know the format to set osd_backfill_full_ratio ? Is it " ceph osd set -backfillfull-ratio .89 " ? Karun Josy On Thu, Jan 25, 2018 at 1:29 AM, Jean-Charles Lopez wrote: > Hi, > > if you are using an older Ceph version note that the > mon_osd_near_full_ration and mon_osd_full_ration must be set in the config > file on the MON hosts first and then the MONs restarted one after the other > one. > > If using a recent version there is a command ceph osd set-full-ratio and > ceph osd set-nearfull-ratio > > Regards > JC > > > On Jan 24, 2018, at 11:07, Karun Josy wrote: > > > > Hi, > > > > I am trying to increase the full ratio of OSDs in a cluster. > > While adding a new node one of the new disk got backfilled to more than > 95% and cluster freezed. So I am trying to avoid it from happening again. > > > > > > Tried pg set command but it is not working : > > $ ceph pg set_nearfull_ratio 0.88 > > Error ENOTSUP: this command is obsolete > > > > I had increased the full ratio in osds using injectargs initially but it > didnt work as when the disk reached 95% it showed osd full status > > > > $ ceph tell osd.* injectargs '--mon_osd_full_ratio 0.97' > > osd.0: mon_osd_full_ratio = '0.97' (not observed, change may require > restart) > > osd.1: mon_osd_full_ratio = '0.97' (not observed, change may require > restart) > > > > > > > > How can I set full ratio to more than 95% ? > > > > Karun > > ___ > > ceph-users mailing list > > ceph-users@lists.ceph.com > > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > > ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] Full Ratio
Hi, I am trying to increase the full ratio of OSDs in a cluster. While adding a new node one of the new disk got backfilled to more than 95% and cluster freezed. So I am trying to avoid it from happening again. Tried pg set command but it is not working : $ ceph pg set_nearfull_ratio 0.88 Error ENOTSUP: this command is obsolete I had increased the full ratio in osds using injectargs initially but it didnt work as when the disk reached 95% it showed osd full status $ ceph tell osd.* injectargs '--mon_osd_full_ratio 0.97' osd.0: mon_osd_full_ratio = '0.97' (not observed, change may require restart) osd.1: mon_osd_full_ratio = '0.97' (not observed, change may require restart) How can I set full ratio to more than 95% ? Karun ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] PG inactive, peering
Hi, We added a new host to cluster and it was rebalancing. And one PG became "inactive, peering" for very long time which created lot of slow requests and poor performance to the whole cluster. When I queried that PG, it showed this : "recovery_state": [ { "name": "Started/Primary/Peering/GetMissing", "enter_time": "2018-01-22 18:40:04.777654", "peer_missing_requested": [ { "osd": "77(7)", So I assumed it was stuck getting information from osd77 and so I marked osd.77 down. The status of the PG changed to "active+undersized+degraded" and PG became active again. Can anyone know why this happened ? If I start osd.77,again the PG becomes inactive and peering state. Is it becase osd.77 is bad ? Or will the same happen when the PG tries to peer again with another disk? Any help is really appreciated Karun ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] lease_timeout
Hello, In one of our cluster set up, there is frequent monitor elections happening. In the logs of one of the monitor, there is "lease_timeout" message before that happens. Can anyone help me to figure it out ? (When this happens, the Ceph Dashboard GUI gets stuck and we have to restart the manager daemon to make it work again) Ceph version : Luminous 12.2.2 Log : = 2018-01-16 16:33:08.001937 7f0cfbaad700 4 rocksdb: [/home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos7/DIST/centos7/MACHINE_SIZE/huge/release/12.2.2/rpm/el7/BUILD/ceph-12.2.2/src/rocksdb/db/compaction_job.cc:1173] [default] [JOB 885] Compacted 1@0 + 1@1 files to L1 => 20046585 bytes 2018-01-16 16:33:08.015891 7f0cfbaad700 4 rocksdb: (Original Log Time 2018/01/16-16:33:08.015826) [/home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos7/DIST/centos7/MACHINE_SIZE/huge/release/12.2.2/rpm/el7/BUILD/ceph-12.2.2/src/rocksdb/db/compaction_job.cc:621] [default] compacted to: base level 1 max bytes base 268435456 files[0 1 0 0 0 0 0] max score 0.07, MB/sec: 32.7 rd, 30.9 wr, level 1, files in(1, 1) out(1) MB in(1.3, 18.9) out(19.1), read-write-amplify(31.0) write-amplify(15.1) OK, records in: 4305, records dropped: 515 2018-01-16 16:33:08.015897 7f0cfbaad700 4 rocksdb: (Original Log Time 2018/01/16-16:33:08.015840) EVENT_LOG_v1 {"time_micros": 1516149188015833, "job": 885, "event": "compaction_finished", "compaction_time_micros": 647876, "output_level": 1, "num_output_files": 1, "total_output_size": 20046585, "num_input_records": 4305, "num_output_records": 3790, "num_subcompactions": 1, "num_single_delete_mismatches": 0, "num_single_delete_fallthrough": 0, "lsm_state": [0, 1, 0, 0, 0, 0, 0]} 2018-01-16 16:33:08.016131 7f0cfbaad700 4 rocksdb: EVENT_LOG_v1 {"time_micros": 1516149188016128, "job": 885, "event": "table_file_deletion", "file_number": 2419} 2018-01-16 16:33:08.018147 7f0cfbaad700 4 rocksdb: EVENT_LOG_v1 {"time_micros": 1516149188018146, "job": 885, "event": "table_file_deletion", "file_number": 2417} 2018-01-16 16:33:11.051010 7f0d042be700 0 mon.ceph-mon3@2(peon).data_health(436) update_stats avail 84% total 20918 MB, used 2179 MB, avail 17653 MB 2018-01-16 16:33:17.269954 7f0d042be700 1 mon.ceph-mon3@2(peon).paxos(paxos active c 84337..84838) lease_timeout -- calling new election 2018-01-16 16:33:17.291096 7f0d01ab9700 0 log_channel(cluster) log [INF] : mon.ceph-sgp-mon3 calling new monitor election 2018-01-16 16:33:17.291182 7f0d01ab9700 1 mon.ceph-mon3@2(electing).elector(436) init, last seen epoch 436 2018-01-16 16:33:20.834853 7f0d01ab9700 1 mon.ceph-mon3@2(peon).log v23189 check_sub sending message to client.65755 10.255.0.95:0/2603001850 with 8 entries (version 23189) Karun ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] Limit deep scrub
Hello, It appears that cluster is having many slow requests while it is scrubbing and deep scrubbing. Also sometimes we can see osds flapping. So we have put the flags : noscrub,nodeep-scrub When we unset it, 5 PGs start to scrub. Is there a way to limit it to one at a time? # ceph daemon osd.35 config show | grep scrub "mds_max_scrub_ops_in_progress": "5", "mon_scrub_inject_crc_mismatch": "0.00", "mon_scrub_inject_missing_keys": "0.00", "mon_scrub_interval": "86400", "mon_scrub_max_keys": "100", "mon_scrub_timeout": "300", "mon_warn_not_deep_scrubbed": "0", "mon_warn_not_scrubbed": "0", "osd_debug_scrub_chance_rewrite_digest": "0", "osd_deep_scrub_interval": "604800.00", "osd_deep_scrub_randomize_ratio": "0.15", "osd_deep_scrub_stride": "524288", "osd_deep_scrub_update_digest_min_age": "7200", "osd_max_scrubs": "1", "osd_op_queue_mclock_scrub_lim": "0.001000", "osd_op_queue_mclock_scrub_res": "0.00", "osd_op_queue_mclock_scrub_wgt": "1.00", "osd_requested_scrub_priority": "120", "osd_scrub_auto_repair": "false", "osd_scrub_auto_repair_num_errors": "5", "osd_scrub_backoff_ratio": "0.66", "osd_scrub_begin_hour": "0", "osd_scrub_chunk_max": "25", "osd_scrub_chunk_min": "5", "osd_scrub_cost": "52428800", "osd_scrub_during_recovery": "false", "osd_scrub_end_hour": "24", "osd_scrub_interval_randomize_ratio": "0.50", "osd_scrub_invalid_stats": "true", "osd_scrub_load_threshold": "0.50", "osd_scrub_max_interval": "604800.00", "osd_scrub_min_interval": "86400.00", "osd_scrub_priority": "5", "osd_scrub_sleep": "0.00", Karun ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] rbd: map failed
Hello, We have a user "testuser" with below permissions : $ ceph auth get client.testuser exported keyring for client.testuser [client.testuser] key = == caps mon = "profile rbd" caps osd = "profile rbd pool=ecpool, profile rbd pool=cv, profile rbd-read-only pool=templates" But when we try to map an image in pool 'templates' we get the below error : -- # rbd map templates/centos.7-4.x86-64.2017 --id testuser rbd: sysfs write failed In some cases useful info is found in syslog - try "dmesg | tail". rbd: map failed: (1) Operation not permitted Is it because that user has only read permission in templates pool ? Karun Josy ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] How to evict a client in rbd
It happens randomly. Karun Josy On Wed, Jan 3, 2018 at 7:07 AM, Jason Dillaman wrote: > I tried to reproduce this for over an hour today using the specified > versions w/o any success. Is this something that you can repeat > on-demand or was this a one-time occurance? > > On Sat, Dec 23, 2017 at 3:48 PM, Karun Josy wrote: > > Hello, > > > > The image is not mapped. > > > > # ceph --version > > ceph version 12.2.1 luminous (stable) > > # uname -r > > 4.14.0-1.el7.elrepo.x86_64 > > > > > > Karun Josy > > > > On Sat, Dec 23, 2017 at 6:51 PM, Jason Dillaman > wrote: > >> > >> What Ceph and what kernel version are you using? Are you positive that > >> the image has been unmapped from 10.255.0.17? > >> > >> On Fri, Dec 22, 2017 at 7:14 PM, Karun Josy > wrote: > >> > Hello, > >> > > >> > I am unable to delete this abandoned image.Rbd info shows a watcher ip > >> > Image is not mapped > >> > Image has no snapshots > >> > > >> > > >> > rbd status cvm/image --id clientuser > >> > Watchers: > >> > watcher=10.255.0.17:0/3495340192 client.390908 > >> > cookie=18446462598732841114 > >> > > >> > How can I evict or black list a watcher client so that image can be > >> > deleted > >> > http://docs.ceph.com/docs/master/cephfs/eviction/ > >> > I see this is possible in Cephfs > >> > > >> > > >> > > >> > Karun > >> > > >> > ___ > >> > ceph-users mailing list > >> > ceph-users@lists.ceph.com > >> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > >> > > >> > >> > >> > >> -- > >> Jason > > > > > > > > -- > Jason > ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Increasing PG number
https://access.redhat.com/solutions/2457321 It says it is a very intensive process and can affect cluster performance. Our Version is Luminous 12.2.2 And we are using erasure coding profile for a pool 'ecpool' with k=5 and m=3 Current PG number is 256 and it has about 20 TB of data. Should I increase it gradually? Or set pg as 512 in one step ? Karun Josy On Tue, Jan 2, 2018 at 9:26 PM, Hans van den Bogert wrote: > Please refer to standard documentation as much as possible, > > http://docs.ceph.com/docs/jewel/rados/operations/ > placement-groups/#set-the-number-of-placement-groups > > Han’s is also incomplete, since you also need to change the ‘pgp_num’ as > well. > > Regards, > > Hans > > On Jan 2, 2018, at 4:41 PM, Vladimir Prokofev wrote: > > Increased number of PGs in multiple pools in a production cluster on > 12.2.2 recently - zero issues. > CEPH claims that increasing pg_num and pgp_num are safe operations, which > are essential for it's ability to scale, and this sounds pretty reasonable > to me. [1] > > > [1] https://www.sebastien-han.fr/blog/2013/03/12/ceph-change > -pg-number-on-the-fly/ > > 2018-01-02 18:21 GMT+03:00 Karun Josy : > >> Hi, >> >> Initial PG count was not properly planned while setting up the cluster, >> so now there are only less than 50 PGs per OSDs. >> >> What are the best practises to increase PG number of a pool ? >> We have replicated pools as well as EC pools. >> >> Or is it better to create a new pool with higher PG numbers? >> >> >> Karun >> >> ___ >> ceph-users mailing list >> ceph-users@lists.ceph.com >> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >> >> > ___ > ceph-users mailing list > ceph-users@lists.ceph.com > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > > > ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] Increasing PG number
Hi, Initial PG count was not properly planned while setting up the cluster, so now there are only less than 50 PGs per OSDs. What are the best practises to increase PG number of a pool ? We have replicated pools as well as EC pools. Or is it better to create a new pool with higher PG numbers? Karun ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] PG active+clean+remapped status
Hi, We added some more osds to the cluster and it was fixed. Karun Josy On Tue, Jan 2, 2018 at 6:21 AM, 한승진 wrote: > Are all odsd are same version? > I recently experienced similar situation. > > I upgraded all osds to exact same version and reset of pool configuration > like below > > ceph osd pool set min_size 5 > > I have 5+2 erasure code the important thing is not the number of min_size > but re-configuration I think. > I hope this help you. > > 2017. 12. 19. 오전 5:25에 "Karun Josy" 님이 작성: > > I think what happened is this : >> >> http://docs.ceph.com/docs/master/rados/operations/add-or-rm-osds/ >> >> >> Note >> >> >> Sometimes, typically in a “small” cluster with few hosts (for instance >> with a small testing cluster), the fact to take out the OSD can spawn a >> CRUSH corner case where some PGs remain stuck in the active+remapped >> state >> >> Its a small cluster with unequal number of osds and one of the OSD disk >> failed and I had taken it out. >> I have already purged it, so I cannot use the reweight option mentioned >> in that link. >> >> >> So any other workarounds ? >> Will adding more disks will clear it ? >> >> Karun Josy >> >> On Mon, Dec 18, 2017 at 9:06 AM, David Turner >> wrote: >> >>> Maybe try outing the disk that should have a copy of the PG, but >>> doesn't. Then mark it back in. It might check that it has everything >>> properly and pull a copy of the data it's missing. I dunno. >>> >>> On Sun, Dec 17, 2017, 10:00 PM Karun Josy wrote: >>> >>>> Tried restarting all osds. Still no luck. >>>> >>>> Will adding a new disk to any of the server forces a rebalance and fix >>>> it? >>>> >>>> Karun Josy >>>> >>>> On Sun, Dec 17, 2017 at 12:22 PM, Cary wrote: >>>> >>>>> Karun, >>>>> >>>>> Could you paste in the output from "ceph health detail"? Which OSD >>>>> was just added? >>>>> >>>>> Cary >>>>> -Dynamic >>>>> >>>>> On Sun, Dec 17, 2017 at 4:59 AM, Karun Josy >>>>> wrote: >>>>> > Any help would be appreciated! >>>>> > >>>>> > Karun Josy >>>>> > >>>>> > On Sat, Dec 16, 2017 at 11:04 PM, Karun Josy >>>>> wrote: >>>>> >> >>>>> >> Hi, >>>>> >> >>>>> >> Repair didnt fix the issue. >>>>> >> >>>>> >> In the pg dump details, I notice this None. Seems pg is missing >>>>> from one >>>>> >> of the OSD >>>>> >> >>>>> >> [0,2,NONE,4,12,10,5,1] >>>>> >> [0,2,1,4,12,10,5,1] >>>>> >> >>>>> >> There is no way Ceph corrects this automatically ? I have to edit/ >>>>> >> troubleshoot it manually ? >>>>> >> >>>>> >> Karun >>>>> >> >>>>> >> On Sat, Dec 16, 2017 at 10:44 PM, Cary >>>>> wrote: >>>>> >>> >>>>> >>> Karun, >>>>> >>> >>>>> >>> Running ceph pg repair should not cause any problems. It may not >>>>> fix >>>>> >>> the issue though. If that does not help, there is more information >>>>> at >>>>> >>> the link below. >>>>> >>> http://ceph.com/geen-categorie/ceph-manually-repair-object/ >>>>> >>> >>>>> >>> I recommend not rebooting, or restarting while Ceph is repairing or >>>>> >>> recovering. If possible, wait until the cluster is in a healthy >>>>> state >>>>> >>> first. >>>>> >>> >>>>> >>> Cary >>>>> >>> -Dynamic >>>>> >>> >>>>> >>> On Sat, Dec 16, 2017 at 2:05 PM, Karun Josy >>>>> wrote: >>>>> >>> > Hi Cary, >>>>> >>> > >>>>> >>> > No, I didnt try to repair it. >>>>> >>> > I am comparatively new in ceph. Is it okay to try to repair it ? >>>>> >>> > Or should I take any precautions while doing it ? >&g
Re: [ceph-users] Cache tiering on Erasure coded pools
Hello David, Thank you! We setup 2 pools to use EC with RBD. One ecpool and other normal replicated pool. However, would it still be advantageous to add a replicated cache tier in front of an EC one, even though it is not required anymore? I would still assume that replication would be less intensive than EC computing? Karun Josy On Wed, Dec 27, 2017 at 3:42 AM, David Turner wrote: > Please use the version of the docs for your installed version of ceph. > Now the Jewel in your URL and the Luminous in mine. In Luminous you no > longer need a cache tier to use EC with RBDs. > > http://docs.ceph.com/docs/luminous/rados/operations/cache-tiering/ > > On Tue, Dec 26, 2017, 4:21 PM Karun Josy wrote: > >> Hi, >> >> We are using Erasure coded pools in a ceph cluster for RBD images. >> Ceph version is 12.2.2 Luminous. >> >> - >> http://docs.ceph.com/docs/jewel/rados/operations/cache-tiering/ >> - >> >> Here it says we can use a Cache tiering infront of ec pools. >> To use erasure code with RBD we have a replicated pool to store metadata >> and ecpool as data pool . >> >> Is it possible to setup cache tiering since there is already a replicated >> pool that is being used ? >> >> >> >> >> >> >> >> >> >> >> Karun Josy >> ___ >> ceph-users mailing list >> ceph-users@lists.ceph.com >> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >> > ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] Cache tiering on Erasure coded pools
Hi, We are using Erasure coded pools in a ceph cluster for RBD images. Ceph version is 12.2.2 Luminous. - http://docs.ceph.com/docs/jewel/rados/operations/cache-tiering/ - Here it says we can use a Cache tiering infront of ec pools. To use erasure code with RBD we have a replicated pool to store metadata and ecpool as data pool . Is it possible to setup cache tiering since there is already a replicated pool that is being used ? Karun Josy ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] How to evict a client in rbd
Any help is really appreciated. Karun Josy On Sun, Dec 24, 2017 at 2:18 AM, Karun Josy wrote: > Hello, > > The image is not mapped. > > # ceph --version > ceph version 12.2.1 luminous (stable) > # uname -r > 4.14.0-1.el7.elrepo.x86_64 > > > Karun Josy > > On Sat, Dec 23, 2017 at 6:51 PM, Jason Dillaman > wrote: > >> What Ceph and what kernel version are you using? Are you positive that >> the image has been unmapped from 10.255.0.17? >> >> On Fri, Dec 22, 2017 at 7:14 PM, Karun Josy wrote: >> > Hello, >> > >> > I am unable to delete this abandoned image.Rbd info shows a watcher ip >> > Image is not mapped >> > Image has no snapshots >> > >> > >> > rbd status cvm/image --id clientuser >> > Watchers: >> > watcher=10.255.0.17:0/3495340192 client.390908 >> > cookie=18446462598732841114 >> > >> > How can I evict or black list a watcher client so that image can be >> deleted >> > http://docs.ceph.com/docs/master/cephfs/eviction/ >> > I see this is possible in Cephfs >> > >> > >> > >> > Karun >> > >> > ___ >> > ceph-users mailing list >> > ceph-users@lists.ceph.com >> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >> > >> >> >> >> -- >> Jason >> > > ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] How to evict a client in rbd
Hello, The image is not mapped. # ceph --version ceph version 12.2.1 luminous (stable) # uname -r 4.14.0-1.el7.elrepo.x86_64 Karun Josy On Sat, Dec 23, 2017 at 6:51 PM, Jason Dillaman wrote: > What Ceph and what kernel version are you using? Are you positive that > the image has been unmapped from 10.255.0.17? > > On Fri, Dec 22, 2017 at 7:14 PM, Karun Josy wrote: > > Hello, > > > > I am unable to delete this abandoned image.Rbd info shows a watcher ip > > Image is not mapped > > Image has no snapshots > > > > > > rbd status cvm/image --id clientuser > > Watchers: > > watcher=10.255.0.17:0/3495340192 client.390908 > > cookie=18446462598732841114 > > > > How can I evict or black list a watcher client so that image can be > deleted > > http://docs.ceph.com/docs/master/cephfs/eviction/ > > I see this is possible in Cephfs > > > > > > > > Karun > > > > ___ > > ceph-users mailing list > > ceph-users@lists.ceph.com > > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > > > > > > -- > Jason > ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] How to evict a client in rbd
Hello, I am unable to delete this abandoned image.Rbd info shows a watcher ip Image is not mapped Image has no snapshots rbd status cvm/image --id clientuser Watchers: watcher=10.255.0.17:0/3495340192 client.390908 cookie=18446462598732841114 How can I evict or black list a watcher client so that image can be deleted http://docs.ceph.com/docs/master/cephfs/eviction/ I see this is possible in Cephfs Karun ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Proper way of removing osds
Thank you! Karun Josy On Thu, Dec 21, 2017 at 3:51 PM, Konstantin Shalygin wrote: > Is this the correct way to removes OSDs, or am I doing something wrong ? >> > Generic way for maintenance (e.g. disk replace) is rebalance by change osd > weight: > > > ceph osd crush reweight osdid 0 > > cluster migrate data "from this osd" > > > When HEALTH_OK you can safe remove this OSD: > > ceph osd out osd_id > systemctl stop ceph-osd@osd_id > ceph osd crush remove osd_id > ceph auth del osd_id > ceph osd rm osd_id > > > > k > > ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] Proper way of removing osds
Hi, This is how I remove an OSD from cluster - Take it out ceph osd out osdid Wait for the balancing to finish - Mark it down ceph osd down osdid Then Purge it ceph osd purge osdid --yes-i-really-mean-it While purging I can see there is another rebalancing occurring. Is this the correct way to removes OSDs, or am I doing something wrong ? Karun ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] POOL_NEARFULL
Hi , That makes sense. How can I adjust the osd nearfull ratio ? I tried this, however it didnt change. $ ceph tell mon.* injectargs "--mon_osd_nearfull_ratio .86" mon.mon-a1: injectargs:mon_osd_nearfull_ratio = '0.86' (not observed, change may require restart) mon.mon-a2: injectargs:mon_osd_nearfull_ratio = '0.86' (not observed, change may require restart) mon.mon-a3: injectargs:mon_osd_nearfull_ratio = '0.86' (not observed, change may require restart) Karun Josy On Tue, Dec 19, 2017 at 10:05 PM, Jean-Charles Lopez wrote: > OK so it’s telling you that the near full OSD holds PGs for these three > pools. > > JC > > On Dec 19, 2017, at 08:05, Karun Josy wrote: > > No, I haven't. > > Interestingly, the POOL_NEARFULL flag is shown only when there is OSD_NEARFULL > flag. > I have recently upgraded to Luminous 12.2.2, haven't seen this flag in > 12.2.1 > > > > Karun Josy > > On Tue, Dec 19, 2017 at 9:27 PM, Jean-Charles Lopez > wrote: > >> Hi >> >> did you set quotas on these pools? >> >> See this page for explanation of most error messages: >> http://docs.ceph.com/docs/master/rados/operations/ >> health-checks/#pool-near-full >> >> JC >> >> On Dec 19, 2017, at 01:48, Karun Josy wrote: >> >> Hello, >> >> In one of our clusters, health is showing these warnings : >> - >> OSD_NEARFULL 1 nearfull osd(s) >> osd.22 is near full >> POOL_NEARFULL 3 pool(s) nearfull >> pool 'templates' is nearfull >> pool 'cvm' is nearfull >> pool 'ecpool' is nearfull >> >> >> One osd is above 85% used, which I know caused the OSD_Nearfull flag. >> But what does pool(s) nearfull mean ? >> And how can I correct it ? >> >> ]$ ceph df >> GLOBAL: >> SIZE AVAIL RAW USED %RAW USED >> 31742G 11147G 20594G 64.88 >> POOLS: >> NAMEID USED %USED MAX AVAIL OBJECTS >> templates 5196G 23.28 645G 50202 >> cvm 66528 0 1076G 770 >> ecpool 7 10260G 83.56 2018G 3004031 >> >> >> >> Karun >> ___ >> ceph-users mailing list >> ceph-users@lists.ceph.com >> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >> >> >> > > ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] POOL_NEARFULL
No, I haven't. Interestingly, the POOL_NEARFULL flag is shown only when there is OSD_NEARFULL flag. I have recently upgraded to Luminous 12.2.2, haven't seen this flag in 12.2.1 Karun Josy On Tue, Dec 19, 2017 at 9:27 PM, Jean-Charles Lopez wrote: > Hi > > did you set quotas on these pools? > > See this page for explanation of most error messages: http://docs.ceph. > com/docs/master/rados/operations/health-checks/#pool-near-full > > JC > > On Dec 19, 2017, at 01:48, Karun Josy wrote: > > Hello, > > In one of our clusters, health is showing these warnings : > - > OSD_NEARFULL 1 nearfull osd(s) > osd.22 is near full > POOL_NEARFULL 3 pool(s) nearfull > pool 'templates' is nearfull > pool 'cvm' is nearfull > pool 'ecpool' is nearfull > > > One osd is above 85% used, which I know caused the OSD_Nearfull flag. > But what does pool(s) nearfull mean ? > And how can I correct it ? > > ]$ ceph df > GLOBAL: > SIZE AVAIL RAW USED %RAW USED > 31742G 11147G 20594G 64.88 > POOLS: > NAMEID USED %USED MAX AVAIL OBJECTS > templates 5196G 23.28 645G 50202 > cvm 66528 0 1076G 770 > ecpool 7 10260G 83.56 2018G 3004031 > > > > Karun > ___ > ceph-users mailing list > ceph-users@lists.ceph.com > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > > > ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] POOL_NEARFULL
Hello, In one of our clusters, health is showing these warnings : - OSD_NEARFULL 1 nearfull osd(s) osd.22 is near full POOL_NEARFULL 3 pool(s) nearfull pool 'templates' is nearfull pool 'cvm' is nearfull pool 'ecpool' is nearfull One osd is above 85% used, which I know caused the OSD_Nearfull flag. But what does pool(s) nearfull mean ? And how can I correct it ? ]$ ceph df GLOBAL: SIZE AVAIL RAW USED %RAW USED 31742G 11147G 20594G 64.88 POOLS: NAMEID USED %USED MAX AVAIL OBJECTS templates 5196G 23.28 645G 50202 cvm 66528 0 1076G 770 ecpool 7 10260G 83.56 2018G 3004031 Karun ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] PG active+clean+remapped status
I think what happened is this : http://docs.ceph.com/docs/master/rados/operations/add-or-rm-osds/ Note Sometimes, typically in a “small” cluster with few hosts (for instance with a small testing cluster), the fact to take out the OSD can spawn a CRUSH corner case where some PGs remain stuck in the active+remapped state Its a small cluster with unequal number of osds and one of the OSD disk failed and I had taken it out. I have already purged it, so I cannot use the reweight option mentioned in that link. So any other workarounds ? Will adding more disks will clear it ? Karun Josy On Mon, Dec 18, 2017 at 9:06 AM, David Turner wrote: > Maybe try outing the disk that should have a copy of the PG, but doesn't. > Then mark it back in. It might check that it has everything properly and > pull a copy of the data it's missing. I dunno. > > On Sun, Dec 17, 2017, 10:00 PM Karun Josy wrote: > >> Tried restarting all osds. Still no luck. >> >> Will adding a new disk to any of the server forces a rebalance and fix it? >> >> Karun Josy >> >> On Sun, Dec 17, 2017 at 12:22 PM, Cary wrote: >> >>> Karun, >>> >>> Could you paste in the output from "ceph health detail"? Which OSD >>> was just added? >>> >>> Cary >>> -Dynamic >>> >>> On Sun, Dec 17, 2017 at 4:59 AM, Karun Josy >>> wrote: >>> > Any help would be appreciated! >>> > >>> > Karun Josy >>> > >>> > On Sat, Dec 16, 2017 at 11:04 PM, Karun Josy >>> wrote: >>> >> >>> >> Hi, >>> >> >>> >> Repair didnt fix the issue. >>> >> >>> >> In the pg dump details, I notice this None. Seems pg is missing from >>> one >>> >> of the OSD >>> >> >>> >> [0,2,NONE,4,12,10,5,1] >>> >> [0,2,1,4,12,10,5,1] >>> >> >>> >> There is no way Ceph corrects this automatically ? I have to edit/ >>> >> troubleshoot it manually ? >>> >> >>> >> Karun >>> >> >>> >> On Sat, Dec 16, 2017 at 10:44 PM, Cary >>> wrote: >>> >>> >>> >>> Karun, >>> >>> >>> >>> Running ceph pg repair should not cause any problems. It may not fix >>> >>> the issue though. If that does not help, there is more information at >>> >>> the link below. >>> >>> http://ceph.com/geen-categorie/ceph-manually-repair-object/ >>> >>> >>> >>> I recommend not rebooting, or restarting while Ceph is repairing or >>> >>> recovering. If possible, wait until the cluster is in a healthy state >>> >>> first. >>> >>> >>> >>> Cary >>> >>> -Dynamic >>> >>> >>> >>> On Sat, Dec 16, 2017 at 2:05 PM, Karun Josy >>> wrote: >>> >>> > Hi Cary, >>> >>> > >>> >>> > No, I didnt try to repair it. >>> >>> > I am comparatively new in ceph. Is it okay to try to repair it ? >>> >>> > Or should I take any precautions while doing it ? >>> >>> > >>> >>> > Karun Josy >>> >>> > >>> >>> > On Sat, Dec 16, 2017 at 2:08 PM, Cary >>> wrote: >>> >>> >> >>> >>> >> Karun, >>> >>> >> >>> >>> >> Did you attempt a "ceph pg repair "? Replace with >>> the pg >>> >>> >> ID that needs repaired, 3.4. >>> >>> >> >>> >>> >> Cary >>> >>> >> -D123 >>> >>> >> >>> >>> >> On Sat, Dec 16, 2017 at 8:24 AM, Karun Josy >> > >>> >>> >> wrote: >>> >>> >> > Hello, >>> >>> >> > >>> >>> >> > I added 1 disk to the cluster and after rebalancing, it shows 1 >>> PG >>> >>> >> > is in >>> >>> >> > remapped state. How can I correct it ? >>> >>> >> > >>> >>> >> > (I had to restart some osds during the rebalancing as there were >>> >>> >> > some >>> >>> >> > slow >>> >>> >> > requests) >>> >>> >> &
Re: [ceph-users] PG active+clean+remapped status
Tried restarting all osds. Still no luck. Will adding a new disk to any of the server forces a rebalance and fix it? Karun Josy On Sun, Dec 17, 2017 at 12:22 PM, Cary wrote: > Karun, > > Could you paste in the output from "ceph health detail"? Which OSD > was just added? > > Cary > -Dynamic > > On Sun, Dec 17, 2017 at 4:59 AM, Karun Josy wrote: > > Any help would be appreciated! > > > > Karun Josy > > > > On Sat, Dec 16, 2017 at 11:04 PM, Karun Josy > wrote: > >> > >> Hi, > >> > >> Repair didnt fix the issue. > >> > >> In the pg dump details, I notice this None. Seems pg is missing from one > >> of the OSD > >> > >> [0,2,NONE,4,12,10,5,1] > >> [0,2,1,4,12,10,5,1] > >> > >> There is no way Ceph corrects this automatically ? I have to edit/ > >> troubleshoot it manually ? > >> > >> Karun > >> > >> On Sat, Dec 16, 2017 at 10:44 PM, Cary wrote: > >>> > >>> Karun, > >>> > >>> Running ceph pg repair should not cause any problems. It may not fix > >>> the issue though. If that does not help, there is more information at > >>> the link below. > >>> http://ceph.com/geen-categorie/ceph-manually-repair-object/ > >>> > >>> I recommend not rebooting, or restarting while Ceph is repairing or > >>> recovering. If possible, wait until the cluster is in a healthy state > >>> first. > >>> > >>> Cary > >>> -Dynamic > >>> > >>> On Sat, Dec 16, 2017 at 2:05 PM, Karun Josy > wrote: > >>> > Hi Cary, > >>> > > >>> > No, I didnt try to repair it. > >>> > I am comparatively new in ceph. Is it okay to try to repair it ? > >>> > Or should I take any precautions while doing it ? > >>> > > >>> > Karun Josy > >>> > > >>> > On Sat, Dec 16, 2017 at 2:08 PM, Cary > wrote: > >>> >> > >>> >> Karun, > >>> >> > >>> >> Did you attempt a "ceph pg repair "? Replace with the > pg > >>> >> ID that needs repaired, 3.4. > >>> >> > >>> >> Cary > >>> >> -D123 > >>> >> > >>> >> On Sat, Dec 16, 2017 at 8:24 AM, Karun Josy > >>> >> wrote: > >>> >> > Hello, > >>> >> > > >>> >> > I added 1 disk to the cluster and after rebalancing, it shows 1 PG > >>> >> > is in > >>> >> > remapped state. How can I correct it ? > >>> >> > > >>> >> > (I had to restart some osds during the rebalancing as there were > >>> >> > some > >>> >> > slow > >>> >> > requests) > >>> >> > > >>> >> > $ ceph pg dump | grep remapped > >>> >> > dumped all > >>> >> > 3.4 981 00 0 0 > >>> >> > 2655009792 > >>> >> > 1535 1535 active+clean+remapped 2017-12-15 22:07:21.663964 > >>> >> > 2824'785115 > >>> >> > 2824:2297888 [0,2,NONE,4,12,10,5,1] 0 > [0,2,1,4,12,10,5,1] > >>> >> > 0 2288'767367 2017-12-14 11:00:15.576741 417'518549 > 2017-12-08 > >>> >> > 03:56:14.006982 > >>> >> > > >>> >> > That PG belongs to an erasure pool with k=5, m =3 profile, failure > >>> >> > domain is > >>> >> > host. > >>> >> > > >>> >> > === > >>> >> > > >>> >> > $ ceph osd tree > >>> >> > ID CLASS WEIGHT TYPE NAMESTATUS REWEIGHT > PRI-AFF > >>> >> > -1 16.94565 root default > >>> >> > -32.73788 host ceph-a1 > >>> >> > 0 ssd 1.86469 osd.0up 1.0 > 1.0 > >>> >> > 14 ssd 0.87320 osd.14 up 1.0 > 1.0 > >>> >> > -52.73788 host ceph-a2 > >>> >> > 1 ssd 1.86469 osd.1up 1.0 > 1.0 > >>> >> > 15 ssd 0.87320
Re: [ceph-users] Adding new host
Hi David, Thank you for your response. Failure domain for ec profile is 'host'. So I guess it is okay to add a node and activate 5 disks at a time ? $ ceph osd erasure-code-profile get profile5by3 crush-device-class= crush-failure-domain=host crush-root=default jerasure-per-chunk-alignment=false k=5 m=3 plugin=jerasure technique=reed_sol_van w=8 Karun Josy On Sun, Dec 17, 2017 at 11:26 PM, David Turner wrote: > I like to avoid adding disks from more than 1 failure domain at a time in > case some of the new disks are bad. In your example of only adding 1 new > node, I would say that adding all of the disks at the same time is the > better way to do it. > > Adding only 1 disk in the new node at a time would actually be worse for > the balance of the cluster as it would only have 1 disk while the rest have > all 5 or more. > > The EC profile shouldn't play into account as you already have enough > hosts to fulfill it. > > On Sun, Dec 17, 2017, 11:57 AM Karun Josy wrote: > >> Hi, >> >> We have a live cluster with 8 OSD nodes all having 5-6 disks each. >> >> We would like to add a new host and expand the cluster. >> >> We have 4 pools >> - 3 replicated pools with replication factor 5 and 3 >> - 1 erasure coded pool with k=5, m=3 >> >> So my concern is, is there any precautions that are needed to add the new >> host since the ec profile is 5+3. >> >> And can we add multiple disks at the same time in the new host ? Or >> should it be 1 at a time ? >> >> >> >> Karun >> ___ >> ceph-users mailing list >> ceph-users@lists.ceph.com >> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >> > ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] Adding new host
Hi, We have a live cluster with 8 OSD nodes all having 5-6 disks each. We would like to add a new host and expand the cluster. We have 4 pools - 3 replicated pools with replication factor 5 and 3 - 1 erasure coded pool with k=5, m=3 So my concern is, is there any precautions that are needed to add the new host since the ec profile is 5+3. And can we add multiple disks at the same time in the new host ? Or should it be 1 at a time ? Karun ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] PG active+clean+remapped status
Any help would be appreciated! Karun Josy On Sat, Dec 16, 2017 at 11:04 PM, Karun Josy wrote: > Hi, > > Repair didnt fix the issue. > > In the pg dump details, I notice this None. Seems pg is missing from one > of the OSD > > [0,2,NONE,4,12,10,5,1] > [0,2,1,4,12,10,5,1] > > There is no way Ceph corrects this automatically ? I have to edit/ > troubleshoot it manually ? > > Karun > > On Sat, Dec 16, 2017 at 10:44 PM, Cary wrote: > >> Karun, >> >> Running ceph pg repair should not cause any problems. It may not fix >> the issue though. If that does not help, there is more information at >> the link below. >> http://ceph.com/geen-categorie/ceph-manually-repair-object/ >> >> I recommend not rebooting, or restarting while Ceph is repairing or >> recovering. If possible, wait until the cluster is in a healthy state >> first. >> >> Cary >> -Dynamic >> >> On Sat, Dec 16, 2017 at 2:05 PM, Karun Josy wrote: >> > Hi Cary, >> > >> > No, I didnt try to repair it. >> > I am comparatively new in ceph. Is it okay to try to repair it ? >> > Or should I take any precautions while doing it ? >> > >> > Karun Josy >> > >> > On Sat, Dec 16, 2017 at 2:08 PM, Cary wrote: >> >> >> >> Karun, >> >> >> >> Did you attempt a "ceph pg repair "? Replace with the pg >> >> ID that needs repaired, 3.4. >> >> >> >> Cary >> >> -D123 >> >> >> >> On Sat, Dec 16, 2017 at 8:24 AM, Karun Josy >> wrote: >> >> > Hello, >> >> > >> >> > I added 1 disk to the cluster and after rebalancing, it shows 1 PG >> is in >> >> > remapped state. How can I correct it ? >> >> > >> >> > (I had to restart some osds during the rebalancing as there were some >> >> > slow >> >> > requests) >> >> > >> >> > $ ceph pg dump | grep remapped >> >> > dumped all >> >> > 3.4 981 00 0 0 >> 2655009792 >> >> > 1535 1535 active+clean+remapped 2017-12-15 22:07:21.663964 >> >> > 2824'785115 >> >> > 2824:2297888 [0,2,NONE,4,12,10,5,1] 0 [0,2,1,4,12,10,5,1] >> >> > 0 2288'767367 2017-12-14 11:00:15.576741 417'518549 2017-12-08 >> >> > 03:56:14.006982 >> >> > >> >> > That PG belongs to an erasure pool with k=5, m =3 profile, failure >> >> > domain is >> >> > host. >> >> > >> >> > === >> >> > >> >> > $ ceph osd tree >> >> > ID CLASS WEIGHT TYPE NAMESTATUS REWEIGHT PRI-AFF >> >> > -1 16.94565 root default >> >> > -32.73788 host ceph-a1 >> >> > 0 ssd 1.86469 osd.0up 1.0 1.0 >> >> > 14 ssd 0.87320 osd.14 up 1.0 1.0 >> >> > -52.73788 host ceph-a2 >> >> > 1 ssd 1.86469 osd.1up 1.0 1.0 >> >> > 15 ssd 0.87320 osd.15 up 1.0 1.0 >> >> > -71.86469 host ceph-a3 >> >> > 2 ssd 1.86469 osd.2up 1.0 1.0 >> >> > -91.74640 host ceph-a4 >> >> > 3 ssd 0.87320 osd.3up 1.0 1.0 >> >> > 4 ssd 0.87320 osd.4up 1.0 1.0 >> >> > -111.74640 host ceph-a5 >> >> > 5 ssd 0.87320 osd.5up 1.0 1.0 >> >> > 6 ssd 0.87320 osd.6up 1.0 1.0 >> >> > -131.74640 host ceph-a6 >> >> > 7 ssd 0.87320 osd.7up 1.0 1.0 >> >> > 8 ssd 0.87320 osd.8up 1.0 1.0 >> >> > -151.74640 host ceph-a7 >> >> > 9 ssd 0.87320 osd.9up 1.0 1.0 >> >> > 10 ssd 0.87320 osd.10 up 1.0 1.0 >> >> > -172.61960 host ceph-a8 >> >> > 11 ssd 0.87320 osd.11 up 1.0 1.0 >> >> > 12 ssd 0.87320 osd.12 up 1.0 1.0 >> >> > 13 ssd 0.87320 osd.13 up 1.0 1.0 >> >> > >> >> > >> >> > >> >> > Karun >> >> > >> >> > ___ >> >> > ceph-users mailing list >> >> > ceph-users@lists.ceph.com >> >> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >> >> > >> > >> > >> > > ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] PG active+clean+remapped status
Hi, Repair didnt fix the issue. In the pg dump details, I notice this None. Seems pg is missing from one of the OSD [0,2,NONE,4,12,10,5,1] [0,2,1,4,12,10,5,1] There is no way Ceph corrects this automatically ? I have to edit/ troubleshoot it manually ? Karun On Sat, Dec 16, 2017 at 10:44 PM, Cary wrote: > Karun, > > Running ceph pg repair should not cause any problems. It may not fix > the issue though. If that does not help, there is more information at > the link below. > http://ceph.com/geen-categorie/ceph-manually-repair-object/ > > I recommend not rebooting, or restarting while Ceph is repairing or > recovering. If possible, wait until the cluster is in a healthy state > first. > > Cary > -Dynamic > > On Sat, Dec 16, 2017 at 2:05 PM, Karun Josy wrote: > > Hi Cary, > > > > No, I didnt try to repair it. > > I am comparatively new in ceph. Is it okay to try to repair it ? > > Or should I take any precautions while doing it ? > > > > Karun Josy > > > > On Sat, Dec 16, 2017 at 2:08 PM, Cary wrote: > >> > >> Karun, > >> > >> Did you attempt a "ceph pg repair "? Replace with the pg > >> ID that needs repaired, 3.4. > >> > >> Cary > >> -D123 > >> > >> On Sat, Dec 16, 2017 at 8:24 AM, Karun Josy > wrote: > >> > Hello, > >> > > >> > I added 1 disk to the cluster and after rebalancing, it shows 1 PG is > in > >> > remapped state. How can I correct it ? > >> > > >> > (I had to restart some osds during the rebalancing as there were some > >> > slow > >> > requests) > >> > > >> > $ ceph pg dump | grep remapped > >> > dumped all > >> > 3.4 981 00 0 0 > 2655009792 > >> > 1535 1535 active+clean+remapped 2017-12-15 22:07:21.663964 > >> > 2824'785115 > >> > 2824:2297888 [0,2,NONE,4,12,10,5,1] 0 [0,2,1,4,12,10,5,1] > >> > 0 2288'767367 2017-12-14 11:00:15.576741 417'518549 2017-12-08 > >> > 03:56:14.006982 > >> > > >> > That PG belongs to an erasure pool with k=5, m =3 profile, failure > >> > domain is > >> > host. > >> > > >> > === > >> > > >> > $ ceph osd tree > >> > ID CLASS WEIGHT TYPE NAMESTATUS REWEIGHT PRI-AFF > >> > -1 16.94565 root default > >> > -32.73788 host ceph-a1 > >> > 0 ssd 1.86469 osd.0up 1.0 1.0 > >> > 14 ssd 0.87320 osd.14 up 1.0 1.0 > >> > -52.73788 host ceph-a2 > >> > 1 ssd 1.86469 osd.1up 1.0 1.0 > >> > 15 ssd 0.87320 osd.15 up 1.0 1.0 > >> > -71.86469 host ceph-a3 > >> > 2 ssd 1.86469 osd.2up 1.0 1.0 > >> > -91.74640 host ceph-a4 > >> > 3 ssd 0.87320 osd.3up 1.0 1.0 > >> > 4 ssd 0.87320 osd.4up 1.0 1.0 > >> > -111.74640 host ceph-a5 > >> > 5 ssd 0.87320 osd.5up 1.0 1.0 > >> > 6 ssd 0.87320 osd.6up 1.0 1.0 > >> > -131.74640 host ceph-a6 > >> > 7 ssd 0.87320 osd.7up 1.0 1.0 > >> > 8 ssd 0.87320 osd.8up 1.0 1.0 > >> > -151.74640 host ceph-a7 > >> > 9 ssd 0.87320 osd.9up 1.0 1.0 > >> > 10 ssd 0.87320 osd.10 up 1.0 1.0 > >> > -172.61960 host ceph-a8 > >> > 11 ssd 0.87320 osd.11 up 1.0 1.0 > >> > 12 ssd 0.87320 osd.12 up 1.0 1.0 > >> > 13 ssd 0.87320 osd.13 up 1.0 1.0 > >> > > >> > > >> > > >> > Karun > >> > > >> > ___ > >> > ceph-users mailing list > >> > ceph-users@lists.ceph.com > >> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > >> > > > > > > ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] PG active+clean+remapped status
Hi Cary, No, I didnt try to repair it. I am comparatively new in ceph. Is it okay to try to repair it ? Or should I take any precautions while doing it ? Karun Josy On Sat, Dec 16, 2017 at 2:08 PM, Cary wrote: > Karun, > > Did you attempt a "ceph pg repair "? Replace with the pg > ID that needs repaired, 3.4. > > Cary > -D123 > > On Sat, Dec 16, 2017 at 8:24 AM, Karun Josy wrote: > > Hello, > > > > I added 1 disk to the cluster and after rebalancing, it shows 1 PG is in > > remapped state. How can I correct it ? > > > > (I had to restart some osds during the rebalancing as there were some > slow > > requests) > > > > $ ceph pg dump | grep remapped > > dumped all > > 3.4 981 00 0 0 2655009792 > > 1535 1535 active+clean+remapped 2017-12-15 22:07:21.663964 > 2824'785115 > > 2824:2297888 [0,2,NONE,4,12,10,5,1] 0 [0,2,1,4,12,10,5,1] > > 0 2288'767367 2017-12-14 11:00:15.576741 417'518549 2017-12-08 > > 03:56:14.006982 > > > > That PG belongs to an erasure pool with k=5, m =3 profile, failure > domain is > > host. > > > > === > > > > $ ceph osd tree > > ID CLASS WEIGHT TYPE NAMESTATUS REWEIGHT PRI-AFF > > -1 16.94565 root default > > -32.73788 host ceph-a1 > > 0 ssd 1.86469 osd.0up 1.0 1.0 > > 14 ssd 0.87320 osd.14 up 1.0 1.0 > > -52.73788 host ceph-a2 > > 1 ssd 1.86469 osd.1up 1.0 1.0 > > 15 ssd 0.87320 osd.15 up 1.0 1.0 > > -71.86469 host ceph-a3 > > 2 ssd 1.86469 osd.2up 1.0 1.0 > > -91.74640 host ceph-a4 > > 3 ssd 0.87320 osd.3up 1.0 1.0 > > 4 ssd 0.87320 osd.4up 1.0 1.0 > > -111.74640 host ceph-a5 > > 5 ssd 0.87320 osd.5up 1.0 1.0 > > 6 ssd 0.87320 osd.6up 1.0 1.0 > > -131.74640 host ceph-a6 > > 7 ssd 0.87320 osd.7up 1.0 1.0 > > 8 ssd 0.87320 osd.8up 1.0 1.0 > > -151.74640 host ceph-a7 > > 9 ssd 0.87320 osd.9up 1.0 1.0 > > 10 ssd 0.87320 osd.10 up 1.0 1.0 > > -172.61960 host ceph-a8 > > 11 ssd 0.87320 osd.11 up 1.0 1.0 > > 12 ssd 0.87320 osd.12 up 1.0 1.0 > > 13 ssd 0.87320 osd.13 up 1.0 1.0 > > > > > > > > Karun > > > > ___ > > ceph-users mailing list > > ceph-users@lists.ceph.com > > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > > > ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] PG active+clean+remapped status
Hello, I added 1 disk to the cluster and after rebalancing, it shows 1 PG is in remapped state. How can I correct it ? (I had to restart some osds during the rebalancing as there were some slow requests) $ ceph pg dump | grep remapped dumped all 3.4 981 00 0 0 2655009792 1535 1535 active+clean+remapped 2017-12-15 22:07:21.663964 2824'785115 2824:2297888 [0,2,NONE,4,12,10,5,1] 0 [0,2,1,4,12,10,5,1] 0 2288'767367 2017-12-14 11:00:15.576741 417'518549 2017-12-08 03:56:14.006982 That PG belongs to an erasure pool with k=5, m =3 profile, failure domain is host. === $ ceph osd tree ID CLASS WEIGHT TYPE NAMESTATUS REWEIGHT PRI-AFF -1 16.94565 root default -32.73788 host ceph-a1 0 ssd 1.86469 osd.0up 1.0 1.0 14 ssd 0.87320 osd.14 up 1.0 1.0 -52.73788 host ceph-a2 1 ssd 1.86469 osd.1up 1.0 1.0 15 ssd 0.87320 osd.15 up 1.0 1.0 -71.86469 host ceph-a3 2 ssd 1.86469 osd.2up 1.0 1.0 -91.74640 host ceph-a4 3 ssd 0.87320 osd.3up 1.0 1.0 4 ssd 0.87320 osd.4up 1.0 1.0 -111.74640 host ceph-a5 5 ssd 0.87320 osd.5up 1.0 1.0 6 ssd 0.87320 osd.6up 1.0 1.0 -131.74640 host ceph-a6 7 ssd 0.87320 osd.7up 1.0 1.0 8 ssd 0.87320 osd.8up 1.0 1.0 -151.74640 host ceph-a7 9 ssd 0.87320 osd.9up 1.0 1.0 10 ssd 0.87320 osd.10 up 1.0 1.0 -172.61960 host ceph-a8 11 ssd 0.87320 osd.11 up 1.0 1.0 12 ssd 0.87320 osd.12 up 1.0 1.0 13 ssd 0.87320 osd.13 up 1.0 1.0 Karun ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Health Error : Request Stuck
Hi Nick, Finally, was able to correct the issue! We found that there were many slow requests in ceph health detail. And found that some osds were slowing the cluster down. Initially the cluster was unusable when there were 10 PGs with "activating+remapped" status and slow requests. Slow requests were mainly on 2 osds. And we restarted osd daemons one by one, which cleared the block requests. And that made the cluster reusable. However, there were 4 PGs still in inactive state. So I took down one of the osd with slow requests for some time, and allowed the cluster to rebalance. And it worked! To be honest, not exactly sure its the correct way. P.S : I had upgraded to Luminous 12.2.2 yesterday. Karun Josy On Wed, Dec 13, 2017 at 4:31 PM, Nick Fisk wrote: > Hi Karun, > > > > I too am experiencing something very similar with a PG stuck in > activating+remapped state after re-introducing a OSD back into the cluster > as Bluestore. Although this new OSD is not the one listed against the PG’s > stuck activating. I also see the same thing as you where the up set is > different to the acting set. > > > > Can I just ask what ceph version you are running and the output of ceph > osd tree? > > > > *From:* ceph-users [mailto:ceph-users-boun...@lists.ceph.com] *On Behalf > Of *Karun Josy > *Sent:* 13 December 2017 07:06 > *To:* ceph-users > *Subject:* Re: [ceph-users] Health Error : Request Stuck > > > > Cluster is unusable because of inactive PGs. How can we correct it? > > > > = > > ceph pg dump_stuck inactive > > ok > > PG_STAT STATE UP UP_PRIMARY ACTING > ACTING_PRIMARY > > 1.4bactivating+remapped [5,2,0,13,1] 5 [5,2,13,1,4] > 5 > > 1.35activating+remapped [2,7,0,1,12] 2 [2,7,1,12,9] > 2 > > 1.12activating+remapped [1,3,5,0,7] 1 [1,3,5,7,2] > 1 > > 1.4eactivating+remapped [1,3,0,9,2] 1 [1,3,0,9,5] > 1 > > 2.3bactivating+remapped [13,1,0] 13 [13,1,2] >13 > > 1.19activating+remapped [2,13,8,9,0] 2 [2,13,8,9,1] > 2 > > 1.1eactivating+remapped [2,3,1,10,0] 2 [2,3,1,10,5] > 2 > > 2.29activating+remapped [1,0,13] 1 [1,8,11] > 1 > > 1.6factivating+remapped [8,2,0,4,13] 8 [8,2,4,13,1] > 8 > > 1.74activating+remapped [7,13,2,0,4] 7 [7,13,2,4,1] > 7 > > > > > Karun Josy > > > > On Wed, Dec 13, 2017 at 8:27 AM, Karun Josy wrote: > > Hello, > > > > We added a new disk to the cluster and while rebalancing we are getting > error warnings. > > > > = > > Overall status: HEALTH_ERR > > REQUEST_SLOW: 1824 slow requests are blocked > 32 sec > > REQUEST_STUCK: 1022 stuck requests are blocked > 4096 sec > > == > > > > The load in the servers seems to be very low. > > > > How can I correct it? > > > > > > Karun > > > ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Health Error : Request Stuck
Cluster is unusable because of inactive PGs. How can we correct it? = ceph pg dump_stuck inactive ok PG_STAT STATE UP UP_PRIMARY ACTING ACTING_PRIMARY 1.4bactivating+remapped [5,2,0,13,1] 5 [5,2,13,1,4] 5 1.35activating+remapped [2,7,0,1,12] 2 [2,7,1,12,9] 2 1.12activating+remapped [1,3,5,0,7] 1 [1,3,5,7,2] 1 1.4eactivating+remapped [1,3,0,9,2] 1 [1,3,0,9,5] 1 2.3bactivating+remapped [13,1,0] 13 [13,1,2] 13 1.19activating+remapped [2,13,8,9,0] 2 [2,13,8,9,1] 2 1.1eactivating+remapped [2,3,1,10,0] 2 [2,3,1,10,5] 2 2.29activating+remapped [1,0,13] 1 [1,8,11] 1 1.6factivating+remapped [8,2,0,4,13] 8 [8,2,4,13,1] 8 1.74activating+remapped [7,13,2,0,4] 7 [7,13,2,4,1] 7 Karun Josy On Wed, Dec 13, 2017 at 8:27 AM, Karun Josy wrote: > Hello, > > We added a new disk to the cluster and while rebalancing we are getting > error warnings. > > = > Overall status: HEALTH_ERR > REQUEST_SLOW: 1824 slow requests are blocked > 32 sec > REQUEST_STUCK: 1022 stuck requests are blocked > 4096 sec > == > > The load in the servers seems to be very low. > > How can I correct it? > > > Karun > ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] Health Error : Request Stuck
Hello, We added a new disk to the cluster and while rebalancing we are getting error warnings. = Overall status: HEALTH_ERR REQUEST_SLOW: 1824 slow requests are blocked > 32 sec REQUEST_STUCK: 1022 stuck requests are blocked > 4096 sec == The load in the servers seems to be very low. How can I correct it? Karun ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] HEALTH_ERR : PG_DEGRADED_FULL
Hi Lars, Sean, Thank you for your response. The cluster health is ok now! :) Karun Josy On Thu, Dec 7, 2017 at 3:35 PM, Sean Redmond wrote: > Can you share - ceph osd tree / crushmap and `ceph health detail` via > pastebin? > > Is recovery stuck or it is on going? > > On 7 Dec 2017 07:06, "Karun Josy" wrote: > >> Hello, >> >> I am seeing health error in our production cluster. >> >> health: HEALTH_ERR >> 1105420/11038158 objects misplaced (10.015%) >> Degraded data redundancy: 2046/11038158 objects degraded >> (0.019%), 102 pgs unclean, 2 pgs degraded >> Degraded data redundancy (low space): 4 pgs backfill_toofull >> >> The cluster space was running out. >> So I was in the process of adding a disk. >> Since I got this error, we deleted some of the data to create more space. >> >> >> This is the current usage, after clearing some space, earlier 3 disks >> were at 85%. >> >> >> $ ceph osd df >> ID CLASS WEIGHT REWEIGHT SIZE USE AVAIL %USE VAR PGS >> 0 ssd 1.86469 1.0 1909G 851G 1058G 44.59 0.78 265 >> 16 ssd 0.87320 1.0 894G 361G 532G 40.43 0.71 112 >> 1 ssd 0.87320 1.0 894G 586G 307G 65.57 1.15 163 >> 2 ssd 0.87320 1.0 894G 490G 403G 54.84 0.96 145 >> 17 ssd 0.87320 1.0 894G 163G 731G 18.24 0.32 58 >> 3 ssd 0.87320 1.0 894G 616G 277G 68.98 1.21 176 >> 4 ssd 0.87320 1.0 894G 593G 300G 66.42 1.17 179 >> 5 ssd 0.87320 1.0 894G 419G 474G 46.89 0.82 130 >> 6 ssd 0.87320 1.0 894G 422G 472G 47.21 0.83 129 >> 7 ssd 0.87320 1.0 894G 397G 496G 44.50 0.78 115 >> 8 ssd 0.87320 1.0 894G 656G 237G 73.44 1.29 184 >> 9 ssd 0.87320 1.0 894G 560G 333G 62.72 1.10 170 >> 10 ssd 0.87320 1.0 894G 623G 270G 69.78 1.22 183 >> 11 ssd 0.87320 1.0 894G 586G 307G 65.57 1.15 172 >> 12 ssd 0.87320 1.0 894G 610G 283G 68.29 1.20 172 >> 13 ssd 0.87320 1.0 894G 597G 296G 66.87 1.17 180 >> 14 ssd 0.87320 1.0 894G 597G 296G 66.79 1.17 168 >> 15 ssd 0.87320 1.0 894G 610G 283G 68.32 1.20 179 >> TOTAL 17110G 9746G 7363G 56.97 >> >> How to fix this? Please help! >> >> Karun >> >> ___ >> ceph-users mailing list >> ceph-users@lists.ceph.com >> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >> >> ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] HEALTH_ERR : PG_DEGRADED_FULL
Hello, I am seeing health error in our production cluster. health: HEALTH_ERR 1105420/11038158 objects misplaced (10.015%) Degraded data redundancy: 2046/11038158 objects degraded (0.019%), 102 pgs unclean, 2 pgs degraded Degraded data redundancy (low space): 4 pgs backfill_toofull The cluster space was running out. So I was in the process of adding a disk. Since I got this error, we deleted some of the data to create more space. This is the current usage, after clearing some space, earlier 3 disks were at 85%. $ ceph osd df ID CLASS WEIGHT REWEIGHT SIZE USE AVAIL %USE VAR PGS 0 ssd 1.86469 1.0 1909G 851G 1058G 44.59 0.78 265 16 ssd 0.87320 1.0 894G 361G 532G 40.43 0.71 112 1 ssd 0.87320 1.0 894G 586G 307G 65.57 1.15 163 2 ssd 0.87320 1.0 894G 490G 403G 54.84 0.96 145 17 ssd 0.87320 1.0 894G 163G 731G 18.24 0.32 58 3 ssd 0.87320 1.0 894G 616G 277G 68.98 1.21 176 4 ssd 0.87320 1.0 894G 593G 300G 66.42 1.17 179 5 ssd 0.87320 1.0 894G 419G 474G 46.89 0.82 130 6 ssd 0.87320 1.0 894G 422G 472G 47.21 0.83 129 7 ssd 0.87320 1.0 894G 397G 496G 44.50 0.78 115 8 ssd 0.87320 1.0 894G 656G 237G 73.44 1.29 184 9 ssd 0.87320 1.0 894G 560G 333G 62.72 1.10 170 10 ssd 0.87320 1.0 894G 623G 270G 69.78 1.22 183 11 ssd 0.87320 1.0 894G 586G 307G 65.57 1.15 172 12 ssd 0.87320 1.0 894G 610G 283G 68.29 1.20 172 13 ssd 0.87320 1.0 894G 597G 296G 66.87 1.17 180 14 ssd 0.87320 1.0 894G 597G 296G 66.79 1.17 168 15 ssd 0.87320 1.0 894G 610G 283G 68.32 1.20 179 TOTAL 17110G 9746G 7363G 56.97 How to fix this? Please help! Karun ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Adding multiple OSD
Thank you for detailed explanation! Got one another doubt, This is the total space available in the cluster : TOTAL : 23490G Use : 10170G Avail : 13320G But ecpool shows max avail as just 3 TB. What am I missing ? == $ ceph df GLOBAL: SIZE AVAIL RAW USED %RAW USED 23490G 13338G 10151G 43.22 POOLS: NAMEID USED %USED MAX AVAIL OBJECTS ostemplates 1 162G 2.79 1134G 42084 imagepool 34 122G 2.11 1891G 34196 cvm154 8058 0 1891G 950 ecpool1 55 4246G 42.77 3546G 1232590 $ ceph osd df ID CLASS WEIGHT REWEIGHT SIZE USEAVAIL %USE VAR PGS 0 ssd 1.86469 1.0 1909G 625G 1284G 32.76 0.76 201 1 ssd 1.86469 1.0 1909G 691G 1217G 36.23 0.84 208 2 ssd 0.87320 1.0 894G 587G 306G 65.67 1.52 156 11 ssd 0.87320 1.0 894G 631G 262G 70.68 1.63 186 3 ssd 0.87320 1.0 894G 605G 288G 67.73 1.56 165 14 ssd 0.87320 1.0 894G 635G 258G 71.07 1.64 177 4 ssd 0.87320 1.0 894G 419G 474G 46.93 1.08 127 15 ssd 0.87320 1.0 894G 373G 521G 41.73 0.96 114 16 ssd 0.87320 1.0 894G 492G 401G 55.10 1.27 149 5 ssd 0.87320 1.0 894G 288G 605G 32.25 0.74 87 6 ssd 0.87320 1.0 894G 342G 551G 38.28 0.88 102 7 ssd 0.87320 1.0 894G 300G 593G 33.61 0.78 93 22 ssd 0.87320 1.0 894G 343G 550G 38.43 0.89 104 8 ssd 0.87320 1.0 894G 267G 626G 29.90 0.69 77 9 ssd 0.87320 1.0 894G 376G 518G 42.06 0.97 118 10 ssd 0.87320 1.0 894G 322G 571G 36.12 0.83 102 19 ssd 0.87320 1.0 894G 339G 554G 37.95 0.88 109 12 ssd 0.87320 1.0 894G 360G 534G 40.26 0.93 112 13 ssd 0.87320 1.0 894G 404G 489G 45.21 1.04 120 20 ssd 0.87320 1.0 894G 342G 551G 38.29 0.88 103 23 ssd 0.87320 1.0 894G 148G 745G 16.65 0.38 61 17 ssd 0.87320 1.0 894G 423G 470G 47.34 1.09 117 18 ssd 0.87320 1.0 894G 403G 490G 45.18 1.04 120 21 ssd 0.87320 1.0 894G 444G 450G 49.67 1.15 130 TOTAL 23490G 10170G 13320G 43.30 Karun Josy On Tue, Dec 5, 2017 at 4:42 AM, Karun Josy wrote: > Thank you for detailed explanation! > > Got one another doubt, > > This is the total space available in the cluster : > > TOTAL 23490G > Use 10170G > Avail : 13320G > > > But ecpool shows max avail as just 3 TB. > > > > Karun Josy > > On Tue, Dec 5, 2017 at 1:06 AM, David Turner > wrote: > >> No, I would only add disks to 1 failure domain at a time. So in your >> situation where you're adding 2 more disks to each node, I would recommend >> adding the 2 disks into 1 node at a time. Your failure domain is the >> crush-failure-domain=host. So you can lose a host and only lose 1 copy of >> the data. If all of your pools are using the k=5 m=3 profile, then I would >> say it's fine to add the disks into 2 nodes at a time. If you have any >> replica pools for RGW metadata or anything, then I would stick with the 1 >> host at a time. >> >> On Mon, Dec 4, 2017 at 2:29 PM Karun Josy wrote: >> >>> Thanks for your reply! >>> >>> I am using erasure coded profile with k=5, m=3 settings >>> >>> $ ceph osd erasure-code-profile get profile5by3 >>> crush-device-class= >>> crush-failure-domain=host >>> crush-root=default >>> jerasure-per-chunk-alignment=false >>> k=5 >>> m=3 >>> plugin=jerasure >>> technique=reed_sol_van >>> w=8 >>> >>> >>> Cluster has 8 nodes, with 3 disks each. We are planning to add 2 more on >>> each nodes. >>> >>> If I understand correctly, then I can add 3 disks at once right , >>> assuming 3 disks can fail at a time as per the ec code profile. >>> >>> Karun Josy >>> >>> On Tue, Dec 5, 2017 at 12:06 AM, David Turner >>> wrote: >>> >>>> Depending on how well you burn-in/test your new disks, I like to only >>>> add 1 failure domain of disks at a time in case you have bad disks that >>>> you're adding. If you are confident that your disks aren't likely to fail >>>> during the backfilling, then you can go with more. I just added 8 servers >>>> (16 OSDs each) to a cluster with 15 servers (16 OSDs each) all at the same >>>> time, but we spent 2 weeks testing the hardware before adding the new nodes >>>> to the cluster. >>>> >>>> If you add 1 failure domain at a time
Re: [ceph-users] Adding multiple OSD
Thank you for detailed explanation! Got one another doubt, This is the total space available in the cluster : TOTAL 23490G Use 10170G Avail : 13320G But ecpool shows max avail as just 3 TB. Karun Josy On Tue, Dec 5, 2017 at 1:06 AM, David Turner wrote: > No, I would only add disks to 1 failure domain at a time. So in your > situation where you're adding 2 more disks to each node, I would recommend > adding the 2 disks into 1 node at a time. Your failure domain is the > crush-failure-domain=host. So you can lose a host and only lose 1 copy of > the data. If all of your pools are using the k=5 m=3 profile, then I would > say it's fine to add the disks into 2 nodes at a time. If you have any > replica pools for RGW metadata or anything, then I would stick with the 1 > host at a time. > > On Mon, Dec 4, 2017 at 2:29 PM Karun Josy wrote: > >> Thanks for your reply! >> >> I am using erasure coded profile with k=5, m=3 settings >> >> $ ceph osd erasure-code-profile get profile5by3 >> crush-device-class= >> crush-failure-domain=host >> crush-root=default >> jerasure-per-chunk-alignment=false >> k=5 >> m=3 >> plugin=jerasure >> technique=reed_sol_van >> w=8 >> >> >> Cluster has 8 nodes, with 3 disks each. We are planning to add 2 more on >> each nodes. >> >> If I understand correctly, then I can add 3 disks at once right , >> assuming 3 disks can fail at a time as per the ec code profile. >> >> Karun Josy >> >> On Tue, Dec 5, 2017 at 12:06 AM, David Turner >> wrote: >> >>> Depending on how well you burn-in/test your new disks, I like to only >>> add 1 failure domain of disks at a time in case you have bad disks that >>> you're adding. If you are confident that your disks aren't likely to fail >>> during the backfilling, then you can go with more. I just added 8 servers >>> (16 OSDs each) to a cluster with 15 servers (16 OSDs each) all at the same >>> time, but we spent 2 weeks testing the hardware before adding the new nodes >>> to the cluster. >>> >>> If you add 1 failure domain at a time, then any DoA disks in the new >>> nodes will only be able to fail with 1 copy of your data instead of across >>> multiple nodes. >>> >>> On Mon, Dec 4, 2017 at 12:54 PM Karun Josy wrote: >>> >>>> Hi, >>>> >>>> Is it recommended to add OSD disks one by one or can I add couple of >>>> disks at a time ? >>>> >>>> Current cluster size is about 4 TB. >>>> >>>> >>>> >>>> Karun >>>> ___ >>>> ceph-users mailing list >>>> ceph-users@lists.ceph.com >>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >>>> >>> >> ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Adding multiple OSD
Thanks for your reply! I am using erasure coded profile with k=5, m=3 settings $ ceph osd erasure-code-profile get profile5by3 crush-device-class= crush-failure-domain=host crush-root=default jerasure-per-chunk-alignment=false k=5 m=3 plugin=jerasure technique=reed_sol_van w=8 Cluster has 8 nodes, with 3 disks each. We are planning to add 2 more on each nodes. If I understand correctly, then I can add 3 disks at once right , assuming 3 disks can fail at a time as per the ec code profile. Karun Josy On Tue, Dec 5, 2017 at 12:06 AM, David Turner wrote: > Depending on how well you burn-in/test your new disks, I like to only add > 1 failure domain of disks at a time in case you have bad disks that you're > adding. If you are confident that your disks aren't likely to fail during > the backfilling, then you can go with more. I just added 8 servers (16 > OSDs each) to a cluster with 15 servers (16 OSDs each) all at the same > time, but we spent 2 weeks testing the hardware before adding the new nodes > to the cluster. > > If you add 1 failure domain at a time, then any DoA disks in the new nodes > will only be able to fail with 1 copy of your data instead of across > multiple nodes. > > On Mon, Dec 4, 2017 at 12:54 PM Karun Josy wrote: > >> Hi, >> >> Is it recommended to add OSD disks one by one or can I add couple of >> disks at a time ? >> >> Current cluster size is about 4 TB. >> >> >> >> Karun >> ___ >> ceph-users mailing list >> ceph-users@lists.ceph.com >> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >> > ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] Adding multiple OSD
Hi, Is it recommended to add OSD disks one by one or can I add couple of disks at a time ? Current cluster size is about 4 TB. Karun ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] OSD down ( rocksdb: submit_transaction error: Corruption: block checksum mismatch)
Hi, One OSD in the cluster is down. Tried to restart the service, but its still failing. I can see the below error in log file. Can this be a hardware issue ? - -9> 2017-11-23 09:47:37.768969 7f368686a700 3 rocksdb: [/home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos7/DIST/centos7/MACHINE_SIZE/huge/release/12.2.1/rpm/el7/BUILD/ceph-12.2.1/src/rocksdb/db/db_impl_compaction_flush.cc:1591] Compaction error: Corruption: block checksum mismatch -8> 2017-11-23 09:47:37.768980 7f368686a700 4 rocksdb: (Original Log Time 2017/11/23-09:47:37.768936) [/home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos7/DIST/centos7/MACHINE_SIZE/huge/release/12.2.1/rpm/el7/BUILD/ceph-12.2.1/src/rocksdb/db/compaction_job.cc:621] [default] compacted to: base level 1 max bytes base 268435456 files[11 1 0 0 0 0 0] max score 0.00, MB/sec: 2.3 rd, 2.0 wr, level 1, files in(11, 1) out(1) MB in(0.1, 7.8) out(7.0), read-write-amplify(202.0) write-amplify(94.6) Corruption: block checksum mismatch, records in: 42 -7> 2017-11-23 09:47:37.768984 7f368686a700 4 rocksdb: (Original Log Time 2017/11/23-09:47:37.768963) EVENT_LOG_v1 {"time_micros": 1511459257768950, "job": 3, "event": "compaction_finished", "compaction_time_micros": 3667366, "output_level": 1, "num_output_files": 1, "total_output_size": 7317366, "num_input_records": 38738, "num_output_records": 37539, "num_subcompactions": 1, "num_single_delete_mismatches": 0, "num_single_delete_fallthrough": 0, "lsm_state": [11, 1, 0, 0, 0, 0, 0]} -6> 2017-11-23 09:47:37.768988 7f368686a700 2 rocksdb: [/home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos7/DIST/centos7/MACHINE_SIZE/huge/release/12.2.1/rpm/el7/BUILD/ceph-12.2.1/src/rocksdb/db/db_impl_compaction_flush.cc:1275] Waiting after background compaction error: Corruption: block checksum mismatch, Accumulated background error counts: 1 -5> 2017-11-23 09:47:38.245022 7f369a708d00 5 osd.6 pg_epoch: 324 pg[3.98s5(unlocked)] enter Initial -4> 2017-11-23 09:47:38.245256 7f369a708d00 5 osd.6 pg_epoch: 324 pg[3.98s5( empty local-lis/les=323/324 n=0 ec=69/69 lis/c 323/323 les/c/f 324/324/0 323/323/69) [2,11,7,1,0,6,9,3] r=5 lpr=0 crt=0'0 unknown NOTIFY] exit Initial 0.000235 0 0.00 -3> 2017-11-23 09:47:38.245275 7f369a708d00 5 osd.6 pg_epoch: 324 pg[3.98s5( empty local-lis/les=323/324 n=0 ec=69/69 lis/c 323/323 les/c/f 324/324/0 323/323/69) [2,11,7,1,0,6,9,3] r=5 lpr=0 crt=0'0 unknown NOTIFY] enter Reset -2> 2017-11-23 09:47:38.245288 7f369a708d00 5 write_log_and_missing with: dirty_to: 0'0, dirty_from: 4294967295'18446744073709551615, writeout_from: 4294967295'18446744073709551615, trimmed: , trimmed_dups: , clear_divergent_priors: 0 -1> 2017-11-23 09:47:38.245355 7f368806d700 -1 rocksdb: submit_transaction error: Corruption: block checksum mismatch code = 2 Rocksdb transaction: Put( Prefix = M key = 0x052c'.can_rollback_to' Value size = 12) Put( Prefix = M key = 0x052c'.rollback_info_trimmed_to' Value size = 12) Put( Prefix = O key = 0x858003190021213dfffe'o' Value size = 29) 0> 2017-11-23 09:47:38.247357 7f368806d700 -1 /home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos7/DIST/centos7/MACHINE_SIZE/huge/release/12.2.1/rpm/el7/BUILD/ceph-12.2.1/src/os/bluestore/BlueStore.cc: In function 'void BlueStore::_kv_sync_thread()' thread 7f368806d700 time 2017-11-23 09:47:38.245386 /home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos7/DIST/centos7/MACHINE_SIZE/huge/release/12.2.1/rpm/el7/BUILD/ceph-12.2.1/src/os/bluestore/BlueStore.cc: 8453: FAILED assert(r == 0) Karun ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] Admin server
Hi, Just a not so significant doubt :) We have a cluster with 1 admin server and 3 monitors and 8 OSD nodes. Admin server is used to deploy the cluster. What if the admin server permanently fails? Will it affect the cluster ? Karun ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] How to set osd_max_backfills in Luminous
Thanks! Karun Josy On Wed, Nov 22, 2017 at 5:44 AM, Jean-Charles Lopez wrote: > Hi, > > to check a current value use the following command on the machine where > the OSD you want to check is running > > ceph daemon osd.{id} config show | grep {parameter} > Or > ceph daemon osd.{id} config get {parameter} > > What you are seeing is actually a known glitch where you are being told it > has no effect when in fact it does. See capture below > [root@luminous ceph-deploy]# ceph daemon osd.0 config get > osd_max_backfills > { > "osd_max_backfills": "1" > } > [root@luminous ceph-deploy]# ceph tell osd.* injectargs > '--osd_max_backfills 2' > osd.0: osd_max_backfills = '2' rocksdb_separate_wal_dir = 'false' (not > observed, change may require restart) > osd.1: osd_max_backfills = '2' rocksdb_separate_wal_dir = 'false' (not > observed, change may require restart) > osd.2: osd_max_backfills = '2' rocksdb_separate_wal_dir = 'false' (not > observed, change may require restart) > [root@luminous ceph-deploy]# ceph daemon osd.0 config get > osd_max_backfills > { > "osd_max_backfills": "2" > } > > Regards > JC > > On Nov 21, 2017, at 15:17, Karun Josy wrote: > > Hello, > > We added couple of OSDs to the cluster and the recovery is taking much > time. > > So I tried to increase the osd_max_backfills value dynamically. But its > saying the change may need restart. > > $ ceph tell osd.* injectargs '--osd-max-backfills 5' > osd.0: osd_max_backfills = '5' osd_objectstore = 'bluestore' (not > observed, change may require restart) rocksdb_separate_wal_dir = 'false' > (not observed, change may require restart) > > > = > > The value seems to be not changed too. > > [cephuser@ceph-las-admin-a1 home]$ ceph -n osd.0 --show-config | grep > osd_max_backfills > osd_max_backfills = 1 > > Do I have to really restart all the OSD daemons ? > > > > Karun > ___ > ceph-users mailing list > ceph-users@lists.ceph.com > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > > > ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] How to set osd_max_backfills in Luminous
Hello, We added couple of OSDs to the cluster and the recovery is taking much time. So I tried to increase the osd_max_backfills value dynamically. But its saying the change may need restart. $ ceph tell osd.* injectargs '--osd-max-backfills 5' osd.0: osd_max_backfills = '5' osd_objectstore = 'bluestore' (not observed, change may require restart) rocksdb_separate_wal_dir = 'false' (not observed, change may require restart) = The value seems to be not changed too. [cephuser@ceph-las-admin-a1 home]$ ceph -n osd.0 --show-config | grep osd_max_backfills osd_max_backfills = 1 Do I have to really restart all the OSD daemons ? Karun ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Reuse pool id
Any suggestions ? Karun Josy On Mon, Nov 13, 2017 at 10:06 PM, Karun Josy wrote: > Hi, > > Is there anyway we can change or reuse pool id ? > I had created and deleted lot of test pools. So the IDs kind of look like > this now: > > --- > $ ceph osd lspools > 34 imagepool,37 cvmpool,40 testecpool,41 ecpool1, > -- > > Can I change it to 0,1,2,3 etc ? > > Karun > ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Incorrect pool usage statistics
Help?! There seems to be many objects still present in the pool : - $ rados df POOL_NAME USED OBJECTS CLONES COPIES MISSING_ON_PRIMARY UNFOUND DEGRADED RD_OPSRDWR_OPSWR vm 886 105 0 315 0 00943399 1301M 39539 30889M ecpool 403G 388652 316701 2720564 0 00 156972536 1081G 203383441 4074G imagepool89014M 22485 0 67455 0 0 0 7856029 708G 13140767 602G template 115G29848 43 149240 0 00 66138389 2955G 1123900 539G Karun Josy On Tue, Nov 14, 2017 at 4:16 AM, Karun Josy wrote: > Hello, > > Recently, I deleted all the disks from an erasure pool 'ecpool'. > The pool is empty. However the space usage shows around 400GB. > What might be wrong? > > > $ rbd ls -l ecpool > $ $ ceph df > > GLOBAL: > SIZE AVAIL RAW USED %RAW USED > 19019G 16796G2223G 11.69 > POOLS: > NAMEID USED %USED MAX AVAIL OBJECTS > template 1227G 1.59 2810G 58549 > vm 21 0 0 4684G 2 > ecpool 33 403G 2.7910038G 388652 > imagepool 34 90430M 0.62 4684G 22789 > > > > Karun Josy > ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] Incorrect pool usage statistics
Hello, Recently, I deleted all the disks from an erasure pool 'ecpool'. The pool is empty. However the space usage shows around 400GB. What might be wrong? $ rbd ls -l ecpool $ $ ceph df GLOBAL: SIZE AVAIL RAW USED %RAW USED 19019G 16796G2223G 11.69 POOLS: NAMEID USED %USED MAX AVAIL OBJECTS template 1227G 1.59 2810G 58549 vm 21 0 0 4684G 2 ecpool 33 403G 2.7910038G 388652 imagepool 34 90430M 0.62 4684G 22789 Karun Josy ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] Reuse pool id
Hi, Is there anyway we can change or reuse pool id ? I had created and deleted lot of test pools. So the IDs kind of look like this now: --- $ ceph osd lspools 34 imagepool,37 cvmpool,40 testecpool,41 ecpool1, -- Can I change it to 0,1,2,3 etc ? Karun ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] Disconnect a client Hypervisor
Hi, Do you think there is a way for ceph to disconnect an HV client from a cluster? We want to prevent the possibility that two hvs are running the same vm. When a hv crashes, we have to make sure that when the vms are started in a new hv, that the disk is not open in the crashed hv. I can see 'eviction' in filesystem: http://docs.ceph.com/docs/master/cephfs/eviction/ But we are implementing RBD in erasure coded profile. Karun ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] OSD daemons active in nodes after removal
Hello everyone! :) I have an interesting problem. For a few weeks, we've been testing Luminous in a cluster made up of 8 servers and with about 20 SSD disks almost evenly distributed. It is running erasure coding. Yesterday, we decided to bring the cluster to a minimum of 8 servers and 1 disk per server. So, we went ahead and removed the additional disks from the ceph cluster, by executing commands like this from the admin server: --- $ ceph osd out osd.20 osd.20 is already out. $ ceph osd down osd.20 marked down osd.20. $ ceph osd purge osd.20 --yes-i-really-mean-it Error EBUSY: osd.20 is not `down`. So I logged in to the host it resides on and killed it systemctl stop ceph -osd@26 $ ceph osd purge osd.20 --yes-i-really-mean-it purged osd.20 We waited for the cluster to be healthy once again and I physically removed the disks (hot swap, connected to an LSI 3008 controller). A few minutes after that, I needed to turn off one of the OSD servers to swap out a piece of hardware inside. So, I issued: ceph osd set noout And proceeded to turn off that 1 OSD server. But the interesting thing happened then. Once that 1 server came back up, the cluster all of a sudden showed that out of the 8 nodes, only 2 were up! 8 (2 up, 5 in) Even more interesting is that it seems Ceph, in each OSD server, still thinks the missing disks are there! When I start ceph on each OSD server with "systemctl start ceph-osd.target", /var/logs/ceph gets filled with logs for disks that are not supposed to exist anymore. The contents of the logs show something like: # cat /var/log/ceph/ceph-osd.7.log 2017-10-20 08:45:16.389432 7f8ee6e36d00 0 set uid:gid to 167:167 (ceph:ceph ) 2017-10-20 08:45:16.389449 7f8ee6e36d00 0 ceph version 12.2.1 (3e7492b9ada8bdc9a5cd0feafd42fbca27f9c38e) luminous (stable), process (unknown), pid 2591 2017-10-20 08:45:16.389639 7f8ee6e36d00 -1 ** ERROR: unable to open OSD superblock on /var/lib/ceph/osd/ceph-7: (2) No such file or directory 2017-10-20 08:45:36.639439 7fb389277d00 0 set uid:gid to 167:167 (ceph:ceph ) The actual Ceph cluster sees only 8 disks, as you can see here: $ ceph osd tree ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF -1 7.97388 root default -3 1.86469 host ceph-las1-a1-osd 1 ssd 1.86469 osd.1 down0 1.0 -5 0.87320 host ceph-las1-a2-osd 2 ssd 0.87320 osd.2 down0 1.0 -7 0.87320 host ceph-las1-a3-osd 4 ssd 0.87320 osd.4 down 1.0 1.0 -9 0.87320 host ceph-las1-a4-osd 8 ssd 0.87320 osd.8 up 1.0 1.0 -11 0.87320 host ceph-las1-a5-osd 12 ssd 0.87320 osd.12 down 1.0 1.0 -13 0.87320 host ceph-las1-a6-osd 17 ssd 0.87320 osd.17up 1.0 1.0 -15 0.87320 host ceph-las1-a7-osd 21 ssd 0.87320 osd.21 down 1.0 1.0 -17 0.87000 host ceph-las1-a8-osd 28 ssd 0.87000 osd.28 down0 1.0 Linux, in the OSD servers, seems to also think the disks are in: # df -h Filesystem Size Used Avail Use% Mounted on /dev/sde2 976M 183M 727M 21% /boot /dev/sdd197M 5.4M 92M 6% /var/lib/ceph/osd/ceph-7 /dev/sdc197M 5.4M 92M 6% /var/lib/ceph/osd/ceph-6 /dev/sda197M 5.4M 92M 6% /var/lib/ceph/osd/ceph-4 /dev/sdb197M 5.4M 92M 6% /var/lib/ceph/osd/ceph-5 tmpfs 6.3G 0 6.3G 0% /run/user/0 It should show only one disk, not 4. I tried to issue again the commands to remove the disks, this time, in the OSD server itself: $ ceph osd out osd.X osd.X does not exist. $ ceph osd purge osd.X --yes-i-really-mean-it osd.X does not exist Yet, if I again issue "systemctl start ceph-osd.target", /var/log/ceph again shows logs for a disk that does not exist (to make sure, I deleted all logs prior). So, it seems, somewhere, Ceph in the OSD still thinks there should be more disks? The Ceph cluster is unusable though. We've tried everything to bring it back again. But as Dr. Bones would say, it's dead Jim. ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Erasure code profile
Thank you for your reply. I am finding it confusing to understand the profile structure. Consider a cluster of 8 OSD servers with 3 disks on each server. If I use a profile setting of k=5, m=3 and ruleset-failure-domain=host ; Encoding Rate (r) : r = k / n , where n = k+m = 5/8 = 0.625 Storage Required : 1 / r = 1 / 0.625 = 1.6 times original file Is this correct? And more importantly, will the profile work without failure? As far as I understand it can tolerate failure of 3 OSDs and 1 host, am I right ? I can't find much information from this link : - http://docs.ceph.com/docs/master/rados/operations/erasure-code-profile/ Is there a better article that I can refer to ? Karun Josy On Tue, Oct 24, 2017 at 1:23 AM, David Turner wrote: > This can be changed to a failure domain of OSD in which case it could > satisfy the criteria. The problem with a failure domain of OSD, is that > all of your data could reside on a single host and you could lose access to > your data after restarting a single host. > > On Mon, Oct 23, 2017 at 3:23 PM LOPEZ Jean-Charles > wrote: > >> Hi, >> >> the default failure domain if not specified on the CLI at the moment you >> create your EC profile is set to HOST. So you need 14 OSDs spread across 14 >> different nodes by default. And you only have 8 different nodes. >> >> Regards >> JC >> >> On 23 Oct 2017, at 21:13, Karun Josy wrote: >> >> Thank you for the reply. >> >> There are 8 OSD nodes with 23 OSDs in total. (However, they are not >> distributed equally on all nodes) >> >> So it satisfies that criteria, right? >> >> >> >> Karun Josy >> >> On Tue, Oct 24, 2017 at 12:30 AM, LOPEZ Jean-Charles >> wrote: >> >>> Hi, >>> >>> yes you need as many OSDs that k+m is equal to. In your example you need >>> a minimum of 14 OSDs for each PG to become active+clean. >>> >>> Regards >>> JC >>> >>> On 23 Oct 2017, at 20:29, Karun Josy wrote: >>> >>> Hi, >>> >>> While creating a pool with erasure code profile k=10, m=4, I get PG >>> status as >>> "200 creating+incomplete" >>> >>> While creating pool with profile k=5, m=3 it works fine. >>> >>> Cluster has 8 OSDs with total 23 disks. >>> >>> Is there any requirements for setting the first profile ? >>> >>> Karun >>> ___ >>> ceph-users mailing list >>> ceph-users@lists.ceph.com >>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >>> >>> >>> >> >> ___ >> ceph-users mailing list >> ceph-users@lists.ceph.com >> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >> > ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Erasure code profile
Thank you for the reply. There are 8 OSD nodes with 23 OSDs in total. (However, they are not distributed equally on all nodes) So it satisfies that criteria, right? Karun Josy On Tue, Oct 24, 2017 at 12:30 AM, LOPEZ Jean-Charles wrote: > Hi, > > yes you need as many OSDs that k+m is equal to. In your example you need a > minimum of 14 OSDs for each PG to become active+clean. > > Regards > JC > > On 23 Oct 2017, at 20:29, Karun Josy wrote: > > Hi, > > While creating a pool with erasure code profile k=10, m=4, I get PG status > as > "200 creating+incomplete" > > While creating pool with profile k=5, m=3 it works fine. > > Cluster has 8 OSDs with total 23 disks. > > Is there any requirements for setting the first profile ? > > Karun > ___ > ceph-users mailing list > ceph-users@lists.ceph.com > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > > > ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] Erasure code profile
Hi, While creating a pool with erasure code profile k=10, m=4, I get PG status as "200 creating+incomplete" While creating pool with profile k=5, m=3 it works fine. Cluster has 8 OSDs with total 23 disks. Is there any requirements for setting the first profile ? Karun ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com