Hi Christian,
Thank you for your help.
Ceph version is 12.2.2. So is this value bad ? Do you have any suggestions ?
So to reduce the max chunk ,I assume I can choose something like
7 << 20 ,ie 7340032 ?
Karun Josy
On Tue, Feb 6, 2018 at 1:15 PM, Christian Balzer <ch...@gol.c
on: The maximum size of a recovered chunk of data to push.
Type: 64-bit Unsigned Integer
Default: 8 << 20
I am confused. Can anyone let me know what is the value that I have to give
to reduce this parameter ?
Karun Josy
___
ceph-use
Can it be this bug :
http://lists.ceph.com/pipermail/ceph-users-ceph.com/2017-October/021676.html
In most of the OSDs buffer anon is high
},
"buffer_anon": {
"items": 268443,
"bytes": 1421912265
Karun Josy
On Sun, Feb 4, 2018 at 7:03 AM, Ka
And can see this in error log :
Feb 2 16:41:28 ceph-las1-a4-osd kernel: bstore_kv_sync: page allocation
stalls for 14188ms, order:0,
mode:0x14280ca(GFP_HIGHUSER_MOVABLE|__GFP_ZERO), nodemask=(null)
Karun Josy
On Sun, Feb 4, 2018 at 6:19 AM, Karun Josy <karunjo...@gmail.com> wrote:
cannot turn this
node off as it will force some pgs to be incomplete state.
And help would be really appreciated.
Karun Josy
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Hi Jason,
>> Was the base RBD pool used only for data-pool associated images
Yes, it is only used for storing metadata of ecpool.
We use 2 pools for erasure coding
ecpool - erasure coded datapool
vm - replicated pool to store metadata
Karun Josy
On Tue, Jan 30, 2018 at 8:00 PM,
gets updated correctly.
Karun Josy
On Tue, Jan 30, 2018 at 1:35 AM, John Spray <jsp...@redhat.com> wrote:
> On Mon, Jan 29, 2018 at 6:58 PM, Gregory Farnum <gfar...@redhat.com>
> wrote:
> > The lease timeout means this (peon) monitor hasn't heard from the leader
>
as version 12.2.1 and then updated
to .2.
Karun Josy
On Mon, Jan 29, 2018 at 7:01 PM, Karun Josy <karunjo...@gmail.com> wrote:
> Thank you for your response.
>
> We don't think there is an issue with the cluster being behind snap
> trimming. We just don't think snaptrim is occurr
not being reclaimed.
All clusters are same hardware. Some have more disks and servers than
others. The only major difference is that this particular Ceph with this
problem, it had the noscrub and nodeep-scrub flags set for many weeks.
Karun Josy
On Mon, Jan 29, 2018 at 6:27 PM, David Turner <drakonst
fast-diff map is not enabled for RBD images.
Can it be a reason for Trimming not happening ?
Karun Josy
On Sat, Jan 27, 2018 at 10:19 PM, Karun Josy <karunjo...@gmail.com> wrote:
> Hi David,
>
> Thank you for your reply! I really appreciate it.
>
> The images are in pool id
In Luminous version, we have to use osd set command
--
ceph osd set -backfillfull-ratio .89
ceph osd set-nearfull-ratio .84
ceph osd set-full-ratio .96
--
Karun Josy
On Thu, Dec 21, 2017 at 4:29 PM, Konstantin Shalygin <k0...@k0ste.ru> wrote:
> Update your ceph.
, but there has been no osd flapping and slow
requests!
Thanks for all your help!
Karun Josy
On Sun, Jan 28, 2018 at 9:25 PM, David Turner <drakonst...@gmail.com> wrote:
> Use a get with the second syntax to see the currently running config.
>
> On Sun, Jan 28, 2018, 3:41 AM Karu
Still the issue is continuing. Any one else has noticed it ?
When this happens, the Ceph Dashboard GUI gets stuck and we have to restart
the manager daemon to make it work again
Karun Josy
On Wed, Jan 17, 2018 at 6:16 AM, Karun Josy <karunjo...@gmail.com> wrote:
> Hello,
>
>
While using both it shows (not observed, change may require restart)
So is it not set ?
Karun Josy
On Mon, Jan 15, 2018 at 7:16 AM, shadow_lin <shadow_...@163.com> wrote:
> hi,
> you can try to adjusting osd_scrub_chunk_min,osd_scrub_chunk_max and
> osd_scrub_sleep.
>
&
in mailing lists, I can see you have dealt with similar issues with
Snapshots
So I think you can help me figure this mess out.
Karun Josy
On Sat, Jan 27, 2018 at 7:15 PM, David Turner <drakonst...@gmail.com> wrote:
> Prove* a positive
>
> On Sat, Jan 27, 2018, 8:45 AM David T
Is scrubbing and deep scrubbing necessary for Snaptrim operation to happen ?
Karun Josy
On Fri, Jan 26, 2018 at 9:29 PM, Karun Josy <karunjo...@gmail.com> wrote:
> Thank you for your quick response!
>
> I used the command to fetch the snap_trimq from many pgs, however it seems
&g
Hi,
We have set no scrub , no deep scrub flag on a ceph cluster.
When we are deleting snapshots we are not seeing any change in usage space.
I understand that Ceph OSDs delete data asynchronously, so deleting a
snapshot doesn’t free up the disk space immediately. But we are not seeing
any change
Thank you!
Ceph version is 12.2
Also, can you let me know the format to set osd_backfill_full_ratio ?
Is it " ceph osd set -backfillfull-ratio .89 " ?
Karun Josy
On Thu, Jan 25, 2018 at 1:29 AM, Jean-Charles Lopez <jelo...@redhat.com>
wrote:
> Hi,
>
>
Hi,
I am trying to increase the full ratio of OSDs in a cluster.
While adding a new node one of the new disk got backfilled to more than 95%
and cluster freezed. So I am trying to avoid it from happening again.
Tried pg set command but it is not working :
$ ceph pg set_nearfull_ratio 0.88
Error
Hi,
We added a new host to cluster and it was rebalancing.
And one PG became "inactive, peering" for very long time which created lot
of slow requests and poor performance to the whole cluster.
When I queried that PG, it showed this :
"recovery_state": [
{
"name":
Hello,
In one of our cluster set up, there is frequent monitor elections
happening.
In the logs of one of the monitor, there is "lease_timeout" message before
that happens. Can anyone help me to figure it out ?
(When this happens, the Ceph Dashboard GUI gets stuck and we have to
restart the
Hello,
It appears that cluster is having many slow requests while it is scrubbing
and deep scrubbing. Also sometimes we can see osds flapping.
So we have put the flags : noscrub,nodeep-scrub
When we unset it, 5 PGs start to scrub.
Is there a way to limit it to one at a time?
# ceph daemon
map failed: (1) Operation not permitted
Is it because that user has only read permission in templates pool ?
Karun Josy
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
It happens randomly.
Karun Josy
On Wed, Jan 3, 2018 at 7:07 AM, Jason Dillaman <jdill...@redhat.com> wrote:
> I tried to reproduce this for over an hour today using the specified
> versions w/o any success. Is this something that you can repeat
> on-demand or was this a one
increase it gradually? Or set pg as 512 in one step ?
Karun Josy
On Tue, Jan 2, 2018 at 9:26 PM, Hans van den Bogert <hansbog...@gmail.com>
wrote:
> Please refer to standard documentation as much as possible,
>
> http://docs.ceph.com/docs/jewel/rados/operations/
> pla
Hi,
Initial PG count was not properly planned while setting up the cluster, so
now there are only less than 50 PGs per OSDs.
What are the best practises to increase PG number of a pool ?
We have replicated pools as well as EC pools.
Or is it better to create a new pool with higher PG numbers?
Hi,
We added some more osds to the cluster and it was fixed.
Karun Josy
On Tue, Jan 2, 2018 at 6:21 AM, 한승진 <yongi...@gmail.com> wrote:
> Are all odsd are same version?
> I recently experienced similar situation.
>
> I upgraded all osds to exact same version and reset of
intensive than EC computing?
Karun Josy
On Wed, Dec 27, 2017 at 3:42 AM, David Turner <drakonst...@gmail.com> wrote:
> Please use the version of the docs for your installed version of ceph.
> Now the Jewel in your URL and the Luminous in mine. In Luminous you no
> longer need a cache
pool to store metadata
and ecpool as data pool .
Is it possible to setup cache tiering since there is already a replicated
pool that is being used ?
Karun Josy
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com
Any help is really appreciated.
Karun Josy
On Sun, Dec 24, 2017 at 2:18 AM, Karun Josy <karunjo...@gmail.com> wrote:
> Hello,
>
> The image is not mapped.
>
> # ceph --version
> ceph version 12.2.1 luminous (stable)
> # uname -r
> 4.14.0-1.el7.elrepo.x86_64
>
Hello,
The image is not mapped.
# ceph --version
ceph version 12.2.1 luminous (stable)
# uname -r
4.14.0-1.el7.elrepo.x86_64
Karun Josy
On Sat, Dec 23, 2017 at 6:51 PM, Jason Dillaman <jdill...@redhat.com> wrote:
> What Ceph and what kernel version are you using? Are you
Hello,
I am unable to delete this abandoned image.Rbd info shows a watcher ip
Image is not mapped
Image has no snapshots
rbd status cvm/image --id clientuser
Watchers:
watcher=10.255.0.17:0/3495340192 client.390908
cookie=18446462598732841114
How can I evict or black list a watcher
Thank you!
Karun Josy
On Thu, Dec 21, 2017 at 3:51 PM, Konstantin Shalygin <k0...@k0ste.ru> wrote:
> Is this the correct way to removes OSDs, or am I doing something wrong ?
>>
> Generic way for maintenance (e.g. disk replace) is rebalance by change osd
> weight:
>
>
Hi,
This is how I remove an OSD from cluster
- Take it out
ceph osd out osdid
Wait for the balancing to finish
- Mark it down
ceph osd down osdid
Then Purge it
ceph osd purge osdid --yes-i-really-mean-it
While purging I can see there is another rebalancing occurring.
mon.mon-a2: injectargs:mon_osd_nearfull_ratio = '0.86' (not observed,
change may require restart)
mon.mon-a3: injectargs:mon_osd_nearfull_ratio = '0.86' (not observed,
change may require restart)
Karun Josy
On Tue, Dec 19, 2017 at 10:05 PM, Jean-Charles Lopez <jelo...@redhat.com>
wrote:
> OK
No, I haven't.
Interestingly, the POOL_NEARFULL flag is shown only when there is OSD_NEARFULL
flag.
I have recently upgraded to Luminous 12.2.2, haven't seen this flag in
12.2.1
Karun Josy
On Tue, Dec 19, 2017 at 9:27 PM, Jean-Charles Lopez <jelo...@redhat.com>
wrote:
> Hi
>
&
Hello,
In one of our clusters, health is showing these warnings :
-
OSD_NEARFULL 1 nearfull osd(s)
osd.22 is near full
POOL_NEARFULL 3 pool(s) nearfull
pool 'templates' is nearfull
pool 'cvm' is nearfull
pool 'ecpool' is nearfull
One osd is above 85% used,
in the active+remapped state
Its a small cluster with unequal number of osds and one of the OSD disk
failed and I had taken it out.
I have already purged it, so I cannot use the reweight option mentioned in
that link.
So any other workarounds ?
Will adding more disks will clear it ?
Karun Josy
Tried restarting all osds. Still no luck.
Will adding a new disk to any of the server forces a rebalance and fix it?
Karun Josy
On Sun, Dec 17, 2017 at 12:22 PM, Cary <dynamic.c...@gmail.com> wrote:
> Karun,
>
> Could you paste in the output from "ceph health detail&qu
=false
k=5
m=3
plugin=jerasure
technique=reed_sol_van
w=8
Karun Josy
On Sun, Dec 17, 2017 at 11:26 PM, David Turner <drakonst...@gmail.com>
wrote:
> I like to avoid adding disks from more than 1 failure domain at a time in
> case some of the new disks are bad. In your example of only
Hi,
We have a live cluster with 8 OSD nodes all having 5-6 disks each.
We would like to add a new host and expand the cluster.
We have 4 pools
- 3 replicated pools with replication factor 5 and 3
- 1 erasure coded pool with k=5, m=3
So my concern is, is there any precautions that are needed to
Any help would be appreciated!
Karun Josy
On Sat, Dec 16, 2017 at 11:04 PM, Karun Josy <karunjo...@gmail.com> wrote:
> Hi,
>
> Repair didnt fix the issue.
>
> In the pg dump details, I notice this None. Seems pg is missing from one
> of the OSD
>
> [0,2,NONE,4,1
-object/
>
> I recommend not rebooting, or restarting while Ceph is repairing or
> recovering. If possible, wait until the cluster is in a healthy state
> first.
>
> Cary
> -Dynamic
>
> On Sat, Dec 16, 2017 at 2:05 PM, Karun Josy <karunjo...@gmail.com> wrote:
> > Hi Car
Hi Cary,
No, I didnt try to repair it.
I am comparatively new in ceph. Is it okay to try to repair it ?
Or should I take any precautions while doing it ?
Karun Josy
On Sat, Dec 16, 2017 at 2:08 PM, Cary <dynamic.c...@gmail.com> wrote:
> Karun,
>
> Did you attempt a "ceph
Hello,
I added 1 disk to the cluster and after rebalancing, it shows 1 PG is in
remapped state. How can I correct it ?
(I had to restart some osds during the rebalancing as there were some slow
requests)
$ ceph pg dump | grep remapped
dumped all
3.4 981 00
rebalance.
And it worked!
To be honest, not exactly sure its the correct way.
P.S : I had upgraded to Luminous 12.2.2 yesterday.
Karun Josy
On Wed, Dec 13, 2017 at 4:31 PM, Nick Fisk <n...@fisk.me.uk> wrote:
> Hi Karun,
>
>
>
> I too am experiencing something very similar with a PG
]
2
1.1eactivating+remapped [2,3,1,10,0] 2 [2,3,1,10,5]
2
2.29activating+remapped [1,0,13] 1 [1,8,11]
1
1.6factivating+remapped [8,2,0,4,13] 8 [8,2,4,13,1]
8
1.74activating+remapped [7,13,2,0,4] 7 [7,13,2,4,1]
7
Karun Josy
Hello,
We added a new disk to the cluster and while rebalancing we are getting
error warnings.
=
Overall status: HEALTH_ERR
REQUEST_SLOW: 1824 slow requests are blocked > 32 sec
REQUEST_STUCK: 1022 stuck requests are blocked > 4096 sec
==
The load in the servers seems to
Hi Lars, Sean,
Thank you for your response.
The cluster health is ok now! :)
Karun Josy
On Thu, Dec 7, 2017 at 3:35 PM, Sean Redmond <sean.redmo...@gmail.com>
wrote:
> Can you share - ceph osd tree / crushmap and `ceph health detail` via
> pastebin?
>
> Is recovery stuck
Hello,
I am seeing health error in our production cluster.
health: HEALTH_ERR
1105420/11038158 objects misplaced (10.015%)
Degraded data redundancy: 2046/11038158 objects degraded
(0.019%), 102 pgs unclean, 2 pgs degraded
Degraded data redundancy (low space):
1.0 894G 423G 470G 47.34 1.09 117
18 ssd 0.87320 1.0 894G 403G 490G 45.18 1.04 120
21 ssd 0.87320 1.0 894G 444G 450G 49.67 1.15 130
TOTAL 23490G 10170G 13320G 43.30
Karun Josy
On Tue, Dec 5, 2017 at 4:42 AM, Karun Josy <karu
Thank you for detailed explanation!
Got one another doubt,
This is the total space available in the cluster :
TOTAL 23490G
Use 10170G
Avail : 13320G
But ecpool shows max avail as just 3 TB.
Karun Josy
On Tue, Dec 5, 2017 at 1:06 AM, David Turner <drakonst...@gmail.com> wrote:
&
nodes, with 3 disks each. We are planning to add 2 more on
each nodes.
If I understand correctly, then I can add 3 disks at once right , assuming
3 disks can fail at a time as per the ec code profile.
Karun Josy
On Tue, Dec 5, 2017 at 12:06 AM, David Turner <drakonst...@gmail.com>
Hi,
Is it recommended to add OSD disks one by one or can I add couple of disks
at a time ?
Current cluster size is about 4 TB.
Karun
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Hi,
One OSD in the cluster is down. Tried to restart the service, but its still
failing.
I can see the below error in log file. Can this be a hardware issue ?
-
-9> 2017-11-23 09:47:37.768969 7f368686a700 3 rocksdb:
Hi,
Just a not so significant doubt :)
We have a cluster with 1 admin server and 3 monitors and 8 OSD nodes.
Admin server is used to deploy the cluster.
What if the admin server permanently fails?
Will it affect the cluster ?
Karun
___
ceph-users
Thanks!
Karun Josy
On Wed, Nov 22, 2017 at 5:44 AM, Jean-Charles Lopez <jelo...@redhat.com>
wrote:
> Hi,
>
> to check a current value use the following command on the machine where
> the OSD you want to check is running
>
> ceph daemon osd.{id} config show | grep {
Hello,
We added couple of OSDs to the cluster and the recovery is taking much time.
So I tried to increase the osd_max_backfills value dynamically. But its
saying the change may need restart.
$ ceph tell osd.* injectargs '--osd-max-backfills 5'
osd.0: osd_max_backfills = '5' osd_objectstore =
Any suggestions ?
Karun Josy
On Mon, Nov 13, 2017 at 10:06 PM, Karun Josy <karunjo...@gmail.com> wrote:
> Hi,
>
> Is there anyway we can change or reuse pool id ?
> I had created and deleted lot of test pools. So the IDs kind of look like
> this now:
>
> --
29848 43 149240 0
00 66138389 2955G 1123900 539G
Karun Josy
On Tue, Nov 14, 2017 at 4:16 AM, Karun Josy <karunjo...@gmail.com> wrote:
> Hello,
>
> Recently, I deleted all the disks from an erasure pool 'ecpool'.
> The pool is empty. However the
34 90430M 0.62 4684G 22789
Karun Josy
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Hi,
Is there anyway we can change or reuse pool id ?
I had created and deleted lot of test pools. So the IDs kind of look like
this now:
---
$ ceph osd lspools
34 imagepool,37 cvmpool,40 testecpool,41 ecpool1,
--
Can I change it to 0,1,2,3 etc ?
Karun
Hi,
Do you think there is a way for ceph to disconnect an HV client from a
cluster?
We want to prevent the possibility that two hvs are running the same vm.
When a hv crashes, we have to make sure that when the
vms are started in a new hv, that the disk is not open in the crashed hv.
I can see
Hello everyone! :)
I have an interesting problem. For a few weeks, we've been testing Luminous
in a cluster made up of 8 servers and with about 20 SSD disks almost evenly
distributed. It is running erasure coding.
Yesterday, we decided to bring the cluster to a minimum of 8 servers and 1
disk
Thank you for the reply.
There are 8 OSD nodes with 23 OSDs in total. (However, they are not
distributed equally on all nodes)
So it satisfies that criteria, right?
Karun Josy
On Tue, Oct 24, 2017 at 12:30 AM, LOPEZ Jean-Charles <jelo...@redhat.com>
wrote:
> Hi,
>
> yes y
Hi,
While creating a pool with erasure code profile k=10, m=4, I get PG status
as
"200 creating+incomplete"
While creating pool with profile k=5, m=3 it works fine.
Cluster has 8 OSDs with total 23 disks.
Is there any requirements for setting the first profile ?
Karun
66 matches
Mail list logo