The actual mount command doesn't hang, we just can't interact with any of
the directory's contents once mounted. I couldn't find anything unusual in
the logs.
Best,
Justin Lee
On Fri, Aug 2, 2024 at 10:38 AM Dhairya Parmar wrote:
> So the mount hung? Can you see anything
Hi Dhairya,
Thanks for the response! We tried removing it as you suggested with `rm
-rf` but the command just hangs indefinitely with no output. We are also
unable to `ls lost_found`, or otherwise interact with the directory's
contents.
Best,
Justin lee
On Fri, Aug 2, 2024 at 8:24 AM Dh
After we updated our ceph cluster from 17.2.7 to 18.2.0 the MDS kept being
marked as damaged and stuck in up:standby with these errors in the log.
debug-12> 2024-07-14T21:22:19.962+ 7f020cf3a700 1
mds.0.cache.den(0x4 1000b3bcfea) loaded already corrupt dentry:
[dentry #0x1/lost+found/1000
Hello, I wrote a Lua script in order to retrieve RGW logs such as bucket name,
bucket owner, etc.
However, when I apply a lua script I wrote using the below command, I do not
see any logs start with Lua: INFO
radosgw-admin script put --infile=/usr/tmp/testPreRequest.lua
--context=postrequest
On Wed, 2023-05-17 at 17:23 +, Marc wrote:
> >
> >
> > In fact, when we start up the cluster, we don't have DNS available
> > to
> > resolve the IP addresses, and for a short while, all OSDs are
> > located
> > in a new host called "localhost.localdomain". At that point, I
> > fixed
> > it b
we could fix it and get the OSDs to use cluster network to do
heartbeat checks. Any help would be highly appreciated. Thank you
very much.
Cheers, Hong
--
Hurng-Chun (Hong) Lee, PhD
ICT manager
Donders Institute for Brain, Cognition and Behaviour,
Centre for Cognitive Neuroimaging
Radboud Univ
Dear experts,
Sorry, I missed to mention that the initial symptom is that those OSDs
will suffer: "wait_auth_rotating timed out" and "unable to obtain
rotating service keys; retrying"
I then increased rotating_keys_bootstrap_timeout, but it doesn't really
help.
Best
ithout_osd_lock
The ceph version is Octopus: 15.2.17.
OSD storage backend: bluestore
OS: CentOS7 64bit.
Any idea?
Thanks
&
Best regards,
Felix Lee ~
--
Felix H.T Lee Academia Sinica Grid & Cloud.
Tel: +886-2-27898308
Office: Room P111, Institute of Phy
Much appreciated.
From: Adam King
Sent: 28 October 2022 19:25
To: Lee Carney
Cc: Wyll Ingersoll; ceph-users@ceph.io
Subject: [**SPAM**] [ceph-users] Re: cephadm node-exporter extra_container_args
for textfile_collector
We had actually considered adding an
> resolved later, it's probably not critical at the moment.
> If the slow requests resolve you can repair one PG at a time after
> inspecting the output of 'rados -p list-inconsistent-obj
> '.
>
>
> Zitat von Frank Lee :
>
> > Hi again,
> >
&
Hi again,
My CEPH came up a while ago: 3 pgs not deep-scrubbed in time.
I googled to increase osd_scrub_begin_hour and osd_scrub_end_hour but not
seems to work.
There was a discussion on proxmox, a similar situation, he ran "ceph osd
repair all" and got it fixed. But it doesn't seem to work a da
y etc
From: Wyll Ingersoll
Sent: 28 October 2022 15:19:17
To: Lee Carney; ceph-users@ceph.io
Subject: Re: cephadm node-exporter extra_container_args for textfile_collector
I ran into the same issue - wanted to add the textfile.directory to the
node_exporter using "extra_cont
Has anyone had success in using cephadm to add extra_container_args onto the
node-exporter config? For example changing the collector config.
I am trying and failing using the following:
1. Create ne.yml
service_type: node-exporter
service_name: node-exporter
placement:
host_pattern: '*'
gives us good motivation to speed up the Ceph upgrade.
Again, thanks you all for the great inputs
&
Best regards,
Felix Lee ~
On 5/17/22 19:41, Dan van der Ster wrote:
Hi Felix,
"rejoin" took awhile in the past because the MDS needs to reload all
inodes for all the open directorie
ere is any way for us to
estimate the rejoin time? So that we can decide whether to wait or take
proactive action if necessary.
Best regards,
Felix Lee ~
On 5/17/22 16:15, Jos Collin wrote:
I suggest you to upgrade the cluster to the latest release [1], as
nautilus reached EOL.
o 20 for a
while as ceph-mds.ceph16.log-20220516.gz
Thanks
&
Best regards,
Felix Lee ~
On 5/16/22 14:45, Jos Collin wrote:
It's hard to suggest without the logs. Do verbose logging debug_mds=20.
What's the ceph version? Do you have the logs why the MDS crashed?
On 16/05/22 11:
oin time
and maybe improve it? because we always need to tell user the time
estimation of its recovery.
Thanks
&
Best regards,
Felix Lee ~
--
Felix H.T Lee Academia Sinica Grid & Cloud.
Tel: +886-2-27898308
Office: Room P111, Institute of Physics, 128 Academia
We had the exact same issue last week, in the end unless the dataset can
fit in memory it will never boot..
To be honest this bug seems to being seen by quite a few, in our case it
happened after a PGNUM change on a pool..
In the end I had to manually export the PG's from the OSD, ad them back
in
. Patrakov
wrote:
> пт, 7 янв. 2022 г. в 06:21, Lee :
>
>> Hello,
>>
>> As per another post I been having a huge issue since a PGNUM increase took
>> my cluster offline..
>>
>> I have got to a point where I have just 20 PG's Down / Unavailable due to
>>
PH to used the pg on the OSD to rebuild? When I query
the PG at the end it complains about marking the offline OSD as offline?
I have looked online and cannot find a definitive guide on how the process
/ steps that should be taken.
Cheers
Lee
___
ceph-
I tried with disk based swap on a SATA SSD.
I think that might be the last option. I have exported already all the down
PG's from the OSD that they are waiting for.
Kind Regards
Lee
On Thu, 6 Jan 2022 at 20:00, Alexander E. Patrakov
wrote:
> пт, 7 янв. 2022 г. в 00:50, Alexander E.
"bytes": 4854818176
},
"osdmap": {
"items": 3792,
"bytes": 140872
},
"osdmap_mapping": {
"items": 0,
"bytes": 0
d memory target", and "mds
> cache memory limit". Osd processes have become noisy neighbors in the last
> few versions.
>
>
>
> On Wed, Jan 5, 2022 at 1:47 PM Lee wrote:
>
>> I'm not rushing,
>>
>> I have found the issue, Im am getting OOM
bb-ceph-enc-rm63-osd03-31 init.scope Stopped Ceph
object storage daemon osd.51.
I have just increased the RAM physically in one of the node's removed the
other OSD's physically for now and managed to get one of the 3 down to come
up. Just stepping through each at the moment.
Regards
Lee
rm63-osd03-31 init.scope ceph-osd@51.service:
Scheduled restart job, restart counter is at 2.
The problem I have this has basically taken the production and metadata SSD
pool's down fully and all 3 copies are offline. And I cannot find a way to
find out what is causing these to crash.
Kind Regards
Lee
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io
remapped"
comment: "not enough up instances of this PG to go active"
With only 2 OSDs out, a PG of the EC8+3 pool enters "down+remapped"
state. So, it seems that the min_size of a erasure coded K+M pool
should be set to K+1 which ensures that the data is intact even o
two more failures in the system
> without losing data (or losing access to data, given that min_size=k,
> though I believe it's recommended to set min_size=k+1).
>
> However, that sequence of acting sets doesn't make a whole lot of
> sense to me for a single OSD failure (thou
Hello,
I would like to know the maximum number of node failures for a EC8+3
pool in a 12-node cluster with 3 OSDs in each node. The size and
min_size of the EC8+3 pool is configured as 11 and 8, and OSDs of each
PG are selected by host. When there is no node failure, the maximum
number of node f
ng list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
--
Hurng-Chun (Hong) Lee, PhD
ICT manager
Donders Institute for Brain, Cognition and Behaviour,
Centre for Cognitive Neuroimaging
Radboud University Nijmegen
e-mail: h@donders.ru.nl
tel: +31(0) 243610
Hi,
On Thu, 2020-07-02 at 16:15 +0200, Janne Johansson wrote:
Den tors 2 juli 2020 kl 14:42 skrev Lee, H. (Hurng-Chun)
mailto:h@donders.ru.nl>>:
Hi,
We use the official Ceph RPM repository (http://download.ceph.com/rpm-
nautilus/el7<http://download.ceph.com/rpm-nautilus/el7>) fo
t the official repo no longer provides RPM packages for
older versions? Thanks!
Cheers, Hong
--
Hurng-Chun (Hong) Lee, PhD
ICT manager
Donders Institute for Brain, Cognition and Behaviour,
Centre for Cognitive Neuroimaging
Radboud University Nijmegen
e-mail: h@donders.ru.nl
tel:
the actual stored data is less.
Is my interpretation correct? If so, does it mean that we will be
wasting a lot of space when we have a lot files smaller than the object
size of 4MB in the system? Thanks for the help!
Cheers, Hong
--
Hurng-Chun (Hong) Lee, PhD
ICT manager
Donders Institute for
Hi
Ceph: nautilus (14.2.2)
NFS-Ganesha v 2.8
ceph-ansible stable 4.0 << git checkout 28th Aug
CentOS 7
I am trying to do a fresh installation using Ceph Ansible and I am
getting the following error when running the playbook. I have not
enabled or config dashboard/grafana/prometheus yet.
fatal
33 matches
Mail list logo