[ceph-users] Re: ceph daemon mgr.# dump_osd_network: no valid command found

2020-12-06 Thread Eugen Block

Hi Frank,

I responded to a recent thread [1] about this, you should be able to  
run that command for an OSD daemon.


Regards,
Eugen

[1]  
https://lists.ceph.io/hyperkitty/list/ceph-users@ceph.io/message/A3NZ4DBMJT2KZQG5SMK6YXYZHIIBFBJW/



Zitat von Frank Schilder :


Dear cephers,

I see "Long heartbeat ping times on back interface seen" in ceph  
status and ceph health detail says that I should "Use ceph daemon  
mgr.# dump_osd_network for more information". I tries, but it seems  
this command was removed during upgrade from mimic 13.2.8 to 13.2.10:


[root@ceph-01 ~]# ceph daemon mgr.ceph-01 dump_osd_network
no valid command found; 10 closest matches:
log flush
log dump
git_version
get_command_descriptions
kick_stale_sessions
help
config unset 
config show
dump_mempools
dump_cache
admin_socket: invalid command

Has this been replaced by some other command?

Best regards,
=
Frank Schilder
AIT Risø Campus
Bygning 109, rum S14
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io



___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: ceph daemon mgr.# dump_osd_network: no valid command found

2020-12-06 Thread Eugen Block
I didn't look very closely but I didn't find a tracker issue for this  
so maybe we should create one. I thought maybe OP from the thread I  
responded to would have done that but apparently not. Do you want to  
create it?



Zitat von Frank Schilder :

Thanks! I guess the message in ceph health detail should be changed  
then. Is this already on the list?


Best regards,
=
Frank Schilder
AIT Risø Campus
Bygning 109, rum S14


From: Eugen Block 
Sent: 06 December 2020 12:41:00
To: ceph-users@ceph.io
Subject: [ceph-users] Re: ceph daemon mgr.# dump_osd_network: no  
valid command found


Hi Frank,

I responded to a recent thread [1] about this, you should be able to
run that command for an OSD daemon.

Regards,
Eugen

[1]
https://lists.ceph.io/hyperkitty/list/ceph-users@ceph.io/message/A3NZ4DBMJT2KZQG5SMK6YXYZHIIBFBJW/


Zitat von Frank Schilder :


Dear cephers,

I see "Long heartbeat ping times on back interface seen" in ceph
status and ceph health detail says that I should "Use ceph daemon
mgr.# dump_osd_network for more information". I tries, but it seems
this command was removed during upgrade from mimic 13.2.8 to 13.2.10:

[root@ceph-01 ~]# ceph daemon mgr.ceph-01 dump_osd_network
no valid command found; 10 closest matches:
log flush
log dump
git_version
get_command_descriptions
kick_stale_sessions
help
config unset 
config show
dump_mempools
dump_cache
admin_socket: invalid command

Has this been replaced by some other command?

Best regards,
=
Frank Schilder
AIT Risø Campus
Bygning 109, rum S14
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io



___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io



___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] ceph daemon mgr.# dump_osd_network: no valid command found

2020-12-06 Thread Frank Schilder
Dear cephers,

I see "Long heartbeat ping times on back interface seen" in ceph status and 
ceph health detail says that I should "Use ceph daemon mgr.# dump_osd_network 
for more information". I tries, but it seems this command was removed during 
upgrade from mimic 13.2.8 to 13.2.10:

[root@ceph-01 ~]# ceph daemon mgr.ceph-01 dump_osd_network
no valid command found; 10 closest matches:
log flush
log dump
git_version
get_command_descriptions
kick_stale_sessions
help
config unset 
config show
dump_mempools
dump_cache
admin_socket: invalid command

Has this been replaced by some other command?

Best regards,
=
Frank Schilder
AIT Risø Campus
Bygning 109, rum S14
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: MDS lost, Filesystem degraded and wont mount

2020-12-06 Thread Janek Bevendorff




(Only one of our test clusters saw this happen so far, during mimic
days, and this provoked us to move all MDSs to 64GB VMs, with mds
cache mem limit = 4GB, so there is a large amount of RAM available in
case it's needed.


Ours are running on machines with 128GB RAM. I tried limits between 4 
and 40GB. But the higher the limit, the higher the fall after a crash.


We used to have three MDSs, now I am testing out one to see if that's 
more stable. At the moment, it runs fine, but we have also outsourced 
all the heavy lifting to S3.




What do the crashes look like with the TF training? Do you have a tracker?


At some point the MDS becomes laggy and is killed and not even the hot 
standby is able to resume. There is nothing special going on. You only 
notice that the FS is suddenly degraded and MDS daemons are playing 
Russian roulette until systemd pulls the plug due to too many daemon 
failures. At that point I have to fail the remaining ones, run systemctl 
reset-failed, delete the openfiles objects and restart the daemons.



How many client sessions do need to crash an MDS?


Depends. Surprisingly, it can be as little as one big node with 1.5TB of 
RAM and a few hungry GPUs.




-- Dan



I guess we're pretty lucky with our CephFS's because we have more than
1k clients and it is pretty solid (though the last upgrade had a
hiccup decreasing down to single active MDS).

-- Dan



On Fri, Dec 4, 2020 at 8:20 PM Janek Bevendorff
 wrote:

This is very common issue. Deleting mdsX_openfiles.Y has become part of
my standard maintenance repertoire. As soon as you have a few more
clients and one of them starts opening and closing files in rapid
succession (or does other metadata-heavy things), it becomes very likely
that the MDS crashes and is unable to recover.

There have been numerous fixes in the past, which improved the overall
stability, but it is far from perfect. I am happy to see another patch
in that direction, but I believe more effort needs to be spent here. It
is way too easy to DoS the MDS from a single client. Our 78-node CephFS
beats our old NFS RAID server in terms of throughput, but latency and
stability are way behind.

Janek

On 04/12/2020 11:39, Dan van der Ster wrote:

Excellent!

For the record, this PR is the plan to fix this:
https://github.com/ceph/ceph/pull/36089
(nautilus, octopus PRs here: https://github.com/ceph/ceph/pull/37382
https://github.com/ceph/ceph/pull/37383)

Cheers, Dan

On Fri, Dec 4, 2020 at 11:35 AM Anton Aleksandrov  wrote:

Thank you very much! This solution helped:

Stop all MDS, then:
# rados -p cephfs_metadata_pool rm mds0_openfiles.0
then start one MDS.

We are back online. Amazing!!!  :)


On 04.12.2020 12:20, Dan van der Ster wrote:

Please also make sure the mds_beacon_grace is high on the mon's too.

it doesn't matter which mds you select to be the running one.

Is the processing getting killed, restarted?
If you're confident that the mds is getting OOM killed during rejoin
step, then you might find this useful:
http://lists.ceph.com/pipermail/ceph-users-ceph.com/2018-August/028964.html

Stop all MDS, then:
# rados -p cephfs_metadata_pool rm mds0_openfiles.0
then start one MDS.

-- Dan

On Fri, Dec 4, 2020 at 11:05 AM Anton Aleksandrov  wrote:

Yes, MDS eats all memory+swap, stays like this for a moment and then
frees memory.

mds_beacon_grace was already set to 1800

Also on other it is seen this message: Map has assigned me to become a
standby.

Does it matter, which MDS we stop and which we leave running?

Anton


On 04.12.2020 11:53, Dan van der Ster wrote:

How many active MDS's did you have? (max_mds == 1, right?)

Stop the other two MDS's so you can focus on getting exactly one running.
Tail the log file and see what it is reporting.
Increase mds_beacon_grace to 600 so that the mon doesn't fail this MDS
while it is rejoining.

Is that single MDS running out of memory during the rejoin phase?

-- dan

On Fri, Dec 4, 2020 at 10:49 AM Anton Aleksandrov  wrote:

Hello community,

we are on ceph 13.2.8 - today something happenned with one MDS and cephs
status tells, that filesystem is degraded. It won't mount either. I have
take server with MDS, that was not working down. There are 2 more MDS
servers, but they stay in "rejoin" state. Also only 1 is shown in
"services", even though there are 2.

Both running MDS servers have these lines in their logs:

heartbeat_map is_healthy 'MDSRank' had timed out after 15
mds.beacon.mds2 Skipping beacon heartbeat to monitors (last acked
28.8979s ago); MDS internal heartbeat is not healthy!

On one of MDS nodes I enabled more detailed debug, so I am getting there
also:

mds.beacon.mds3 Sending beacon up:standby seq 178
mds.beacon.mds3 received beacon reply up:standby seq 178 rtt 0.00068

Makes no sense and too much stress in my head... Anyone could help please?

Anton.
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@c

[ceph-users] Re: Increase number of objects in flight during recovery

2020-12-06 Thread Frank Schilder
Just to update the case for others: Setting

ceph config set osd/class:ssd osd_recovery_sleep 0.001
ceph config set osd/class:hdd osd_recovery_sleep 0.05

had the desired effect. I'm running another massive rebalancing operation right 
now and these settings seem to help. It would be nice if one could use a pool 
name in a filter though (osd/pool:NAME). I have 2 different pools on the same 
SSDs and only objects from one of these pools require the lower sleep setting.

Best regards,
=
Frank Schilder
AIT Risø Campus
Bygning 109, rum S14


From: Joachim Kraftmayer 
Sent: 03 December 2020 16:49:51
To: 胡 玮文; Frank Schilder
Cc: ceph-users@ceph.io
Subject: Re: [ceph-users] Re: Increase number of objects in flight during 
recovery

Hi Frank,

this values we used to reduce the recovery impact before luminous.

#reduce recovery impact
osd max backfills
osd recovery max active
osd recovery max single start
osd recovery op priority
osd recovery threads
osd backfill scan max
osd backfill scan min

I do not know how many osds and pgs you have in your cluster. But the
backfill performance depends on osds, pgs and objects/pg.

Regards, Joachim

___

Clyso GmbH

Am 03.12.2020 um 12:35 schrieb 胡 玮文:
> Hi,
>
> There is a “OSD recovery priority” dialog box in web dashboard. 
> Configurations it will change includes:
>
> osd_max_backfill
> osd_recovery_max_active
> osd_recovery_max_single_start
> osd_recovery_sleep
>
> Tune these config may helps. “High” priority corresponding to 4, 4, 4, 0, 
> respectively. Some of these also have a _ssd/_hdd variant.
>
>> 在 2020年12月3日,17:11,Frank Schilder  写道:
>>
>> Hi all,
>>
>> I have the opposite problem as discussed in "slow down keys/s in recovery". 
>> I need to increase the number of objects in flight during rebalance. It is 
>> already all remapped PGs in state backfilling, but it looks like no more 
>> than 8 objects/sec are transferred per PG at a time. The pools sits on 
>> high-performance SSDs and could easily handle a transfer of 100 or more 
>> objects/sec simultaneously. Is there any way to increase the number of 
>> transfers/sec or simultaneous transfers? Increasing the options 
>> osd_max_backfills and osd_recovery_max_active has no effect.
>>
>> Background: The pool in question (con-fs2-meta2) is the default data pool of 
>> a ceph fs, which stores exclusively the kind of meta data that goes into 
>> this pool. Storage consumption is reported as 0, but the number of objects 
>> is huge:
>>
>> NAME ID USED%USED MAX AVAIL 
>> OBJECTS
>> con-fs2-meta112 216 MiB  0.02   933 GiB  
>> 1335
>> con-fs2-meta213 0 B 0   933 GiB 
>> 118389897
>> con-fs2-data 14 698 TiB 72.15   270 TiB 
>> 286826739
>>
>> Unfortunately, there were no recommendations on dimensioning PG numbers for 
>> this pool, so I used the same for con-fs2-meta1, and con-fs2-meta2. In 
>> hindsight, this was potentially a bad idea, the meta2 pool should have a 
>> much higher PG count or a much more aggressive recovery policy.
>>
>> I now need to rebalance PGs on meta2 and it is going way too slow compared 
>> with the performance of the SSDs it is located on. In a way, I would like to 
>> keep the PG count where it is, but increase the recovery rate for this pool 
>> by a factor of 10. Please let me know what options I have.
>>
>> Best regards,
>> =
>> Frank Schilder
>> AIT Risø Campus
>> Bygning 109, rum S14
>> ___
>> ceph-users mailing list -- ceph-users@ceph.io
>> To unsubscribe send an email to ceph-users-le...@ceph.io
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: ceph daemon mgr.# dump_osd_network: no valid command found

2020-12-06 Thread Frank Schilder
Thanks! I guess the message in ceph health detail should be changed then. Is 
this already on the list?

Best regards,
=
Frank Schilder
AIT Risø Campus
Bygning 109, rum S14


From: Eugen Block 
Sent: 06 December 2020 12:41:00
To: ceph-users@ceph.io
Subject: [ceph-users] Re: ceph daemon mgr.# dump_osd_network: no valid command 
found

Hi Frank,

I responded to a recent thread [1] about this, you should be able to
run that command for an OSD daemon.

Regards,
Eugen

[1]
https://lists.ceph.io/hyperkitty/list/ceph-users@ceph.io/message/A3NZ4DBMJT2KZQG5SMK6YXYZHIIBFBJW/


Zitat von Frank Schilder :

> Dear cephers,
>
> I see "Long heartbeat ping times on back interface seen" in ceph
> status and ceph health detail says that I should "Use ceph daemon
> mgr.# dump_osd_network for more information". I tries, but it seems
> this command was removed during upgrade from mimic 13.2.8 to 13.2.10:
>
> [root@ceph-01 ~]# ceph daemon mgr.ceph-01 dump_osd_network
> no valid command found; 10 closest matches:
> log flush
> log dump
> git_version
> get_command_descriptions
> kick_stale_sessions
> help
> config unset 
> config show
> dump_mempools
> dump_cache
> admin_socket: invalid command
>
> Has this been replaced by some other command?
>
> Best regards,
> =
> Frank Schilder
> AIT Risø Campus
> Bygning 109, rum S14
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io


___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: ceph daemon mgr.# dump_osd_network: no valid command found

2020-12-06 Thread Frank Schilder
Can do. I hope I don't forget it :)

Best regards,
=
Frank Schilder
AIT Risø Campus
Bygning 109, rum S14


From: Eugen Block 
Sent: 06 December 2020 13:55:13
To: Frank Schilder
Cc: ceph-users@ceph.io
Subject: Re: [ceph-users] Re: ceph daemon mgr.# dump_osd_network: no valid 
command found

I didn't look very closely but I didn't find a tracker issue for this
so maybe we should create one. I thought maybe OP from the thread I
responded to would have done that but apparently not. Do you want to
create it?


Zitat von Frank Schilder :

> Thanks! I guess the message in ceph health detail should be changed
> then. Is this already on the list?
>
> Best regards,
> =
> Frank Schilder
> AIT Risø Campus
> Bygning 109, rum S14
>
> 
> From: Eugen Block 
> Sent: 06 December 2020 12:41:00
> To: ceph-users@ceph.io
> Subject: [ceph-users] Re: ceph daemon mgr.# dump_osd_network: no
> valid command found
>
> Hi Frank,
>
> I responded to a recent thread [1] about this, you should be able to
> run that command for an OSD daemon.
>
> Regards,
> Eugen
>
> [1]
> https://lists.ceph.io/hyperkitty/list/ceph-users@ceph.io/message/A3NZ4DBMJT2KZQG5SMK6YXYZHIIBFBJW/
>
>
> Zitat von Frank Schilder :
>
>> Dear cephers,
>>
>> I see "Long heartbeat ping times on back interface seen" in ceph
>> status and ceph health detail says that I should "Use ceph daemon
>> mgr.# dump_osd_network for more information". I tries, but it seems
>> this command was removed during upgrade from mimic 13.2.8 to 13.2.10:
>>
>> [root@ceph-01 ~]# ceph daemon mgr.ceph-01 dump_osd_network
>> no valid command found; 10 closest matches:
>> log flush
>> log dump
>> git_version
>> get_command_descriptions
>> kick_stale_sessions
>> help
>> config unset 
>> config show
>> dump_mempools
>> dump_cache
>> admin_socket: invalid command
>>
>> Has this been replaced by some other command?
>>
>> Best regards,
>> =
>> Frank Schilder
>> AIT Risø Campus
>> Bygning 109, rum S14
>> ___
>> ceph-users mailing list -- ceph-users@ceph.io
>> To unsubscribe send an email to ceph-users-le...@ceph.io
>
>
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io


___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] guest fstrim not showing free space

2020-12-06 Thread Marc Roos



I have a 74GB vm with 34466MB free space. But when I do fstrim / 'rbd 
du' shows still 60GB used.
When I fill the 34GB of space with an image, delete it and do again the 
fstrim 'rbd du' still shows 59GB used.

Is this normal? Or should I be able to get it to ~30GB used?








___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: guest fstrim not showing free space

2020-12-06 Thread Eugen Block

Have you also tried the ‚rbd sparsify‘ command? It worked for me.

https://docs.ceph.com/en/latest/man/8/rbd/


Zitat von Marc Roos :


I have a 74GB vm with 34466MB free space. But when I do fstrim / 'rbd
du' shows still 60GB used.
When I fill the 34GB of space with an image, delete it and do again the
fstrim 'rbd du' still shows 59GB used.

Is this normal? Or should I be able to get it to ~30GB used?








___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io



___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: guest fstrim not showing free space

2020-12-06 Thread Marc Roos

At first this worked[1] but after I removed the snaphost I am back at 
57GB[2] I ran rbd sparsify again after the snapshot was removed, but 
stayed the same.


[1]
NAME   PROVISIONED USED
x...@xxx.bak74 GiB 59 GiB
XXX74 GiB 37 GiB
74 GiB 96 GiB


[2]
NAME  PROVISIONED USED
XXX   74 GiB 57 GiB



-Original Message-
Cc: ceph-users
Subject: [ceph-users] Re: guest fstrim not showing free space

Have you also tried the ‚rbd sparsify‘ command? It worked for me.

https://docs.ceph.com/en/latest/man/8/rbd/


Zitat von Marc Roos :

> I have a 74GB vm with 34466MB free space. But when I do fstrim / 'rbd 
> du' shows still 60GB used.
> When I fill the 34GB of space with an image, delete it and do again 
> the fstrim 'rbd du' still shows 59GB used.
>
> Is this normal? Or should I be able to get it to ~30GB used?
>
>
>
>
>
>
>
>
> ___
> ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an 
> email to ceph-users-le...@ceph.io


___
ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an 
email to ceph-users-le...@ceph.io

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Garbage Collection on Luminous

2020-12-06 Thread Priya Sehgal
Hi,
I have a production cluster and it is experiencing lot of DELETES since
many months. However, with the default gc configs - I did not see the
cluster space utilization going down. Moreover, the gc list has more than 4
million objects. I tried increasing the gc configs on 4 rados gateways and
fired manual gc using the command:

*sudo radosgw-admin gc process*


The changed GC configs are as follows:

rgw gc max objs = 100
rgw gc obj min wait = 3600
rgw gc processor max time = 900
rgw gc processor period = 3600
rgw gc max concurrent io = 20
rgw gc max trim chunk = 64


*Questions*

1. Since the cluster is active cluster and having both PUT and DELETE
requests - I am unable to determine whether manual GC is indeed helping. I
did see an increase in free space one day - but from the past few days I do
not see any increase in free space. How do I determine whether GC is really
running?


*sudo radosgw-admin gc list | grep oid | wc -l *

This command the list is always increasing and I do see very old objects
when I grep for oid.


2. Does GC have some problems running automatically on Luminous? It is
enabled.

3. Are the above configs fine or can they be made more aggressive?

4. Is there a faster way to claim space which is pending GC without
impacting much performance?


Thanks and Regards,

Priya

-- 


*-*

*This email and any files transmitted with it are confidential and 
intended solely for the use of the individual or entity to whom they are 
addressed. If you have received this email in error, please notify the 
system manager. This message contains confidential information and is 
intended only for the individual named. If you are not the named addressee, 
you should not disseminate, distribute or copy this email. Please notify 
the sender immediately by email if you have received this email by mistake 
and delete this email from your system. If you are not the intended 
recipient, you are notified that disclosing, copying, distributing or 
taking any action in reliance on the contents of this information is 
strictly prohibited.*

 

*Any views or opinions presented in this 
email are solely those of the author and do not necessarily represent those 
of the organization. Any information on shares, debentures or similar 
instruments, recommended product pricing, valuations and the like are for 
information purposes only. It is not meant to be an instruction or 
recommendation, as the case may be, to buy or to sell securities, products, 
services nor an offer to buy or sell securities, products or services 
unless specifically stated to be so on behalf of the Flipkart group. 
Employees of the Flipkart group of companies are expressly required not to 
make defamatory statements and not to infringe or authorise any 
infringement of copyright or any other legal right by email communications. 
Any such communication is contrary to organizational policy and outside the 
scope of the employment of the individual concerned. The organization will 
not accept any liability in respect of such communication, and the employee 
responsible will be personally liable for any damages or other liability 
arising.*

 

*Our organization accepts no liability for the 
content of this email, or for the consequences of any actions taken on the 
basis of the information *provided,* unless that information is 
subsequently confirmed in writing. If you are not the intended recipient, 
you are notified that disclosing, copying, distributing or taking any 
action in reliance on the contents of this information is strictly 
prohibited.*

_-_
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io