[ceph-users] Re: backfill_toofull after adding new OSDs

2019-08-30 Thread Frank Schilder
r, 208 op/s rd, 306 op/s wr recovery: 298 MiB/s, 156 objects/s Best regards, = Frank Schilder AIT Risø Campus Bygning 109, rum S14 ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: EC Compression

2019-09-02 Thread Frank Schilder
Compression is defined per pool and is independent of the replication type. You can choose depending on IO pattern. I use compression only on data pools. Meta data is usually small random IO and not very suited for compression, which might increase latency. In particular, when meta data is on SS

[ceph-users] MDS blocked ops; kernel: Workqueue: ceph-pg-invalid ceph_invalidate_work [ceph]

2019-09-03 Thread Frank Schilder
Hi, I encountered a problem with blocked MDS operations and a client becoming unresponsive. I dumped the MDS cache, ops, blocked ops and some further log information here: https://files.dtu.dk/u/peQSOY1kEja35BI5/2010-09-03-mds-blocked-ops?l A user of our HPC system was running a job that create

[ceph-users] Re: ceph fs crashes on simple fio test

2019-09-03 Thread Frank Schilder
he client. An ordinary user should not have so much power in his hands. This makes it trivial to destroy a ceph cluster. This very short fio test is probably sufficient to reproduce the issue on any test cluster. Should I open an issue? Best regards, = Frank Schilder AIT Risø

[ceph-users] Re: forcing an osd down

2019-09-03 Thread Frank Schilder
it is possible to set the noup flag on a specific OSD, which is much safer. Best regards, = Frank Schilder AIT Risø Campus Bygning 109, rum S14 From: ceph-users on behalf of solarflow99 Sent: 03 September 2019 19:40:59 To: Ceph

[ceph-users] Re: Proposal to disable "Avoid Duplicates" on all ceph.io lists

2019-09-04 Thread Frank Schilder
on-line, but I get only first and important e-mails to my inbox. Messages added on-line of sent by e-mail are always sent to everyone subscribed to a thread. Does ceph.io offer a similar feature? Best regards, ===== Frank Schilder AIT Risø Campus Bygni

[ceph-users] Re: ceph fs crashes on simple fio test

2019-09-20 Thread Frank Schilder
level becomes warn. - The cluster crunches through the meta data ops for a minute or so and then settles. This is quite a long time considering a 5 secs burst. - OSDs did not go out, but this could be due to not running the test long enough. Best regards, = Frank Sc

[ceph-users] moving EC pool from HDD to SSD without downtime

2019-09-30 Thread Frank Schilder
es? If not, what are my options? Thanks! = Frank Schilder AIT Risø Campus Bygning 109, rum S14 ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: moving EC pool from HDD to SSD without downtime

2019-10-01 Thread Frank Schilder
while storage is fully redundant and r/w accessible. Best regards, = Frank Schilder AIT Risø Campus Bygning 109, rum S14 ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Objects degraded after adding disks

2019-10-01 Thread Frank Schilder
ed to keep all PGs active with 9 out of 10 OSDs up and in? Why do undersized PGs arise even though all OSDs are up? Why do degraded objects arise even though no OSD was removed? Thanks! ===== Frank Schilder AIT Risø Campus Bygning 109, rum S14 _

[ceph-users] Re: Objects degraded after adding disks

2019-10-03 Thread Frank Schilder
your help, ===== Frank Schilder AIT Risø Campus Bygning 109, rum S14 From: Robert LeBlanc Sent: 01 October 2019 17:13:41 To: Frank Schilder Cc: ceph-users Subject: Re: [ceph-users] Objects degraded after adding disks On Tue, Oct 1, 2019 at 5:2

[ceph-users] Re: Nautilus: PGs stuck remapped+backfilling

2019-10-11 Thread Frank Schilder
You meta data PGs *are* backfilling. It is the "61 keys/s" statement in the ceph status output in the recovery I/O line. If this is too slow, increase osd_max_backfills and osd_recovery_max_active. Or just have some coffee ... Best regards, ===== Frank Schilder AIT R

[ceph-users] Re: Nautilus: PGs stuck remapped+backfilling

2019-10-11 Thread Frank Schilder
SSD this should be OK. Best regards, = Frank Schilder AIT Risø Campus Bygning 109, rum S14 From: Eugen Block Sent: 11 October 2019 10:24 To: Frank Schilder Cc: ceph-users@ceph.io Subject: Re: [ceph-users] Nautilus: PGs stuck remapped

[ceph-users] Re: Recovering from a Failed Disk (replication 1)

2019-10-17 Thread Frank Schilder
doesn't help and you really need that last bit of data, you might need support from one of those companies that restore disk data with electron microscopy. I successfully transferred OSDs between disks using ddrescue. Best regards, = Frank Schilder AIT Risø Campus Bygning 109

[ceph-users] Change device class in EC profile

2019-10-18 Thread Frank Schilder
-profile set sr-ec-6-2-ssd crush-device-class=ssd --force If not, how can I change the device class of the profile? Many thanks and best regards, = Frank Schilder AIT Risø Campus Bygning 109, rum S14 ___ ceph-users mailing list -- ceph-users

[ceph-users] Re: Change device class in EC profile

2019-10-18 Thread Frank Schilder
from. Hence, the pool info will drag outdated information along. Best regards, = Frank Schilder AIT Risø Campus Bygning 109, rum S14 From: Maks Kowalik Sent: 18 October 2019 14:01 To: Frank Schilder Subject: Re: [ceph-users] Change device

[ceph-users] Re: Fwd: large concurrent rbd operations block for over 15 mins!

2019-10-22 Thread Frank Schilder
per disk, each VM getting 50IOPs write performance. This is probably what you would like to see as well. If you use replicated data pool, this should be relatively easy. With EC data pool, this is a bit of a battle. Good luck, ===== Frank Schilder AIT Risø Campus Bygning 109, rum

[ceph-users] Re: Replace ceph osd in a container

2019-10-22 Thread Frank Schilder
this container I make all relevant hardware visible. You might also want to expose /var/run/ceph to be able to use admin sockets without hassle. This way, I separated admin operations from actual storage daemons and can modify and restart the admin container as I like. Best regards,

[ceph-users] Re: Fwd: large concurrent rbd operations block for over 15 mins!

2019-10-23 Thread Frank Schilder
r design philosophy is "get sufficient performance for small bucks" and EC does it for us. Best regards, ===== Frank Schilder AIT Risø Campus Bygning 109, rum S14 From: Mark Nelson Sent: 22 October 2019 15:59:21 To: ceph-users@ceph.io Su

[ceph-users] Re: Change device class in EC profile

2019-10-24 Thread Frank Schilder
must not be changed ever. Device class and crush root should be safe to change as they are only relevant at pool creation. Sounds like what you write confirms this hypothesis. Best regards and thanks, = Frank Schilder AIT Risø Campus Bygning 109, rum S14

[ceph-users] Re: Choosing suitable SSD for Ceph cluster

2019-10-24 Thread Frank Schilder
o look for DWPD>=1. In addition, as Martin writes, consider upgrading and deploy all new disks with bluestore. Best regards, ===== Frank Schilder AIT Risø Campus Bygning 109, rum S14 From: Martin Verges Sent: 24 October 2019 21:2

[ceph-users] Re: mimic 13.2.6 too much broken connexions

2019-11-29 Thread Frank Schilder
and ca. 550 client nodes, accounting for about 1500 active ceph clients, 1400 cephfs and 170 RBD images. Best regards, = Frank Schilder AIT Risø Campus Bygning 109, rum S14 From: Vincent Godin Sent: 27 November 2019 20:11:23 To: Anthony

[ceph-users] getfattr problem on ceph-fs

2019-12-10 Thread Frank Schilder
h-01 was kickstarted with Centos7.6 while gnosis was kickstarted with Centos7.7. Otherwise, both machines are deployed identically. getfattr is the same version on both. Kernel versions are ceph-01:5.0.2-1.el7.elrepo.x86_64 and gnosis:5.4.2-1.el7.elrepo.x86_64. Does anyone ha

[ceph-users] Re: getfattr problem on ceph-fs

2019-12-10 Thread Frank Schilder
Thanks for the fast answer! Is there any (other) way to get a complete list of extended attributes? Is there something documented - meaning what can I rely on in the future? Best regards, = Frank Schilder AIT Risø Campus Bygning 109, rum S14

[ceph-users] Re: getfattr problem on ceph-fs

2019-12-11 Thread Frank Schilder
gards, = Frank Schilder AIT Risø Campus Bygning 109, rum S14 From: David Disseldorp Sent: 10 December 2019 15:55:07 To: Frank Schilder Cc: Yan, Zheng; ceph-users Subject: ORe: [ceph-users] Re: getfattr problem on ceph-fs Hi, On Tue, 1

[ceph-users] Re: list CephFS snapshots

2019-12-17 Thread Frank Schilder
Have you tried "ceph daemon mds.NAME dump snaps" (available since mimic)? ===== Frank Schilder AIT Risø Campus Bygning 109, rum S14 From: Lars Täuber Sent: 17 December 2019 12:32:34 To: Stephan Mueller Cc: ceph-users@ceph.io Subj

[ceph-users] Re: list CephFS snapshots

2019-12-17 Thread Frank Schilder
I think you can do a find for the inode (-inum n). At last I hope you can. However, I vaguely remember that there was a thread where someone gave a really nice MDS command for finding the path to an inode in no time. Best regards, = Frank Schilder AIT Risø Campus Bygning 109

[ceph-users] Re: list CephFS snapshots

2019-12-18 Thread Frank Schilder
ries ".snap" and "" at the top? Best regards, = Frank Schilder AIT Risø Campus Bygning 109, rum S14 From: Lars Täuber Sent: 18 December 2019 09:24:02 To: Frank Schilder Cc: Marc Roos; ceph-users Subject: Re: [ceph-users] R

[ceph-users] Re: cephfs : write error: Operation not permitted

2020-01-23 Thread Frank Schilder
, allow rw pool=cephfs-data" an easy way is: - ceph auth export - add the caps with an editor - ceph auth import I consider this a bug and thought it was fixed in newer versions already. Best regards, ===== Frank Schilder AIT Risø Campus

[ceph-users] Re: cephfs : write error: Operation not permitted

2020-01-23 Thread Frank Schilder
out of the box. ===== Frank Schilder AIT Risø Campus Bygning 109, rum S14 ____ From: Frank Schilder Sent: 23 January 2020 09:10:41 To: Yoann Moulin; ceph-users Subject: [ceph-users] Re: cephfs : write error: Operation not permitted Hi Yoann, for some reas

[ceph-users] Re: cephfs : write error: Operation not permitted

2020-01-23 Thread Frank Schilder
== Frank Schilder AIT Risø Campus Bygning 109, rum S14 From: Yoann Moulin Sent: 23 January 2020 10:38:42 To: Frank Schilder; ceph-users Subject: Re: [ceph-users] Re: cephfs : write error: Operation not permitted Hi Frank, >> for some re

[ceph-users] Re: cephfs : write error: Operation not permitted

2020-01-24 Thread Frank Schilder
//docs.ceph.com/docs/mimic/rados/operations/user-management/?highlight=pool%20tags Thanks and best regards, ===== Frank Schilder AIT Risø Campus Bygning 109, rum S14 ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an em

[ceph-users] Re: cephfs : write error: Operation not permitted

2020-01-27 Thread Frank Schilder
Thanks a lot! I will fix the pool meta data and clean up my keys. Best regards, = Frank Schilder AIT Risø Campus Bygning 109, rum S14 From: Ilya Dryomov Sent: 25 January 2020 09:01 To: Frank Schilder Cc: Yoann Moulin; ceph-users Subject

[ceph-users] ceph fs dir-layouts and sub-directory mounts

2020-01-29 Thread Frank Schilder
ent key with access restricted to "/a" The client will not be able to see the dir layout attribute set at "/", its not mounted. Will the data of this client still go to the pool "P", that is, does "/a" inherit the dir layout transparently to the client

[ceph-users] Upgrading mimic 13.2.2 to mimic 13.2.8

2020-01-31 Thread Frank Schilder
Dear all, is it possible to upgrade from 13.2.2 directly to 13.2.8 after setting "ceph osd set pglog_hardlimit" (mimic 13.2.5 release notes), or do I need to follow this path: 13.2. 2 -> 5 -> 6 -> 8 ? Thanks! = Frank Schilder AIT Risø Campus B

[ceph-users] Re: Upgrading mimic 13.2.2 to mimic 13.2.8

2020-01-31 Thread Frank Schilder
-mean-it} : set Error EINVAL: invalid command Not sure if this means anything problematic. Best regards, ===== Frank Schilder AIT Risø Campus Bygning 109, rum S14 ____ From: Frank Schilder Sent: 31 January 2020 11:08:35 To: ceph-users Subject: [ce

[ceph-users] Re: CephFS - objects in default data pool

2020-01-31 Thread Frank Schilder
see any activity on this pool with pool stats and neither are there any objects. Is there any way to check if anything is on this pool and how much storage it uses? "Ceph df" is not helping and neither is "rados ls", which is a bit of an issue when it comes to sizing. Best reg

[ceph-users] Re: CephFS - objects in default data pool

2020-01-31 Thread Frank Schilder
relevant, other objects might just be transient and therefore never seen. Best regards, = Frank Schilder AIT Risø Campus Bygning 109, rum S14 From: Frank Schilder Sent: 31 January 2020 18:19:34 To: Gregory Farnum; CASS Philip Cc: ceph

[ceph-users] Re: ceph fs dir-layouts and sub-directory mounts

2020-02-03 Thread Frank Schilder
a replicated pool for the default data pool ... to, for example, If erasure-coded pools are planned for the file system, it is strongly recommended to use a replicated pool for the default data pool ... Best regards, ===== Frank Schilder AIT Risø Ca

[ceph-users] Re: ceph fs dir-layouts and sub-directory mounts

2020-02-03 Thread Frank Schilder
errata: con-fs2-meta2 is the default data pool. = Frank Schilder AIT Risø Campus Bygning 109, rum S14 From: Frank Schilder Sent: 03 February 2020 10:08 To: Patrick Donnelly; Konstantin Shalygin Cc: ceph-users Subject: Re: [ceph-users] ceph

[ceph-users] Re: ceph fs dir-layouts and sub-directory mounts

2020-02-03 Thread Frank Schilder
Thumbs up for that! Best regards, = Frank Schilder AIT Risø Campus Bygning 109, rum S14 From: Patrick Donnelly Sent: 03 February 2020 11:18 To: Frank Schilder Cc: Konstantin Shalygin; ceph-users Subject: Re: [ceph-users] ceph fs dir

[ceph-users] osd_memory_target ignored

2020-02-04 Thread Frank Schilder
. Is this in conflict with allocator=bitmap? If so, what is the way to tune cache sizes (say if tcmalloc is not used/how to check?)? Are bluestore_cache_* indeed obsolete as the above release notes suggest, or is this not true? Many thanks for your help. Best regards, = Frank

[ceph-users] Re: osd_memory_target ignored

2020-02-04 Thread Frank Schilder
osd_memory_target 8589934592 Best regards, = Frank Schilder AIT Risø Campus Bygning 109, rum S14 From: Stefan Kooman Sent: 04 February 2020 16:34:34 To: Frank Schilder Cc: ceph-users Subject: Re: [ceph-users] osd_memory_target ignored Hi, Quoting Frank

[ceph-users] Re: osd_memory_target ignored

2020-02-05 Thread Frank Schilder
regards, = Frank Schilder AIT Risø Campus Bygning 109, rum S14 From: Stefan Kooman Sent: 04 February 2020 21:14:28 To: Frank Schilder Cc: ceph-users Subject: Re: [ceph-users] osd_memory_target ignored Quoting Frank Schilder (fr...@dtu.dk

[ceph-users] Re: osd_memory_target ignored

2020-02-05 Thread Frank Schilder
different default memory targets set for different device classes. Unfortunately, there seem not to be different memory_target_[devide class] default options. Is there a good way to set different while avoiding to bloat "ceph config dump" unnecessarily? Best regards, ===== Fra

[ceph-users] Re: osd_memory_target ignored

2020-02-06 Thread Frank Schilder
Dear Stefan, thanks for your help. I opened these: https://tracker.ceph.com/issues/44010 https://tracker.ceph.com/issues/44011 Best regards, = Frank Schilder AIT Risø Campus Bygning 109, rum S14 From: Stefan Kooman Sent: 05 February

[ceph-users] Re: osd_memory_target ignored

2020-02-07 Thread Frank Schilder
For everybody reading this: It is already possible to set configuration options based on device class using masks, for example, ceph config set osd/class:hdd osd_memory_target 2147483648 will set a memory target for HDDs only. = Frank Schilder AIT Risø Campus Bygning 109

[ceph-users] Re: EC Pools w/ RBD - IOPs

2020-02-14 Thread Frank Schilder
ice outage on any maintenance event (min_size>=k+1), or risk data loss for fresh writes to non-redundant storage (min_size=k). = Frank Schilder AIT Risø Campus Bygning 109, rum S14 From: Vitaliy Filippov Sent: 14 February 2020 00:10

[ceph-users] Move on cephfs not O(1)?

2020-03-26 Thread Frank Schilder
questions are: 1) Was there a change from 13.2.2 to 13.2.8 explaining this? 2) Are there (rare) conditions under which an mv on cephfs becomes a cp+rm? 3) Am I seeing ghosts? Thanks for clues and best regards, ===== Frank Schilder AIT Risø Campus Bygning 10

[ceph-users] Re: Move on cephfs not O(1)?

2020-03-26 Thread Frank Schilder
nks! Best regards, ===== Frank Schilder AIT Risø Campus Bygning 109, rum S14 From: Gregory Farnum Sent: 26 March 2020 15:20:51 To: Beocat KSU Cc: ceph-users; Frank Schilder Subject: Re: [ceph-users] Re: Move on cephfs not O(1)? I was wonderin

[ceph-users] Re: Move on cephfs not O(1)?

2020-03-26 Thread Frank Schilder
, do the move and set quotas back again? Best regards, = Frank Schilder AIT Risø Campus Bygning 109, rum S14 From: Frank Schilder Sent: 26 March 2020 16:35:07 To: Gregory Farnum; Beocat KSU Cc: ceph-users Subject: [ceph-users] Re: Move on

[ceph-users] How to migrate ceph-xattribs?

2020-03-26 Thread Frank Schilder
*find* everything with special ceph attributes? Thanks and best regards, = Frank Schilder AIT Risø Campus Bygning 109, rum S14 ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: How to migrate ceph-xattribs?

2020-03-26 Thread Frank Schilder
problem of finding all locations of snapshots in a cephfs. This used to be a huge pain. Now there is MDS functionality and clever rados querying for this. Best regards, = Frank Schilder AIT Risø Campus Bygning 109, rum S14 From: Gregory

[ceph-users] Re: How to migrate ceph-xattribs?

2020-03-27 Thread Frank Schilder
o/message/3HWU4DITVDF4IXDC2NETWS5E3EA4PM6Q/ is about this. Best regards, = Frank Schilder AIT Risø Campus Bygning 109, rum S14 From: Gregory Farnum Sent: 26 March 2020 18:36 To: Frank Schilder Cc: ceph-users Subject: Re: [ceph-users] How

[ceph-users] Re: Combining erasure coding and replication?

2020-03-27 Thread Frank Schilder
ing codes (Reed-Solomon codes). From what I remember, the computational complexity of these codes explodes at least exponentially with m. Out of curiosity, how does m>3 perform in practice? What's the CPU requirement per OSD? Best regards, ===== Frank Schilder AIT Risø Cam

[ceph-users] Re: Move on cephfs not O(1)?

2020-03-27 Thread Frank Schilder
Thanks a lot! Have a good weekend. = Frank Schilder AIT Risø Campus Bygning 109, rum S14 From: Gregory Farnum Sent: 27 March 2020 21:18:37 To: Jeff Layton Cc: Frank Schilder; Zheng Yan; ceph-users; Luis Henriques Subject: Re: [ceph-users

[ceph-users] Bluestore compression parameters in ceph.conf not used in mimic 13.2.8?

2020-04-01 Thread Frank Schilder
e the same amount of space as the compressed one as both will require the same allocation size. Did something change here? Are compressed blobs now co-located in allocations? Thanks for your help, ===== Frank Schilder AIT Risø Campus Bygning 10

[ceph-users] Re: Bluestore compression parameters in ceph.conf not used in mimic 13.2.8?

2020-04-01 Thread Frank Schilder
Dear Igor, thanks, done: https://tracker.ceph.com/issues/44878 . = Frank Schilder AIT Risø Campus Bygning 109, rum S14 From: Igor Fedotov Sent: 01 April 2020 12:14:02 To: Frank Schilder; ceph-users Subject: Re: [ceph-users] Bluestore

[ceph-users] Poor Windows performance on ceph RBD.

2020-04-02 Thread Frank Schilder
, = Frank Schilder AIT Risø Campus Bygning 109, rum S14 ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: Poor Windows performance on ceph RBD.

2020-04-03 Thread Frank Schilder
Dear Olivier, thanks for your answer. We are using the virtio driver already. Best regards, = Frank Schilder AIT Risø Campus Bygning 109, rum S14 From: Olivier AUDRY Sent: 02 April 2020 18:20:49 To: Frank Schilder; ceph-users Subject

[ceph-users] Check if upmap is supported by client?

2020-04-13 Thread Frank Schilder
"features": "00ff", "entity_id": "con-fs2-hpc", "hostname": "sn253.hpc.ait.dtu.dk", "kernel_version": "3.10.0-957.12.2.el7.x86_64", "root"

[ceph-users] Re: Check if upmap is supported by client?

2020-04-13 Thread Frank Schilder
If so, what happens if a jewel client without this feature bit set tries to connect? 3) I guess that in case that as soon as an up-map table is created, only clients with this bit set can connect. In case we run into problems, is there a way to roll back? Many thanks and best regards,

[ceph-users] Re: How to migrate ceph-xattribs?

2020-04-20 Thread Frank Schilder
ibs are for the purpose they are used for, with the above recursive structure it becomes very easy to find what one is looking for without having to remember anything. Its completely self-documenting. Best regards, = Frank Schilder AIT Risø Campus Bygning 109, rum S14 ___

[ceph-users] Is "." a legal characted in device class or not?

2020-04-25 Thread Frank Schilder
ss ls-osd" command. Best regards, ===== Frank Schilder AIT Risø Campus Bygning 109, rum S14 ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Data loss by adding 2OSD causing Long heartbeat ping times

2020-04-25 Thread Frank Schilder
wever, I cannot risk data integrity of a production cluster and, therefore, probably not run the original procedure again. Many thanks for your help and best regards, ===== Frank Schilder AIT Risø Campus Bygning 109, rum S14 ___ ceph-users mailing l

[ceph-users] Ceph meltdown, need help

2020-05-05 Thread Frank Schilder
ent state active+undersized+degraded, last acting [234,3,2147483647,158,180,63,2147483647,181] SLOW_OPS 9788 slow ops, oldest one blocked for 2953 sec, daemons [osd.0,osd.100,osd.101,osd.112,osd.118,osd.133,osd.136,osd.142,osd.144,osd.145]... have slow ops. ===== Frank Schilder

[ceph-users] Re: Ceph meltdown, need help

2020-05-05 Thread Frank Schilder
regards, = Frank Schilder AIT Risø Campus Bygning 109, rum S14 From: Dan van der Ster Sent: 05 May 2020 16:25:31 To: Frank Schilder Cc: ceph-users Subject: Re: [ceph-users] Ceph meltdown, need help Hi Frank, Could you share any ceph-osd logs and

[ceph-users] Re: Ceph meltdown, need help

2020-05-05 Thread Frank Schilder
= Frank Schilder AIT Risø Campus Bygning 109, rum S14 From: Frank Schilder Sent: 05 May 2020 16:41:59 To: Dan van der Ster Cc: ceph-users Subject: [ceph-users] Re: Ceph meltdown, need help Dear Dan, thank you for your fast response. Please find the

[ceph-users] Re: Ceph meltdown, need help

2020-05-05 Thread Frank Schilder
I tried that and get: 2020-05-05 17:23:17.008 7fbbe700 0 -- 192.168.32.64:0/2061991714 >> 192.168.32.68:6826/5216 conn(0x7fbbf01d6f80 :-1 s=STATE_CONNECTING_WAIT_CONNECT_REPLY_AUTH pgs=0 cs=0 l=1).handle_connect_reply connect got BADAUTHORIZER Strange. = Frank Sc

[ceph-users] Re: Ceph meltdown, need help

2020-05-05 Thread Frank Schilder
/was really busy trying to serve the IO. It lost beacons/heartbeats in the process or theygot too old. Is there a way to pause client I/O? = Frank Schilder AIT Risø Campus Bygning 109, rum S14 From: Dan van der Ster Sent: 05 May 2020 17

[ceph-users] Re: Ceph meltdown, need help

2020-05-05 Thread Frank Schilder
. = Frank Schilder AIT Risø Campus Bygning 109, rum S14 From: Alex Gorbachev Sent: 05 May 2020 17:31:17 To: Frank Schilder Cc: Dan van der Ster; ceph-users Subject: Re: [ceph-users] Re: Ceph meltdown, need help On Tue, May 5, 2020 at 11:27 AM Frank Schilder

[ceph-users] Re: Ceph meltdown, need help

2020-05-05 Thread Frank Schilder
about that? Thanks! ===== Frank Schilder AIT Risø Campus Bygning 109, rum S14 From: Dan van der Ster Sent: 05 May 2020 17:35:33 To: Frank Schilder Cc: ceph-users Subject: Re: [ceph-users] Ceph meltdown, need help OK those requires look correct.

[ceph-users] Re: Ceph meltdown, need help

2020-05-05 Thread Frank Schilder
. Thanks for all your quick help! Best regards, = Frank Schilder AIT Risø Campus Bygning 109, rum S14 From: Dan van der Ster Sent: 05 May 2020 17:45 To: Frank Schilder Cc: ceph-users Subject: Re: [ceph-users] Ceph meltdown, need help ceph

[ceph-users] Re: Data loss by adding 2OSD causing Long heartbeat ping times

2020-05-06 Thread Frank Schilder
and this will continue in its own thread "Cluster outage due to client IO" to be opened soon. Best regards, = Frank Schilder AIT Risø Campus Bygning 109, rum S14 ____ From: Frank Schilder Sent: 25 April 2020 15:34:25 To: ceph-users Subj

[ceph-users] Re: Ceph meltdown, need help

2020-05-06 Thread Frank Schilder
ell. Note that the OSDs in question were in their own sub-tree and all shut down at the same time. In the past (mimic 13.2.2), such OSDs were marked down after some time. This time (mimic 13.2.8), they stayed up and in for 4 to 5 days until I restarted them in a different crush location. Th

[ceph-users] Re: What's the best practice for Erasure Coding

2020-05-06 Thread Frank Schilder
ld data. Its going to be the dump yard. This means we will eventually get really good performance for the small amount of warm/hot data once the cluster grows enough. Hope that answered your questions. Best regards, ===== Frank Schilder AIT Risø Ca

[ceph-users] Re: cephfs change/migrate default data pool

2020-05-07 Thread Frank Schilder
is becomes important, but this would need to be answered by a developer. If you can't afford the downtime and are happy with how it works, I wouldn't bother too much. Would be nice to hear a bit more from someone with technical insight on code level. Best regards, ===== Frank Schi

[ceph-users] Re: Data loss by adding 2OSD causing Long heartbeat ping times

2020-05-07 Thread Frank Schilder
ew conversation "Cluster outage due to client IO" to have a clean focused thread. I need a bit more time to collect information though. For now, our cluster is up and running healthy. Best regards, ===== Frank Schilder AIT Risø Campus Bygning 109, rum S14

[ceph-users] Re: Data loss by adding 2OSD causing Long heartbeat ping times

2020-05-08 Thread Frank Schilder
On all OSD nodes I'm using vm.min_free_kbytes = 4194304 (4GB). This was one of the first tunings on the cluster. Best regards, = Frank Schilder AIT Risø Campus Bygning 109, rum S14 From: Anthony D'Atri Sent: 08 May 2020 10:17

[ceph-users] Re: Cluster network and public network

2020-05-11 Thread Frank Schilder
on.ceph-01 config show | grep heart ... "osd_heartbeat_addr": "-", ... Is it actually possible to reserve a dedicated (third) VLAN with high QOS to heartbeat traffic by providing a per-host IP address to this parameter? What does this parameter do? Best regards, ==

[ceph-users] Yet another meltdown starting

2020-05-11 Thread Frank Schilder
. Its just an enormous load. I'm trying to increase # ceph config set global mon_mgr_beacon_grace 90 but the command doesn't complete. I guess because all the MGRs are out. Is there any way to force the MONs *not* to mark MGRs as unresponsive? Best regards, ===== Frank Sc

[ceph-users] Re: Yet another meltdown starting

2020-05-11 Thread Frank Schilder
oring anything. Best regads and thanks for any pointers. = Frank Schilder AIT Risø Campus Bygning 109, rum S14 From: Frank Schilder Sent: 11 May 2020 14:52:05 To: ceph-users Subject: [ceph-users] Yet another meltdown starting Hi all, an

[ceph-users] Re: Yet another meltdown starting

2020-05-11 Thread Frank Schilder
eacons are processed in a different way than heartbeats and that there is a critical bottleneck somewhere. Best regards, ===== Frank Schilder AIT Risø Campus Bygning 109, rum S14 From: Lenz Grimmer Sent: 11 May 2020 17:50:34 To: ceph-users@cep

[ceph-users] Re: Yet another meltdown starting

2020-05-11 Thread Frank Schilder
MGR deliver the numbers? Well, is says HEALTH_WARN, so I really hope this is just missing stats and not complete service outage. Best regards, ===== Frank Schilder AIT Risø Campus Bygning 109, rum S14 From: Frank Schilder Sent: 11 May 2020 19:43:24

[ceph-users] Re: Cluster network and public network

2020-05-12 Thread Frank Schilder
=6211 mtu 9000 p2p2: flags=6211 mtu 9000 If you already have 2 VLANs with different IDs, then this flip-over is trivial. I did it without service outage. Best regards, ===== Frank Schilder AIT Risø Campus Bygning 109, rum S14 From: mj Sent:

[ceph-users] Re: Cluster network and public network

2020-05-13 Thread Frank Schilder
mendation would be to keep both networks if they are on different VLAN IDs. Then, nothing special is required to do the transition and this is what I did to simplify the physical networking (two logical networks, identical physical networking). Best regards, = Frank Schilder A

[ceph-users] Re: What is a pgmap?

2020-05-14 Thread Frank Schilder
: 80.80 M objects, 195 TiB usage: 249 TiB used, 1.5 PiB / 1.8 PiB avail pgs: 2543 active+clean 2active+clean+scrubbing+deep io: client: 20 MiB/s rd, 21 MiB/s wr, 578 op/s rd, 1.08 kop/s wr Thanks for any info! = Frank Schilder AIT Risø Campus

[ceph-users] Re: What is a pgmap?

2020-05-14 Thread Frank Schilder
ected. Thanks and best regards, ===== Frank Schilder AIT Risø Campus Bygning 109, rum S14 ____ From: Frank Schilder Sent: 14 May 2020 12:37 To: Nghia Viet Tran; Bryan Henderson; Ceph users mailing list Subject: [ceph-users] Re: What is a pgmap? Hi,

[ceph-users] Re: Ceph meltdown, need help

2020-05-14 Thread Frank Schilder
will not continue this here, but rather prepare another thread "Cluster outage due to client IO" after checking network hardware. It looks as if two MON+MGR nodes are desperately trying to talk to each other but fail. And this after only 1.5 years of relationship :) Thanks for making

[ceph-users] Re: Reweighting OSD while down results in undersized+degraded PGs

2020-05-19 Thread Frank Schilder
clean, set the weight back to 0 and now the OSD will be vacated as expected. Best regards, = Frank Schilder AIT Risø Campus Bygning 109, rum S14 From: Andras Pataki Sent: 18 May 2020 22:25:37 To: ceph-users Subject: [ceph-users] Reweighting

[ceph-users] Re: Reweighting OSD while down results in undersized+degraded PGs

2020-05-19 Thread Frank Schilder
OSDs). Only OSDs that can peer are able to respond to changes of the crush map. Best regards, = Frank Schilder AIT Risø Campus Bygning 109, rum S14 From: Andras Pataki Sent: 19 May 2020 15:57:49 To: Frank Schilder; ceph-users Subject: Re

[ceph-users] total ceph outage again, need help

2020-05-20 Thread Frank Schilder
riority? Thanks and best regards, ===== Frank Schilder AIT Risø Campus Bygning 109, rum S14 ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: total ceph outage again, need help

2020-05-20 Thread Frank Schilder
far. Best regards, = Frank Schilder AIT Risø Campus Bygning 109, rum S14 From: Amit Ghadge Sent: 20 May 2020 09:44 To: Frank Schilder Subject: Re: [ceph-users] total ceph outage again, need help look like ceph-01 shows in starting, so I

[ceph-users] Re: Best way to change bucket hierarchy

2020-06-03 Thread Frank Schilder
You should see misplaced objects and remapped PGs, but no degraded objects or PGs. Do this only when cluster is helth_ok, otherwise things can get really complicated. Best regards, ===== Frank Schilder AIT Risø Campus Bygning 109, rum S14 From

[ceph-users] Re: Best way to change bucket hierarchy

2020-06-04 Thread Frank Schilder
It is a good idea to have a backup also just for reference and to compare before and after. Best regards, = Frank Schilder AIT Risø Campus Bygning 109, rum S14 From: Kyriazis, George Sent: 04 June 2020 00:58:20 To: Frank Schilder Cc: ceph-

[ceph-users] Re: Best way to change bucket hierarchy

2020-06-04 Thread Frank Schilder
rds, = Frank Schilder AIT Risø Campus Bygning 109, rum S14 From: Wido den Hollander Sent: 04 June 2020 08:50:16 To: Frank Schilder; Kyriazis, George; ceph-users Subject: Re: [ceph-users] Re: Best way to change bucket hierarchy On 6/4/20 12

[ceph-users] log_channel(cluster) log [ERR] : Error -2 reading object

2020-06-04 Thread Frank Schilder
:::1000203ad7f.:head in one of our OSD logs. The disk is healthy according to smartctl. Should I worry about that? Best regards, = Frank Schilder AIT Risø Campus Bygning 109, rum S14 ___ ceph-users mailing list -- ceph-users@ceph.io To

[ceph-users] Re: Best way to change bucket hierarchy

2020-06-04 Thread Frank Schilder
touch an EC profile, they are read-only any ways. The crush parameters are only used at pool creation and never looked at again. You can override these by editing the crush rule as explained above. Best regards and good luck, ===== Frank Schilder AIT Risø Ca

[ceph-users] Re: Best way to change bucket hierarchy

2020-06-04 Thread Frank Schilder
xpect almost every PG to be affected. Best regards, = Frank Schilder AIT Risø Campus Bygning 109, rum S14 From: Kyriazis, George Sent: 05 June 2020 00:28:43 To: Frank Schilder Cc: ceph-users Subject: Re: Best way to change bucket hierarchy

[ceph-users] Re: Best way to change bucket hierarchy

2020-06-05 Thread Frank Schilder
you are into the data movement. If its almost done, I wouldn't bother. If its another month, it might be worth trying. As far as I can see, your crush map is going to be a short text file, so it should be feasible to edit. Best regards, = Frank Schilder AIT Risø Campus Bygnin

<    1   2   3   4   5   6   7   8   >