[ceph-users] Cephadm stacktrace on copying ceph.conf

2024-03-26 Thread Jesper Agerbo Krogh [JSKR]
denied 3/26/24 9:38:09 PM[INF]Updating dkcphhpcmgt028:/var/lib/ceph/5c384430-da91-11ed-af9c-c780a5227aff/config/ceph.conf It seem to be related to the permissions that the manager writes the files with and the process copying them around. $ sudo ceph -v [sudo] password for adminjskr: ceph version 17.2.6 (d7ff0d1

[ceph-users] Cannot get backfill speed up

2023-07-05 Thread Jesper Krogh
more resources on recovery than 328 MiB/s Thanks, . -- Jesper Krogh ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] pg_num != pgp_num - and unable to change.

2023-07-05 Thread Jesper Krogh
pg_num and pgp_num took immidiate effect? Jesper jskr@dkcphhpcmgt028:/$ sudo ceph version ceph version 17.2.6 (d7ff0d10654d2280e08f1ab989c7cdf3064446a5) quincy (stable) jskr@dkcphhpcmgt028:/$ sudo ceph health HEALTH_OK jskr@dkcphhpcmgt028:/$ sudo ceph status cluster: id: 5c384430-da91

[ceph-users] Erasure coding and backfilling speed

2023-07-05 Thread jesper
Hi. I have a Ceph (NVME) based cluster with 12 hosts and 40 OSD's .. currently it is backfilling pg's but I cannot get it to run more than 20 backfilling (pgs) at the same time (6+2 profile) osd_max_backfills = 100 and osd_recovery_max_active_ssd = 50 (non-sane) but it still stops at 20 with

[ceph-users] Re: force-create-pg not working

2022-09-20 Thread Jesper Lykkegaard Karlsen
p OSD - repair - mark-complete on the primary OSD. A scrub tells me that the "active+clean” state is for real. I also found out the more automated "force-create-pg" command only works on pgs that a in down state. Best, Jesper ------ Jesper Lykkegaar

[ceph-users] force-create-pg not working

2022-09-20 Thread Jesper Lykkegaard Karlsen
y with something like this?: * set nobackfill and norecovery * delete the pgs shards one by one * unset nobackfill and norecovery Any idea on how to proceed from here is most welcome. Thanks, Jesper -- Jesper Lykkegaard Karlsen Scientific Computing

[ceph-users] Re: Remove corrupt PG

2022-09-01 Thread Jesper Lykkegaard Karlsen
? Best, Jesper -- Jesper Lykkegaard Karlsen Scientific Computing Centre for Structural Biology Department of Molecular Biology and Genetics Aarhus University Universitetsbyen 81 8000 Aarhus C E-mail: je...@mbg.au.dk Tlf:+45 50906203 > On 1 Sep 2022, at 22.01, Jes

[ceph-users] Re: Remove corrupt PG

2022-09-01 Thread Jesper Lykkegaard Karlsen
it came back online. Best, Jesper -- Jesper Lykkegaard Karlsen Scientific Computing Centre for Structural Biology Department of Molecular Biology and Genetics Aarhus University Universitetsbyen 81 8000 Aarhus C E-mail: je...@mbg.au.dk Tlf:+45 50906203 > On 31 Aug 2

[ceph-users] Remove corrupt PG

2022-08-31 Thread Jesper Lykkegaard Karlsen
objects in that PG (also with objectstore-tool), but this process is extremely slow. When looping over the >65,000 object, each remove takes ~10 sec and is very compute intensive, which is approximately 7.5 days. Is the a faster way to get around this? Mvh. Jes

[ceph-users] Re: Potential bug in cephfs-data-scan?

2022-08-19 Thread Jesper Lykkegaard Karlsen
Actually, it might have worked better if the PG had stayed down while running cephfs-data-scan, as it could only then get file structure from metadata pool and not touch each file/link in data pool? This would at least properly have given the list of files in (only) the affected PG? //Jesper

[ceph-users] Re: Potential bug in cephfs-data-scan?

2022-08-19 Thread Jesper Lykkegaard Karlsen
Fra: Patrick Donnelly Sendt: 19. august 2022 16:16 Til: Jesper Lykkegaard Karlsen Cc: ceph-users@ceph.io Emne: Re: [ceph-users] Potential bug in cephfs-data-scan? On Fri, Aug 19, 2022 at 5:02 AM Jesper Lykkegaard Karlsen wrote: >> > >Hi,

[ceph-users] Potential bug in cephfs-data-scan?

2022-08-19 Thread Jesper Lykkegaard Karlsen
recently deprecated Octopus, I suspect that this bug is also present in Pacific and Quincy? It might be related to this bug? https://tracker.ceph.com/issues/46166 But symptoms are different. Or, maybe there is a way to disable the following of symlinks in "cephfs-data-scan pg_

[ceph-users] Re: replacing OSD nodes

2022-07-28 Thread Jesper Lykkegaard Karlsen
Cool thanks a lot! I will definitely put it in my toolbox. Best, Jesper -- Jesper Lykkegaard Karlsen Scientific Computing Centre for Structural Biology Department of Molecular Biology and Genetics Aarhus University Universitetsbyen 81 8000 Aarhus C E-mail: je

[ceph-users] Re: replacing OSD nodes

2022-07-28 Thread Jesper Lykkegaard Karlsen
the same amount of PGs are backfilling. Can large disk usage on mons slow down backfill and other operations? Is it dangerous? Best, Jesper -- Jesper Lykkegaard Karlsen Scientific Computing Centre for Structural Biology Department of Molecular Biology and Genetics

[ceph-users] Re: cannot set quota on ceph fs root

2022-07-28 Thread Jesper Lykkegaard Karlsen
Hi Frank, I guess there is alway the possibility to set quota on pool level with "target_max_objects" and “target_max_bytes” The cephfs quotas through attributes are only for sub-directories as far as I recall. Best, Jesper ------ Jesper Lykkegaard Karlsen

[ceph-users] Re: PG does not become active

2022-07-28 Thread Jesper Lykkegaard Karlsen
Ah I see, should have look at the “raw” data instead ;-) Then I agree this very weird? Best, Jesper -- Jesper Lykkegaard Karlsen Scientific Computing Centre for Structural Biology Department of Molecular Biology and Genetics Aarhus University Universitetsbyen 81 8000

[ceph-users] Re: PG does not become active

2022-07-28 Thread Jesper Lykkegaard Karlsen
as two OSDs down. Also, I believe that min_size should never be smaller than “coding” shards, which is 4 in this case. You can either make a new test setup with your three test OSD hosts using EC 2+1 or make e.g. 4+2, but with failure domain set to OSD. Best, Jesper

[ceph-users] Re: replacing OSD nodes

2022-07-22 Thread Jesper Lykkegaard Karlsen
or not? Summer vacation? Best, Jesper -- Jesper Lykkegaard Karlsen Scientific Computing Centre for Structural Biology Department of Molecular Biology and Genetics Aarhus University Universitetsbyen 81 8000 Aarhus C E-mail: je...@mbg.au.dk Tlf:+45 50906203

[ceph-users] Re: replacing OSD nodes

2022-07-20 Thread Jesper Lykkegaard Karlsen
Thanks for you answer Janne. Yes, I am also running "ceph osd reweight" on the "nearfull" osds, once they get too close for comfort. But I just though a continuous prioritization of rebalancing PGs, could make this process more smooth, with less/no need for handheld opera

[ceph-users] replacing OSD nodes

2022-07-19 Thread Jesper Lykkegaard Karlsen
if the priority of backfill_wait PGs was rerun, perharps every 24 hours, as OSD availability landscape of course changes during backfill. I imagine this, especially, could stabilize recovery/rebalance on systems where space is a little tight. Best regards, Jesper -- Jespe

[ceph-users] Re: cephfs quota used

2021-12-17 Thread Jesper Lykkegaard Karlsen
dir.rbytes $i 2>/dev/nul) | sed -r 's/([0-9])([a-zA-Z])/\1 \2/g; s/([a-zA-Z])([0-9])/\1 \2/g') $i" fi done The above can be run as: ceph_du_dir $DIR with multiple directories: ceph_du_dir $DIR1 $DIR2 $DIR3 .. Or even with wildcard: ceph_du_dir $DIR/* Best, Jesper ----

[ceph-users] Re: cephfs quota used

2021-12-16 Thread Jesper Lykkegaard Karlsen
Not to spam, but to make it output prettier, one can also separate the number from the byte-size prefix. numfmt --to=iec --suffix=B --padding=7 $(getfattr --only-values -n ceph.dir.rbytes $1 2>/dev/nul) | sed -r 's/([0-9])([a-zA-Z])/\1 \2/g; s/([a-zA-Z])([0-9])/\1 \2/g' //Jes

[ceph-users] Re: cephfs quota used

2021-12-16 Thread Jesper Lykkegaard Karlsen
Brilliant, thanks Jean-François Best, Jesper -- Jesper Lykkegaard Karlsen Scientific Computing Centre for Structural Biology Department of Molecular Biology and Genetics Aarhus University Gustav Wieds Vej 10 8000 Aarhus C E-mail: je...@mbg.au.dk Tlf:+45 50906203

[ceph-users] Re: cephfs quota used

2021-12-16 Thread Jesper Lykkegaard Karlsen
in "human-readble" It works like a charm and my god it is fast!. Tools like that could be very useful, if provided by the development team  Best, Jesper -- Jesper Lykkegaard Karlsen Scientific Computing Centre for Structural Biology Department of Molecular

[ceph-users] Re: cephfs quota used

2021-12-16 Thread Jesper Lykkegaard Karlsen
Woops, wrong copy/pasta: getfattr -n ceph.dir.rbytes $DIR works on all distributions I have tested. It is: getfattr -d -m 'ceph.*' $DIR that does not work on Rocky Linux 8, Ubuntu 18.04, but works on CentOS 7. Best, Jesper -- Jesper Lykkegaard Karlsen Scientific

[ceph-users] Re: cephfs quota used

2021-12-16 Thread Jesper Lykkegaard Karlsen
Just tested: getfattr -n ceph.dir.rbytes $DIR Works on CentOS 7, but not on Ubuntu 18.04 eighter. Weird? Best, Jesper -- Jesper Lykkegaard Karlsen Scientific Computing Centre for Structural Biology Department of Molecular Biology and Genetics Aarhus University Gustav

[ceph-users] Re: cephfs quota used

2021-12-16 Thread Jesper Lykkegaard Karlsen
no output. Should that not list all atributes? This is on Rocky Linux kernel 4.18.0-348.2.1.el8_5.x86_64 Best, Jesper -- Jesper Lykkegaard Karlsen Scientific Computing Centre for Structural Biology Department of Molecular Biology and Genetics Aarhus University Gustav Wieds

[ceph-users] cephfs quota used

2021-12-16 Thread Jesper Lykkegaard Karlsen
, Jesper -- Jesper Lykkegaard Karlsen Scientific Computing Centre for Structural Biology Department of Molecular Biology and Genetics Aarhus University Gustav Wieds Vej 10 8000 Aarhus C E-mail: je...@mbg.au.dk Tlf:+45 50906203

[ceph-users] Recover data from Cephfs snapshot

2021-03-12 Thread Jesper Lykkegaard Karlsen
some "ceph snapshot recover" command, that can move metadata pointers and make recovery much lighter, or is this just that way it is? Best reagards, Jesper ------ Jesper Lykkegaard Karlsen Scientific Computing Centre for Structural Biology Department of Molecul

[ceph-users] Cephfs metadata and MDS on same node

2021-03-09 Thread Jesper Lykkegaard Karlsen
? Anyway, I am just asking for your opinion on this? Pros and cons or even better somebody who actually have tried this? Best regards, Jesper -- Jesper Lykkegaard Karlsen Scientific Computing Centre for Structural Biology Department of Molecular Biology and Genetics Aarhus

[ceph-users] Re: Change crush rule on pool

2020-09-12 Thread jesper
 Can i do that - when the SSDs are allready used in another crush rule - backing and kvm_ssd rbd’s? Jesper Sent from myMail for iOS Saturday, 12 September 2020, 11.01 +0200 from anthony.da...@gmail.com : >If you have capacity to have both online at the same time, why not add the &g

[ceph-users] Re: Change crush rule on pool

2020-09-12 Thread jesper
> I would like to change the crush rule so data lands on ssd instead of hdd, > can this be done on the fly and migration will just happen or do I need to > do something to move data? I would actually like to relocate my object store to a new storage tier. Is the best to: 1) create new pool on

[ceph-users] Change crush rule on pool

2020-08-05 Thread jesper
Hi I would like to change the crush rule so data lands on ssd instead of hdd, can this be done on the fly and migration will just happen or do I need to do something to move data? Jesper Sent from myMail for iOS ___ ceph-users mailing list

[ceph-users] Re: OSD weight on Luminous

2020-05-14 Thread jesper
unless uou have enabled some balancing - then this is very normal (actually pretty good normal) Jesper Sent from myMail for iOS Thursday, 14 May 2020, 09.35 +0200 from Florent B. : >Hi, > >I have something strange on a Ceph Luminous cluster. > >All OSDs have the same size,

[ceph-users] Ceph MDS - busy?

2020-04-30 Thread jesper
Hi. How do I find out if the MDS is "busy" - being the one limiting CephFS metadata throughput. (12.2.8). $ time find . | wc -l 1918069 real8m43.008s user0m2.689s sys 0m7.818s or 3.667ms per file. In the light of "potentially batching" and a network latency of ~0.20ms to the MDS -

[ceph-users] Re: HBase/HDFS on Ceph/CephFS

2020-04-27 Thread jesper
everything again. I may get equally good data locality with Ceph-based SSD as with local HDDs (which I currently have) Jesper ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] HBase/HDFS on Ceph/CephFS

2020-04-24 Thread jesper
is a bit more than what looks feasible at the moment. Thanks for your reflections/input. Jesper ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Healthy objects trapped in incomplete pgs

2020-04-23 Thread Jesper Lykkegaard Karlsen
ible of those files to other location. * recreate pgs: ceph osd force-create-pg * restart recovery:ceph osd unset norecover * copy back in the recovered files Would that work or do you have a better suggestion? Ch

[ceph-users] MDS_CACHE_OVERSIZED warning

2020-04-16 Thread jesper
; 34400070 inodes in use by clients, 3293 stray files Thanks - Jesper ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: Recommendation for decent write latency performance from HDDs

2020-04-04 Thread jesper
iver a fast write cache for smallish writes. Would setting the parameter til 1MB be "insane"? Jesper ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Recommendation for decent write latency performance from HDDs

2020-04-04 Thread jesper
will be as slow as hitting dead-rust - anything that cannot live with that need to be entirely on SSD/NVMe. Other? Thanks for your input. Jesper ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: New 3 node Ceph cluster

2020-03-14 Thread jesper
Hi. Unless there is plans for going to Petabyte scale with it - then I really dont see the benefits of getting CephFS involved over just an RBD image with a VM running standard samba on top. More performant and less complexity to handle - zero gains (by my book) Jesper > Hi, > > I am

[ceph-users] Re: Ceph Performance of Micron 5210 SATA?

2020-03-06 Thread jesper
But is random/sequential read performance still good? even during saturated write performance ?  if so the tradeoff could fit quite some applications  Sent from myMail for iOS Friday, 6 March 2020, 14.06 +0100 from vitalif : >Hi, > >Current QLC drives are total shit in terms of

[ceph-users] Performance of old vs new hw?

2020-02-17 Thread jesper
Hi We have some oldish servers with ssds - all on 25gbit nics. R815 AMD - 2,4ghz+ Is there significant performance benefits in moving to a new NVMe based, new cpus? +20% IOPs? + 50% IOPs? Jesper Sent from myMail for iOS ___ ceph-users mailing

[ceph-users] Re: Building a petabyte cluster from scratch

2019-12-03 Thread jesper
fraction of our dataset - and 10GB cache on all 113 HDD ~1TB effective read-cache - and then writes hitting the battery-backed write-cache - this can overspill and when hitting "cold" data performance varies. But the read/write amplification of EC is still un-manageable in p

[ceph-users] Re: Building a petabyte cluster from scratch

2019-12-03 Thread jesper
fordable - or have I missed something that makes the math work ? -- Jesper ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: Building a petabyte cluster from scratch

2019-12-03 Thread jesper
h Battery Backed write-cache - will allow OSD to ack writes before hitting spinning rust. * More memory for OSD-level read-caching. * 3x replication instead of EC .. (we have all above in a "similar" setup ~1PB - 10 OSD - hosts). SSD-tiering pool (havent been there - but would like to test it out). -- Jesper ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: MDS blocked ops; kernel: Workqueue: ceph-pg-invalid ceph_invalidate_work [ceph]

2019-09-03 Thread jesper
3201013c940bf000 https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=3e1d0452edceebb903d23db53201013c940bf000 Was capable of deadlocking the kernel when memory pressure caused MDS to reclaim capabilities - smells similar. Jesper

[ceph-users] Re: Danish ceph users

2019-08-29 Thread jesper
yes Sent from myMail for iOS Thursday, 29 August 2019, 15.52 +0200 from fr...@dtu.dk : >I would be in. > >= >Frank Schilder >AIT Risø Campus >Bygning 109, rum S14 > > >From: Torben Hørup < tor...@t-hoerup.dk > >Sent: 29 August 2019

[ceph-users] Re: the ceph rbd read dd with fio performance diffrent so huge?

2019-08-27 Thread jesper
concurrency is widely different 1:30  Jesper  Sent from myMail for iOS Tuesday, 27 August 2019, 16.25 +0200 from linghucongs...@163.com : >The performance with the dd and fio diffrent is so huge? > >I have 25 OSDS with 8TB hdd. with dd I only get 410KB/s read perfomance,but &g

[ceph-users] Re: krdb upmap compatibility

2019-08-26 Thread jesper
What will actually happen if an old client comes by, potential data damage - or just broken connections from the client? jesper  Sent from myMail for iOS Monday, 26 August 2019, 20.16 +0200 from Paul Emmerich : >4.13 or newer is enough for upmap > >-- >Paul Emmerich > &g