[ceph-users] Re: Data migration between clusters

2021-01-03 Thread Kalle Happonen
e. Cheers, Kalle - Original Message - > From: "Istvan Szabo, Agoda" > To: "Kalle Happonen" > Cc: "ceph-users" > Sent: Thursday, 24 December, 2020 04:43:44 > Subject: Re: [ceph-users] Re: Data migration between clusters > Hmmm, doesn’t see

[ceph-users] Re: Data migration between clusters

2020-12-22 Thread Kalle Happonen
Hi Istvan, I'm not sure it helps, but here's at least some pitfalls we faced when migrating radosgws between clusters. https://cloud.blog.csc.fi/2019/12/ceph-object-storage-migraine-i-mean.html Cheers, Kalle - Original Message - > From: "Szabo, Istvan (Agoda)" > To: "ceph-users" >

[ceph-users] Re: osd_pglog memory hoarding - another case

2020-12-22 Thread Kalle Happonen
For anybody facing similar issues, we wrote a blog post about everything we faced, and how we worked through it. https://cloud.blog.csc.fi/2020/12/allas-november-2020-incident-details.html Cheers, Kalle - Original Message - > From: "Kalle Happonen" > To: "Dan

[ceph-users] Re: OSD reboot loop after running out of memory

2020-12-14 Thread Kalle Happonen
/ceph/ceph/pull/35584 Cheers, Kalle - Original Message - > From: huxia...@horebdata.cn > To: "Kalle Happonen" , "Stefan Wild" > > Cc: "ceph-users" > Sent: Monday, 14 December, 2020 10:27:57 > Subject: Re: [ceph-users] Re: OSD reboot

[ceph-users] Re: osd_pglog memory hoarding - another case

2020-12-14 Thread Kalle Happonen
So we hope we found the (or a) trigger for the problem. Hopefully reveals another thread to pull for others debugging the same issue (and for us when we hit it again). Cheers, Kalle - Original Message - > From: "Dan van der Ster" > To: "Kalle Happonen"

[ceph-users] Re: OSD reboot loop after running out of memory

2020-12-13 Thread Kalle Happonen
Hi Stefan, we had been seeing OSDs OOMing on 14.2.13, but on a larger scale. In our case we hit a some bugs with pg_log memory growth and buffer_anon memory growth. Can you check what's taking up the memory on the OSD with the following command? ceph daemon osd.123 dump_mempools Cheers, Kalle

[ceph-users] Re: osd_pglog memory hoarding - another case

2020-12-01 Thread Kalle Happonen
Quick update, restarting OSDs is not enough for us to compact the db. So we stop the osd ceph-kvstore-tool bluestore-kv /var/lib/ceph/osd/ceph-$osd compact start the osd It seems to fix the spillover. Until it grows again. Cheers, Kalle - Original Message - > From: "Kalle

[ceph-users] Re: osd_pglog memory hoarding - another case

2020-12-01 Thread Kalle Happonen
it takes a while, it seems it may help. Of course this is not the greatest fix in production. Has anybody gleaned any new information on this issue? Things to tweaks? Fixes in the horizon? Other mitigations? Cheers, Kalle - Original Message - > From: "Kalle Happonen" >

[ceph-users] Re: high memory usage in osd_pglog

2020-11-26 Thread Kalle Happonen
Hi Robert, This sounds very much like a big problem we had 2 weeks back. https://lists.ceph.io/hyperkitty/list/ceph-users@ceph.io/thread/EWPPEMPAJQT6GGYSHM7GIM3BZWS2PSUY/ Are you running EC? Which version are you running? It would fit our narrative if you use EC and recently updated to 14.2.11+

[ceph-users] Re: osd_pglog memory hoarding - another case

2020-11-19 Thread Kalle Happonen
. Cheers, Kalle - Original Message - > From: "Kalle Happonen" > To: "Dan van der Ster" > Cc: "ceph-users" > Sent: Tuesday, 17 November, 2020 16:07:03 > Subject: [ceph-users] Re: osd_pglog memory hoarding - another case > Hi, > >

[ceph-users] Re: osd_pglog memory hoarding - another case

2020-11-17 Thread Kalle Happonen
t with the cluster at this state, it's hard to be specific. Cheers, Kalle > -- dan > > > On Tue, Nov 17, 2020 at 11:58 AM Kalle Happonen wrote: >> >> Another idea, which I don't know if has any merit. >> >> If 8 MB is a realistic log size (or has this grown

[ceph-users] Re: osd_pglog memory hoarding - another case

2020-11-17 Thread Kalle Happonen
issues with memory. Cheers, Kalle - Original Message - > From: "Kalle Happonen" > To: "Dan van der Ster" > Cc: "ceph-users" > Sent: Tuesday, 17 November, 2020 12:45:25 > Subject: [ceph-users] Re: osd_pglog memory hoarding - another case &g

[ceph-users] Re: osd_pglog memory hoarding - another case

2020-11-17 Thread Kalle Happonen
"pgid": "26.4", "ondisk_log_size": 3185, "pgid": "33.4", "ondisk_log_size": 3311, "pgid": "33.8", "ondisk_log_size": 3278, I also have no idea what the average size of a pg log entry should be, in our case it

[ceph-users] osd_pglog memory hoarding - another case

2020-11-17 Thread Kalle Happonen
tly to figure out if there are good guesses why the pg_log size per OSD process exploded? Any technical (and moral) support is appreciated. Also, currently we're not sure if 14.2.13 triggered this, so this is also to put a data point out there for other debuggers. Cheers, Kalle Happo