[ceph-users] MDS stuck in rejoin

2022-05-30 Thread Dave Schulz

Hi Everyone,

I have a down system that has the MDS stuck in the rejoin state. When I 
run ceph-mds with -d and --debug_mds 10 I get this repeating:
2022-05-31 00:33:03.554 7fac80ee3700 10 mds.trex-ceph4  my compat 
compat={},rocompat={},incompat={1=base v0.20,2=client writeable 
ranges,3=default file layouts on dirs,4=dir inode in separate 
object,5=mds uses versioned encoding,6=dir
frag is stored in omap,7=mds uses inline data,8=no anchor table,9=file 
layout v2,10=snaprealm v2}
2022-05-31 00:33:03.554 7fac80ee3700 10 mds.trex-ceph4  mdsmap compat 
compat={},rocompat={},incompat={1=base v0.20,2=client writeable 
ranges,3=default file layouts on dirs,4=dir inode in separate 
object,5=mds uses versioned encoding,6=dir

frag is stored in omap,8=no anchor table,9=file layout v2,10=snaprealm v2}
2022-05-31 00:33:03.554 7fac80ee3700 10 mds.trex-ceph4 my gid is 161986332
2022-05-31 00:33:03.554 7fac80ee3700 10 mds.trex-ceph4 map says I am 
mds.0.2365745 state up:rejoin
2022-05-31 00:33:03.554 7fac80ee3700 10 mds.trex-ceph4 msgr says i am 
[v2:172.23.0.44:6800/4094836140,v1:172.23.0.44:6801/4094836140]
2022-05-31 00:33:03.554 7fac80ee3700 10 mds.trex-ceph4 handle_mds_map: 
handling map as rank 0
2022-05-31 00:33:03.557 7fac83972700  5 mds.beacon.trex-ceph4 received 
beacon reply up:rejoin seq 31 rtt 0.21701
2022-05-31 00:33:04.185 7fac7c6da700 10 mds.0.cache cache not ready for 
trimming
2022-05-31 00:33:05.182 7fac7c6da700 10 mds.0.cache cache not ready for 
trimming

2022-05-31 00:33:05.182 7fac7c6da700 10 mds.0.cache releasing free memory
2022-05-31 00:33:06.182 7fac7c6da700 10 mds.0.cache cache not ready for 
trimming
2022-05-31 00:33:07.183 7fac7c6da700 10 mds.0.cache cache not ready for 
trimming
2022-05-31 00:33:07.341 7fac7dedd700  5 mds.beacon.trex-ceph4 Sending 
beacon up:rejoin seq 32
2022-05-31 00:33:07.341 7fac83972700  5 mds.beacon.trex-ceph4 received 
beacon reply up:rejoin seq 32 rtt 0
2022-05-31 00:33:08.183 7fac7c6da700 10 mds.0.cache cache not ready for 
trimming
2022-05-31 00:33:09.184 7fac7c6da700 10 mds.0.cache cache not ready for 
trimming
2022-05-31 00:33:10.184 7fac7c6da700 10 mds.0.cache cache not ready for 
trimming
2022-05-31 00:33:11.185 7fac7c6da700 10 mds.0.cache cache not ready for 
trimming
2022-05-31 00:33:11.341 7fac7dedd700  5 mds.beacon.trex-ceph4 Sending 
beacon up:rejoin seq 33
2022-05-31 00:33:11.397 7fac80ee3700  1 mds.trex-ceph4 Updating MDS map 
to version 2365758 from mon.0


and it just stays in that state seemingly forever.  Also it seems to be 
doing nothing cpu wise.  I don't even know where to look at this point.


I see this in the mon log:

2022-05-31 00:36:27.359 7f39d0c6c700  1 mon.trex-ceph1@0(leader).osd 
e51026 _set_new_cache_sizes cache_size:1020054731 inc_alloc: 301989888 
full_alloc: 322961408 kv_alloc: 390070272


I'm falling asleep at the keyboard trying to get this to work. Any thoughts?

Thanks

-Dave

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Crashing MDS

2022-06-08 Thread Dave Schulz

Hi Everyone,

I have an MDS server that's crashing moments after it starts. The 
filesystem is set to max_mds=5 and mds.[1-4] are all up and active but 
mds.0 keeps crashing.  all I can see is the following in the 
/var/log/ceph/ceph-mds. logfile.  Any thoughts?



    -2> 2022-06-08 10:02:59.408 7fc0a479d700  4 mds.0.server 
handle_client_request client_request(client.162790796:2313708455 create 
#0x500012ba51c/0192.jpg 2022-06-08 09:06:03.780237 RETRY=19 
caller_uid=10363898, caller_gid=10363898{10363898,}) v4
    -1> 2022-06-08 10:02:59.410 7fc0a479d700 -1 
/home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos7/DIST/centos7/MACHINE_SIZE/gigantic/release/14.2.9/rpm/el7/BUILD/ceph-14.2.9/src/mds/MDCache.cc: 
In function 'void MDCache::add_inode(CInode*)' thread 7fc0a479d700 time 
2022-06-08 10:02:59.409711
/home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos7/DIST/centos7/MACHINE_SIZE/gigantic/release/14.2.9/rpm/el7/BUILD/ceph-14.2.9/src/mds/MDCache.cc: 
279: FAILED ceph_assert(!p)


 ceph version 14.2.9 (581f22da52345dba46ee232b73b990f06029a2a0) 
nautilus (stable)
 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char 
const*)+0x14a) [0x7fc0b2cbb875]

 2: (()+0x253a3d) [0x7fc0b2cbba3d]
 3: (()+0x20f84e) [0x56386170d84e]
 4: (Server::prepare_new_inode(boost::intrusive_ptr&, 
CDir*, inodeno_t, unsigned int, file_layout_t*)+0x4ad) [0x5638616a0f1d]
 5: 
(Server::handle_client_openc(boost::intrusive_ptr&)+0xcf1) 
[0x5638616b0a91]
 6: 
(Server::dispatch_client_request(boost::intrusive_ptr&)+0xb5b) 
[0x5638616d78fb]
 7: (Server::handle_client_request(boost::intrusive_ptrconst> const&)+0x2f8) [0x5638616d7d78]
 8: (Server::dispatch(boost::intrusive_ptr 
const&)+0x122) [0x5638616e3722]
 9: (MDSRank::handle_deferrable_message(boost::intrusive_ptrconst> const&)+0x6dc) [0x5638616585ec]
 10: (MDSRank::_dispatch(boost::intrusive_ptr const&, 
bool)+0x7ea) [0x56386165aa4a]
 11: (MDSRank::retry_dispatch(boost::intrusive_ptr 
const&)+0x12) [0x56386165af72]

 12: (MDSContext::complete(int)+0x74) [0x5638618c9ce4]
 13: (MDSRank::_advance_queues()+0xa4) [0x563861659ac4]
 14: (MDSRank::_dispatch(boost::intrusive_ptr const&, 
bool)+0x1d0) [0x56386165a430]
 15: (MDSRank::retry_dispatch(boost::intrusive_ptr 
const&)+0x12) [0x56386165af72]

 16: (MDSContext::complete(int)+0x74) [0x5638618c9ce4]
 17: (MDSRank::_advance_queues()+0xa4) [0x563861659ac4]
 18: (MDSRank::ProgressThread::entry()+0x3d) [0x56386165a13d]
 19: (()+0x7ea5) [0x7fc0b0b7aea5]
 20: (clone()+0x6d) [0x7fc0af8288dd]

 0> 2022-06-08 10:02:59.413 7fc0a479d700 -1 *** Caught signal 
(Aborted) **

 in thread 7fc0a479d700 thread_name:mds_rank_progr

 ceph version 14.2.9 (581f22da52345dba46ee232b73b990f06029a2a0) 
nautilus (stable)

 1: (()+0xf630) [0x7fc0b0b82630]
 2: (gsignal()+0x37) [0x7fc0af760387]
 3: (abort()+0x148) [0x7fc0af761a78]
 4: (ceph::__ceph_assert_fail(char const*, char const*, int, char 
const*)+0x199) [0x7fc0b2cbb8c4]

 5: (()+0x253a3d) [0x7fc0b2cbba3d]
 6: (()+0x20f84e) [0x56386170d84e]
 7: (Server::prepare_new_inode(boost::intrusive_ptr&, 
CDir*, inodeno_t, unsigned int, file_layout_t*)+0x4ad) [0x5638616a0f1d]
 8: 
(Server::handle_client_openc(boost::intrusive_ptr&)+0xcf1) 
[0x5638616b0a91]
 9: 
(Server::dispatch_client_request(boost::intrusive_ptr&)+0xb5b) 
[0x5638616d78fb]
 10: (Server::handle_client_request(boost::intrusive_ptrconst> const&)+0x2f8) [0x5638616d7d78]
 11: (Server::dispatch(boost::intrusive_ptr 
const&)+0x122) [0x5638616e3722]
 12: (MDSRank::handle_deferrable_message(boost::intrusive_ptrconst> const&)+0x6dc) [0x5638616585ec]
 13: (MDSRank::_dispatch(boost::intrusive_ptr const&, 
bool)+0x7ea) [0x56386165aa4a]
 14: (MDSRank::retry_dispatch(boost::intrusive_ptr 
const&)+0x12) [0x56386165af72]

 15: (MDSContext::complete(int)+0x74) [0x5638618c9ce4]
 16: (MDSRank::_advance_queues()+0xa4) [0x563861659ac4]
 17: (MDSRank::_dispatch(boost::intrusive_ptr const&, 
bool)+0x1d0) [0x56386165a430]
 18: (MDSRank::retry_dispatch(boost::intrusive_ptr 
const&)+0x12) [0x56386165af72]

 19: (MDSContext::complete(int)+0x74) [0x5638618c9ce4]
 20: (MDSRank::_advance_queues()+0xa4) [0x563861659ac4]
 21: (MDSRank::ProgressThread::entry()+0x3d) [0x56386165a13d]
 22: (()+0x7ea5) [0x7fc0b0b7aea5]
 23: (clone()+0x6d) [0x7fc0af8288dd]
 NOTE: a copy of the executable, or `objdump -rdS ` is 
needed to interpret this.




Thanks

-Dave

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Crashing MDS

2022-06-08 Thread Dave Schulz
I don't see any full inode fs on the boot disks if that's what you 
mean.  We're using bluestore so I don't think this applies to the OSDs.  
Thanks for the suggestion.


-Dave

On 2022-06-08 10:50 a.m., Can Özyurt wrote:

[△EXTERNAL]



Hi Dave,

Just to make sure, have you checked if the host has free inode available?

On Wed, 8 Jun 2022 at 19:22, Dave Schulz  wrote:

Hi Everyone,

I have an MDS server that's crashing moments after it starts. The
filesystem is set to max_mds=5 and mds.[1-4] are all up and active
but
mds.0 keeps crashing.  all I can see is the following in the
/var/log/ceph/ceph-mds. logfile.  Any thoughts?


 -2> 2022-06-08 10:02:59.408 7fc0a479d700  4 mds.0.server
handle_client_request client_request(client.162790796:2313708455
create
#0x500012ba51c/0192.jpg 2022-06-08 09:06:03.780237 RETRY=19
caller_uid=10363898, caller_gid=10363898{10363898,}) v4
 -1> 2022-06-08 10:02:59.410 7fc0a479d700 -1

/home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos7/DIST/centos7/MACHINE_SIZE/gigantic/release/14.2.9/rpm/el7/BUILD/ceph-14.2.9/src/mds/MDCache.cc:

In function 'void MDCache::add_inode(CInode*)' thread 7fc0a479d700
time
2022-06-08 10:02:59.409711

/home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos7/DIST/centos7/MACHINE_SIZE/gigantic/release/14.2.9/rpm/el7/BUILD/ceph-14.2.9/src/mds/MDCache.cc:

279: FAILED ceph_assert(!p)

  ceph version 14.2.9 (581f22da52345dba46ee232b73b990f06029a2a0)
nautilus (stable)
  1: (ceph::__ceph_assert_fail(char const*, char const*, int, char
const*)+0x14a) [0x7fc0b2cbb875]
  2: (()+0x253a3d) [0x7fc0b2cbba3d]
  3: (()+0x20f84e) [0x56386170d84e]
  4: (Server::prepare_new_inode(boost::intrusive_ptr&,
CDir*, inodeno_t, unsigned int, file_layout_t*)+0x4ad)
[0x5638616a0f1d]
  5:
(Server::handle_client_openc(boost::intrusive_ptr&)+0xcf1)

[0x5638616b0a91]
  6:

(Server::dispatch_client_request(boost::intrusive_ptr&)+0xb5b)

[0x5638616d78fb]
  7:
(Server::handle_client_request(boost::intrusive_ptr const&)+0x2f8) [0x5638616d7d78]
  8: (Server::dispatch(boost::intrusive_ptr
const&)+0x122) [0x5638616e3722]
  9: (MDSRank::handle_deferrable_message(boost::intrusive_ptr const&)+0x6dc) [0x5638616585ec]
  10: (MDSRank::_dispatch(boost::intrusive_ptr const&,
bool)+0x7ea) [0x56386165aa4a]
  11: (MDSRank::retry_dispatch(boost::intrusive_ptr
const&)+0x12) [0x56386165af72]
  12: (MDSContext::complete(int)+0x74) [0x5638618c9ce4]
  13: (MDSRank::_advance_queues()+0xa4) [0x563861659ac4]
  14: (MDSRank::_dispatch(boost::intrusive_ptr const&,
bool)+0x1d0) [0x56386165a430]
  15: (MDSRank::retry_dispatch(boost::intrusive_ptr
const&)+0x12) [0x56386165af72]
  16: (MDSContext::complete(int)+0x74) [0x5638618c9ce4]
  17: (MDSRank::_advance_queues()+0xa4) [0x563861659ac4]
  18: (MDSRank::ProgressThread::entry()+0x3d) [0x56386165a13d]
  19: (()+0x7ea5) [0x7fc0b0b7aea5]
  20: (clone()+0x6d) [0x7fc0af8288dd]

  0> 2022-06-08 10:02:59.413 7fc0a479d700 -1 *** Caught signal
(Aborted) **
  in thread 7fc0a479d700 thread_name:mds_rank_progr

  ceph version 14.2.9 (581f22da52345dba46ee232b73b990f06029a2a0)
nautilus (stable)
  1: (()+0xf630) [0x7fc0b0b82630]
  2: (gsignal()+0x37) [0x7fc0af760387]
  3: (abort()+0x148) [0x7fc0af761a78]
  4: (ceph::__ceph_assert_fail(char const*, char const*, int, char
const*)+0x199) [0x7fc0b2cbb8c4]
  5: (()+0x253a3d) [0x7fc0b2cbba3d]
  6: (()+0x20f84e) [0x56386170d84e]
  7: (Server::prepare_new_inode(boost::intrusive_ptr&,
CDir*, inodeno_t, unsigned int, file_layout_t*)+0x4ad)
[0x5638616a0f1d]
  8:
(Server::handle_client_openc(boost::intrusive_ptr&)+0xcf1)

[0x5638616b0a91]
  9:

(Server::dispatch_client_request(boost::intrusive_ptr&)+0xb5b)

[0x5638616d78fb]
  10:
(Server::handle_client_request(boost::intrusive_ptr const&)+0x2f8) [0x5638616d7d78]
  11: (Server::dispatch(boost::intrusive_ptr
const&)+0x122) [0x5638616e3722]
  12:
(MDSRank::handle_deferrable_message(boost::intrusive_ptr const&)+0x6dc) [0x5638616585ec]
  13: (MDSRank::_dispatch(boost::intrusive_ptr const&,
bool)+0x7ea) [0x56386165aa4a]
  14: (MDSRank::retry_dispatch(boost::intrusive_ptr
const&)+0x12) [0x56386165af72]
  15: (MDSContext::complete(int)+0x74) [0x5638618c9ce4]
  16: (MDSRank::_advance_queues()+0xa4) [0x563861659ac4]
  17: (MDSRank::_dispatch(boost::intrusive_ptr const&,
bool)+0x1d0) [0x56386165a430]
  18: (MDSRank::retry_dispatch(boost::intrusive_ptr
const&)+0x12) [0x56386165af72]
  19: (MDSContext::complete

[ceph-users] Re: OSDs growing beyond full ratio

2022-08-29 Thread Dave Schulz

Hi Wyll,

Any chance you're using CephFS and have some really large files in the 
CephFS filesystem?  Erasure coding? I recently encountered a similar 
problem and as soon as the end-user deleted the really large files our 
problem became much more managable.


I had issues reweighting OSDs too and in the end I changed the crush 
weights and had to chase them around every couple of days reweighting 
the OSDs >70% to zero and then setting them back to 12 when they were 
mostly empty (12TB spinning rust buckets).  Note that I'm really not 
recommending this course of action it's just the only option that seemed 
to have any effect.


-Dave

On 2022-08-29 3:00 p.m., Wyll Ingersoll wrote:

[△EXTERNAL]



Can anyone explain why OSDs (ceph pacific, bluestore osds) continue to grow well after 
they have exceeded the "full" level (95%) and is there any way to stop this?

"The full_ratio is 0.95 but we have several osds that continue to grow and are 
approaching 100% utilization.  They are reweighted to almost 0, but yet continue to 
grow.
Why is this happening?  I thought the cluster would stop writing to the osd when it 
was at above the full ratio."

thanks...


From: Wyll Ingersoll 
Sent: Monday, August 29, 2022 9:24 AM
To: Jarett ; ceph-users@ceph.io 
Subject: [ceph-users] Re: OSDs growing beyond full ratio


I would think so, but it isn't happening nearly fast enough.

It's literally been over 10 days with 40 new drives across 2 new servers and 
they barely have any PGs yet. A few, but not nearly enough to help with the 
imbalance.

From: Jarett 
Sent: Sunday, August 28, 2022 8:19 PM
To: Wyll Ingersoll ; ceph-users@ceph.io 

Subject: RE: [ceph-users] OSDs growing beyond full ratio


Isn’t rebalancing onto the empty OSDs default behavior?



From: Wyll Ingersoll
Sent: Sunday, August 28, 2022 10:31 AM
To: ceph-users@ceph.io
Subject: [ceph-users] OSDs growing beyond full ratio



We have a pacific cluster that is overly filled and is having major trouble 
recovering.  We are desperate for help in improving recovery speed.  We have 
modified all of the various recovery throttling parameters.



The full_ratio is 0.95 but we have several osds that continue to grow and are 
approaching 100% utilization.  They are reweighted to almost 0, but yet 
continue to grow.

Why is this happening?  I thought the cluster would stop writing to the osd 
when it was at above the full ratio.





We have added additional capacity to the cluster but the new OSDs are being used very 
very slowly.  The primary pool in the cluster is the RGW data pool which is a 12+4 EC 
pool using "host" placement rules across 18 hosts, 2 new hosts with 20x10TB 
osds each were recently added but they are only very very slowly being filled up.  I 
don't see how to force recovery on that particular pool.   From what I understand, we 
cannot modify the EC parameters without destroying the pool and we cannot offload that 
pool to any others because there is no other place to store the amount of data.





We have been running "ceph osd reweight-by-utilization"  periodically and it 
works for a while (a few hours) but then recovery and backfill IO numbers drop to 
negligible values.



The balancer module will not run because the current misplaced % is about 97%.



Would it be more effective to use the osmaptool and generate a bunch of upmap 
commands to manually move data around or keep trying to get 
reweight-by-utlilization to work?



Any suggestions (other than deleting data which we cannot do at this point, the 
pools are not accessible) or adding more storage (we already did and it is not 
being utilized very heavily yet for some reason).









___

ceph-users mailing list -- ceph-users@ceph.io

To unsubscribe send an email to ceph-users-le...@ceph.io


___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: OSDs growing beyond full ratio

2022-08-30 Thread Dave Schulz

Hi Wyll,

The only way I could get my OSDs to start dropping their utilization 
because of a similar "unable to access the fs" problem was to run "ceph 
osd crush reweight  0" on the full OSDs then wait while they start 
to empty and get below the full ratio.  Not this is different from ceph 
osd reweight (missing the word crush).  I know this goes against the 
documented best practices and I'm just relaying what worked for me 
recently.  I'm running 14.2.22 and I think you're Pacific which is 2 
major versions newer.


In case it's important: We also had HDD with SSD for DB/WAL.


@Ceph gurus: Is a file in ceph assigned to a specific PG?  In my case it 
seems like a file that's close to the size of a single OSD gets moved 
from one OSD to the next filling it up and domino-ing around the cluster 
filling up OSDs.


Sincerely

-Dave


On 2022-08-30 8:04 a.m., Wyll Ingersoll wrote:

[△EXTERNAL]




OSDs are bluestore on HDD with SSD for DB/WAL.  We already tuned the 
sleep_hdd to 0 and cranked up the max_backfills and recovery 
parameters to much higher values.




*From:* Josh Baergen 
*Sent:* Tuesday, August 30, 2022 9:46 AM
*To:* Wyll Ingersoll 
*Cc:* Dave Schulz ; ceph-users@ceph.io 


*Subject:* Re: [ceph-users] Re: OSDs growing beyond full ratio
Hey Wyll,

I haven't been following this thread very closely so my apologies if
this has already been covered: Are the OSDs on HDDs or SSDs (or
hybrid)? If HDDs, you may want to look at decreasing
osd_recovery_sleep_hdd and increasing osd_max_backfills. YMMV, but
I've seen osd_recovery_sleep_hdd=0.01 and osd_max_backfills=6 work OK
on Bluestore HDDs. This would help speed up the data movements.

If it's a hybrid setup, I'm sure you could apply similar tweaks. Sleep
is already 0 for SSDs but you may be able to increase max_backfills
for some gains.

Josh

On Tue, Aug 30, 2022 at 7:31 AM Wyll Ingersoll
 wrote:
>
>
> Yes, this cluster has both - a large cephfs FS (60TB) that is 
replicated (2-copy) and a really large RGW data pool that is EC 
(12+4).  We cannot currently delete any data from either of them 
because commands to access them are not responsive.  The cephfs will 
not mount and radosgw-admin just hangs.

>
> We have several OSDs that are >99% full and keep approaching 100, 
even after reweighting them to 0. There is no client activity in this 
cluster at this point (its dead), but lots of rebalance and repairing 
going on) so data is moving around.

>
> We are currently trying to use upmap commands to relocate PGs in to 
attempt to balance things better and get it moving again, but progress 
is glacially slow.

>
> 
> From: Dave Schulz 
> Sent: Monday, August 29, 2022 10:42 PM
> To: Wyll Ingersoll ; 
ceph-users@ceph.io 

> Subject: Re: [ceph-users] Re: OSDs growing beyond full ratio
>
> Hi Wyll,
>
> Any chance you're using CephFS and have some really large files in the
> CephFS filesystem?  Erasure coding? I recently encountered a similar
> problem and as soon as the end-user deleted the really large files our
> problem became much more managable.
>
> I had issues reweighting OSDs too and in the end I changed the crush
> weights and had to chase them around every couple of days reweighting
> the OSDs >70% to zero and then setting them back to 12 when they were
> mostly empty (12TB spinning rust buckets).  Note that I'm really not
> recommending this course of action it's just the only option that seemed
> to have any effect.
>
> -Dave
>
> On 2022-08-29 3:00 p.m., Wyll Ingersoll wrote:
> > [△EXTERNAL]
> >
> >
> >
> > Can anyone explain why OSDs (ceph pacific, bluestore osds) 
continue to grow well after they have exceeded the "full" level (95%) 
and is there any way to stop this?

> >
> > "The full_ratio is 0.95 but we have several osds that continue to 
grow and are approaching 100% utilization.  They are reweighted to 
almost 0, but yet continue to grow.
> > Why is this happening?  I thought the cluster would stop writing 
to the osd when it was at above the full ratio."

> >
> > thanks...
> >
> > 
> > From: Wyll Ingersoll 
> > Sent: Monday, August 29, 2022 9:24 AM
> > To: Jarett ; ceph-users@ceph.io 


> > Subject: [ceph-users] Re: OSDs growing beyond full ratio
> >
> >
> > I would think so, but it isn't happening nearly fast enough.
> >
> > It's literally been over 10 days with 40 new drives across 2 new 
servers and they barely have any PGs yet. A few, but not nearly enough 
to help with the imbalance.

> > 
> > From: Jarett 
> &g

[ceph-users] Re: OSDs growing beyond full ratio

2022-08-30 Thread Dave Schulz

Hi Weiwen,

Thanks for the reference link.  That does indeed indicate the opposite.  
I'm not sure why our issues became much less when the big files were 
deleted.  I suppose it's just that there was more space available after 
deleting the big files.


-Dave

On 2022-08-30 11:56 a.m., 胡 玮文 wrote:

[△EXTERNAL]





在 2022年8月30日,23:20,Dave Schulz  写道:

Is a file in ceph assigned to a specific PG?  In my case it seems 
like a file that's close to the size of a single OSD gets moved from 
one OSD to the next filling it up and domino-ing around the cluster 
filling up OSDs.


I believe no. Each large file is split into multiple objects, 4MB each 
by default. These objects are evenly assigned to all PGs in the pool.


https://docs.ceph.com/en/quincy/cephfs/file-layouts/ 
<https://can01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fdocs.ceph.com%2Fen%2Fquincy%2Fcephfs%2Ffile-layouts%2F&data=05%7C01%7Cdschulz%40ucalgary.ca%7C2de770fd2bda4af4012708da8ab0f5b8%7Cc609a0eca5e346319686192280bd9151%7C1%7C0%7C637974790346533261%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C2000%7C%7C%7C&sdata=cY0UfMKts0cy8xXj37u8gIbbp5xeyfQQFAJ%2Bo5GKN1o%3D&reserved=0>


Weiwen Hu

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io