Re: "Out of memory during read" errors instead of key eviction
So I tested this a bit more and released it in 1.6.17; I think bitnami should pick it up soonish. if not I'll try to figure out docker this weekend if you still need it. I'm not 100% sure it'll fix your use case but it does fix some things I can test and it didn't seem like a regression. would be nice to validate still. On Fri, 26 Aug 2022, dormando wrote: > You can't build docker images or compile binaries? there's a > docker-compose.yml in the repo already if that helps. > > If not I can try but I don't spend a lot of time with docker directly. > > On Fri, 26 Aug 2022, Hayden wrote: > > > I'd be happy to help validate the fix, but I can't do it until the weekend, > > and I don't have a ready way to build an updated image. Any chance you could > > create a docker image with the fix that I could grab from somewhere? > > > > On Friday, August 26, 2022 at 10:38:54 AM UTC-7 Dormando wrote: > > I have an opportunity to put this fix into a release today if anyone > > wants > > to help validate :) > > > > On Thu, 25 Aug 2022, dormando wrote: > > > > > Took another quick look... > > > > > > Think there's an easy patch that might work: > > > https://github.com/memcached/memcached/pull/924 > > > > > > If you wouldn't mind helping validate? An external validator would > > help me > > > get it in time for the next release :) > > > > > > Thanks, > > > -Dormando > > > > > > On Wed, 24 Aug 2022, dormando wrote: > > > > > > > Hey, > > > > > > > > Thanks for the info. Yes; this generally confirms the issue. I > > see some of > > > > your higher slab classes with "free_chunks 0", so if you're > > setting data > > > > that requires these chunks it could error out. The "stats items" > > confirms > > > > this since there are no actual items in those lower slab classes. > > > > > > > > You're certainly right a workaround of making your items < 512k > > would also > > > > work; but in general if I have features it'd be nice if they > > worked well > > > > :) Please open an issue so we can improve things! > > > > > > > > I intended to lower the slab_chunk_max default from 512k to much > > lower, as > > > > that actually raises the memory efficiency by a bit (less gap at > > the > > > > higher classes). That may help here. The system should also try > > ejecting > > > > items from the highest LRU... I need to double check that it > > wasn't > > > > already intending to do that and failing. > > > > > > > > Might also be able to adjust the page mover but not sure. The > > page mover > > > > can probably be adjusted to attempt to keep one page in reserve, > > but I > > > > think the algorithm isn't expecting slabs with no items in it so > > I'd have > > > > to audit that too. > > > > > > > > If you're up for experiments it'd be interesting to know if > > setting > > > > "-o slab_chunk_max=32768" or 16k (probably not more than 64) > > makes things > > > > better or worse. > > > > > > > > Also, crud.. it's documented as kilobytes but that's not working > > somehow? > > > > aaahahah. I guess the big EXPERIMENTAL tag scared people off > > since that > > > > never got reported. > > > > > > > > I'm guessing most people have a mix of small to large items, but > > you only > > > > have large items and a relatively low memory limit, so this is > > why you're > > > > seeing it so easily. I think most people setting large items have > > like > > > > 30G+ of memory so you end up with more spread around. > > > > > > > > Thanks, > > > > -Dormando > > > > > > > > On Wed, 24 Aug 2022, Hayden wrote: > > > > > > > > > What you're saying makes sense, and I'm pretty sure it won't be > > too hard to add some functionality to my writing code to break my large > > items up into > > > > > smaller parts that can each fit into a single chunk. That has > > the added benefit that I won't have to bother increasing the max item > > size. > > > > > In the meantime, though, I reran my pipeline and captured the > > output of stats, stats slabs, and stats items both when evicting normally > > and when getting > > > > > spammed with the error. > > > > > > > > > > First, the output when I'm in the error state: > > > > > Output of stats > > > > > STAT pid 1 > > > > > STAT uptime 11727 > > > > > STAT time 1661406229 > > > > > STAT version b'1.6.14' > > > > > STAT libevent b'2.1.8-stable' > > > > > STAT pointer_size 64 > > > > > STAT rusage_user 2.93837 > > > > > STAT rusage_system 6.339015 > > > > > STAT max_connections 1024 > > > > > STAT curr_connections 2 > > > > > STAT
Re: "Out of memory during read" errors instead of key eviction
You can't build docker images or compile binaries? there's a docker-compose.yml in the repo already if that helps. If not I can try but I don't spend a lot of time with docker directly. On Fri, 26 Aug 2022, Hayden wrote: > I'd be happy to help validate the fix, but I can't do it until the weekend, > and I don't have a ready way to build an updated image. Any chance you could > create a docker image with the fix that I could grab from somewhere? > > On Friday, August 26, 2022 at 10:38:54 AM UTC-7 Dormando wrote: > I have an opportunity to put this fix into a release today if anyone > wants > to help validate :) > > On Thu, 25 Aug 2022, dormando wrote: > > > Took another quick look... > > > > Think there's an easy patch that might work: > > https://github.com/memcached/memcached/pull/924 > > > > If you wouldn't mind helping validate? An external validator would > help me > > get it in time for the next release :) > > > > Thanks, > > -Dormando > > > > On Wed, 24 Aug 2022, dormando wrote: > > > > > Hey, > > > > > > Thanks for the info. Yes; this generally confirms the issue. I see > some of > > > your higher slab classes with "free_chunks 0", so if you're setting > data > > > that requires these chunks it could error out. The "stats items" > confirms > > > this since there are no actual items in those lower slab classes. > > > > > > You're certainly right a workaround of making your items < 512k > would also > > > work; but in general if I have features it'd be nice if they worked > well > > > :) Please open an issue so we can improve things! > > > > > > I intended to lower the slab_chunk_max default from 512k to much > lower, as > > > that actually raises the memory efficiency by a bit (less gap at the > > > higher classes). That may help here. The system should also try > ejecting > > > items from the highest LRU... I need to double check that it wasn't > > > already intending to do that and failing. > > > > > > Might also be able to adjust the page mover but not sure. The page > mover > > > can probably be adjusted to attempt to keep one page in reserve, > but I > > > think the algorithm isn't expecting slabs with no items in it so > I'd have > > > to audit that too. > > > > > > If you're up for experiments it'd be interesting to know if setting > > > "-o slab_chunk_max=32768" or 16k (probably not more than 64) makes > things > > > better or worse. > > > > > > Also, crud.. it's documented as kilobytes but that's not working > somehow? > > > aaahahah. I guess the big EXPERIMENTAL tag scared people off since > that > > > never got reported. > > > > > > I'm guessing most people have a mix of small to large items, but > you only > > > have large items and a relatively low memory limit, so this is why > you're > > > seeing it so easily. I think most people setting large items have > like > > > 30G+ of memory so you end up with more spread around. > > > > > > Thanks, > > > -Dormando > > > > > > On Wed, 24 Aug 2022, Hayden wrote: > > > > > > > What you're saying makes sense, and I'm pretty sure it won't be > too hard to add some functionality to my writing code to break my large > items up into > > > > smaller parts that can each fit into a single chunk. That has the > added benefit that I won't have to bother increasing the max item > size. > > > > In the meantime, though, I reran my pipeline and captured the > output of stats, stats slabs, and stats items both when evicting normally > and when getting > > > > spammed with the error. > > > > > > > > First, the output when I'm in the error state: > > > > Output of stats > > > > STAT pid 1 > > > > STAT uptime 11727 > > > > STAT time 1661406229 > > > > STAT version b'1.6.14' > > > > STAT libevent b'2.1.8-stable' > > > > STAT pointer_size 64 > > > > STAT rusage_user 2.93837 > > > > STAT rusage_system 6.339015 > > > > STAT max_connections 1024 > > > > STAT curr_connections 2 > > > > STAT total_connections 8230 > > > > STAT rejected_connections 0 > > > > STAT connection_structures 6 > > > > STAT response_obj_oom 0 > > > > STAT response_obj_count 1 > > > > STAT response_obj_bytes 65536 > > > > STAT read_buf_count 8 > > > > STAT read_buf_bytes 131072 > > > > STAT read_buf_bytes_free 49152 > > > > STAT read_buf_oom 0 > > > > STAT reserved_fds 20 > > > > STAT cmd_get 0 > > > > STAT cmd_set 12640 > > > > STAT cmd_flush 0 > > > > STAT cmd_touch 0 > > > > STAT cmd_meta 0 > > > > STAT get_hits 0 >
Re: "Out of memory during read" errors instead of key eviction
I'd be happy to help validate the fix, but I can't do it until the weekend, and I don't have a ready way to build an updated image. Any chance you could create a docker image with the fix that I could grab from somewhere? On Friday, August 26, 2022 at 10:38:54 AM UTC-7 Dormando wrote: > I have an opportunity to put this fix into a release today if anyone wants > to help validate :) > > On Thu, 25 Aug 2022, dormando wrote: > > > Took another quick look... > > > > Think there's an easy patch that might work: > > https://github.com/memcached/memcached/pull/924 > > > > If you wouldn't mind helping validate? An external validator would help > me > > get it in time for the next release :) > > > > Thanks, > > -Dormando > > > > On Wed, 24 Aug 2022, dormando wrote: > > > > > Hey, > > > > > > Thanks for the info. Yes; this generally confirms the issue. I see > some of > > > your higher slab classes with "free_chunks 0", so if you're setting > data > > > that requires these chunks it could error out. The "stats items" > confirms > > > this since there are no actual items in those lower slab classes. > > > > > > You're certainly right a workaround of making your items < 512k would > also > > > work; but in general if I have features it'd be nice if they worked > well > > > :) Please open an issue so we can improve things! > > > > > > I intended to lower the slab_chunk_max default from 512k to much > lower, as > > > that actually raises the memory efficiency by a bit (less gap at the > > > higher classes). That may help here. The system should also try > ejecting > > > items from the highest LRU... I need to double check that it wasn't > > > already intending to do that and failing. > > > > > > Might also be able to adjust the page mover but not sure. The page > mover > > > can probably be adjusted to attempt to keep one page in reserve, but I > > > think the algorithm isn't expecting slabs with no items in it so I'd > have > > > to audit that too. > > > > > > If you're up for experiments it'd be interesting to know if setting > > > "-o slab_chunk_max=32768" or 16k (probably not more than 64) makes > things > > > better or worse. > > > > > > Also, crud.. it's documented as kilobytes but that's not working > somehow? > > > aaahahah. I guess the big EXPERIMENTAL tag scared people off since that > > > never got reported. > > > > > > I'm guessing most people have a mix of small to large items, but you > only > > > have large items and a relatively low memory limit, so this is why > you're > > > seeing it so easily. I think most people setting large items have like > > > 30G+ of memory so you end up with more spread around. > > > > > > Thanks, > > > -Dormando > > > > > > On Wed, 24 Aug 2022, Hayden wrote: > > > > > > > What you're saying makes sense, and I'm pretty sure it won't be too > hard to add some functionality to my writing code to break my large items > up into > > > > smaller parts that can each fit into a single chunk. That has the > added benefit that I won't have to bother increasing the max item size. > > > > In the meantime, though, I reran my pipeline and captured the output > of stats, stats slabs, and stats items both when evicting normally and when > getting > > > > spammed with the error. > > > > > > > > First, the output when I'm in the error state: > > > > Output of stats > > > > STAT pid 1 > > > > STAT uptime 11727 > > > > STAT time 1661406229 > > > > STAT version b'1.6.14' > > > > STAT libevent b'2.1.8-stable' > > > > STAT pointer_size 64 > > > > STAT rusage_user 2.93837 > > > > STAT rusage_system 6.339015 > > > > STAT max_connections 1024 > > > > STAT curr_connections 2 > > > > STAT total_connections 8230 > > > > STAT rejected_connections 0 > > > > STAT connection_structures 6 > > > > STAT response_obj_oom 0 > > > > STAT response_obj_count 1 > > > > STAT response_obj_bytes 65536 > > > > STAT read_buf_count 8 > > > > STAT read_buf_bytes 131072 > > > > STAT read_buf_bytes_free 49152 > > > > STAT read_buf_oom 0 > > > > STAT reserved_fds 20 > > > > STAT cmd_get 0 > > > > STAT cmd_set 12640 > > > > STAT cmd_flush 0 > > > > STAT cmd_touch 0 > > > > STAT cmd_meta 0 > > > > STAT get_hits 0 > > > > STAT get_misses 0 > > > > STAT get_expired 0 > > > > STAT get_flushed 0 > > > > STAT delete_misses 0 > > > > STAT delete_hits 0 > > > > STAT incr_misses 0 > > > > STAT incr_hits 0 > > > > STAT decr_misses 0 > > > > STAT decr_hits 0 > > > > STAT cas_misses 0 > > > > STAT cas_hits 0 > > > > STAT cas_badval 0 > > > > STAT touch_hits 0 > > > > STAT touch_misses 0 > > > > STAT store_too_large 0 > > > > STAT store_no_memory 0 > > > > STAT auth_cmds 0 > > > > STAT auth_errors 0 > > > > STAT bytes_read 21755739959 > > > > STAT bytes_written 330909 > > > > STAT limit_maxbytes 5368709120 > > > > STAT accepting_conns 1 > > > > STAT listen_disabled_num 0 > > > > STAT time_in_listen_disabled_us 0 > > > > STAT threads 4 > > > > STAT conn_yields 0 > > > > STAT hash_power_level 16 > > > > STAT