Re: SegFault in Crawler Part
You can't evict memory that's being used to load data from the network. So if you have a low amount of memory and run a benchmark doing a bunch of parallel writes you're going to be sad. On Tue, 1 Jun 2021, Qingchen Dang wrote: > Thank you very much! Yes your guess is correct, I forgot the possibility of > evicting a crawler item :( > Furthermore, I have a similar problem as this post: > https://github.com/memcached/memcached/issues/467 > I gave a very limited memory usage to Memcached to test eviction and it does > cause the similar error. > When I use Memtier_Benchmark, the error looks like: > > [RUN #1] Preparing benchmark client... > > [RUN #1] Launching threads now... > > error: response parsing failed. > > error: response parsing failed. > > server 127.0.0.1:11211 handle error response: SERVER_ERROR out of memory > storing object > > error: response parsing failed. > > server 127.0.0.1:11211 handle error response: SERVER_ERROR out of memory > storing object > > error: response parsing failed. > > [RUN #1 17%, 0 secs] 1 threads: 87137 ops, 87213 (avg: 87213) > ops/sec, 65.66MB/sec (avg: 65.66MB/sec > > [RUN #1 36%, 1 secs] 1 threads: 179012 ops, 91864 (avg: 89540) > ops/sec, 69.87MB/sec (avg: 67.76MB/sec > > [RUN #1 56%, 2 secs] 1 threads: 279971 ops, 100947 (avg: 93343) > ops/sec, 76.76MB/sec (avg: 70.76MB/sec > > [RUN #1 75%, 3 secs] 1 threads: 375715 ops, 95732 (avg: 93941) > ops/sec, 72.87MB/sec (avg: 71.29MB/sec > > [RUN #1 92%, 4 secs] 1 threads: 462054 ops, 93910 (avg: 93935) > ops/sec, 71.41MB/sec (avg: 71.31MB/sec > > [RUN #1 92%, 4 secs] 1 threads: 462054 ops, 0 (avg: 92431) > ops/sec, 0.00KB/sec (avg: 70.17MB/sec) > > [RUN #1 92%, 5 secs] 1 threads: 462054 ops, 0 (avg: 90975) > ops/sec, 0.00KB/sec (avg: 69.06MB/sec) > > [RUN #1 92%, 5 secs] 1 threads: 462054 ops, 0 (avg: 89564) > ops/sec, 0.00KB/sec (avg: 67.99MB/sec) > > When I use Memaslap, it looks like > > set proportion: set_prop=0.10 > > get proportion: get_prop=0.90 > > <12 SERVER_ERROR out of memory storing object > > <10 SERVER_ERROR out of memory storing object > > <12 SERVER_ERROR out of memory storing object > > <7 SERVER_ERROR out of memory storing object > > The unmodified Memcached gives errors less frequently than Memcached with my > eviction framework (especially using Memtier_Benchmark), so I wonder the > reason. I read your post message in the above link, but I am still confused > about why memory limitation affect Memcached's usage. Could you give a more > detailed explanation? If I have to give limited memory, is there a way to > avoid this issue? > Thank you very much for helping! > > Best, > Qingchen > On Tuesday, June 1, 2021 at 2:36:09 AM UTC-4 Dormando wrote: > try '-o no_lru_crawler' ? That definitely works. > > I don't know what you're doing since no code has been provided. The > locks > around managing LRU tails is pretty strict; so make sure you are > actually > using them correctly. > > The LRU crawler works by injecting a fake item into the LRU, then using > that to keep its position and walk. If I had to guess I bet you've > "evicted" the LRU crawler, which then immediately dies when it tries to > continue crawling. > > On Mon, 31 May 2021, Qingchen Dang wrote: > > > Furthermore, I tried to disable the crawler with the '- > no_lru_crawler' command parameter, and it gives the same error. I wonder why > it > does not disable > > the crawler lru as it supposes to do. > > > > On Monday, May 31, 2021 at 1:02:38 AM UTC-4 Qingchen Dang wrote: > > Hi, > > I am implementing a framework based on Memcached. There's a problem > that confused me a lot. The framework basically change the eviction > policy, so > > when it calls to evict an item, it might not evict the tail item at > COLD LRU, instead it will look for a "more suitable" item to evict and > it will > > reinsert the tail items to the head of COLD queue. > > > > It mostly works fine, but sometimes it causes a SegFault when > reinsertion happens very frequently (like in almost each eviction). The > SegFault is > > triggered in the crawler part. As attached, it seems when the crawler > loops through the item queue, it reaches an invalid memory address. > The bug > > happens after around 5000~1000 GET/SET (9:1) operations. I > used Memaslap for testing. > > > > Could anyone give me some suggestions of the reasons which cause such > error? > > > > Here is the gdb messages: > > > > Thread 8 "memcached" received signal SIGSEGV, Segmentation fault. > > > > [Switching to Thread 0x74d6c700 (LWP 36414)] > > > > do_item_crawl_q (it=it@entry=0x5579e7e0 ) > > > > at items.c:2015 > > >
Re: SegFault in Crawler Part
Thank you very much! Yes your guess is correct, I forgot the possibility of evicting a crawler item :( Furthermore, I have a similar problem as this post: https://github.com/memcached/memcached/issues/467 I gave a very limited memory usage to Memcached to test eviction and it does cause the similar error. When I use Memtier_Benchmark, the error looks like: *[RUN #1] Preparing benchmark client...* *[RUN #1] Launching threads now...* *error: response parsing failed.* *error: response parsing failed.* *server 127.0.0.1:11211 handle error response: SERVER_ERROR out of memory storing object* *error: response parsing failed.* *server 127.0.0.1:11211 handle error response: SERVER_ERROR out of memory storing object* *error: response parsing failed.* *[RUN #1 17%, 0 secs] 1 threads: 87137 ops, 87213 (avg: 87213) ops/sec, 65.66MB/sec (avg: 65.66MB/sec* *[RUN #1 36%, 1 secs] 1 threads: 179012 ops, 91864 (avg: 89540) ops/sec, 69.87MB/sec (avg: 67.76MB/sec* *[RUN #1 56%, 2 secs] 1 threads: 279971 ops, 100947 (avg: 93343) ops/sec, 76.76MB/sec (avg: 70.76MB/sec* *[RUN #1 75%, 3 secs] 1 threads: 375715 ops, 95732 (avg: 93941) ops/sec, 72.87MB/sec (avg: 71.29MB/sec* *[RUN #1 92%, 4 secs] 1 threads: 462054 ops, 93910 (avg: 93935) ops/sec, 71.41MB/sec (avg: 71.31MB/sec* *[RUN #1 92%, 4 secs] 1 threads: 462054 ops, 0 (avg: 92431) ops/sec, 0.00KB/sec (avg: 70.17MB/sec)* *[RUN #1 92%, 5 secs] 1 threads: 462054 ops, 0 (avg: 90975) ops/sec, 0.00KB/sec (avg: 69.06MB/sec)* *[RUN #1 92%, 5 secs] 1 threads: 462054 ops, 0 (avg: 89564) ops/sec, 0.00KB/sec (avg: 67.99MB/sec)* When I use Memaslap, it looks like *set proportion: set_prop=0.10* *get proportion: get_prop=0.90* *<12 SERVER_ERROR out of memory storing object* *<10 SERVER_ERROR out of memory storing object* *<12 SERVER_ERROR out of memory storing object* *<7 SERVER_ERROR out of memory storing object* The unmodified Memcached gives errors less frequently than Memcached with my eviction framework (especially using Memtier_Benchmark), so I wonder the reason. I read your post message in the above link, but I am still confused about why memory limitation affect Memcached's usage. Could you give a more detailed explanation? If I have to give limited memory, is there a way to avoid this issue? Thank you very much for helping! Best, Qingchen On Tuesday, June 1, 2021 at 2:36:09 AM UTC-4 Dormando wrote: > try '-o no_lru_crawler' ? That definitely works. > > I don't know what you're doing since no code has been provided. The locks > around managing LRU tails is pretty strict; so make sure you are actually > using them correctly. > > The LRU crawler works by injecting a fake item into the LRU, then using > that to keep its position and walk. If I had to guess I bet you've > "evicted" the LRU crawler, which then immediately dies when it tries to > continue crawling. > > On Mon, 31 May 2021, Qingchen Dang wrote: > > > Furthermore, I tried to disable the crawler with the '- no_lru_crawler' > command parameter, and it gives the same error. I wonder why it does not > disable > > the crawler lru as it supposes to do. > > > > On Monday, May 31, 2021 at 1:02:38 AM UTC-4 Qingchen Dang wrote: > > Hi, > > I am implementing a framework based on Memcached. There's a problem that > confused me a lot. The framework basically change the eviction policy, so > > when it calls to evict an item, it might not evict the tail item at COLD > LRU, instead it will look for a "more suitable" item to evict and it will > > reinsert the tail items to the head of COLD queue. > > > > It mostly works fine, but sometimes it causes a SegFault when > reinsertion happens very frequently (like in almost each eviction). The > SegFault is > > triggered in the crawler part. As attached, it seems when the crawler > loops through the item queue, it reaches an invalid memory address. The bug > > happens after around 5000~1000 GET/SET (9:1) operations. I used > Memaslap for testing. > > > > Could anyone give me some suggestions of the reasons which cause such > error? > > > > Here is the gdb messages: > > > > Thread 8 "memcached" received signal SIGSEGV, Segmentation fault. > > > > [Switching to Thread 0x74d6c700 (LWP 36414)] > > > > do_item_crawl_q (it=it@entry=0x5579e7e0 ) > > > > at items.c:2015 > > > > 2015 it->prev->next = it->next; > > > > (gdb) print it->prev > > > > $5 = (struct _stritem *) 0x4f4d6355616d5471 > > > > (gdb) print it->prev->next > > > > Cannot access memory at address 0x4f4d6355616d5479 > > > > (gdb) print it->next > > > > $6 = (struct _stritem *) 0x7a59324376753351 > > > > (gdb) print it->next->prev > > > > Cannot access memory at address 0x7a59324376753361 > > > > (gdb) print it->nkey > > > > $7 = 0 '\000' > > > > (gdb) > > > > Here is the part that triggers the error: > > > > 2012 assert(it->next != it); > >