try '-o no_lru_crawler' ? That definitely works. I don't know what you're doing since no code has been provided. The locks around managing LRU tails is pretty strict; so make sure you are actually using them correctly.
The LRU crawler works by injecting a fake item into the LRU, then using that to keep its position and walk. If I had to guess I bet you've "evicted" the LRU crawler, which then immediately dies when it tries to continue crawling. On Mon, 31 May 2021, Qingchen Dang wrote: > Furthermore, I tried to disable the crawler with the '- no_lru_crawler' > command parameter, and it gives the same error. I wonder why it does not > disable > the crawler lru as it supposes to do. > > On Monday, May 31, 2021 at 1:02:38 AM UTC-4 Qingchen Dang wrote: > Hi, > I am implementing a framework based on Memcached. There's a problem that > confused me a lot. The framework basically change the eviction policy, so > when it calls to evict an item, it might not evict the tail item at COLD LRU, > instead it will look for a "more suitable" item to evict and it will > reinsert the tail items to the head of COLD queue. > > It mostly works fine, but sometimes it causes a SegFault when reinsertion > happens very frequently (like in almost each eviction). The SegFault is > triggered in the crawler part. As attached, it seems when the crawler loops > through the item queue, it reaches an invalid memory address. The bug > happens after around 50000000~10000000 GET/SET (9:1) operations. I used > Memaslap for testing. > > Could anyone give me some suggestions of the reasons which cause such error? > > Here is the gdb messages: > > Thread 8 "memcached" received signal SIGSEGV, Segmentation fault. > > [Switching to Thread 0x7ffff4d6c700 (LWP 36414)] > > do_item_crawl_q (it=it@entry=0x55555579e7e0 <crawlers+12320>) > > at items.c:2015 > > 2015 it->prev->next = it->next; > > (gdb) print it->prev > > $5 = (struct _stritem *) 0x4f4d6355616d5471 > > (gdb) print it->prev->next > > Cannot access memory at address 0x4f4d6355616d5479 > > (gdb) print it->next > > $6 = (struct _stritem *) 0x7a59324376753351 > > (gdb) print it->next->prev > > Cannot access memory at address 0x7a59324376753361 > > (gdb) print it->nkey > > $7 = 0 '\000' > > (gdb) > > Here is the part that triggers the error: > > 2012 assert(it->next != it); > > 2013 if (it->next) { > > 2014 assert(it->prev->next == it); > > 2015 it->prev->next = it->next; > > 2016 it->next->prev = it->prev; > > 2017 } else { > > 2018 /* Tail. Move this above? */ > > 2019 it->prev->next = 0; > > 2020 } > > (I'm also confused why the assert function in line 2014 does not give error?) > > Thank you very much for helping! > > Best, > > Qingchen > > -- > > --- > You received this message because you are subscribed to the Google Groups > "memcached" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to memcached+unsubscr...@googlegroups.com. > To view this discussion on the web visit > https://groups.google.com/d/msgid/memcached/1398d377-06b8-4a43-8811-f299d044d055n%40googlegroups.com. > > -- --- You received this message because you are subscribed to the Google Groups "memcached" group. To unsubscribe from this group and stop receiving emails from it, send an email to memcached+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/memcached/1f184a63-c220-c949-91f9-9aeca3ff1d85%40rydia.net.