try '-o no_lru_crawler' ? That definitely works.

I don't know what you're doing since no code has been provided. The locks
around managing LRU tails is pretty strict; so make sure you are actually
using them correctly.

The LRU crawler works by injecting a fake item into the LRU, then using
that to keep its position and walk. If I had to guess I bet you've
"evicted" the LRU crawler, which then immediately dies when it tries to
continue crawling.

On Mon, 31 May 2021, Qingchen Dang wrote:

> Furthermore, I tried to disable the crawler with the '- no_lru_crawler' 
> command parameter, and it gives the same error. I wonder why it does not 
> disable
> the crawler lru as it supposes to do.
>
> On Monday, May 31, 2021 at 1:02:38 AM UTC-4 Qingchen Dang wrote:
>       Hi,
> I am implementing a framework based on Memcached. There's a problem that 
> confused me a lot. The framework basically change the eviction policy, so
> when it calls to evict an item, it might not evict the tail item at COLD LRU, 
> instead it will look for a "more suitable" item to evict and it will
> reinsert the tail items to the head of COLD queue.
>
> It mostly works fine, but sometimes it causes a SegFault when reinsertion 
> happens very frequently (like in almost each eviction). The SegFault is
> triggered in the crawler part. As attached, it seems when the crawler loops 
> through the item queue, it reaches an invalid memory address. The bug
> happens after around 50000000~10000000 GET/SET (9:1) operations. I used 
> Memaslap for testing.
>
> Could anyone give me some suggestions of the reasons which cause such error?
>
> Here is the gdb messages:
>
> Thread 8 "memcached" received signal SIGSEGV, Segmentation fault.
>
> [Switching to Thread 0x7ffff4d6c700 (LWP 36414)]
>
> do_item_crawl_q (it=it@entry=0x55555579e7e0 <crawlers+12320>)
>
>     at items.c:2015
>
> 2015             it->prev->next = it->next;
>
> (gdb) print it->prev
>
> $5 = (struct _stritem *) 0x4f4d6355616d5471
>
> (gdb) print it->prev->next
>
> Cannot access memory at address 0x4f4d6355616d5479
>
> (gdb) print it->next
>
> $6 = (struct _stritem *) 0x7a59324376753351
>
> (gdb) print it->next->prev
>
> Cannot access memory at address 0x7a59324376753361
>
> (gdb) print it->nkey
>
> $7 = 0 '\000'
>
> (gdb) 
>
> Here is the part that triggers the error:
>
> 2012         assert(it->next != it);
>
> 2013         if (it->next) {
>
> 2014             assert(it->prev->next == it);
>
> 2015             it->prev->next = it->next;
>
> 2016             it->next->prev = it->prev;
>
> 2017         } else {
>
> 2018             /* Tail. Move this above? */
>
> 2019             it->prev->next = 0;
>
> 2020         }
>
> (I'm also confused why the assert function in line 2014 does not give error?)
>
> Thank you very much for helping!
>
> Best,
>
> Qingchen
>
> --
>
> ---
> You received this message because you are subscribed to the Google Groups 
> "memcached" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to memcached+unsubscr...@googlegroups.com.
> To view this discussion on the web visit 
> https://groups.google.com/d/msgid/memcached/1398d377-06b8-4a43-8811-f299d044d055n%40googlegroups.com.
>
>

-- 

--- 
You received this message because you are subscribed to the Google Groups 
"memcached" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to memcached+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/memcached/1f184a63-c220-c949-91f9-9aeca3ff1d85%40rydia.net.

Reply via email to