On Nov 19, 2007 8:56 AM, Ingo Molnar <[EMAIL PROTECTED]> wrote: > > * Torsten Kaiser <[EMAIL PROTECTED]> wrote: > > > Trying the last NFSv4 patch (but that patch is only the cause, why I > > had lockdep enabled) I got this: > > [ 64.550203] > > [ 64.550205] ========================= > > [ 64.552213] [ BUG: held lock freed! ] > > [ 64.553633] ------------------------- > > [ 64.555055] kcryptd/1022 is freeing memory > > FFFF81011EBEFB00-FFFF81011EBEFB3F, with a lock still held there! > > so kcryptd frees a live, still in use bio? That could be a receipe for > data corruption. Does SLUB_DEBUG (or SLAB_DEBUG) catch anything?
I have SLUB_DEBUG=y, but not SLUB_DEBUG_ON. But apart from this message I did not the anything in the syslog. It seems to not be onetime event, as one the third boot it happend again. Stacktrace was identical. Sadly trying 3 boots with slub_debug=FZP and another one with only F did not trigger it. But I don't think kcryptd is freeing a bio at that point. The message said about the freed lock: (kcryptd){--..}, at: [<ffffffff80247dd9>] (gdb) list *0xffffffff80247dd9 0xffffffff80247dd9 is in run_workqueue (include/asm/bitops_64.h:69). 64 * you should call smp_mb__before_clear_bit() and/or smp_mb__after_clear_bit() 65 * in order to ensure changes are visible on other processors. 66 */ 67 static inline void clear_bit(int nr, volatile void *addr) 68 { 69 __asm__ __volatile__( LOCK_PREFIX 70 "btrl %1,%0" 71 :ADDR 72 :"dIr" (nr)); 73 } increasing the addr a little bit shows: (gdb) list *0xffffffff80247ddf 0xffffffff80247ddf is in run_workqueue (kernel/workqueue.c:275). 270 list_del_init(cwq->worklist.next); 271 spin_unlock_irq(&cwq->lock); 272 273 BUG_ON(get_wq_data(work) != cwq); 274 work_clear_pending(work); 275 lock_acquire(&cwq->wq->lockdep_map, 0, 0, 0, 2, _THIS_IP_); 276 lock_acquire(&lockdep_map, 0, 0, 0, 2, _THIS_IP_); 277 f(work); 278 lock_release(&lockdep_map, 1, _THIS_IP_); 279 lock_release(&cwq->wq->lockdep_map, 1, _THIS_IP_); Above this acquire/release sequence is the following comment: #ifdef CONFIG_LOCKDEP /* * It is permissible to free the struct work_struct * from inside the function that is called from it, * this we need to take into account for lockdep too. * To avoid bogus "held lock freed" warnings as well * as problems when looking into work->lockdep_map, * make a copy and use that here. */ struct lockdep_map lockdep_map = work->lockdep_map; #endif Did something trigger this anyway? Anything I could try, apart from more boots with slub_debug=F? Torsten - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/