Re: [RFC] simple_lmk: Introduce Simple Low Memory Killer for Android
On 05/15, Sultan Alsawaf wrote: > > On Wed, May 15, 2019 at 04:58:32PM +0200, Oleg Nesterov wrote: > > Could you explain in detail what exactly did you do and what do you see in > > dmesg? > > > > Just in case, lockdep complains only once, print_circular_bug() does > > debug_locks_off() > > so it it has already reported another false positive __lock_acquire() will > > simply > > return after that. > > > > Oleg. > > This is what I did: > diff --git a/kernel/locking/lockdep.c b/kernel/locking/lockdep.c > index 774ab79d3ec7..009e7d431a88 100644 > --- a/kernel/locking/lockdep.c > +++ b/kernel/locking/lockdep.c > @@ -3078,6 +3078,7 @@ static int __lock_acquire(struct lockdep_map *lock, > unsigned int subclass, > int class_idx; > u64 chain_key; > > + BUG_ON(!debug_locks || !prove_locking); > if (unlikely(!debug_locks)) > return 0; > > diff --git a/lib/debug_locks.c b/lib/debug_locks.c > index 124fdf238b3d..4003a18420fb 100644 > --- a/lib/debug_locks.c > +++ b/lib/debug_locks.c > @@ -37,6 +37,7 @@ EXPORT_SYMBOL_GPL(debug_locks_silent); > */ > int debug_locks_off(void) > { > + return 0; > if (debug_locks && __debug_locks_off()) { > if (!debug_locks_silent) { > console_verbose(); OK, this means that debug_locks_off() always returns 0, as if debug_locks was already cleared. Thus print_deadlock_bug() will do nothing, it does if (!debug_locks_off_graph_unlock() || debug_locks_silent) return 0; iow this means that even if lockdep finds a problem, the problem won't be reported. > [1.492128] BUG: key not in .data! > [1.492141] BUG: key not in .data! > [1.492152] BUG: key not in .data! > [1.492228] BUG: key not in .data! > [1.492238] BUG: key not in .data! > [1.492248] BUG: key not in .data! I guess this is lockdep_init_map() which does printk("BUG:") itself, but due to your change above it doesn't do WARN(1) and thus there is no call trace. Oleg. ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
Re: [RFC] simple_lmk: Introduce Simple Low Memory Killer for Android
On 05/13, Sultan Alsawaf wrote: > > On Fri, May 10, 2019 at 05:10:25PM +0200, Oleg Nesterov wrote: > > I am starting to think I am ;) > > > > If you have task1 != task2 this code > > > > task_lock(task1); > > task_lock(task2); > > > > should trigger print_deadlock_bug(), task1->alloc_lock and > > task2->alloc_lock are > > the "same" lock from lockdep pov, held_lock's will have the same > > hlock_class(). > > Okay, I've stubbed out debug_locks_off(), and lockdep is now complaining > about a > bunch of false positives so it is _really_ enabled this time. Could you explain in detail what exactly did you do and what do you see in dmesg? Just in case, lockdep complains only once, print_circular_bug() does debug_locks_off() so it it has already reported another false positive __lock_acquire() will simply return after that. Oleg. ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
Re: [RFC] simple_lmk: Introduce Simple Low Memory Killer for Android
On 05/09, Sultan Alsawaf wrote: > > On Thu, May 09, 2019 at 05:56:46PM +0200, Oleg Nesterov wrote: > > Impossible ;) I bet lockdep should report the deadlock as soon as > > find_victims() > > calls find_lock_task_mm() when you already have a locked victim. > > I hope you're not a betting man ;) I am starting to think I am ;) If you have task1 != task2 this code task_lock(task1); task_lock(task2); should trigger print_deadlock_bug(), task1->alloc_lock and task2->alloc_lock are the "same" lock from lockdep pov, held_lock's will have the same hlock_class(). > CONFIG_PROVE_LOCKING=y OK, > And a printk added in vtsk_is_duplicate() to print when a duplicate is > detected, in this case find_lock_task_mm() won't be called, and this is what saves us from the actual deadlock. > and my phone's memory cut in half to make simple_lmk do something, this is > what > I observed: > taimen:/ # dmesg | grep lockdep > [0.00] \x09RCU lockdep checking is enabled. this reports that CONFIG_PROVE_RCU is enabled ;) > taimen:/ # dmesg | grep simple_lmk > [ 23.211091] simple_lmk: Killing android.carrier with adj 906 to free 37420 > kiB > [ 23.211160] simple_lmk: Killing oadcastreceiver with adj 906 to free 36784 > kiB yes, looks like simple_lmk has at least 2 locked victims. And I have no idea why you do not see anything else in dmesg. May be debug_locks_off() was already called. But see above, "grep lockdep" won't work. Perhaps you can do "grep -e WARNING -e BUG -e locking". Oleg. ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
Re: [RFC] simple_lmk: Introduce Simple Low Memory Killer for Android
On 05/07, Sultan Alsawaf wrote: > > On Tue, May 07, 2019 at 05:31:54PM +0200, Oleg Nesterov wrote: > > > Did you test this patch with lockdep enabled? > > > > If I read the patch correctly, lockdep should complain. vtsk_is_duplicate() > > ensures that we do not take the same ->alloc_lock twice or more, but lockdep > > can't know this. > > Yeah, lockdep is fine with this, at least on 4.4. Impossible ;) I bet lockdep should report the deadlock as soon as find_victims() calls find_lock_task_mm() when you already have a locked victim. Nevermind, I guess this code won't run with lockdep enabled... As for https://github.com/kerneltoast/android_kernel_google_wahoo/commit/afc8c9bf2dbde95941253c168d1adb64cfa2e3ad Well, mmdrop(mm); simple_lmk_mm_freed(mm); looks racy because mmdrop(mm) can free this mm_struct. Yes, simple_lmk_mm_freed() does not dereference this pointer, but the same memory can be re-allocated as another ->mm for the new task which can be found by find_victims(), and _in theory_ this all can happen in between, so the "victims[i].mm == mm" can be false positive. And this also means that simple_lmk_mm_freed() should clear victims[i].mm when it detects "victims[i].mm == mm", otherwise we have the same theoretical race, victims_to_kill is only cleared when the last victim goes away. Another nit... you can drop tasklist_lock right after the 1st "find_victims" loop. And it seems that you do not really need to walk the "victims" array twice after that, you can do everything in a single loop, but this is cosmetic. Oleg. ___ devel mailing list de...@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
Re: [RFC] simple_lmk: Introduce Simple Low Memory Killer for Android
I am not going to comment the intent, but to be honest I am skeptical too. On 05/06, Sultan Alsawaf wrote: > > +static unsigned long find_victims(struct victim_info *varr, int *vindex, > + int vmaxlen, int min_adj, int max_adj) > +{ > + unsigned long pages_found = 0; > + int old_vindex = *vindex; > + struct task_struct *tsk; > + > + for_each_process(tsk) { > + struct task_struct *vtsk; > + unsigned long tasksize; > + short oom_score_adj; > + > + /* Make sure there's space left in the victim array */ > + if (*vindex == vmaxlen) > + break; > + > + /* Don't kill current, kthreads, init, or duplicates */ > + if (same_thread_group(tsk, current) || > + tsk->flags & PF_KTHREAD || > + is_global_init(tsk) || > + vtsk_is_duplicate(varr, *vindex, tsk)) > + continue; > + > + vtsk = find_lock_task_mm(tsk); Did you test this patch with lockdep enabled? If I read the patch correctly, lockdep should complain. vtsk_is_duplicate() ensures that we do not take the same ->alloc_lock twice or more, but lockdep can't know this. > +static void scan_and_kill(unsigned long pages_needed) > +{ > + static DECLARE_WAIT_QUEUE_HEAD(victim_waitq); > + struct victim_info victims[MAX_VICTIMS]; > + int i, nr_to_kill = 0, nr_victims = 0; > + unsigned long pages_found = 0; > + atomic_t victim_count; > + > + /* > + * Hold the tasklist lock so tasks don't disappear while scanning. This > + * is preferred to holding an RCU read lock so that the list of tasks > + * is guaranteed to be up to date. Keep preemption disabled until the > + * SIGKILLs are sent so the victim kill process isn't interrupted. > + */ > + read_lock(_lock); > + preempt_disable(); read_lock() disables preemption, every task_lock() too, so this looks unnecessary. > + for (i = 1; i < ARRAY_SIZE(adj_prio); i++) { > + pages_found += find_victims(victims, _victims, MAX_VICTIMS, > + adj_prio[i], adj_prio[i - 1]); > + if (pages_found >= pages_needed || nr_victims == MAX_VICTIMS) > + break; > + } > + > + /* > + * Calculate the number of tasks that need to be killed and quickly > + * release the references to those that'll live. > + */ > + for (i = 0, pages_found = 0; i < nr_victims; i++) { > + struct victim_info *victim = [i]; > + struct task_struct *vtsk = victim->tsk; > + > + /* The victims' mm lock is taken in find_victims; release it */ > + if (pages_found >= pages_needed) { > + task_unlock(vtsk); > + continue; > + } > + > + /* > + * Grab a reference to the victim so it doesn't disappear after > + * the tasklist lock is released. > + */ > + get_task_struct(vtsk); The comment doesn't look correct. the victim can't dissapear until task_unlock() below, it can't pass exit_mm(). > + pages_found += victim->size; > + nr_to_kill++; > + } > + read_unlock(_lock); > + > + /* Kill the victims */ > + victim_count = (atomic_t)ATOMIC_INIT(nr_to_kill); > + for (i = 0; i < nr_to_kill; i++) { > + struct victim_info *victim = [i]; > + struct task_struct *vtsk = victim->tsk; > + > + pr_info("Killing %s with adj %d to free %lu kiB\n", vtsk->comm, > + vtsk->signal->oom_score_adj, > + victim->size << (PAGE_SHIFT - 10)); > + > + /* Configure the victim's mm to notify us when it's freed */ > + vtsk->mm->slmk_waitq = _waitq; > + vtsk->mm->slmk_counter = _count; > + > + /* Accelerate the victim's death by forcing the kill signal */ > + do_send_sig_info(SIGKILL, SIG_INFO_TYPE, vtsk, true); this should be PIDTYPE_TGID > + > + /* Finally release the victim's mm lock */ > + task_unlock(vtsk); > + } > + preempt_enable_no_resched(); See above. And I don't understand how can _no_resched() really help... > + > + /* Try to speed up the death process now that we can schedule again */ > + for (i = 0; i < nr_to_kill; i++) { > + struct task_struct *vtsk = victims[i].tsk; > + > + /* Increase the victim's priority to make it die faster */ > + set_user_nice(vtsk, MIN_NICE); > + > + /* Allow the victim to run on any CPU */ > + set_cpus_allowed_ptr(vtsk, cpu_all_mask); > + > + /* Finally release the victim reference acquired earlier */ > + put_task_struct(vtsk); > + } > + > + /* Wait