On 08/23, Cyrill Gorcunov wrote: > > On Sat, Aug 23, 2014 at 01:53:02PM +0200, Oleg Nesterov wrote: > > > > > > It should protect from allocation/devetion/mergin of another vma. IOW when > > > I lookup for vma I need to be sure it exist and won't disappear at least > > > while I validate it. > > > > plus you need mmap_sem (at least for reading) when you update mm_struct, > > this is clear. > > > > My question was why the whole function should be called under mmap_sem? > > It could take it only around find_vma() + check(RLIMIT_STACK) ? > > Stricktly speaking yes, but don't forget we might need to update > exe::file as well which requires lock to be taken.
For reading? I see prctl_set_mm_exe_file_locked() in this patch, probably this function was added by another patch. But, if this function calls set_mm_exe_file() (I guess it does?) then down_read() is not enough? set_mm_exe_file() can race with itself. And this still doesn't answer my question. As I said, I understand that we need mmap_sem to update mm_struct, and this is what prctl_set_mm_map() does at the end. And it also calls prctl_set_mm_exe_file_locked(), validate_prctl_map_locked() doesn't do this. > So it is simplier > to take the read-lock for the whole function. Still can't understand why validate_prctl_map_locked() should be called under this lock. OK, I won't insist. > > In fact I do not think we need this vma_stack/RLIMIT_STACK check at all. > > It buys nithing and looks strange. RLIMIT_STACK is mostly for > > self-debugging, > > to catch the, say, unlimited recursion. An application can trivially > > create a stack region of arbitrary size. I'd seriously suggest to remove it. > > Look, allocate stack for self is not a problem (we do this for our parasite > code which executes inside dumpee address space) but RLIMIT_STACK check is > present in ipc shmem so I think we still need this check in a sake of > consistency. But for what? Ignoring the (I think buggy) check in do_shmat() ->start_stack is simply unused, we only report it via /proc/. The same for, say, mm->start_code. It seems that only start_brk/end_data/brk need some validation. Perhaps something else, I didn't try to verify. So why do we need these confusing checks? > > > > > + if (prctl_map.auxv_size) { > > > > > + /* Last entry must be AT_NULL as specification requires > > > > > */ > > > > > + user_auxv[AT_VECTOR_SIZE - 2] = AT_NULL; > > > > > + user_auxv[AT_VECTOR_SIZE - 1] = AT_NULL; > > > > > + > > > > > + task_lock(current); > > > > > + memcpy(mm->saved_auxv, user_auxv, sizeof(user_auxv)); > > > > > + task_unlock(current); > > > > > > > > Again, could you explain this task_lock() ? > > > > > > It is used for serialization access to saved_auxv, ie when we fill it > > > with new data the other reader (via procfs interface) should wait until > > > we finish. > > > > But proc_pid_auxv() doesn't take this lock? And even if it did, this lock > > can't help. task_lock() is per-thread, and multiple threads (including > > CLONE_VM tasks, vfork() for example) can share the same ->mm. > > > > This certainly doesn't look right. > > It takes this lock Where? Another patch I missed ? ;) > but indeed this won't help much. Yes, it can't help at all. > Looks like I need > to use cred_guard_mutex instead of task_lock here, no? Please don't. First of all, it can't help because proc_pid_auxv() doesn't hold this lock. It does mm_access() which drops this lock after return. And to remind, we are going to remove mm_access/lock_trace from sys_read() paths in proc. Oleg. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/