On Wed, Jul 29, 2015 at 05:05:49PM +0200, Michal Hocko wrote: > On Wed 29-07-15 09:14:54, Johannes Weiner wrote: > > On Tue, Jul 14, 2015 at 05:18:23PM +0200, Michal Hocko wrote: > [...] > > > 3) fail mem_cgroup_can_attach if we are trying to migrate a task sharing > > > mm_struct with a process outside of the tset. If I understand the > > > tset properly this would require all the sharing tasks to be migrated > > > together and we would never end up with task_css != &task->mm->css. > > > __cgroup_procs_write doesn't seem to support multi pid move currently > > > AFAICS, though. cgroup_migrate_add_src, however, seems to be intended > > > for this purpose so this should be doable. Without that support we would > > > basically disallow migrating these tasks - I wouldn't object if you ask > > > me. > > > > I'd prefer not adding controller-specific failure modes for attaching, > > Does this mean that there is a plan to drop the return value from > can_attach? I can see that both cpuset and cpu controllers currently > allow to fail to attach. Are those going to change? I remember some > discussions but no clear outcome of those.
Nothing but the realtime stuff needs to be able to fail migration due to controller restraints. This should probably remain a fringe thing, because it does make for a much more ambiguous interface. So I think can_attach() will have to stay, but it should be avoided. > > and this too would lead to very non-obvious behavior. > > Yeah, the user will not get an error source with the current API but > this is an inherent restriction currently. Maybe we can add a knob with > the error source? > > If there is a clear consensus that can_attach failures are clearly a no > go then what about "silent" moving of the associated tasks? This would > be similar to thread group except the group would be more generic term. > > > > Do you see other options? From the above three options the 3rd one > > > sounds the most sane to me and the 1st quite easy to implement. Both will > > > require some cgroup core work though. But maybe we would be good enough > > > with 3rd option without supporting moving schizophrenic tasks and that > > > would be reduced to memcg code. > > > > A modified form of 1) would be to track the mms referring to a memcg > > but during offline search the process tree for a matching task. > > But we might have many of those and all of them living in different > cgroups. So which one do we take? The first encountered, the one with > the majority? I am not sure this is much better. > > I would really prefer if we could get rid of the schizophrenia if it is > possible. The first encountered. This is just our model for sharing memory across groups. Page cache, writeback, address space--we have always accounted based on who's touching it first. We might as well stick with it for shared mms. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/