Re: [Devel] [PATCH -mm v2 1/2] mem-hotplug: implement get/put_online_mems

2014-04-18 Thread Andrew Morton
On Mon, 7 Apr 2014 13:45:34 +0400 Vladimir Davydov wrote: > {un}lock_memory_hotplug, which is used to synchronize against memory > hotplug, is currently backed by a mutex, which makes it a bit of a > hammer - threads that only want to get a stable value of online nodes > mask won't be able to pr

Re: [Devel] [PATCH -mm v2.2] mm: get rid of __GFP_KMEMCG

2014-04-10 Thread Andrew Morton
On Thu, 3 Apr 2014 19:05:59 +0400 Vladimir Davydov wrote: > Currently to allocate a page that should be charged to kmemcg (e.g. > threadinfo), we pass __GFP_KMEMCG flag to the page allocator. The page > allocated is then to be freed by free_memcg_kmem_pages. Apart from > looking asymmetrical, th

Re: [Devel] [PATCH -mm v3 2/7] memcg, slab: cleanup memcg cache creation

2014-02-21 Thread Andrew Morton
On Thu, 20 Feb 2014 11:22:04 +0400 Vladimir Davydov wrote: > This patch cleanups the memcg cache creation path as follows: > - Move memcg cache name creation to a separate function to be called >from kmem_cache_create_memcg(). This allows us to get rid of the >mutex protecting the tempo

Re: [Devel] [PATCH 1/2] kobject: don't block for each kobject_uevent

2014-02-13 Thread Andrew Morton
On Sun, 9 Feb 2014 14:56:15 +0400 Vladimir Davydov wrote: > Currently kobject_uevent has somewhat unpredictable semantics. The point > is, since it may call a usermode helper and wait for it to execute > (UMH_WAIT_EXEC), it is impossible to say for sure what lock dependencies > it will introduce

Re: [Devel] [PATCH 1/2] kobject: don't block for each kobject_uevent

2014-02-11 Thread Andrew Morton
On Sun, 9 Feb 2014 14:56:15 +0400 Vladimir Davydov wrote: > Currently kobject_uevent has somewhat unpredictable semantics. The point > is, since it may call a usermode helper and wait for it to execute > (UMH_WAIT_EXEC), it is impossible to say for sure what lock dependencies > it will introduce

Re: [Devel] [PATCH 2/3] mm: vmscan: get rid of DEFAULT_SEEKS and document shrink_slab logic

2014-02-05 Thread Andrew Morton
On Wed, 5 Feb 2014 11:16:49 +0400 Vladimir Davydov wrote: > > So why did I originally make DEFAULT_SEEKS=2? Because I figured that to > > recreate (say) an inode would require a seek to the inode data then a > > seek back. Is it legitimate to include the > > seek-back-to-what-you-were-doing-be

Re: [Devel] [PATCH 2/3] mm: vmscan: get rid of DEFAULT_SEEKS and document shrink_slab logic

2014-02-04 Thread Andrew Morton
On Fri, 17 Jan 2014 23:25:30 +0400 Vladimir Davydov wrote: > Each shrinker must define the number of seeks it takes to recreate a > shrinkable cache object. It is used to balance slab reclaim vs page > reclaim: assuming it costs one seek to replace an LRU page, we age equal > percentages of the

Re: [Devel] [PATCH v2 2/7] memcg, slab: cleanup memcg cache name creation

2014-02-03 Thread Andrew Morton
On Mon, 3 Feb 2014 19:54:37 +0400 Vladimir Davydov wrote: > The way memcg_create_kmem_cache() creates the name for a memcg cache > looks rather strange: it first formats the name in the static buffer > tmp_name protected by a mutex, then passes the pointer to the buffer to > kmem_cache_create_me

Re: [Devel] [PATCH 1/5] mm: vmscan: shrink all slab objects if tight on memory

2014-01-15 Thread Andrew Morton
On Wed, 15 Jan 2014 19:55:11 +0400 Vladimir Davydov wrote: > > > > We could avoid the "scan 32 then scan just 1" issue with something like > > > > if (total_scan > batch_size) > > total_scan %= batch_size; > > > > before the loop. But I expect the effects of that will be unmeasu

Re: [Devel] [PATCH 1/5] mm: vmscan: shrink all slab objects if tight on memory

2014-01-15 Thread Andrew Morton
On Wed, 15 Jan 2014 12:47:35 +0400 Vladimir Davydov wrote: > On 01/15/2014 02:14 AM, Andrew Morton wrote: > > On Tue, 14 Jan 2014 11:23:30 +0400 Vladimir Davydov > > wrote: > > > >> On 01/14/2014 03:05 AM, Andrew Morton wrote: > >>> That being said,

Re: [Devel] [PATCH 1/5] mm: vmscan: shrink all slab objects if tight on memory

2014-01-14 Thread Andrew Morton
On Tue, 14 Jan 2014 11:23:30 +0400 Vladimir Davydov wrote: > On 01/14/2014 03:05 AM, Andrew Morton wrote: > > On Sat, 11 Jan 2014 16:36:31 +0400 Vladimir Davydov > > wrote: > > > >> When reclaiming kmem, we currently don't scan slabs that have les

Re: [Devel] [PATCH 4/5] mm: vmscan: move call to shrink_slab() to shrink_zones()

2014-01-13 Thread Andrew Morton
On Sat, 11 Jan 2014 16:36:34 +0400 Vladimir Davydov wrote: > This reduces the indentation level of do_try_to_free_pages() and removes > extra loop over all eligible zones counting the number of on-LRU pages. So this should cause no functional change, yes? ___

Re: [Devel] [PATCH 3/5] mm: vmscan: respect NUMA policy mask when shrinking slab on direct reclaim

2014-01-13 Thread Andrew Morton
On Sat, 11 Jan 2014 16:36:33 +0400 Vladimir Davydov wrote: > When direct reclaim is executed by a process bound to a set of NUMA > nodes, we should scan only those nodes when possible, but currently we > will scan kmem from all online nodes even if the kmem shrinker is NUMA > aware. That said, b

Re: [Devel] [PATCH 1/5] mm: vmscan: shrink all slab objects if tight on memory

2014-01-13 Thread Andrew Morton
On Sat, 11 Jan 2014 16:36:31 +0400 Vladimir Davydov wrote: > When reclaiming kmem, we currently don't scan slabs that have less than > batch_size objects (see shrink_slab_node()): > > while (total_scan >= batch_size) { > shrinkctl->nr_to_scan = batch_size; >

Re: [Devel] [PATCH v11 00/15] kmemcg shrinkers

2013-11-26 Thread Andrew Morton
On Tue, 26 Nov 2013 16:55:43 +0400 Vladimir Davydov wrote: > What do you think about splitting this set into two main series as follows: > > 1) Prepare vmscan to kmemcg-aware shrinkers; would include patches 1-7 > of this set. > 2) Make fs shrinkers memcg-aware; would include patches 9-11 of t

Re: [Devel] [PATCH] mm: strictlimit feature -v4

2013-08-21 Thread Andrew Morton
On Wed, 21 Aug 2013 17:56:32 +0400 Maxim Patlasov wrote: > The feature prevents mistrusted filesystems to grow a large number of dirty > pages before throttling. For such filesystems balance_dirty_pages always > check bdi counters against bdi limits. I.e. even if global "nr_dirty" is under > "fr

Re: [Devel] [PATCH] mqueue: sys_mq_open: do not call mnt_drop_write() if read-only

2013-03-19 Thread Andrew Morton
On Tue, 19 Mar 2013 13:31:18 +0400 Vladimir Davydov wrote: > mnt_drop_write() must be called only if mnt_want_write() succeeded, > otherwise the mnt_writers counter will diverge. > > ... > > --- a/ipc/mqueue.c > +++ b/ipc/mqueue.c > @@ -840,7 +840,8 @@ out_putfd: > fd = error; >

Re: [Devel] [RFC PATCH v8 0/5] IPC: checkpoint/restore in userspace enhancements

2012-12-20 Thread Andrew Morton
On Thu, 20 Dec 2012 08:06:32 +0400 Stanislav Kinsbursky wrote: > 19.12.2012 00:36, Andrew Morton __: > > On Wed, 24 Oct 2012 19:34:51 +0400 > > Stanislav Kinsbursky wrote: > > > >> This respin of the patch set was significantly reworked. Most part of new

Re: [Devel] [RFC PATCH v8 0/5] IPC: checkpoint/restore in userspace enhancements

2012-12-18 Thread Andrew Morton
On Wed, 24 Oct 2012 19:34:51 +0400 Stanislav Kinsbursky wrote: > This respin of the patch set was significantly reworked. Most part of new API > was replaced by sysctls (by one per messages, semaphores and shared memory), > allowing to preset desired id for next new IPC object. > > This patch se

[Devel] Re: [RFC PATCH v8 0/5] IPC: checkpoint/restore in userspace enhancements

2012-10-24 Thread Andrew Morton
On Wed, 24 Oct 2012 19:34:51 +0400 Stanislav Kinsbursky wrote: > This respin of the patch set was significantly reworked. Most part of new API > was replaced by sysctls (by one per messages, semaphores and shared memory), > allowing to preset desired id for next new IPC object. > > This patch se

[Devel] Re: [PATCH v8 4/5] ipc: message queue copy feature introduced

2012-10-24 Thread Andrew Morton
On Wed, 24 Oct 2012 19:35:20 +0400 Stanislav Kinsbursky wrote: > This patch is required for checkpoint/restore in userspace. > IOW, c/r requires some way to get all pending IPC messages without deleting > them from the queue (checkpoint can fail and in this case tasks will be > resumed, > so que

[Devel] Re: [PATCH v8 2/5] ipc: add sysctl to specify desired next object id

2012-10-24 Thread Andrew Morton
On Wed, 24 Oct 2012 19:35:09 +0400 Stanislav Kinsbursky wrote: > This patch adds 3 new variables and sysctls to tune them (by one "next_id" > variable for messages, semaphores and shared memory respectively). > This variable can be used to set desired id for next allocated IPC object. > By defaul

[Devel] Re: [PATCH v5] slab: Ignore internal flags in cache creation

2012-10-18 Thread Andrew Morton
On Wed, 17 Oct 2012 15:36:51 +0400 Glauber Costa wrote: > Some flags are used internally by the allocators for management > purposes. One example of that is the CFLGS_OFF_SLAB flag that slab uses > to mark that the metadata for that cache is stored outside of the slab. > > No cache should ever p

[Devel] Re: [PATCH v5 07/14] mm: Allocate kernel pages to the right memcg

2012-10-18 Thread Andrew Morton
On Thu, 18 Oct 2012 13:24:47 +0400 Glauber Costa wrote: > On 10/18/2012 02:12 AM, Andrew Morton wrote: > > On Tue, 16 Oct 2012 14:16:44 +0400 > > Glauber Costa wrote: > > > >> When a process tries to allocate a page with the __GFP_KMEMCG flag, the >

[Devel] Re: [PATCH v5 00/14] kmem controller for memcg.

2012-10-18 Thread Andrew Morton
On Thu, 18 Oct 2012 20:51:05 +0400 Glauber Costa wrote: > On 10/18/2012 02:11 AM, Andrew Morton wrote: > > On Tue, 16 Oct 2012 14:16:37 +0400 > > Glauber Costa wrote: > > > >> ... > >> > >> A general explanation of what this is all about

[Devel] Re: [PATCH v5 14/14] Add documentation about the kmem controller

2012-10-17 Thread Andrew Morton
On Tue, 16 Oct 2012 14:16:51 +0400 Glauber Costa wrote: > +Kernel memory won't be accounted at all until limit on a group is set. This > +allows for existing setups to continue working without disruption. The limit > +cannot be set if the cgroup have children, or if there are already tasks in >

[Devel] Re: [PATCH v5 13/14] protect architectures where THREAD_SIZE >= PAGE_SIZE against fork bombs

2012-10-17 Thread Andrew Morton
On Tue, 16 Oct 2012 14:16:50 +0400 Glauber Costa wrote: > @@ -146,7 +146,7 @@ void __weak arch_release_thread_info(struct thread_info > *ti) > static struct thread_info *alloc_thread_info_node(struct task_struct *tsk, > int node) > { > - stru

[Devel] Re: [PATCH v5 11/14] memcg: allow a memcg with kmem charges to be destructed.

2012-10-17 Thread Andrew Morton
On Tue, 16 Oct 2012 14:16:48 +0400 Glauber Costa wrote: > Because the ultimate goal of the kmem tracking in memcg is to track slab > pages as well, It is? For a major patchset such as this, it's pretty important to discuss such long-term plans in the top-level discussion. Covering things such

[Devel] Re: [PATCH v5 07/14] mm: Allocate kernel pages to the right memcg

2012-10-17 Thread Andrew Morton
On Tue, 16 Oct 2012 14:16:44 +0400 Glauber Costa wrote: > When a process tries to allocate a page with the __GFP_KMEMCG flag, the > page allocator will call the corresponding memcg functions to validate > the allocation. Tasks in the root memcg can always proceed. > > To avoid adding markers to

[Devel] Re: [PATCH v5 06/14] memcg: kmem controller infrastructure

2012-10-17 Thread Andrew Morton
On Tue, 16 Oct 2012 14:16:43 +0400 Glauber Costa wrote: > This patch introduces infrastructure for tracking kernel memory pages to > a given memcg. This will happen whenever the caller includes the flag > __GFP_KMEMCG flag, and the task belong to a memcg other than the root. > > In memcontrol.h

[Devel] Re: [PATCH v5 04/14] kmem accounting basic infrastructure

2012-10-17 Thread Andrew Morton
On Tue, 16 Oct 2012 14:16:41 +0400 Glauber Costa wrote: > This patch adds the basic infrastructure for the accounting of kernel > memory. To control that, the following files are created: > > * memory.kmem.usage_in_bytes > * memory.kmem.limit_in_bytes > * memory.kmem.failcnt gargh. "failcnt

[Devel] Re: [PATCH v5 01/14] memcg: Make it possible to use the stock for more than one page.

2012-10-17 Thread Andrew Morton
On Tue, 16 Oct 2012 14:16:38 +0400 Glauber Costa wrote: > From: Suleiman Souhlal > > We currently have a percpu stock cache scheme that charges one page at a > time from memcg->res, the user counter. When the kernel memory > controller comes into play, we'll need to charge more than that. > >

[Devel] Re: [PATCH v5 00/14] kmem controller for memcg.

2012-10-17 Thread Andrew Morton
On Tue, 16 Oct 2012 14:16:37 +0400 Glauber Costa wrote: > ... > > A general explanation of what this is all about follows: > > The kernel memory limitation mechanism for memcg concerns itself with > disallowing potentially non-reclaimable allocations to happen in exaggerate > quantities by a par

[Devel] Re: [PATCH] proc: check vma->vm_file before dereferencing

2012-10-15 Thread Andrew Morton
On Tue, 16 Oct 2012 01:52:30 +0400 Cyrill Gorcunov wrote: > On Mon, Oct 15, 2012 at 02:40:48PM -0700, Andrew Morton wrote: > > On Mon, 15 Oct 2012 19:30:03 +0400 > > Stanislav Kinsbursky wrote: > > > > > It can be equal to NULL. > > > > > > &g

[Devel] Re: [PATCH] proc: check vma->vm_file before dereferencing

2012-10-15 Thread Andrew Morton
On Mon, 15 Oct 2012 19:30:03 +0400 Stanislav Kinsbursky wrote: > It can be equal to NULL. > Please write better changelogs, so people do not have to ask questions such as: - Under what conditions does this bug trigger? - In which kernel version(s)? - Is it a post-3.6 regression? Thanks. __

[Devel] Re: [RFC 2/4] memcg: make it suck faster

2012-09-25 Thread Andrew Morton
On Tue, 25 Sep 2012 12:52:51 +0400 Glauber Costa wrote: > It is an accepted fact that memcg sucks. But can it suck faster? Or in > a more fair statement, can it at least stop draining everyone's > performance when it is not in use? > > This experimental and slightly crude patch demonstrates tha

[Devel] Re: Fork bomb limitation in memcg WAS: Re: [PATCH 00/11] kmem controller for memcg: stripped down version

2012-06-28 Thread Andrew Morton
On Thu, 28 Jun 2012 13:01:23 +0400 Glauber Costa wrote: > > ... > OK, that all sounds convincing ;) Please summarise and capture this discussion in the [patch 0/n] changelog so we (or others) don't have to go through this all again. And let's remember this in the next patchset! > Last, but not

[Devel] Re: [PATCH 00/11] kmem controller for memcg: stripped down version

2012-06-26 Thread Andrew Morton
On Tue, 26 Jun 2012 11:17:49 +0400 Glauber Costa wrote: > On 06/26/2012 03:27 AM, Andrew Morton wrote: > > On Mon, 25 Jun 2012 18:15:17 +0400 > > Glauber Costa wrote: > > > >> What I am proposing with this series is a stripped down version of the > >> kme

[Devel] Re: [PATCH 06/11] memcg: kmem controller infrastructure

2012-06-26 Thread Andrew Morton
On Tue, 26 Jun 2012 22:14:51 +0400 Glauber Costa wrote: > On 06/26/2012 10:01 PM, Andrew Morton wrote: > > On Tue, 26 Jun 2012 19:01:15 +0400 Glauber Costa > > wrote: > > > >> On 06/26/2012 03:17 AM, Andrew Morton wrote: > >>>

[Devel] Re: [PATCH 06/11] memcg: kmem controller infrastructure

2012-06-26 Thread Andrew Morton
On Tue, 26 Jun 2012 19:01:15 +0400 Glauber Costa wrote: > On 06/26/2012 03:17 AM, Andrew Morton wrote: > >> + memcg_uncharge_kmem(memcg, size); > >> >+ mem_cgroup_put(memcg); > >> >+} > >> >+EXPORT_SYMBOL(__mem_cgroup_free_kmem_page); &

[Devel] Re: [PATCH 09/11] memcg: propagate kmem limiting information to children

2012-06-25 Thread Andrew Morton
On Mon, 25 Jun 2012 22:24:44 -0700 (PDT) David Rientjes wrote: > > > +#define KMEM_ACCOUNTED_THIS 0 > > > +#define KMEM_ACCOUNTED_PARENT1 > > > > And then document the fields here. > > > > In hex, please? Well, they're bit numbers, not masks. Decimal 0-31 is OK, or an enum. __

[Devel] Re: [PATCH 00/11] kmem controller for memcg: stripped down version

2012-06-25 Thread Andrew Morton
On Mon, 25 Jun 2012 18:15:17 +0400 Glauber Costa wrote: > What I am proposing with this series is a stripped down version of the > kmem controller for memcg that would allow us to merge significant parts > of the infrastructure, while leaving out, for now, the polemic bits about > the slab while

[Devel] Re: [PATCH 09/11] memcg: propagate kmem limiting information to children

2012-06-25 Thread Andrew Morton
On Mon, 25 Jun 2012 18:15:26 +0400 Glauber Costa wrote: > --- a/mm/memcontrol.c > +++ b/mm/memcontrol.c > @@ -287,7 +287,11 @@ struct mem_cgroup { >* Should the accounting and control be hierarchical, per subtree? >*/ > bool use_hierarchy; > - bool kmem_accounted; > +

[Devel] Re: [PATCH 09/11] memcg: propagate kmem limiting information to children

2012-06-25 Thread Andrew Morton
On Tue, 26 Jun 2012 02:36:27 +0400 Glauber Costa wrote: > On 06/25/2012 10:29 PM, Tejun Heo wrote: > > Feeling like a nit pervert but.. > > > > On Mon, Jun 25, 2012 at 06:15:26PM +0400, Glauber Costa wrote: > >> @@ -287,7 +287,11 @@ struct mem_cgroup { > >> * Should the accounting and control

[Devel] Re: [PATCH 06/11] memcg: kmem controller infrastructure

2012-06-25 Thread Andrew Morton
On Mon, 25 Jun 2012 18:15:23 +0400 Glauber Costa wrote: > This patch introduces infrastructure for tracking kernel memory pages > to a given memcg. This will happen whenever the caller includes the > flag __GFP_KMEMCG flag, and the task belong to a memcg other than > the root. > > In memcontrol.

[Devel] Re: [PATCH v6 2/2] decrement static keys on real destroy time

2012-05-23 Thread Andrew Morton
On Wed, 23 May 2012 13:16:36 +0400 Glauber Costa wrote: > On 05/23/2012 02:46 AM, Andrew Morton wrote: > > Here, we're open-coding kinda-test_bit(). Why do that? These flags are > > modified with set_bit() and friends, so we should read them with the > > matching tes

[Devel] Re: [PATCH v6 2/2] decrement static keys on real destroy time

2012-05-22 Thread Andrew Morton
On Tue, 22 May 2012 15:46:10 -0700 Andrew Morton wrote: > > +static inline bool memcg_proto_active(struct cg_proto *cg_proto) > > +{ > > + return cg_proto->flags & (1 << MEMCG_SOCK_ACTIVE); > > +} > > + > > +static inline boo

[Devel] Re: [PATCH v6 2/2] decrement static keys on real destroy time

2012-05-22 Thread Andrew Morton
; + * > + * The activated bit is used to guarantee that no two writers > will > + * do the update in the same memcg. Without that, we can't > properly > + * shutdown the static key. > + */ This comment needl

[Devel] Re: [PATCH v5 2/2] decrement static keys on real destroy time

2012-05-17 Thread Andrew Morton
On Thu, 17 May 2012 13:52:13 +0400 Glauber Costa wrote: > Andrew is right. It seems we will need that mutex after all. Just this > is not a race, and neither something that should belong in the > static_branch interface. Well, a mutex is one way. Or you could do something like if (!t

[Devel] Re: [PATCH v5 2/2] decrement static keys on real destroy time

2012-05-16 Thread Andrew Morton
On Thu, 17 May 2012 07:06:52 +0400 Glauber Costa wrote: > ... > >> + else if (val != RESOURCE_MAX) { > >> + /* > >> + * ->activated needs to be written after the static_key update. > >> + * This is what guarantees that the socket activation function > >> +

[Devel] Re: [PATCH v5 2/2] decrement static keys on real destroy time

2012-05-16 Thread Andrew Morton
On Fri, 11 May 2012 17:11:17 -0300 Glauber Costa wrote: > We call the destroy function when a cgroup starts to be removed, > such as by a rmdir event. > > However, because of our reference counters, some objects are still > inflight. Right now, we are decrementing the static_keys at destroy() >

[Devel] Re: [PATCH v5 2/2] decrement static keys on real destroy time

2012-05-16 Thread Andrew Morton
On Fri, 11 May 2012 17:11:17 -0300 Glauber Costa wrote: > We call the destroy function when a cgroup starts to be removed, > such as by a rmdir event. > > However, because of our reference counters, some objects are still > inflight. Right now, we are decrementing the static_keys at destroy() >

[Devel] Re: [PATCH v5 2/2] decrement static keys on real destroy time

2012-05-16 Thread Andrew Morton
On Wed, 16 May 2012 11:03:47 +0400 Glauber Costa wrote: > On 05/14/2012 05:38 AM, Li Zefan wrote: > >> +static void disarm_static_keys(struct mem_cgroup *memcg) > > > >> +{ > >> +#ifdef CONFIG_INET > >> + if (memcg->tcp_mem.cg_proto.activated) > >> + static_key_slow_dec(&memcg_socket_l

[Devel] Re: [PATCH] remove BUG() in possible but rare condition

2012-04-11 Thread Andrew Morton
On Wed, 11 Apr 2012 17:51:57 -0300 Glauber Costa wrote: > On 04/11/2012 05:26 PM, Andrew Morton wrote: > >> > >> >failed: > >> > - BUG(); > >> > unlock_page(page); > >> > page_cache_release(page); > &g

[Devel] Re: [PATCH] remove BUG() in possible but rare condition

2012-04-11 Thread Andrew Morton
, grow_dev_page() > can return NULL just fine in other circumstances, so I propose we just > remove it, then. > > Signed-off-by: Glauber Costa > CC: Linus Torvalds > CC: Andrew Morton > --- > fs/buffer.c |1 - > 1 files changed, 0 insertions(+), 1 deletions(-) >

[Devel] Re: [PATCH] memcg: Do not open code accesses to res_counter members

2012-04-05 Thread Andrew Morton
On Tue, 20 Mar 2012 20:53:44 +0400 Glauber Costa wrote: > We should use the acessor res_counter_read_u64 for that. > Although a purely cosmetic change is sometimes better of delayed, > to avoid conflicting with other people's work, we are starting to > have people touching this code as well, and

[Devel] Re: [PATCH 1/2] SYSCTL: root unregister routine introduced

2011-12-12 Thread Andrew Morton
On Mon, 12 Dec 2011 21:50:00 +0300 Stanislav Kinsbursky wrote: > This routine is required for SUNRPC sysctl's, which are going to be allocated, > processed and destroyed per network namespace context. > IOW, new sysctl root will be registered on network namespace creation and > thus have to unreg

[Devel] Re: [PATCH 9/9] userns: check user namespace for task->file uid equivalence checks

2011-02-23 Thread Andrew Morton
On Thu, 24 Feb 2011 03:24:16 + "Serge E. Hallyn" wrote: > Quoting Andrew Morton (a...@linux-foundation.org): > > On Thu, 17 Feb 2011 15:04:07 + > > "Serge E. Hallyn" wrote: > > > > There's a fairly well adhered to convention that

[Devel] Re: [PATCH] userns: ptrace: incorporate feedback from Eric

2011-02-23 Thread Andrew Morton
On Thu, 24 Feb 2011 00:49:01 + "Serge E. Hallyn" wrote: > same_or_ancestore_user_ns() was not an appropriate check to > constrain cap_issubset. Rather, cap_issubset() only is > meaningful when both capsets are in the same user_ns. I queued this as a fix against userns-allow-ptrace-from-non-

[Devel] Re: [PATCH 4/9] allow killing tasks in your own or child userns

2011-02-23 Thread Andrew Morton
On Thu, 24 Feb 2011 00:48:18 + "Serge E. Hallyn" wrote: > Quoting Andrew Morton (a...@linux-foundation.org): > > On Thu, 17 Feb 2011 15:03:25 + > > "Serge E. Hallyn" wrote: > > > > > /* > > > + * called with RCU read lock f

[Devel] Re: [PATCH 6/9] user namespaces: convert all capable checks in kernel/sys.c

2011-02-18 Thread Andrew Morton
On Thu, 17 Feb 2011 15:03:42 + "Serge E. Hallyn" wrote: > @@ -1177,8 +1189,11 @@ SYSCALL_DEFINE2(sethostname, char __user *, name, int, > len) > int errno; > char tmp[__NEW_UTS_LEN]; > > - if (!ns_capable(current->nsproxy->uts_ns->user_ns, CAP_SYS_ADMIN)) > + if (!ns_ca

[Devel] Re: [PATCH 9/9] userns: check user namespace for task->file uid equivalence checks

2011-02-18 Thread Andrew Morton
On Thu, 17 Feb 2011 15:04:07 + "Serge E. Hallyn" wrote: > Cheat for now and say all files belong to init_user_ns. Next > step will be to let superblocks belong to a user_ns, and derive > inode_userns(inode) from inode->i_sb->s_user_ns. Finally we'll > introduce more flexible arrangements. >

[Devel] Re: [PATCH 6/9] user namespaces: convert all capable checks in kernel/sys.c

2011-02-18 Thread Andrew Morton
On Thu, 17 Feb 2011 15:03:42 + "Serge E. Hallyn" wrote: > This allows setuid/setgid in containers. It also fixes some > corner cases where kernel logic foregoes capability checks when > uids are equivalent. The latter will need to be done throughout > the whole kernel. > > > ... > > --- a/

[Devel] Re: [PATCH 7/9] add a user namespace owner of ipc ns

2011-02-18 Thread Andrew Morton
On Thu, 17 Feb 2011 15:03:49 + "Serge E. Hallyn" wrote: > > ... > > --- a/include/linux/ipc_namespace.h > +++ b/include/linux/ipc_namespace.h > @@ -24,6 +24,7 @@ struct ipc_ids { > struct idr ipcs_idr; > }; > > +struct user_namespace; Move to top of file. > struct ipc_namespace {

[Devel] Re: [PATCH 5/9] Allow ptrace from non-init user namespaces

2011-02-18 Thread Andrew Morton
On Thu, 17 Feb 2011 15:03:33 + "Serge E. Hallyn" wrote: > ptrace is allowed to tasks in the same user namespace according to > the usual rules (i.e. the same rules as for two tasks in the init > user namespace). ptrace is also allowed to a user namespace to > which the current task the has C

[Devel] Re: [PATCH 4/9] allow killing tasks in your own or child userns

2011-02-18 Thread Andrew Morton
On Thu, 17 Feb 2011 15:03:25 + "Serge E. Hallyn" wrote: > /* > + * called with RCU read lock from check_kill_permission() > + */ > +static inline int kill_ok_by_cred(struct task_struct *t) > +{ > + const struct cred *cred = current_cred(); > + const struct cred *tcred = __task_cred(t

[Devel] Re: [PATCH 2/9] security: Make capabilities relative to the user namespace.

2011-02-18 Thread Andrew Morton
On Thu, 17 Feb 2011 15:03:06 + "Serge E. Hallyn" wrote: > - Introduce ns_capable to test for a capability in a non-default > user namespace. > - Teach cap_capable to handle capabilities in a non-default > user namespace. > > The motivation is to get to the unprivileged creation of new >

[Devel] Re: [PATCH 1/9] Add a user_namespace as creator/owner of uts_namespace

2011-02-18 Thread Andrew Morton
On Thu, 17 Feb 2011 15:02:57 + "Serge E. Hallyn" wrote: > +/* > + * userns count is 1 for root user, 1 for init_uts_ns, > + * and 1 for... ? > + */ ? ___ Containers mailing list contain...@lists.linux-foundation.org https://lists.linux-foundation.o

[Devel] Re: userns: targeted capabilities v5

2011-02-17 Thread Andrew Morton
On Thu, 17 Feb 2011 15:02:24 + "Serge E. Hallyn" wrote: > Here is a repost of my previous user namespace patch, ported onto > last night's git head. > > It fixes several things I was doing wrong in the last (v4) > posting, in particular: > > 1. don't set uts_ns->user_ns to current's w

[Devel] Re: [PATCH 1/1, v7] cgroup/freezer: add per freezer duty ratio control

2011-02-14 Thread Andrew Morton
On Sun, 13 Feb 2011 19:23:10 -0800 Arjan van de Ven wrote: > On 2/13/2011 4:44 PM, KAMEZAWA Hiroyuki wrote: > > On Sat, 12 Feb 2011 15:29:07 -0800 > > Matt Helsley wrote: > > > >> On Fri, Feb 11, 2011 at 11:10:44AM -0800, jacob.jun@linux.intel.com > >> wrote: > >>> From: Jacob Pan > >>> > >

[Devel] Re: [PATCH v8 0/3] cgroups: implement moving a threadgroup's threads atomically with cgroup.procs

2011-02-09 Thread Andrew Morton
On Mon, 7 Feb 2011 20:35:42 -0500 Ben Blum wrote: > On Sun, Dec 26, 2010 at 07:09:19AM -0500, Ben Blum wrote: > > On Fri, Dec 24, 2010 at 03:22:26AM -0500, Ben Blum wrote: > > > On Wed, Aug 11, 2010 at 01:46:04AM -0400, Ben Blum wrote: > > > > On Fri, Jul 30, 2010 at 07:56:49PM -0400, Ben Blum wr

[Devel] Re: [PATCH, v4 2/2] cgroups: introduce timer slack subsystem

2011-02-08 Thread Andrew Morton
On Thu, 3 Feb 2011 16:34:14 +0200 "Kirill A. Shutsemov" wrote: > From: Kirill A. Shutemov > > Provides a way of tasks grouping by timer slack value. Introduces per > cgroup max and min timer slack value. When a task attaches to a cgroup, > its timer slack value adjusts (if needed) to fit min-m

[Devel] Re: [PATCH v2] cgroup/freezer: add per freezer duty ratio control

2011-02-04 Thread Andrew Morton
On Wed, 2 Feb 2011 16:42:20 -0800 jacob.jun@linux.intel.com wrote: > Freezer subsystem is used to manage batch jobs which can start > stop at the same time. However, sometime it is desirable to let > the kernel manage the freezer state automatically with a given > duty ratio. > For example, i

[Devel] Re: [PATCH v7 1/3] cgroups: read-write lock CLONE_THREAD forking per threadgroup

2011-02-04 Thread Andrew Morton
On Fri, 4 Feb 2011 16:25:15 -0500 Ben Blum wrote: > On Mon, Jan 24, 2011 at 01:05:29PM -0800, Andrew Morton wrote: > > On Sun, 26 Dec 2010 07:09:51 -0500 > > Ben Blum wrote: > > > > > Adds functionality to read/write lock CLONE_THREAD fork()ing > > >

[Devel] Re: [PATCH] cgroup : remove the ns_cgroup

2011-01-26 Thread Andrew Morton
On Thu, 27 Jan 2011 09:08:51 +0800 Li Zefan wrote: > Andrew Morton wrote: > > On Tue, 25 Jan 2011 10:39:48 +0100 > > Daniel Lezcano wrote: > > > >> This patch removes the ns_cgroup as suggested in the following thread: > > > > I had this patch queued

[Devel] Re: [PATCH] cgroup : remove the ns_cgroup

2011-01-26 Thread Andrew Morton
On Tue, 25 Jan 2011 10:39:48 +0100 Daniel Lezcano wrote: > This patch removes the ns_cgroup as suggested in the following thread: I had this patch queued up in September last year, but dropped it. Why did I do that? ___ Containers mailing list contain

[Devel] Re: [PATCH] cgroup : remove the ns_cgroup

2011-01-26 Thread Andrew Morton
On Tue, 25 Jan 2011 10:39:48 +0100 Daniel Lezcano wrote: > The ns_cgroup is an annoying cgroup at the namespace / cgroup frontier > and leads to some problems: > > * cgroup creation is out-of-control > * cgroup name can conflict when pids are looping > * it is not possibl

[Devel] Re: [PATCH v7 1/3] cgroups: read-write lock CLONE_THREAD forking per threadgroup

2011-01-24 Thread Andrew Morton
On Sun, 26 Dec 2010 07:09:51 -0500 Ben Blum wrote: > Adds functionality to read/write lock CLONE_THREAD fork()ing per-threadgroup > > From: Ben Blum > > This patch adds an rwsem that lives in a threadgroup's signal_struct that's > taken for reading in the fork path, under CONFIG_CGROUPS. If an

[Devel] Re: [PATCH v5 3/3] cgroups: make procs file writable

2010-12-24 Thread Andrew Morton
On Fri, 24 Dec 2010 06:45:00 -0500 Ben Blum wrote: > > kmalloc() is allowed while holding a spinlock and NODEMASK_ALLOC() takes a > > gfp_flags argument for that reason. > > Ah, it's only with GFP_KERNEL and friends. So changing the uses in > cpuset_can_attach to GFP_ATOMIC would solve this con

[Devel] Re: [PATCH v5 3/3] cgroups: make procs file writable

2010-12-16 Thread Andrew Morton
On Wed, 15 Dec 2010 22:34:39 -0800 Paul Menage wrote: > Ping akpm? Patches have gone a bit stale, sorry. Refactoring in kernel/cgroup_freezer.c necessitates a refresh and retest please. > On Fri, Oct 8, 2010 at 2:57 PM, Paul Menage wrote: > > Hi Andrew, > > > > Do you see any road-blockers f

[Devel] Re: [PATCH] user_ns: Improve the user_ns on-the-slab packaging

2010-12-07 Thread Andrew Morton
On Tue, 07 Dec 2010 17:12:33 +0300 Pavel Emelyanov wrote: > @@ -126,3 +128,11 @@ gid_t user_ns_map_gid(struct user_namespace *to, const > struct cred *cred, gid_t > /* No useful relationship so no mapping */ > return overflowgid; > } > + > +static __init int user_namespaces_init(voi

[Devel] Re: [PATCH v4 05/11] writeback: create dirty_info structure

2010-11-17 Thread Andrew Morton
On Wed, 17 Nov 2010 16:49:24 -0800 Andrew Morton wrote: > against the http://userweb.kernel.org/~akpm/mmotm/ which I just > uploaded err, will upload Real Soon Now. ___ Containers mailing list contain...@lists.linux-foundation.org https://lists

[Devel] Re: [PATCH v4 05/11] writeback: create dirty_info structure

2010-11-17 Thread Andrew Morton
On Fri, 29 Oct 2010 00:09:08 -0700 Greg Thelen wrote: > Bundle dirty limits and dirty memory usage metrics into a dirty_info > structure to simplify interfaces of routines that need all. Problems... These patches interact pretty badly with Fengguang's "IO-less dirty throttling v2" patches. I f

[Devel] Re: [PATCH v4 02/11] memcg: document cgroup dirty memory interfaces

2010-10-29 Thread Andrew Morton
On Fri, 29 Oct 2010 00:09:05 -0700 Greg Thelen wrote: > Document cgroup dirty memory interfaces and statistics. > > > ... > > +When use_hierarchy=0, each cgroup has dirty memory usage and limits. > +System-wide dirty limits are also consulted. Dirty memory consumption is > +checked against both

[Devel] Re: [PATCH v4 09/11] memcg: CPU hotplug lockdep warning fix

2010-10-29 Thread Andrew Morton
On Fri, 29 Oct 2010 00:09:12 -0700 Greg Thelen wrote: > From: Balbir Singh > > memcg has lockdep warnings (sleep inside rcu lock) > > > ... > > Acked-by: Greg Thelen You were on the patch delivery path, so this should be Signed-off-by:. I made that change to my copy. > Signed-off-by: Balbir

[Devel] Re: [PATCH v4 00/11] memcg: per cgroup dirty page accounting

2010-10-29 Thread Andrew Morton
On Fri, 29 Oct 2010 00:09:03 -0700 Greg Thelen wrote: This is cool stuff - it's been a long haul. One day we'll be nearly-finished and someone will write a book telling people how to use it all and lots of people will go "holy crap". I hope. > Limiting dirty memory is like fixing the max amoun

[Devel] Re: [PATCH 2/2] Kconfig : default all the namespaces to 'yes'

2010-10-14 Thread Andrew Morton
please confirm that the current patch is still good? From: Daniel Lezcano As the different namespaces depend on 'CONFIG_NAMESPACES', it is logical to enable all the namespaces when we enable NAMESPACES. Signed-off-by: Daniel Lezcano Cc: "Eric W. Biederman" Cc: David Miller

[Devel] Re: [PATCH v2] memcg: reduce lock time at move charge (Was Re: [PATCH 04/10] memcg: disable local interrupts in lock_page_cgroup()

2010-10-07 Thread Andrew Morton
On Fri, 8 Oct 2010 13:37:12 +0900 KAMEZAWA Hiroyuki wrote: > On Thu, 7 Oct 2010 16:14:54 -0700 > Andrew Morton wrote: > > > On Thu, 7 Oct 2010 17:04:05 +0900 > > KAMEZAWA Hiroyuki wrote: > > > > > Now, at task migration among cgroup, memory cgrou

[Devel] Re: [PATCH v2] memcg: reduce lock time at move charge (Was Re: [PATCH 04/10] memcg: disable local interrupts in lock_page_cgroup()

2010-10-07 Thread Andrew Morton
On Thu, 7 Oct 2010 17:04:05 +0900 KAMEZAWA Hiroyuki wrote: > Now, at task migration among cgroup, memory cgroup scans page table and moving > account if flags are properly set. > > The core code, mem_cgroup_move_charge_pte_range() does > > pte_offset_map_lock(); > for all ptes in a

[Devel] Re: [PATCH 1/1] cgroups: strcpy destination string overflow

2010-10-05 Thread Andrew Morton
On Tue, 5 Oct 2010 12:38:05 +0400 Evgeny Kuznetsov wrote: > From: Evgeny Kuznetsov > > Function "strcpy" is used without check for maximum allowed source > string length and could cause destination string overflow. > Check for string length is added before using "strcpy". > Function now is ret

[Devel] Re: [PATCH 0/3][V2] remove the ns_cgroup

2010-09-28 Thread Andrew Morton
On Tue, 28 Sep 2010 15:50:17 +0200 Daniel Lezcano wrote: > On 09/27/2010 10:46 PM, Andrew Morton wrote: > > On Mon, 27 Sep 2010 15:36:58 -0500 > > "Serge E. Hallyn" wrote: > > > > > >>>> This patchset removes the ns_cgroup by adding a

[Devel] Re: [PATCH 0/3][V2] remove the ns_cgroup

2010-09-27 Thread Andrew Morton
On Mon, 27 Sep 2010 13:45:26 -0700 ebied...@xmission.com (Eric W. Biederman) wrote: > "Serge E. Hallyn" writes: > > > Quoting Andrew Morton (a...@linux-foundation.org): > >> On Mon, 27 Sep 2010 12:14:10 +0200 > >> Daniel Lezcano wrote: > >> &g

[Devel] Re: [PATCH 0/3][V2] remove the ns_cgroup

2010-09-27 Thread Andrew Morton
On Mon, 27 Sep 2010 15:36:58 -0500 "Serge E. Hallyn" wrote: > > > This patchset removes the ns_cgroup by adding a new flag to the cgroup > > > and the cgroupfs mount option. It enables the copy of the parent cgroup > > > when a child cgroup is created. We can then safely remove the ns_cgroup as >

[Devel] Re: [RFC][PATCH 00/10] taskstats: Enhancements for precise accounting

2010-09-27 Thread Andrew Morton
On Mon, 27 Sep 2010 11:18:47 +0200 Michael Holzheu wrote: > Hello Andrew, > > On Fri, 2010-09-24 at 11:50 -0700, Andrew Morton wrote: > > > > This is a big change! If this is done right then we're heading in the > > > > direction of deprecating t

[Devel] Re: [PATCH 0/3][V2] remove the ns_cgroup

2010-09-27 Thread Andrew Morton
On Mon, 27 Sep 2010 12:14:10 +0200 Daniel Lezcano wrote: > The ns_cgroup is a control group interacting with the namespaces. > When a new namespace is created, a corresponding cgroup is > automatically created too. The cgroup name is the pid of the process > who did 'unshare' or the child of 'cl

[Devel] Re: [RFC][PATCH 00/10] taskstats: Enhancements for precise accounting

2010-09-24 Thread Andrew Morton
On Fri, 24 Sep 2010 11:10:15 +0200 Michael Holzheu wrote: > Hello Andrew, > > On Thu, 2010-09-23 at 13:11 -0700, Andrew Morton wrote: > > > GOALS OF THIS PATCH SET > > > --- > > > The intention of this patch set is to provide better sup

[Devel] Re: [RFC][PATCH 00/10] taskstats: Enhancements for precise accounting

2010-09-23 Thread Andrew Morton
On Thu, 23 Sep 2010 15:48:01 +0200 Michael Holzheu wrote: > Currently tools like "top" gather the task information by reading procfs > files. This has several disadvantages: > > * It is very CPU intensive, because a lot of system calls (readdir, open, > read, close) are necessary. > * No real

[Devel] Re: [PATCH] cgroups: fix API thinko

2010-08-25 Thread Andrew Morton
On Fri, 06 Aug 2010 10:38:24 -0600 Alex Williamson wrote: > On Fri, 2010-08-06 at 09:34 -0700, Sridhar Samudrala wrote: > > On 8/5/2010 3:59 PM, Michael S. Tsirkin wrote: > > > cgroup_attach_task_current_cg API that have upstream is backwards: we > > > really need an API to attach to the cgroups

[Devel] Re: [PATCH v4 0/2] cgroups: implement moving a threadgroup's threads atomically with cgroup.procs

2010-08-03 Thread Andrew Morton
On Fri, 30 Jul 2010 19:56:49 -0400 Ben Blum wrote: > This patch series implements a write function for the 'cgroup.procs' > per-cgroup file, which enables atomic movement of multithreaded > applications between cgroups. Writing the thread-ID of any thread in a > threadgroup to a cgroup's procs fi

[Devel] Re: [Bugme-new] [Bug 16417] New: Slow context switches with SMP and CONFIG_FAIR_GROUP_SCHED

2010-07-22 Thread Andrew Morton
(switched to email. Please respond via emailed reply-to-all, not via the bugzilla web interface). sched suckage! Do we have a linear search in there? On Mon, 19 Jul 2010 14:38:09 GMT bugzilla-dae...@bugzilla.kernel.org wrote: > https://bugzilla.kernel.org/show_bug.cgi?id=16417 > >

  1   2   3   4   5   >