On 08/14/2012 07:16 PM, Mel Gorman wrote:
On Thu, Aug 09, 2012 at 05:01:15PM +0400, Glauber Costa wrote:
When a process tries to allocate a page with the __GFP_KMEMCG flag, the
page allocator will call the corresponding memcg functions to validate
the allocation. Tasks in the root memcg can
On 08/14/2012 10:58 PM, Greg Thelen wrote:
On Mon, Aug 13 2012, Glauber Costa wrote:
+ WARN_ON(mem_cgroup_is_root(memcg));
+ size = (1 order) PAGE_SHIFT;
+ memcg_uncharge_kmem(memcg, size);
+ mem_cgroup_put(memcg);
Why do we need ref-counting here ? kmem res_counter cannot work as
On Thu 09-08-12 17:01:15, Glauber Costa wrote:
[...]
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index b956cec..da341dc 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -2532,6 +2532,7 @@ __alloc_pages_nodemask(gfp_t gfp_mask, unsigned int
order,
struct page *page = NULL;
On Mon, Aug 13, 2012 at 12:03:38PM +0400, Glauber Costa wrote:
On 08/10/2012 09:33 PM, Kamezawa Hiroyuki wrote:
(2012/08/09 22:01), Glauber Costa wrote:
When a process tries to allocate a page with the __GFP_KMEMCG flag, the
page allocator will call the corresponding memcg functions to
On Mon, 13 Aug 2012 15:21:56 +0400
Stanislav Kinsbursky skinsbur...@parallels.com wrote:
When child reaper exits, it can destroy mount namespace it belong to, and if
there are NFS mounts inside, then it will try to umount them. But in this
point current-nsproxy is set to NULL and all
On Thu, Aug 09, 2012 at 05:01:15PM +0400, Glauber Costa wrote:
When a process tries to allocate a page with the __GFP_KMEMCG flag, the
page allocator will call the corresponding memcg functions to validate
the allocation. Tasks in the root memcg can always proceed.
To avoid adding markers to
We always account to both user and kernel resource_counters. This
effectively means that an independent kernel limit is in place when the
limit is set to a lower value than the user memory. A equal or higher
value means that the user limit will always hit first, meaning that kmem
is
+ * memcg_kmem_new_page: verify if a new kmem allocation is allowed.
+ * @gfp: the gfp allocation flags.
+ * @handle: a pointer to the memcg this was charged against.
+ * @order: allocation order.
+ *
+ * returns true if the memcg where the current task belongs can hold this
+ *
On 08/15/2012 01:42 PM, Glauber Costa wrote:
Also, as I
have mentioned in the other email in this thread. Why should we reclaim
just because of kernel allocation when we are not reclaiming any of it
because shrink_slab is ignored in the memcg reclaim.
Don't get too distracted by the fact
On Wed, 2012-08-15 at 13:33 +0400, Glauber Costa wrote:
This can
be quite confusing. I am still not sure whether we should mix the two
things together. If somebody wants to limit the kernel memory he has to
touch the other limit anyway. Do you have a strong reason to mix the
user and
On Wed 15-08-12 13:33:55, Glauber Costa wrote:
[...]
This can
be quite confusing. I am still not sure whether we should mix the two
things together. If somebody wants to limit the kernel memory he has to
touch the other limit anyway. Do you have a strong reason to mix the
user and
On 08/15/2012 04:39 PM, Michal Hocko wrote:
On Wed 15-08-12 13:33:55, Glauber Costa wrote:
[...]
This can
be quite confusing. I am still not sure whether we should mix the two
things together. If somebody wants to limit the kernel memory he has to
touch the other limit anyway. Do you have
On Wed 15-08-12 12:12:23, James Bottomley wrote:
On Wed, 2012-08-15 at 13:33 +0400, Glauber Costa wrote:
This can
be quite confusing. I am still not sure whether we should mix the two
things together. If somebody wants to limit the kernel memory he has to
touch the other limit
On Wed 15-08-12 16:53:40, Glauber Costa wrote:
[...]
This doesn't check for the hierachy so kmem_accounted might not be in
sync with it's parents. mem_cgroup_create (below) needs to copy
kmem_accounted down from the parent and the above needs to check if this
is a similar dance like
On 08/15/2012 05:02 PM, Michal Hocko wrote:
On Wed 15-08-12 16:53:40, Glauber Costa wrote:
[...]
This doesn't check for the hierachy so kmem_accounted might not be in
sync with it's parents. mem_cgroup_create (below) needs to copy
kmem_accounted down from the parent and the above needs to
On Wed 15-08-12 13:42:24, Glauber Costa wrote:
[...]
+
+ ret = 0;
+
+ if (!memcg)
+ return ret;
+
+ _memcg = memcg;
+ ret = __mem_cgroup_try_charge(NULL, gfp, delta / PAGE_SIZE,
+ _memcg, may_oom);
This is really dangerous because atomic allocation which
On Wed 15-08-12 17:04:31, Glauber Costa wrote:
On 08/15/2012 05:02 PM, Michal Hocko wrote:
On Wed 15-08-12 16:53:40, Glauber Costa wrote:
[...]
This doesn't check for the hierachy so kmem_accounted might not be in
sync with it's parents. mem_cgroup_create (below) needs to copy
On Wed, 2012-08-15 at 14:55 +0200, Michal Hocko wrote:
On Wed 15-08-12 12:12:23, James Bottomley wrote:
On Wed, 2012-08-15 at 13:33 +0400, Glauber Costa wrote:
This can
be quite confusing. I am still not sure whether we should mix the two
things together. If somebody wants to limit
On 08/15/2012 05:26 PM, Michal Hocko wrote:
On Wed 15-08-12 17:04:31, Glauber Costa wrote:
On 08/15/2012 05:02 PM, Michal Hocko wrote:
On Wed 15-08-12 16:53:40, Glauber Costa wrote:
[...]
This doesn't check for the hierachy so kmem_accounted might not be in
sync with it's parents.
As for the type, do you think using struct mem_cgroup would be less
confusing?
Yes and returning the mem_cgroup or NULL instead of bool.
Ok. struct mem_cgroup it is.
The placeholder is there, but it is later patched
to the final thing.
With that explained, if you want me to change it
On 08/15/2012 05:22 PM, Mel Gorman wrote:
I believe it
to be a better and less complicated approach then letting a page appear
and then charging it. Besides being consistent with the rest of memcg,
it won't create unnecessary disturbance in the page allocator
when the allocation is to
On 08/15/2012 05:09 PM, Michal Hocko wrote:
On Wed 15-08-12 13:42:24, Glauber Costa wrote:
[...]
+
+ ret = 0;
+
+ if (!memcg)
+ return ret;
+
+ _memcg = memcg;
+ ret = __mem_cgroup_try_charge(NULL, gfp, delta / PAGE_SIZE,
+ _memcg, may_oom);
This is really dangerous
On Wed 15-08-12 17:31:24, Glauber Costa wrote:
On 08/15/2012 05:26 PM, Michal Hocko wrote:
On Wed 15-08-12 17:04:31, Glauber Costa wrote:
On 08/15/2012 05:02 PM, Michal Hocko wrote:
On Wed 15-08-12 16:53:40, Glauber Costa wrote:
[...]
This doesn't check for the hierachy so
OK, I missed an important point that kmem_accounted is not exported to
the userspace (I thought it would be done later in the series) which
is not the case so actually nobody get's confused by the inconsistency
because it is about RESOURCE_MAX which they see in both cases.
Sorry about the
On Wed 15-08-12 18:01:51, Glauber Costa wrote:
On 08/15/2012 05:09 PM, Michal Hocko wrote:
On Wed 15-08-12 13:42:24, Glauber Costa wrote:
[...]
+
+ret = 0;
+
+if (!memcg)
+return ret;
+
+_memcg = memcg;
+ret =
I see now, you seem to be right.
No I am not because it seems that I am really blind these days...
We were doing this in mem_cgroup_do_charge for ages:
if (!(gfp_mask __GFP_WAIT))
return CHARGE_WOULDBLOCK;
/me goes to hide and get with further feedback with a
On Wed, 15 Aug 2012, Michal Hocko wrote:
That is not what the kernel does, in general. We assume that if he wants
that memory and we can serve it, we should. Also, not all kernel memory
is unreclaimable. We can shrink the slabs, for instance. Ying Han
claims she has patches for that
On 08/15/2012 06:47 PM, Christoph Lameter wrote:
On Wed, 15 Aug 2012, Michal Hocko wrote:
That is not what the kernel does, in general. We assume that if he wants
that memory and we can serve it, we should. Also, not all kernel memory
is unreclaimable. We can shrink the slabs, for instance.
On Wed, Aug 15 2012, Christoph Lameter wrote:
On Wed, 15 Aug 2012, Michal Hocko wrote:
That is not what the kernel does, in general. We assume that if he wants
that memory and we can serve it, we should. Also, not all kernel memory
is unreclaimable. We can shrink the slabs, for instance.
On Wed, 15 Aug 2012, Glauber Costa wrote:
On 08/15/2012 06:47 PM, Christoph Lameter wrote:
On Wed, 15 Aug 2012, Michal Hocko wrote:
That is not what the kernel does, in general. We assume that if he wants
that memory and we can serve it, we should. Also, not all kernel memory
is
On Wed, 15 Aug 2012, Greg Thelen wrote:
You can already shrink the reclaimable slabs (dentries / inodes) via
calls to the subsystem specific shrinkers. Did Ying Han do anything to
go beyond that?
cc: Ying
The Google shrinker patches enhance prune_dcache_sb() to limit dentry
pressure to
On 08/15/2012 07:34 PM, Christoph Lameter wrote:
On Wed, 15 Aug 2012, Glauber Costa wrote:
On 08/15/2012 06:47 PM, Christoph Lameter wrote:
On Wed, 15 Aug 2012, Michal Hocko wrote:
That is not what the kernel does, in general. We assume that if he wants
that memory and we can serve it, we
This patch set introduces new socket operation and new system call:
sys_fbind(), which allows to bind socket to opened file.
File to bind to can be created by sys_mknod(S_IFSOCK) and opened by
open(O_PATH).
This system call is especially required for UNIX sockets, which has name
lenght
This patch moves UNIX socket insert into separated function, because this code
will be used for unix_fbind() too.
Signed-off-by: Stanislav Kinsbursky skinsbur...@parallels.com
---
net/unix/af_unix.c | 52 +---
1 files changed, 29 insertions(+),
This will simplify further changes for unix_fbind().
---
net/unix/af_unix.c | 12 +---
1 files changed, 5 insertions(+), 7 deletions(-)
diff --git a/net/unix/af_unix.c b/net/unix/af_unix.c
index 641f2e4..bc90ddb 100644
--- a/net/unix/af_unix.c
+++ b/net/unix/af_unix.c
@@ -880,10 +880,8
This operation is used to bind socket to specified file.
Signed-off-by: Stanislav Kinsbursky skinsbur...@parallels.com
---
include/linux/net.h |2 ++
1 files changed, 2 insertions(+), 0 deletions(-)
diff --git a/include/linux/net.h b/include/linux/net.h
index e9ac2df..843cb75 100644
---
Path for unix_address is taken from passed file.
File inode have to be socket.
Since no sunaddr is present, addr-name is constructed at the place. It
obviously means, then path name can be truncated is it's longer then
UNIX_MAX_PATH.
Signed-off-by: Stanislav Kinsbursky skinsbur...@parallels.com
This syscall allows to bind socket to specified file descriptor.
Descriptor can be gained by simple open with O_PATH flag.
Socket node can be created by sys_mknod().
Signed-off-by: Stanislav Kinsbursky skinsbur...@parallels.com
---
arch/x86/syscalls/syscall_32.tbl |1 +
On 08/15/2012 09:22 AM, Stanislav Kinsbursky wrote:
This syscall allows to bind socket to specified file descriptor.
Descriptor can be gained by simple open with O_PATH flag.
Socket node can be created by sys_mknod().
Signed-off-by: Stanislav Kinsbursky skinsbur...@parallels.com
---
On Wed, Aug 15 2012, Glauber Costa wrote:
On 08/14/2012 10:58 PM, Greg Thelen wrote:
On Mon, Aug 13 2012, Glauber Costa wrote:
+WARN_ON(mem_cgroup_is_root(memcg));
+size = (1 order) PAGE_SHIFT;
+memcg_uncharge_kmem(memcg, size);
+mem_cgroup_put(memcg);
15.08.2012 20:30, H. Peter Anvin пишет:
On 08/15/2012 09:22 AM, Stanislav Kinsbursky wrote:
This syscall allows to bind socket to specified file descriptor.
Descriptor can be gained by simple open with O_PATH flag.
Socket node can be created by sys_mknod().
Signed-off-by: Stanislav Kinsbursky
On 08/15/2012 08:38 PM, Greg Thelen wrote:
On Wed, Aug 15 2012, Glauber Costa wrote:
On 08/14/2012 10:58 PM, Greg Thelen wrote:
On Mon, Aug 13 2012, Glauber Costa wrote:
+ WARN_ON(mem_cgroup_is_root(memcg));
+ size = (1 order) PAGE_SHIFT;
+ memcg_uncharge_kmem(memcg,
On Wed, Aug 15 2012, Glauber Costa wrote:
On 08/15/2012 08:38 PM, Greg Thelen wrote:
On Wed, Aug 15 2012, Glauber Costa wrote:
On 08/14/2012 10:58 PM, Greg Thelen wrote:
On Mon, Aug 13 2012, Glauber Costa wrote:
+ WARN_ON(mem_cgroup_is_root(memcg));
+ size = (1 order)
On Wed, 15 Aug 2012, Glauber Costa wrote:
Remember we copy over the metadata and create copies of the caches
per-memcg. Therefore, a dentry belongs to a memcg if it was allocated
from the slab pertaining to that memcg.
The dentry could be used by other processes in the system though. F.e.
On 08/15/2012 09:52 AM, Ben Pfaff wrote:
Stanislav Kinsbursky skinsbur...@parallels.com writes:
This system call is especially required for UNIX sockets, which has name
lenght limitation.
The worst of the name length limitations can be worked around by
opening the directory where the
On Wed, Aug 15, 2012 at 5:39 AM, Michal Hocko mho...@suse.cz wrote:
On Wed 15-08-12 13:33:55, Glauber Costa wrote:
[...]
This can
be quite confusing. I am still not sure whether we should mix the two
things together. If somebody wants to limit the kernel memory he has to
touch the other
On 08/15/2012 10:01 PM, Ying Han wrote:
On Wed, Aug 15, 2012 at 5:39 AM, Michal Hocko mho...@suse.cz wrote:
On Wed 15-08-12 13:33:55, Glauber Costa wrote:
[...]
This can
be quite confusing. I am still not sure whether we should mix the two
things together. If somebody wants to limit the
On Wed, Aug 15, 2012 at 8:11 AM, Glauber Costa glom...@parallels.com wrote:
On 08/15/2012 06:47 PM, Christoph Lameter wrote:
On Wed, 15 Aug 2012, Michal Hocko wrote:
That is not what the kernel does, in general. We assume that if he wants
that memory and we can serve it, we should. Also, not
On Wed, Aug 15, 2012 at 8:34 AM, Christoph Lameter c...@linux.com wrote:
On Wed, 15 Aug 2012, Glauber Costa wrote:
On 08/15/2012 06:47 PM, Christoph Lameter wrote:
On Wed, 15 Aug 2012, Michal Hocko wrote:
That is not what the kernel does, in general. We assume that if he wants
that
On 08/15/2012 10:25 PM, Christoph Lameter wrote:
On Wed, 15 Aug 2012, Ying Han wrote:
How can you figure out which objects belong to which memcg? The ownerships
of dentries and inodes is a dubious concept already.
I figured it out based on the kernel slab accounting.
On 08/15/2012 09:12 PM, Greg Thelen wrote:
On Wed, Aug 15 2012, Glauber Costa wrote:
On 08/15/2012 08:38 PM, Greg Thelen wrote:
On Wed, Aug 15 2012, Glauber Costa wrote:
On 08/14/2012 10:58 PM, Greg Thelen wrote:
On Mon, Aug 13 2012, Glauber Costa wrote:
+
Stanislav Kinsbursky skinsbur...@parallels.com writes:
This patch set introduces new socket operation and new system call:
sys_fbind(), which allows to bind socket to opened file.
File to bind to can be created by sys_mknod(S_IFSOCK) and opened by
open(O_PATH).
This system call is
On Tue, Aug 14, 2012 at 9:21 AM, Michal Hocko mho...@suse.cz wrote:
On Thu 09-08-12 17:01:12, Glauber Costa wrote:
This patch adds the basic infrastructure for the accounting of the slab
caches. To control that, the following files are created:
* memory.kmem.usage_in_bytes
*
On 08/15/2012 12:49 PM, Eric W. Biederman wrote:
There is also the trick of getting a shorter directory name using
/proc/self/fd if you are threaded and can't change the directory.
The obvious choices at this point are
- Teach bind and connect and af_unix sockets to take longer AF_UNIX
H. Peter Anvin h...@zytor.com writes:
On 08/15/2012 12:49 PM, Eric W. Biederman wrote:
There is also the trick of getting a shorter directory name using
/proc/self/fd if you are threaded and can't change the directory.
The obvious choices at this point are
- Teach bind and connect and
Stanislav Kinsbursky skinsbur...@parallels.com writes:
This patch set introduces new socket operation and new system call:
sys_fbind(), which allows to bind socket to opened file.
File to bind to can be created by sys_mknod(S_IFSOCK) and opened by
open(O_PATH).
This system call is
On Wed, Aug 15 2012, Glauber Costa wrote:
On 08/15/2012 09:12 PM, Greg Thelen wrote:
On Wed, Aug 15 2012, Glauber Costa wrote:
On 08/15/2012 08:38 PM, Greg Thelen wrote:
On Wed, Aug 15 2012, Glauber Costa wrote:
On 08/14/2012 10:58 PM, Greg Thelen wrote:
On Mon, Aug 13 2012, Glauber
57 matches
Mail list logo