[Devel] [PATCH RHEL7 COMMIT] ve: Use ve_name() in devperms ioctl

2015-06-29 Thread Konstantin Khorenko
The commit is pushed to branch-rh7-3.10.0-123.1.2-ovz and will appear at 
https://src.openvz.org/scm/ovz/vzkernel.git
after rh7-3.10.0-123.1.2.vz7.5.22
--
commit 812ac8c85ec65cb323c68c07050a20c7b60d806b
Author: Kirill Tkhai ktk...@odin.com
Date:   Mon Jun 29 17:42:56 2015 +0400

ve: Use ve_name() in devperms ioctl

In PCS7 we use UUID instead of ctid for cgroup directories names,
so we should open it.

Below is small test program to check the patch works:

#include linux/vzcalluser.h
#include stdio.h
#include fcntl.h

main()
{
struct vzctl_setdevperms s;
int fd = open(/dev/vzctl, O_RDWR);
unsigned major = 182;
unsigned minor = 281233;
s.veid = 101;
s.type = S_IFBLK|030;
s.dev = (minor  0xff) | (major  8) | ((minor  ~0xff)  12);
s.mask = S_IXUSR;

if (fd  0) {
printf(open err\n);
return -1;
}

if (ioctl(fd, VZCTL_SETDEVPERMS, s))
printf(ioctl\n);
}

https://jira.sw.ru/browse/PSBM-34497

Signed-off-by: Kirill Tkhai ktk...@odin.com
Reviewed-by: Vladimir Davydov vdavy...@parallels.com
---
 kernel/ve/vecalls.c | 9 +
 1 file changed, 5 insertions(+), 4 deletions(-)

diff --git a/kernel/ve/vecalls.c b/kernel/ve/vecalls.c
index e262c5e..be4fb1e 100644
--- a/kernel/ve/vecalls.c
+++ b/kernel/ve/vecalls.c
@@ -160,13 +160,14 @@ static int real_setdevperms(envid_t veid, unsigned type,
return -ESRCH;
 
down_read(ve-op_sem);
-   err = -ESRCH;
 
-   cgroup = ve_cgroup_open(devices_root, 0, ve-veid);
-   err = PTR_ERR(cgroup);
-   if (IS_ERR(cgroup))
+   cgroup = cgroup_kernel_open(devices_root, 0, ve_name(ve));
+   if (IS_ERR_OR_NULL(cgroup)) {
+   err = PTR_ERR(cgroup) ? : -ESRCH;
goto out;
+   }
 
+   err = -EAGAIN;
if (ve-is_running)
err = devcgroup_set_perms_ve(cgroup, type, dev, mask);
 
___
Devel mailing list
Devel@openvz.org
https://lists.openvz.org/mailman/listinfo/devel


[Devel] [PATCH RHEL7 COMMIT] ub: remove sock acct related resources from cgroup params

2015-06-29 Thread Konstantin Khorenko
The commit is pushed to branch-rh7-3.10.0-123.1.2-ovz and will appear at 
https://src.openvz.org/scm/ovz/vzkernel.git
after rh7-3.10.0-123.1.2.vz7.5.22
--
commit da2ff85105421a5a8cbecf1e38e57b6e6ccf80cc
Author: Vladimir Davydov vdavy...@parallels.com
Date:   Mon Jun 29 17:53:57 2015 +0400

ub: remove sock acct related resources from cgroup params

These resources are now accounted by memcg, so hide them, just like we
do for other such resources.

Signed-off-by: Vladimir Davydov vdavy...@parallels.com
---
 kernel/bc/beancounter.c | 24 +++-
 1 file changed, 19 insertions(+), 5 deletions(-)

diff --git a/kernel/bc/beancounter.c b/kernel/bc/beancounter.c
index 5f1affa..6b5ed78 100644
--- a/kernel/bc/beancounter.c
+++ b/kernel/bc/beancounter.c
@@ -483,6 +483,12 @@ static inline int bc_verify_held(struct user_beancounter 
*ub)
ub-ub_parms[UB_PHYSPAGES].held = 0;
ub-ub_parms[UB_SWAPPAGES].held = 0;
ub-ub_parms[UB_OOMGUARPAGES].held = 0;
+   ub-ub_parms[UB_NUMTCPSOCK].held = 0;
+   ub-ub_parms[UB_TCPSNDBUF].held = 0;
+   ub-ub_parms[UB_TCPRCVBUF].held = 0;
+   ub-ub_parms[UB_OTHERSOCKBUF].held = 0;
+   ub-ub_parms[UB_DGRAMRCVBUF].held = 0;
+   ub-ub_parms[UB_NUMOTHERSOCK].held = 0;
 
clean = 1;
for (i = 0; i  UB_RESOURCES; i++)
@@ -783,12 +789,20 @@ static __init int ub_cgroup_init(void)
continue;
 
/* accounted by memcg */
-   if (i == UB_PHYSPAGES ||
-   i == UB_SWAPPAGES ||
-   i == UB_OOMGUARPAGES ||
-   i == UB_KMEMSIZE ||
-   i == UB_DCACHESIZE)
+   switch (i) {
+   case UB_KMEMSIZE:
+   case UB_DCACHESIZE:
+   case UB_PHYSPAGES:
+   case UB_SWAPPAGES:
+   case UB_OOMGUARPAGES:
+   case UB_NUMTCPSOCK:
+   case UB_TCPSNDBUF:
+   case UB_TCPRCVBUF:
+   case UB_OTHERSOCKBUF:
+   case UB_DGRAMRCVBUF:
+   case UB_NUMOTHERSOCK:
continue;
+   }
 
cft = cgroup_files[j * UB_CGROUP_NR_ATTRS];
snprintf(cft-name, MAX_CFTYPE_NAME, %s.held, ub_rnames[i]);
___
Devel mailing list
Devel@openvz.org
https://lists.openvz.org/mailman/listinfo/devel


[Devel] [PATCH RHEL7 COMMIT] Revert ve/netfilter/ipt_CLUSTERIP: Pass ve net instead of init_net

2015-06-29 Thread Konstantin Khorenko
The commit is pushed to branch-rh7-3.10.0-123.1.2-ovz and will appear at 
https://src.openvz.org/scm/ovz/vzkernel.git
after rh7-3.10.0-123.1.2.vz7.5.22
--
commit 8381478ff63d1e9efea78b1b8bf007a9757faa4f
Author: Vladimir Davydov vdavy...@parallels.com
Date:   Mon Jun 29 17:58:45 2015 +0400

Revert ve/netfilter/ipt_CLUSTERIP: Pass ve net instead of init_net

This reverts commit 05f27248a3f1e0154d22d77bbc7987b34e3d1936.

netfilter/ipt_CLUSTERIP has been turned per net namespace upstream, so
we'd better drop this hack and pull the upstream code instead. This will
be done by the following patches.

Signed-off-by: Vladimir Davydov vdavy...@parallels.com
---
 net/ipv4/netfilter/ipt_CLUSTERIP.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/net/ipv4/netfilter/ipt_CLUSTERIP.c 
b/net/ipv4/netfilter/ipt_CLUSTERIP.c
index 79944da..a2e2b61 100644
--- a/net/ipv4/netfilter/ipt_CLUSTERIP.c
+++ b/net/ipv4/netfilter/ipt_CLUSTERIP.c
@@ -384,8 +384,7 @@ static int clusterip_tg_check(const struct xt_tgchk_param 
*par)
return -EINVAL;
}
 
-   dev = dev_get_by_name(get_exec_env()-ve_netns,
- e-ip.iniface);
+   dev = dev_get_by_name(init_net, e-ip.iniface);
if (!dev) {
pr_info(no such interface %s\n,
e-ip.iniface);
___
Devel mailing list
Devel@openvz.org
https://lists.openvz.org/mailman/listinfo/devel


[Devel] [PATCH RHEL7 COMMIT] ms/netfilter: ipt_CLUSTERIP: make clusterip_lock per net namespace

2015-06-29 Thread Konstantin Khorenko
The commit is pushed to branch-rh7-3.10.0-123.1.2-ovz and will appear at 
https://src.openvz.org/scm/ovz/vzkernel.git
after rh7-3.10.0-123.1.2.vz7.5.22
--
commit a893029e2574ac1e08f3dbcce04a7d9e5df25fe9
Author: Gao feng gaof...@cn.fujitsu.com
Date:   Mon Jun 29 17:59:28 2015 +0400

ms/netfilter: ipt_CLUSTERIP: make clusterip_lock per net namespace

this lock is used for protecting clusterip_configs of per
net namespace, it should be per net namespace too.

Signed-off-by: Gao feng gaof...@cn.fujitsu.com
Signed-off-by: Pablo Neira Ayuso pa...@netfilter.org
(cherry picked from commit f1e8077f490cff4253b197154bf2affaa0ca08e3)
Signed-off-by: Vladimir Davydov vdavy...@parallels.com
---
 net/ipv4/netfilter/ipt_CLUSTERIP.c | 18 +++---
 1 file changed, 11 insertions(+), 7 deletions(-)

diff --git a/net/ipv4/netfilter/ipt_CLUSTERIP.c 
b/net/ipv4/netfilter/ipt_CLUSTERIP.c
index bed4137..b96ad7e 100644
--- a/net/ipv4/netfilter/ipt_CLUSTERIP.c
+++ b/net/ipv4/netfilter/ipt_CLUSTERIP.c
@@ -58,9 +58,6 @@ struct clusterip_config {
struct rcu_head rcu;
 };
 
-/* clusterip_lock protects the clusterip_configs list */
-static DEFINE_SPINLOCK(clusterip_lock);
-
 #ifdef CONFIG_PROC_FS
 static const struct file_operations clusterip_proc_fops;
 #endif
@@ -69,6 +66,9 @@ static int clusterip_net_id __read_mostly;
 
 struct clusterip_net {
struct list_head configs;
+   /* lock protects the configs list */
+   spinlock_t lock;
+
 #ifdef CONFIG_PROC_FS
struct proc_dir_entry *procdir;
 #endif
@@ -99,10 +99,12 @@ clusterip_config_put(struct clusterip_config *c)
 static inline void
 clusterip_config_entry_put(struct clusterip_config *c)
 {
+   struct clusterip_net *cn = net_generic(init_net, clusterip_net_id);
+
local_bh_disable();
-   if (atomic_dec_and_lock(c-entries, clusterip_lock)) {
+   if (atomic_dec_and_lock(c-entries, cn-lock)) {
list_del_rcu(c-list);
-   spin_unlock(clusterip_lock);
+   spin_unlock(cn-lock);
local_bh_enable();
 
dev_mc_del(c-dev, c-clustermac);
@@ -198,9 +200,9 @@ clusterip_config_init(const struct ipt_clusterip_tgt_info 
*i, __be32 ip,
}
 #endif
 
-   spin_lock_bh(clusterip_lock);
+   spin_lock_bh(cn-lock);
list_add_rcu(c-list, cn-configs);
-   spin_unlock_bh(clusterip_lock);
+   spin_unlock_bh(cn-lock);
 
return c;
 }
@@ -713,6 +715,8 @@ static int clusterip_net_init(struct net *net)
 
INIT_LIST_HEAD(cn-configs);
 
+   spin_lock_init(cn-lock);
+
 #ifdef CONFIG_PROC_FS
cn-procdir = proc_mkdir(ipt_CLUSTERIP, net-proc_net);
if (!cn-procdir) {
___
Devel mailing list
Devel@openvz.org
https://lists.openvz.org/mailman/listinfo/devel


[Devel] [PATCH RHEL7 COMMIT] ve/kernel: drop broken audit virtualization

2015-06-29 Thread Konstantin Khorenko
The commit is pushed to branch-rh7-3.10.0-123.1.2-ovz and will appear at 
https://src.openvz.org/scm/ovz/vzkernel.git
after rh7-3.10.0-123.1.2.vz7.5.22
--
commit 5d57c681071143a4fafe0d8589cb5f5073d7ef09
Author: Vladimir Davydov vdavy...@parallels.com
Date:   Mon Jun 29 17:51:45 2015 +0400

ve/kernel: drop broken audit virtualization

As noted by Cyrill, it is deadly broken, that's why we disabled it in
config - see commit d9362cde31ac (audit: Disable audit subsystem in
config). This patch removes the virtualization code too. Since audit
has been turned per net namespace upstream, if we want it back we'd
better pull upstream patches.

Signed-off-by: Vladimir Davydov vdavy...@parallels.com
Acked-by: Cyrill Gorcunov gorcu...@openvz.org
---
 include/net/net_namespace.h |  1 -
 kernel/audit.c  | 51 ++---
 2 files changed, 11 insertions(+), 41 deletions(-)

diff --git a/include/net/net_namespace.h b/include/net/net_namespace.h
index 14eda00..20eb093 100644
--- a/include/net/net_namespace.h
+++ b/include/net/net_namespace.h
@@ -86,7 +86,6 @@ struct net {
/* core fib_rules */
struct list_headrules_ops;
 
-   struct sock *_audit_sock;   /* audit socket */
 
struct net_device   *loopback_dev;  /* The loopback */
struct netns_core   core;
diff --git a/kernel/audit.c b/kernel/audit.c
index e081f08..2c6de57 100644
--- a/kernel/audit.c
+++ b/kernel/audit.c
@@ -51,7 +51,6 @@
 #include linux/kthread.h
 #include linux/kernel.h
 #include linux/syscalls.h
-#include linux/ve.h
 
 #include linux/audit.h
 
@@ -121,7 +120,8 @@ u32 audit_sig_sid = 0;
 */
 static atomic_taudit_lost = ATOMIC_INIT(0);
 
-#define audit_sock (get_exec_env()-ve_netns-_audit_sock)
+/* The netlink socket. */
+static struct sock *audit_sock;
 
 /* Hash for inode-based rules */
 struct list_head audit_inode_hash[AUDIT_INODE_BUCKETS];
@@ -755,9 +755,6 @@ static int audit_receive_msg(struct sk_buff *skb, struct 
nlmsghdr *nlh)
char*ctx = NULL;
u32 len;
 
-   if (!ve_is_super(sock_net(skb-sk)-owner_ve))
-   return -ECONNREFUSED;
-
err = audit_netlink_ok(skb, msg_type);
if (err)
return err;
@@ -1017,50 +1014,24 @@ static void audit_receive(struct sk_buff  *skb)
mutex_unlock(audit_cmd_mutex);
 }
 
-static int __net_init audit_net_init(struct net *net)
-{
-   struct sock *sk;
-   struct netlink_kernel_cfg cfg = {
-   .input = audit_receive,
-   };
-
-   sk = netlink_kernel_create(net, NETLINK_AUDIT, cfg);
-   if (!sk) {
-   audit_panic(cannot initialize netlink socket);
-   return -ENODEV;
-   }
-
-   sk-sk_sndtimeo = MAX_SCHEDULE_TIMEOUT;
-   net-_audit_sock = sk;
-
-   return 0;
-}
-
-static void __net_exit audit_net_exit(struct net *net)
-{
-   netlink_kernel_release(net-_audit_sock);
-   net-_audit_sock = NULL;
-}
-
-static struct pernet_operations audit_net_ops = {
-   .init = audit_net_init,
-   .exit = audit_net_exit,
-};
-
 /* Initialize audit support at boot time. */
 static int __init audit_init(void)
 {
-   int i, res;
+   int i;
+   struct netlink_kernel_cfg cfg = {
+   .input  = audit_receive,
+   };
 
if (audit_initialized == AUDIT_DISABLED)
return 0;
 
printk(KERN_INFO audit: initializing netlink socket (%s)\n,
   audit_default ? enabled : disabled);
-
-   res = register_pernet_subsys(audit_net_ops);
-   if (res  0)
-   return res;
+   audit_sock = netlink_kernel_create(init_net, NETLINK_AUDIT, cfg);
+   if (!audit_sock)
+   audit_panic(cannot initialize netlink socket);
+   else
+   audit_sock-sk_sndtimeo = MAX_SCHEDULE_TIMEOUT;
 
skb_queue_head_init(audit_skb_queue);
skb_queue_head_init(audit_skb_hold_queue);
___
Devel mailing list
Devel@openvz.org
https://lists.openvz.org/mailman/listinfo/devel


[Devel] [PATCH RHEL7 COMMIT] ms/netfilter: ipt_CLUSTERIP: make proc directory per net namespace

2015-06-29 Thread Konstantin Khorenko
The commit is pushed to branch-rh7-3.10.0-123.1.2-ovz and will appear at 
https://src.openvz.org/scm/ovz/vzkernel.git
after rh7-3.10.0-123.1.2.vz7.5.22
--
commit 201f3b2df3b9dfd2ecdc3c07f82c660094e7c362
Author: Gao feng gaof...@cn.fujitsu.com
Date:   Mon Jun 29 17:59:02 2015 +0400

ms/netfilter: ipt_CLUSTERIP: make proc directory per net namespace

Create /proc/net/ipt_CLUSTERIP directory for per net namespace.
Right now,only allow to create entries under the ipt_CLUSTERIP
in init net namespace.

Signed-off-by: Gao feng gaof...@cn.fujitsu.com
Signed-off-by: Pablo Neira Ayuso pa...@netfilter.org
(cherry picked from commit ce4ff76c15a877a62097807a35518fc808c1853c)
Signed-off-by: Vladimir Davydov vdavy...@parallels.com
---
 net/ipv4/netfilter/ipt_CLUSTERIP.c | 70 +++---
 1 file changed, 51 insertions(+), 19 deletions(-)

diff --git a/net/ipv4/netfilter/ipt_CLUSTERIP.c 
b/net/ipv4/netfilter/ipt_CLUSTERIP.c
index a2e2b61..2ba91c4 100644
--- a/net/ipv4/netfilter/ipt_CLUSTERIP.c
+++ b/net/ipv4/netfilter/ipt_CLUSTERIP.c
@@ -28,6 +28,7 @@
 #include linux/netfilter_ipv4/ipt_CLUSTERIP.h
 #include net/netfilter/nf_conntrack.h
 #include net/net_namespace.h
+#include net/netns/generic.h
 #include net/checksum.h
 #include net/ip.h
 
@@ -64,9 +65,16 @@ static DEFINE_SPINLOCK(clusterip_lock);
 
 #ifdef CONFIG_PROC_FS
 static const struct file_operations clusterip_proc_fops;
-static struct proc_dir_entry *clusterip_procdir;
 #endif
 
+static int clusterip_net_id __read_mostly;
+
+struct clusterip_net {
+#ifdef CONFIG_PROC_FS
+   struct proc_dir_entry *procdir;
+#endif
+};
+
 static inline void
 clusterip_config_get(struct clusterip_config *c)
 {
@@ -158,6 +166,7 @@ clusterip_config_init(const struct ipt_clusterip_tgt_info 
*i, __be32 ip,
struct net_device *dev)
 {
struct clusterip_config *c;
+   struct clusterip_net *cn = net_generic(init_net, clusterip_net_id);
 
c = kzalloc(sizeof(*c), GFP_ATOMIC);
if (!c)
@@ -180,7 +189,7 @@ clusterip_config_init(const struct ipt_clusterip_tgt_info 
*i, __be32 ip,
/* create proc dir entry */
sprintf(buffer, %pI4, ip);
c-pde = proc_create_data(buffer, S_IWUSR|S_IRUSR,
- clusterip_procdir,
+ cn-procdir,
  clusterip_proc_fops, c);
if (!c-pde) {
kfree(c);
@@ -698,48 +707,71 @@ static const struct file_operations clusterip_proc_fops = 
{
 
 #endif /* CONFIG_PROC_FS */
 
+static int clusterip_net_init(struct net *net)
+{
+#ifdef CONFIG_PROC_FS
+   struct clusterip_net *cn = net_generic(net, clusterip_net_id);
+
+   cn-procdir = proc_mkdir(ipt_CLUSTERIP, net-proc_net);
+   if (!cn-procdir) {
+   pr_err(Unable to proc dir entry\n);
+   return -ENOMEM;
+   }
+#endif /* CONFIG_PROC_FS */
+
+   return 0;
+}
+
+static void clusterip_net_exit(struct net *net)
+{
+#ifdef CONFIG_PROC_FS
+   struct clusterip_net *cn = net_generic(net, clusterip_net_id);
+   proc_remove(cn-procdir);
+#endif
+}
+
+static struct pernet_operations clusterip_net_ops = {
+   .init = clusterip_net_init,
+   .exit = clusterip_net_exit,
+   .id   = clusterip_net_id,
+   .size = sizeof(struct clusterip_net),
+};
+
 static int __init clusterip_tg_init(void)
 {
int ret;
 
-   ret = xt_register_target(clusterip_tg_reg);
+   ret = register_pernet_subsys(clusterip_net_ops);
if (ret  0)
return ret;
 
+   ret = xt_register_target(clusterip_tg_reg);
+   if (ret  0)
+   goto cleanup_subsys;
+
ret = nf_register_hook(cip_arp_ops);
if (ret  0)
goto cleanup_target;
 
-#ifdef CONFIG_PROC_FS
-   clusterip_procdir = proc_mkdir(ipt_CLUSTERIP, init_net.proc_net);
-   if (!clusterip_procdir) {
-   pr_err(Unable to proc dir entry\n);
-   ret = -ENOMEM;
-   goto cleanup_hook;
-   }
-#endif /* CONFIG_PROC_FS */
-
pr_info(ClusterIP Version %s loaded successfully\n,
CLUSTERIP_VERSION);
+
return 0;
 
-#ifdef CONFIG_PROC_FS
-cleanup_hook:
-   nf_unregister_hook(cip_arp_ops);
-#endif /* CONFIG_PROC_FS */
 cleanup_target:
xt_unregister_target(clusterip_tg_reg);
+cleanup_subsys:
+   unregister_pernet_subsys(clusterip_net_ops);
return ret;
 }
 
 static void __exit clusterip_tg_exit(void)
 {
pr_info(ClusterIP Version %s unloading\n, CLUSTERIP_VERSION);
-#ifdef CONFIG_PROC_FS
-   proc_remove(clusterip_procdir);
-#endif
+
nf_unregister_hook(cip_arp_ops);
xt_unregister_target(clusterip_tg_reg);
+   unregister_pernet_subsys(clusterip_net_ops);
 
/* Wait for completion of call_rcu_bh()'s (clusterip_config_rcu_free) */

[Devel] [PATCH RHEL7 COMMIT] memcg/ub: fix limit RESOURCE_MAX case

2015-06-29 Thread Konstantin Khorenko
The commit is pushed to branch-rh7-3.10.0-123.1.2-ovz and will appear at 
https://src.openvz.org/scm/ovz/vzkernel.git
after rh7-3.10.0-123.1.2.vz7.5.22
--
commit e651dfc30632e241a0ce5758c9a4c5e9d4a6935b
Author: Vladimir Davydov vdavy...@parallels.com
Date:   Mon Jun 29 17:37:26 2015 +0400

memcg/ub: fix limit  RESOURCE_MAX case

A memcg limit can be greater than RESOURCE_MAX (LLONG_MAX), because it
is rounded up to PAGE_SIZE. As a result, we will show huge numbers in
meminfo for unlimited containers instead of the host's configuration.
Fix this by making all limit-vs-RESOURCE_MAX comparisons use
greater-or-equal sign.

Signed-off-by: Vladimir Davydov vdavy...@parallels.com
Reviewed-by: Kirill Tkhai ktk...@odin.com
---
 mm/memcontrol.c | 12 ++--
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index cb153ac..50eefe3 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -1648,7 +1648,7 @@ unsigned long mem_cgroup_total_pages(struct mem_cgroup 
*memcg, bool swap)
 
limit = swap ? res_counter_read_u64(memcg-memsw, RES_LIMIT) :
res_counter_read_u64(memcg-res, RES_LIMIT);
-   if (limit == RESOURCE_MAX)
+   if (limit = RESOURCE_MAX)
return ULONG_MAX;
return min_t(unsigned long long, ULONG_MAX, limit  PAGE_SHIFT);
 }
@@ -5388,7 +5388,7 @@ void mem_cgroup_sync_beancounter(struct mem_cgroup *memcg,
p-maxheld = res_counter_read_u64(memcg-res, RES_MAX_USAGE)  
PAGE_SHIFT;
p-failcnt = atomic_long_read(memcg-mem_failcnt);
lim = res_counter_read_u64(memcg-res, RES_LIMIT);
-   lim = lim == RESOURCE_MAX ? UB_MAXVALUE :
+   lim = lim = RESOURCE_MAX ? UB_MAXVALUE :
min_t(unsigned long long, lim  PAGE_SHIFT, UB_MAXVALUE);
p-barrier = p-limit = lim;
 
@@ -5396,7 +5396,7 @@ void mem_cgroup_sync_beancounter(struct mem_cgroup *memcg,
k-maxheld = res_counter_read_u64(memcg-kmem, RES_MAX_USAGE);
k-failcnt = res_counter_read_u64(memcg-kmem, RES_FAILCNT);
lim = res_counter_read_u64(memcg-kmem, RES_LIMIT);
-   lim = lim == RESOURCE_MAX ? UB_MAXVALUE :
+   lim = lim = RESOURCE_MAX ? UB_MAXVALUE :
min_t(unsigned long long, lim, UB_MAXVALUE);
k-barrier = k-limit = lim;
 
@@ -5410,7 +5410,7 @@ void mem_cgroup_sync_beancounter(struct mem_cgroup *memcg,
maxheld = memcg-swap_max  PAGE_SHIFT;
s-failcnt = atomic_long_read(memcg-swap_failcnt);
lim = res_counter_read_u64(memcg-memsw, RES_LIMIT);
-   lim = lim == RESOURCE_MAX ? UB_MAXVALUE :
+   lim = lim = RESOURCE_MAX ? UB_MAXVALUE :
min_t(unsigned long long, lim  PAGE_SHIFT, UB_MAXVALUE);
if (lim != UB_MAXVALUE)
lim -= p-limit;
@@ -5425,7 +5425,7 @@ void mem_cgroup_sync_beancounter(struct mem_cgroup *memcg,
o-maxheld = res_counter_read_u64(memcg-memsw, RES_MAX_USAGE)  
PAGE_SHIFT;
o-failcnt = atomic_long_read(memcg-oom_kill_cnt);
lim = memcg-oom_guarantee;
-   lim = lim == RESOURCE_MAX ? UB_MAXVALUE :
+   lim = lim = RESOURCE_MAX ? UB_MAXVALUE :
min_t(unsigned long long, lim  PAGE_SHIFT, UB_MAXVALUE);
o-barrier = o-limit = lim;
 }
@@ -5486,7 +5486,7 @@ int mem_cgroup_apply_beancounter(struct mem_cgroup *memcg,
 
if (mem != mem_old) {
/* first, reset memsw limit since it cannot be  mem limit */
-   if (memsw_old != RESOURCE_MAX) {
+   if (memsw_old  RESOURCE_MAX) {
memsw_old = RESOURCE_MAX;
ret = mem_cgroup_resize_memsw_limit(memcg, memsw_old);
if (ret)
___
Devel mailing list
Devel@openvz.org
https://lists.openvz.org/mailman/listinfo/devel


[Devel] [PATCH RHEL7 COMMIT] ms/netfilter: ipt_CLUSTERIP: create proc entry under proper ipt_CLUSTERIP directory

2015-06-29 Thread Konstantin Khorenko
The commit is pushed to branch-rh7-3.10.0-123.1.2-ovz and will appear at 
https://src.openvz.org/scm/ovz/vzkernel.git
after rh7-3.10.0-123.1.2.vz7.5.22
--
commit 90a6058501a752eb80101eaa126fcc443b809cb5
Author: Gao feng gaof...@cn.fujitsu.com
Date:   Mon Jun 29 17:59:46 2015 +0400

ms/netfilter: ipt_CLUSTERIP: create proc entry under proper ipt_CLUSTERIP 
directory

Create proc entries under the ipt_CLUSTERIP directory of proper
net namespace.

Signed-off-by: Gao feng gaof...@cn.fujitsu.com
Signed-off-by: Pablo Neira Ayuso pa...@netfilter.org
(cherry picked from commit f58d7866018dedae7ec67e152402b8ede17ce39e)
Signed-off-by: Vladimir Davydov vdavy...@parallels.com
---
 net/ipv4/netfilter/ipt_CLUSTERIP.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/ipv4/netfilter/ipt_CLUSTERIP.c 
b/net/ipv4/netfilter/ipt_CLUSTERIP.c
index 8945df5..78b7767 100644
--- a/net/ipv4/netfilter/ipt_CLUSTERIP.c
+++ b/net/ipv4/netfilter/ipt_CLUSTERIP.c
@@ -168,7 +168,7 @@ clusterip_config_init(const struct ipt_clusterip_tgt_info 
*i, __be32 ip,
struct net_device *dev)
 {
struct clusterip_config *c;
-   struct clusterip_net *cn = net_generic(init_net, clusterip_net_id);
+   struct clusterip_net *cn = net_generic(dev_net(dev), clusterip_net_id);
 
c = kzalloc(sizeof(*c), GFP_ATOMIC);
if (!c)
___
Devel mailing list
Devel@openvz.org
https://lists.openvz.org/mailman/listinfo/devel


[Devel] [PATCH RHEL7 COMMIT] ms/netfilter: ipt_CLUSTERIP: add parameter net in clusterip_config_find_get

2015-06-29 Thread Konstantin Khorenko
The commit is pushed to branch-rh7-3.10.0-123.1.2-ovz and will appear at 
https://src.openvz.org/scm/ovz/vzkernel.git
after rh7-3.10.0-123.1.2.vz7.5.22
--
commit ba3e0d8b74e2a6ae4074cd567677e82955e01313
Author: Gao feng gaof...@cn.fujitsu.com
Date:   Mon Jun 29 17:59:40 2015 +0400

ms/netfilter: ipt_CLUSTERIP: add parameter net in clusterip_config_find_get

Inorder to find clusterip_config in net namespace.

Signed-off-by: Gao feng gaof...@cn.fujitsu.com
Signed-off-by: Pablo Neira Ayuso pa...@netfilter.org
(cherry picked from commit b5ef0f85bf76986e5076cd1e0820fa4e61325772)
Signed-off-by: Vladimir Davydov vdavy...@parallels.com
---
 net/ipv4/netfilter/ipt_CLUSTERIP.c | 12 ++--
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/net/ipv4/netfilter/ipt_CLUSTERIP.c 
b/net/ipv4/netfilter/ipt_CLUSTERIP.c
index b96ad7e..8945df5 100644
--- a/net/ipv4/netfilter/ipt_CLUSTERIP.c
+++ b/net/ipv4/netfilter/ipt_CLUSTERIP.c
@@ -122,10 +122,10 @@ clusterip_config_entry_put(struct clusterip_config *c)
 }
 
 static struct clusterip_config *
-__clusterip_config_find(__be32 clusterip)
+__clusterip_config_find(struct net *net, __be32 clusterip)
 {
struct clusterip_config *c;
-   struct clusterip_net *cn = net_generic(init_net, clusterip_net_id);
+   struct clusterip_net *cn = net_generic(net, clusterip_net_id);
 
list_for_each_entry_rcu(c, cn-configs, list) {
if (c-clusterip == clusterip)
@@ -136,12 +136,12 @@ __clusterip_config_find(__be32 clusterip)
 }
 
 static inline struct clusterip_config *
-clusterip_config_find_get(__be32 clusterip, int entry)
+clusterip_config_find_get(struct net *net, __be32 clusterip, int entry)
 {
struct clusterip_config *c;
 
rcu_read_lock_bh();
-   c = __clusterip_config_find(clusterip);
+   c = __clusterip_config_find(net, clusterip);
if (c) {
if (unlikely(!atomic_inc_not_zero(c-refcount)))
c = NULL;
@@ -381,7 +381,7 @@ static int clusterip_tg_check(const struct xt_tgchk_param 
*par)
 
/* FIXME: further sanity checks */
 
-   config = clusterip_config_find_get(e-ip.dst.s_addr, 1);
+   config = clusterip_config_find_get(init_net, e-ip.dst.s_addr, 1);
if (!config) {
if (!(cipinfo-flags  CLUSTERIP_FLAG_NEW)) {
pr_info(no config found for %pI4, need 'new'\n,
@@ -519,7 +519,7 @@ arp_mangle(const struct nf_hook_ops *ops,
 
/* if there is no clusterip configuration for the arp reply's
 * source ip, we don't want to mangle it */
-   c = clusterip_config_find_get(payload-src_ip, 0);
+   c = clusterip_config_find_get(init_net, payload-src_ip, 0);
if (!c)
return NF_ACCEPT;
 
___
Devel mailing list
Devel@openvz.org
https://lists.openvz.org/mailman/listinfo/devel


[Devel] [PATCH RHEL7 COMMIT] ms/netfilter: ipt_CLUSTERIP: use proper net namespace to operate CLUSTERIP

2015-06-29 Thread Konstantin Khorenko
The commit is pushed to branch-rh7-3.10.0-123.1.2-ovz and will appear at 
https://src.openvz.org/scm/ovz/vzkernel.git
after rh7-3.10.0-123.1.2.vz7.5.22
--
commit ce4729e92881835587a29db96d0501aa45b2aebb
Author: Gao feng gaof...@cn.fujitsu.com
Date:   Mon Jun 29 17:59:51 2015 +0400

ms/netfilter: ipt_CLUSTERIP: use proper net namespace to operate CLUSTERIP

we can allow users in uninit net namespace to operate ipt_CLUSTERIP
now.

Signed-off-by: Gao feng gaof...@cn.fujitsu.com
Signed-off-by: Pablo Neira Ayuso pa...@netfilter.org
(cherry picked from commit d86946d2c5b4e519ffe435c2deeb2c9436ceb04f)
Signed-off-by: Vladimir Davydov vdavy...@parallels.com
---
 net/ipv4/netfilter/ipt_CLUSTERIP.c | 10 ++
 1 file changed, 6 insertions(+), 4 deletions(-)

diff --git a/net/ipv4/netfilter/ipt_CLUSTERIP.c 
b/net/ipv4/netfilter/ipt_CLUSTERIP.c
index 78b7767..2510c02 100644
--- a/net/ipv4/netfilter/ipt_CLUSTERIP.c
+++ b/net/ipv4/netfilter/ipt_CLUSTERIP.c
@@ -99,7 +99,8 @@ clusterip_config_put(struct clusterip_config *c)
 static inline void
 clusterip_config_entry_put(struct clusterip_config *c)
 {
-   struct clusterip_net *cn = net_generic(init_net, clusterip_net_id);
+   struct net *net = dev_net(c-dev);
+   struct clusterip_net *cn = net_generic(net, clusterip_net_id);
 
local_bh_disable();
if (atomic_dec_and_lock(c-entries, cn-lock)) {
@@ -381,7 +382,7 @@ static int clusterip_tg_check(const struct xt_tgchk_param 
*par)
 
/* FIXME: further sanity checks */
 
-   config = clusterip_config_find_get(init_net, e-ip.dst.s_addr, 1);
+   config = clusterip_config_find_get(par-net, e-ip.dst.s_addr, 1);
if (!config) {
if (!(cipinfo-flags  CLUSTERIP_FLAG_NEW)) {
pr_info(no config found for %pI4, need 'new'\n,
@@ -395,7 +396,7 @@ static int clusterip_tg_check(const struct xt_tgchk_param 
*par)
return -EINVAL;
}
 
-   dev = dev_get_by_name(init_net, e-ip.iniface);
+   dev = dev_get_by_name(par-net, e-ip.iniface);
if (!dev) {
pr_info(no such interface %s\n,
e-ip.iniface);
@@ -503,6 +504,7 @@ arp_mangle(const struct nf_hook_ops *ops,
struct arphdr *arp = arp_hdr(skb);
struct arp_payload *payload;
struct clusterip_config *c;
+   struct net *net = dev_net(in ? in : out);
 
/* we don't care about non-ethernet and non-ipv4 ARP */
if (arp-ar_hrd != htons(ARPHRD_ETHER) ||
@@ -519,7 +521,7 @@ arp_mangle(const struct nf_hook_ops *ops,
 
/* if there is no clusterip configuration for the arp reply's
 * source ip, we don't want to mangle it */
-   c = clusterip_config_find_get(init_net, payload-src_ip, 0);
+   c = clusterip_config_find_get(net, payload-src_ip, 0);
if (!c)
return NF_ACCEPT;
 
___
Devel mailing list
Devel@openvz.org
https://lists.openvz.org/mailman/listinfo/devel


[Devel] [PATCH RHEL7 COMMIT] net/packet: cleanup packet_sk_charge

2015-06-29 Thread Konstantin Khorenko
The commit is pushed to branch-rh7-3.10.0-123.1.2-ovz and will appear at 
https://src.openvz.org/scm/ovz/vzkernel.git
after rh7-3.10.0-123.1.2.vz7.5.21
--
commit 30a812f9edd80554dd11977529a789afd5f9bc18
Author: Vladimir Davydov vdavy...@parallels.com
Date:   Mon Jun 29 15:00:45 2015 +0400

net/packet: cleanup packet_sk_charge

 - Do not check mem_cgroup_sockets_enabled - it has nothing to do with
   tcp/udp buffers accounting, which enable this static key. A check if
   memcg_kmem_is_active is enough anyway.

 - Do not forget to put memcg if try_get_mem_cgroup_from_mm returned a
   kmem inactive memcg.

 - Use ACCESS_ONCE for reading sysctl_rmem_max, because it can change on
   the fly and we rely on it being constant.

 - Use memcg_charge_kmem instead of memcg_charge_kmem_nofail for
   charging sock packet buf, because we can dive into reclaim here.

Signed-off-by: Vladimir Davydov vdavy...@parallels.com
---
 include/linux/memcontrol.h |  1 +
 mm/memcontrol.c|  2 +-
 net/packet/af_packet.c | 44 +++-
 3 files changed, 29 insertions(+), 18 deletions(-)

diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h
index e09ec92..eb7ae43 100644
--- a/include/linux/memcontrol.h
+++ b/include/linux/memcontrol.h
@@ -481,6 +481,7 @@ void __memcg_kmem_put_cache(struct kmem_cache *cachep);
 
 struct mem_cgroup *__mem_cgroup_from_kmem(void *ptr);
 
+int memcg_charge_kmem(struct mem_cgroup *memcg, gfp_t gfp, u64 size);
 void memcg_charge_kmem_nofail(struct mem_cgroup *memcg, u64 size);
 void memcg_uncharge_kmem(struct mem_cgroup *memcg, u64 size);
 
diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index 17552cf..cb153ac 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -3158,7 +3158,7 @@ static int mem_cgroup_slabinfo_read(struct cgroup *cont, 
struct cftype *cft,
 }
 #endif
 
-static int memcg_charge_kmem(struct mem_cgroup *memcg, gfp_t gfp, u64 size)
+int memcg_charge_kmem(struct mem_cgroup *memcg, gfp_t gfp, u64 size)
 {
struct res_counter *fail_res;
struct mem_cgroup *_memcg;
diff --git a/net/packet/af_packet.c b/net/packet/af_packet.c
index a2e4fad..af79d53 100644
--- a/net/packet/af_packet.c
+++ b/net/packet/af_packet.c
@@ -2429,28 +2429,38 @@ struct packet_sk_charge {
 static struct cg_proto *packet_sk_charge(void)
 {
struct packet_sk_charge *psc;
-
-   if (!mem_cgroup_sockets_enabled)
-   return NULL;
+   int err = -ENOMEM;
 
psc = kmalloc(sizeof(*psc), GFP_KERNEL);
if (!psc)
-   return ERR_PTR(-ENOMEM);
+   goto out;
 
+   err = 0;
psc-memcg = try_get_mem_cgroup_from_mm(current-mm);
-   if (psc-memcg  memcg_kmem_is_active(psc-memcg)) {
-   /*
-* Forcedly charge the maximum amount of data this socket
-* may have. It's typically not huge and packet sockets are
-* rare guests in containers, so we don't disturb the memory
-* consumption much.
-*/
-   psc-amt = sysctl_rmem_max;
-   memcg_charge_kmem_nofail(psc-memcg, psc-amt);
-   } else {
-   kfree(psc);
-   psc = NULL;
-   }
+   if (!psc-memcg)
+   goto out_free_psc;
+   if (!memcg_kmem_is_active(psc-memcg))
+   goto out_put_cg;
+
+   /*
+* Forcedly charge the maximum amount of data this socket may have.
+* It's typically not huge and packet sockets are rare guests in
+* containers, so we don't disturb the memory consumption much.
+*/
+   psc-amt = ACCESS_ONCE(sysctl_rmem_max);
+
+   err = memcg_charge_kmem(psc-memcg, GFP_KERNEL, psc-amt);
+   if (!err)
+   goto out;
+
+out_put_cg:
+   css_put(mem_cgroup_css(psc-memcg));
+out_free_psc:
+   kfree(psc);
+   psc = NULL;
+out:
+   if (err)
+   return ERR_PTR(err);
 
/*
 * The sk-sk_cgrp is not used for packet sockets,
___
Devel mailing list
Devel@openvz.org
https://lists.openvz.org/mailman/listinfo/devel


[Devel] [PATCH RHEL7 COMMIT] Revert unix: Charge outgoing buffers into cg memory

2015-06-29 Thread Konstantin Khorenko
The commit is pushed to branch-rh7-3.10.0-123.1.2-ovz and will appear at 
https://src.openvz.org/scm/ovz/vzkernel.git
after rh7-3.10.0-123.1.2.vz7.5.21
--
commit c2993da9b126396025f8485004fad02f9a5c9525
Author: Vladimir Davydov vdavy...@parallels.com
Date:   Mon Jun 29 15:00:19 2015 +0400

Revert unix: Charge outgoing buffers into cg memory

This reverts commit f22980954a2d765ca6ca03c11b2eac8f3fe1d105.

This commit is deadly broken - it frees pages allocated with
alloc_kmem_pages using put_page instead of free_kmem_pages. As a result,
kmem counter of the cgroup the page is charged to won't be uncharged and
therefore will be leaked. What is worse, if such a page then gets reused
for a thread info or slab page, there is a chance that the order of the
new page will be greater than it used to be, as a result the
mem_cgroup-kmem counter can be under-uncharged:

  WARNING: at kernel/res_counter.c:91 
res_counter_uncharge_locked+0x2f/0x40()
  CPU: 0 PID: 19 Comm: rcuos/1 ve: 0 Tainted: GW   --   
3.10.0-123.1.2.vz7.5.18 #1 5.18
  817e8e14 a31483c6 8804090c1c98 815ca9ea
  8804090c1cd0 8105e091 8800ceadf150 
  8800ceadf150  8800ceadf178 8804090c1ce0
  Call Trace:
  [815ca9ea] dump_stack+0x19/0x1b
  [8105e091] warn_slowpath_common+0x61/0x80
  [8105e1ba] warn_slowpath_null+0x1a/0x20
  [810edfdf] res_counter_uncharge_locked+0x2f/0x40
  [810ee1e5] res_counter_uncharge_until+0x55/0xb0
  [810ee253] res_counter_uncharge+0x13/0x20
  [811b2ba4] memcg_uncharge_kmem+0x34/0x80
  [811b2ebd] __memcg_kmem_uncharge_pages+0x5d/0x70
  [8114fe28] free_kmem_pages+0x68/0x80
  [8105aed2] free_task+0x32/0x60
  [8105af9b] __put_task_struct+0x9b/0x140
  [81062b6c] delayed_put_task_struct+0x3c/0x80
  [811034d9] rcu_nocb_kthread+0x229/0x370
  [810883a0] ? wake_up_bit+0x30/0x30
  [811032b0] ? rcu_start_gp+0x40/0x40
  [8108723f] kthread+0xcf/0xe0
  [81087170] ? create_kthread+0x60/0x60
  [815db0ac] ret_from_fork+0x7c/0xb0
  [81087170] ? create_kthread+0x60/0x60

This will probably eventually lead to the cgroup being freed when there
are still active objects in one or more of its kmem caches:

  BUG buffer_head(39:101) (Tainted: GW   --  ): Objects 
remaining in buffer_head(39:101) on kmem_cache_close()

  kernel BUG at mm/slab_common.c:493!

This patch therefore reverts the above mentioned commit. We will rework
it later.

https://jira.sw.ru/browse/PSBM-34492

Signed-off-by: Vladimir Davydov vdavy...@parallels.com

Conflicts:
net/core/sock.c
---
 net/core/sock.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/net/core/sock.c b/net/core/sock.c
index 10b4362..03f4b23 100644
--- a/net/core/sock.c
+++ b/net/core/sock.c
@@ -1780,7 +1780,7 @@ struct sk_buff *sock_alloc_send_pskb(struct sock *sk, 
unsigned long header_len,
 
while (order) {
if (npages = 1  order) {
-   page = 
alloc_kmem_pages(sk-sk_allocation |
+   page = alloc_pages(sk-sk_allocation |
   __GFP_COMP | 
__GFP_NOWARN,
   order);
if (page)
@@ -1788,7 +1788,7 @@ struct sk_buff *sock_alloc_send_pskb(struct sock *sk, 
unsigned long header_len,
}
order--;
}
-   page = alloc_kmem_pages(sk-sk_allocation, 0);
+   page = alloc_page(sk-sk_allocation);
if (!page)
goto failure;
 fill_page:
___
Devel mailing list
Devel@openvz.org
https://lists.openvz.org/mailman/listinfo/devel


[Devel] [PATCH RHEL7 COMMIT] memcg: disarm memcg_socket_limit_enabled key per each active udp proto

2015-06-29 Thread Konstantin Khorenko
The commit is pushed to branch-rh7-3.10.0-123.1.2-ovz and will appear at 
https://src.openvz.org/scm/ovz/vzkernel.git
after rh7-3.10.0-123.1.2.vz7.5.21
--
commit 41d2a06dbf49a7b05d49c1922d06e26768a7bc86
Author: Vladimir Davydov vdavy...@parallels.com
Date:   Mon Jun 29 15:00:40 2015 +0400

memcg: disarm memcg_socket_limit_enabled key per each active udp proto

We increment it not only per each active tcp proto, but also per each
active udp proto - see udp_memcontrol.c, so we must decrement it
appropriately when a proto is destroyed.

Fixes: bf083721b986e (udp: Charge ingress buffers into cg memory)
Signed-off-by: Vladimir Davydov vdavy...@parallels.com
---
 mm/memcontrol.c | 7 ---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index e7c1cd3..17552cf 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -584,9 +584,10 @@ EXPORT_SYMBOL(udp_proto_cgroup);
 
 static void disarm_sock_keys(struct mem_cgroup *memcg)
 {
-   if (!memcg_proto_activated(memcg-tcp_mem.cg_proto))
-   return;
-   static_key_slow_dec(memcg_socket_limit_enabled);
+   if (memcg_proto_activated(memcg-tcp_mem.cg_proto))
+   static_key_slow_dec(memcg_socket_limit_enabled);
+   if (memcg_proto_activated(memcg-udp_mem.cg_proto))
+   static_key_slow_dec(memcg_socket_limit_enabled);
 }
 #else
 static void disarm_sock_keys(struct mem_cgroup *memcg)
___
Devel mailing list
Devel@openvz.org
https://lists.openvz.org/mailman/listinfo/devel


Re: [Devel] [PATCH RH7 0/4] ve: fix initialization and remove sysctl_fsync_enable

2015-06-29 Thread Konstantin Khorenko
Dima,

please review changes in fsync_enable, ve_fsync_behavior, odirect_enable,
so patches 1,2,4.

--
Best regards,

Konstantin Khorenko,
Virtuozzo Linux Kernel Team

On 06/25/2015 03:56 PM, Pavel Tikhomirov wrote:
 Pavel Tikhomirov (4):
   ve: remove sysctl_fsync_enable and use ve_fsync_behavior instead
   ve: initialize fsync_enable also for non ve0 environment
   ve: iptables: fix mask initialization and changing
   ve: cgroup: initialize odirect_enable, features and
 _randomize_va_space
 
  fs/sync.c   |  9 
  include/linux/fs.h  |  1 -
  include/linux/ve.h  |  4 
  kernel/ve/Makefile  |  2 ++
  kernel/ve/ve.c  | 66 
 -
  kernel/ve/vecalls.c | 28 +--
  mm/msync.c  |  2 +-
  7 files changed, 53 insertions(+), 59 deletions(-)
 
___
Devel mailing list
Devel@openvz.org
https://lists.openvz.org/mailman/listinfo/devel