Re: [Devel] [PATCH rh7 2/3] memcg: use CFTYPE_NOT_ON_ROOT for memory.low and memory.oom_guarantee

2015-05-26 Thread Kirill Tkhai
В Пн, 25/05/2015 в 17:05 +0300, Vladimir Davydov пишет:
> This is neater than checking if the root is passed to the write method
> and this is how it works upstream (for memory.low).
> 
> Signed-off-by: Vladimir Davydov 
> ---
>  mm/memcontrol.c | 8 ++--
>  1 file changed, 2 insertions(+), 6 deletions(-)
> 
> diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> index 0ca9cdff8c83..144a2720b604 100644
> --- a/mm/memcontrol.c
> +++ b/mm/memcontrol.c
> @@ -5192,9 +5192,6 @@ static int mem_cgroup_low_write(struct cgroup *cont, 
> struct cftype *cft,
>   unsigned long long val;
>   int ret;
>  
> - if (mem_cgroup_is_root(memcg))
> - return -EINVAL;
> -
>   ret = res_counter_memparse_write_strategy(buffer, &val);
>   if (ret)
>   return ret;
> @@ -5222,9 +5219,6 @@ static int mem_cgroup_oom_guarantee_write(struct cgroup 
> *cont,

I can't find this function in memcontrol.c. Which series this series goes after?

>   unsigned long long val;
>   int ret;
>  
> - if (mem_cgroup_is_root(memcg))
> - return -EINVAL;
> -
>   ret = res_counter_memparse_write_strategy(buffer, &val);
>   if (ret)
>   return ret;
> @@ -6118,6 +6112,7 @@ static struct cftype mem_cgroup_files[] = {
>   },
>   {
>   .name = "low",
> + .flags = CFTYPE_NOT_ON_ROOT,
>   .write_string = mem_cgroup_low_write,
>   .read = mem_cgroup_low_read,
>   },
> @@ -6161,6 +6156,7 @@ static struct cftype mem_cgroup_files[] = {
>   },
>   {
>   .name = "oom_guarantee",
> + .flags = CFTYPE_NOT_ON_ROOT,
>   .write_string = mem_cgroup_oom_guarantee_write,
>   .read = mem_cgroup_oom_guarantee_read,
>   },


___
Devel mailing list
Devel@openvz.org
https://lists.openvz.org/mailman/listinfo/devel


Re: [Devel] [PATCH rh7 2/3] memcg: use CFTYPE_NOT_ON_ROOT for memory.low and memory.oom_guarantee

2015-05-26 Thread Vladimir Davydov
On Tue, May 26, 2015 at 01:36:35PM +0300, Kirill Tkhai wrote:
> В Пн, 25/05/2015 в 17:05 +0300, Vladimir Davydov пишет:
> > @@ -5222,9 +5219,6 @@ static int mem_cgroup_oom_guarantee_write(struct 
> > cgroup *cont,
> 
> I can't find this function in memcontrol.c. Which series this series goes 
> after?

[PATCH rh7 0/3] memcg: implement oomguarpages
___
Devel mailing list
Devel@openvz.org
https://lists.openvz.org/mailman/listinfo/devel


Re: [Devel] [PATCH rh7 2/3] memcg: use CFTYPE_NOT_ON_ROOT for memory.low and memory.oom_guarantee

2015-05-26 Thread Kirill Tkhai
В Вт, 26/05/2015 в 13:47 +0300, Vladimir Davydov пишет:
> On Tue, May 26, 2015 at 01:36:35PM +0300, Kirill Tkhai wrote:
> > В Пн, 25/05/2015 в 17:05 +0300, Vladimir Davydov пишет:
> > > @@ -5222,9 +5219,6 @@ static int mem_cgroup_oom_guarantee_write(struct 
> > > cgroup *cont,
> > 
> > I can't find this function in memcontrol.c. Which series this series goes 
> > after?
> 
> [PATCH rh7 0/3] memcg: implement oomguarpages

Ok, thanks.

For the whole series:

Reviewed-by: Kirill Tkhai 

___
Devel mailing list
Devel@openvz.org
https://lists.openvz.org/mailman/listinfo/devel


[Devel] [PATCH rh7 1/2] net: Add rules for new {ip, ip6, x}table modules

2015-05-26 Thread Kirill Tkhai
Here are the modules, which need extended permittions
(see module_payload_allowed() for details).

https://jira.sw.ru/browse/PSBM-33631

Signed-off-by: Kirill Tkhai 
---
 kernel/kmod.c |   13 +
 1 file changed, 13 insertions(+)

diff --git a/kernel/kmod.c b/kernel/kmod.c
index b77bbc5..a213533 100644
--- a/kernel/kmod.c
+++ b/kernel/kmod.c
@@ -211,6 +211,7 @@ static struct {
{ "iptable_nat",VE_IP_NAT   },
{ "iptable_mangle", VE_IP_MANGLE},
{ "ip6table_filter",VE_IP_FILTER6   },
+   { "ip6table_nat",   VE_IP_NAT   },
{ "ip6table_mangle",VE_IP_MANGLE6   },
 
{ "xt_CONNMARK",VE_NF_CONNTRACK|VE_IP_CONNTRACK },
@@ -225,6 +226,8 @@ static struct {
{ "xt_state",   VE_NF_CONNTRACK|VE_IP_CONNTRACK },
{ "xt_socket",  VE_NF_CONNTRACK|VE_IP_CONNTRACK|
VE_IP_IPTABLES6 },
+   { "xt_connlabel",   VE_NF_CONNTRACK|VE_IP_CONNTRACK|
+   VE_IP_IPTABLES6 },
 
{ "ipt_CLUSTERIP",  VE_NF_CONNTRACK|VE_IP_CONNTRACK },
{ "ipt_CONNMARK",   VE_NF_CONNTRACK|VE_IP_CONNTRACK },
@@ -245,6 +248,9 @@ static struct {
VE_IP_NAT   },
{ "ipt_REDIRECT",   VE_NF_CONNTRACK|VE_IP_CONNTRACK|
VE_IP_NAT   },
+   { "ipt_connlabel",  VE_NF_CONNTRACK|VE_IP_CONNTRACK|
+   VE_IP_IPTABLES6 },
+   { "ipt_SYNPROXY",   VE_NF_CONNTRACK|VE_IP_CONNTRACK },
 
{ "ip6t_CONNMARK",  VE_NF_CONNTRACK|VE_IP_CONNTRACK },
{ "ip6t_CONNSECMARK",   VE_NF_CONNTRACK|VE_IP_CONNTRACK },
@@ -258,6 +264,13 @@ static struct {
{ "ip6t_state", VE_NF_CONNTRACK|VE_IP_CONNTRACK },
{ "ip6t_socket",VE_NF_CONNTRACK|VE_IP_CONNTRACK|
VE_IP_IPTABLES6 },
+   { "ip6t_MASQUERADE",VE_NF_CONNTRACK|VE_IP_CONNTRACK|
+   VE_IP_NAT|VE_IP_IPTABLES6   },
+   { "ip6t_connlabel", VE_NF_CONNTRACK|VE_IP_CONNTRACK|
+   VE_IP_IPTABLES6 },
+   { "ip6t_SYNPROXY",  VE_NF_CONNTRACK|VE_IP_CONNTRACK|
+   VE_IP_IPTABLES6 },
+
{ "nf-nat-ipv4",VE_NF_CONNTRACK|VE_IP_CONNTRACK|
VE_IP_NAT   },
{ "nf-nat", VE_NF_CONNTRACK|VE_IP_CONNTRACK|

___
Devel mailing list
Devel@openvz.org
https://lists.openvz.org/mailman/listinfo/devel


[Devel] [PATCH rh7 2/2] net: Add rules for autoloading nf_tables

2015-05-26 Thread Kirill Tkhai
nf_tables is a new netfilter table. Add autoload permittions
like we have for {ip,ip6,x}tables.

https://jira.sw.ru/browse/PSBM-33631

Signed-off-by: Kirill Tkhai 
---
 kernel/kmod.c |   47 +++
 1 file changed, 47 insertions(+)

diff --git a/kernel/kmod.c b/kernel/kmod.c
index a213533..04948ee 100644
--- a/kernel/kmod.c
+++ b/kernel/kmod.c
@@ -280,9 +280,52 @@ static struct {
{ "ip_conntrack",   VE_NF_CONNTRACK|VE_IP_CONNTRACK },
{ "nf_conntrack-10",VE_NF_CONNTRACK|VE_IP_CONNTRACK },
{ "nf_conntrack_ipv6",  VE_NF_CONNTRACK|VE_IP_CONNTRACK },
+
+   { "nft-set",VE_IP_IPTABLES  },
+   { "nft-afinfo-2",   VE_IP_IPTABLES  }, /* IPV4 */
+   { "nft-afinfo-3",   VE_IP_IPTABLES  }, /* ARP  */
+   { "nft-afinfo-10",  VE_IP_IPTABLES6 }, /* IPV6 */
+
+   { "nft-chain-2-nat",VE_IP_IPTABLES|VE_IP_NAT},
+   { "nft-chain-2-route",  VE_IP_IPTABLES  },
+
+   { "nft-chain-10-nat",   VE_IP_IPTABLES6|VE_IP_NAT   },
+   { "nft-chain-10-route", VE_IP_IPTABLES6 },
+
+   { "nft-expr-2-reject",  VE_IP_IPTABLES  },
+   { "nft-expr-10-reject", VE_IP_IPTABLES6 },
 };
 
 /*
+ *  Check if module named nft-expr-name is allowed.
+ *  We pass only tail name part to this function.
+ */
+static bool nft_expr_allowed(const char *name)
+{
+   u64 permitted = get_exec_env()->ipt_mask;
+
+   if (!name[0])
+   return false;
+
+   if (!strcmp(name, "ct"))
+   return mask_ipt_allow(permitted, VE_IP_CONNTRACK);
+
+   if (!strcmp(name, "nat"))
+   return mask_ipt_allow(permitted, VE_IP_NAT);
+
+   /*
+* We are interested in modules like nft-expr-xxx.
+* Expressions like nft-expr-xxx-yyy currently are
+* handled in ve0_am table. So expr does not cointain
+* minus
+*/
+   if (!strchr(name, '-'))
+   return mask_ipt_allow(permitted, VE_IP_IPTABLES) |
+  mask_ipt_allow(permitted, VE_IP_IPTABLES6);
+   return false;
+}
+
+/*
  * module_payload_allowed - check if module functionality is allowed
  * to be used inside current virtual enviroment.
  *
@@ -323,6 +366,10 @@ bool module_payload_allowed(const char *module)
if (!strncmp("ebt_", module, 4))
return true;
 
+   /* The rest of nft- modules */
+   if (!strncmp("nft-expr-", module, 9))
+   return nft_expr_allowed(module + 9);
+
return false;
 }
 #endif /* CONFIG_VE_IPTABLES */

___
Devel mailing list
Devel@openvz.org
https://lists.openvz.org/mailman/listinfo/devel


Re: [Devel] [PATCH rh7 1/2] net: Add rules for new {ip, ip6, x}table modules

2015-05-26 Thread Kirill Tkhai
Cyrill, please, review the series.

В Вт, 26/05/2015 в 14:09 +0300, Kirill Tkhai пишет:
> Here are the modules, which need extended permittions
> (see module_payload_allowed() for details).
> 
> https://jira.sw.ru/browse/PSBM-33631
> 
> Signed-off-by: Kirill Tkhai 
> ---
>  kernel/kmod.c |   13 +
>  1 file changed, 13 insertions(+)
> 
> diff --git a/kernel/kmod.c b/kernel/kmod.c
> index b77bbc5..a213533 100644
> --- a/kernel/kmod.c
> +++ b/kernel/kmod.c
> @@ -211,6 +211,7 @@ static struct {
>   { "iptable_nat",VE_IP_NAT   },
>   { "iptable_mangle", VE_IP_MANGLE},
>   { "ip6table_filter",VE_IP_FILTER6   },
> + { "ip6table_nat",   VE_IP_NAT   },
>   { "ip6table_mangle",VE_IP_MANGLE6   },
>  
>   { "xt_CONNMARK",VE_NF_CONNTRACK|VE_IP_CONNTRACK },
> @@ -225,6 +226,8 @@ static struct {
>   { "xt_state",   VE_NF_CONNTRACK|VE_IP_CONNTRACK },
>   { "xt_socket",  VE_NF_CONNTRACK|VE_IP_CONNTRACK|
>   VE_IP_IPTABLES6 },
> + { "xt_connlabel",   VE_NF_CONNTRACK|VE_IP_CONNTRACK|
> + VE_IP_IPTABLES6 },
>  
>   { "ipt_CLUSTERIP",  VE_NF_CONNTRACK|VE_IP_CONNTRACK },
>   { "ipt_CONNMARK",   VE_NF_CONNTRACK|VE_IP_CONNTRACK },
> @@ -245,6 +248,9 @@ static struct {
>   VE_IP_NAT   },
>   { "ipt_REDIRECT",   VE_NF_CONNTRACK|VE_IP_CONNTRACK|
>   VE_IP_NAT   },
> + { "ipt_connlabel",  VE_NF_CONNTRACK|VE_IP_CONNTRACK|
> + VE_IP_IPTABLES6 },
> + { "ipt_SYNPROXY",   VE_NF_CONNTRACK|VE_IP_CONNTRACK },
>  
>   { "ip6t_CONNMARK",  VE_NF_CONNTRACK|VE_IP_CONNTRACK },
>   { "ip6t_CONNSECMARK",   VE_NF_CONNTRACK|VE_IP_CONNTRACK },
> @@ -258,6 +264,13 @@ static struct {
>   { "ip6t_state", VE_NF_CONNTRACK|VE_IP_CONNTRACK },
>   { "ip6t_socket",VE_NF_CONNTRACK|VE_IP_CONNTRACK|
>   VE_IP_IPTABLES6 },
> + { "ip6t_MASQUERADE",VE_NF_CONNTRACK|VE_IP_CONNTRACK|
> + VE_IP_NAT|VE_IP_IPTABLES6   },
> + { "ip6t_connlabel", VE_NF_CONNTRACK|VE_IP_CONNTRACK|
> + VE_IP_IPTABLES6 },
> + { "ip6t_SYNPROXY",  VE_NF_CONNTRACK|VE_IP_CONNTRACK|
> + VE_IP_IPTABLES6 },
> +
>   { "nf-nat-ipv4",VE_NF_CONNTRACK|VE_IP_CONNTRACK|
>   VE_IP_NAT   },
>   { "nf-nat", VE_NF_CONNTRACK|VE_IP_CONNTRACK|
> 
> ___
> Devel mailing list
> Devel@openvz.org
> https://lists.openvz.org/mailman/listinfo/devel


___
Devel mailing list
Devel@openvz.org
https://lists.openvz.org/mailman/listinfo/devel


Re: [Devel] Running Debian 8 Jessie as OpenVZ Hostnode?

2015-05-26 Thread Konstantin Khorenko
On 05/23/2015 12:12 PM, Lope wrote:
> Hi,
> 
> I've not been successful trying to install OpenVZ on the current stable 
> version of Debian 8 Jessie.
> 
> 
> I was able to boot the OpenVZ stab108 kernel on Debian 8 Jessie with SysVinit.
> 
> However I could not start a container.
> First it gave an error about not finding a kernel module. So I modprobe'd it, 
> and then it gave a different error. It said something about the kernel being 
> really old and not supporting stuff (maybe it was ploop, can't remember). 
> Strange message because it was running an OpenVZ kernel.
> 
> For now I have to run OpenVZ on Debian OldStable 7.8 Wheezy, inside KVM.
> 
> Details here, perhaps we can continue the discussion on the forum?
> http://forum.openvz.org/index.php?t=msg&th=12907

Hi Lope,

well, as we are currently working hard on Virtuozzo 7 (which has RHEL7-based 
(3.10-x) kernel),
i do not think Debian 8 support on the Host for current OpenVZ based on RHEL6-x 
(2.6.32-x) kernel fits our focus,

but if you dig through all those issues of Debian 8 Host + 2.6.32-x OpenVZ 
kernel and send us appropriate patches,
we'll be happy to review and apply them.


P.S. one more time, it's important: the question above is about Debian 8 on the 
Host Node + OpenVZ on top -
we encourage anybody who really needs such a configuration (or just wants to 
investigate it just for fun)
to investigate, fix and send patches.

A Container with Debian 8 inside is a completely different story - this _must_ 
work.
If it does not - please file bugs, we will handle them.


P.P.S. if you doubt weather to start the investigation or not because of 
possible dead ends,
just start it, if you face a problem you don't understand, we can show the 
direction what to check/fix further.

Hope that helps.

--
Best regards,

Konstantin Khorenko,
Virtuozzo Linux Kernel Team
___
Devel mailing list
Devel@openvz.org
https://lists.openvz.org/mailman/listinfo/devel


Re: [Devel] Any plans about overlayfs support for Docker?

2015-05-26 Thread Konstantin Khorenko
On 05/24/2015 02:24 PM, Pavel Odintsov wrote:
> Hello, folks!
> 
> I inspired with Docker support inside containers so much! It works perfectly.
> 
> But I can't find any articles regarding to OverlayFS support there.
> 
> Do you have any plans about OverlayFS
> http://blog.cloud66.com/docker-with-overlayfs-first-impression/?
> 
> It's must feature for Docker and it's pretty useless without it :(

Hi Pavel,

yes, currently Docker running inside an OpenVZ (and Virtuozzo) Container can 
use only "vfs" graph driver.
Yes, we know that "vfs" driver consumes a lot of diskspace and coping all the 
data on Docker CT creation
makes this operation quite slow.

So what are we going to do with that?
In fact nowadays the only alternative which is stable enough is "devicemapper",
but we cannot just let Docker to use it inside an OpenVZ CT because if we do 
and allow OpenVZ owner to write to
the image and later mount it - the owner can easily write garbage and crash the 
kernel on a mount attempt.

To avoid this we are working on another simple graph driver - "proxy".
Basically it will redirect all requests from inside the OpenVZ Container to a 
daemon running on Host which will prepare disk for Docker Container from 
outside.

Once we have a prototype of it we'll push it to Docker mainstream.

Returning to the question about OverlayFS:
* it's not that simple to be bug free in the near future
* it's appeared quite recently => porting to 2.6.32-x may be a pain

=> we don't plan to add OverlayFS support in RHEL6-based (2.6.32-x) OpenVZ 
kernels.


On the other hand it's quite possible that OverlayFS will become good and 
stable enough and we'll eventually add it to Virtuozzo 7 (RHEL7-based) kernel 
one day.

Hope that helps.

--
Best regards,

Konstantin Khorenko,
Virtuozzo Linux Kernel Team
___
Devel mailing list
Devel@openvz.org
https://lists.openvz.org/mailman/listinfo/devel


Re: [Devel] [PATCH rh7 0/3] memcg: implement oomguarpages

2015-05-26 Thread Konstantin Khorenko
Kirill, please review.

--
Best regards,

Konstantin Khorenko,
Virtuozzo Linux Kernel Team

On 05/21/2015 12:50 PM, Vladimir Davydov wrote:
> This patch set adds memory.oom_guarantee file to memory cgroup which
> allows to protect a memory cgroup from OOM killer. It works as follows:
> OOM killer first selects from processes in cgroups that are above their
> OOM guarantee, and only if there is no such it switches to scanning
> processes from all cgroups. This behavior is similar to UB_OOMGUARPAGES.
> 
> It also adds OOM kills counter to each memory cgroup and synchronizes
> beancounters' UB_OOMGUARPAGES resource with oom_guarantee/oom_kill_cnt
> obtained from mem_cgroup.
> 
> Related to https://jira.sw.ru/browse/PSBM-20089
> 
> Vladimir Davydov (3):
>   memcg: add oom_guarantee
>   memcg: count oom kills
>   memcg: sync UB_OOMGUARPAGES
> 
>  include/linux/memcontrol.h |   13 
>  include/linux/oom.h|2 +-
>  mm/memcontrol.c|  141 
> ++--
>  mm/oom_kill.c  |   15 -
>  4 files changed, 164 insertions(+), 7 deletions(-)
> 
___
Devel mailing list
Devel@openvz.org
https://lists.openvz.org/mailman/listinfo/devel


[Devel] [PATCH RHEL7 COMMIT] bc/mm: zap page->{slab_ubs,kmem_ub}

2015-05-26 Thread Konstantin Khorenko
The commit is pushed to "branch-rh7-3.10.0-123.1.2-ovz" and will appear at 
https://src.openvz.org/scm/ovz/vzkernel.git
after rh7-3.10.0-123.1.2.vz7.5.5
-->
commit 3e695c5023b8e1a6f2ac61a1e94ffcbd7b897fa3
Author: Vladimir Davydov 
Date:   Tue May 26 18:00:24 2015 +0400

bc/mm: zap page->{slab_ubs,kmem_ub}

This is a leftover from UBC kmem accounting, which is now handled by
memcg.

Signed-off-by: Vladimir Davydov 
---
 include/linux/mm_types.h | 12 
 1 file changed, 12 deletions(-)

diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h
index 5ddfa80..fd501ae 100644
--- a/include/linux/mm_types.h
+++ b/include/linux/mm_types.h
@@ -187,12 +187,6 @@ struct page {
 #ifdef LAST_CPUPID_NOT_IN_PAGE_FLAGS
int _last_cpupid;
 #endif
-   union {
-#ifdef CONFIG_BEANCOUNTERS
-   struct user_beancounter *kmem_ub;
-   struct user_beancounter **slub_ubs;
-#endif
-   };
 }
 /*
  * The struct page can be forced to be double word aligned so that atomic ops
@@ -212,12 +206,6 @@ struct page_frag {
__u16 offset;
__u16 size;
 #endif
-   union {
-#ifdef CONFIG_BEANCOUNTERS
-   struct user_beancounter *kmem_ub;
-   struct user_beancounter **slub_ubs;
-#endif
-   };
 };
 
 typedef unsigned long __nocast vm_flags_t;
___
Devel mailing list
Devel@openvz.org
https://lists.openvz.org/mailman/listinfo/devel


[Devel] [PATCH RHEL7 COMMIT] mm/shmem: unexport shmem_file_operations

2015-05-26 Thread Konstantin Khorenko
The commit is pushed to "branch-rh7-3.10.0-123.1.2-ovz" and will appear at 
https://src.openvz.org/scm/ovz/vzkernel.git
after rh7-3.10.0-123.1.2.vz7.5.5
-->
commit 25d45a76a884735ac2530bfe4d027ddfaddd1a9b
Author: Vladimir Davydov 
Date:   Tue May 26 18:00:31 2015 +0400

mm/shmem: unexport shmem_file_operations

It is exported in RH6, becasue it is required by CPT.
No need to export it in RH7.

Signed-off-by: Vladimir Davydov 
---
 mm/shmem.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/mm/shmem.c b/mm/shmem.c
index d09a230..d35bd1c 100644
--- a/mm/shmem.c
+++ b/mm/shmem.c
@@ -2730,7 +2730,6 @@ static const struct file_operations shmem_file_operations 
= {
.fallocate  = shmem_fallocate,
 #endif
 };
-EXPORT_SYMBOL(shmem_file_operations);
 
 static const struct inode_operations shmem_inode_operations = {
.setattr= shmem_setattr,
___
Devel mailing list
Devel@openvz.org
https://lists.openvz.org/mailman/listinfo/devel


[Devel] [PATCH RHEL7 COMMIT] ub: zap ub_reclaim_rate_limit

2015-05-26 Thread Konstantin Khorenko
The commit is pushed to "branch-rh7-3.10.0-123.1.2-ovz" and will appear at 
https://src.openvz.org/scm/ovz/vzkernel.git
after rh7-3.10.0-123.1.2.vz7.5.5
-->
commit 0c76a2a4e0e69827c118d5859001167ee231b925
Author: Vladimir Davydov 
Date:   Tue May 26 18:46:19 2015 +0400

ub: zap ub_reclaim_rate_limit

This was used for throttling virtual swap in/out. We do not need it
anymore as we do not need virtual swap, because its functionality is now
implemented as a part of frontswap/tswap. If we want virtual swap in/out
throttling, we should add it to tswap.

Signed-off-by: Vladimir Davydov 
---
 include/bc/beancounter.h | 10 --
 kernel/bc/beancounter.c  | 29 -
 2 files changed, 39 deletions(-)

diff --git a/include/bc/beancounter.h b/include/bc/beancounter.h
index 8e384f8..ea05b11 100644
--- a/include/bc/beancounter.h
+++ b/include/bc/beancounter.h
@@ -137,11 +137,6 @@ struct user_beancounter {
atomic_long_t   wb_requests;
atomic_long_t   wb_sectors;
 
-   /* reclaim rate-limit */
-   spinlock_t  rl_lock;
-   unsignedrl_step;/* ns per page */
-   ktime_t rl_wall;/* wall time */
-
void*private_data2;
 
/* resources statistic and settings */
@@ -261,9 +256,6 @@ static inline void uncharge_beancounter(struct 
user_beancounter *ub,
int resource, unsigned long val) { }
 #define uncharge_beancounter_fast uncharge_beancounter
 
-static inline void ub_reclaim_rate_limit(struct user_beancounter *ub,
-int wait, unsigned count) { }
-
 #else /* CONFIG_BEANCOUNTERS */
 
 extern struct list_head ub_list_head;
@@ -510,8 +502,6 @@ int precharge_beancounter(struct user_beancounter *ub,
int resource, unsigned long val);
 void ub_precharge_snapshot(struct user_beancounter *ub, int *precharge);
 
-void ub_reclaim_rate_limit(struct user_beancounter *ub, int wait, unsigned 
count);
-
 #define UB_IOPRIO_MIN 0
 #define UB_IOPRIO_MAX 8
 
diff --git a/kernel/bc/beancounter.c b/kernel/bc/beancounter.c
index f17eb9e..f08402e 100644
--- a/kernel/bc/beancounter.c
+++ b/kernel/bc/beancounter.c
@@ -883,28 +883,6 @@ out:
 }
 EXPORT_SYMBOL(precharge_beancounter);
 
-void ub_reclaim_rate_limit(struct user_beancounter *ub, int wait, unsigned 
count)
-{
-   ktime_t wall;
-   u64 step;
-
-   if (!ub->rl_step)
-   return;
-
-   spin_lock(&ub->rl_lock);
-   step = (u64)ub->rl_step * count;
-   wall = ktime_add_ns(ktime_get(), step);
-   if (wall.tv64 < ub->rl_wall.tv64)
-   wall = ktime_add_ns(ub->rl_wall, step);
-   ub->rl_wall = wall;
-   spin_unlock(&ub->rl_lock);
-
-   if (wait && get_exec_ub() == ub && !test_thread_flag(TIF_MEMDIE)) {
-   set_current_state(TASK_KILLABLE);
-   schedule_hrtimeout(&wall, HRTIMER_MODE_ABS);
-   }
-}
-
 /*
  * Initialization
  *
@@ -924,8 +902,6 @@ static void init_beancounter_struct(struct user_beancounter 
*ub)
spin_lock_init(&ub->ub_lock);
INIT_LIST_HEAD(&ub->ub_tcp_sk_list);
INIT_LIST_HEAD(&ub->ub_other_sk_list);
-   spin_lock_init(&ub->rl_lock);
-   ub->rl_wall.tv64 = LLONG_MIN;
 }
 
 static void init_beancounter_nolimits(struct user_beancounter *ub)
@@ -944,9 +920,6 @@ static void init_beancounter_nolimits(struct 
user_beancounter *ub)
/* FIXME: set unlimited rate? */
ub->ub_ratelimit.burst = 4;
ub->ub_ratelimit.interval = 300*HZ;
-
-   if (ub != get_ub0())
-   ub->rl_step = NSEC_PER_SEC / 25600; /* 100 Mb/s */
 }
 
 static void init_beancounter_syslimits(struct user_beancounter *ub)
@@ -980,8 +953,6 @@ static void init_beancounter_syslimits(struct 
user_beancounter *ub)
 
ub->ub_ratelimit.burst = 4;
ub->ub_ratelimit.interval = 300*HZ;
-
-   ub->rl_step = NSEC_PER_SEC / 25600; /* 100 Mb/s */
 }
 
 static DEFINE_PER_CPU(struct ub_percpu_struct, ub0_percpu);
___
Devel mailing list
Devel@openvz.org
https://lists.openvz.org/mailman/listinfo/devel


Re: [Devel] [PATCH rh7 1/3] memcg: add function to get container's ram size

2015-05-26 Thread Konstantin Khorenko
Kirill, please review the patch set.

--
Best regards,

Konstantin Khorenko,
Virtuozzo Linux Kernel Team

On 05/21/2015 06:27 PM, Vladimir Davydov wrote:
> Sometimes we need to get the ram size of the container the current
> process belongs to and we cannot open the memory cgroup by name as we
> usually do (e.g. see ub_dirty_limits). This patch adds a function for
> this purpose, mem_cgroup_ram_pages.
> 
> In this function we implicitly assume that each container lives in its
> own top level memory cgroup. If it is changed (e.g. we move all
> containers to /machine.slice), then we must rework this function (as
> well as all the code in beancounters that gets stats from memcg). One
> way to do that is allow the userspace assign a memory cgroup to a ve or
> beancounter cgroup and get the memory cgroup of a container from
> get_exec_env() or get_exec_ub().
> 
> Signed-off-by: Vladimir Davydov 
> ---
>  include/linux/memcontrol.h |6 ++
>  mm/memcontrol.c|   31 +++
>  2 files changed, 37 insertions(+)
> 
> diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h
> index 5393f5f3b7d5..1382e4939a21 100644
> --- a/include/linux/memcontrol.h
> +++ b/include/linux/memcontrol.h
> @@ -127,6 +127,7 @@ extern void mem_cgroup_print_oom_info(struct mem_cgroup 
> *memcg,
>   struct task_struct *p);
>  extern void mem_cgroup_replace_page_cache(struct page *oldpage,
>   struct page *newpage);
> +unsigned long mem_cgroup_ram_pages(void);
>  
>  #ifdef CONFIG_MEMCG_SWAP
>  extern int do_swap_account;
> @@ -400,6 +401,11 @@ static inline void mem_cgroup_replace_page_cache(struct 
> page *oldpage,
>   struct page *newpage)
>  {
>  }
> +
> +static inline unsigned long mem_cgroup_ram_pages(void)
> +{
> + return ULONG_MAX;
> +}
>  #endif /* CONFIG_MEMCG */
>  
>  #if !defined(CONFIG_MEMCG) || !defined(CONFIG_DEBUG_VM)
> diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> index 6ef83fbd1a58..d8f9b5561222 100644
> --- a/mm/memcontrol.c
> +++ b/mm/memcontrol.c
> @@ -1620,6 +1620,37 @@ void mem_cgroup_note_oom_kill(struct mem_cgroup 
> *root_memcg,
>   css_put(&memcg_to_put->css);
>  }
>  
> +/*
> + * XXX: This function returns the limit of the topmost memory cgroup the
> + * current process belongs to (ULONG_MAX if unlimited). It is used to get the
> + * number of memory pages available to a Virtuozzo container. Here we
> + * implicitly assume that each container lives in its own top level memory
> + * cgroup. If it is changed, this function must be reworked. E.g. we could
> + * assign a memory cgroup to each ve or beancounter cgroup and get the memory
> + * cgroup of a container from get_exec_env() or get_exec_ub().
> + */
> +unsigned long mem_cgroup_ram_pages(void)
> +{
> + struct mem_cgroup *memcg_to_put, *memcg, *parent;
> + unsigned long long limit = RESOURCE_MAX;
> +
> + memcg_to_put = memcg = try_get_mem_cgroup_from_mm(current->mm);
> + if (!memcg || memcg == root_mem_cgroup)
> + goto out;
> +
> + while ((parent = parent_mem_cgroup(memcg)) &&
> +parent != root_mem_cgroup)
> + memcg = parent;
> +
> + limit = res_counter_read_u64(&memcg->res, RES_LIMIT);
> +out:
> + if (memcg_to_put)
> + css_put(&memcg_to_put->css);
> + if (limit == RESOURCE_MAX)
> + return ULONG_MAX;
> + return min_t(unsigned long long, ULONG_MAX, limit >> PAGE_SHIFT);
> +}
> +
>  #define mem_cgroup_from_res_counter(counter, member) \
>   container_of(counter, struct mem_cgroup, member)
>  
> 
___
Devel mailing list
Devel@openvz.org
https://lists.openvz.org/mailman/listinfo/devel


[Devel] [PATCH RHEL7 COMMIT] bc/vmalloc: zap ub_vmalloc

2015-05-26 Thread Konstantin Khorenko
The commit is pushed to "branch-rh7-3.10.0-123.1.2-ovz" and will appear at 
https://src.openvz.org/scm/ovz/vzkernel.git
after rh7-3.10.0-123.1.2.vz7.5.5
-->
commit a850b25e2410426bf4a98d0e5893553d20e17586
Author: Vladimir Davydov 
Date:   Tue May 26 18:52:18 2015 +0400

bc/vmalloc: zap ub_vmalloc

This is a leftover from RH6 where some vmalloc allocations are charged
to UBC. This code is now obsolete - all vmalloc allocations should be
charged to kmemcg (this is for future work), so this patch kills it.

[It reverts vmalloc-related pieces of commit 1da9426dc5c49]

Signed-off-by: Vladimir Davydov 
---
 arch/x86/kernel/ldt.c  |  2 +-
 fs/file.c  |  2 +-
 include/linux/vmalloc.h|  4 
 ipc/util.c |  2 +-
 kernel/fairsched.c |  2 +-
 mm/vmalloc.c   | 42 ++
 net/ipv4/netfilter/ip_tables.c |  2 +-
 net/netfilter/x_tables.c   |  2 +-
 8 files changed, 12 insertions(+), 46 deletions(-)

diff --git a/arch/x86/kernel/ldt.c b/arch/x86/kernel/ldt.c
index 0896329..b654ee4 100644
--- a/arch/x86/kernel/ldt.c
+++ b/arch/x86/kernel/ldt.c
@@ -42,7 +42,7 @@ static int alloc_ldt(mm_context_t *pc, int mincount, int 
reload)
mincount = (mincount + (PAGE_SIZE / LDT_ENTRY_SIZE - 1)) &
(~(PAGE_SIZE / LDT_ENTRY_SIZE - 1));
if (mincount * LDT_ENTRY_SIZE > PAGE_SIZE)
-   newldt = ub_vmalloc(mincount * LDT_ENTRY_SIZE);
+   newldt = vmalloc(mincount * LDT_ENTRY_SIZE);
else
newldt = (void *)__get_free_page(GFP_KERNEL);
 
diff --git a/fs/file.c b/fs/file.c
index 7f5e91e..742 100644
--- a/fs/file.c
+++ b/fs/file.c
@@ -38,7 +38,7 @@ static void *alloc_fdmem(size_t size)
if (data != NULL)
return data;
}
-   return ub_vmalloc(size);
+   return vmalloc(size);
 }
 
 static void free_fdmem(void *ptr)
diff --git a/include/linux/vmalloc.h b/include/linux/vmalloc.h
index a97f319..dd0a2c8 100644
--- a/include/linux/vmalloc.h
+++ b/include/linux/vmalloc.h
@@ -66,17 +66,13 @@ static inline void vmalloc_init(void)
 
 extern void *vmalloc(unsigned long size);
 extern void *vzalloc(unsigned long size);
-extern void *ub_vmalloc(unsigned long size);
 extern void *vmalloc_user(unsigned long size);
 extern void *vmalloc_node(unsigned long size, int node);
 extern void *vzalloc_node(unsigned long size, int node);
-extern void *ub_vmalloc_node(unsigned long size, int node);
 extern void *vmalloc_exec(unsigned long size);
 extern void *vmalloc_32(unsigned long size);
 extern void *vmalloc_32_user(unsigned long size);
 extern void *__vmalloc(unsigned long size, gfp_t gfp_mask, pgprot_t prot);
-extern void *vmalloc_best(unsigned long size);
-extern void *ub_vmalloc_best(unsigned long size);
 extern void *__vmalloc_node_range(unsigned long size, unsigned long align,
unsigned long start, unsigned long end, gfp_t gfp_mask,
pgprot_t prot, int node, const void *caller);
diff --git a/ipc/util.c b/ipc/util.c
index 6539b0e..721a9e0 100644
--- a/ipc/util.c
+++ b/ipc/util.c
@@ -466,7 +466,7 @@ void *ipc_alloc(int size)
 {
void *out;
if(size > PAGE_SIZE)
-   out = ub_vmalloc(size);
+   out = vmalloc(size);
else
out = kmalloc(size, GFP_KERNEL);
return out;
diff --git a/kernel/fairsched.c b/kernel/fairsched.c
index 0d0fa5c..2fd39cd 100644
--- a/kernel/fairsched.c
+++ b/kernel/fairsched.c
@@ -456,7 +456,7 @@ static struct fairsched_dump *fairsched_do_dump(int compat)
 
nr_nodes = ve_is_super(get_exec_env()) ? nr_nodes + 16 : 1;
 
-   dump = ub_vmalloc(sizeof(*dump) + nr_nodes * sizeof(dump->nodes[0]));
+   dump = vmalloc(sizeof(*dump) + nr_nodes * sizeof(dump->nodes[0]));
if (dump == NULL)
goto out;
 
diff --git a/mm/vmalloc.c b/mm/vmalloc.c
index ac32dca..7fbc92a 100644
--- a/mm/vmalloc.c
+++ b/mm/vmalloc.c
@@ -32,15 +32,13 @@
 #include 
 #include 
 
-#include 
-
 struct vfree_deferred {
struct llist_head list;
struct work_struct wq;
 };
 static DEFINE_PER_CPU(struct vfree_deferred, vfree_deferred);
 
-static void __vunmap(const void *, int, int);
+static void __vunmap(const void *, int);
 
 static void free_work(struct work_struct *w)
 {
@@ -49,7 +47,7 @@ static void free_work(struct work_struct *w)
while (llnode) {
void *p = llnode;
llnode = llist_next(llnode);
-   __vunmap(p, 1, 0);
+   __vunmap(p, 1);
}
 }
 
@@ -1471,7 +1469,7 @@ struct vm_struct *remove_vm_area(const void *addr)
return NULL;
 }
 
-static void __vunmap(const void *addr, int deallocate_pages, int uncharge)
+static void __vunmap(const void *addr, int deallocate_pages)
 {
struct vm_struct *area;
 
@@ -1540,7 +1538,7 @@ void vfree(const

[Devel] [PATCH RHEL7 COMMIT] cpt: Unexport everything CPT-related

2015-05-26 Thread Konstantin Khorenko
The commit is pushed to "branch-rh7-3.10.0-123.1.2-ovz" and will appear at 
https://src.openvz.org/scm/ovz/vzkernel.git
after rh7-3.10.0-123.1.2.vz7.5.5
-->
commit 8304c77c69430cd34af8c2a349c90feaa4b3ab08
Author: Vladimir Davydov 
Date:   Tue May 26 18:52:26 2015 +0400

cpt: Unexport everything CPT-related

No more CPT in kernel.

Hopefully, I haven't missed anything.

[This patch reverts parts of commit 1da9426dc5c49]

khorenko@: online migration has not been dropped,
it will be done via CRIU.

Signed-off-by: Vladimir Davydov 
---
 arch/x86/include/asm/tlbflush.h   |  1 -
 arch/x86/kernel/ldt.c |  1 -
 arch/x86/mm/tlb.c |  1 -
 arch/x86/vdso/vdso32-setup.c  |  3 +--
 fs/aio.c  |  7 ++
 fs/file.c |  5 +---
 fs/filesystems.c  |  1 -
 fs/locks.c|  1 -
 fs/namei.c|  3 ---
 fs/open.c |  2 --
 fs/pipe.c |  1 -
 fs/select.c   |  3 +--
 fs/splice.c   |  1 -
 fs/super.c|  3 ---
 include/linux/aio.h   |  4 
 include/linux/futex.h |  1 -
 include/linux/huge_mm.h   |  1 -
 include/linux/mm.h|  5 +---
 include/linux/poll.h  |  1 -
 include/linux/sched.h |  2 --
 include/net/addrconf.h|  6 -
 include/net/tcp.h |  5 
 ipc/util.c|  1 -
 kernel/exit.c |  6 +
 kernel/fork.c |  5 +---
 kernel/futex.c|  6 ++---
 kernel/hrtimer.c  |  1 -
 kernel/posix-cpu-timers.c |  1 -
 kernel/sched/core.c   |  1 -
 kernel/signal.c   |  7 ++
 kernel/user.c |  2 --
 mm/fremap.c   |  1 -
 mm/init-mm.c  |  1 -
 mm/memory.c   | 48 ---
 mm/mlock.c|  2 --
 mm/mmap.c |  4 +---
 mm/mprotect.c |  1 -
 mm/nommu.c|  1 -
 mm/rmap.c |  4 
 mm/swapfile.c |  4 
 net/ipv4/devinet.c|  9 +++-
 net/ipv4/tcp_ipv4.c   |  4 +---
 net/ipv6/addrconf.c   |  3 +--
 net/ipv6/mcast.c  |  1 -
 net/ipv6/tcp_ipv6.c   |  9 +++-
 security/integrity/ima/ima_main.c |  1 -
 46 files changed, 32 insertions(+), 149 deletions(-)

diff --git a/arch/x86/include/asm/tlbflush.h b/arch/x86/include/asm/tlbflush.h
index b6914e0..50a7fc0 100644
--- a/arch/x86/include/asm/tlbflush.h
+++ b/arch/x86/include/asm/tlbflush.h
@@ -100,7 +100,6 @@ static inline void flush_tlb_page(struct vm_area_struct 
*vma,
if (vma->vm_mm == current->active_mm)
__flush_tlb_one(addr);
 }
-EXPORT_SYMBOL(flush_tlb_page);
 
 static inline void flush_tlb_range(struct vm_area_struct *vma,
   unsigned long start, unsigned long end)
diff --git a/arch/x86/kernel/ldt.c b/arch/x86/kernel/ldt.c
index b654ee4..79aa97d 100644
--- a/arch/x86/kernel/ldt.c
+++ b/arch/x86/kernel/ldt.c
@@ -120,7 +120,6 @@ int init_new_context(struct task_struct *tsk, struct 
mm_struct *mm)
}
return retval;
 }
-EXPORT_SYMBOL(init_new_context);
 
 /*
  * No need to lock the MM as we are the last user
diff --git a/arch/x86/mm/tlb.c b/arch/x86/mm/tlb.c
index 9aab73e..459590c 100644
--- a/arch/x86/mm/tlb.c
+++ b/arch/x86/mm/tlb.c
@@ -227,7 +227,6 @@ void flush_tlb_page(struct vm_area_struct *vma, unsigned 
long start)
 
preempt_enable();
 }
-EXPORT_SYMBOL(flush_tlb_page);
 
 static void do_flush_tlb_all(void *info)
 {
diff --git a/arch/x86/vdso/vdso32-setup.c b/arch/x86/vdso/vdso32-setup.c
index 9775d58..0faad64 100644
--- a/arch/x86/vdso/vdso32-setup.c
+++ b/arch/x86/vdso/vdso32-setup.c
@@ -193,8 +193,7 @@ static __init void relocate_vdso(Elf32_Ehdr *ehdr)
}
 }
 
-struct page *vdso32_pages[1];
-EXPORT_SYMBOL(vdso32_pages);
+static struct page *vdso32_pages[1];
 
 #ifdef CONFIG_X86_64
 
diff --git a/fs/aio.c b/fs/aio.c
index 04fc50e..715eb75 100644
--- a/fs/aio.c
+++ b/fs/aio.c
@@ -115,16 +115,13 @@ struct kioctx {
 };
 
 /*-- sysctl variables*/
-DEFINE_SPINLOCK(aio_nr_lock);
-EXPORT_SYMBOL(aio_nr_lock);
+static DEFINE_SPINLOCK(aio_nr_lock);
 unsigned long aio_nr;  /* current system wide number of aio requests */
-EXPORT_SYMBOL(aio_nr);
 unsigned long aio_max_nr = 0x1; /* system wide maximum number of aio 
requests */
 /*end sysctl variables---*/
 
 static struct kmem_cache   *kiocb_cachep;
-struct kmem_cache  *kioctx_cachep;
-EXPORT_SYMBOL(kioctx_cachep);
+static struct kmem_cache   *kioctx_cachep;
 
 static struct vfsmount *aio_mnt;
 
diff 

Re: [Devel] [PATCH rh7 0/2] Implement UB_DCACHESIZE stats

2015-05-26 Thread Konstantin Khorenko
Kirill, please review the patch set.

--
Best regards,

Konstantin Khorenko,
Virtuozzo Linux Kernel Team

On 05/22/2015 02:08 PM, Vladimir Davydov wrote:
> See individual patches for more details.
> 
> Related to https://jira.sw.ru/browse/PSBM-20089
> 
> Vladimir Davydov (2):
>   memcg: account dcache size
>   memcg: sync UB_DCACHESIZE
> 
>  include/linux/memcontrol.h |3 ---
>  kernel/bc/beancounter.c|3 ++-
>  kernel/bc/vm_pages.c   |1 -
>  mm/memcontrol.c|   54 
> +---
>  mm/slab.h  |8 ---
>  5 files changed, 58 insertions(+), 11 deletions(-)
> 
___
Devel mailing list
Devel@openvz.org
https://lists.openvz.org/mailman/listinfo/devel


Re: [Devel] [PATCH rh7 1/2] net: Add rules for new {ip, ip6, x}table modules

2015-05-26 Thread Cyrill Gorcunov
On Tue, May 26, 2015 at 02:09:14PM +0300, Kirill Tkhai wrote:
> Here are the modules, which need extended permittions
> (see module_payload_allowed() for details).
> 
> https://jira.sw.ru/browse/PSBM-33631
> 
> Signed-off-by: Kirill Tkhai 
Reviewed-by: Cyrill Gorcunov 
___
Devel mailing list
Devel@openvz.org
https://lists.openvz.org/mailman/listinfo/devel


Re: [Devel] [PATCH rh7 2/2] net: Add rules for autoloading nf_tables

2015-05-26 Thread Cyrill Gorcunov
On Tue, May 26, 2015 at 02:09:25PM +0300, Kirill Tkhai wrote:
> nf_tables is a new netfilter table. Add autoload permittions
> like we have for {ip,ip6,x}tables.
> 
> https://jira.sw.ru/browse/PSBM-33631
> 
> Signed-off-by: Kirill Tkhai 
Reviewed-by: Cyrill Gorcunov 
___
Devel mailing list
Devel@openvz.org
https://lists.openvz.org/mailman/listinfo/devel


[Devel] [PATCH RHEL7 COMMIT] bc/mm: zap pte_ptrs and same_ub macros

2015-05-26 Thread Konstantin Khorenko
The commit is pushed to "branch-rh7-3.10.0-123.1.2-ovz" and will appear at 
https://src.openvz.org/scm/ovz/vzkernel.git
after rh7-3.10.0-123.1.2.vz7.5.5
-->
commit 0fbf24f5927de7e7c9a31e9213ad9975f3aa2adc
Author: Vladimir Davydov 
Date:   Tue May 26 18:59:12 2015 +0400

bc/mm: zap pte_ptrs and same_ub macros

They were added by commit 1da9426dc5c49, which ported stuff from RH6.
They are not used, so zap them.

Signed-off-by: Vladimir Davydov 
---
 mm/memory.c | 7 ---
 1 file changed, 7 deletions(-)

diff --git a/mm/memory.c b/mm/memory.c
index 1214542..2f09839 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -917,13 +917,6 @@ out_set_pte:
return 0;
 }
 
-#define pte_ptrs(a)(PTRS_PER_PTE - ((a >> PAGE_SHIFT)&(PTRS_PER_PTE - 1)))
-#ifdef CONFIG_BEANCOUNTERS
-#define same_ub(mm1, mm2)  ((mm1)->mm_ub == (mm2)->mm_ub)
-#else
-#define same_ub(mm1, mm2)  1
-#endif
-
 int copy_pte_range(struct mm_struct *dst_mm, struct mm_struct *src_mm,
   pmd_t *dst_pmd, pmd_t *src_pmd, struct vm_area_struct *vma,
   unsigned long addr, unsigned long end)
___
Devel mailing list
Devel@openvz.org
https://lists.openvz.org/mailman/listinfo/devel


[Devel] [patch rh7 0/2] Disable mount cgroups from inside of VE and mangle cgroup root paths

2015-05-26 Thread Cyrill Gorcunov
Please take a look, thanks.
___
Devel mailing list
Devel@openvz.org
https://lists.openvz.org/mailman/listinfo/devel


[Devel] [patch rh7 2/2] cgroup: Mangle cgroups root from inside of VE view

2015-05-26 Thread Cyrill Gorcunov
We're bindmounting cgroups for container so if say a container
is having CTID=200 then @cgroups and @mountinfo output will
contain /200 as a root. Which makes Docker to lookup for
appropriate directory inside /sys/fs/cgroup/
which of course not present because of been bindmounted
from the node (note we can't bindmount into
/ here because it confuses container's
systemd instance and it stuck on boot).

Thus we simply mangle root here so when one is accessing
@cgroups or @mountinfo kernel shows '/' instead of $ctid
which makes both docker and systemd happy.

https://jira.sw.ru/browse/PSBM-33757

Signed-off-by: Cyrill Gorcunov 
CC: Vladimir Davydov 
CC: Konstantin Khorenko 
CC: Pavel Emelyanov 
CC: Andrey Vagin 
---
 kernel/cgroup.c |   29 +
 1 file changed, 29 insertions(+)

Index: linux-pcs7.git/kernel/cgroup.c
===
--- linux-pcs7.git.orig/kernel/cgroup.c
+++ linux-pcs7.git/kernel/cgroup.c
@@ -1386,10 +1386,24 @@ static int cgroup_remount(struct super_b
return ret;
 }
 
+#ifdef CONFIG_VE
+int cgroup_show_path(struct seq_file *m, struct dentry *dentry)
+{
+   if (!ve_is_super(get_exec_env()))
+   seq_puts(m, "/");
+   else
+   seq_dentry(m, dentry, " \t\n\\");
+   return 0;
+}
+#endif
+
 static const struct super_operations cgroup_ops = {
.statfs = simple_statfs,
.drop_inode = generic_delete_inode,
.show_options = cgroup_show_options,
+#ifdef CONFIG_VE
+   .show_path = cgroup_show_path,
+#endif
.remount_fs = cgroup_remount,
 };
 
@@ -1807,6 +1821,21 @@ int cgroup_path(const struct cgroup *cgr
return 0;
}
 
+#ifdef CONFIG_VE
+   /*
+* Containers cgroups are bind-mounted from node
+* so they are like '/' from inside, thus we have
+* to mangle cgroup path output.
+*/
+   if (!ve_is_super(get_exec_env())) {
+   if (cgrp->parent && !cgrp->parent->parent) {
+   if (strlcpy(buf, "/", buflen) >= buflen)
+   return -ENAMETOOLONG;
+   return 0;
+   }
+   }
+#endif
+
start = buf + buflen - 1;
*start = '\0';
 

___
Devel mailing list
Devel@openvz.org
https://lists.openvz.org/mailman/listinfo/devel


[Devel] [patch rh7 1/2] cgroup: mount -- Disable mounting from inside of VE context

2015-05-26 Thread Cyrill Gorcunov
Even mounting knowing cgroups (ie ones which already known to VE and
been mounted by vzctl or any other tool for containter sake) is not
that harmless as it might look like. In particular this introduce
additional performance hit. So because we are using bindmount
strategy to grant cgorups to VE we don't need to mount it from
inside of VE anymore and can simply disable.

Signed-off-by: Cyrill Gorcunov 
CC: Vladimir Davydov 
CC: Konstantin Khorenko 
CC: Pavel Emelyanov 
CC: Andrey Vagin 
---
 kernel/cgroup.c |   18 +-
 1 file changed, 5 insertions(+), 13 deletions(-)

Index: linux-pcs7.git/kernel/cgroup.c
===
--- linux-pcs7.git.orig/kernel/cgroup.c
+++ linux-pcs7.git/kernel/cgroup.c
@@ -1572,6 +1572,11 @@ static struct dentry *cgroup_mount(struc
struct cgroupfs_root *new_root;
struct inode *inode;
 
+#ifdef CONFIG_VE
+   if (!ve_is_super(get_exec_env()) && !(flags & MS_KERNMOUNT))
+   return ERR_PTR(-EACCES);
+#endif
+
/* First find the desired set of subsystems */
if (!(flags & MS_KERNMOUNT)) {
mutex_lock(&cgroup_mutex);
@@ -1615,19 +1620,6 @@ static struct dentry *cgroup_mount(struc
int i;
struct css_set *cg;
 
-#ifdef CONFIG_VE
-   /*
-* We don't allow to mount new roots from inside
-* of container (but have to allow mounting existing
-* cgroups, because the VE restore procedure is
-* implemented from inside of container environment).
-*/
-   if (!ve_is_super(get_exec_env())) {
-   ret = -EACCES;
-   goto drop_new_super;
-   }
-#endif
-
BUG_ON(sb->s_root != NULL);
 
ret = cgroup_get_rootdir(sb);

___
Devel mailing list
Devel@openvz.org
https://lists.openvz.org/mailman/listinfo/devel


[Devel] [PATCH RHEL7 COMMIT] Revert "scripts: Delete generated binary files from kernel tree"

2015-05-26 Thread Konstantin Khorenko
The commit is pushed to "branch-rh7-3.10.0-123.1.2-ovz" and will appear at 
https://src.openvz.org/scm/ovz/vzkernel.git
after rh7-3.10.0-123.1.2.vz7.5.6
-->
commit 9c427a87aa6083367978866348e753f886f9cf50
Author: Konstantin Khorenko 
Date:   Tue May 26 20:09:14 2015 +0400

Revert "scripts: Delete generated binary files from kernel tree"

This reverts commit 1ba90c8b2c0023526ad16ac61a661c3136adb691.

We do create a cumulative Virtuozzo diff to be put into
src rpm and in kernel.spec we do apply this cumulative patch
using "patch" utility.
And "patch" does not understand binary diffs generated by git.

=> let those files just be in git until RedHat removes them from
tar archive.

Signed-off-by: Konstantin Khorenko 
---
 scripts/basic/fixdep | Bin 0 -> 13875 bytes
 scripts/kconfig/conf | Bin 0 -> 114694 bytes
 2 files changed, 0 insertions(+), 0 deletions(-)

diff --git a/scripts/basic/fixdep b/scripts/basic/fixdep
new file mode 100755
index 000..2d8a408
Binary files /dev/null and b/scripts/basic/fixdep differ
diff --git a/scripts/kconfig/conf b/scripts/kconfig/conf
new file mode 100755
index 000..2b2a841
Binary files /dev/null and b/scripts/kconfig/conf differ
___
Devel mailing list
Devel@openvz.org
https://lists.openvz.org/mailman/listinfo/devel


Re: [Devel] Running Debian 8 Jessie as OpenVZ Hostnode?

2015-05-26 Thread Kir Kolyshkin



On 05/23/2015 02:12 AM, Lope wrote:

Hi,

I've not been successful trying to install OpenVZ on the current 
stable version of Debian 8 Jessie.



I was able to boot the OpenVZ stab108 kernel on Debian 8 Jessie with 
SysVinit.


However I could not start a container.
First it gave an error about not finding a kernel module. So I 
modprobe'd it, and then it gave a different error. It said something 
about the kernel being really old and not supporting stuff (maybe it 
was ploop, can't remember). Strange message because it was running an 
OpenVZ kernel.


Did you run /etc/init.d/vz script? It is supposed to load all the modules.

What was the exact error message? Perhaps you have very old ploop library?



For now I have to run OpenVZ on Debian OldStable 7.8 Wheezy, inside KVM.

Details here, perhaps we can continue the discussion on the forum?
http://forum.openvz.org/index.php?t=msg&th=12907




___
Devel mailing list
Devel@openvz.org
https://lists.openvz.org/mailman/listinfo/devel


___
Devel mailing list
Devel@openvz.org
https://lists.openvz.org/mailman/listinfo/devel