Cedric Le Goater wrote:
> Hello Kirill !
>
> Kirill Korotaev wrote:
>> Pierre,
>>
>> my point is that after you've added interface "set IPCID", you'll need
>> more and more for checkpointing:
>> - "create/setup conntrack" (o
e.
Remember recent story with SLUB and /proc/slabinfo?
Hope I made my argument more clear this time.
Thanks,
Kirill
Pierre Peiffer wrote:
>
> Kirill Korotaev wrote:
>> Why user space can need this API? for checkpointing only?
>
> I would say "at least for checkpointing&qu
Why user space can need this API? for checkpointing only?
Then I would not consider it for inclusion until it is clear how to implement
checkpointing.
As for me personally - I'm against exporting such APIs, since they are not
needed in real-life user space applications and maintaining it forever
I dislike this patch:
it's not scalable/efficient to travers all the tasks
while we know the pid namespace we care about.
Kirill
Eric W. Biederman wrote:
> This patch implements task_in_pid_ns and uses it to limit cap_set_all
> and sys_kill(-1,) to only those tasks in the current pid namespace.
Can you please send namespace related patches to containers@ ML first
before sending them to Linus/Andrew?
Thanks,
Kirill
Eric W. Biederman wrote:
> This is my trivial patch to swat innumerable little bugs
> with a single blow.
>
> After some intensive review (my apologies for not having
> gott
Virtualization of sysv msg queues is incomplete:
msg_hdrs and msg_bytes variables visible from userspace are global.
Let's make them per-namespace.
Signed-Off-By: Alexey Kuznetsov <[EMAIL PROTECTED]>
Signed-Off-By: Kirill Korotaev <[EMAIL PROTECTED]>
---
include/linux/ipc.h |
Eric W. Biederman wrote:
> Greg KH <[EMAIL PROTECTED]> writes:
>
>>> Also fun is that the dev file implementation needs to be able to
>>> report different major:minor numbers based on which mount of
>>> sysfs we are dealing with.
>>
>>Um, no, that's not going to happen. /dev/sda will _always_
write_cr3(c | 0x100);
}
-- cut
Kirill
Arjan van de Ven wrote:
> On Tue, 02 Oct 2007 18:08:32 +0400
> Kirill Korotaev <[EMAIL PROTECTED]> wrote:
>
>
>>Some gcc versions (I checked at least 4.1.1 from RHEL5 & 4
s volatile which helps.
i686 already has most of these functions marked as volatile already.
I faced this bug myself in i686 arch code when did code
rearrangement in 2.6.18.
Signed-Off-By: Kirill Korotaev <[EMAIL PROTECTED]>
Acked-By: Pavel Emelianov <[EMAIL PROTECTED]>
---
asm-i386/system.h
Andi Kleen wrote:
>>Not everyone likes frame buffer
>
>
> You don't need the frame buffer; cards typically have text mode
> fonts upto 80x50. The node numbers vary, but you can find out yours
> with vga=ask
>
>
>>but even with it any OOPs in
>>network code which happens in softirq, io schedul
Masoud Sharbiani wrote:
> On 7/25/07, Kirill Korotaev <[EMAIL PROTECTED]> wrote:
>
>>plz don't enable it by default... :/
>>any user can spam syslog with these messages and if syslog is run as root
>>can take the whole diskspace...
>
>
>
> Yeah,
plz don't enable it by default... :/
any user can spam syslog with these messages and if syslog is run as root
can take the whole diskspace...
Thanks,
Kirill
Masoud Asgharifard Sharbiani wrote:
> Hello,
> This patch makes the i386 behave the same way that x86_64 does when a
> segfault happens. A
Andrew Morton wrote:
> On Mon, 16 Jul 2007 16:24:12 +0400
> Pavel Emelianov <[EMAIL PROTECTED]> wrote:
>
>
>>When user locks an ipc shmem segmant with SHM_LOCK ctl and the
>>segment is already locked the shmem_lock() function returns 0.
>>After this the subsequent code leaks the existing user st
Look, until you have any numbers in hands it's impossible to say
which one is faster.
Please measure N d_alloc()'s on i686 and some other archs w/o string operations
and compare whether your patch improves something or not.
Kirill
rae l wrote:
> On 7/13/07, Kirill Korotaev <
This doesn't look worth zeroing half of the struct
when it is initialized to non-zeros then.
Denis Cheng wrote:
>>From 4d87e14b67890f06885a76b5792ca034de2e9d06 Mon Sep 17 00:00:00 2001
> From: Denis Cheng <[EMAIL PROTECTED]>
> Date: Thu, 12 Jul 2007 11:53:58 +0800
> Subject: [PATCH] replace kmem_c
Jan Kara wrote:
> Hello,
>
> On Mon 02-07-07 18:16:09, Kirill Korotaev wrote:
>
>>it looks like the following fix:
>>http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff;h=43c3e6f5abdf6acac9b90c86bf03f995bf7d3d92
>>
>>was lo
Jan,
it looks like the following fix:
http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff;h=43c3e6f5abdf6acac9b90c86bf03f995bf7d3d92
was lost after resurrecting of the spliced checkpointing list in this patch:
http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.
Balbir Singh wrote:
> Kirill Korotaev wrote:
>
>>Paul Menage wrote:
>>
>>>On 6/22/07, Balbir Singh <[EMAIL PROTECTED]> wrote:
>>>
>>>
>>>>The problem with input in bytes is that the user will have to ensure
>>>>that the
Paul Menage wrote:
> On 6/22/07, Balbir Singh <[EMAIL PROTECTED]> wrote:
>
>>The problem with input in bytes is that the user will have to ensure
>>that the input is
>>a multiple of page size, which implies that she would need to use the
>>calculator every time.
>>
>
>
> Having input in bytes s
Ingo Molnar wrote:
> i'd still like to hear back from Kirill & co whether this framework is
> flexible enough for their work (OpenVZ, etc.) too.
My IMHO is that so far the proposed group scheduler doesn't look ready/suitable.
We need to have a working SMP version before it will be clear
whether
Srivatsa Vaddagiri wrote:
> On Fri, May 25, 2007 at 05:05:16PM +0400, Kirill Korotaev wrote:
>
>>>That way the scheduler would first pick a "virtual CPU" to schedule, and
>>>then pick a user from that virtual CPU, and then a task from the user.
>>
>>
Ingo Molnar wrote:
> * Srivatsa Vaddagiri <[EMAIL PROTECTED]> wrote:
>
>
>>Can you repeat your tests with this patch pls? With the patch applied,
>>I am now getting the same split between nice 0 and nice 10 task as
>>CFS-v13 provides (90:10 as reported by top )
>>
>> 5418 guest 20 0 2464
Andrew Morton wrote:
> On Thu, 17 May 2007 23:20:12 +0530
> Balbir Singh <[EMAIL PROTECTED]> wrote:
>
>
>>A meaningful container size does not hamper performance. I am in the process
>>of getting more results (with varying container sizes). Please let me know
>>what you think of the results? Woul
Acked-By: Kirill Korotaev <[EMAIL PROTECTED]>
Vasily Averin wrote:
> Andrew,
>
> I've fixed a number of issues in i2o layer:
> [patch i2o 1/6] i2o_cfg_passthru cleanup (memory leak and infinite loop)
> [patch i2o 2/6] wrong memory access in i2o_block_device_lock()
>
Balbir Singh wrote:
> This patch is inspired by the discussion at http://lkml.org/lkml/2007/4/11/187
> and implements per container statistics as suggested by Andrew Morton
> in http://lkml.org/lkml/2007/4/11/263. The patch is on top of 2.6.21-mm1
> with Paul's containers v9 patches (forward ported
Roland,
can you please help with it?
current utrace state is far from being stable,
RHEL5 and -mm kernels can be quite easily crashed with some of the exploits
we collected so far.
Alexey can help you with any information needed - call traces, test cases,
but without your help we can't fix it all
for additional LOCK-ed operation on
>> altering this field - l3->list_lock is already taken
>> where needed.
>>
>>Made naming more descriptive according to Dave.
>>
>>Signed-off-by: Pavel Emelianov <[EMAIL PROTECTED]>
>>Signed-off-by: Kirill Koro
Paul Menage wrote:
> On 4/3/07, Serge E. Hallyn <[EMAIL PROTECTED]> wrote:
>
>>But frankly I don't know where we stand right now wrt the containers
>>patches. Do most people want to go with Vatsa's latest version moving
>>containers into nsproxy? Has any other development been going on?
>>Paul,
Andrew Morton wrote:
[...skip]
> The problem is memory reclaim. A number of schemes which have been
> proposed require a per-container page reclaim mechanism - basically a
> separate scanner.
>
> This is a huge, huge, huge problem. The present scanner has been under
> development for over a
Nick,
>>Accounting becomes easy if we have a container pointer in struct page.
>> This can form base ground for building controllers since any memory
>>related controller would be interested in tracking pages. However we
>>still want to evaluate if we can build them without bloating the
>>struct
Herbert,
>>Just curious why current vserver code kills arbitrary
>>task in container then?
>
>
> because it obviously lacks the finess of OpenVZ code :)
>
> seriously, handling the OOM kills inside a container
> has never been a real world issue, as once you are
> really out of memory (and OOM
Eric,
>>>And misses every resource sharing opportunity in sight.
>>
>>that was my point too.
>>
>>
>>>Except for
>>>filtering the which pages are eligible for reclaim an RSS limit should
>>>not need to change the existing reclaim logic, and with things like the
>>>memory zones we have had that kin
>>So what to do when virtual physical limit is hit?
>>OOM-kill current task?
>
>
> when the RSS limit is hit, but there _are_ enough
> pages left on the physical system, there is no
> good reason to swap out the page at all
>
> - there is no benefit in doing so (performance
>wise, that is)
Andrew Morton wrote:
- shared mappings of 'shared' files (binaries
and libraries) to allow for reduced memory
footprint when N identical guests are running
>>>
>>>So, it sounds like this can be phrased as a requirement like:
>>>
>>> "Guests must be able to share pages."
>>>
>> - doesn't store the accounted value but
>> limit - accounted (i.e. the free resource)
>> - uses atomic_add_return()
>> - when negative, an error is returned and
>> the resource amount is added back
>>
>>changes to the limit have to adjust the 'current'
>>value too, but that is again simple
>>>well, Linux-VServer is "working", "secure", "flexible"
>>>_and_ non-intrusive ... it is quite natural that less
>>>won't work for me ... and regarding patches, there
>>>will be a 2.2 release soon, with all the patches ...
>
>
>>ok. please check your dcache and slab accounting then
>>(analyzed
> On Mon, 2007-03-12 at 19:23 +0300, Kirill Korotaev wrote:
>
>>For these you essentially need per-container page->_mapcount counter,
>>otherwise you can't detect whether rss group still has the page in question
>>being mapped
>>in its processes' addre
Eric W. Biederman wrote:
> Pavel Emelianov <[EMAIL PROTECTED]> writes:
>
>
>>Pages are charged to their first touchers which are
>>determined using pages' mapcount manipulations in
>>rmap calls.
>
>
> NAK pages should be charged to every rss group whose mm_struct they
> are mapped into.
For the
Eric W. Biederman wrote:
> Pavel Emelianov <[EMAIL PROTECTED]> writes:
>
>
>>Adds needed pointers to mm_struct and page struct,
>>places hooks to core code for mm_struct initialization
>>and hooks in container_init_early() to preinitialize
>>RSS accounting subsystem.
>
>
> An extra pointer in s
Eric,
> And misses every resource sharing opportunity in sight.
that was my point too.
> Except for
> filtering the which pages are eligible for reclaim an RSS limit should
> not need to change the existing reclaim logic, and with things like the
> memory zones we have had that kind of restricti
Herbert,
> sorry, I'm not in the lucky position that I get payed
> for sending patches to LKML, so I have to think twice
> before I invest time in coding up extra patches ...
>
> i.e. you will have to live with my comments for now
looks like you have no better argurments then that...
>>Looks lik
Andrew Morton wrote:
> On Tue, 06 Mar 2007 17:55:29 +0300
> Pavel Emelianov <[EMAIL PROTECTED]> wrote:
>
>
>>+struct rss_container {
>>+ struct res_counter res;
>>+ struct list_head page_list;
>>+ struct container_subsys_state css;
>>+};
>>+
>>+struct page_container {
>>+ struct p
>>There have been various projects attempting to provide resource
>>management support in Linux, including CKRM/Resource Groups and UBC.
>
>
> let me note here, once again, that you forgot Linux-VServer
> which does quite non-intrusive resource management ...
Herbert, do you care to send patches
> nobody actually cares about a precise accounting and
> calculating shares or partitions of whatever resource,
> all that matters is that you have a way to prevent a
> potential hostile environment from sucking up all your
> resources (or even a single one) resulting in a DoS
This is not true
Pavel Emelianov wrote:
> Balbir Singh wrote:
>
>>Pavel Emelianov wrote:
>>
>>>This patchset adds RSS, accounting and control and
>>>limiting the number of tasks and files within container.
>>>
>>>Based on top of Paul Menage's container subsystem v7
>>>
>>>RSS controller includes per-container RSS
Andrew,
>>>I'm wagering you'll break either the semantics, and/or the
>>>performance, of cpusets doing this.
>>
>>I like Paul's containers patch. It looks good and pretty well.
>>After some of the context issues are resolved it's fine.
>>Maybe it is even the best way of doing things.
>
>
> Have
Paul,
>>I suspect we can make cpusets also work
>>on top of this very easily.
>
>
> I'm skeptical, and kinda worried.
>
> ... can you show me the code that does this?
don't worry. we are not planning to commit any code breaking cpusets...
I will be the first one against it.
> Namespaces are no
Acked-By: Kirill Korotaev <[EMAIL PROTECTED]>
> From: Serge E. Hallyn <[EMAIL PROTECTED]>
> Subject: [PATCH] namespaces: update some function names
>
> The {get,exit}_task_namespaces do not grab references to the individual
> namespaces, only to the nsproxy. Reflect
> On 2/19/07, Andrew Morton <[EMAIL PROTECTED]> wrote:
>
>>Alas, I fear this might have quite bad worst-case behaviour. One small
>>container which is under constant memory pressure will churn the
>>system-wide LRUs like mad, and will consume rather a lot of system time.
>>So it's a point at whic
Acked-By: Kirill Korotaev <[EMAIL PROTECTED]>
> /proc/$PID/fd has r-x-- permissions, so if process does setuid(), it
> will not be able to access /proc/*/fd/. This breaks fstatat() emulation
> in glibc.
>
> open("foo", O_RDONLY|O_DIRECT
Linus,
>>Also, that patch would break many 32-bit programs not compiled with large
>>offsets when run in compatibility mode on a 64-bit kernel. If they were to
>>do a stat on this inode, it would likely generate an EOVERFLOW error since
>>the pointer address would probably not fit in a 32 bit fiel
>>Jeff,
>>
>>taking into account the discussion about unawarness/uncertainty
>>of whether *unique* inode number is needed at all on pipe fds and such
>>do we need this at all?
>>
>>Thanks,
>>Kirill
>>
>
>
> Fair enough, perhaps we should just not worry about it, and assume that there
> might be
Jeff,
taking into account the discussion about unawarness/uncertainty
of whether *unique* inode number is needed at all on pipe fds and such
do we need this at all?
Thanks,
Kirill
> This patch adds 2 new libfs functions that allow for us to defer assignment of
> an i_ino value until such time th
Srivatsa,
> Current Linux CPU scheduler doesnt recognize process aggregates while
> allocating bandwidth. As a result of this, an user could simply spawn large
> number of processes and get more bandwidth than others.
>
> Here's a patch that provides fair allocation for all users in a system.
>
Jeff,
is 100% uniqeness is so much required for pipe inode numbers?
AFAIU, it is not that critical for pipefs (unlike smb, nfs etc.)
Thanks,
Kirill
> This converts pipefs to use the new scheme. Here we're calling iunique to get
> a unique i_ino value for the new inode, and then hashing it afterw
Acked-By: Kirill Korotaev <[EMAIL PROTECTED]>
hit it today as well :/
> When a 32-bit program that was not compiled with large file offsets does a
> stat and gets a st_ino value back that won't fit in the 32 bit field, glibc
> (correctly) generates an EOVERFLOW error. We can
please fix the comment as well.
oops number is very helpful in dealing with people
reports. Very often the first Oops is required to get
understanding of the real problem, so
further oops can be ignored and the first one requested
if the problem is reproducable.
Kirill
> From: Pavel Emelianov <
Eric, really good job!
Patches: 1-13, 15-24, 26-32, 34-44, 46-49, 52-55, 57 (all except below)
Acked-By: Kirill Korotaev <[EMAIL PROTECTED]>
14/59 - minor (extra space)
25/59 - minor note
33/59 - not sorted sysctl IDs
45/59 - typo
50/59 - copyright/file note
51/59 - copyright/fil
1. I ask for not setting your authorship/copyright on the code which you just
copied
from other places. Just doesn't look polite IMHO.
2. please don't name files like ipc/ipc_sysctl.c
ipc/sysctl.c sounds better IMHO.
3. any reason to introduce CONFIG_SYSVIPC_SYSCTL?
why not simply do
>
Eric, though I personally don't care much:
1. I ask for not setting your authorship/copyright on the code which you just
copied
from other places. Just doesn't look polite IMHO.
2. I would propose to not introduce utsname_sysctl.c.
both files are too small and minor that I can't see much reaso
IDs not sorted in enum. see below.
> From: Eric W. Biederman <[EMAIL PROTECTED]> - unquoted
>
> We need to have the the definition of all top level sysctl
> directories registers in sysctl.h so we don't conflict by
> accident and cause abi problems.
>
> Signed-off-by: Eric W. Biederman <[EMAIL P
another small minor note.
> From: Eric W. Biederman <[EMAIL PROTECTED]> - unquoted
>
> Signed-off-by: Eric W. Biederman <[EMAIL PROTECTED]>
> ---
> arch/frv/kernel/pm.c | 50
> +++---
> 1 files changed, 43 insertions(+), 7 deletions(-)
>
> diff --g
minor extra space in table below...
Kirill
> From: Eric W. Biederman <[EMAIL PROTECTED]> - unquoted
>
> Signed-off-by: Eric W. Biederman <[EMAIL PROTECTED]>
> ---
> fs/xfs/linux-2.6/xfs_sysctl.c | 258
> 1 files changed, 180 insertions(+), 78 deletions(
Herbert,
>>This patch implements the BeanCounter resource control abstraction
>>over generic process containers. It contains the beancounter core
>>code, plus the numfiles resource counter. It doesn't currently contain
>>any of the memory tracking code or the code for switching beancounter
>>conte
[IA64] bug in ldscript (mainstream)
Occasionally, in mainstream number of fsys entries is even.
In OpenVZ it is odd and we get misaligned kernel image,
which does not boot.
Signed-Off-By: Alexey Kuznetsov <[EMAIL PROTECTED]>
Signed-Off-By: Kirill Korotaev <[EMAIL PROTECTED]>
diff -
[IA64] virt_to_page() cannot be called with NULL (mainstream bug)
It does not return NULL when arg is NULL.
Signed-Off-By: Alexey Kuznetsov <[EMAIL PROTECTED]>
Signed-Off-By: Kirill Korotaev <[EMAIL PROTECTED]>
--- linus-2.6.git/include/asm-ia64/pgalloc.h.orig 2006-
I guess you forgot to add Andrew on CC.
Thanks,
Kirill
> OpenVZ team has discovered error inside generic_file_direct_write()
> If generic_file_direct_IO() has fail (ENOSPC condition) it may have
> instantiated
> a few blocks outside i_size. And fsck will complain about wrong i_size
> (ext2, ext3
OpenVZ has been using them for more than a month already ;-)
Kirill
> Ladies and Gentlemen!
>
> here is the first Linux-VServer version (testing)
> with support for the *spaces (uts, ipc and vfs)
> introduced in 2.6.19 ...
>
> http://vserver.13thfloor.at/Experimental/patch-2.6.19-vs2.1.x-t1.dif
AIL PROTECTED]>
Signed-Off-By: Kirill Korotaev <[EMAIL PROTECTED]>
diff --git a/fs/ext3/inode.c b/fs/ext3/inode.c
index a48ada9..1acb528 100644
--- a/fs/ext3/inode.c
+++ b/fs/ext3/inode.c
@@ -1162,13 +1162,14 @@ static int ext3_prepare_failure(struct f
struct buffer_head *bh,
Andrew,
answers on your questions.
>>in journal=ordered or journal=data mode retry in ext3_prepare_write()
>>breaks the requirements of journaling of data with respect to metadata.
>>The fix is to call commit_write to commit allocated zero blocks before
>>retry.
>>
>
>
> How was this problem de
PROTECTED]>
Signed-Off-By: Kirill Korotaev <[EMAIL PROTECTED]>
Kirill
--- linux-2.6.8.1-t032/fs/dcookies.c.dget 2004-08-14 14:54:46.0
+0400
+++ linux-2.6.8.1-t032/fs/dcookies.c2005-08-23 14:09:00.0 +0400
@@ -93,12 +93,10 @@ static struct dcookie_struct * alloc_dc
Hello,
we recently obtained the oops below in restore_fpu() which makes us
believe that there was lost correct masking of the hardcoded constant:
0x1f80 with mxcsr_feature_mask in init_fpu().
Can someone check that patch attached?
general protection fault: [#1]
SMP
Modules linked in: e100
Can someone (Ingo?) recommend me CPU scheduler tests which are usually
used to test CPU scheduler perfomance, context switch performance,
SMP/migration/balancing performance etc.?
Thanks in advance,
Kirill
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a
Hello Rohit,
BTW, can you explain why making pages non-global is the cure? Is it
safe workaround for this bug?
There is a boundary condition that can have non-global pages containing
the CR3 load to also hit this issue on affected PIII. Though for this
to happen, mov to cr3 has to be the very last
ny test case). We would like to
root cause the failure here at Intel.
Appreciate your help,
Thanks,
-rohit
Kirill Korotaev <> wrote on Wednesday, January 19, 2005 8:08 AM:
Hello Linus,
Linus, Ingo, I've got one strange CPU bug leading to oopses, reboots
and so on. This bug can
version and any test case). We would like to root cause
the failure here at Intel.
Appreciate your help,
Thanks,
-rohit
Kirill Korotaev <> wrote on Wednesday, January 19, 2005 8:08 AM:
Hello Linus,
Linus, Ingo, I've got one strange CPU bug leading to oopses, reboots
and so on. This b
Hello Linus,
Linus, Ingo, I've got one strange CPU bug leading to oopses, reboots and
so on. This bug can be reproduced with a little bit modified 4gb split
and is probably related to CPU speculative execution. I'll post more
information about this bug later, but I would like to ask you for Inte
77 matches
Mail list logo