Hi Lai,
Sorry for the delay, I've been away on vacation.
Lai Jiangshan wrote:
>
> My original purpose was to fix a bug as I described.
> This bug and the problem that offering big enough array for a huge
> cgroup are orthogonal!
>
You're right. So solving them separately seems fine.
It's def
On Thursday 2008-09-04 22:58, Alexey Dobriyan wrote:
>In conntrack_mt_v0() "ct->status" can be used even for untracked connection,
>is this right?
Yes.
>For example, does setting IPS_NAT_DONE_MASK and IPS_CONFIRMED_BIT on
>untracked conntracked really necessary?
Does it even happen? Something
On Thu, Sep 04, 2008 at 06:58:38PM +0200, Patrick McHardy wrote:
> [EMAIL PROTECTED] wrote:
>> static inline void
>> -nf_conntrack_event_cache(enum ip_conntrack_events event,
>> +nf_conntrack_event_cache(struct net *net, enum ip_conntrack_events event,
>> const struct sk_buff
On Thu, Sep 04, 2008 at 06:54:16PM +0200, Patrick McHardy wrote:
> [EMAIL PROTECTED] wrote:
>> Make untracked conntrack per-netns. Compare conntracks with relevant
>> untracked one.
>>
>> The following code you'll start laughing at this code:
>>
>> if (ct == ct->ct_net->ct.untracked)
>>
[EMAIL PROTECTED] wrote:
>
> When both modes are used simultaneously, we have following options:
>
> 1. Let container-startup deal with it i.e use above bind-mount approach
>or, as Serge mentioned, have containers chroot and make ptmx->pts/ptmx
>symlink or another option ?
>
> 2. Have t
H. Peter Anvin [EMAIL PROTECTED] wrote:
> [EMAIL PROTECTED] wrote:
>> But that node will not be accessible if there is a newinstance mount
>> without the bind mount ? IOW
>> 1. mount -t devpts -o newinstance lxcpts /dev/pts
>> 2. mount -o bind /dev/pts/ptmx /dev/ptmx
>> If both #1 and #2
On Wed, 2008-09-03 at 17:59 +0400, Andrey Mirkin wrote:
> > The first issues I see with this direction are some EXPORT_SYMBOL() that
> > would be useless without a module.
>
> Checkpoint/restart functionality is implemented as a kernel module to provide
> more flexibility during development proce
Quoting Oren Laadan ([EMAIL PROTECTED]):
>
>
> Serge E. Hallyn wrote:
> > Quoting Oren Laadan ([EMAIL PROTECTED]):
> >>
> >> Serge E. Hallyn wrote:
> >>> Quoting Oren Laadan ([EMAIL PROTECTED]):
> Create trivial sys_checkpoint and sys_restore system calls. They will
> enable to checkpoi
Serge E. Hallyn wrote:
> Quoting Oren Laadan ([EMAIL PROTECTED]):
>>
>> Serge E. Hallyn wrote:
>>> Quoting Oren Laadan ([EMAIL PROTECTED]):
Create trivial sys_checkpoint and sys_restore system calls. They will
enable to checkpoint and restart an entire container, to and from a
chec
Quoting Oren Laadan ([EMAIL PROTECTED]):
>
>
> Serge E. Hallyn wrote:
> > Quoting Oren Laadan ([EMAIL PROTECTED]):
> >> Create trivial sys_checkpoint and sys_restore system calls. They will
> >> enable to checkpoint and restart an entire container, to and from a
> >> checkpoint image file descrip
On Thu, 2008-09-04 at 04:05 -0400, Oren Laadan wrote:
> +/**
> + * cr_scan_fds - scan file table and construct array of open fds
> + * @files: files_struct pointer
> + * @fdtable: (output) array of open fds
> + * @return: the number of open fds found
> + *
> + * Allocates the file descriptors array
On Thu, 2008-09-04 at 04:03 -0400, Oren Laadan wrote:
> +/* free a chain of page-arrays */
> +void cr_pgarr_free(struct cr_ctx *ctx)
> +{
> + struct cr_pgarr *pgarr, *pgnxt;
> +
> + for (pgarr = ctx->pgarr; pgarr; pgarr = pgnxt) {
> + _cr_pgarr_release(ctx, pgarr);
> +
On Thu, 2008-09-04 at 04:04 -0400, Oren Laadan wrote:
> +asmlinkage int sys_modify_ldt(int func, void __user *ptr, unsigned long
> bytecount);
This needs to go into a header.
> +int cr_read_mm_context(struct cr_ctx *ctx, struct mm_struct *mm, int parent)
> +{
> + struct cr_hdr_mm_context *hh
On Thu, 2008-09-04 at 04:05 -0400, Oren Laadan wrote:
> +=== Shared resources (objects)
> +
> +Many resources used by tasks may be shared by more than one task (e.g.
> +file descriptors, memory address space, etc), or even have multiple
> +references from other resources (e.g. a single inode that r
[EMAIL PROTECTED] wrote:
>
> But that node will not be accessible if there is a newinstance mount
> without the bind mount ? IOW
>
> 1. mount -t devpts -o newinstance lxcpts /dev/pts
> 2. mount -o bind /dev/pts/ptmx /dev/ptmx
>
> If both #1 and #2 or neither happen there is no proble
Serge E. Hallyn wrote:
> Quoting Oren Laadan ([EMAIL PROTECTED]):
>> Create trivial sys_checkpoint and sys_restore system calls. They will
>> enable to checkpoint and restart an entire container, to and from a
>> checkpoint image file descriptor.
>>
>> The syscalls take a file descriptor (for the
[EMAIL PROTECTED] wrote:
> Note, sysctl table is always duplicated, this is simpler, less special-cased,
> less mistakes (and did one mistake in first version of this patch).
This also doesn't explain what the patch is doing at all.
___
Containers mailin
H. Peter Anvin [EMAIL PROTECTED] wrote:
> Alan Cox wrote:
>> O> We can't, really, because it will open the global ptmx. This is an
>>> unfortunate side effect of the backwards-compatibility code.
>>>
>>> This is also why I don't like the bind mount; the symlink option has the
>>> nice property t
[EMAIL PROTECTED] wrote:
> static inline void
> -nf_conntrack_event_cache(enum ip_conntrack_events event,
> +nf_conntrack_event_cache(struct net *net, enum ip_conntrack_events event,
>const struct sk_buff *skb)
> {
Passing the conntrack instead of the struct net and the s
[EMAIL PROTECTED] wrote:
> Signed-off-by: Alexey Dobriyan <[EMAIL PROTECTED]>
Changelog please, I was wondering whether this was a resend
of the last one.
___
Containers mailing list
[EMAIL PROTECTED]
https://lists.linux-foundation.org/mailman/listinfo/c
Alan Cox wrote:
> O> We can't, really, because it will open the global ptmx. This is an
>> unfortunate side effect of the backwards-compatibility code.
>>
>> This is also why I don't like the bind mount; the symlink option has the
>> nice property that f*ckups are more obvious.
>
> It's asking
[EMAIL PROTECTED] wrote:
> Make untracked conntrack per-netns. Compare conntracks with relevant
> untracked one.
>
> The following code you'll start laughing at this code:
>
> if (ct == ct->ct_net->ct.untracked)
> ...
>
> let me remind you that ->ct_net is set in only one pla
[EMAIL PROTECTED] wrote:
> Make per-netns expectation hash and expectation count.
>
> Expectation always belongs to netns to which it's master conntrack belongs.
> This is natural and allows to not bloat expectations.
>
> Proc files and leaf users in protocol modules are stubbed to init_net,
> th
[EMAIL PROTECTED] wrote:
> What is unconfirmed connection in one netns can very well be confirmed
> in another.
>
> @@ -10,5 +11,6 @@ struct netns_ct {
> unsigned intexpect_count;
> struct hlist_head *expect_hash;
> int expect_vmalloc;
> +
On Thu, 2008-09-04 at 11:03 -0500, Serge E. Hallyn wrote:
> Dave, are you happy with the allocations here, or were you objecting
> to cr_hbuf_get() and cr_hbuf_put()?
I still don't think there's really enough justification as it stands,
but don't let me get in the way. If it ends up being an issu
O> We can't, really, because it will open the global ptmx. This is an
> unfortunate side effect of the backwards-compatibility code.
>
> This is also why I don't like the bind mount; the symlink option has the
> nice property that f*ckups are more obvious.
It's asking for trouble with existing
[EMAIL PROTECTED] wrote:
> * make per-netns conntrack hash
>
> Other solution is to add ->ct_net pointer to tuplehashes and still has one
> hash, I tried that it's ugly and requires more code deep down in protocol
> modules et al.
>
> * propagate netns pointer to where needed, e. g. to conn
[EMAIL PROTECTED] wrote:
> Sysctls and proc files are stubbed to init_net's one. This is temporary.
Applied, thanks.
___
Containers mailing list
[EMAIL PROTECTED]
https://lists.linux-foundation.org/mailman/listinfo/containers
___
[EMAIL PROTECTED] wrote:
> Conntrack (struct nf_conn) gets pointer to netns: ->ct_net -- netns in which
> it was created. It comes from netdevice.
>
> ->ct_net is write-once field.
>
> Every conntrack in system has ->ct_net initialized, no exceptions.
>
> ->ct_net doesn't pin netns: conntracks a
[EMAIL PROTECTED] wrote:
> ip_route_me_harder() is called on output codepaths:
> 1) IPVS: honestly, not sure, looks like it can be called during forwarding
> 2) IPv4 REJECT: refreshing comment re skb->dst is valid and assigment of
>skb->dst right before call :^)
> 3) NAT: called in LOCAL_OUT ho
[EMAIL PROTECTED] wrote:
> One comment: #ifdefs around #include is necessary to overcome amazing compile
> breakages in NOTRACK-in-netns patch (see below).
I guess thats because of the net/netfilter/nf_conntrack.h inclusion.
We should fix that, its spreading to too many places.
Anyways, applied.
[EMAIL PROTECTED] [EMAIL PROTECTED] wrote:
|
| TODO:
| - Do we need a '-o ptmxuid' and '-o ptmxgid' options as well ?
| - Add a config option to enable multiple-mounts of devpts.
| - (Sometime in future) Remove even initial kernel mount of devpts
| - Any other good test su
[EMAIL PROTECTED] wrote:
>
> Ah, ok. Well, I will remove that para from the patch description.
>
> If the -o newinstance is NOT followed by the bind mount, ptys won't
> work and would be nice if we can print a useful message when opening
> /dev/ptmx.
>
We can't, really, because it will open th
[EMAIL PROTECTED] wrote:
> Signed-off-by: Alexey Dobriyan <[EMAIL PROTECTED]>
Applied, thanks.
> @@ -108,7 +120,7 @@ ip6t_local_hook(unsigned int hook,
> /* flowlabel and prio (includes version, which shouldn't change either
> */
> flowlabel = *((u_int32_t *)ipv6_hdr(skb));
>
> -
Quoting Louis Rilling ([EMAIL PROTECTED]):
> On Thu, Sep 04, 2008 at 04:02:38AM -0400, Oren Laadan wrote:
> >
> > Add those interfaces, as well as helpers needed to easily manage the
> > file format. The code is roughly broken out as follows:
> >
> > checkpoint/sys.c - user/kernel data transfer, as
[EMAIL PROTECTED] wrote:
> Signed-off-by: Alexey Dobriyan <[EMAIL PROTECTED]>
Applied, thanks.
___
Containers mailing list
[EMAIL PROTECTED]
https://lists.linux-foundation.org/mailman/listinfo/containers
___
D
Quoting Oren Laadan ([EMAIL PROTECTED]):
>
> Add those interfaces, as well as helpers needed to easily manage the
> file format. The code is roughly broken out as follows:
>
> checkpoint/sys.c - user/kernel data transfer, as well as setup of the
> checkpoint/restart context (a per-checkpoint data
[EMAIL PROTECTED] wrote:
> Signed-off-by: Alexey Dobriyan <[EMAIL PROTECTED]>
Applied, thanks.
___
Containers mailing list
[EMAIL PROTECTED]
https://lists.linux-foundation.org/mailman/listinfo/containers
___
D
H. Peter Anvin [EMAIL PROTECTED] wrote:
> [EMAIL PROTECTED] wrote:
>> 2. To effectively use the multi-instance mode, applications/libraries
>> should, open "/dev/pts/ptmx" instead of "/dev/ptmx" but obviously
>> this would fail in the legacy mode.
>>
>
> NOT SO!
>
> /de
[EMAIL PROTECTED] wrote:
> Now that dev_net() exists, the usefullness of them is even less. Also they're
> a big problem in resolving circular header dependencies necessary for
> NOTRACK-in-netns patch. See below.
Applied, thanks.
___
Containers mailing
On Thu, 2008-09-04 at 04:05 -0400, Oren Laadan wrote:
>
> diff --git a/include/linux/ckpt_hdr.h b/include/linux/ckpt_hdr.h
> index 322ade5..1ce1dbc 100644
> --- a/include/linux/ckpt_hdr.h
> +++ b/include/linux/ckpt_hdr.h
> @@ -17,7 +17,7 @@
> /*
>* To maintain compatibility between 32-bit an
Eelco Chaudron wrote:
> Hi All,
>
> I was looking at the network namespaces implementation for ARP, and I
> was wondering why the struct net abstraction was done in the core
> neighbour functions, and not at the struct neigh_table arp_tbl level
> (i.e. one arp_tbl per namespace)?
>
> One problem
Louis Rilling wrote:
> On Thu, Sep 04, 2008 at 04:05:50AM -0400, Oren Laadan wrote:
>> Dump the files_struct of a task with 'struct cr_hdr_files', followed by
>> all open file descriptors. Since FDs can be shared, they are assigned a
>> tag and registered in the object hash.
>>
>> For each open F
Quoting Oren Laadan ([EMAIL PROTECTED]):
>
> Create trivial sys_checkpoint and sys_restore system calls. They will
> enable to checkpoint and restart an entire container, to and from a
> checkpoint image file descriptor.
>
> The syscalls take a file descriptor (for the image file) and flags as
>
Louis Rilling wrote:
> On Thu, Sep 04, 2008 at 04:05:22AM -0400, Oren Laadan wrote:
>> Infrastructure to handle objects that may be shared and referenced by
>> multiple tasks or other objects, e..g open files, memory address space
>> etc.
>>
>> The state of shared objects is saved once. On the fi
So, just like the I/O controller patches, we surely can't just throw
patch sets back and forth at each other. We're also sure to wear out
any potential reviewers, especially on LKML.
The differences you've described between this and Oren's patches are
pretty small, all things considered. Would y
Hi All,
I was looking at the network namespaces implementation for ARP, and I was
wondering why the struct net abstraction was done in the core neighbour
functions, and not at the struct neigh_table arp_tbl level (i.e. one arp_tbl
per namespace)?
One problem I could find with the current imple
On Thu, Sep 04, 2008 at 04:05:50AM -0400, Oren Laadan wrote:
>
> Dump the files_struct of a task with 'struct cr_hdr_files', followed by
> all open file descriptors. Since FDs can be shared, they are assigned a
> tag and registered in the object hash.
>
> For each open FD there is a 'struct cr_hdr_
On Thu, Sep 04, 2008 at 04:05:22AM -0400, Oren Laadan wrote:
>
> Infrastructure to handle objects that may be shared and referenced by
> multiple tasks or other objects, e..g open files, memory address space
> etc.
>
> The state of shared objects is saved once. On the first encounter, the
> state i
On Thu, Sep 04, 2008 at 04:02:38AM -0400, Oren Laadan wrote:
>
> Add those interfaces, as well as helpers needed to easily manage the
> file format. The code is roughly broken out as follows:
>
> checkpoint/sys.c - user/kernel data transfer, as well as setup of the
> checkpoint/restart context (a p
Oren Laadan wrote:
> Create trivial sys_checkpoint and sys_restore system calls. They will
> enable to checkpoint and restart an entire container, to and from a
> checkpoint image file descriptor.
>
> The syscalls take a file descriptor (for the image file) and flags as
> arguments. For sys_checkp
Infrastructure to handle objects that may be shared and referenced by
multiple tasks or other objects, e..g open files, memory address space
etc.
The state of shared objects is saved once. On the first encounter, the
state is dumped and the object is assigned a unique identifier and also
stored i
Andrey Mirkin wrote:
> This patchset introduces kernel based checkpointing/restart as it is
> implemented in OpenVZ project. This patchset has limited functionality and
> are able to checkpoint/restart only single process. Recently Oren Laaden
> sent another kernel based implementation of checkpo
Dump the files_struct of a task with 'struct cr_hdr_files', followed by
all open file descriptors. Since FDs can be shared, they are assigned a
tag and registered in the object hash.
For each open FD there is a 'struct cr_hdr_fd_ent' with the FD, its tag
and its close-on-exec property. If the FD
Restore open file descriptors: for each FD read 'struct cr_hdr_fd_ent'
and lookup tag in the hash table; if not found (first occurence), read
in 'struct cr_hdr_fd_data', create a new FD and register in the hash.
Otherwise attach the file pointer from the hash as an FD.
This patch only handles bas
For each VMA, there is a 'struct cr_vma'; if the VMA is file-mapped,
it will be followed by the file name. The cr_vma->npages will tell
how many pages were dumped for this VMA. Then it will be followed
by the actual data: first a dump of the addresses of all dumped
pages (npages entries) followe
Restoring the memory address space begins with nuking the existing one
of the current process, and then reading the VMA state and contents.
Call do_mmap_pgoffset() for each VMA and then read in the data.
Signed-off-by: Oren Laadan <[EMAIL PROTECTED]>
---
arch/x86/mm/restart.c | 56
Add those interfaces, as well as helpers needed to easily manage the
file format. The code is roughly broken out as follows:
checkpoint/sys.c - user/kernel data transfer, as well as setup of the
checkpoint/restart context (a per-checkpoint data structure for
housekeeping)
checkpoint/checkpoint.c
(Following Dave Hansen's refactoring of the original post)
Add logic to save and restore architecture specific state, including
thread-specific state, CPU registers and FPU state.
Currently only x86-32 is supported. Compiling on x86-64 will trigger
an explicit error.
Signed-off-by: Oren Laadan
Covers application checkpoint/restart, overall design, interfaces
and checkpoint image format.
Signed-off-by: Oren Laadan <[EMAIL PROTECTED]>
---
Documentation/checkpoint.txt | 182 ++
1 files changed, 182 insertions(+), 0 deletions(-)
create mode 100
Create trivial sys_checkpoint and sys_restore system calls. They will
enable to checkpoint and restart an entire container, to and from a
checkpoint image file descriptor.
The syscalls take a file descriptor (for the image file) and flags as
arguments. For sys_checkpoint the first argument identi
These patches implement checkpoint-restart [CR v3]. This version is
aimed at addressing feedback and eliminating bugs, after having added
save and restore of open files state (regular files and directories)
which makes it more usable.
Todo:
- Add support for x86-64 and improve ABI
- Refine or cha
62 matches
Mail list logo