On Mon, 2008-02-18 at 17:10 +0100, Miklos Szeredi wrote:
+ /*
+ * We don't have to hold all of the locks at the
+ * same time here because we know that we're the
+ * last reference to mnt and that no new writers
+ * can come in.
+ */
+
On Sat, 2008-02-16 at 07:31 +0100, Christoph Hellwig wrote:
once we put pieces in the first three patches would be useful aswell,
to easily catch additions in the next cycle that might be adding
NULL-vfsmount calls to dentry_open.
So, we want
[PATCH 07/30] r/o bind mounts: stub
On Thu, 2008-02-21 at 04:38 -0800, Andrew Morton wrote:
4[0.071378] [do_name+279/440] do_name+0x117/0x1b8
4[0.071570] [write_buffer+34/49] write_buffer+0x22/0x31
4[0.071763] [flush_window+105/184] flush_window+0x69/0xb8
4[0.071996] [unpack_to_rootfs+1585/2238]
On Sat, 2008-02-23 at 10:18 +0800, Matt Mackall wrote:
Another
problem is that there is no way to get information about the page size a
specific mapping uses.
Is this true generically, or just with pagemap? It seems like we should
have a way to tell that a particular mapping is of large
On Mon, 2008-02-25 at 11:27 +0530, srinivasa wrote:
This patch prohibits user from probing preempt_schedule(). One way of
prohibiting the user from probing functions is by marking such
functions with __kprobes. But this method doesn't work for those functions,
which are already marked to
On Mon, 2008-02-25 at 15:07 +, Andy Whitcroft wrote:
shrink_page_list() would be expected to be passed pages pulled from
the active or inactive lists via isolate_lru_pages()? I would not have
expected to find the kernel text on the LRU and therefore not expect to
see it passed to
On Mon, 2008-02-25 at 13:09 +0100, Hans Rosenfeld wrote:
On Sat, Feb 23, 2008 at 10:31:01AM -0800, Dave Hansen wrote:
- 4 bits for the page size, with 0 meaning native page size (4k on x86,
8k on alpha, ...) and values 1-15 being specific to the architecture
(I used 1 for 2M, 2
On 07/06/2012 10:23 AM, Rafael J. Wysocki wrote:
OK, this looks good to me. Queuing up in the linux-next branch of the
linux-pm.git tree. If no problems with it are reported, I'll move it to the
pm-cpuidle branch in a couple of days.
I've got this running on the problem hardware. It seems
this is the first non-trivial use of the inc/drop_nlink()
functions, add some kernel docs for them.
Signed-off-by: Dave Hansen [EMAIL PROTECTED]
---
lxc-dave/fs/inode.c |7 +
lxc-dave/fs/libfs.c |1
lxc-dave/include/linux/fs.h | 58
3
Some filesystems forego the vfs and may_open() and create their
own 'struct file's.
This patch creates a couple of helper functions which can be
used by these filesystems, and will provide a unified place
which the r/o bind mount code may patch.
Signed-off-by: Dave Hansen [EMAIL PROTECTED
' operation.
Signed-off-by: Dave Hansen [EMAIL PROTECTED]
---
lxc-dave/fs/namespace.c | 24 ++--
lxc-dave/fs/open.c |2 +-
2 files changed, 23 insertions(+), 3 deletions(-)
diff -puN fs/namespace.c~23-24-honor-r-w-changes-at-do-remount-time
fs/namespace.c
--- lxc/fs
elevate mnt writers for callers of vfs_mkdir()
Pretty self-explanatory. Fits in with the rest of the series.
Signed-off-by: Dave Hansen [EMAIL PROTECTED]
---
lxc-dave/fs/namei.c|5 +
lxc-dave/fs/nfsd/nfs4recover.c |4
2 files changed, 9 insertions(+)
diff -puN
This area of code is currently #ifdef'd out, so add a comment
for the time when it is actually used.
Signed-off-by: Dave Hansen [EMAIL PROTECTED]
---
lxc-dave/fs/namespace.c |4
1 file changed, 4 insertions(+)
diff -puN fs/namespace.c~11-24-mount-is-safe-add-comment fs/namespace.c
Signed-off-by: Dave Hansen [EMAIL PROTECTED]
---
lxc-dave/net/unix/af_unix.c | 16
1 file changed, 12 insertions(+), 4 deletions(-)
diff -puN
net/unix/af_unix.c~12-24-unix-find-other-elevate-write-count-for-touch-atime
net/unix/af_unix.c
---
lxc/net/unix/af_unix.c~12-24
This does create a little helper in the NFS code to
make an if() a little bit less ugly.
Signed-off-by: Dave Hansen [EMAIL PROTECTED]
---
lxc-dave/fs/namei.c|4
lxc-dave/fs/nfsd/vfs.c | 23 +++
2 files changed, 23 insertions(+), 4 deletions(-)
diff -puN fs
Elevate the write count during the vfs_rmdir() call.
Signed-off-by: Dave Hansen [EMAIL PROTECTED]
---
lxc-dave/fs/namei.c |5 +
1 file changed, 5 insertions(+)
diff -puN fs/namei.c~20-24-do-rmdir-elevate-write-count fs/namei.c
--- lxc/fs/namei.c~20-24-do-rmdir-elevate-write-count
---
lxc-dave/fs/gfs2/inode.c |1 +
1 file changed, 1 insertion(+)
diff -puN fs/gfs2/inode.c~gfs-check-nlink-count fs/gfs2/inode.c
--- lxc/fs/gfs2/inode.c~gfs-check-nlink-count 2007-02-09 14:26:59.0
-0800
+++ lxc-dave/fs/gfs2/inode.c2007-02-09 14:26:59.0 -0800
@@
Signed-off-by: Dave Hansen [EMAIL PROTECTED]
---
lxc-dave/fs/namei.c |4
lxc-dave/ipc/mqueue.c |5 -
2 files changed, 8 insertions(+), 1 deletion(-)
diff -puN fs/namei.c~19-24-elevate-mnt-writers-for-vfs-unlink-callers fs/namei.c
--- lxc/fs/namei.c~19-24-elevate-mnt
Signed-off-by: Dave Hansen [EMAIL PROTECTED]
---
lxc-dave/fs/utimes.c | 13 +
1 file changed, 9 insertions(+), 4 deletions(-)
diff -puN fs/utimes.c~16-24-elevate-write-count-for-do-utimes fs/utimes.c
--- lxc/fs/utimes.c~16-24-elevate-write-count-for-do-utimes 2007-02-09
14
This takes care of all of the direct callers of vfs_mknod().
Since a few of these cases also handle normal file creation
as well, this also covers some calls to vfs_create().
Signed-off-by: Dave Hansen [EMAIL PROTECTED]
---
lxc-dave/fs/namei.c | 12
lxc-dave/fs/nfsd
Signed-off-by: Dave Hansen [EMAIL PROTECTED]
---
lxc-dave/fs/open.c | 16 +++-
1 file changed, 11 insertions(+), 5 deletions(-)
diff -puN fs/open.c~15-24-elevate-writer-count-for-do-sys-truncate fs/open.c
--- lxc/fs/open.c~15-24-elevate-writer-count-for-do-sys-truncate
Signed-off-by: Dave Hansen [EMAIL PROTECTED]
---
lxc-dave/fs/inode.c | 20
1 file changed, 12 insertions(+), 8 deletions(-)
diff -puN fs/inode.c~17-24-elevate-write-count-for-do-sys-utime-and-touch-atime
fs/inode.c
--- lxc/fs/inode.c~17-24-elevate-write-count-for-do
created file,
while the vfsmount is ro. That is bad.
Signed-off-by: Dave Hansen [EMAIL PROTECTED]
---
lxc-dave/fs/file_table.c |5 -
lxc-dave/fs/namei.c | 22 ++
lxc-dave/ipc/mqueue.c|3 +++
3 files changed, 25 insertions(+), 5 deletions(-)
diff -puN fs
Some filesystems forego the use of normal vfs calls to create
struct files. Make sure that these users elevate the mnt writer
count. These probably don't have any real meaning because there
is no real backing store for these mounts, but it is here for
consistency.
Signed-off-by: Dave Hansen
Now that we have the sb writer count, and all of the
writers marked with mnt_want_write(), we don't need to
go looking at all of the individual open files.
Kill the open files walk, and use the sb writer count.
Signed-off-by: Dave Hansen [EMAIL PROTECTED]
---
lxc-dave/fs/file_table.c
code that will safely check the counts before
allowing r/w-r/o transitions to occur.
Signed-off-by: Dave Hansen [EMAIL PROTECTED]
---
lxc-dave/fs/namespace.c| 53 +
lxc-dave/fs/super.c| 18 ++---
lxc-dave/include/linux/fs.h
This basically audits the callers of xattr_permission(), which
calls permission() and can perform writes to the filesystem.
Signed-off-by: Dave Hansen [EMAIL PROTECTED]
---
lxc-dave/fs/nfsd/nfs4proc.c |7 ++-
lxc-dave/fs/xattr.c | 14 ++
2 files changed, 20
Signed-off-by: Dave Hansen [EMAIL PROTECTED]
---
lxc-dave/fs/namei.c | 10 ++
1 file changed, 10 insertions(+)
diff -puN fs/namei.c~09-24-elevate-write-count-for-link-and-symlink-calls
fs/namei.c
--- lxc/fs/namei.c~09-24-elevate-write-count-for-link-and-symlink-calls
2007-02-09
Some ioctls need write access, but others don't. Make a helper
function to decide when write access is needed, and take it.
Signed-off-by: Dave Hansen [EMAIL PROTECTED]
---
lxc-dave/fs/ncpfs/ioctl.c | 55 +-
1 file changed, 54 insertions(+), 1
chown/chmod,etc... don't call permission in the same way
that the normal open for write calls do. They still
write to the filesystem, so bump the write count during
these operations.
Signed-off-by: Dave Hansen [EMAIL PROTECTED]
---
lxc-dave/fs/open.c | 37
On Fri, 2007-02-09 at 15:22 -0800, Andrew Morton wrote:
On Fri, 09 Feb 2007 14:53:44 -0800
Dave Hansen [EMAIL PROTECTED] wrote:
This is the core of the read-only bind mount patch set.
Who wants read-only bind mounts, and for what reason?
The original desire came out of the linux-vserver
writes are performed with a
want/drop pair. When that is complete, we can actually
introduce code that will safely check the counts before
allowing r/w-r/o transitions to occur.
Signed-off-by: Dave Hansen [EMAIL PROTECTED]
---
lxc-dave/fs/namespace.c| 53
On Thu, 2007-11-01 at 07:56 -0700, Ulrich Drepper wrote:
Pavel Emelyanov wrote:
With this set we'll be able to mark pid namespaces as EXPERIMENTAL
or even BROKEN, so nobody will be able to crate them. So can we, please,
keep things as they are for now - the appropriate fix will be ready
, then release them once the write for the filp has
been established.
Any caller who gets a 'struct file' back must consider that filp
instantiated and fput() it normally. The callers no longer
have to worry about ever manually releasing a mnt write count.
Signed-off-by: Dave Hansen [EMAIL PROTECTED
This kills off the almost empty do_filp_open(). The indenting
change in do_sys_open() is because we would have gone over our
80 characters otherwise.
Signed-off-by: Dave Hansen [EMAIL PROTECTED]
---
linux-2.6.git-dave/fs/open.c | 39 ++-
1 file changed
Replace all callers with open_namei() directly, and move the
nameidata stack allocation into open_namei().
Signed-off-by: Dave Hansen [EMAIL PROTECTED]
---
linux-2.6.git-dave/drivers/usb/gadget/file_storage.c |5 -
linux-2.6.git-dave/fs/exec.c |2
linux-2.6.git
open_namei() no longer touches namei's. rename it
to something more appropriate: open_pathname().
Signed-off-by: Dave Hansen [EMAIL PROTECTED]
---
linux-2.6.git-dave/drivers/usb/gadget/file_storage.c |4 ++--
linux-2.6.git-dave/fs/exec.c |2 +-
linux-2.6.git
Pretty self-explanatory. Fits in with the rest of the series.
Signed-off-by: Dave Hansen [EMAIL PROTECTED]
Acked-by: Christoph Hellwig [EMAIL PROTECTED]
Signed-off-by: Andrew Morton [EMAIL PROTECTED]
---
linux-2.6.git-dave/fs/namei.c|5 +
linux-2.6.git-dave/fs/nfsd
. It takes a
directory and makes a regular bind and a r/o bind mount of it.
It then performs some normal filesystem operations on the
three directories, including ones that are expected to fail,
like creating a file on the r/o mount.
Signed-off-by: Dave Hansen [EMAIL PROTECTED]
-
To unsubscribe from
. When that is complete, we
can actually introduce code that will safely check the counts before allowing
r/w-r/o transitions to occur.
Signed-off-by: Dave Hansen [EMAIL PROTECTED]
Cc: Christoph Hellwig [EMAIL PROTECTED]
Signed-off-by: Andrew Morton [EMAIL PROTECTED]
Acked-by: Serge Hallyn [EMAIL
Acked-by: Christoph Hellwig [EMAIL PROTECTED]
Signed-off-by: Dave Hansen [EMAIL PROTECTED]
Signed-off-by: Andrew Morton [EMAIL PROTECTED]
---
linux-2.6.git-dave/fs/namei.c |4
linux-2.6.git-dave/ipc/mqueue.c |5 -
2 files changed, 8 insertions(+), 1 deletion(-)
diff -puN fs
This basically audits the callers of xattr_permission(), which calls
permission() and can perform writes to the filesystem.
Acked-by: Christoph Hellwig [EMAIL PROTECTED]
Signed-off-by: Dave Hansen [EMAIL PROTECTED]
Signed-off-by: Andrew Morton [EMAIL PROTECTED]
---
linux-2.6.git-dave/fs/nfsd
the nameidata_to_filp() calls into namei.c, and this
gets the sys_open flags to a place where we can get
at them when we need them.
Signed-off-by: Dave Hansen [EMAIL PROTECTED]
---
linux-2.6.git-dave/fs/namei.c | 43 +-
linux-2.6.git-dave/fs/open.c | 22
Elevate the write count during the vfs_rmdir() call.
Signed-off-by: Dave Hansen [EMAIL PROTECTED]
Acked-by: Christoph Hellwig [EMAIL PROTECTED]
Signed-off-by: Andrew Morton [EMAIL PROTECTED]
Acked-by: Serge Hallyn [EMAIL PROTECTED]
---
linux-2.6.git-dave/fs/namei.c |5 +
1 file
Now includes fix for oops seen by akpm.
never let a libc developer write your kernel code - hch
nor, apparently, a kernel developer - akpm
Acked-by: Christoph Hellwig [EMAIL PROTECTED]
Signed-off-by: Dave Hansen [EMAIL PROTECTED]
Signed-off-by: Andrew Morton [EMAIL PROTECTED]
Cc: Christoph
Signed-off-by: Dave Hansen [EMAIL PROTECTED]
Acked-by: Christoph Hellwig [EMAIL PROTECTED]
Signed-off-by: Andrew Morton [EMAIL PROTECTED]
---
linux-2.6.git-dave/fs/inode.c | 13 -
1 file changed, 12 insertions(+), 1 deletion(-)
diff -puN fs/inode.c~r-o-bind-mounts-elevate-write
fs/ncpfs/ioctl.c: In function 'ncp_ioctl_need_write':
fs/ncpfs/ioctl.c:852: error: label at end of compound statement
Cc: Dave Hansen [EMAIL PROTECTED]
Cc: Christoph Hellwig [EMAIL PROTECTED]
Signed-off-by: Andrew Morton [EMAIL PROTECTED]
---
linux-2.6.git-dave/fs/ncpfs/ioctl.c | 57
Acked-by: Christoph Hellwig [EMAIL PROTECTED]
Signed-off-by: Dave Hansen [EMAIL PROTECTED]
Signed-off-by: Andrew Morton [EMAIL PROTECTED]
---
linux-2.6.git-dave/fs/namei.c | 10 ++
1 file changed, 10 insertions(+)
diff -puN
fs/namei.c~r-o-bind-mounts-elevate-write-count-for-link
Acked-by: Christoph Hellwig [EMAIL PROTECTED]
Signed-off-by: Dave Hansen [EMAIL PROTECTED]
Signed-off-by: Andrew Morton [EMAIL PROTECTED]
---
linux-2.6.git-dave/fs/inode.c | 20
1 file changed, 12 insertions(+), 8 deletions(-)
diff -puN
fs/inode.c~r-o-bind-mounts
-by: Dave Hansen [EMAIL PROTECTED]
Acked-by: Christoph Hellwig [EMAIL PROTECTED]
Signed-off-by: Andrew Morton [EMAIL PROTECTED]
---
linux-2.6.git-dave/fs/ext2/ioctl.c | 46 ++
linux-2.6.git-dave/fs/ext3/ioctl.c | 100 +++---
linux-2.6.git-dave/fs
this to fix a 'create, remount, fail r/w open()' race.
Some filesystems forego the use of normal vfs calls to create
struct files. Make sure that these users elevate the mnt
writer count because they will get __fput(), and we need
to make sure they're balanced.
Signed-off-by: Dave Hansen [EMAIL
This also uses the little helper in the NFS code to make an if() a little bit
less ugly. We introduced the helper at the beginning of the series.
Signed-off-by: Dave Hansen [EMAIL PROTECTED]
Acked-by: Christoph Hellwig [EMAIL PROTECTED]
Signed-off-by: Andrew Morton [EMAIL PROTECTED
Acked-by: Christoph Hellwig [EMAIL PROTECTED]
Signed-off-by: Dave Hansen [EMAIL PROTECTED]
Signed-off-by: Andrew Morton [EMAIL PROTECTED]
---
linux-2.6.git-dave/fs/open.c | 14 --
1 file changed, 8 insertions(+), 6 deletions(-)
diff -puN fs/open.c~r-o-bind-mounts-elevate-writer
-by: Dave Hansen [EMAIL PROTECTED]
Signed-off-by: Andrew Morton [EMAIL PROTECTED]
---
linux-2.6.git-dave/fs/open.c | 13 +++--
1 file changed, 11 insertions(+), 2 deletions(-)
diff -puN fs/open.c~r-o-bind-mounts-make-access-use-mnt-check fs/open.c
--- linux-2.6.git/fs/open.c~r-o-bind-mounts
-off-by: Dave Hansen [EMAIL PROTECTED]
Acked-by: Christoph Hellwig [EMAIL PROTECTED]
Signed-off-by: Andrew Morton [EMAIL PROTECTED]
---
linux-2.6.git-dave/fs/open.c | 39 ++-
1 file changed, 30 insertions(+), 9 deletions(-)
diff -puN fs/open.c~r-o-bind-mounts
logic outside of the switch and into a helper function
suggested by Christoph.
This also encapsulates a fix for mknod(S_IFREG) that Miklos
found.
Acked-by: Christoph Hellwig [EMAIL PROTECTED]
Signed-off-by: Dave Hansen [EMAIL PROTECTED]
Signed-off-by: Andrew Morton [EMAIL PROTECTED]
---
linux-2.6
on percpu data when it only
accesses N or fewer mounts.)
Signed-off-by: Dave Hansen [EMAIL PROTECTED]
Cc: Christoph Hellwig [EMAIL PROTECTED]
Signed-off-by: Andrew Morton [EMAIL PROTECTED]
---
linux-2.6.git-dave/fs/namespace.c| 205 ---
linux-2.6.git-dave/include
two are probably unnecessary and duplicate existing checks in the
VFS. This won't make them better checks than before, but it will make them
detect r/o mounts.
Acked-by: Christoph Hellwig [EMAIL PROTECTED]
Signed-off-by: Dave Hansen [EMAIL PROTECTED]
Signed-off-by: Andrew Morton [EMAIL PROTECTED
' operation.
Signed-off-by: Dave Hansen [EMAIL PROTECTED]
Cc: Christoph Hellwig [EMAIL PROTECTED]
Signed-off-by: Andrew Morton [EMAIL PROTECTED]
---
linux-2.6.git-dave/fs/namespace.c| 46 ++-
linux-2.6.git-dave/include/linux/mount.h |1
2 files changed, 40
With the r/o bind mount patches, we can have as many
spinlocks nested as there are CPUs on the system.
Lockdep freaks out after 8.
So, create a new lockdep class of locks for the
mnt_writer spinlocks, and initialize each of the
cpu locks to be in a different class.
It should shut up warnings
having any oopses or mnt_writer
count imbalances.
I'm quite convinced that this is a good thing because it
found bugs in the stuff I was working on as soon as I
wrote it.
Signed-off-by: Dave Hansen [EMAIL PROTECTED]
---
linux-2.6.git-dave/fs/file_table.c| 21 +++--
linux
On Fri, 2007-11-02 at 01:04 -0700, Andrew Morton wrote:
That is the fix you were referring to? I was hoping you have a sketch
for a real solution. If nobody can think of a way to fix this PID
Looks like we misunderstood each other. Can you please elaborate on
what exactly is broken
On Sun, 2007-11-04 at 11:38 +0100, Ingo Molnar wrote:
I.e. keep the namespace functionality but use a modulo 1.000.000 base
for the PIDs so that it all looks nicer to the user. Minimal visibility
difference but maximum compatibility. (The resulting limits are
reasonable: 1 million tasks per
On Mon, 2007-11-05 at 15:40 +, Hugh Dickins wrote:
The second problem was a hang: all cpus in
handle_write_count_underflow
doing lock_and_coalesce_cpu_mnt_writer_counts: new -mm stuff from Dave
Hansen. At first I thought that was a locking problem in Dave's code,
but I now suspect it's
locking up. It will also
warn a lot earlier that something funky is going on.
Signed-off-by: Dave Hansen [EMAIL PROTECTED]
---
linux-2.6.git-dave/fs/namespace.c| 31 ++-
linux-2.6.git-dave/include/linux/mount.h |1 +
2 files changed, 23 insertions(+), 9
On Wed, 2007-11-07 at 15:07 +0800, Zou Nan hai wrote:
Try to allocate sparse vmemmap block above 4G on x64 system.
On some single node x64 system with huge amount of physical memory e.g
64G. the memmap size maybe very big.
Could we just change the default bootmem behavior to allocate
drivers/kvm/kvm_main.c: In function `kvm_flush_remote_tlbs':
drivers/kvm/kvm_main.c:220: error: implicit declaration of function
`smp_call_function_mask'
make[2]: *** [drivers/kvm/kvm_main.o] Error 1
make[1]: *** [drivers/kvm] Error 2
http://sr71.net/~dave/linux/config-kvm-up
Looks like that
On Fri, 2007-11-09 at 13:23 -0500, Erez Zadok wrote:
Setup: FC6 system with MM snapshot broken-out-2007-11-06-02-32 and
these two
patches added:
r-o-bind-mounts-track-number-of-mount-writer-fix-buggy-loop.patch
this in the next
patch.
Signed-off-by: Dave Hansen [EMAIL PROTECTED]
---
linux-2.6.git-dave/fs/file_table.c | 46 +---
linux-2.6.git-dave/include/linux/file.h |1
2 files changed, 32 insertions(+), 15 deletions(-)
diff -puN fs/file_table.c~create
Some new uses of get_empty_filp() have crept in, and are
not properly taking mnt_want_write()s. This fixes them
up.
We really need to kill get_empty_filp().
Signed-off-by: Dave Hansen [EMAIL PROTECTED]
---
linux-2.6.git-dave/fs/anon_inodes.c| 16 ++--
linux-2.6.git-dave/fs
On Fri, 2007-11-09 at 16:26 -0500, Trond Myklebust wrote:
#include linux/sunrpc/svc.h
#include linux/nfsd/nfsd.h
#include linux/nfsd/cache.h
+#include linux/file.h
#include linux/mount.h
#include linux/workqueue.h
#include linux/smp_lock.h
@@ -1303,7 +1304,7 @@ static inline
On 07/09/2012 03:25 AM, Yasuaki Ishimatsu wrote:
@@ -642,7 +642,7 @@ int __ref add_memory(int nid, u64 start,
}
/* create new memmap entry */
- firmware_map_add_hotplug(start, start + size, System RAM);
+ firmware_map_add_hotplug(start, start + size - 1, System RAM);
I
On 07/11/2012 09:52 PM, Yasuaki Ishimatsu wrote:
Does the following patch include your comment? If O.K., I will separate
the patch from the series and send it for bug fix.
Looks sane to me. It does now mean that the calling conventions for
some of the other firmware_map*() functions are
From: m...@skynet.ie (Mel Gorman)
PAGE_OWNER tracks free pages by setting page-order to -1. However, it is
set during __free_pages() which is not the only free path as
__pagevec_free() and free_compound_page() do not go through __free_pages().
This leads to a situation where free pages are
a new function slow_virt_to_phys(), which
walks the kernel page tables on x86 and should do precisely
the same logical thing as __pa(), but actually work on a wider
range of memory. It should work on the normal linear mapping,
vmalloc(), kmap(), etc...
Signed-off-by: Dave Hansen d
The KVM code has some repeated bugs in it around use of __pa() on
per-cpu data. Those data are not in an area on which __pa() is
valid. However, they are also called early enough in boot that
__vmalloc_start_set is not set, and thus the CONFIG_DEBUG_VIRTUAL
debugging does not catch them.
This
for the page fault (it
was injected by the host), assumed that the kernel had taken
a _real_ page fault, and panic()'d.
Signed-off-by: Dave Hansen d...@linux.vnet.ibm.com
---
linux-2.6.git-dave/arch/x86/kernel/kvm.c |9 +
linux-2.6.git-dave/arch/x86/kernel/kvmclock.c |4 ++--
2 files
On 12/07/2012 02:26 PM, Andrew Morton wrote:\
I have cunningly divined the intention of your update and have queued
the below incremental. The change to
pagetypeinfo_showmixedcount_print() was a surprise. What's that there
for?
Do you mean to ask why it's being modified at all here in this
On 12/07/2012 02:44 PM, Andrew Morton wrote:
AFACIT that difference was undescribed. I can see that the new version
uses the stack-tracing infrastructure, but the change to
pagetypeinfo_showmixedcount_print() is a mystery.
Ahhh, I assume you're talking about this hunk:
@@ -976,10 +976,7 @@
On 12/07/2012 03:51 PM, Andrew Morton wrote:
+static ssize_t node_read_memrange(struct device *dev,
+struct device_attribute *attr, char *buf)
+{
+ int nid = dev-id;
+ unsigned long start_pfn = NODE_DATA(nid)-node_start_pfn;
+ unsigned long end_pfn =
Hi Mel,
I'm chasing an apparent memory leak introduced post-3.6. The
interesting thing is that it appears that the pages are in the
allocator, but not being accounted for:
http://www.spinics.net/lists/linux-mm/msg46187.html
https://bugzilla.kernel.org/show_bug.cgi?id=50181
I
I'm really evil, so I changed the loop in compact_capture_page() to
basically steal the highest-order page it can. This shouldn't _break_
anything, but it does ensure that we'll be splitting pages that we find
more often and recreating this *MUCH* faster:
- for (order = cc-order;
. The amount leaked very closely tracks the
imbalance I see in buddy pages vs. NR_FREE_PAGES. I have
confirmed that this patch fixes the imbalance
Signed-off-by: Dave Hansen d...@linux.vnet.ibm.com
Acked-by: Mel Gorman mgor...@suse.de
---
linux-2.6.git-dave/mm/page_alloc.c |2 +-
1 file changed, 1
On 11/26/2012 03:23 AM, Mel Gorman wrote:
On Wed, Nov 21, 2012 at 02:21:51PM -0500, Dave Hansen wrote:
This needs to make it in before 3.7 is released.
This is also required. Dave, can you double check? The surprise is that
this does not blow up very obviously.
...
@@ -1422,7 +1422,7
Hi Tejun,
I was bisecting a boot problem on a 32-bit NUMA kernel and it bisected
down to commit 8db78cc4. It turns out that, with this patch,
pcpu_need_numa() changed its return value on my system from 1 to 0.
What that basically meant was that we stopped using the remapped lowmem
areas for
a new function slow_virt_to_phys(), which
walks the kernel page tables on x86 and should do precisely
the same logical thing as __pa(), but actually work on a wider
range of memory. It should work on the normal linear mapping,
vmalloc(), kmap(), etc...
Signed-off-by: Dave Hansen d
for the page fault (it
was injected by the host), assumed that the kernel had taken
a _real_ page fault, and panic()'d.
Signed-off-by: Dave Hansen d...@linux.vnet.ibm.com
---
linux-2.6.git-dave/arch/x86/kernel/kvm.c |9 +
1 file changed, 5 insertions(+), 4 deletions(-)
diff -puN arch/x86
I think the Kernel Hacking menu has gotten a bit out of hand. It
is over 120 lines long on my system with everything enabled and
options are scattered around it haphazardly.
http://sr71.net/~dave/linux/kconfig-horror.png
Let's try to introduce some sanity.
I believe the risk of a
having an
arch/foo/Kconfig.debug-memory might be taking things a bit too far
Signed-off-by: Dave Hansen d...@linux.vnet.ibm.com
---
linux-2.6.git-dave/lib/Kconfig.debug | 702 +--
1 file changed, 356 insertions(+), 346 deletions(-)
diff -puN arch/x86/Kconfig.debug
the actual menu option.
Signed-off-by: Dave Hansen d...@linux.vnet.ibm.com
---
linux-2.6.git-dave/arch/blackfin/Kconfig |1 +
linux-2.6.git-dave/arch/blackfin/Kconfig.debug |7 ---
linux-2.6.git-dave/arch/frv/Kconfig|1 +
linux-2.6.git-dave/arch/frv/Kconfig.debug
These were in two different places, and taking up too much of my
valuable screen real-estate. Banish them to their own menu.
Signed-off-by: Dave Hansen d...@linux.vnet.ibm.com
---
linux-2.6.git-dave/lib/Kconfig.debug | 160 +--
1 file changed, 82 insertions
, configfs, or /proc.
Also, Debug filesystem sounds like a debugging option _for_
filesystems code, not a filesystem for debugging. We also never call
it the debug filesystem. We always say debugfs, so reflect the
fact that we _call_ it debugfs in the menu text.
Signed-off-by: Dave Hansen d
. This menu should only be used for tests
that do not have a more appropriate home.
Signed-off-by: Dave Hansen d...@linux.vnet.ibm.com
---
linux-2.6.git-dave/lib/Kconfig.debug | 151 ++-
1 file changed, 78 insertions(+), 73 deletions(-)
diff -puN lib/Kconfig.debug
There are quite a few of these, and we want to make sure that
there is one-stop-shopping for lock debugging.
Signed-off-by: Dave Hansen d...@linux.vnet.ibm.com
---
linux-2.6.git-dave/lib/Kconfig.debug | 120 ++-
1 file changed, 62 insertions(+), 58 deletions
even though I'm actually moving the options on
either side of it.
Signed-off-by: Dave Hansen d...@linux.vnet.ibm.com
---
linux-2.6.git-dave/lib/Kconfig.debug | 156 +--
1 file changed, 80 insertions(+), 76 deletions(-)
diff -puN lib/Kconfig.debug~consolidate
On 01/01/2013 09:52 AM, Seth Jennings wrote:
On 12/31/2012 05:06 PM, Dan Magenheimer wrote:
A second related issue that concerns me is that, although you
are now, like zcache2, using an LRU queue for compressed pages
(aka zpages), there is no relationship between that queue and
physical
On 01/02/2013 09:26 AM, Dan Magenheimer wrote:
However if one compares the total percentage
of RAM used for zpages by zswap vs the total percentage of RAM
used by slab, I suspect that the zswap number will dominate,
perhaps because zswap is storing primarily data and slab is
storing primarily
On 01/02/2013 08:28 PM, Minchan Kim wrote:
VOLATILE imply the the pages in the range isn't working set any more
so it's pointless that make them to THP/KSM.
One of the points of this implementation is that it be able to preserve
memory contents when there is no pressure. If those contents
On 12/12/2012 05:18 PM, Davidlohr Bueso wrote:
On Fri, 2012-12-07 at 16:17 -0800, Dave Hansen wrote:
Seems like the better way to do this would be to expose the DIMMs
themselves in some way, and then map _those_ back to a node.
Good point, and from a DIMM perspective, I agree, and will look
On 12/12/2012 06:03 PM, Davidlohr Bueso wrote:
On Wed, 2012-12-12 at 17:48 -0800, Dave Hansen wrote:
But if we went and did it per-DIMM (showing which physical addresses and
NUMA nodes a DIMM maps to), wouldn't that be redundant with this
proposed interface?
If DIMMs overlap between nodes
601 - 700 of 9732 matches
Mail list logo