date:20070725

Re: updatedb

2007-07-25 Thread Rene Herman


On 07/26/2007 08:39 AM, Bongani Hlope wrote:


On Thursday 26 July 2007 05:59:53 Rene Herman wrote:



So what's happening? If you sit down with a copy op "top" in one terminal
and updatedb in another, what does it show?



Just tested that, there's a steady increase in the useage of buff


Great. Now concentrate on the "swpd" column, as it's the only thing relevant 
here. The fact that an updatedb run fills/replaces caches is completely and 
utterly unsurprising and not something swap-prefetch helps with. The only 
thing it does is bring back stuff from _swap_.


Rene.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [GIT PATCH] ACPI patches for 2.6.23-rc1

2007-07-25 Thread Linus Torvalds

On Thu, 26 Jul 2007, Len Brown wrote:
> 
> Feel free to share what you know about the benefits vs. the costs
> of maintaining CONFIG_ACPI_SLEEP as a build option.

Why don't you just make CONFIG_ACPI_SLEEP dependent on SOFTWARE_SUSPEND 
and STR?

> If you feel that your system has been degraded
> because it now includes what used to be excluded under
> CONFIG_ACPI_SLEEP=n, please let me know how.

I feel that I get asked to include a feature that 
 (a) I have no interest in on that machine
 (b) I didn't need to include before.

What was the advantage? And what was it that caused something like this to 
be a post-rc1 thing. That makes me really unhappy. This is a *regression*.

Linus
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] sysfs - cleanup semaphore.h

2007-07-25 Thread Dave Young

Cleanup semaphore.h

Signed-off-by: Dave Young <[EMAIL PROTECTED]> 

---
fs/sysfs/bin.c |2 +-
fs/sysfs/dir.c |2 +-
fs/sysfs/group.c   |1 -
fs/sysfs/inode.c   |1 -
fs/sysfs/mount.c   |1 -
fs/sysfs/symlink.c |2 +-
6 files changed, 3 insertions(+), 6 deletions(-)

diff -upr linux/fs/sysfs/bin.c linux.new/fs/sysfs/bin.c
--- linux/fs/sysfs/bin.c2007-07-26 14:31:57.0 +
+++ linux.new/fs/sysfs/bin.c2007-07-26 14:35:22.0 +
@@ -14,9 +14,9 @@
 #include 
 #include 
 #include 
+#include 
 
 #include 
-#include 
 
 #include "sysfs.h"
 
diff -upr linux/fs/sysfs/dir.c linux.new/fs/sysfs/dir.c
--- linux/fs/sysfs/dir.c2007-07-26 14:31:57.0 +
+++ linux.new/fs/sysfs/dir.c2007-07-26 14:35:47.0 +
@@ -11,7 +11,7 @@
 #include 
 #include 
 #include 
-#include 
+#include 
 #include "sysfs.h"
 
 DEFINE_MUTEX(sysfs_mutex);
diff -upr linux/fs/sysfs/group.c linux.new/fs/sysfs/group.c
--- linux/fs/sysfs/group.c  2007-07-26 14:31:57.0 +
+++ linux.new/fs/sysfs/group.c  2007-07-26 14:36:17.0 +
@@ -14,7 +14,6 @@
 #include 
 #include 
 #include 
-#include 
 #include "sysfs.h"
 
 
diff -upr linux/fs/sysfs/inode.c linux.new/fs/sysfs/inode.c
--- linux/fs/sysfs/inode.c  2007-07-26 14:31:57.0 +
+++ linux.new/fs/sysfs/inode.c  2007-07-26 14:36:38.0 +
@@ -14,7 +14,6 @@
 #include 
 #include 
 #include 
-#include 
 #include "sysfs.h"
 
 extern struct super_block * sysfs_sb;
diff -upr linux/fs/sysfs/mount.c linux.new/fs/sysfs/mount.c
--- linux/fs/sysfs/mount.c  2007-07-26 14:31:57.0 +
+++ linux.new/fs/sysfs/mount.c  2007-07-26 14:36:45.0 +
@@ -8,7 +8,6 @@
 #include 
 #include 
 #include 
-#include 
 
 #include "sysfs.h"
 
diff -upr linux/fs/sysfs/symlink.c linux.new/fs/sysfs/symlink.c
--- linux/fs/sysfs/symlink.c2007-07-26 14:31:57.0 +
+++ linux.new/fs/sysfs/symlink.c2007-07-26 14:37:12.0 +
@@ -7,7 +7,7 @@
 #include 
 #include 
 #include 
-#include 
+#include 
 
 #include "sysfs.h"
 
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: -mm merge plans for 2.6.23

2007-07-25 Thread Andrew Morton

On Wed, 25 Jul 2007 23:33:24 -0700 "Ray Lee" <[EMAIL PROTECTED]> wrote:

> > So.  We can
> >
> > a) provide a way for userspace to reload pagecache and
> >
> > b) merge maps2 (once it's finished) (pokes mpm)
> >
> > and we're done?
> 
> Eh, dunno. Maybe?
> 
> We're assuming we come up with an API for userspace to get
> notifications of evictions (without polling, though poll() would be
> fine -- you know what I mean), and an API for re-victing those things
> on demand.

I was assuming that polling would work OK.  I expect it would.

> If you think that adding that API and maintaining it is
> simpler/better than including a variation on the above hueristic I
> offered, then yeah, I guess we are. It'll all have that vague
> userspace s2ram odor about it, but I'm sure it could be made to work.

Actually, I overdesigned the API, I suspect.  What we _could_ do is to
provide a way of allowing userspace to say "pretend process A touched page
B": adopt its mm and go touch the page.  We in fact already have that:
PTRACE_PEEKTEXT.

So I suspect this could all be done by polling maps2 and using PEEKTEXT. 
The tricky part would be working out when to poll, and when to reestablish.

A neater implementation than PEEKTEXT would be to make the maps2 files
writeable(!) so as a party trick you could tar 'em up and then, when you
want to reestablish firefox's previous working set, do a untar in
/proc/$(pidof firefox)/

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: What's does KPROBE_ENTRY mean?

2007-07-25 Thread jidong xiao

Thanks.So if I don't care any probes, and I actually don't need to
take use of kprobes, then I can use the functions defined through
KPROBE_ENTRY() the same way as those defined via ENTRY(), right?

Regards
Jason Xiao

On 7/26/07, Paul Mundt <[EMAIL PROTECTED]> wrote:
> On Thu, Jul 26, 2007 at 01:43:10PM +0800, jidong xiao wrote:
> > Anyone can help this?
> >
> > On 6/21/07, jidong xiao <[EMAIL PROTECTED]> wrote:
> > > I searched in linux kernel 2.6.10, didn't find it, then I tried
> > > 2.6.20, it is there. But I am not familiar with assembly language, so
> > > can anybody kindly explain it, I don't know the difference between
> > > KPROBE_ENTRY and ENTRY, however, I can find both of these items in
> > > some files, such as arch/x86_64/kernel/entry.S.
> > >
> KPROBE_ENTRY() is the assembly equivalent of __kprobes, it places the
> symbol in a special section (.kprobes.text) where probes can't be
> inserted. This is usually helpful in cases where inserting the probe may
> lead to recursion or other undesirable behaviour.
>
> See include/linux/linkage.h and include/linux/kprobes.h.
>
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: updatedb

2007-07-25 Thread Bongani Hlope

On Thursday 26 July 2007 05:59:53 Rene Herman wrote:
>
> Problem spot no. 1.
>
> RAM intensive? If I run updatedb here, it never grows itself beyond 2M.
> Yes, two. I'm certainly willing to accept that me and my systems are
> possibly not the reference but assuming I'm _very_ special hasn't done much
> for me either in the past.
>
> The thing updatedb does do, or at least has the potential to do, is fill
> memory with cached inodes/dentries but Linux does not swap to make room for
> caches. So why will updatedb "often cause things to be swapped out"?
>
> [ snip ]
>
> > Swap prefetch, on the other hand, would have kicked in shortly after
> > updatedb finished, leaving the applications in swap for a speedy
> > recovery when the person comes back to their computer.
>
> Problem spot no. 2.
>
> If updatedb filled all of RAM with inodes/dentries, that RAM is now used
> (ie, not free) and swap-prefetch wouldn't have anywhere to prefetch into so
> would _not_ have kicked in.
>
> So what's happening? If you sit down with a copy op "top" in one terminal
> and updatedb in another, what does it show?
>
> Rene.

Just tested that, there's a steady increase in the useage of buff


procs ---memory-- ---swap-- -io -system-- cpu
 r  b   swpd   free   buff  cache   si   sobibo   in   cs us sy id wa
 2  1  0 1279412 201160 23472000   19329  558  657  5  1 89  5
 0  1  0 1276624 203436 23487200  2276 0 1638 2456  4  2 48 48
 1  1  0 1273372 206292 23501200  2852 0 1773 2755  3  3 48 46
 2  1  0 1270128 208376 23536000  2084 0 1545 2168  5  2 47 46

8<

 0  1  0 1228004 237288 23783600  2192 0 1669 2941  6  3 47 44
 1  1  0 1223424 239228 23802000  1932   272 1580 2881  9  4 44 44
 1  1  0 1219692 241600 23820800  2372 0 1719 2881 10  4 45 43
 0  1  0 1217296 243372 23831200  1772 0 1526 2320  4  2 49 46

8<

 0  1  0 1166852 277912 24084000  2244 0 1699 3037  7  2 48 43
 0  1  0 1164016 279528 24101600  1608   824 1512 2364  7  2 47 44
 1  1  0 1161256 281860 24126400  2332 0 1709 2769  7  2 49 43
 1  1  0 1155632 284792 24145200  2932 0 1835 3084  8  4 46 42

8< 

 0  1  0 1104568 324788 24361600  3500 4 1879 3054  5  4 46 46
 1  1  0 1099596 328524 24376800  3736 0 1990 3257  7  4 48 43
 1  1  0 1093976 332516 24406000  3984   572 2013 3348  6  3 48 43
 0  1  0 1090320 335396 24434000  2880 0 1760 2925  5  3 47 46

8<

 1  1  0 1025212 384380 24822400  2940 0 1763 2864  6  3 46 46
 0  1  0 1022196 386444 24832800  2064 8 1527 2543  5  2 45 47
 0  1  0 1018620 389476 24840400  3032 0 1798 2988  6  3 47 45
 0  1  0 1014800 392364 24855200  2888 0 1738 2821  5  2 48 45

8<

 0  1  0 425200 839828 27339200  1744 0 1441 2248  9  2 44 46
 0  1  0 423360 841220 27354400  1384   368 1374 2144  3  1 48 48
 0  1  0 421288 842868 27357600  1648 0 1400 2141  4  2 46 48
 0  1  0 418252 845172 27367600  2300 0 1570 2492  3  1 49 48
 0  0  0 417300 846100 27377600   928 0 1232 1837  3  2 72 24 



 0  0  0 416724 846100 27377600 0 0 1025 1579  5  1 94  0
 0  0  0 417012 846100 27377600 0 0 1002 1474  3  1 97  0
 1  0  0 417220 846100 27377600 0 0 1026 1414  2  0 98  0

So 32 percent of free memory went to the buffers.

5 minutes later it's still not freed

procs ---memory-- ---swap-- -io -system-- cpu
 r  b   swpd   free   buff  cache   si   sobibo   in   cs us sy id wa
 2  0  0 409500 846652 27732000   28631  585  766  6  1 83 10
 1  0  0 409328 846652 27732000 0 0 1003 1442  3  1 97  0

/proc/slabinfo
ext3_inode_cache  176198 17620081651 : tunables   54   278 : 
slabdata  35240  35240  0
dentry233054 233054208   191 : tunables  120   608 : 
slabdata  12266  12266  0
buffer_head   228303 228327104   371 : tunables  120   608 : 
slabdata   6171   6171  0

run OpenOffice

procs ---memory-- ---swap-- -io -system-- cpu
 r  b   swpd   free   buff  cache   si   sobibo   in   cs us sy id wa
 1  0  0 403664 847056 27746000   23526  577  766  6  1 85  8
 0  0  0 403656 847056 27746000 0 0 1003 1385  5  0 96  0
 0  0  0 403888 847056 27746000 0 0 1237 1968  3  1 96  0

8<


 0  0  0 400708 847088 27762000 0 0 1058 1259  4  0 95  0
 0  0  0 400584 847088 27762000 0 0 1246 1647  7  1 93  0
 1  1  0 389796 847164 28410000  6528   116 1215 2663 10  4 71 14

8<

 0  0  0 307000 847464 361

Re: [PATCH][resend] sysfs/file.c - use mutex instead of semaphore

2007-07-25 Thread Dave Young

>On 7/26/07, Greg KH <[EMAIL PROTECTED]> wrote:
> On Thu, Jul 26, 2007 at 11:03:54AM +, Dave Young wrote:
> > Use mutex instead of semaphore in sysfs/file.c : sys_buffer.
>
> Thanks, it's in my queue, but I'm at a conference this week, so I'll get
> to it on monday, sorry for the delay.
>
Hi, thank you. I'm not sure why it's ignored, I think maybe it should
be splitted to two patches.

Andrew has added this one to -mm tree. I'm happy to hear from you. I
will send the seperated header cleanup patch again. gmail will convert
 tab to white space, so I have to send another post.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 13/16] Switch to operating with pid_numbers instead of pids

2007-07-25 Thread Pavel Emelyanov


[EMAIL PROTECTED] wrote:

Pavel Emelianov [EMAIL PROTECTED] wrote:
| [EMAIL PROTECTED] wrote:
| >Pavel Emelianov [EMAIL PROTECTED] wrote:
| >| Make alloc_pid() initialize pid_numbers and hash them
| >| into the hashtable, not the struct pid itself.
| >| 
| >| Signed-off-by: Pavel Emelianov <[EMAIL PROTECTED]>
| >| 
| >| ---
| >| 
| >|  pid.c |   47 +--

| >|  1 files changed, 33 insertions(+), 14 deletions(-)
| >| 
| >| --- ./kernel/pid.c.ve12	2007-07-05 11:06:41.0 +0400

| >| +++ ./kernel/pid.c  2007-07-05 11:08:23.0 +0400
| >| @@ -28,8 +28,10 @@
| >|  #include 
| >|  #include 
| >|  #include 
| >| +#include 
| >| 
| >| -#define pid_hashfn(nr) hash_long((unsigned long)nr, pidhash_shift)

| >| +#define pid_hashfn(nr, ns) \
| >| +   hash_long((unsigned long)nr + (unsigned long)ns, pidhash_shift)
| >|  static struct hlist_head *pid_hash;
| >|  static int pidhash_shift;
| >|  struct pid init_struct_pid = INIT_STRUCT_PID;
| >| @@ -194,7 +198,7 @@ fastcall void put_pid(struct pid *pid)
| >| if (!pid)
| >| return;
| >| 
| >| -	ns = pid->numbers[0].ns;

| >| +   ns = pid->numbers[pid->level].ns;
| >| if ((atomic_read(&pid->count) == 1) ||
| >|  atomic_dec_and_test(&pid->count))
| >| kmem_cache_free(ns->pid_cachep, pid);
| >| @@ -210,13 +214,17 @@ static void delayed_put_pid(struct rcu_h
| >|  fastcall void free_pid(struct pid *pid)
| >|  {
| >| /* We can be called with write_lock_irq(&tasklist_lock) held */
| >| +   int i;
| >| unsigned long flags;
| >| 
| >|  	spin_lock_irqsave(&pidmap_lock, flags);

| >| -   hlist_del_rcu(&pid->pid_chain);
| >| +   for (i = 0; i <= pid->level; i++)
| >| +   hlist_del_rcu(&pid->numbers[i].pid_chain);
| >| spin_unlock_irqrestore(&pidmap_lock, flags);
| >| 
| >| -	free_pidmap(&init_pid_ns, pid->nr);

| >| +   for (i = 0; i <= pid->level; i++)
| >| +   free_pidmap(pid->numbers[i].ns, pid->numbers[i].nr);
| >| +
| >| call_rcu(&pid->rcu, delayed_put_pid);
| >|  }
| >| 
| >| @@ -224,30 +232,43 @@ struct pid *alloc_pid(struct pid_namespa

| >|  {
| >| struct pid *pid;
| >| enum pid_type type;
| >| -   int nr = -1;
| >| +   struct pid_namespace *ns;
| >| +   int i, nr;
| >| 
| >| -	pid = kmem_cache_alloc(init_pid_ns.pid_cachep, GFP_KERNEL);

| >| +   pid = kmem_cache_alloc(pid_ns->pid_cachep, GFP_KERNEL);
| >| if (!pid)
| >| goto out;
| >| 
| >| -	nr = alloc_pidmap(current->nsproxy->pid_ns);

| >| -   if (nr < 0)
| >| -   goto out_free;
| >| +   ns = pid_ns;
| >| +   for (i = pid_ns->level; i >= 0; i--) {
| >| +   nr = alloc_pidmap(ns);
| >| +   if (nr < 0)
| >| +   goto out_free;
| >
| >If pid_ns->level is say 3 and alloc_pidmap() succeeds when i=0,1
| 
| It cannot :) If level is 3, then we'll allocate for 3, 2, 1, 0 sequence.

| The loop is descending, not ascending...

Aah descending - thats right. But I still think there is a problem.

Here is your code that I am referring to:

pid = kmem_cache_alloc(pid_ns->pid_cachep, GFP_KERNEL);
if (!pid)
goto out;

ns = pid_ns;
for (i = pid_ns->level; i >= 0; i--) {
nr = alloc_pidmap(ns);
if (nr < 0)
goto out_free;

pid->numbers[i].nr = nr;
pid->numbers[i].ns = ns;
ns = ns->parent;
}

pid->level = pid_ns->level;



out:
return pid;
out_free:
for (i++; i <= pid->level; i++)
free_pidmap(pid->numbers[i].ns, pid->numbers[i].nr);

kmem_cache_free(pid_ns->pid_cachep, pid);
pid = NULL;
goto out;



Lets say initially pid_ns->level = 3 and alloc_pidmap() succeeds for
i=3 and i=2 but fails for i=1 and we execute "goto out_free".

But pid->level is uninitialized at this point right ?

Even if it were set to zero (using kmem_cache_zalloc()), we may not
free the two pidmap entries we allocated for i=3 and i=2.


Yes. I found this after detailed look at the code and (hope) fixed.


Suka



Thanks,
Pavel

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: -mm merge plans for 2.6.23

2007-07-25 Thread Ray Lee

On 7/25/07, Andrew Morton <[EMAIL PROTECTED]> wrote:

On Wed, 25 Jul 2007 09:09:01 -0700
"Ray Lee" <[EMAIL PROTECTED]> wrote:

> No, there's a third case which I find the most annoying. I have
> multiple working sets, the sum of which won't fit into RAM. When I
> finish one, the kernel had time to preemptively swap back in the
> other, and yet it didn't. So, I sit around, twiddling my thumbs,
> waiting for my music player to come back to life, or thunderbird,
> or...

Yes, I'm thinking that's a good problem statement and it isn't something
which the kernel even vaguely attempts to address, apart from normal
demand paging.

We could perhaps improve things with larger and smarter fault readaround,
perhaps guided by refault-rate measurement. But that's still demand-paged
rather than being proactive/predictive/whatever.

None of this is swap-specific though: exactly the same problem would need
to be solved for mmapped files and even plain old pagecache.

Could be what I'm noticing, but it's important to note that as
others have shown improvement with Con's swap prefetch, it's easily
arguable that targeting just swap is good enough for a first
approximation.

In fact I'd restate the problem as "system is in steady state A, then there
is a workload shift causing transition to state B, then the system goes
idle. We now wish to reinstate state A in anticipation of a resumption of
the original workload".

Yes, that's a fair transformation / generalization. It's always nice
talking to someone with more clarity than one's self.

swap-prefetch solves a part of that.

A complete solution for anon and file-backed memory could be implemented
(ta-da) in userspace using the kernel inspection tools in -mm's maps2-*
patches.
We would need to add a means by which userspace can repopulate
swapcache,

Okay, let's run with that for argument's sake.

but that doesn't sound too hard (especially when you haven't
thought about it).

I've always thought your sense of humor was underappreciated.

And userspace can right now work out which pages from which files are in
pagecache so this application can handle pagecache, swap and file-backed
memory. (file-backed memory might not even need special treatment, given
that it's pagecache anyway).

So in your proposed scheme, would userspace be polling, er, , well, /proc//something_or_another?

A userspace daemon that wakes up regularly to poll a bunch of proc
files fills me with glee. Wait, is that glee? I think, no... wait...
horror, yes, horror is what I'm feeling.

I'm wrong, right? I love being wrong about this kind of stuff.

And userspace can do a much better implementation of this
how-to-handle-large-load-shifts problem, because it is really quite
complex. The system needs to be monitored to determine what is the "usual"
state (ie: the thing we wish to reestablish when the transient workload
subsides). The system then needs to be monitored to determine when the
exceptional workload has started, and when it has subsided, and userspace
then needs to decide when to start reestablishing the old working set, at
what rate, when to abort doing that, etc.

Oy. I mean this in the most respectful way possible, but you're too
smart for your own good.

I mean, sure, it's possible one could have multiply-chained transient
workloads each of which have their optimum workingset, of which
there's little overlap with the previous. Mainframes made their names
on such loads. Workingset A starts, generates data, finishes and
invokes workingset B, of which the only thing they share in common is
said data. B finishes and invokes C, etc.

So, yeah, that's way too complex to stuff into the kernel. Even if it
were possible to do so, I cringe at the thought. And I can't believe
that would be a common enough pattern nowadays to justify any
hueristics on anyone's part. It's certainly complex enough that I'd
like to punt that scenario out of the conversation entirely -- I think
it has the potential to give a false impression as to how involved of
a process we're talking about here.

Let's go back to your restatement:

I'll take an 80% solution for that one problem, and happily declare
that the kernel's job is done. In particular, when a resource hog
exits (or whatever hueristics prefetch is currently hooking in to),
the kernel (or userspace, if that interface could be made sane) could
exercise a completely workload agnostic refetch of the last n things
evicted, where n is determined by what's suddenly become free (or
whatever Con came up with).

Just, y'know, MRU style.

All this would end up needing runtime configurability and tweakability and
customisability. All standard fare for userspace stuff - much easier than
patching the kernel.

1 2 3 4 5 6 >

1 - 100 of 515 matches

Mail list logo