date:20071207

Re: [PATCH 09/20] drivers/s390/: use LIST_HEAD instead of LIST_HEAD_INIT

2007-12-07 Thread Heiko Carstens

On Thu, Dec 06, 2007 at 11:19:41PM +0800, Denis Cheng wrote:
> single list_head variable initialized with LIST_HEAD_INIT could almost
> always can be replaced with LIST_HEAD declaration, this shrinks the code
> and looks better.
> 
> Signed-off-by: Denis Cheng <[EMAIL PROTECTED]>
> ---
>  drivers/s390/block/dcssblk.c  |2 +-
>  drivers/s390/char/raw3270.c   |4 ++--
>  drivers/s390/char/tape_core.c |2 +-
>  drivers/s390/net/netiucv.c|3 +--
>  drivers/s390/net/smsgiucv.c   |2 +-
>  5 files changed, 6 insertions(+), 7 deletions(-)

Thanks, applied. I added the possible change in arch/s390/mm/extmem.c
to your patch.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Add LZO compression support to cryptoapi

2007-12-07 Thread Herbert Xu

On Wed, Dec 05, 2007 at 12:24:16PM +0100, Zoltan Sogor wrote:
>
> I've modified the patch as you suggested and added an other patch which adds
> a common compression test function (modifies deflate test case to use the 
> common function). 

Both applied to cryptodev-2.6.  Thanks a lot Zoltan!
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <[EMAIL PROTECTED]>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 17/20] net/xfrm/xfrm_state.c: use LIST_HEAD instead of LIST_HEAD_INIT

2007-12-07 Thread David Miller

From: Denis Cheng <[EMAIL PROTECTED]>
Date: Fri,  7 Dec 2007 00:09:43 +0800

> single list_head variable initialized with LIST_HEAD_INIT could almost
> always can be replaced with LIST_HEAD declaration, this shrinks the code
> and looks better.
> 
> Signed-off-by: Denis Cheng <[EMAIL PROTECTED]>

Applied.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 16/20] net/x25/: use LIST_HEAD instead of LIST_HEAD_INIT

2007-12-07 Thread David Miller

From: Denis Cheng <[EMAIL PROTECTED]>
Date: Fri,  7 Dec 2007 00:07:19 +0800

> single list_head variable initialized with LIST_HEAD_INIT could almost
> always can be replaced with LIST_HEAD declaration, this shrinks the code
> and looks better.
> 
> Signed-off-by: Denis Cheng <[EMAIL PROTECTED]>

Applied.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 14/20] net/ipv4/cipso_ipv4.c: use LIST_HEAD instead of LIST_HEAD_INIT

2007-12-07 Thread David Miller

From: Denis Cheng <[EMAIL PROTECTED]>
Date: Fri,  7 Dec 2007 00:04:36 +0800

> single list_head variable initialized with LIST_HEAD_INIT could almost
> always can be replaced with LIST_HEAD declaration, this shrinks the code
> and looks better.
> 
> Signed-off-by: Denis Cheng <[EMAIL PROTECTED]>

Applied.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Kernel 2.6.23.9 + mdadm 2.6.2-2 + Auto rebuild RAID1?

2007-12-07 Thread Jan Engelhardt


On Dec 7 2007 07:30, Nix wrote:
>On 6 Dec 2007, Jan Engelhardt verbalised:
>> On Dec 5 2007 19:29, Nix wrote:
 On Dec 1 2007 06:19, Justin Piszcz wrote:

> RAID1, 0.90.03 superblocks (in order to be compatible with LILO, if
> you use 1.x superblocks with LILO you can't boot)

 Says who? (Don't use LILO ;-)
>>>
>>>Well, your kernels must be on a 0.90-superblocked RAID-0 or RAID-1
>>>device. It can't handle booting off 1.x superblocks nor RAID-[56]
>>>(not that I could really hope for the latter).
>>
>> If the superblock is at the end (which is the case for 0.90 and 1.0),
>> then the offsets for a specific block on /dev/mdX match the ones for 
>> /dev/sda,
>> so it should be "easy" to use lilo on 1.0 too, no?
>
>Sure, but you may have to hack /sbin/lilo to convince it to create the
>superblock there at all. It's likely to recognise that this is an md
>device without a v0.90 superblock and refuse to continue. (But I haven't
>tested it.)
>
In that case, see above - move to a different bootloader.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [BUG] 2.6.23-rc3 can't see sd partitions on Alpha

2007-12-07 Thread Ingo Molnar

* Andrew Morton <[EMAIL PROTECTED]> wrote:

> > # bad: [6f37ac793d6ba7b35d338f791974166f67fdd9ba] Merge branch 'master' of 
> > master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6
> > git-bisect bad 6f37ac793d6ba7b35d338f791974166f67fdd9ba
> > # good: [2f1f53bdc6531696934f6ee7bbdfa2ab4f4f62a3] CRISv10 fasttimer: Scrap 
> > INLINE and name timeval_cmp better
> > git-bisect good 2f1f53bdc6531696934f6ee7bbdfa2ab4f4f62a3

> I'm struggling to see how any of those could have broken block device 
> mounting on alpha.  Are you sure you bisected right?

the bisection log looks healthy so far - with nicely alternating 
good/bad bisection points. Barring the possibility that the bug is 
non-deterministic, i'd guess the bisection points are OK, at least 
judging from their statistical properties.

but ... i went over the diffs too, and i fail to see how they could 
affect the bootup path of an Alpha box, which i suspect has no 
networking dependency up to the failure point.

Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: sockets affected by IPsec always block (2.6.23)

2007-12-07 Thread Stefan Rompf

Am Freitag, 7. Dezember 2007 04:20 schrieb David Miller:

> If IPSEC takes a long time to resolve, and we don't block, the
> connect() can hard fail (we will just keep dropping the outgoing SYN
> packet send attempts, eventually hitting the retry limit) in cases
> where if we did block it would not fail (because we wouldn't send
> the first SYN until IPSEC resolved).

David - I'm aware of this, the discussion is which behaviour is ok. Let's go 
back to a real life example. I've already researched that the squid web proxy 
has a poll() based main loop doing nonblocking connects, may be with multiple 
threads.

Situation: One user wants to access a web page that needs IPSEC. The SA takes 
30 seconds to come up.

a) Non-blocking connect is respected: SYN packets during the first 30 seconds 
will be dropped as you said. Connection can be completed on the next SYN 
retry (timeout in linux: 3 minutes). During this time, the 500 other users 
can continue to browse using the proxy.

b) Non-blocking connect is ignored during IPSEC resolving as you advocate it: 
Connection for the one user can be completed immediatly after IPSEC comes up. 
That's the pro. However, until then, the other 500 proxy user CANNOT ACCESS 
THE WEB because squid's threads are stuck in connect()s on sockets they 
configured not to block. If the IPSEC SA never resolves due to some network 
outage, squid will sleep forever or until an admin configures it that it 
doesn't try to connect the adress in question and restarts it.

Don't you realize how broken this behaviour is? Can you give me ONE example of 
an application that works better with b) and why this outweights the problems 
it creates for everybody else?

Even the DNS example you posted in  
<[EMAIL PROTECTED]> is wrong because the second 
server will never queried if the kernel puts the process into coma while the 
IPSEC SA to the first server cannot be resolved.

Stefan
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[GIT PULL] LED bugfix

2007-12-07 Thread Richard Purdie

Linus,

Could you please pull from:

git://git.o-hand.com/linux-rpurdie-leds for-linus

This is an LED trigger locking fix for 2.6.24. This fixes the issues
discussed in bug 9264, the change has been tested in -mm.

Thanks, Richard

 drivers/leds/led-class.c|6 ++---
 drivers/leds/led-triggers.c |   49 ++--
 include/linux/leds.h|3 +-
 3 files changed, 30 insertions(+), 28 deletions(-)

Richard Purdie (1):
  leds: Fix led trigger locking bugs



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [patch] ext2: xip check fix

2007-12-07 Thread Jared Hulbert

> > I think so.  The filemap_xip.c functionality doesn't work for Flash
> > memory yet.  Flash memory doesn't have struct pages to back it up with
> > which this stuff depends on.
>
> Struct page is not the major issue. The primary problem is writing to
> the media (and I am not a flash expert at all, just relaying here):
> For some period of time, the flash memory is not usable and thus we
> need to make sure we can nuke the page table entries that we have in
> userland page tables. For that, we need a callback from the device so
> that it can ask to get its references back. Oh, and a put_xip_page
> counterpart to get_xip_page, so that the driver knows when it's safe
> to erase.

Well... That's the biggest/hardest problem, yes.  But not the first.
First we got to tackle the easy read only case, which doesn't require
any of that unpleasantness, yet which is used in a bunch of out of
tree hacks.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] scheduler: fix x86 regression in native_sched_clock

2007-12-07 Thread Andrew Morton

On Fri, 7 Dec 2007 09:45:59 +0100 Ingo Molnar <[EMAIL PROTECTED]> wrote:

> 
> * Stefano Brivio <[EMAIL PROTECTED]> wrote:
> 
> > This patch fixes a regression introduced by:
> > 
> > commit bb29ab26863c022743143f27956cc0ca362f258c
> > Author: Ingo Molnar <[EMAIL PROTECTED]>
> > Date:   Mon Jul 9 18:51:59 2007 +0200
> > 
> > This caused the jiffies counter to leap back and forth on cpufreq 
> > changes on my x86 box. I'd say that we can't always assume that TSC 
> > does "small errors" only, when marked unstable. On cpufreq changes 
> > these errors can be huge.
> 
> ah, printk_clock() still uses sched_clock(), not jiffies. So it's not 
> the jiffies counter that goes back and forth, it's sched_clock() - so 
> this is a printk timestamps anomaly, not related to jiffies. I thought 
> we have fixed this bug in the printk code already: sched_clock() is a 
> 'raw' interface that should not be used directly - the proper interface 
> is cpu_clock(cpu). Does the patch below help?
> 
>   Ingo
> 
> --->
> Subject: sched: fix CONFIG_PRINT_TIME's reliance on sched_clock()
> From: Ingo Molnar <[EMAIL PROTECTED]>
> 
> Stefano Brivio reported weird printk timestamp behavior during
> CPU frequency changes:
> 
>   http://bugzilla.kernel.org/show_bug.cgi?id=9475
> 
> fix CONFIG_PRINT_TIME's reliance on sched_clock() and use cpu_clock()
> instead.
> 
> Reported-and-bisected-by: Stefano Brivio <[EMAIL PROTECTED]>
> Signed-off-by: Ingo Molnar <[EMAIL PROTECTED]>
> ---
>  kernel/printk.c |2 +-
>  kernel/sched.c  |7 ++-
>  2 files changed, 7 insertions(+), 2 deletions(-)
> 
> Index: linux/kernel/printk.c
> ===
> --- linux.orig/kernel/printk.c
> +++ linux/kernel/printk.c
> @@ -680,7 +680,7 @@ asmlinkage int vprintk(const char *fmt, 
>   loglev_char = default_message_loglevel
>   + '0';
>   }
> - t = printk_clock();
> + t = cpu_clock(printk_cpu);
>   nanosec_rem = do_div(t, 10);
>   tlen = sprintf(tbuf,
>   "<%c>[%5lu.%06lu] ",

A bit risky - it's quite an expansion of code which no longer can call printk.

You might want to take that WARN_ON out of __update_rq_clock() ;)

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: git guidance

2007-12-07 Thread Al Boldi

Andreas Ericsson wrote:
> So, to get to the bottom of this, which of the following workflows is it
> you want git to support?
>
> ### WORKFLOW A ###
> edit, edit, edit
> edit, edit, edit
> edit, edit, edit
> Oops I made a mistake and need to hop back to "current - 12".
> edit, edit, edit
> edit, edit, edit
> publish everything, similar to just tarring up your workdir and sending
> out ### END WORKFLOW A ###
>
> ### WORKFLOW B ###
> edit, edit, edit
> ok this looks good, I want to save a checkpoint here
> edit, edit, edit
> looks good again. next checkpoint
> edit, edit, edit
> oh crap, back to checkpoint 2
> edit, edit, edit
> ooh, that's better. save a checkpoint and publish those checkpoints
> ### END WORKFLOW B ###

### WORKFLOW C ###
for every save on a gitfs mounted dir, do an implied checkpoint, commit, or 
publish (should be adjustable), on its privately created on-the-fly 
repository.
### END WORKFLOW C ###

For example:

  echo "// last comment on this file" >> /gitfs.mounted/file

should do an implied checkpoint, and make these checkpoints immediately 
visible under some checkpoint branch of the gitfs mounted dir.

Note, this way the developer gets version control without even noticing, and 
works completely transparent to any kind of application.

Thanks!

--
Al

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

RE: ptrace API extensions for BTS

2007-12-07 Thread Metzger, Markus T

>From: Andi Kleen [mailto:[EMAIL PROTECTED] 
>Sent: Freitag, 7. Dezember 2007 12:18

>> I would like to settle the discussion and find an interface that
>> everybody can agree to, so I can implement that interface and we can
>> move forward with the patch.
>
>The most efficient interface would be zero copy with tracer 
>user process
>supplying memory that is pinned (get_user_pages()) subject to the
>mlock rlimit. Then kernel telling the CPU to directly log into
>that.

That would require users to understand all kinds of BTS formats
and to detect the hardware they are running on in order to interpret
the data.

So far, there are two different formats. But one of them is wasting
an entire word of memory per record. I could imagine that this would
change some day.

Other architectures would likely use an entirely different format.
Users who want to support several architectures would benefit from
a common format for this from-to branch information.


>> Regarding 1, we currently provide scheduling timestamps, 
>which are arch
>
>That's actually broken because you don't log the CPU number.
>sched_clock() without the CPU number associated is meaningless 
>on systems without synchronized, pstate invariant TSC 
>[that is older Intel systems or some larger current systems]

I see.

The intention was not to provide exact timestamps, but rather a
relative order of BTS chunks that would allow debuggers to
show which parts were (actually, "might have been" is the best we
can say) executed in parallel, and which parts were definitely 
executed sequentially.

Without a global time, though, this becomes rather meaningless.

Is there some other metric that would allow me to order BTS 
chunks for different threads?


>> Additional architectures may want to (re)use and extend the x86 bts
>> record, or they may want to invent their own format. In the 
>former case,
>
>I think that's actually not a good goal. If the code is so complicated
>that it makes sense sharing then you did something wrong :)

Agreed;-)

Users would benefit if they wanted to support multiple architectures.
They would need to invent such a more general interface; or duplicate 
code, which is never a good thing.


regards,
markus.
-
Intel GmbH
Dornacher Strasse 1
85622 Feldkirchen/Muenchen Germany
Sitz der Gesellschaft: Feldkirchen bei Muenchen
Geschaeftsfuehrer: Douglas Lusk, Peter Gleissner, Hannes Schwaderer
Registergericht: Muenchen HRB 47456 Ust.-IdNr.
VAT Registration No.: DE129385895
Citibank Frankfurt (BLZ 502 109 00) 600119052

This e-mail and any attachments may contain confidential material for
the sole use of the intended recipient(s). Any review or distribution
by others is strictly prohibited. If you are not the intended
recipient, please contact the sender and delete all copies.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: programs vanish with 2.6.22+

2007-12-07 Thread Guennadi Liakhovetski

On Fri, 7 Dec 2007, Markus wrote:

> Hi again!
> 
> The memtest ran 14 passes (~10h) without an error.
> 
> I now have a 2.6.24-rc4 with some debug-options turned on, waiting for 
> something to happen... can I just leave it untill a window disappears 
> or do I need to manually enable something or run some user-space app?!

It depends - different options have it differently. Most simple ones are 
just compile-time, so, you don't have to enable them. Look in "help" for 
respective debug-options.

Thanks
Guennadi
---
Guennadi Liakhovetski
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: ptrace API extensions for BTS

2007-12-07 Thread Andi Kleen

On Friday 07 December 2007 10:11:04 Metzger, Markus T wrote:
> Roland, Andi,
>  
> I would like to discuss the ptrace user interface for the BTS extension.
> In previous emails,
> Andi suggested a stream-like interface, but is also OK with an
> array-like interface (as far as I understood).
> Roland is dubious about the ptrace API additions.
> 
> I would like to settle the discussion and find an interface that
> everybody can agree to, so I can implement that interface and we can
> move forward with the patch.

The most efficient interface would be zero copy with tracer user process
supplying memory that is pinned (get_user_pages()) subject to the
mlock rlimit. Then kernel telling the CPU to directly log into
that.

Kernel buffers would be only needed for the per CPU kernel 
logging.

Then the only information that would need to be passed with
system calls would be wakeup, tail position and perhaps a wrapping
counter.

> Regarding 1, we currently provide scheduling timestamps, which are arch

That's actually broken because you don't log the CPU number.
sched_clock() without the CPU number associated is meaningless 
on systems without synchronized, pstate invariant TSC 
[that is older Intel systems or some larger current systems]

And even if you log the CPU number it is unclear how user space
would make sense of that. It can't generally, even the kernel
can't. Perhaps better to just not supply any time stamps for this.

Even on systems that don't have unsync TSC problem above
it can be tricky to convert the TSC into real time. Right now
we don't report the TSC frequency for once. Usually it tends
to be at highest p state but finding that out is also 
difficult and unreliable (rounding errors) and might not
always be true in the future. Anyways could be solved
by reporting that separately in /proc/cpuinfo, but given all
the other problems I have my doubts it is really worth it. I would
suggest dropping the time stamp.

> Additional architectures may want to (re)use and extend the x86 bts
> record, or they may want to invent their own format. In the former case,

I think that's actually not a good goal. If the code is so complicated
that it makes sense sharing then you did something wrong :)

-Andi
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] scheduler: fix x86 regression in native_sched_clock

2007-12-07 Thread Andi Kleen

Thomas Gleixner <[EMAIL PROTECTED]> writes:
>
> Hmrpf. sched_clock() is used for the time stamp of the printks. We
> need to find some better solution other than killing off the tsc
> access completely.

Doing it properly requires pretty much most of my old sched-clock ff patch.
Complicated and not pretty, but ..
Unfortunately that version still had some jumps on cpufreq, but they
are fixable there.

-Andi
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [BUG] 2.6.23-rc3 can't see sd partitions on Alpha

2007-12-07 Thread Andrew Morton

On Thu, 6 Dec 2007 23:07:08 -0600 (CST) [EMAIL PROTECTED] (Bob Tracy) wrote:

> Andrew Morton wrote:
> > commit 6f37ac793d6ba7b35d338f791974166f67fdd9ba
> > Merge: 2f1f53b... d90bf5a...
> > Author: Linus Torvalds <[EMAIL PROTECTED]>
> > Date:   Wed Nov 14 18:51:48 2007 -0800
> > 
> > Merge branch 'master' of 
> > master.kernel.org:/pub/scm/linux/kernel/git/davem/n
> > 
> > * 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6:
> >   [NET]: rt_check_expire() can take a long time, add a cond_resched()
> >   [ISDN] sc: Really, really fix warning
> >   [ISDN] sc: Fix sndpkt to have the correct number of arguments
> >   [TCP] FRTO: Clear frto_highmark only after process_frto that uses it
> >   [NET]: Remove notifier block from chain when 
> > register_netdevice_notifier f
> >   [FS_ENET]: Fix module build.
> >   [TCP]: Make sure write_queue_from does not begin with NULL ptr
> >   [TCP]: Fix size calculation in sk_stream_alloc_pskb
> >   [S2IO]: Fixed memory leak when MSI-X vector allocation fails
> >   [BONDING]: Fix resource use after free
> >   [SYSCTL]: Fix warning for token-ring from sysctl checker
> >   [NET] random : secure_tcp_sequence_number should not assume 
> > CONFIG_KTIME_S
> >   [IWLWIFI]: Not correctly dealing with hotunplug.
> >   [TCP] FRTO: Plug potential LOST-bit leak
> >   [TCP] FRTO: Limit snd_cwnd if TCP was application limited
> >   [E1000]: Fix schedule while atomic when called from mii-tool.
> >   [NETX]: Fix build failure added by 2.6.24 statistics cleanup.
> >   [EP93xx_ETH]: Build fix after 2.6.24 NAPI changes.
> >   [PKT_SCHED]: Check subqueue status before calling hard_start_xmit
> > 
> > I'm struggling to see how any of those could have broken block device
> > mounting on alpha.  Are you sure you bisected right?
> 
> Based on what's in that commit, it *does* appear something went wrong
> with bisection.  If the implicated commit is the next one in time
> sequence relative to
> 
> # good: [2f1f53bdc6531696934f6ee7bbdfa2ab4f4f62a3] CRISv10 fasttimer: Scrap 
> INLINE and name timeval_cmp better
> 
> then the test of whether I bisected correctly is as simple as applying
> the commit and seeing if things break, because I'm running on the
> kernel corresponding to 2f1f53bdc6531696934f6ee7bbdfa2ab4f4f62a3 right
> now.  Let me give that a try and I'll report back.  Worst case, I'll
> have to start over and write off the past four days...

Gad.  I trust the second time will be faster.

git-bisect _is_ very error prone.  I find one of the problems is that each
step is so far apart in time that you forget what you were doing.  Did I
remember to test that iteration?  Did I install the right kernel?  etc.

> Sorry about this...

Not appropriate ;)   Thanks for helping out.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 2.6.24-rc3] Fix /proc/net breakage

2007-12-07 Thread Andrew Morton

On Fri, 07 Dec 2007 04:51:37 + David Woodhouse <[EMAIL PROTECTED]> wrote:

> On Mon, 2007-11-26 at 15:17 -0700, Eric W. Biederman wrote:
> > Well I clearly goofed when I added the initial network namespace support
> > for /proc/net.  Currently things work but there are odd details visible
> > to user space, even when we have a single network namespace.
> > 
> > Since we do not cache proc_dir_entry dentries at the moment we can
> > just modify ->lookup to return a different directory inode depending
> > on the network namespace of the process looking at /proc/net, replacing
> > the current technique of using a magic and fragile follow_link method.
> > 
> > To accomplish that this patch:
> > - introduces a shadow_proc method to allow different dentries to
> >   be returned from proc_lookup.
> > - Removes the old /proc/net follow_link magic
> > - Fixes a weakness in our not caching of proc generic dentries.
> > 
> > As shadow_proc uses a task struct to decided which dentry to return we
> > can go back later and fix the proc generic caching without modifying any 
> > code that
> > uses the shadow_proc method.
> > 
> > Signed-off-by: Eric W. Biederman <[EMAIL PROTECTED]>
> > ---
> >  fs/proc/generic.c   |   12 ++-
> >  fs/proc/proc_net.c  |   86 
> > +++
> >  include/linux/proc_fs.h |3 ++
> >  3 files changed, 19 insertions(+), 82 deletions(-)
> 
> (commit 2b1e300a9dfc3196ccddf6f1d74b91b7af55e416)
> 
> This seems to have broken the use of /proc/bus/usb as a mountpoint. It
> always appears empty now, whatever's supposed to be mounted there.
> 

Yes.  Denis and Eric are tossing around competing patches but afaik nobody
is happy with any of them.  Guys, could we get this sorted soonish please?
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

broken suspend (sched related) [Was: 2.6.24-rc4-mm1]

2007-12-07 Thread Jiri Slaby

On 12/05/2007 06:17 AM, Andrew Morton wrote:
>   
> ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.24-rc4/2.6.24-rc4-mm1/

>  git-sched.patch

breaks suspend here since -rc3-mm2. More precisely, this one:
softlockup: automatically detect hung TASK_UNINTERRUPTIBLE tasks

2.6.24-rc4-mm1 minus this one works just fine. Otherwise disks stop, graphics
stops and then it hangs not powering down.

Core 2 Duo, SMP kernel, voluntary preempt, 250 HZ, SLUB, 64 bit.

Ideas?
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [patch] x86: scale cyc_2_nsec according to CPU frequency

2007-12-07 Thread Guillaume Chazarain

Le Fri, 7 Dec 2007 14:55:25 +0100,
Ingo Molnar <[EMAIL PROTECTED]> a écrit :

> Firstly, we dont need the 'offset' anymore because cpu_clock() maintains 
> offsets itself.

Yes, but a lower quality one. __update_rq_clock tries to compensate
large jumping clocks with a jiffy resolution, while my offset arranges
for a very smooth frequency transition.

I agree with keeping a single offset, but I liked the fact that with my
patch on frequency change, the clock had no jump at all.

> + *  ns += offset to avoid sched_clock jumps with cpufreq

I guess this needs to go away if I don't make my point :-(

> + printk("CPU#%d: changed cyc2ns scale from %ld to %ld\n",
> + cpu, prev_scale, *scale);

Pointing it out just to be sure it does not end in the final version ;-)

Thanks for cleaning up my mess ;-)

-- 
Guillaume
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: RFC: outb 0x80 in inb_p, outb_p harmful on some modern AMD64 with MCP51 laptops

2007-12-07 Thread David P. Reed


Andi Kleen wrote:

Changing the delay instruction sequence from the outb to short jumps
might be the safe thing.

I don't think that makes sense to do on anything modern. The trouble
is that the jumps will effectively execute near "infinitely fast" on any
modern CPU compared to the bus. But the delay really needs to be something
that is about IO port speed.
This all presumes that you need any delay at all.  From back in the 
early days (when I was writing DOS and BIOS code on 80286 class 
machines) the /only/ reason this was a problem was using really slow 
acting, non-buffered chips compared to the processor clock (8259?).  If 
you think about it, if there is a sequence such as outb->device, 
inb<-device, the only reason for a delay would be that the device failed 
to process the out command, /and/ the device had no "done" flag.  The 
other "slow" problem would be an out->device, out->device at a rate 
higher than the device could handle because it had a one-level buffer 
that ignored input that came too fast after the previous, but didn't 
stall the bus to protect the device.  Modern machines just are not 
designed that way - a few of the early PC compatibles were.


My machine in question, for example, needs no waiting within CMOS_READs 
at all.   And I doubt any other chip/device needs waiting that isn't 
already provided by the bus. the i/o to port 80 is very, very odd in 
this context.  Actually, modern machines have potentially more serious 
problems with i/o ops to non-existent addresses, which may cause real 
bus wierdness.


So that's why I suggested the short-jump answer - it fixes the problem 
on the ancient machines, but doesn't do anything on the modern ones, 
where there should be no problem.


One patch that makes immediate sense is to use the "virtualization" 
hooks for the CMOS_READ/WRITE ops that is there in the 32-bit code to 
allow substitution of a workable sequence for the RTC, which is where I 
experience the problem on my machine.  This doesn't fix any lurking 
issues with the _p APIs, since they are not virtualized.  I'd suggest 
the safest possible route that would fix my machine would be either an 
early_quirk, a boot parameter, or both that would then control the 
virtualization hook logic.


That patch would fix my machine's current issues, and would not harm any 
machines that need the 0x80 delay.


But I know it leaves a lurking issue for another day - for all the other 
inb_p and outb_p code in the kernel drivers.  A grep suggests that they 
are used only in somewhat less modern drivers - perhaps for legacy 
machines.  I don't think any such drivers are used on any of my machines.



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [patch] x86: scale cyc_2_nsec according to CPU frequency

2007-12-07 Thread Ingo Molnar


* Guillaume Chazarain <[EMAIL PROTECTED]> wrote:

> Le Fri, 7 Dec 2007 14:55:25 +0100,
> Ingo Molnar <[EMAIL PROTECTED]> a ??crit :
> 
> > Firstly, we dont need the 'offset' anymore because cpu_clock() 
> > maintains offsets itself.
> 
> Yes, but a lower quality one. __update_rq_clock tries to compensate 
> large jumping clocks with a jiffy resolution, while my offset arranges 
> for a very smooth frequency transition.

yes, but that would be easy to fix up via calling 
sched_clock_idle_wakeup_event(0) when doing a frequency transition, 
without burdening the normal sched_clock() codepath with the offset. See 
the attached latest version.

Ingo

--->
Subject: x86: scale cyc_2_nsec according to CPU frequency
From: "Guillaume Chazarain" <[EMAIL PROTECTED]>

scale the sched_clock() cyc_2_nsec scaling factor according to
CPU frequency changes.

[ [EMAIL PROTECTED]: simplified it and fixed it for SMP. ]

Signed-off-by: Ingo Molnar <[EMAIL PROTECTED]>
Signed-off-by: Thomas Gleixner <[EMAIL PROTECTED]>
---
 arch/x86/kernel/tsc_32.c |   45 +++
 arch/x86/kernel/tsc_64.c |   59 +++
 include/asm-x86/timer.h  |   23 ++
 3 files changed, 106 insertions(+), 21 deletions(-)

Index: linux-x86.q/arch/x86/kernel/tsc_32.c
===
--- linux-x86.q.orig/arch/x86/kernel/tsc_32.c
+++ linux-x86.q/arch/x86/kernel/tsc_32.c
@@ -5,6 +5,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include 
 #include 
@@ -78,15 +79,35 @@ EXPORT_SYMBOL_GPL(check_tsc_unstable);
  *  cyc2ns_scale is limited to 10^6 * 2^10, which fits in 32 bits.
  *  ([EMAIL PROTECTED])
  *
+ *  ns += offset to avoid sched_clock jumps with cpufreq
+ *
  * [EMAIL PROTECTED] "math is hard, lets go shopping!"
  */
-unsigned long cyc2ns_scale __read_mostly;
 
-#define CYC2NS_SCALE_FACTOR 10 /* 2^10, carefully chosen */
+DEFINE_PER_CPU(unsigned long, cyc2ns);
 
-static inline void set_cyc2ns_scale(unsigned long cpu_khz)
+static void set_cyc2ns_scale(unsigned long cpu_khz, int cpu)
 {
-   cyc2ns_scale = (100 << CYC2NS_SCALE_FACTOR)/cpu_khz;
+   unsigned long flags, prev_scale, *scale;
+   unsigned long long tsc_now, ns_now;
+
+   local_irq_save(flags);
+   sched_clock_idle_sleep_event();
+
+   scale = &per_cpu(cyc2ns, cpu);
+
+   rdtscll(tsc_now);
+   ns_now = __cycles_2_ns(tsc_now);
+
+   prev_scale = *scale;
+   if (cpu_khz)
+   *scale = (NSEC_PER_MSEC << CYC2NS_SCALE_FACTOR)/cpu_khz;
+
+   /*
+* Start smoothly with the new frequency:
+*/
+   sched_clock_idle_wakeup_event(0);
+   local_irq_restore(flags);
 }
 
 /*
@@ -239,7 +260,9 @@ time_cpufreq_notifier(struct notifier_bl
ref_freq, freq->new);
if (!(freq->flags & CPUFREQ_CONST_LOOPS)) {
tsc_khz = cpu_khz;
-   set_cyc2ns_scale(cpu_khz);
+   preempt_disable();
+   set_cyc2ns_scale(cpu_khz, smp_processor_id());
+   preempt_enable();
/*
 * TSC based sched_clock turns
 * to junk w/ cpufreq
@@ -367,6 +390,8 @@ static inline void check_geode_tsc_relia
 
 void __init tsc_init(void)
 {
+   int cpu;
+
if (!cpu_has_tsc || tsc_disable)
goto out_no_tsc;
 
@@ -380,7 +405,15 @@ void __init tsc_init(void)
(unsigned long)cpu_khz / 1000,
(unsigned long)cpu_khz % 1000);
 
-   set_cyc2ns_scale(cpu_khz);
+   /*
+* Secondary CPUs do not run through tsc_init(), so set up
+* all the scale factors for all CPUs, assuming the same
+* speed as the bootup CPU. (cpufreq notifiers will fix this
+* up if their speed diverges)
+*/
+   for_each_possible_cpu(cpu)
+   set_cyc2ns_scale(cpu_khz, cpu);
+
use_tsc_delay();
 
/* Check and install the TSC clocksource */
Index: linux-x86.q/arch/x86/kernel/tsc_64.c
===
--- linux-x86.q.orig/arch/x86/kernel/tsc_64.c
+++ linux-x86.q/arch/x86/kernel/tsc_64.c
@@ -10,6 +10,7 @@
 
 #include 
 #include 
+#include 
 
 static int notsc __initdata = 0;
 
@@ -18,16 +19,50 @@ EXPORT_SYMBOL(cpu_khz);
 unsigned int tsc_khz;
 EXPORT_SYMBOL(tsc_khz);
 
-static unsigned int cyc2ns_scale __read_mostly;
+/* Accelerators for sched_clock()
+ * convert from cycles(64bits) => nanoseconds (64bits)
+ *  basic equation:
+ * ns = cycles / (freq / ns_per_sec)
+ * ns = cycles * (ns_per_sec / freq)
+ * ns = cycles * (10^9 / (cpu_khz * 10^3))
+ * ns = cycles * (10^6 / cpu_khz)
+ *
+ *

Re: RFC: outb 0x80 in inb_p, outb_p harmful on some modern AMD64 with MCP51 laptops

2007-12-07 Thread Andi Kleen

> My machine in question, for example, needs no waiting within CMOS_READs 
> at all.   And I doubt any other chip/device needs waiting that isn't 

I don't know about CMOS, but there were definitely some not too ancient
systems (let's say not more than 10 years) who required IO delays in the
floppy driver and the 8253/8259. But on those the jumps are already
far too fast.

-Andi
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] scheduler: fix x86 regression in native_sched_clock

2007-12-07 Thread Ingo Molnar


* Ingo Molnar <[EMAIL PROTECTED]> wrote:

> third update. the cpufreq callbacks are not quite OK yet.

fourth update - the cpufreq callbacks are back. This is a version that 
is supposed fix all known aspects of TSC and frequency-change 
weirdnesses.

Ingo

Index: linux/arch/arm/kernel/time.c
===
--- linux.orig/arch/arm/kernel/time.c
+++ linux/arch/arm/kernel/time.c
@@ -79,17 +79,6 @@ static unsigned long dummy_gettimeoffset
 }
 #endif
 
-/*
- * An implementation of printk_clock() independent from
- * sched_clock().  This avoids non-bootable kernels when
- * printk_clock is enabled.
- */
-unsigned long long printk_clock(void)
-{
-   return (unsigned long long)(jiffies - INITIAL_JIFFIES) *
-   (10 / HZ);
-}
-
 static unsigned long next_rtc_update;
 
 /*
Index: linux/arch/ia64/kernel/time.c
===
--- linux.orig/arch/ia64/kernel/time.c
+++ linux/arch/ia64/kernel/time.c
@@ -344,33 +344,6 @@ udelay (unsigned long usecs)
 }
 EXPORT_SYMBOL(udelay);
 
-static unsigned long long ia64_itc_printk_clock(void)
-{
-   if (ia64_get_kr(IA64_KR_PER_CPU_DATA))
-   return sched_clock();
-   return 0;
-}
-
-static unsigned long long ia64_default_printk_clock(void)
-{
-   return (unsigned long long)(jiffies_64 - INITIAL_JIFFIES) *
-   (10/HZ);
-}
-
-unsigned long long (*ia64_printk_clock)(void) = &ia64_default_printk_clock;
-
-unsigned long long printk_clock(void)
-{
-   return ia64_printk_clock();
-}
-
-void __init
-ia64_setup_printk_clock(void)
-{
-   if (!(sal_platform_features & IA64_SAL_PLATFORM_FEATURE_ITC_DRIFT))
-   ia64_printk_clock = ia64_itc_printk_clock;
-}
-
 /* IA64 doesn't cache the timezone */
 void update_vsyscall_tz(void)
 {
Index: linux/arch/x86/kernel/process_32.c
===
--- linux.orig/arch/x86/kernel/process_32.c
+++ linux/arch/x86/kernel/process_32.c
@@ -113,10 +113,19 @@ void default_idle(void)
smp_mb();
 
local_irq_disable();
-   if (!need_resched())
+   if (!need_resched()) {
+   ktime_t t0, t1;
+   u64 t0n, t1n;
+
+   t0 = ktime_get();
+   t0n = ktime_to_ns(t0);
safe_halt();/* enables interrupts racelessly */
-   else
-   local_irq_enable();
+   local_irq_disable();
+   t1 = ktime_get();
+   t1n = ktime_to_ns(t1);
+   sched_clock_idle_wakeup_event(t1n - t0n);
+   }
+   local_irq_enable();
current_thread_info()->status |= TS_POLLING;
} else {
/* loop is done by the caller */
Index: linux/arch/x86/kernel/tsc_32.c
===
--- linux.orig/arch/x86/kernel/tsc_32.c
+++ linux/arch/x86/kernel/tsc_32.c
@@ -5,6 +5,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include 
 #include 
@@ -78,15 +79,35 @@ EXPORT_SYMBOL_GPL(check_tsc_unstable);
  *  cyc2ns_scale is limited to 10^6 * 2^10, which fits in 32 bits.
  *  ([EMAIL PROTECTED])
  *
+ *  ns += offset to avoid sched_clock jumps with cpufreq
+ *
  * [EMAIL PROTECTED] "math is hard, lets go shopping!"
  */
-unsigned long cyc2ns_scale __read_mostly;
 
-#define CYC2NS_SCALE_FACTOR 10 /* 2^10, carefully chosen */
+DEFINE_PER_CPU(unsigned long, cyc2ns);
 
-static inline void set_cyc2ns_scale(unsigned long cpu_khz)
+static void set_cyc2ns_scale(unsigned long cpu_khz, int cpu)
 {
-   cyc2ns_scale = (100 << CYC2NS_SCALE_FACTOR)/cpu_khz;
+   unsigned long flags, prev_scale, *scale;
+   unsigned long long tsc_now, ns_now;
+
+   local_irq_save(flags);
+   sched_clock_idle_sleep_event();
+
+   scale = &per_cpu(cyc2ns, cpu);
+
+   rdtscll(tsc_now);
+   ns_now = __cycles_2_ns(tsc_now);
+
+   prev_scale = *scale;
+   if (cpu_khz)
+   *scale = (NSEC_PER_MSEC << CYC2NS_SCALE_FACTOR)/cpu_khz;
+
+   /*
+* Start smoothly with the new frequency:
+*/
+   sched_clock_idle_wakeup_event(0);
+   local_irq_restore(flags);
 }
 
 /*
@@ -239,7 +260,9 @@ time_cpufreq_notifier(struct notifier_bl
ref_freq, freq->new);
if (!(freq->flags & CPUFREQ_CONST_LOOPS)) {
tsc_khz = cpu_khz;
-   set_cyc2ns_scale(cpu_khz);
+   preempt_disable();
+   set_cyc2ns_scale(cpu_khz, smp_processor_id());
+   preempt_enable();
/*
 * TSC based sched_clock tur

Re: [BUG] 2.6.23-rc3 can't see sd partitions on Alpha

2007-12-07 Thread Bob Tracy

I wrote:
> "git diff 2f1f53bdc6531696934f6ee7bbdfa2ab4f4f62a3 
> 6f37ac793d6ba7b35d338f791974166f67fdd9ba"
> produced a relatively short patch (18,437 bytes).  The list of involved
> files:
> 
> (omitted)
>
> Current state of the source tree is the 6f37ac... version, so I'll start
> backing out the above diffs in related groups and continue until I've got
> a working kernel.  For lack of an obvious target, I'll start with the
> seemingly innocuous change to sysctl_check.c.  I'll report back when I've
> got something.

That was quick :-).  Backing out the sysctl_check.c diff gives me a
working kernel.  Beats the [EMAIL PROTECTED] out of me how/why, though.

Michael Cree: could you try backing out the diff below from your
2.6.24-rc3 tree and see if things are now working for you?

Here's "uname -a", just to confirm (maybe) I'm running on what I say
works:

Linux smirkin 2.6.24-rc2-g6f37ac79-dirty #2 Fri Dec 7 08:03:12 CST 2007 alpha

Here's the diff I backed out (patch -R).  It's short...

diff --git a/kernel/sysctl_check.c b/kernel/sysctl_check.c
index 5a2f2b2..4abc6d2 100644
--- a/kernel/sysctl_check.c
+++ b/kernel/sysctl_check.c
@@ -738,7 +738,7 @@ static struct trans_ctl_table trans_net_table[] = {
{ NET_ROSE, "rose", trans_net_rose_table },
{ NET_IPV6, "ipv6", trans_net_ipv6_table },
{ NET_X25,  "x25",  trans_net_x25_table },
-   { NET_TR,   "tr",   trans_net_tr_table },
+   { NET_TR,   "token-ring",   trans_net_tr_table },
{ NET_DECNET,   "decnet",   trans_net_decnet_table },
/*  NET_ECONET not used */
{ NET_SCTP, "sctp", trans_net_sctp_table },

-- 

Bob Tracy  |  "They couldn't hit an elephant at this dist- "
[EMAIL PROTECTED]   |   - Last words of Union General John Sedgwick,
   |  Battle of Spotsylvania Court House, U.S. Civil War

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[GIT PULL] avr32 fixes for 2.6.24

2007-12-07 Thread Haavard Skinnemoen

Linus,

Please pull from

  ssh://master.kernel.org/pub/scm/linux/kernel/git/hskinnemoen/avr32-2.6.git 
for-linus

to receive the following updates.

Yes, lots of complicated stuff has been changed here. There is
currently a bug in the debug trap handling code which may cause a soft
lockup while debugging userspace applications, and it took some major
surgery to get it fixed. The lockdep stuff isn't really a part of the
fix, but since it touches the low-level exception handling code, I
think it is more risky to remove it than leaving it in.

So I hope you won't be too upset by these changes. I wouldn't have
pushed it if I didn't think the bug it fixes is very serious, and I've
spent quite a few days testing that nothing broke. A customer has
verified the fix too, and the LTP test cases that fail after this
patch, failed before too.

Haavard Skinnemoen (9):
  [AVR32] Add TIF_RESTORE_SIGMASK to the work masks
  [AVR32] Fix invalid status register bit definitions in asm/ptrace.h
  [AVR32] Kconfig: Use def_bool instead of bool + default
  [AVR32] Implement stacktrace support
  [AVR32] Implement irqflags trace and lockdep support
  [AVR32] Clean up OCD register usage
  [AVR32] Follow the rules when dealing with the OCD system
  [AVR32] Fix copy_to_user_page() breakage
  [AVR32] Fix wrong pt_regs in critical exception handler

 arch/avr32/Kconfig   |   65 ++---
 arch/avr32/kernel/Makefile   |1 +
 arch/avr32/kernel/asm-offsets.c  |2 +
 arch/avr32/kernel/entry-avr32b.S |  285 ---
 arch/avr32/kernel/kprobes.c  |   14 +-
 arch/avr32/kernel/process.c  |9 +-
 arch/avr32/kernel/ptrace.c   |  273 ++
 arch/avr32/kernel/stacktrace.c   |   53 
 arch/avr32/kernel/traps.c|2 +-
 arch/avr32/kernel/vmlinux.lds.S  |2 +-
 arch/avr32/mm/cache.c|   20 +-
 include/asm-avr32/cacheflush.h   |   19 +-
 include/asm-avr32/ocd.h  |  592 +-
 include/asm-avr32/processor.h|3 +
 include/asm-avr32/ptrace.h   |6 +-
 include/asm-avr32/sysreg.h   |2 +
 include/asm-avr32/system.h   |4 +-
 include/asm-avr32/thread_info.h  |   25 ++-
 18 files changed, 1006 insertions(+), 371 deletions(-)
 create mode 100644 arch/avr32/kernel/stacktrace.c

diff --git a/arch/avr32/Kconfig b/arch/avr32/Kconfig
index 4f402c9..b77abce 100644
--- a/arch/avr32/Kconfig
+++ b/arch/avr32/Kconfig
@@ -6,8 +6,7 @@
 mainmenu "Linux Kernel Configuration"
 
 config AVR32
-   bool
-   default y
+   def_bool y
# With EMBEDDED=n, we get lots of stuff automatically selected
# that we usually don't need on AVR32.
select EMBEDDED
@@ -20,51 +19,49 @@ config AVR32
  http://avr32linux.org/.
 
 config GENERIC_GPIO
-   bool
-   default y
+   def_bool y
 
 config GENERIC_HARDIRQS
-   bool
-   default y
+   def_bool y
+
+config STACKTRACE_SUPPORT
+   def_bool y
+
+config LOCKDEP_SUPPORT
+   def_bool y
+
+config TRACE_IRQFLAGS_SUPPORT
+   def_bool y
 
 config HARDIRQS_SW_RESEND
-   bool
-   default y
+   def_bool y
 
 config GENERIC_IRQ_PROBE
-   bool
-   default y
+   def_bool y
 
 config RWSEM_GENERIC_SPINLOCK
-   bool
-   default y
+   def_bool y
 
 config GENERIC_TIME
-   bool
-   default y
+   def_bool y
 
 config RWSEM_XCHGADD_ALGORITHM
-   bool
+   def_bool n
 
 config ARCH_HAS_ILOG2_U32
-   bool
-   default n
+   def_bool n
 
 config ARCH_HAS_ILOG2_U64
-   bool
-   default n
+   def_bool n
 
 config GENERIC_HWEIGHT
-   bool
-   default y
+   def_bool y
 
 config GENERIC_CALIBRATE_DELAY
-   bool
-   default y
+   def_bool y
 
 config GENERIC_BUG
-   bool
-   default y
+   def_bool y
depends on BUG
 
 source "init/Kconfig"
@@ -139,28 +136,22 @@ config PHYS_OFFSET
 source "kernel/Kconfig.preempt"
 
 config HAVE_ARCH_BOOTMEM_NODE
-   bool
-   default n
+   def_bool n
 
 config ARCH_HAVE_MEMORY_PRESENT
-   bool
-   default n
+   def_bool n
 
 config NEED_NODE_MEMMAP_SIZE
-   bool
-   default n
+   def_bool n
 
 config ARCH_FLATMEM_ENABLE
-   bool
-   default y
+   def_bool y
 
 config ARCH_DISCONTIGMEM_ENABLE
-   bool
-   default n
+   def_bool n
 
 config ARCH_SPARSEMEM_ENABLE
-   bool
-   default n
+   def_bool n
 
 source "mm/Kconfig"
 
diff --git a/arch/avr32/kernel/Makefile b/arch/avr32/kernel/Makefile
index 989fcd1..2d6d48f 100644
--- a/arch/avr32/kernel/Makefile
+++ b/arch/avr32/kernel/Makefile
@@ -11,3 +11,4 @@ obj-y += signal.o sys_avr32.o 
process.o time.o
 obj-y  += init_task.o switch_to.o cpu.o
 obj-$(CONFIG_MODULES)  += module.o avr32_ksyms.o
 obj-$(CONFIG_KPROBES)  += kprobes.o
+obj-$(CONFIG_STACKTRACE)   += stacktrace.o
diff --

[PATCH -mm 2/6] powerpc: convert iommu to use the IOMMU helper

2007-12-07 Thread FUJITA Tomonori

This patch converts PPC's IOMMU to use the IOMMU helper functions. The
IOMMU doesn't allocate a memory area spanning LLD's segment boundary
anymore.

iseries_hv_alloc and iseries_hv_map don't have proper device
struct. 4GB boundary is used for them.

Signed-off-by: FUJITA Tomonori <[EMAIL PROTECTED]>
---
 arch/powerpc/Kconfig   |3 +
 arch/powerpc/kernel/dma_64.c   |6 +-
 arch/powerpc/kernel/iommu.c|   65 
 arch/powerpc/platforms/iseries/iommu.c |4 +-
 include/asm-powerpc/iommu.h|   10 ++--
 5 files changed, 45 insertions(+), 43 deletions(-)

diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
index 98aef7f..1a6cf07 100644
--- a/arch/powerpc/Kconfig
+++ b/arch/powerpc/Kconfig
@@ -227,6 +227,9 @@ config IOMMU_VMERGE
 
  Most drivers don't have this problem; it is safe to say Y here.
 
+config IOMMU_HELPER
+   def_bool PPC64
+
 config HOTPLUG_CPU
bool "Support for enabling/disabling CPUs"
depends on SMP && HOTPLUG && EXPERIMENTAL && (PPC_PSERIES || PPC_PMAC)
diff --git a/arch/powerpc/kernel/dma_64.c b/arch/powerpc/kernel/dma_64.c
index 1806d96..6fcb7cb 100644
--- a/arch/powerpc/kernel/dma_64.c
+++ b/arch/powerpc/kernel/dma_64.c
@@ -31,8 +31,8 @@ static inline unsigned long device_to_mask(struct device *dev)
 static void *dma_iommu_alloc_coherent(struct device *dev, size_t size,
  dma_addr_t *dma_handle, gfp_t flag)
 {
-   return iommu_alloc_coherent(dev->archdata.dma_data, size, dma_handle,
-   device_to_mask(dev), flag,
+   return iommu_alloc_coherent(dev, dev->archdata.dma_data, size,
+   dma_handle, device_to_mask(dev), flag,
dev->archdata.numa_node);
 }
 
@@ -52,7 +52,7 @@ static dma_addr_t dma_iommu_map_single(struct device *dev, 
void *vaddr,
   size_t size,
   enum dma_data_direction direction)
 {
-   return iommu_map_single(dev->archdata.dma_data, vaddr, size,
+   return iommu_map_single(dev, dev->archdata.dma_data, vaddr, size,
device_to_mask(dev), direction);
 }
 
diff --git a/arch/powerpc/kernel/iommu.c b/arch/powerpc/kernel/iommu.c
index 7a5d247..6abf4c3 100644
--- a/arch/powerpc/kernel/iommu.c
+++ b/arch/powerpc/kernel/iommu.c
@@ -31,6 +31,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -81,17 +82,19 @@ static int __init setup_iommu(char *str)
 __setup("protect4gb=", setup_protect4gb);
 __setup("iommu=", setup_iommu);
 
-static unsigned long iommu_range_alloc(struct iommu_table *tbl,
+static unsigned long iommu_range_alloc(struct device *dev,
+  struct iommu_table *tbl,
unsigned long npages,
unsigned long *handle,
unsigned long mask,
unsigned int align_order)
 { 
-   unsigned long n, end, i, start;
+   unsigned long n, end, start;
unsigned long limit;
int largealloc = npages > 15;
int pass = 0;
unsigned long align_mask;
+   unsigned long boundary_size;
 
align_mask = 0xl >> (64 - align_order);
 
@@ -136,14 +139,17 @@ static unsigned long iommu_range_alloc(struct iommu_table 
*tbl,
start &= mask;
}
 
-   n = find_next_zero_bit(tbl->it_map, limit, start);
-
-   /* Align allocation */
-   n = (n + align_mask) & ~align_mask;
-
-   end = n + npages;
+   if (dev)
+   boundary_size = ALIGN(dma_get_seg_boundary(dev) + 1,
+ 1 << IOMMU_PAGE_SHIFT);
+   else
+   boundary_size = ALIGN(1UL << 32, 1 << IOMMU_PAGE_SHIFT);
+   /* 4GB boundary for iseries_hv_alloc and iseries_hv_map */
 
-   if (unlikely(end >= limit)) {
+   n = iommu_area_alloc(tbl->it_map, limit, start, npages,
+tbl->it_offset, boundary_size >> IOMMU_PAGE_SHIFT,
+align_mask);
+   if (n == -1) {
if (likely(pass < 2)) {
/* First failure, just rescan the half of the table.
 * Second failure, rescan the other half of the table.
@@ -158,14 +164,7 @@ static unsigned long iommu_range_alloc(struct iommu_table 
*tbl,
}
}
 
-   for (i = n; i < end; i++)
-   if (test_bit(i, tbl->it_map)) {
-   start = i+1;
-   goto again;
-   }
-
-   for (i = n; i < end; i++)
-   __set_bit(i, tbl->it_map);
+   end = n + npages;
 
/* Bump the hint to a new block for small allocs. */
if (largealloc) {
@@ -184,16 +183,17 @@ static un

[PATCH -mm 3/6] powerpc: remove DMA 4GB boundary protection

2007-12-07 Thread FUJITA Tomonori

Previously, during initialization of the IOMMU tables, the last entry
at each 4GB boundary is marked as used since there are many adapters
which cannot handle DMAing across any 4GB boundary.

The IOMMU doesn't allocate a memory area spanning LLD's segment
boundary anymore. The segment boundary of devices are set to 4GB by
default. So we can remove 4GB boundary protection now.

Signed-off-by: FUJITA Tomonori <[EMAIL PROTECTED]>
---
 arch/powerpc/kernel/iommu.c |   21 +
 1 files changed, 1 insertions(+), 20 deletions(-)

diff --git a/arch/powerpc/kernel/iommu.c b/arch/powerpc/kernel/iommu.c
index 6abf4c3..bdb194c 100644
--- a/arch/powerpc/kernel/iommu.c
+++ b/arch/powerpc/kernel/iommu.c
@@ -448,9 +448,6 @@ void iommu_unmap_sg(struct iommu_table *tbl, struct 
scatterlist *sglist,
 struct iommu_table *iommu_init_table(struct iommu_table *tbl, int nid)
 {
unsigned long sz;
-   unsigned long start_index, end_index;
-   unsigned long entries_per_4g;
-   unsigned long index;
static int welcomed = 0;
struct page *page;
 
@@ -472,6 +469,7 @@ struct iommu_table *iommu_init_table(struct iommu_table 
*tbl, int nid)
 
 #ifdef CONFIG_CRASH_DUMP
if (ppc_md.tce_get) {
+   unsigned long index;
unsigned long tceval;
unsigned long tcecount = 0;
 
@@ -502,23 +500,6 @@ struct iommu_table *iommu_init_table(struct iommu_table 
*tbl, int nid)
ppc_md.tce_free(tbl, tbl->it_offset, tbl->it_size);
 #endif
 
-   /*
-* DMA cannot cross 4 GB boundary.  Mark last entry of each 4
-* GB chunk as reserved.
-*/
-   if (protect4gb) {
-   entries_per_4g = 0x1l >> IOMMU_PAGE_SHIFT;
-
-   /* Mark the last bit before a 4GB boundary as used */
-   start_index = tbl->it_offset | (entries_per_4g - 1);
-   start_index -= tbl->it_offset;
-
-   end_index = tbl->it_size;
-
-   for (index = start_index; index < end_index - 1; index += 
entries_per_4g)
-   __set_bit(index, tbl->it_map);
-   }
-
if (!welcomed) {
printk(KERN_INFO "IOMMU table initialized, virtual merging 
%s\n",
   novmerge ? "disabled" : "enabled");
-- 
1.5.3.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH -mm 0/6] fix iommu segment boundary problems (powerpc and x86)

2007-12-07 Thread FUJITA Tomonori

This patchset is a sequel to my patchset to fix iommu segment boundary
problems:

http://www.mail-archive.com/[EMAIL PROTECTED]/msg11919.html

This adds new IOMMU helper functions for the free area
management. These functions take care of LLD's segment boundary limit
for IOMMUs. They are useful for IOMMUs that use bitmap for the free
area management.

The helper functions are very low level. They just find a free area in
bitmap appropriate for low level drivers. The IOMMUs continue to use
their hardware specific techniques easily with the low level helper
functions.

This patchset converts three IOMMUs: POWERPC, X86 calgary, and X86 gart
but I tested POWERPC patch. The rest are only compile tested since I
don't have hardware.

This is against 2.6.24-rc4-mm1.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH -mm 5/6] x86: convert gart IOMMU to use the IOMMU helper

2007-12-07 Thread FUJITA Tomonori

This patch converts gart IOMMU to use the IOMMU helper functions. The
IOMMU doesn't allocate a memory area spanning LLD's segment boundary
anymore.

Signed-off-by: FUJITA Tomonori <[EMAIL PROTECTED]>
---
 arch/x86/Kconfig  |2 +-
 arch/x86/kernel/pci-gart_64.c |   41 +
 2 files changed, 26 insertions(+), 17 deletions(-)

diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index df22fe7..34519c2 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -434,7 +434,7 @@ config CALGARY_IOMMU_ENABLED_BY_DEFAULT
  If unsure, say Y.
 
 config IOMMU_HELPER
-   def_bool CALGARY_IOMMU
+   def_bool (CALGARY_IOMMU || GART_IOMMU)
 
 # need this always selected by IOMMU for the VIA workaround
 config SWIOTLB
diff --git a/arch/x86/kernel/pci-gart_64.c b/arch/x86/kernel/pci-gart_64.c
index b8595d6..d0b9033 100644
--- a/arch/x86/kernel/pci-gart_64.c
+++ b/arch/x86/kernel/pci-gart_64.c
@@ -25,6 +25,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -82,17 +83,24 @@ AGPEXTERN __u32 *agp_gatt_table;
 static unsigned long next_bit;  /* protected by iommu_bitmap_lock */
 static int need_flush; /* global flush state. set for each gart wrap */
 
-static unsigned long alloc_iommu(int size)
+static unsigned long alloc_iommu(struct device *dev, int size)
 {
unsigned long offset, flags;
+   unsigned long boundary_size;
+   unsigned long base_index;
+
+   base_index = ALIGN(iommu_bus_base & dma_get_seg_boundary(dev),
+  PAGE_SIZE) >> PAGE_SHIFT;
+   boundary_size = ALIGN(dma_get_seg_boundary(dev) + 1,
+ PAGE_SIZE) >> PAGE_SHIFT;
 
spin_lock_irqsave(&iommu_bitmap_lock, flags);
-   offset = find_next_zero_string(iommu_gart_bitmap, next_bit,
-   iommu_pages, size);
+   offset = iommu_area_alloc(iommu_gart_bitmap, iommu_pages, next_bit,
+ size, base_index, boundary_size, 0);
if (offset == -1) {
need_flush = 1;
-   offset = find_next_zero_string(iommu_gart_bitmap, 0,
-   iommu_pages, size);
+   offset = iommu_area_alloc(iommu_gart_bitmap, iommu_pages, 0,
+ size, base_index, boundary_size, 0);
}
if (offset != -1) {
set_bit_string(iommu_gart_bitmap, offset, size);
@@ -114,7 +122,7 @@ static void free_iommu(unsigned long offset, int size)
unsigned long flags;
 
spin_lock_irqsave(&iommu_bitmap_lock, flags);
-   __clear_bit_string(iommu_gart_bitmap, offset, size);
+   iommu_area_free(iommu_gart_bitmap, offset, size);
spin_unlock_irqrestore(&iommu_bitmap_lock, flags);
 }
 
@@ -235,7 +243,7 @@ static dma_addr_t dma_map_area(struct device *dev, 
dma_addr_t phys_mem,
size_t size, int dir)
 {
unsigned long npages = to_pages(phys_mem, size);
-   unsigned long iommu_page = alloc_iommu(npages);
+   unsigned long iommu_page = alloc_iommu(dev, npages);
int i;
 
if (iommu_page == -1) {
@@ -355,10 +363,11 @@ static int dma_map_sg_nonforce(struct device *dev, struct 
scatterlist *sg,
 }
 
 /* Map multiple scatterlist entries continuous into the first. */
-static int __dma_map_cont(struct scatterlist *start, int nelems,
- struct scatterlist *sout, unsigned long pages)
+static int __dma_map_cont(struct device *dev, struct scatterlist *start,
+ int nelems, struct scatterlist *sout,
+ unsigned long pages)
 {
-   unsigned long iommu_start = alloc_iommu(pages);
+   unsigned long iommu_start = alloc_iommu(dev, pages);
unsigned long iommu_page = iommu_start;
struct scatterlist *s;
int i;
@@ -394,8 +403,8 @@ static int __dma_map_cont(struct scatterlist *start, int 
nelems,
 }
 
 static inline int
-dma_map_cont(struct scatterlist *start, int nelems, struct scatterlist *sout,
-unsigned long pages, int need)
+dma_map_cont(struct device *dev, struct scatterlist *start, int nelems,
+struct scatterlist *sout, unsigned long pages, int need)
 {
if (!need) {
BUG_ON(nelems != 1);
@@ -403,7 +412,7 @@ dma_map_cont(struct scatterlist *start, int nelems, struct 
scatterlist *sout,
sout->dma_length = start->length;
return 0;
}
-   return __dma_map_cont(start, nelems, sout, pages);
+   return __dma_map_cont(dev, start, nelems, sout, pages);
 }

 /*
@@ -452,8 +461,8 @@ static int gart_map_sg(struct device *dev, struct 
scatterlist *sg, int nents,
if (!iommu_merge || !nextneed || !need || s->offset ||
(s->length + seg_size > max_seg_size) ||
(ps->offset + ps->length) % PAGE_

[PATCH -mm 1/6] add IOMMU helper functions for the free area management

2007-12-07 Thread FUJITA Tomonori

This adds IOMMU helper functions for the free area management. These
functions take care of LLD's segment boundary limit for IOMMUs. They
would be useful for IOMMUs that use bitmap for the free area
management.

Signed-off-by: FUJITA Tomonori <[EMAIL PROTECTED]>
---
 include/linux/iommu-helper.h |7 
 lib/Makefile |1 +
 lib/iommu-helper.c   |   76 ++
 3 files changed, 84 insertions(+), 0 deletions(-)
 create mode 100644 include/linux/iommu-helper.h
 create mode 100644 lib/iommu-helper.c

diff --git a/include/linux/iommu-helper.h b/include/linux/iommu-helper.h
new file mode 100644
index 000..4dd4c04
--- /dev/null
+++ b/include/linux/iommu-helper.h
@@ -0,0 +1,7 @@
+extern unsigned long iommu_area_alloc(unsigned long *map, unsigned long size,
+ unsigned long start, unsigned int nr,
+ unsigned long shift,
+ unsigned long boundary_size,
+ unsigned long align_mask);
+extern void iommu_area_free(unsigned long *map, unsigned long start,
+   unsigned int nr);
diff --git a/lib/Makefile b/lib/Makefile
index b862b90..17fb758 100644
--- a/lib/Makefile
+++ b/lib/Makefile
@@ -65,6 +65,7 @@ obj-$(CONFIG_SMP) += pcounter.o
 obj-$(CONFIG_AUDIT_GENERIC) += audit.o
 
 obj-$(CONFIG_SWIOTLB) += swiotlb.o
+obj-$(CONFIG_IOMMU_HELPER) += iommu-helper.o
 obj-$(CONFIG_FAULT_INJECTION) += fault-inject.o
 
 lib-$(CONFIG_GENERIC_BUG) += bug.o
diff --git a/lib/iommu-helper.c b/lib/iommu-helper.c
new file mode 100644
index 000..e7d8544
--- /dev/null
+++ b/lib/iommu-helper.c
@@ -0,0 +1,76 @@
+/*
+ * IOMMU helper functions for the free area management
+ */
+
+#include 
+#include 
+
+static unsigned long find_next_zero_area(unsigned long *map,
+unsigned long size,
+unsigned long start,
+unsigned int nr)
+{
+   unsigned long index, end, i;
+again:
+   index = find_next_zero_bit(map, size, start);
+   end = index + nr;
+   if (end > size)
+   return -1;
+   for (i = index + 1; i < end; i++) {
+   if (test_bit(i, map)) {
+   start = i+1;
+   goto again;
+   }
+   }
+   return index;
+}
+
+static inline void set_bit_area(unsigned long *map, unsigned long i,
+   int len)
+{
+   unsigned long end = i + len;
+   while (i < end) {
+   __set_bit(i, map);
+   i++;
+   }
+}
+
+static inline int is_span_boundary(unsigned int index, unsigned int nr,
+  unsigned long shift,
+  unsigned long boundary_size)
+{
+   shift = (shift + index) & (boundary_size - 1);
+   return shift + nr > boundary_size;
+}
+
+unsigned long iommu_area_alloc(unsigned long *map, unsigned long size,
+  unsigned long start, unsigned int nr,
+  unsigned long shift, unsigned long boundary_size,
+  unsigned long align_mask)
+{
+   unsigned long index;
+again:
+   index = find_next_zero_area(map, size, start, nr);
+   if (index != -1) {
+   index = (index + align_mask) & ~align_mask;
+   if (is_span_boundary(index, nr, shift, boundary_size)) {
+   /* we could do more effectively */
+   start = index + 1;
+   goto again;
+   }
+   set_bit_area(map, index, nr);
+   }
+   return index;
+}
+EXPORT_SYMBOL(iommu_area_alloc);
+
+void iommu_area_free(unsigned long *map, unsigned long start, unsigned int nr)
+{
+   unsigned long end = start + nr;
+
+   while (start < end) {
+   __clear_bit(start, map);
+   start++;
+   }
+}
+EXPORT_SYMBOL(iommu_area_free);
-- 
1.5.3.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH -mm 6/6] kill __clear_bit_string and find_next_zero_string

2007-12-07 Thread FUJITA Tomonori

This kills unused __clear_bit_string and find_next_zero_string (they
were used by only gart and calgary IOMMUs).

Signed-off-by: FUJITA Tomonori <[EMAIL PROTECTED]>
---
 arch/x86/lib/Makefile_64|2 +-
 arch/x86/lib/bitstr_64.c|   28 
 include/asm-x86/bitops_64.h |   16 
 3 files changed, 1 insertions(+), 45 deletions(-)
 delete mode 100644 arch/x86/lib/bitstr_64.c

diff --git a/arch/x86/lib/Makefile_64 b/arch/x86/lib/Makefile_64
index bbabad3..1b72bda 100644
--- a/arch/x86/lib/Makefile_64
+++ b/arch/x86/lib/Makefile_64
@@ -9,5 +9,5 @@ obj-$(CONFIG_SMP)   += msr-on-cpu.o
 
 lib-y := csum-partial_64.o csum-copy_64.o csum-wrappers_64.o delay_64.o \
usercopy_64.o getuser_64.o putuser_64.o  \
-   thunk_64.o clear_page_64.o copy_page_64.o bitstr_64.o bitops_64.o
+   thunk_64.o clear_page_64.o copy_page_64.o bitops_64.o
 lib-y += memcpy_64.o memmove_64.o memset_64.o copy_user_64.o rwlock_64.o 
copy_user_nocache_64.o
diff --git a/arch/x86/lib/bitstr_64.c b/arch/x86/lib/bitstr_64.c
deleted file mode 100644
index 7445caf..000
--- a/arch/x86/lib/bitstr_64.c
+++ /dev/null
@@ -1,28 +0,0 @@
-#include 
-#include 
-
-/* Find string of zero bits in a bitmap */ 
-unsigned long 
-find_next_zero_string(unsigned long *bitmap, long start, long nbits, int len)
-{ 
-   unsigned long n, end, i;
-
- again:
-   n = find_next_zero_bit(bitmap, nbits, start);
-   if (n == -1) 
-   return -1;
-   
-   /* could test bitsliced, but it's hardly worth it */
-   end = n+len;
-   if (end > nbits)
-   return -1; 
-   for (i = n+1; i < end; i++) { 
-   if (test_bit(i, bitmap)) {  
-   start = i+1; 
-   goto again; 
-   } 
-   }
-   return n;
-}
-
-EXPORT_SYMBOL(find_next_zero_string);
diff --git a/include/asm-x86/bitops_64.h b/include/asm-x86/bitops_64.h
index 48adbf5..aaf1519 100644
--- a/include/asm-x86/bitops_64.h
+++ b/include/asm-x86/bitops_64.h
@@ -37,12 +37,6 @@ static inline long __scanbit(unsigned long val, unsigned 
long max)
   ((off)+(__scanbit(~(((*(unsigned long *)addr)) >> (off)),(size)-(off : \
find_next_zero_bit(addr,size,off)))
 
-/* 
- * Find string of zero bits in a bitmap. -1 when not found.
- */ 
-extern unsigned long 
-find_next_zero_string(unsigned long *bitmap, long start, long nbits, int len);
-
 static inline void set_bit_string(unsigned long *bitmap, unsigned long i, 
  int len) 
 { 
@@ -53,16 +47,6 @@ static inline void set_bit_string(unsigned long *bitmap, 
unsigned long i,
}
 } 
 
-static inline void __clear_bit_string(unsigned long *bitmap, unsigned long i, 
-   int len) 
-{ 
-   unsigned long end = i + len; 
-   while (i < end) {
-   __clear_bit(i, bitmap); 
-   i++;
-   }
-} 
-
 /**
  * ffz - find first zero in word.
  * @word: The word to search
-- 
1.5.3.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH -mm 4/6] x86: convert calgary IOMMU to use the IOMMU helper

2007-12-07 Thread FUJITA Tomonori

This patch converts calgary IOMMU to use the IOMMU helper
functions. The IOMMU doesn't allocate a memory area spanning LLD's
segment boundary anymore.

Signed-off-by: FUJITA Tomonori <[EMAIL PROTECTED]>
---
 arch/x86/Kconfig |3 +++
 arch/x86/kernel/pci-calgary_64.c |   34 --
 2 files changed, 23 insertions(+), 14 deletions(-)

diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index 48d09cb..df22fe7 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -433,6 +433,9 @@ config CALGARY_IOMMU_ENABLED_BY_DEFAULT
  Calgary anyway, pass 'iommu=calgary' on the kernel command line.
  If unsure, say Y.
 
+config IOMMU_HELPER
+   def_bool CALGARY_IOMMU
+
 # need this always selected by IOMMU for the VIA workaround
 config SWIOTLB
bool
diff --git a/arch/x86/kernel/pci-calgary_64.c b/arch/x86/kernel/pci-calgary_64.c
index 21f34db..f5b47ba 100644
--- a/arch/x86/kernel/pci-calgary_64.c
+++ b/arch/x86/kernel/pci-calgary_64.c
@@ -35,6 +35,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -260,22 +261,28 @@ static void iommu_range_reserve(struct iommu_table *tbl,
spin_unlock_irqrestore(&tbl->it_lock, flags);
 }
 
-static unsigned long iommu_range_alloc(struct iommu_table *tbl,
-   unsigned int npages)
+static unsigned long iommu_range_alloc(struct device *dev,
+  struct iommu_table *tbl,
+  unsigned int npages)
 {
unsigned long flags;
unsigned long offset;
+   unsigned long boundary_size;
+
+   boundary_size = ALIGN(dma_get_seg_boundary(dev) + 1,
+ PAGE_SIZE) >> PAGE_SHIFT;
 
BUG_ON(npages == 0);
 
spin_lock_irqsave(&tbl->it_lock, flags);
 
-   offset = find_next_zero_string(tbl->it_map, tbl->it_hint,
-  tbl->it_size, npages);
+   offset = iommu_area_alloc(tbl->it_map, tbl->it_size, tbl->it_hint,
+ npages, 0, boundary_size, 0);
if (offset == ~0UL) {
tbl->chip_ops->tce_cache_blast(tbl);
-   offset = find_next_zero_string(tbl->it_map, 0,
-  tbl->it_size, npages);
+
+   offset = iommu_area_alloc(tbl->it_map, tbl->it_size, 0,
+ npages, 0, boundary_size, 0);
if (offset == ~0UL) {
printk(KERN_WARNING "Calgary: IOMMU full.\n");
spin_unlock_irqrestore(&tbl->it_lock, flags);
@@ -286,7 +293,6 @@ static unsigned long iommu_range_alloc(struct iommu_table 
*tbl,
}
}
 
-   set_bit_string(tbl->it_map, offset, npages);
tbl->it_hint = offset + npages;
BUG_ON(tbl->it_hint > tbl->it_size);
 
@@ -295,13 +301,13 @@ static unsigned long iommu_range_alloc(struct iommu_table 
*tbl,
return offset;
 }
 
-static dma_addr_t iommu_alloc(struct iommu_table *tbl, void *vaddr,
-   unsigned int npages, int direction)
+static dma_addr_t iommu_alloc(struct device *dev, struct iommu_table *tbl,
+ void *vaddr, unsigned int npages, int direction)
 {
unsigned long entry;
dma_addr_t ret = bad_dma_address;
 
-   entry = iommu_range_alloc(tbl, npages);
+   entry = iommu_range_alloc(dev, tbl, npages);
 
if (unlikely(entry == bad_dma_address))
goto error;
@@ -354,7 +360,7 @@ static void iommu_free(struct iommu_table *tbl, dma_addr_t 
dma_addr,
   badbit, tbl, dma_addr, entry, npages);
}
 
-   __clear_bit_string(tbl->it_map, entry, npages);
+   iommu_area_free(tbl->it_map, entry, npages);
 
spin_unlock_irqrestore(&tbl->it_lock, flags);
 }
@@ -438,7 +444,7 @@ static int calgary_map_sg(struct device *dev, struct 
scatterlist *sg,
vaddr = (unsigned long) sg_virt(s);
npages = num_dma_pages(vaddr, s->length);
 
-   entry = iommu_range_alloc(tbl, npages);
+   entry = iommu_range_alloc(dev, tbl, npages);
if (entry == bad_dma_address) {
/* makes sure unmap knows to stop */
s->dma_length = 0;
@@ -476,7 +482,7 @@ static dma_addr_t calgary_map_single(struct device *dev, 
void *vaddr,
npages = num_dma_pages(uaddr, size);
 
if (translation_enabled(tbl))
-   dma_handle = iommu_alloc(tbl, vaddr, npages, direction);
+   dma_handle = iommu_alloc(dev, tbl, vaddr, npages, direction);
else
dma_handle = virt_to_bus(vaddr);
 
@@ -516,7 +522,7 @@ static void* calgary_alloc_coherent(struct device *dev, 
size_t size,
 
if (translation_enabled(tbl)) {
/* set up tces to cover the allocated range */
-   mapping = iommu_alloc(tbl, ret, npages, DMA_BIDIRECTIONAL);
+

Re: [PATCH] kexec: force x86_64 arches to boot kdump kernels on boot cpu

2007-12-07 Thread Neil Horman

On Fri, Dec 07, 2007 at 09:39:44AM -0500, Vivek Goyal wrote:
> On Thu, Dec 06, 2007 at 07:10:23PM -0500, Neil Horman wrote:
> > On Thu, Dec 06, 2007 at 05:11:43PM -0500, Vivek Goyal wrote:
> > > On Thu, Dec 06, 2007 at 04:39:51PM -0500, Neil Horman wrote:
> > > > On Fri, Nov 30, 2007 at 09:51:31AM -0500, Neil Horman wrote:
> > > > > On Fri, Nov 30, 2007 at 09:42:50AM -0500, Vivek Goyal wrote:
> > > > 
> > > > > 
> > > > > Thats what I'm doing at the moment.  I'm working on a RHEL5 patch at 
> > > > > the moment
> > > > > (since thats whats on the production system thats failing), and will 
> > > > > forward
> > > > > port it once its working
> > > > > 
> > > > > And not to split hairs, but techically thats not our _only_ choice.  
> > > > > We could
> > > > > force kdump boots on cpu0 as well ;)
> > > > > 
> > > > > Thanks
> > > > > Neil
> > > > > 
> > > > > > Thanks
> > > > > > Vivek
> > > > > 
> > > > 
> > > > 
> > > > Sorry to have been quiet on this issue for a few days. Interesting news 
> > > > to
> > > > report, though.  So I was working on a patch to do early apic enabling 
> > > > on
> > > > x86_64, and had something working for the old 2.6.18 kernel that we were
> > > > origionally testing on.  Unfortunately while it worked on 2.6.18 it 
> > > > failed
> > > > miserably on 2.6.24-rc3-mm2, causing check_timer to consistently report 
> > > > that the
> > > > timer interrupt wasn't getting received (even though we could 
> > > > successfully run
> > > > calibrate_delay).  Vivek and I were digging into this, when I ran 
> > > > accross the
> > > > description of the hypertransport configuration register in the opteron
> > > > specification.  It contains a bit that, suprise, configures the ht bus 
> > > > to either
> > > > unicast interrupts delivered accross the ht bus to a single cpu, or to 
> > > > broadcast
> > > > it to all cpus.  Since it seemed more likely that the 8259 in the nvidia
> > > > southbridge was transporting legacy mode interrupts over the ht bus than
> > > > directly to cpu0 via an actual wire, I wrote the attached patch to add 
> > > > a quirk
> > > > for nvidia chipsets, which scanned for hypertransport controllers, and 
> > > > ensured
> > > > that that broadcast bit was set.  Test results indicate that this 
> > > > solves the
> > > > problem, and kdump kernels boot just fine on the affected system.
> > > > 
> > > 
> > > Hi Neil,
> > > 
> > > Should we disable this broadcasting feature once we are through? Otherwise
> > > in normal systems it might mean extra traffic on hypertransport. There
> > > is no need for every interrupt to be broadcasted in normal systems?
> > > 
> > > Thanks
> > > Vivek
> > 
> > No, I don't think thats necessecary.  Once the apics are enabled, interrupts
> > shouldn't travel accross the hypertransport bus anyway, opting instead to 
> > use
> > the dedicated apic bus (at least thats my understanding).
> 
> I think all interrupt message travel on hypertransport. Even after APICS
> have been enabled.
> 
> Look at the following document.
> 
> http://www.amd.com/us-en/assets/content_type/white_papers_and_tech_docs/24674.pdf
> 
> Have a look at figure 1, figure 2 and section 3.4.2.2 and 3.4.2.3
> 
> That's a different thing that once IOAPIC has formed the vectored message,
> Hypertransport might not touch the destination field.
>  
Ok, that might be the case then.

> Having said that, I am wondering what will happen if a system continues
> to operate the timer through IOAPIC in ExtInt mode. Will hypertransport
> keep on broadcasting that interrupt to every cpu? And every cpu will 
> process that interrupt.
> 
I don't think so.  IIRC once the other cpus are started they all disable the
timer interrupt, except for one cpu, opting instead to get the timer tick via
ipi, So while they all might see the interrupt packet on the ht bus, only one
cpu will process it.

> Hence, I feel it is safe to restore the broadcast bit back to BIOS value once
> we are through calibrate_delay().
> 
I disagree.  Looking at what Yinghai said, the default setting for the broadcast
bit isn't actually to unicast the interrupt, its just to set the broadcast mask
to 0xF, or to 0xFF.  Its use is actually to allow cpus with an extended 8 bit
apic id see interrupts.  So its not so much to direct interrupts to cpu0, but
rather to the first 16 cpus rather than to all 255 available cpus.  From what
I've seen in my testing, systems that 'work' already have this bit set by bios,
and my quirk patch above does nothing to them.  Disabling this bit after
calibrate_dealy is going to introduce more uncertainty in systems that have been
proven to work.  We should leave well enough alone, and just enable the bit if
its off, and we see that we are using extended apic ids via bit 18 of the same
register, as Yinghai pointed out.  By enabling the quirk that way, all we are
really doing is bringing into alignment two bits that should arguably be
set/cleared in unison anyway.

Regards
Neil

> Thanks
> Vivek

Re: [BUG] 2.6.23-rc3 can't see sd partitions on Alpha

2007-12-07 Thread Ingo Molnar


* Bob Tracy <[EMAIL PROTECTED]> wrote:

> > Current state of the source tree is the 6f37ac... version, so I'll 
> > start backing out the above diffs in related groups and continue 
> > until I've got a working kernel.  For lack of an obvious target, 
> > I'll start with the seemingly innocuous change to sysctl_check.c.  
> > I'll report back when I've got something.
> 
> That was quick :-).  Backing out the sysctl_check.c diff gives me a 
> working kernel.  Beats the [EMAIL PROTECTED] out of me how/why, though.
> 
> Michael Cree: could you try backing out the diff below from your 
> 2.6.24-rc3 tree and see if things are now working for you?
> 
> Here's "uname -a", just to confirm (maybe) I'm running on what I say 
> works:
> 
> Linux smirkin 2.6.24-rc2-g6f37ac79-dirty #2 Fri Dec 7 08:03:12 CST 2007 alpha
> 
> Here's the diff I backed out (patch -R).  It's short...
> 
> diff --git a/kernel/sysctl_check.c b/kernel/sysctl_check.c
> index 5a2f2b2..4abc6d2 100644
> --- a/kernel/sysctl_check.c
> +++ b/kernel/sysctl_check.c
> @@ -738,7 +738,7 @@ static struct trans_ctl_table trans_net_table[] = {
>   { NET_ROSE, "rose", trans_net_rose_table },
>   { NET_IPV6, "ipv6", trans_net_ipv6_table },
>   { NET_X25,  "x25",  trans_net_x25_table },
> - { NET_TR,   "tr",   trans_net_tr_table },
> + { NET_TR,   "token-ring",   trans_net_tr_table },
>   { NET_DECNET,   "decnet",   trans_net_decnet_table },
>   /*  NET_ECONET not used */
>   { NET_SCTP, "sctp", trans_net_sctp_table },

reverting this makes the kernel image shorter by 8 bytes - so perhaps 
some alignment issue somewhere? Or something gets overflown? Does any of 
this get actually used by your bootup?

Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: 2.6.23.8: OOM killer kills wrong jobs

2007-12-07 Thread Alan Cox

On Fri, 07 Dec 2007 10:25:23 +0100
Martin MOKREJŠ <[EMAIL PROTECTED]> wrote:

> Hi,
>   first of all, sorry for not being up to date with how the OOM killer
> works. I think there used to be a kernel config option to disable
> OOM killer and instead kill the process which actually asks for the
> memory and supposedly caused the memory lack. That is what I would
> like to have on my system. I a have a 1GB RAM laptop and use t-coffee
> software from 
> http://www.tcoffee.org/Projects_home_page/t_coffee_home_page.html
> to do some science. ;)

The OOM killer triggers where there is no way to fulfill a page request.
Something has to go and there is no real notion of "right" or "wrong"
process at that point.

You can either set no overcommit in which case you'll get failed malloc
and similar rather than allow overcommit, or you can set the OOM priority
of tasks yourself so that your specific app of choice always dies first.

Alan
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] kexec: force x86_64 arches to boot kdump kernels on boot cpu

2007-12-07 Thread Vivek Goyal

On Thu, Dec 06, 2007 at 07:10:23PM -0500, Neil Horman wrote:
> On Thu, Dec 06, 2007 at 05:11:43PM -0500, Vivek Goyal wrote:
> > On Thu, Dec 06, 2007 at 04:39:51PM -0500, Neil Horman wrote:
> > > On Fri, Nov 30, 2007 at 09:51:31AM -0500, Neil Horman wrote:
> > > > On Fri, Nov 30, 2007 at 09:42:50AM -0500, Vivek Goyal wrote:
> > > 
> > > > 
> > > > Thats what I'm doing at the moment.  I'm working on a RHEL5 patch at 
> > > > the moment
> > > > (since thats whats on the production system thats failing), and will 
> > > > forward
> > > > port it once its working
> > > > 
> > > > And not to split hairs, but techically thats not our _only_ choice.  We 
> > > > could
> > > > force kdump boots on cpu0 as well ;)
> > > > 
> > > > Thanks
> > > > Neil
> > > > 
> > > > > Thanks
> > > > > Vivek
> > > > 
> > > 
> > > 
> > > Sorry to have been quiet on this issue for a few days. Interesting news to
> > > report, though.  So I was working on a patch to do early apic enabling on
> > > x86_64, and had something working for the old 2.6.18 kernel that we were
> > > origionally testing on.  Unfortunately while it worked on 2.6.18 it failed
> > > miserably on 2.6.24-rc3-mm2, causing check_timer to consistently report 
> > > that the
> > > timer interrupt wasn't getting received (even though we could 
> > > successfully run
> > > calibrate_delay).  Vivek and I were digging into this, when I ran accross 
> > > the
> > > description of the hypertransport configuration register in the opteron
> > > specification.  It contains a bit that, suprise, configures the ht bus to 
> > > either
> > > unicast interrupts delivered accross the ht bus to a single cpu, or to 
> > > broadcast
> > > it to all cpus.  Since it seemed more likely that the 8259 in the nvidia
> > > southbridge was transporting legacy mode interrupts over the ht bus than
> > > directly to cpu0 via an actual wire, I wrote the attached patch to add a 
> > > quirk
> > > for nvidia chipsets, which scanned for hypertransport controllers, and 
> > > ensured
> > > that that broadcast bit was set.  Test results indicate that this solves 
> > > the
> > > problem, and kdump kernels boot just fine on the affected system.
> > > 
> > 
> > Hi Neil,
> > 
> > Should we disable this broadcasting feature once we are through? Otherwise
> > in normal systems it might mean extra traffic on hypertransport. There
> > is no need for every interrupt to be broadcasted in normal systems?
> > 
> > Thanks
> > Vivek
> 
> No, I don't think thats necessecary.  Once the apics are enabled, interrupts
> shouldn't travel accross the hypertransport bus anyway, opting instead to use
> the dedicated apic bus (at least thats my understanding).

I think all interrupt message travel on hypertransport. Even after APICS
have been enabled.

Look at the following document.

http://www.amd.com/us-en/assets/content_type/white_papers_and_tech_docs/24674.pdf

Have a look at figure 1, figure 2 and section 3.4.2.2 and 3.4.2.3

That's a different thing that once IOAPIC has formed the vectored message,
Hypertransport might not touch the destination field.
 
Having said that, I am wondering what will happen if a system continues
to operate the timer through IOAPIC in ExtInt mode. Will hypertransport
keep on broadcasting that interrupt to every cpu? And every cpu will 
process that interrupt.

Hence, I feel it is safe to restore the broadcast bit back to BIOS value once
we are through calibrate_delay().

Thanks
Vivek
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

re: 2.6.23.8: OOM killer kills wrong jobs

2007-12-07 Thread Dan Kegel

Marting Mokreja wrote:
> first of all, sorry for not being up to date with how the OOM killer
> works. I think there used to be a kernel config option to disable
> OOM killer and instead kill the process which actually asks for the
> memory and supposedly caused the memory lack. That is what I would
> like to have on my system. I a have a 1GB RAM laptop

You probably just need to add more swap space on your system,

Any time the OOM killer fires, something's wrong with the
system, and it's more productive to deal with that than to
wish for a more accurate OOM killer; see http://lwn.net/Articles/111408/

When I was working at a company that used embedded Linux,
I eventually figured this out, and patched the kernel to panic on OOM
conditions; that gave users the right incentive to avoid
configuring jobs that caused the system to run out of memory.
- Dan
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] kexec: force x86_64 arches to boot kdump kernels on boot cpu

2007-12-07 Thread Neil Horman

On Fri, Dec 07, 2007 at 01:22:04AM -0800, Yinghai Lu wrote:
> On Dec 7, 2007 12:50 AM, Yinghai Lu <[EMAIL PROTECTED]> wrote:
> >
> > On Dec 6, 2007 4:33 PM, Eric W. Biederman <[EMAIL PROTECTED]> wrote:
> ...
> > >
> > > My feel is that if it is for legacy interrupts only it should not be a 
> > > problem.
> > > Let's investigate and see if we can unconditionally enable this quirk
> > > for all opteron systems.
> >
> > i checked that bit
> >
> > http://www.openbios.org/viewvc/trunk/LinuxBIOSv2/src/northbridge/amd/amdk8/coherent_ht.c?revision=2596&view=markup
> >
> > static void enable_apic_ext_id(u8 node)
> > {
> > #if ENABLE_APIC_EXT_ID==1
> > #warning "FIXME Is the right place to enable apic ext id here?"
> >
> >   u32 val;
> >
> > val = pci_read_config32(NODE_HT(node), 0x68);
> > val |= (HTTC_APIC_EXT_SPUR | HTTC_APIC_EXT_ID | 
> > HTTC_APIC_EXT_BRD_CST);
> > pci_write_config32(NODE_HT(node), 0x68, val);
> > #endif
> > }
> >
> > that bit only be should be set when apic id is lifted and cpu apid is
> > using 8 bits and that mean broadcast is 0xff instead 0x0f.
> > for example 8 socket dual core system or 4 socket quad core
> > system,that you should make BSP start from 0x04, so cpus apic id will
> > be [0x04, 0x13)
> >
> >
> > So if you want to enable that in early_quirk, you need to
> > make sure apic id is using 8 bits by check if the bit 16 (HTTC_APIC_ID) is 
> > set.
> 
> it should be bit 18 (HTTC_APIC_EXT_ID)
> 
> 
> YH

this seems reasonable, I can reroll the patch for this.  As I think about it I'm
also going to update the patch to make this check occur for any pci class 0600
device from vendor AMD, since its possible that more than just nvidia chipsets
can be affected.

I'll repost as soon as I've tested, thanks!
Neil

-- 
/***
 *Neil Horman
 *Software Engineer
 *Red Hat, Inc.
 [EMAIL PROTECTED]
 *gpg keyid: 1024D / 0x92A74FA1
 *http://pgp.mit.edu
 ***/
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 4/4 -mm] kexec based hibernation -v7 : kimgcore

2007-12-07 Thread huang ying

On Dec 7, 2007 8:33 PM, Rafael J. Wysocki <[EMAIL PROTECTED]> wrote:
> On Friday, 7 of December 2007, Huang, Ying wrote:
> > This patch adds a file in proc file system to access the loaded
> > kexec_image, which may contains the memory image of kexeced
> > system. This can be used by kexec based hibernation to create a file
> > image of hibernating kernel, so that a kernel booting process is not
> > needed for each hibernating.
>
> Hm, I'm not sure what you mean.
>
> Can you explain a bit, please?

The normal kexec based hibernation procedure is as follow:

1. kexec_load the kernel image and initramfs
2. jump to hibernating kernel
3. the normal boot process of kexeced kernel
4. jump back to hibernated kernel
5. execute ACPI methods
6. jump to hibernating kernel
7. write memory image of hibernated kernel
8. go to ACPI S4 state

With kimgcore:

A. Prepare a memory image of hibernation kernel:

A.1 kexec_load the kernel image and initramfs
A.2 jump to hibernating kernel
A.3 the normal boot process of kexeced kernel
A.4 jump back to hibernated kernel
A.5 save the memory image of hibernating kernel via kimgcore

The normal hibernate process is as follow:

1. kexec load the kimgcore of hibernatin kernel
2. jump to the hibernating kernel
3. execute ACPI methods
4. jump to hibernating kernel
5. write memory image of hibernated kernel
6. go to ACPI S4 state

So the boot process of hibernating kernel needs only once unless the
hardware configuration is changed.

Best Regards,
Huang Ying
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: question about sata-error on boot.

2007-12-07 Thread Hemmann, Volker Armin

Hi,

On Mittwoch, 7. November 2007, Andrew Morton wrote:
> > On Fri, 2 Nov 2007 19:34:20 +0100 "Hemmann, Volker Armin"
> > <[EMAIL PROTECTED]> wrote: Hi,
>
> (cc linux-ide)
>
> > for some time (and I can't say for how long, but the board is less than a
> > month old) I get this error on boot:
> >
> > [   42.116273] ahci :00:0a.0: version 2.2
> > [   42.116482] ACPI: PCI Interrupt Link [LSA0] enabled at IRQ 23
> > [   42.116653] ACPI: PCI Interrupt :00:0a.0[A] -> Link [LSA0] -> GSI
> > 23 (level, low) -> IRQ 23
> > [   43.119478] ahci :00:0a.0: AHCI 0001.0100 32 slots 4 ports 3 Gbps
> > 0xf impl IDE mode
> > [   43.119778] ahci :00:0a.0: flags: 64bit led clo pmp pio
> > [   43.119943] PCI: Setting latency timer of device :00:0a.0 to 64
> > [   43.120149] scsi0 : ahci
> > [   43.120365] scsi1 : ahci
> > [   43.120556] scsi2 : ahci
> > [   43.120741] scsi3 : ahci
> > [   43.120927] ata1: SATA max UDMA/133 cmd 0xc2014100 ctl
> > 0x bmdma 0x irq 315
> > [   43.121227] ata2: SATA max UDMA/133 cmd 0xc2014180 ctl
> > 0x bmdma 0x irq 315
> > [   43.121526] ata3: SATA max UDMA/133 cmd 0xc2014200 ctl
> > 0x bmdma 0x irq 315
> > [   43.121826] ata4: SATA max UDMA/133 cmd 0xc2014280 ctl
> > 0x bmdma 0x irq 315
> > [   43.934296] ata1: softreset failed (1st FIS failed)
> > [   43.934461] ata1: reset failed (errno=-5), retrying in 10 secs
> > [   53.885194] ata1: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
> > [   53.885890] ata1.00: ATA-7: WDC WD1600JS-00MHB1, 10.02E01, max
> > UDMA/133 [   53.886056] ata1.00: 312581808 sectors, multi 16: LBA48
> > [   53.886804] ata1.00: configured for UDMA/133
> > [   54.201147] ata2: SATA link down (SStatus 0 SControl 300)
> > [   54.517101] ata3: SATA link down (SStatus 0 SControl 300)
> > [   54.833055] ata4: SATA link down (SStatus 0 SControl 300)

this is gone with 2.6.22.13 an 2.6.23.9:

[   33.277039] scsi0 : ahci
[   33.277262] scsi1 : ahci
[   33.277454] scsi2 : ahci
[   33.277645] scsi3 : ahci
[   33.277826] ata1: SATA max UDMA/133 cmd 0xc2020100 ctl 
0x bmdma 0x irq 315
[   33.278120] ata2: SATA max UDMA/133 cmd 0xc2020180 ctl 
0x bmdma 0x irq 315
[   33.278414] ata3: SATA max UDMA/133 cmd 0xc2020200 ctl 
0x bmdma 0x irq 315
[   33.278708] ata4: SATA max UDMA/133 cmd 0xc2020280 ctl 
0x bmdma 0x irq 315
[   33.751855] ata1: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
[   33.752659] ata1.00: ATA-7: WDC WD1600JS-00MHB1, 10.02E01, max UDMA/133
[   33.752821] ata1.00: 312581808 sectors, multi 16: LBA48
[   33.753574] ata1.00: configured for UDMA/133
[   34.067809] ata2: SATA link down (SStatus 0 SControl 300)
[   34.383762] ata3: SATA link down (SStatus 0 SControl 300)
[   34.699717] ata4: SATA link down (SStatus 0 SControl 300)
[   34.700029] scsi 0:0:0:0: Direct-Access ATA  WDC WD1600JS-00M 10.0 
PQ: 0 ANSI: 5
[   34.700377] sd 0:0:0:0: [sda] 312581808 512-byte hardware sectors (160042 
MB)
[   34.700544] sd 0:0:0:0: [sda] Write Protect is off
[   34.700703] sd 0:0:0:0: [sda] Mode Sense: 00 3a 00 00
[   34.700712] sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, 
doesn't support DPO or FUA
[   34.701026] sd 0:0:0:0: [sda] 312581808 512-byte hardware sectors (160042 
MB)
[   34.701191] sd 0:0:0:0: [sda] Write Protect is off
[   34.701350] sd 0:0:0:0: [sda] Mode Sense: 00 3a 00 00
[   34.701358] sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, 
doesn't support DPO or FUA
[   34.701651]  sda: sda1 sda2 sda3 sda4 < sda5 sda6 >
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: 2.6.24-rc4-mm1

2007-12-07 Thread Ilpo Järvinen

On Wed, 5 Dec 2007, David Miller wrote:

> From: Reuben Farrelly <[EMAIL PROTECTED]>
> Date: Thu, 06 Dec 2007 17:59:37 +1100
> 
> > On 5/12/2007 4:17 PM, Andrew Morton wrote:
> > > - Lots of device IDs have been removed from the e1000 driver and moved 
> > > over
> > >   to e1000e.  So if your e1000 stops working, you forgot to set 
> > > CONFIG_E1000E.
> > 
> > This non fatal oops which I have just noticed may be related to this change 
> > then 
> > - certainly looks networking related.
> > 
> > WARNING: at net/ipv4/tcp_input.c:2518 tcp_fastretrans_alert()
> > Pid: 0, comm: swapper Not tainted 2.6.24-rc4-mm1 #1
> > 
> > Call Trace:
> > [] tcp_fastretrans_alert+0x229/0xe63
> >   [] tcp_ack+0xa3f/0x127d
> >   [] tcp_rcv_established+0x55f/0x7f8
> >   [] tcp_v4_do_rcv+0xdb/0x3a7
> >   [] :nf_conntrack:nf_ct_deliver_cached_events+0x75/0x99
> 
> No, it's from TCP assertions and changes added by Ilpo to the
> net-2.6.25 tree recently.

Yeah, this (very likely) due to the new SACK processing (in net-2.6.25). 
I'll look what could go wrong with fack_count calculations, most likely 
it's the reason (I've found earlier one out-of-place retransmission 
segment in one of my test case which already indicated that there's 
something incorrect with them but didn't have time to debug it yet).

Thanks for report. Some info about how easily you can reproduce & 
couple of sentences about the test case might be useful later on when 
evaluating the fix.

-- 
 i.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Possible EXT2 race

2007-12-07 Thread linux-os (Dick Johnson)



On linux-2.6.22.1, executing the following script
while the mailer is writing to /var/spool/mail/linux-os.


#!/bin/bash
while true ;
do
>/var/spool/mail/linux-os;
sleep 1;
done

...will cause the following errors to occur.

Dec  7 04:05:55 chaos kernel: sd 0:0:1:0: [sdb] Sense Key : No Sense [deferred] 
Dec  7 04:05:55 chaos kernel: Info fld=0x1980240
Dec  7 04:05:55 chaos kernel: sd 0:0:1:0: [sdb] Add. Sense: Peripheral device 
write fault
Dec  7 04:08:13 chaos kernel: attempt to access beyond end of device
Dec  7 04:08:13 chaos kernel: sdb1: rw=0, want=29687515944, limit=33736437
Dec  7 04:08:13 chaos kernel: EXT2-fs error (device sdb1): 
ext2_xattr_delete_inode: inode 656387: block -584027804 read error
Dec  7 04:08:13 chaos kernel: EXT2-fs error (device sdb1): ext2_free_blocks: 
Freeing blocks not in datazone - block = 3710940964, count = 1
Dec  7 04:08:13 chaos kernel: EXT2-fs error (device sdb1): ext2_free_blocks: 
Freeing blocks not in datazone - block = 4294967295, count = 1
Dec  7 04:08:13 chaos kernel: EXT2-fs error (device sdb1): ext2_free_blocks: 
Freeing blocks not in datazone - block = 4294967295, count = 1
Dec  7 04:08:13 chaos kernel: EXT2-fs error (device sdb1): ext2_free_blocks: 
Freeing blocks not in datazone - block = 3710940980, count = 1
Dec  7 04:08:13 chaos kernel: EXT2-fs error (device sdb1): ext2_free_blocks: 
Freeing blocks not in datazone - block = 3710940980, count = 1
Dec  7 04:08:13 chaos kernel: EXT2-fs error (device sdb1): ext2_free_blocks: 
bit already cleared for block 1
Dec  7 04:08:13 chaos kernel: EXT2-fs error (device sdb1): ext2_free_blocks: 
Freeing blocks not in datazone - block = 3710941012, count = 1
Dec  7 04:08:13 chaos kernel: attempt to access beyond end of device
Dec  7 04:08:13 chaos kernel: sdb1: rw=0, want=29687528104, limit=33736437
Dec  7 04:08:13 chaos kernel: EXT2-fs error (device sdb1): ext2_free_branches: 
Read failure, inode=656399, block=-584026284
Dec  7 04:08:13 chaos kernel: attempt to access beyond end of device
Dec  7 04:08:13 chaos kernel: sdb1: rw=0, want=29687529288, limit=33736437
Dec  7 04:08:15 chaos kernel: EXT2-fs error (device sdb1): 
ext2_xattr_delete_inode: inode 656400: block -584026136 read error
Dec  7 04:08:18 chaos kernel: EXT2-fs error (device sdb1): 
ext2_xattr_delete_inode: inode 656403: bad block 30188


Caution is advised when testing because this destroyed a filesystem,
making it unfixable by `fsck`.


Cheers,
Dick Johnson
Penguin : Linux version 2.6.22.1 on an i686 machine (5588.29 BogoMips).
My book : http://www.AbominableFirebug.com/
_



The information transmitted in this message is confidential and may be 
privileged.  Any review, retransmission, dissemination, or other use of this 
information by persons or entities other than the intended recipient is 
prohibited.  If you are not the intended recipient, please notify Analogic 
Corporation immediately - by replying to this message or by sending an email to 
[EMAIL PROTECTED] - and destroy all copies of this information, including any 
attachments, without reading or disclosing them.

Thank you.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: everything in wait_for_completion, what is my system doing?

2007-12-07 Thread Bernd Schubert

Hello Andrew,

thanks for your help!

On Friday 07 December 2007 02:09:11 Andrew Morton wrote:
> On Wed, 5 Dec 2007 21:44:54 +0100
>
> Bernd Schubert <[EMAIL PROTECTED]> wrote:
> > after scsi-recovery a system here went into some kind lock-up, everything
> > seems to be in wait_for_completion(). Please see the attached
> > blocked_states.txt and all_states.txt files.
> > This is 2.6.22.12, I can easily find out the line numbers if required.
> >
> > Any help is highly appreciated.
>
> Please cc linux-scsi on scsi-related reports.

Sorry, I these traces confused me a bit. I had absolutely no idea about a 
possible reason.

>
> > [blocked_states.txt  text/plain (20.5KB)]
> > [generate break]
> > [ 1818.566436] SysRq : Show Blocked State
> > [ 1818.570260]
> > [ 1818.570261]  free 
> >   sibling [ 1818.579253]   task PCstack   pid
> > father child younger older [ 1818.586987] events/7  D
> > 0155dd642280 026  2 (L-TLB) [ 1818.593747] 
> > 81012b529ac0 0046  810128280d18 [
> > 1818.601321]  8100ba2376f8 81012b689630 81012aff76b0
> > 00078023e215 [ 1818.608870]  00010003ca14 
> > 810001065400 000780430c13 [ 1818.616222] Call Trace:
> > [ 1818.618925]  [] io_schedule+0x28/0x36
> > [ 1818.624207]  [] get_request_wait+0x104/0x158
> > [ 1818.630112]  [] blk_get_request+0x36/0x6b
> > [ 1818.635755]  [] scsi_execute+0x51/0x129
> > [ 1818.641240]  []
> > :scsi_transport_spi:spi_execute+0x87/0xf8 [ 1818.648271] 
> > []
> > :scsi_transport_spi:spi_dv_device_echo_buffer+0x181/0x27d [ 1818.656739] 
> > [] :scsi_transport_spi:spi_dv_retrain+0x4e/0x240 [
> > 1818.664139]  []
> > :scsi_transport_spi:spi_dv_device+0x615/0x69c [ 1818.671542] 
> > [] :mptspi:mptspi_dv_device+0xb3/0x14b [ 1818.678042] 
> > [] :mptspi:mptspi_dv_renegotiate_work+0xcb/0xef [
> > 1818.685348]  [] run_workqueue+0x8e/0x120
> > [ 1818.690905]  [] worker_thread+0x106/0x117
> > [ 1818.696540]  [] kthread+0x4b/0x82
> > [ 1818.701474]  [] child_rip+0xa/0x12
> > [ 1818.706495]
> > [ 1818.708022] unionfs-fuse- D 01a76ef63463 0  1119  1
> > (NOTLB) [ 1818.714764]  810129765988 0082
> >  80337e22 [ 1818.722329]  8101297658c8
> > 81012b652f20 810129eec810 0006 [ 1818.729895] 
> > 00010005204e  81000105c400 000680337c3e [
> > 1818.737249] Call Trace:
> > [ 1818.739953]  [] schedule_timeout+0x8a/0xb6
> > [ 1818.745673]  [] io_schedule_timeout+0x28/0x36
> > [ 1818.751664]  [] congestion_wait+0x9d/0xc2
> > [ 1818.757300]  []
> > balance_dirty_pages_ratelimited_nr+0x196/0x22f [ 1818.764781] 
> > [] generic_file_buffered_write+0x52a/0x60d [
> > 1818.771641]  []
> > __generic_file_aio_write_nolock+0x45a/0x491 [ 1818.778852] 
> > [] generic_file_aio_write+0x61/0xc1 [ 1818.785101] 
> > [] nfs_file_write+0x138/0x1b7
> > [ 1818.790822]  [] do_sync_write+0xcc/0x112
> > [ 1818.796372]  [] vfs_write+0xc3/0x165
> > [ 1818.801575]  [] sys_pwrite64+0x68/0x96
> > [ 1818.806959]  [] system_call+0x7e/0x83
> > [ 1818.812250]  [<2b4eeec3ea73>]
> >
> > [snippage]
>
> Possibly your device driver had conniptions and stopped generating
> completion interrupts.
>
> Which driver is in use?

This is this time easily visible from the traces (mptspi_dv_device) ;) So its 
the mpt driver, we are using LSI22320 cards (I CC'ed Eric).

>
> I don't suppose it is repeatable.

Thats a clear "yes and no". Exactly this state we have got two or three times 
during an exhausting hardware stress test over the last weeks (with real and 
with simulated errors), but its not easily reproducible. Furthermore,  the 
hardware will go into production soon and I don't have the chance to simulate 
further errors.
However, we can easily get a similar state just on a raid6-rebuild (with high 
end hardware though. 
(You probably never won't run into into it with normal disks, we are doing 
software-raid over a bunch of several hardware raid systems).  

In the raid6-rebuild case the system is not completely locked up, just mostly. 
Somehow raid6-rebuild is still working, we can see this by the io usage 
status of the hardware-raids, but the system is completely blocked otherwise. 
Only pings and sysrq's are working.


Thanks,
Bernd


-- 
Bernd Schubert
Q-Leap Networks GmbH
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] kbuild: implement modules.order, take #2

2007-12-07 Thread Tejun Heo

When multiple built-in modules (especially drivers) provide the same
capability, they're prioritized by link order specified by the order
listed in Makefile.  This implicit ordering is lost for loadable
modules.

When driver modules are loaded by udev, what comes first in
modules.alias file is selected.  However, the order in this file is
indeterministic (depends on filesystem listing order of installed
modules).  This causes confusion.

The solution is two-parted.  This patch updates kbuild such that it
generates and installs modules.order which contains the name of
modules ordered according to Makefile.  The second part is update to
depmod such that it generates output files according to this file.

Note that both obj-y and obj-m subdirs can contain modules and
ordering information between those two are lost from beginning.
Currently obj-y subdirs are put before obj-m subdirs.

Sam Ravnborg cleaned up Makefile modifications and suggested using awk
to remove duplicate lines from modules.order instead of using separate
C program.

Signed-off-by: Tejun Heo <[EMAIL PROTECTED]>
Cc: Sam Ravnborg <[EMAIL PROTECTED]>
Cc: Bill Nottingham <[EMAIL PROTECTED]>
Cc: Rusty Russell <[EMAIL PROTECTED]>
Cc: Greg Kroah-Hartman <[EMAIL PROTECTED]>
Cc: Kay Sievers <[EMAIL PROTECTED]>
---
 Makefile   |8 +++-
 scripts/Makefile.build |   17 -
 scripts/Makefile.lib   |6 ++
 3 files changed, 29 insertions(+), 2 deletions(-)

diff --git a/Makefile b/Makefile
index 92dc3cb..1542dd2 100644
--- a/Makefile
+++ b/Makefile
@@ -1020,9 +1020,14 @@ ifdef CONFIG_MODULES
 all: modules
 
 #  Build modules
+#
+#  A module can be listed more than once in obj-m resulting in
+#  duplicate lines in modules.order files.  Those are removed
+#  using awk while concatenating to the final file.
 
 PHONY += modules
 modules: $(vmlinux-dirs) $(if $(KBUILD_BUILTIN),vmlinux)
+   $(Q)$(AWK) '!x[$$0]++' $(vmlinux-dirs:%=$(objtree)/%/modules.order) > 
$(objtree)/modules.order
@echo '  Building modules, stage 2.';
$(Q)$(MAKE) -f $(srctree)/scripts/Makefile.modpost
 
@@ -1050,6 +1055,7 @@ _modinst_:
rm -f $(MODLIB)/build ; \
ln -s $(objtree) $(MODLIB)/build ; \
fi
+   @cp -f $(objtree)/modules.order $(MODLIB)/
$(Q)$(MAKE) -f $(srctree)/scripts/Makefile.modinst
 
 # This depmod is only for convenience to give the initial
@@ -1109,7 +1115,7 @@ clean: archclean $(clean-dirs)
@find . $(RCS_FIND_IGNORE) \
\( -name '*.[oas]' -o -name '*.ko' -o -name '.*.cmd' \
-o -name '.*.d' -o -name '.*.tmp' -o -name '*.mod.c' \
-   -o -name '*.symtypes' \) \
+   -o -name '*.symtypes' -o -name 'modules.order' \) \
-type f -print | xargs rm -f
 
 # mrproper - Delete all generated files, including .config
diff --git a/scripts/Makefile.build b/scripts/Makefile.build
index de9836e..875cbdb 100644
--- a/scripts/Makefile.build
+++ b/scripts/Makefile.build
@@ -83,10 +83,12 @@ ifneq ($(strip $(obj-y) $(obj-m) $(obj-n) $(obj-) 
$(lib-target)),)
 builtin-target := $(obj)/built-in.o
 endif
 
+modorder-target := $(obj)/modules.order
+
 # We keep a list of all modules in $(MODVERDIR)
 
 __build: $(if $(KBUILD_BUILTIN),$(builtin-target) $(lib-target) $(extra-y)) \
-$(if $(KBUILD_MODULES),$(obj-m)) \
+$(if $(KBUILD_MODULES),$(obj-m) $(modorder-target)) \
 $(subdir-ym) $(always)
@:
 
@@ -276,6 +278,19 @@ targets += $(builtin-target)
 endif # builtin-target
 
 #
+# Rule to create modules.order file
+#
+# Create commands to either record .ko file or cat modules.order from
+# a subdirectory
+modorder-cmds =\
+   $(foreach m, $(modorder),   \
+   $(if $(filter %/modules.order, $m), \
+   cat $m;, echo kernel/$m;)) 
+
+$(modorder-target): $(subdir-ym) FORCE
+   $(Q)(cat /dev/null; $(modorder-cmds)) > $@
+
+#
 # Rule to compile a set of .o files into one .a file
 #
 ifdef lib-target
diff --git a/scripts/Makefile.lib b/scripts/Makefile.lib
index 3c5e88b..8e44023 100644
--- a/scripts/Makefile.lib
+++ b/scripts/Makefile.lib
@@ -25,6 +25,11 @@ lib-y := $(filter-out $(obj-y), $(sort $(lib-y) $(lib-m)))
 # o if we encounter foo/ in $(obj-m), remove it from $(obj-m) 
 #   and add the directory to the list of dirs to descend into: $(subdir-m)
 
+# Determine modorder.
+# Unfortunately, we don't have information about ordering between -y
+# and -m subdirs.  Just put -y's first.
+modorder   := $(patsubst %/,%/modules.order, $(filter %/, $(obj-y)) 
$(obj-m:.o=.ko))
+
 __subdir-y := $(patsubst %/,%,$(filter %/, $(obj-y)))
 subdir-y   += $(__subdir-y)
 __subdir-m := $(patsubst %/,%,$(filter %/, $(obj-m)))
@@ -64,6 +69,7 @@ real-objs-m := $(foreach m, $(obj-m), $(if $(strip 
$($(m:.o=-objs)) $($(m:.o=-y)
 extra-y:= $(addprefix $(obj)/,$(

Re: [RFC][POWERPC] Provide a way to protect 4k subpages when using 64k pages

2007-12-07 Thread Arnd Bergmann

On Friday 07 December 2007, Paul Mackerras wrote:
> I have re-purposed the ioperm system call for this.  The old ioperm
> system call never did anything (except return an ENOSYS error) and in
> fact never could have actually been useful for anything on the PowerPC
> architecture, so nothing ever used it.

Couldn't there be a program that relies on ioperm to return -ENOSYS on
powerpc in order to fall back on some other method of I/O access?

The risk of actually breaking something is certainly low, but I think
you can never be sure here, so why not use a new syscall number?

Arnd <><
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] scheduler: fix x86 regression in native_sched_clock

2007-12-07 Thread Guillaume Chazarain

On Dec 7, 2007 12:18 PM, Guillaume Chazarain <[EMAIL PROTECTED]> wrote:
> Any pointer to it?

Nevermind, I found it ... in this same thread :-(

-- 
Guillaume
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: NFSv2/3 broken exporting/mounting (permission denied) in 2.6.24-rc4

2007-12-07 Thread J. Bruce Fields

On Fri, Dec 07, 2007 at 11:54:38AM +0100, Mikael Pettersson wrote:
> On Thu, 6 Dec 2007 21:20:41 -0500, Erez Zadok wrote:
> > I get a "permission denied" when trying to mount a localhost nfsv2/3
> > exported volume, on v2.6.24-rc4-124-gf194d13.  It works w/ nfsv4 mounting.
> > It worked fine in 2.6.24-rc3.  Here's a sequence of ops I tried:
> > 
> > # mount -t ext2 /dev/hdb1 /n/lower/b0
> > # exportfs -o no_root_squash,rw localhost:/n/lower/b0
> > # mount -t nfs -o nfsvers=3 localhost:/n/lower/b0 /mnt
> 
> I'm seeing something similar too. NFSv3 export of an ext3 partition
> to another machine in my lan fails (client gets permission denied)
> when the server runs 2.6.24-rc4. It worked fine in 2.6.24-rc3.
> 
> There's no NFSv4 of any kind on either client or server.

And you're not varying the client at all, you're only changing the
kernel version on the server?

There are literally no commits between v2.6.24-rc3 and v2.6.24-rc4 which
touch fs/nfsd/.  What filesystem are you exporting?  Are the nfs-utils
versions the same in both cases?

Also, could you get a network trace showing the failure?

--b.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] kexec: force x86_64 arches to boot kdump kernels on boot cpu

2007-12-07 Thread Neil Horman

On Fri, Dec 07, 2007 at 10:16:23AM -0500, Vivek Goyal wrote:
> On Fri, Dec 07, 2007 at 09:53:15AM -0500, Neil Horman wrote:
> > On Fri, Dec 07, 2007 at 09:39:44AM -0500, Vivek Goyal wrote:
> > > On Thu, Dec 06, 2007 at 07:10:23PM -0500, Neil Horman wrote:
> > > > On Thu, Dec 06, 2007 at 05:11:43PM -0500, Vivek Goyal wrote:
> > > > > On Thu, Dec 06, 2007 at 04:39:51PM -0500, Neil Horman wrote:
> > > > > > On Fri, Nov 30, 2007 at 09:51:31AM -0500, Neil Horman wrote:
> > > > > > > On Fri, Nov 30, 2007 at 09:42:50AM -0500, Vivek Goyal wrote:
> > > > > > 
> > > > > > > 
> > > > > > > Thats what I'm doing at the moment.  I'm working on a RHEL5 patch 
> > > > > > > at the moment
> > > > > > > (since thats whats on the production system thats failing), and 
> > > > > > > will forward
> > > > > > > port it once its working
> > > > > > > 
> > > > > > > And not to split hairs, but techically thats not our _only_ 
> > > > > > > choice.  We could
> > > > > > > force kdump boots on cpu0 as well ;)
> > > > > > > 
> > > > > > > Thanks
> > > > > > > Neil
> > > > > > > 
> > > > > > > > Thanks
> > > > > > > > Vivek
> > > > > > > 
> > > > > > 
> > > > > > 
> > > > > > Sorry to have been quiet on this issue for a few days. Interesting 
> > > > > > news to
> > > > > > report, though.  So I was working on a patch to do early apic 
> > > > > > enabling on
> > > > > > x86_64, and had something working for the old 2.6.18 kernel that we 
> > > > > > were
> > > > > > origionally testing on.  Unfortunately while it worked on 2.6.18 it 
> > > > > > failed
> > > > > > miserably on 2.6.24-rc3-mm2, causing check_timer to consistently 
> > > > > > report that the
> > > > > > timer interrupt wasn't getting received (even though we could 
> > > > > > successfully run
> > > > > > calibrate_delay).  Vivek and I were digging into this, when I ran 
> > > > > > accross the
> > > > > > description of the hypertransport configuration register in the 
> > > > > > opteron
> > > > > > specification.  It contains a bit that, suprise, configures the ht 
> > > > > > bus to either
> > > > > > unicast interrupts delivered accross the ht bus to a single cpu, or 
> > > > > > to broadcast
> > > > > > it to all cpus.  Since it seemed more likely that the 8259 in the 
> > > > > > nvidia
> > > > > > southbridge was transporting legacy mode interrupts over the ht bus 
> > > > > > than
> > > > > > directly to cpu0 via an actual wire, I wrote the attached patch to 
> > > > > > add a quirk
> > > > > > for nvidia chipsets, which scanned for hypertransport controllers, 
> > > > > > and ensured
> > > > > > that that broadcast bit was set.  Test results indicate that this 
> > > > > > solves the
> > > > > > problem, and kdump kernels boot just fine on the affected system.
> > > > > > 
> > > > > 
> > > > > Hi Neil,
> > > > > 
> > > > > Should we disable this broadcasting feature once we are through? 
> > > > > Otherwise
> > > > > in normal systems it might mean extra traffic on hypertransport. There
> > > > > is no need for every interrupt to be broadcasted in normal systems?
> > > > > 
> > > > > Thanks
> > > > > Vivek
> > > > 
> > > > No, I don't think thats necessecary.  Once the apics are enabled, 
> > > > interrupts
> > > > shouldn't travel accross the hypertransport bus anyway, opting instead 
> > > > to use
> > > > the dedicated apic bus (at least thats my understanding).
> > > 
> > > I think all interrupt message travel on hypertransport. Even after APICS
> > > have been enabled.
> > > 
> > > Look at the following document.
> > > 
> > > http://www.amd.com/us-en/assets/content_type/white_papers_and_tech_docs/24674.pdf
> > > 
> > > Have a look at figure 1, figure 2 and section 3.4.2.2 and 3.4.2.3
> > > 
> > > That's a different thing that once IOAPIC has formed the vectored message,
> > > Hypertransport might not touch the destination field.
> > >  
> > Ok, that might be the case then.
> > 
> > > Having said that, I am wondering what will happen if a system continues
> > > to operate the timer through IOAPIC in ExtInt mode. Will hypertransport
> > > keep on broadcasting that interrupt to every cpu? And every cpu will 
> > > process that interrupt.
> > > 
> > I don't think so.  IIRC once the other cpus are started they all disable the
> > timer interrupt, except for one cpu, opting instead to get the timer tick 
> > via
> > ipi, So while they all might see the interrupt packet on the ht bus, only 
> > one
> > cpu will process it.
> > 
> 
> Does LAPIC allow to disable a specific vector and not accept interrupts? I
> don't think so. If a timer interrupt is broadcasted to every cpu I think
> everybody will accept it (like broadcast IPI). That's why intelligence
> is built into IOAPIC and direct interrupts to a cpu or group of cpu.
> 
See disable_APIC_timer().  It seems to set the mask bit in the APIC_LVTT entry.

> I am just trying to understand the functionality better. Can somebody help me
> understand how do we make sure that same timer interrupt is not processed by

RE: ptrace API extensions for BTS

2007-12-07 Thread Metzger, Markus T

>From: Andi Kleen [mailto:[EMAIL PROTECTED] 
>Sent: Freitag, 7. Dezember 2007 14:04

>With Out-of-order CPUs exact global metrics are pretty difficult.
>At which point of the instruction execution would you measure? 

All I want to do is order the execution chunks of different 
threads. Taking two snapshots somewhere near the beginning and 
the end of context switching should be good enough.

There's all the scheduler code in between (or at least the context
switch code). I don't think I need to worry about the exact point
during instruction execution.

I don't think it makes sense to try to correlate instructions
from different threads. It would be a wonderful feature to show
a synchronous trace across multiple threads. But that would require
you to measure time for each instruction. I don't think that's
feasible without reducing performance to single stepping;-)


>Anyways if RDTSC doesn't work the only global alternatives are 
>much slower
>(like southbridge timers) or very inaccurate (jiffies) 

Would jiffies be a metric that works across cpu's?
At the granularity that I want to measure, I guess that accuracy
is not important at all.


>I would just drop it since it'll likely always be somewhat misleading.

I guess I will (have to) drop it if it cannot be used for what I
intended.


thanks and regards,
markus.
-
Intel GmbH
Dornacher Strasse 1
85622 Feldkirchen/Muenchen Germany
Sitz der Gesellschaft: Feldkirchen bei Muenchen
Geschaeftsfuehrer: Douglas Lusk, Peter Gleissner, Hannes Schwaderer
Registergericht: Muenchen HRB 47456 Ust.-IdNr.
VAT Registration No.: DE129385895
Citibank Frankfurt (BLZ 502 109 00) 600119052

This e-mail and any attachments may contain confidential material for
the sole use of the intended recipient(s). Any review or distribution
by others is strictly prohibited. If you are not the intended
recipient, please contact the sender and delete all copies.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [Patch] net/xfrm/xfrm_policy.c: Some small improvements

2007-12-07 Thread Richard Knutsson

David Miller wrote:

From: Richard Knutsson <[EMAIL PROTECTED]>
Date: Thu, 06 Dec 2007 15:37:46 +0100

David Miller wrote:

But this time I'll just let you know up front that I
don't see much value in this patch.  It is not a clear
improvement to replace int's with bool's in my mind and
the other changes are just whitespace changes.

Is it not an improvement to distinct booleans from actual values? Do you 
use integers for ASCII characters too? It can also avoid some potential 
bugs like the 'if (i == TRUE)'...
What is wrong with 'size_t' (since it is unsigned, compared to (some) 
'int')?

When you say "int found;" is there any doubt in your mind that
this integer is going to hold a 1 or a 0 depending upon whether
we "found" something?

That's the problem I have with these kinds of patches, they do
not increase clarity, it's just pure mindless edits.

But is there not a good thing if also the compiler knows + names are 
sometime not as clear as that one?

In new code, fine, use booleans if you want.

I would even accept that it helps to change to boolean for
arguments to functions that are global in scope.

But not for function local variables in cases like this.

Oh, I see your point now. Believed it to be yet another 'booleans is not 
C idiom'.

Sorry about the noise
Richard Knutsson

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] scheduler: fix x86 regression in native_sched_clock

2007-12-07 Thread Ingo Molnar


* Ingo Molnar <[EMAIL PROTECTED]> wrote:

> > Stefano, could you try this ontop of a recent-ish Linus tree - does 
> > this resolve all issues? (without introducing new ones ;-)
> 
> updated version attached below.

third update. the cpufreq callbacks are not quite OK yet.

Ingo

Index: linux/arch/arm/kernel/time.c
===
--- linux.orig/arch/arm/kernel/time.c
+++ linux/arch/arm/kernel/time.c
@@ -79,17 +79,6 @@ static unsigned long dummy_gettimeoffset
 }
 #endif
 
-/*
- * An implementation of printk_clock() independent from
- * sched_clock().  This avoids non-bootable kernels when
- * printk_clock is enabled.
- */
-unsigned long long printk_clock(void)
-{
-   return (unsigned long long)(jiffies - INITIAL_JIFFIES) *
-   (10 / HZ);
-}
-
 static unsigned long next_rtc_update;
 
 /*
Index: linux/arch/ia64/kernel/time.c
===
--- linux.orig/arch/ia64/kernel/time.c
+++ linux/arch/ia64/kernel/time.c
@@ -344,33 +344,6 @@ udelay (unsigned long usecs)
 }
 EXPORT_SYMBOL(udelay);
 
-static unsigned long long ia64_itc_printk_clock(void)
-{
-   if (ia64_get_kr(IA64_KR_PER_CPU_DATA))
-   return sched_clock();
-   return 0;
-}
-
-static unsigned long long ia64_default_printk_clock(void)
-{
-   return (unsigned long long)(jiffies_64 - INITIAL_JIFFIES) *
-   (10/HZ);
-}
-
-unsigned long long (*ia64_printk_clock)(void) = &ia64_default_printk_clock;
-
-unsigned long long printk_clock(void)
-{
-   return ia64_printk_clock();
-}
-
-void __init
-ia64_setup_printk_clock(void)
-{
-   if (!(sal_platform_features & IA64_SAL_PLATFORM_FEATURE_ITC_DRIFT))
-   ia64_printk_clock = ia64_itc_printk_clock;
-}
-
 /* IA64 doesn't cache the timezone */
 void update_vsyscall_tz(void)
 {
Index: linux/arch/x86/kernel/process_32.c
===
--- linux.orig/arch/x86/kernel/process_32.c
+++ linux/arch/x86/kernel/process_32.c
@@ -113,10 +113,19 @@ void default_idle(void)
smp_mb();
 
local_irq_disable();
-   if (!need_resched())
+   if (!need_resched()) {
+   ktime_t t0, t1;
+   u64 t0n, t1n;
+
+   t0 = ktime_get();
+   t0n = ktime_to_ns(t0);
safe_halt();/* enables interrupts racelessly */
-   else
-   local_irq_enable();
+   local_irq_disable();
+   t1 = ktime_get();
+   t1n = ktime_to_ns(t1);
+   sched_clock_idle_wakeup_event(t1n - t0n);
+   }
+   local_irq_enable();
current_thread_info()->status |= TS_POLLING;
} else {
/* loop is done by the caller */
Index: linux/arch/x86/lib/delay_32.c
===
--- linux.orig/arch/x86/lib/delay_32.c
+++ linux/arch/x86/lib/delay_32.c
@@ -38,17 +38,21 @@ static void delay_loop(unsigned long loo
:"0" (loops));
 }
 
-/* TSC based delay: */
+/* cpu_clock() [TSC] based delay: */
 static void delay_tsc(unsigned long loops)
 {
-   unsigned long bclock, now;
+   unsigned long long start, stop, now;
+   int this_cpu;
+
+   preempt_disable();
+
+   this_cpu = smp_processor_id();
+   start = now = cpu_clock(this_cpu);
+   stop = start + loops;
+
+   while ((long long)(stop - now) > 0)
+   now = cpu_clock(this_cpu);
 
-   preempt_disable();  /* TSC's are per-cpu */
-   rdtscl(bclock);
-   do {
-   rep_nop();
-   rdtscl(now);
-   } while ((now-bclock) < loops);
preempt_enable();
 }
 
Index: linux/arch/x86/lib/delay_64.c
===
--- linux.orig/arch/x86/lib/delay_64.c
+++ linux/arch/x86/lib/delay_64.c
@@ -26,19 +26,28 @@ int read_current_timer(unsigned long *ti
return 0;
 }
 
-void __delay(unsigned long loops)
+/* cpu_clock() [TSC] based delay: */
+static void delay_tsc(unsigned long loops)
 {
-   unsigned bclock, now;
+   unsigned long long start, stop, now;
+   int this_cpu;
+
+   preempt_disable();
+
+   this_cpu = smp_processor_id();
+   start = now = cpu_clock(this_cpu);
+   stop = start + loops;
+
+   while ((long long)(stop - now) > 0)
+   now = cpu_clock(this_cpu);
 
-   preempt_disable();  /* TSC's are pre-cpu */
-   rdtscl(bclock);
-   do {
-   rep_nop(); 
-   rdtscl(now);
-   }
-   while ((now-bclock) < loops);
preempt_enable();
 }
+
+void __delay(unsigned long loops)
+{
+   delay_tsc(loops);
+}
 EXPORT_SYMBOL(__delay);
 
 inline void __const_udelay(unsi

Re: [BUG] 2.6.23-rc3 can't see sd partitions on Alpha

2007-12-07 Thread Ingo Molnar

* Andrew Morton <[EMAIL PROTECTED]> wrote:

> > then the test of whether I bisected correctly is as simple as 
> > applying the commit and seeing if things break, because I'm running 
> > on the kernel corresponding to 
> > 2f1f53bdc6531696934f6ee7bbdfa2ab4f4f62a3 right now.  Let me give 
> > that a try and I'll report back.  Worst case, I'll have to start 
> > over and write off the past four days...
> 
> Gad.  I trust the second time will be faster.
> 
> git-bisect _is_ very error prone.  I find one of the problems is that 
> each step is so far apart in time that you forget what you were doing.  
> Did I remember to test that iteration?  Did I install the right 
> kernel?  etc.

i have a fully automated bootup-hang bisection script. It is based on 
"git-bisect run". I run the script, it builds and boots kernels fully 
automatically, and when the bootup fails (the script notices that via 
the serial log, which it continuously watches - or via a timeout, if the 
system does not come up within 10 minutes it's a "bad" kernel), the 
script raises my attention via a beep and i power cycle the test box. 
(yeah, i should make use of a managed power outlet to 100% automate it) 

So i dont have to a single manual decision anytime during the bisection. 
But the scripts are very much tied to my ad-hoc test environment so it 
would not be of much general use.

Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] scheduler: fix x86 regression in native_sched_clock

2007-12-07 Thread Ingo Molnar


* Nick Piggin <[EMAIL PROTECTED]> wrote:

> My patch should fix the worst cpufreq sched_clock jumping issue I 
> think.

but it degrades the precision of sched_clock() and has other problems as 
well. cpu_clock() is the right interface to use for such things.

Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] scheduler: fix x86 regression in native_sched_clock

2007-12-07 Thread Nick Piggin

On Friday 07 December 2007 19:45, Ingo Molnar wrote:
> * Stefano Brivio <[EMAIL PROTECTED]> wrote:
> > This patch fixes a regression introduced by:
> >
> > commit bb29ab26863c022743143f27956cc0ca362f258c
> > Author: Ingo Molnar <[EMAIL PROTECTED]>
> > Date:   Mon Jul 9 18:51:59 2007 +0200
> >
> > This caused the jiffies counter to leap back and forth on cpufreq
> > changes on my x86 box. I'd say that we can't always assume that TSC
> > does "small errors" only, when marked unstable. On cpufreq changes
> > these errors can be huge.
>
> ah, printk_clock() still uses sched_clock(), not jiffies. So it's not
> the jiffies counter that goes back and forth, it's sched_clock() - so
> this is a printk timestamps anomaly, not related to jiffies. I thought
> we have fixed this bug in the printk code already: sched_clock() is a
> 'raw' interface that should not be used directly - the proper interface
> is cpu_clock(cpu).

It's a single CPU box, so sched_clock() jumping would still be
problematic, no?

My patch should fix the worst cpufreq sched_clock jumping issue
I think.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] depmod: sort output according to modules.order, take #2

2007-12-07 Thread Tejun Heo

Kbuild now generates and installs modules.order along with modules.
This patch updates depmod such that it sorts module list according to
the file before generating output files.  Modules which aren't on
modules.order are put after modules which are ordered by
modules.order.

This makes modprobe to prioritize modules according to kernel
Makefile's just as built-in modules are link-ordered by them.

This patch is against module-init-tools 3.3-pre1.

Signed-off-by: Tejun Heo <[EMAIL PROTECTED]>
Cc: Sam Ravnborg <[EMAIL PROTECTED]>
Cc: Bill Nottingham <[EMAIL PROTECTED]>
Cc: Rusty Russell <[EMAIL PROTECTED]>
Cc: Greg Kroah-Hartman <[EMAIL PROTECTED]>
Cc: Kay Sievers <[EMAIL PROTECTED]>
---
Comment added and path comparion logic slightly modified such that
dirname part of mode->pathname is ignored instead of prepending
dirname to lines read from modules.order.  Behavior-wise it's
identical to the previous version.

Thanks.

 depmod.c |   49 +
 1 file changed, 49 insertions(+)

diff --git a/depmod.c b/depmod.c
index ea7ad05..c3ae5a2 100644
--- a/depmod.c
+++ b/depmod.c
@@ -585,6 +585,54 @@ static struct module *grab_basedir(const char *dirname)
return list;
 }
 
+static void sort_modules(const char *dirname, struct module **listp)
+{
+   struct module *list = *listp, *tlist = NULL, **tpos = &tlist;
+   FILE *modorder;
+   int dir_len = strlen(dirname) + 1;
+   char file_name[dir_len + strlen("modules.order") + 1];
+   char line[10240];
+
+   sprintf(file_name, "%s/%s", dirname, "modules.order");
+
+   modorder = fopen(file_name, "r");
+   if (!modorder) {
+   /* Older kernels don't generate modules.order.  Just
+  return if the file doesn't exist. */
+   if (errno == ENOENT)
+   return;
+   fatal("Could not open '%s': %s\n", file_name, strerror(errno));
+   }
+
+   sprintf(line, "%s/", dirname);
+
+   /* move modules listed in modorder file to tlist in order */
+   while (fgets(line, sizeof(line), modorder)) {
+   struct module **pos, *mod;
+   int len = strlen(line);
+
+   if (line[len - 1] == '\n')
+   line[len - 1] = '\0';
+
+   for (pos = &list; (mod = *pos); pos = &(*pos)->next) {
+   if (strcmp(line, mod->pathname + dir_len) == 0) {
+   *pos = mod->next;
+   mod->next = NULL;
+   *tpos = mod;
+   tpos = &mod->next;
+   break;
+   }
+   }
+   }
+
+   /* append the rest */
+   *tpos = list;
+
+   fclose(modorder);
+
+   *listp = tlist;
+}
+
 static void parse_modules(struct module *list)
 {
struct module *i;
@@ -857,6 +905,7 @@ int main(int argc, char *argv[])
} else {
list = grab_basedir(dirname);
}
+   sort_modules(dirname, &list);
parse_modules(list);
 
for (i = 0; i < sizeof(depfiles)/sizeof(depfiles[0]); i++) {
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] Add support for the S-35390A RTC chip.

2007-12-07 Thread Byron Bradley

This adds basic get/set time support for the Seiko Instruments
S-35390A. This chip communicates using I2C and is used on the
QNAP TS-109/TS-209 NAS devices.

Signed-off-by: Byron Bradley <[EMAIL PROTECTED]>
Tested-by: Tim Ellis <[EMAIL PROTECTED]>
---
 drivers/rtc/Kconfig   |9 ++
 drivers/rtc/Makefile  |1 +
 drivers/rtc/rtc-s35390a.c |  302 +
 3 files changed, 312 insertions(+), 0 deletions(-)
 create mode 100644 drivers/rtc/rtc-s35390a.c

diff --git a/drivers/rtc/Kconfig b/drivers/rtc/Kconfig
index 1e6715e..6c0fdf9 100644
--- a/drivers/rtc/Kconfig
+++ b/drivers/rtc/Kconfig
@@ -246,6 +246,15 @@ config RTC_DRV_TWL92330
  platforms.  The support is integrated with the rest of
  the Menelaus driver; it's not separate module.
 
+config RTC_DRV_S35390A
+   tristate "Seiko Instruments S-35390A"
+   help
+ If you say yes here you will get support for the Seiko
+ Instruments S-35390A.
+
+ This driver can also be built as a module. If so the module
+ will be called rtc-s35390a.
+
 endif # I2C
 
 comment "SPI RTC drivers"
diff --git a/drivers/rtc/Makefile b/drivers/rtc/Makefile
index 465db4d..8d6218f 100644
--- a/drivers/rtc/Makefile
+++ b/drivers/rtc/Makefile
@@ -41,6 +41,7 @@ obj-$(CONFIG_RTC_DRV_PL031)   += rtc-pl031.o
 obj-$(CONFIG_RTC_DRV_RS5C313)  += rtc-rs5c313.o
 obj-$(CONFIG_RTC_DRV_RS5C348)  += rtc-rs5c348.o
 obj-$(CONFIG_RTC_DRV_RS5C372)  += rtc-rs5c372.o
+obj-$(CONFIG_RTC_DRV_S35390A)  += rtc-s35390a.o
 obj-$(CONFIG_RTC_DRV_S3C)  += rtc-s3c.o
 obj-$(CONFIG_RTC_DRV_SA1100)   += rtc-sa1100.o
 obj-$(CONFIG_RTC_DRV_SH)   += rtc-sh.o
diff --git a/drivers/rtc/rtc-s35390a.c b/drivers/rtc/rtc-s35390a.c
new file mode 100644
index 000..29a95b6
--- /dev/null
+++ b/drivers/rtc/rtc-s35390a.c
@@ -0,0 +1,302 @@
+/*
+ * Seiko Instruments S-35390A RTC Driver
+ *
+ * Copyright (c) 2007 Byron Bradley
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation; either version
+ * 2 of the License, or (at your option) any later version.
+ */
+
+#include 
+#include 
+#include 
+#include 
+
+#define S35390A_CMD_STATUS10
+#define S35390A_CMD_STATUS21
+#define S35390A_CMD_TIME1  2
+
+#define S35390A_BYTE_YEAR  0
+#define S35390A_BYTE_MONTH 1
+#define S35390A_BYTE_DAY   2
+#define S35390A_BYTE_WDAY  3
+#define S35390A_BYTE_HOURS 4
+#define S35390A_BYTE_MINS  5
+#define S35390A_BYTE_SECS  6
+
+#define S35390A_FLAG_POC   0x01
+#define S35390A_FLAG_BLD   0x02
+#define S35390A_FLAG_24H   0x40
+#define S35390A_FLAG_RESET 0x80
+#define S35390A_FLAG_TEST  0x01
+
+struct s35390a {
+   struct i2c_client *client;
+   struct rtc_device *rtc;
+   int twentyfourhour;
+};
+
+static int s35390a_set_reg(struct s35390a *s35390a, int reg, char *buf, int 
len)
+{
+   struct i2c_client *client = s35390a->client;
+   struct i2c_msg msg[] = {
+   { client->addr | reg, 0, len, buf },
+   };
+
+   /* Only write to the writable bits in the status1 register */
+   if (reg == S35390A_CMD_STATUS1)
+   buf[0] &= 0xf;
+
+   if ((i2c_transfer(client->adapter, msg, 1)) != 1)
+   return -EIO;
+
+   return 0;
+}
+
+static int s35390a_get_reg(struct s35390a *s35390a, int reg, char *buf, int 
len)
+{
+   struct i2c_client *client = s35390a->client;
+   struct i2c_msg msg[] = {
+   { client->addr | reg, I2C_M_RD, len, buf },
+   };
+
+   if ((i2c_transfer(client->adapter, msg, 1)) != 1)
+   return -EIO;
+
+   return 0;
+}
+
+static int s35390a_reset(struct s35390a *s35390a)
+{
+   char buf[1];
+
+   if (s35390a_get_reg(s35390a, S35390A_CMD_STATUS1, buf, sizeof(buf)) < 0)
+   return -EIO;
+
+   if (!(buf[0] & (S35390A_FLAG_POC | S35390A_FLAG_BLD)))
+   return 0;
+
+   buf[0] |= S35390A_FLAG_RESET;
+   return s35390a_set_reg(s35390a, S35390A_CMD_STATUS1, buf, sizeof(buf));
+}
+
+static int s35390a_disable_test_mode(struct s35390a *s35390a)
+{
+   char buf[1];
+
+   if (s35390a_get_reg(s35390a, S35390A_CMD_STATUS2, buf, sizeof(buf)) < 0)
+   return -EIO;
+
+   if (!(buf[0] & S35390A_FLAG_TEST))
+   return 0;
+
+   buf[0] &= ~S35390A_FLAG_TEST;
+   return s35390a_set_reg(s35390a, S35390A_CMD_STATUS2, buf, sizeof(buf));
+}
+
+static char s35390a_hr2reg(struct s35390a *s35390a, int hour)
+{
+   if (s35390a->twentyfourhour)
+   return BIN2BCD(hour);
+
+   if (hour < 12)
+   return BIN2BCD(hour);
+
+   return 0x40 | BIN2BCD(hour - 12);
+}
+
+static int s35390a_reg2hr(struct s35390a *s35390a, char reg)
+{
+   unsigned hour;
+
+   if (s35390a->twentyfourhour)
+   return BCD2BIN(reg & 0x3f);
+
+   hour = BCD2

Re: [PATCH] scheduler: fix x86 regression in native_sched_clock

2007-12-07 Thread Ingo Molnar


* Guillaume Chazarain <[EMAIL PROTECTED]> wrote:

> I'll clean it up and resend it later. As I don't have the necessary 
> knowledge to do the tsc_{32,64}.c unification, should I copy paste 
> common functions into tsc_32.c and tsc_64.c to ease later unification 
> or should I start a common .c file?

note that there are a couple of existing patches in this area. One is 
the fix below. There's also older frequency-scaling TSC patches - i'll 
try to dig them out.

Ingo

>
Subject: x86: idle wakeup event in the HLT loop
From: Ingo Molnar <[EMAIL PROTECTED]>

do a proper idle-wakeup event on HLT as well - some CPUs stop the TSC
in HLT too, not just when going through the ACPI methods.

Signed-off-by: Ingo Molnar <[EMAIL PROTECTED]>
---
 arch/x86/kernel/process_32.c |   15 ---
 1 file changed, 12 insertions(+), 3 deletions(-)

Index: linux/arch/x86/kernel/process_32.c
===
--- linux.orig/arch/x86/kernel/process_32.c
+++ linux/arch/x86/kernel/process_32.c
@@ -113,10 +113,19 @@ void default_idle(void)
smp_mb();
 
local_irq_disable();
-   if (!need_resched())
+   if (!need_resched()) {
+   ktime_t t0, t1;
+   u64 t0n, t1n;
+
+   t0 = ktime_get();
+   t0n = ktime_to_ns(t0);
safe_halt();/* enables interrupts racelessly */
-   else
-   local_irq_enable();
+   local_irq_disable();
+   t1 = ktime_get();
+   t1n = ktime_to_ns(t1);
+   sched_clock_idle_wakeup_event(t1n - t0n);
+   }
+   local_irq_enable();
current_thread_info()->status |= TS_POLLING;
} else {
/* loop is done by the caller */
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH x86/mm] x86 vDSO: canonicalize sysenter .eh_frame

2007-12-07 Thread Ingo Molnar


* Roland McGrath <[EMAIL PROTECTED]> wrote:

> Some assembler versions automagically optimize .eh_frame contents, 
> changing their size.  The CFI in sysenter.S was not using optimal 
> formatting, so it would be changed by newer/smarter assemblers. This 
> ran afoul of the wired constant for padding out the other vDSO images 
> to match its size.  This changes the original hand-coded source to use 
> the optimal format encoding for its operations.  That leaves nothing 
> more for a fancy assembler to do, so the sizes will match the wired-in 
> expected size regardless of the assembler version.
> 
> Signed-off-by: Roland McGrath <[EMAIL PROTECTED]> ---

thanks, applied.

Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] kexec: force x86_64 arches to boot kdump kernels on boot cpu

2007-12-07 Thread Yinghai Lu

On Dec 7, 2007 12:50 AM, Yinghai Lu <[EMAIL PROTECTED]> wrote:
>
> On Dec 6, 2007 4:33 PM, Eric W. Biederman <[EMAIL PROTECTED]> wrote:
...
> >
> > My feel is that if it is for legacy interrupts only it should not be a 
> > problem.
> > Let's investigate and see if we can unconditionally enable this quirk
> > for all opteron systems.
>
> i checked that bit
>
> http://www.openbios.org/viewvc/trunk/LinuxBIOSv2/src/northbridge/amd/amdk8/coherent_ht.c?revision=2596&view=markup
>
> static void enable_apic_ext_id(u8 node)
> {
> #if ENABLE_APIC_EXT_ID==1
> #warning "FIXME Is the right place to enable apic ext id here?"
>
>   u32 val;
>
> val = pci_read_config32(NODE_HT(node), 0x68);
> val |= (HTTC_APIC_EXT_SPUR | HTTC_APIC_EXT_ID | 
> HTTC_APIC_EXT_BRD_CST);
> pci_write_config32(NODE_HT(node), 0x68, val);
> #endif
> }
>
> that bit only be should be set when apic id is lifted and cpu apid is
> using 8 bits and that mean broadcast is 0xff instead 0x0f.
> for example 8 socket dual core system or 4 socket quad core
> system,that you should make BSP start from 0x04, so cpus apic id will
> be [0x04, 0x13)
>
>
> So if you want to enable that in early_quirk, you need to
> make sure apic id is using 8 bits by check if the bit 16 (HTTC_APIC_ID) is 
> set.

it should be bit 18 (HTTC_APIC_EXT_ID)


YH
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Bloggoo.com สร้างเว็บบล็อกแบบ เร็ว ฟรี ง่าย ทันทีตอนนี้เลย

2007-12-07 Thread Bloggoo.com

Dear  linux-kernel@vger.kernel.org,

[EMAIL PROTECTED] has sent you an invite to sign up at Bloggoo.com - 
http://bloggoo.com.

"BlogGoo (www.bloggoo.com) จัดทำขึ้นเพื่อให้ผู้ใช้บริการได้มีพื้นที่ส่วนตัว 
ในการสร้างสรรค์งานเขียนต่างๆ ของตนเองอย่างอิสระ ทั้งบอกเล่าเรื่องราวส่วนตัว 
เหตุการณ์ที่เกิดขึ้นประจำวัน แบ่งปันข้อมูล บทความ ใส่รูปภาพ วีดีโอ และเสียง 
หรือแลกเปลี่ยนความคิดเห็น ข่าวสารต่างๆ ตามแต่ที่ผู้ใช้บริการแต่ละท่านต้องการ. 

นอกจากนั้น BlogGoo ยังถือเป็นชุมชนออนไลน์ ที่เจ้าของ Blog สามารถติดต่อ 
เชื่อมความสัมพันธ์ กับเจ้าของ Blog อื่นๆ สร้างมิตรภาพดีๆ บนโลกอินเทอร์เน็ต 
และเพื่อเปิดโลกทัศน์ให้กว้างขึ้น. 

ขณะนี้ทาง BlogGoo ได้อยู่ในช่วงที่ต้องการการทดสอบระบบก่อนใช้งานจริง 
ซึ่งจะเปิดให้ใช้อย่างเป็นทางการในเร็วๆ นี้ 
เราต้องการผู้ที่สนใจที่จะมีส่วนร่วมในการทดสอบครั้งนี้ 
ถ้าท่านสนใจก็สามารถสมัครสมาชิกสร้างบล็อกของคุณทันทีได้ฟรี ที่นี่ 
http://bloggoo.com/wp-signup.php เพื่อทดสอบการสร้างบล็อกได้เลยทันที.

และท่านสามารถติชม หรือให้คำแนะนำเว็บไซต์ BlogGoo ได้ที่ [EMAIL PROTECTED]

สุดท้ายนี้ ต้องขอขอบคุณทุกท่านที่ให้การสนับสนุน 
และขอให้มีความสุขกับการใช้บริการ BlogGoo ของเรานะครับ"

You can create your account here:
http://bloggoo.com/wp-signup.php

We are looking forward to seeing you on the site.

Cheers,

--The Team @ Bloggoo.com

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 15/20] net/lapb/lapb_iface.c: use LIST_HEAD instead of LIST_HEAD_INIT

2007-12-07 Thread David Miller

From: Denis Cheng <[EMAIL PROTECTED]>
Date: Fri,  7 Dec 2007 00:07:18 +0800

> single list_head variable initialized with LIST_HEAD_INIT could almost
> always can be replaced with LIST_HEAD declaration, this shrinks the code
> and looks better.
> 
> Signed-off-by: Denis Cheng <[EMAIL PROTECTED]>

Applied.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] scheduler: fix x86 regression in native_sched_clock

2007-12-07 Thread Ingo Molnar


* Stefano Brivio <[EMAIL PROTECTED]> wrote:

> This patch fixes a regression introduced by:
> 
> commit bb29ab26863c022743143f27956cc0ca362f258c
> Author: Ingo Molnar <[EMAIL PROTECTED]>
> Date:   Mon Jul 9 18:51:59 2007 +0200
> 
> This caused the jiffies counter to leap back and forth on cpufreq 
> changes on my x86 box. I'd say that we can't always assume that TSC 
> does "small errors" only, when marked unstable. On cpufreq changes 
> these errors can be huge.

ah, printk_clock() still uses sched_clock(), not jiffies. So it's not 
the jiffies counter that goes back and forth, it's sched_clock() - so 
this is a printk timestamps anomaly, not related to jiffies. I thought 
we have fixed this bug in the printk code already: sched_clock() is a 
'raw' interface that should not be used directly - the proper interface 
is cpu_clock(cpu). Does the patch below help?

Ingo

--->
Subject: sched: fix CONFIG_PRINT_TIME's reliance on sched_clock()
From: Ingo Molnar <[EMAIL PROTECTED]>

Stefano Brivio reported weird printk timestamp behavior during
CPU frequency changes:

  http://bugzilla.kernel.org/show_bug.cgi?id=9475

fix CONFIG_PRINT_TIME's reliance on sched_clock() and use cpu_clock()
instead.

Reported-and-bisected-by: Stefano Brivio <[EMAIL PROTECTED]>
Signed-off-by: Ingo Molnar <[EMAIL PROTECTED]>
---
 kernel/printk.c |2 +-
 kernel/sched.c  |7 ++-
 2 files changed, 7 insertions(+), 2 deletions(-)

Index: linux/kernel/printk.c
===
--- linux.orig/kernel/printk.c
+++ linux/kernel/printk.c
@@ -680,7 +680,7 @@ asmlinkage int vprintk(const char *fmt, 
loglev_char = default_message_loglevel
+ '0';
}
-   t = printk_clock();
+   t = cpu_clock(printk_cpu);
nanosec_rem = do_div(t, 10);
tlen = sprintf(tbuf,
"<%c>[%5lu.%06lu] ",
Index: linux/kernel/sched.c
===
--- linux.orig/kernel/sched.c
+++ linux/kernel/sched.c
@@ -599,7 +599,12 @@ unsigned long long cpu_clock(int cpu)
 
local_irq_save(flags);
rq = cpu_rq(cpu);
-   update_rq_clock(rq);
+   /*
+* Only call sched_clock() if the scheduler has already been
+* initialized (some code might call cpu_clock() very early):
+*/
+   if (rq->idle)
+   update_rq_clock(rq);
now = rq->clock;
local_irq_restore(flags);
 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: RFC: outb 0x80 in inb_p, outb_p harmful on some modern AMD64 with MCP51 laptops

2007-12-07 Thread Rene Herman


On 07-12-07 16:43, Rene Herman wrote:


On 07-12-07 15:54, Andi Kleen wrote:

My machine in question, for example, needs no waiting within 
CMOS_READs at all.   And I doubt any other chip/device needs waiting 
that isn't 


I don't know about CMOS, but there were definitely some not too ancient
systems (let's say not more than 10 years) who required IO delays in the
floppy driver and the 8253/8259. But on those the jumps are already
far too fast.


Also see Alan's replies in the thread I posted a link to:

http://linux.derkeiler.com/Mailing-Lists/Kernel/2003-09/5700.html

Also 8254 (PIT) at least it seems.


By the way, David, it would be interesting if you could test 0xed. If your 
problem is some piece of hardware getting upset at LPC bus aborts it's not 
going to matter and we'd know an outb delay is just not an option on your 
system at least. You said you could quickly reproduce the problem with port 
0x80?


Rene.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Scheduler behaviour

2007-12-07 Thread Holger Wolf


Arjan van de Ven wrote:

On Wed, 05 Dec 2007 21:15:30 +0100
Holger Wolf <[EMAIL PROTECTED]> wrote:

  

We discovered performance degradation with dbench when using kernel
2.6.23 compared to kernel 2.6.22.

In our case we booted a Linux in a IBM System z9 LPAR with 256MB of
ram with 4 CPU's. This system uses a striped LV with 16 disks on a
Storage Server connected via 8 4GBit links.
A dbench was started on that system performing I/O operations on the
striped LV. dbench runs were performed with 1 to 62 processes.
Measurements with a 2.6.22 kernel were compared to measurements with
a 2.6.23 kernel. We saw a throughput degradation from 7.2 to 23.4



this is good news!
dbench rewards unfair behavior... so higher dbench usually means a
worse kernel ;)


  

tests with 2.6.22 including CFS  show the same results.
This means the pressure on page cache is much higher when all processes 
run in parallel.
We see this behavior as well with iozone when writing on many disks with 
many threads and just 256 MB memory.


This means the scheduler schedules as it should - fair.

regards Holger

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: broken suspend (sched related) [Was: 2.6.24-rc4-mm1]

2007-12-07 Thread Ingo Molnar


* Jiri Slaby <[EMAIL PROTECTED]> wrote:

> On 12/05/2007 06:17 AM, Andrew Morton wrote:
> >   
> > ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.24-rc4/2.6.24-rc4-mm1/
> 
> >  git-sched.patch
> 
> breaks suspend here since -rc3-mm2. More precisely, this one:
> softlockup: automatically detect hung TASK_UNINTERRUPTIBLE tasks
> 
> 2.6.24-rc4-mm1 minus this one works just fine. Otherwise disks stop, graphics
> stops and then it hangs not powering down.
> 
> Core 2 Duo, SMP kernel, voluntary preempt, 250 HZ, SLUB, 64 bit.
> 
> Ideas?

thanks for tracking it down. Does the patch below help?

Ingo

---
 kernel/softlockup.c |8 ++--
 1 file changed, 6 insertions(+), 2 deletions(-)

Index: linux/kernel/softlockup.c
===
--- linux.orig/kernel/softlockup.c
+++ linux/kernel/softlockup.c
@@ -101,7 +101,11 @@ void softlockup_tick(void)
 
now = get_timestamp(this_cpu);
 
-   /* Warn about unreasonable delays: */
+   /* Wake up the high-prio watchdog task every second: */
+   if (now > (touch_timestamp + 1))
+   wake_up_process(per_cpu(watchdog_task, this_cpu));
+
+   /* Warn about unreasonable 10+ seconds delays: */
if (now <= (touch_timestamp + softlockup_thresh))
return;
 
@@ -214,7 +218,7 @@ static int watchdog(void *__bind_cpu)
 */
while (!kthread_should_stop()) {
touch_softlockup_watchdog();
-   msleep_interruptible(1);
+   schedule();
 
/*
 * Only do the hung-tasks check on one CPU:
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] SCSI: make pcmcia directory use obj-y|m instead of subdir-y|m

2007-12-07 Thread Sam Ravnborg

On Fri, Dec 07, 2007 at 10:36:23PM +0900, Tejun Heo wrote:
> subdir-y|m isn't supposed to contain modules or built-in components.
> Change subdir-$(CONFIG_PCMCIA) to obj-$(CONFIG_PCMCIA).
> 
> Signed-off-by: Tejun Heo <[EMAIL PROTECTED]>
> Cc: Sam Ravnborg <[EMAIL PROTECTED]>
> Cc: James Bottomley <[EMAIL PROTECTED]>
Ack-by: Sam Ravnborg <[EMAIL PROTECTED]>

> ---
>  drivers/scsi/Makefile |2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/scsi/Makefile b/drivers/scsi/Makefile
> index 2e6129f..72c8d2e 100644
> --- a/drivers/scsi/Makefile
> +++ b/drivers/scsi/Makefile
> @@ -18,7 +18,7 @@ CFLAGS_aha152x.o =   -DAHA152X_STAT -DAUTOCONF
>  CFLAGS_gdth.o= # -DDEBUG_GDTH=2 -D__SERIAL__ -D__COM2__ -DGDTH_STATISTICS
>  CFLAGS_seagate.o =   -DARBITRATE -DPARITY -DSEAGATE_USE_ASM
>  
> -subdir-$(CONFIG_PCMCIA)  += pcmcia
> +obj-$(CONFIG_PCMCIA) += pcmcia/
>  
>  obj-$(CONFIG_SCSI)   += scsi_mod.o
>  obj-$(CONFIG_SCSI_TGT)   += scsi_tgt.o
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [patch-RFC 00/26] LTTng Kernel Trace Thread Flag

2007-12-07 Thread Mathieu Desnoyers

* Frank Ch. Eigler ([EMAIL PROTECTED]) wrote:
> Mathieu Desnoyers <[EMAIL PROTECTED]> writes:
> 
> > This is an RFC for addition of a new thread flag, TIF_KERNEL_TRACE, to each
> > architecture to activate system-wide system call tracing.
> > [...]
> 
> Instead of creating a new flag, could you overload TIF_SYSCALL_TRACE,
> putting the marker into syscall_trace(), and letting !PT_TRACED cause
> a skip over the ptrace notification logic?
> 
> - FChE

I don't see any PT_TRACED flag in current kernel HEAD ?

Hrm, let's see. If we share TIF_SYSCALL_TRACE with ptrace, we would then
have to figure out how to get this working :

- kernel tracing activated
- ptracing some random processes
- kernel tracing deactivated
- stop ptracing those processes

It means that we would have to keep some state information about the
ptrace status of each process. This is currently kept by
TIF_SYSCALL_TRACE, but since we would be overloading it, it would be
lost when we deactivate kernel tracing. Adding a supplementary field to
the thread_info structure is out of question here : we have to keep it
as small as possible. So where do you propose to keep this information
other than... another thread flag ?

Mathieu

-- 
Mathieu Desnoyers
Computer Engineering Ph.D. Student, Ecole Polytechnique de Montreal
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F  BA06 3F25 A8FE 3BAE 9A68
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] SCSI: make pcmcia directory use obj-y|m instead of subdir-y|m

2007-12-07 Thread Tejun Heo

subdir-y|m isn't supposed to contain modules or built-in components.
Change subdir-$(CONFIG_PCMCIA) to obj-$(CONFIG_PCMCIA).

Signed-off-by: Tejun Heo <[EMAIL PROTECTED]>
Cc: Sam Ravnborg <[EMAIL PROTECTED]>
Cc: James Bottomley <[EMAIL PROTECTED]>
---
 drivers/scsi/Makefile |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/scsi/Makefile b/drivers/scsi/Makefile
index 2e6129f..72c8d2e 100644
--- a/drivers/scsi/Makefile
+++ b/drivers/scsi/Makefile
@@ -18,7 +18,7 @@ CFLAGS_aha152x.o =   -DAHA152X_STAT -DAUTOCONF
 CFLAGS_gdth.o= # -DDEBUG_GDTH=2 -D__SERIAL__ -D__COM2__ -DGDTH_STATISTICS
 CFLAGS_seagate.o =   -DARBITRATE -DPARITY -DSEAGATE_USE_ASM
 
-subdir-$(CONFIG_PCMCIA)+= pcmcia
+obj-$(CONFIG_PCMCIA)   += pcmcia/
 
 obj-$(CONFIG_SCSI) += scsi_mod.o
 obj-$(CONFIG_SCSI_TGT) += scsi_tgt.o
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] kexec: force x86_64 arches to boot kdump kernels on boot cpu

2007-12-07 Thread Vivek Goyal

On Fri, Dec 07, 2007 at 09:53:15AM -0500, Neil Horman wrote:
> On Fri, Dec 07, 2007 at 09:39:44AM -0500, Vivek Goyal wrote:
> > On Thu, Dec 06, 2007 at 07:10:23PM -0500, Neil Horman wrote:
> > > On Thu, Dec 06, 2007 at 05:11:43PM -0500, Vivek Goyal wrote:
> > > > On Thu, Dec 06, 2007 at 04:39:51PM -0500, Neil Horman wrote:
> > > > > On Fri, Nov 30, 2007 at 09:51:31AM -0500, Neil Horman wrote:
> > > > > > On Fri, Nov 30, 2007 at 09:42:50AM -0500, Vivek Goyal wrote:
> > > > > 
> > > > > > 
> > > > > > Thats what I'm doing at the moment.  I'm working on a RHEL5 patch 
> > > > > > at the moment
> > > > > > (since thats whats on the production system thats failing), and 
> > > > > > will forward
> > > > > > port it once its working
> > > > > > 
> > > > > > And not to split hairs, but techically thats not our _only_ choice. 
> > > > > >  We could
> > > > > > force kdump boots on cpu0 as well ;)
> > > > > > 
> > > > > > Thanks
> > > > > > Neil
> > > > > > 
> > > > > > > Thanks
> > > > > > > Vivek
> > > > > > 
> > > > > 
> > > > > 
> > > > > Sorry to have been quiet on this issue for a few days. Interesting 
> > > > > news to
> > > > > report, though.  So I was working on a patch to do early apic 
> > > > > enabling on
> > > > > x86_64, and had something working for the old 2.6.18 kernel that we 
> > > > > were
> > > > > origionally testing on.  Unfortunately while it worked on 2.6.18 it 
> > > > > failed
> > > > > miserably on 2.6.24-rc3-mm2, causing check_timer to consistently 
> > > > > report that the
> > > > > timer interrupt wasn't getting received (even though we could 
> > > > > successfully run
> > > > > calibrate_delay).  Vivek and I were digging into this, when I ran 
> > > > > accross the
> > > > > description of the hypertransport configuration register in the 
> > > > > opteron
> > > > > specification.  It contains a bit that, suprise, configures the ht 
> > > > > bus to either
> > > > > unicast interrupts delivered accross the ht bus to a single cpu, or 
> > > > > to broadcast
> > > > > it to all cpus.  Since it seemed more likely that the 8259 in the 
> > > > > nvidia
> > > > > southbridge was transporting legacy mode interrupts over the ht bus 
> > > > > than
> > > > > directly to cpu0 via an actual wire, I wrote the attached patch to 
> > > > > add a quirk
> > > > > for nvidia chipsets, which scanned for hypertransport controllers, 
> > > > > and ensured
> > > > > that that broadcast bit was set.  Test results indicate that this 
> > > > > solves the
> > > > > problem, and kdump kernels boot just fine on the affected system.
> > > > > 
> > > > 
> > > > Hi Neil,
> > > > 
> > > > Should we disable this broadcasting feature once we are through? 
> > > > Otherwise
> > > > in normal systems it might mean extra traffic on hypertransport. There
> > > > is no need for every interrupt to be broadcasted in normal systems?
> > > > 
> > > > Thanks
> > > > Vivek
> > > 
> > > No, I don't think thats necessecary.  Once the apics are enabled, 
> > > interrupts
> > > shouldn't travel accross the hypertransport bus anyway, opting instead to 
> > > use
> > > the dedicated apic bus (at least thats my understanding).
> > 
> > I think all interrupt message travel on hypertransport. Even after APICS
> > have been enabled.
> > 
> > Look at the following document.
> > 
> > http://www.amd.com/us-en/assets/content_type/white_papers_and_tech_docs/24674.pdf
> > 
> > Have a look at figure 1, figure 2 and section 3.4.2.2 and 3.4.2.3
> > 
> > That's a different thing that once IOAPIC has formed the vectored message,
> > Hypertransport might not touch the destination field.
> >  
> Ok, that might be the case then.
> 
> > Having said that, I am wondering what will happen if a system continues
> > to operate the timer through IOAPIC in ExtInt mode. Will hypertransport
> > keep on broadcasting that interrupt to every cpu? And every cpu will 
> > process that interrupt.
> > 
> I don't think so.  IIRC once the other cpus are started they all disable the
> timer interrupt, except for one cpu, opting instead to get the timer tick via
> ipi, So while they all might see the interrupt packet on the ht bus, only one
> cpu will process it.
> 

Does LAPIC allow to disable a specific vector and not accept interrupts? I
don't think so. If a timer interrupt is broadcasted to every cpu I think
everybody will accept it (like broadcast IPI). That's why intelligence
is built into IOAPIC and direct interrupts to a cpu or group of cpu.

I am just trying to understand the functionality better. Can somebody help me
understand how do we make sure that same timer interrupt is not processed by
all cpus (assuming hypertransport is broadcasting it)?

> > Hence, I feel it is safe to restore the broadcast bit back to BIOS value 
> > once
> > we are through calibrate_delay().
> > 
> I disagree.  Looking at what Yinghai said, the default setting for the 
> broadcast
> bit isn't actually to unicast the interrupt, its just to set the broadcast 
> mask
> to 0xF, o

Re: [PATCH] scheduler: fix x86 regression in native_sched_clock

2007-12-07 Thread Guillaume Chazarain

On Dec 7, 2007 12:13 PM, Nick Piggin <[EMAIL PROTECTED]> wrote:
> My patch should fix the worst cpufreq sched_clock jumping issue
> I think.

Any pointer to it?

Thanks.

-- 
Guillaume
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] scheduler: fix x86 regression in native_sched_clock

2007-12-07 Thread Ingo Molnar


* Nick Piggin <[EMAIL PROTECTED]> wrote:

> > ah, printk_clock() still uses sched_clock(), not jiffies. So it's 
> > not the jiffies counter that goes back and forth, it's sched_clock() 
> > - so this is a printk timestamps anomaly, not related to jiffies. I 
> > thought we have fixed this bug in the printk code already: 
> > sched_clock() is a 'raw' interface that should not be used directly 
> > - the proper interface is cpu_clock(cpu).
> 
> It's a single CPU box, so sched_clock() jumping would still be 
> problematic, no?

sched_clock() is an internal API - the non-jumping API to be used by 
printk is cpu_clock().

Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] scheduler: fix x86 regression in native_sched_clock

2007-12-07 Thread Ingo Molnar


ok, here's a rollup of 11 patches that relate to this. I hoped we could 
wait with this for 2.6.25, but it seems more urgent as per Stefano's 
testing, as udelay() and drivers are affected as well.

Stefano, could you try this ontop of a recent-ish Linus tree - does this 
resolve all issues? (without introducing new ones ;-)

Ingo

Index: linux/arch/arm/kernel/time.c
===
--- linux.orig/arch/arm/kernel/time.c
+++ linux/arch/arm/kernel/time.c
@@ -79,17 +79,6 @@ static unsigned long dummy_gettimeoffset
 }
 #endif
 
-/*
- * An implementation of printk_clock() independent from
- * sched_clock().  This avoids non-bootable kernels when
- * printk_clock is enabled.
- */
-unsigned long long printk_clock(void)
-{
-   return (unsigned long long)(jiffies - INITIAL_JIFFIES) *
-   (10 / HZ);
-}
-
 static unsigned long next_rtc_update;
 
 /*
Index: linux/arch/ia64/kernel/time.c
===
--- linux.orig/arch/ia64/kernel/time.c
+++ linux/arch/ia64/kernel/time.c
@@ -344,33 +344,6 @@ udelay (unsigned long usecs)
 }
 EXPORT_SYMBOL(udelay);
 
-static unsigned long long ia64_itc_printk_clock(void)
-{
-   if (ia64_get_kr(IA64_KR_PER_CPU_DATA))
-   return sched_clock();
-   return 0;
-}
-
-static unsigned long long ia64_default_printk_clock(void)
-{
-   return (unsigned long long)(jiffies_64 - INITIAL_JIFFIES) *
-   (10/HZ);
-}
-
-unsigned long long (*ia64_printk_clock)(void) = &ia64_default_printk_clock;
-
-unsigned long long printk_clock(void)
-{
-   return ia64_printk_clock();
-}
-
-void __init
-ia64_setup_printk_clock(void)
-{
-   if (!(sal_platform_features & IA64_SAL_PLATFORM_FEATURE_ITC_DRIFT))
-   ia64_printk_clock = ia64_itc_printk_clock;
-}
-
 /* IA64 doesn't cache the timezone */
 void update_vsyscall_tz(void)
 {
Index: linux/arch/x86/kernel/process_32.c
===
--- linux.orig/arch/x86/kernel/process_32.c
+++ linux/arch/x86/kernel/process_32.c
@@ -113,10 +113,19 @@ void default_idle(void)
smp_mb();
 
local_irq_disable();
-   if (!need_resched())
+   if (!need_resched()) {
+   ktime_t t0, t1;
+   u64 t0n, t1n;
+
+   t0 = ktime_get();
+   t0n = ktime_to_ns(t0);
safe_halt();/* enables interrupts racelessly */
-   else
-   local_irq_enable();
+   local_irq_disable();
+   t1 = ktime_get();
+   t1n = ktime_to_ns(t1);
+   sched_clock_idle_wakeup_event(t1n - t0n);
+   }
+   local_irq_enable();
current_thread_info()->status |= TS_POLLING;
} else {
/* loop is done by the caller */
Index: linux/arch/x86/kernel/tsc_32.c
===
--- linux.orig/arch/x86/kernel/tsc_32.c
+++ linux/arch/x86/kernel/tsc_32.c
@@ -5,6 +5,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include 
 #include 
@@ -78,15 +79,32 @@ EXPORT_SYMBOL_GPL(check_tsc_unstable);
  *  cyc2ns_scale is limited to 10^6 * 2^10, which fits in 32 bits.
  *  ([EMAIL PROTECTED])
  *
+ *  ns += offset to avoid sched_clock jumps with cpufreq
+ *
  * [EMAIL PROTECTED] "math is hard, lets go shopping!"
  */
-unsigned long cyc2ns_scale __read_mostly;
 
 #define CYC2NS_SCALE_FACTOR 10 /* 2^10, carefully chosen */
 
-static inline void set_cyc2ns_scale(unsigned long cpu_khz)
+DEFINE_PER_CPU(struct cyc2ns_params, cyc2ns) __read_mostly;
+
+static void set_cyc2ns_scale(unsigned long cpu_khz)
 {
-   cyc2ns_scale = (100 << CYC2NS_SCALE_FACTOR)/cpu_khz;
+   struct cyc2ns_params *params;
+   unsigned long flags;
+   unsigned long long tsc_now, ns_now;
+
+   rdtscll(tsc_now);
+   params = &get_cpu_var(cyc2ns);
+
+   local_irq_save(flags);
+   ns_now = __cycles_2_ns(params, tsc_now);
+
+   params->scale = (NSEC_PER_MSEC << CYC2NS_SCALE_FACTOR)/cpu_khz;
+   params->offset += ns_now - __cycles_2_ns(params, tsc_now);
+   local_irq_restore(flags);
+
+   put_cpu_var(cyc2ns);
 }
 
 /*
Index: linux/arch/x86/kernel/tsc_64.c
===
--- linux.orig/arch/x86/kernel/tsc_64.c
+++ linux/arch/x86/kernel/tsc_64.c
@@ -10,6 +10,7 @@
 
 #include 
 #include 
+#include 
 
 static int notsc __initdata = 0;
 
@@ -18,16 +19,25 @@ EXPORT_SYMBOL(cpu_khz);
 unsigned int tsc_khz;
 EXPORT_SYMBOL(tsc_khz);
 
-static unsigned int cyc2ns_scale __read_mostly;
+DEFINE_PER_CPU(struct cyc2ns_params, cyc2ns) __read_mostly;
 
-static inline void set_cyc2ns_scale(unsigned long khz)
+static void set_cyc2ns_scale(unsigned long cp

Re: [PATCH] scheduler: fix x86 regression in native_sched_clock

2007-12-07 Thread Ingo Molnar


* Ingo Molnar <[EMAIL PROTECTED]> wrote:

> > > - t = printk_clock();
> > > + t = cpu_clock(printk_cpu);
> > >   nanosec_rem = do_div(t, 10);
> > >   tlen = sprintf(tbuf,
> > >   "<%c>[%5lu.%06lu] ",
> > 
> > A bit risky - it's quite an expansion of code which no longer can call 
> > printk.
> > 
> > You might want to take that WARN_ON out of __update_rq_clock() ;)
> 
> hm, dont we already detect printk recursions and turn them into a 
> silent return instead of a hang/crash?

ugh, we dont. So i guess the (tested) patch below is highly needed. (If 
such incidents become frequent then we could save the stackdump of the 
recursion via save_stack_trace() too - but i wanted to keep the initial 
code simple.)

Ingo

>
Subject: printk: make printk more robust by not allowing recursion
From: Ingo Molnar <[EMAIL PROTECTED]>

make printk more robust by allowing recursion only if there's a crash
going on. Also add recursion detection.

I've tested it with an artificially injected printk recursion - instead
of a lockup or spontaneous reboot or other crash, the output was a well
controlled:

[   41.057335] SysRq : <2>BUG: recent printk recursion!
[   41.057335] loglevel0-8 reBoot Crashdump show-all-locks(D) tErm Full kIll 
saK showMem Nice powerOff showPc show-all-timers(Q) unRaw Sync showTasks 
Unmount shoW-blocked-tasks

also do all this printk logic with irqs disabled.

Signed-off-by: Ingo Molnar <[EMAIL PROTECTED]>
---
 kernel/printk.c |   52 ++--
 1 file changed, 42 insertions(+), 10 deletions(-)

Index: linux/kernel/printk.c
===
--- linux.orig/kernel/printk.c
+++ linux/kernel/printk.c
@@ -623,30 +623,57 @@ asmlinkage int printk(const char *fmt, .
 /* cpu currently holding logbuf_lock */
 static volatile unsigned int printk_cpu = UINT_MAX;
 
+const char printk_recursion_bug_msg [] =
+   KERN_CRIT "BUG: recent printk recursion!\n";
+static int printk_recursion_bug;
+
 asmlinkage int vprintk(const char *fmt, va_list args)
 {
+   static int log_level_unknown = 1;
+   static char printk_buf[1024];
+
unsigned long flags;
-   int printed_len;
+   int printed_len = 0;
+   int this_cpu;
char *p;
-   static char printk_buf[1024];
-   static int log_level_unknown = 1;
 
boot_delay_msec();
 
preempt_disable();
-   if (unlikely(oops_in_progress) && printk_cpu == smp_processor_id())
-   /* If a crash is occurring during printk() on this CPU,
-* make sure we can't deadlock */
-   zap_locks();
-
/* This stops the holder of console_sem just where we want him */
raw_local_irq_save(flags);
+   this_cpu = smp_processor_id();
+
+   /*
+* Ouch, printk recursed into itself!
+*/
+   if (unlikely(printk_cpu == this_cpu)) {
+   /*
+* If a crash is occurring during printk() on this CPU,
+* then try to get the crash message out but make sure
+* we can't deadlock. Otherwise just return to avoid the
+* recursion and return - but flag the recursion so that
+* it can be printed at the next appropriate moment:
+*/
+   if (!oops_in_progress) {
+   printk_recursion_bug = 1;
+   goto out_restore_irqs;
+   }
+   zap_locks();
+   }
+
lockdep_off();
spin_lock(&logbuf_lock);
-   printk_cpu = smp_processor_id();
+   printk_cpu = this_cpu;
 
+   if (printk_recursion_bug) {
+   printk_recursion_bug = 0;
+   strcpy(printk_buf, printk_recursion_bug_msg);
+   printed_len = sizeof(printk_recursion_bug_msg);
+   }
/* Emit the output into the temporary buffer */
-   printed_len = vscnprintf(printk_buf, sizeof(printk_buf), fmt, args);
+   printed_len += vscnprintf(printk_buf + printed_len,
+ sizeof(printk_buf), fmt, args);
 
/*
 * Copy the output into log_buf.  If the caller didn't provide
@@ -675,6 +702,10 @@ asmlinkage int vprintk(const char *fmt, 
loglev_char = default_message_loglevel
+ '0';
}
+   if (panic_timeout) {
+   panic_timeout = 0;
+   printk("recurse!\n");
+   }
t = cpu_clock(printk_cpu);
nanosec_rem = do_div(t, 10);
tlen = sprintf(tbuf,
@@ -739,6

Re: [PATCH] scheduler: fix x86 regression in native_sched_clock

2007-12-07 Thread Ingo Molnar


* Andrew Morton <[EMAIL PROTECTED]> wrote:

> > -   t = printk_clock();
> > +   t = cpu_clock(printk_cpu);
> > nanosec_rem = do_div(t, 10);
> > tlen = sprintf(tbuf,
> > "<%c>[%5lu.%06lu] ",
> 
> A bit risky - it's quite an expansion of code which no longer can call 
> printk.
> 
> You might want to take that WARN_ON out of __update_rq_clock() ;)

hm, dont we already detect printk recursions and turn them into a silent 
return instead of a hang/crash?

Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 5/6] syslets: add generic syslets infrastructure

2007-12-07 Thread Evgeniy Polyakov

Hi Zach.

On Thu, Dec 06, 2007 at 03:20:18PM -0800, Zach Brown ([EMAIL PROTECTED]) wrote:
> +/*
> + * XXX todo:
> + *  - do we need all this '*cur = current' nonsense?
> + *  - try to prevent userspace from submitting too much.. lazy user ptr read?
> + *  - explain how to deal with waiting threads with stale data in current
> + *  - how does userspace tell that a syslet completion was lost?
> + *   provide an -errno argument to the userspace return function?
> + */
> +
> +/*
> + * These structs are stored on the kernel stack of tasks which are waiting to
> + * return to userspace.  They are linked into their parent's list of syslet
> + * children stored in 'syslet_tasks' in the parent's task_struct.
> + */
> +struct syslet_task_entry {
> + struct task_struct *task;
> + struct list_head item;
> +};
> +
> +/*
> + * syslet_ring doesn't have any kernel-side storage.  Userspace allocates 
> them
> + * in their address space and initializes their fields and then passes them 
> to
> + * the kernel.
> + *
> + * These hashes provide the kernel-side storage for the wait queues which
> + * sys_syslet_ring_wait() uses and the mutex which completion uses to 
> serialize
> + * the (possible blocking) ordered writes of the completion and kernel head
> + * index into the ring.
> + *
> + * We chose the bucket that supports a given ring by hashing a u32 that
> + * userspace sets in the ring.
> + */
> +#define SYSLET_HASH_BITS (CONFIG_BASE_SMALL ? 4 : 8)
> +#define SYSLET_HASH_NR (1 << SYSLET_HASH_BITS)
> +#define SYSLET_HASH_MASK (SYSLET_HASH_NR - 1)
> +static wait_queue_head_t syslet_waitqs[SYSLET_HASH_NR];
> +static struct mutex syslet_muts[SYSLET_HASH_NR];

Why do you care about hashed tables scalability and not using trees?

-- 
Evgeniy Polyakov
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: New Address Family: Inter Process Networking (IPN)

2007-12-07 Thread Andi Kleen

> Stop making excuses, with minor adjustments we have the facilities to
> meet your needs.  There is no need for yet-another-protocol to do what

I suspect they would be better of just using IP multicast. But the localhost 
latency penalty vs Unix Chris was talking about probably needs to be 
investigated.

-Andi
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] IB/ehca: Serialize HCA-related hCalls on POWER5

2007-12-07 Thread Arnd Bergmann

On Thursday 06 December 2007, Roland Dreier wrote:
>  > Regarding the performance problem, have you checked whether converting all
>  > your spin_lock_irqsave to spin_lock/spin_lock_irq improves your performance
>  > on the older machines? Maybe it's already fast enough that way.
> 
> It does seem that the only places that the hcall_lock is taken also
> use msleep, so they must always be in process context.  So you can
> safely just use spin_lock(), right?

I think it needs some more inspection. The msleep in there is only called
for hcalls that return H_IS_LONG_BUSY(). In theory, you can call
ehca_plpar_hcall_norets() from inside an interrupt handler if the
hcall in question never returns long busy.

Arnd <><
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Peculiar out-of-sync boot log lines

2007-12-07 Thread Sergei Shtylyov


Hello.

Bartlomiej Zolnierkiewicz wrote:


[PATCH] ide: DMA reporting and validity checking fixes (take 2)



* ide_xfer_verbose() fixups:
  - beautify returned mode names
  - fix PIO5 reporting
  - make it return 'const char *'



* Change printk() level from KERN_DEBUG to KERN_INFO in ide_find_dma_mode().



* Add ide_id_dma_bug() helper based on ide_dma_verbose() to check for invalid
  DMA info in identify block.



* Use ide_id_dma_bug() in ide_tune_dma() and ide_driveid_update().



  As a result DMA won't be tuned or will be disabled after tuning if device
  reports inconsistent info about enabled DMA mode (ide_dma_verbose() does the
  same checks while the IDE device is probed by ide-{cd,disk} device driver).



* Since (id->capability & 1) && id->tDMA is a valid configuration handle
  it correctly in ide_id_dma_bug().


   Huh? You don't check (id->capability & 1) there...


* Remove no longer needed ide_dma_verbose().



This patch should fix the following problem with out-of-sync IDE messages
reported by Nick Warne:



   hdd: ATAPI 48X DVD-ROM DVD-R-RAM CD-R/RW drive, 2048kB Cache<7>hdd:
   skipping word 93 validity check
, UDMA(66)



and later debugged by Mark Lord to be caused by:



ide_dma_verbose()
printk( ... "2048kB Cache");
eighty_ninty_three()
printk(KERN_DEBUG "%s: skipping word 93 validity check\n");
ide_dma_verbose()
printk(", UDMA(66)"



Please note that as a result ide-{cd,disk} device drivers won't report the
DMA speed used but this is intended since now DMA mode being used is always
reported by IDE core code.



v2:
* fixes suggested by Randy:
  - use KERN_CONT for printk()-s in ide-{cd,disk}.c
  - don't remove argument name from ide_xfer_verbose() declaration



Cc: Nick Warne <[EMAIL PROTECTED]>
Cc: Mark Lord <[EMAIL PROTECTED]>
Cc: Randy Dunlap <[EMAIL PROTECTED]>
Signed-off-by: Bartlomiej Zolnierkiewicz <[EMAIL PROTECTED]>


[...]


Index: b/drivers/ide/ide-dma.c
===
--- a/drivers/ide/ide-dma.c
+++ b/drivers/ide/ide-dma.c
@@ -806,58 +809,26 @@ static int ide_dma_check(ide_drive_t *dr
return vdma ? 0 : -1;
 }
 
-void ide_dma_verbose(ide_drive_t *drive)

+int ide_id_dma_bug(ide_drive_t *drive)
 {
-   struct hd_driveid *id   = drive->id;
-   ide_hwif_t *hwif= HWIF(drive);
+   struct hd_driveid *id = drive->id;
 
 	if (id->field_valid & 4) {

if ((id->dma_ultra >> 8) && (id->dma_mword >> 8))

[...]

+   goto err_out;
} else if (id->field_valid & 2) {
if ((id->dma_mword >> 8) && (id->dma_1word >> 8))
-   goto bug_dma_off;
-   printk(", DMA");
+   goto err_out;
} else if (id->field_valid & 1) {


   Hm, bit 0 only gurantees that current translation


-   goto bug_dma_off;
+   if (id->tDMA == 0)


   Despite the name, this is not a transfer period but SW DMA mode number, so 
why mode 0 is bad?



+   goto err_out;
}
-   return;
-bug_dma_off:
-   printk(", BUG DMA OFF");
-   hwif->dma_off_quietly(drive);
-   return;
+   return 0;
+err_out:
+   printk(KERN_ERR "%s: bad DMA info in identify block\n", drive->name);
+   return 1;
 }
 
Index: b/drivers/ide/ide-lib.c

===
--- a/drivers/ide/ide-lib.c
+++ b/drivers/ide/ide-lib.c
@@ -29,41 +29,44 @@
  * Add common non I/O op stuff here. Make sure it has proper
  * kernel-doc function headers or your patch will be rejected
  */
- 
+

+static const char *udma_str[] =
+{ "UDMA/16", "UDMA/25",  "UDMA/33",  "UDMA/44",
+  "UDMA/66", "UDMA/100", "UDMA/133", "UDMA7" };
+static const char *mwdma_str[] =
+   { "MWDMA0", "MWDMA1", "MWDMA2" };
+static const char *swdma_str[] =
+   { "SWDMA0", "SWDMA1", "SWDMA2" };
+static const char *pio_str[] =
+   { "PIO0", "PIO1", "PIO2", "PIO3", "PIO4", "PIO5" };
 
 /**

  * ide_xfer_verbose-   return IDE mode names
- * @xfer_rate: rate to name
+ * @mode: transfer mode
  *
  * Returns a constant string giving the name of the mode
  * requested.
  */
 
-char *ide_xfer_verbose (u8 xfer_rate)

+const char *ide_xfer_verbose(u8 mode)
 {

[...]

+   const char *s;
+   u8 i = mode & 0xf;
+
+   if (mode >= XFER_UDMA_0 && mode <= XFER_UDMA_7)
+   s = udma_str[i];
+   else if (mode >= XFER_MW_DMA_0 && mode <= XFER_MW_DMA_2)
+   s = mwdma_str[i];
+   else if (mode >= XFER_SW_DMA_0 && mode <= XFER_SW_DMA_2)
+   s = swdma_str[i];
+   else if (mode >= XFER_PIO_0 && mode <= XFER_PIO_5)
+   s = pio_str[i & 0x7];
+   else if (mode == XFER_PIO_SLOW)
+   s = "XFER SLOW";


   Not "PIO SLOW"?


+   else
+   s = "XFER ERROR";
+
+

Re: RFC: outb 0x80 in inb_p, outb_p harmful on some modern AMD64 with MCP51 laptops

2007-12-07 Thread Andi Kleen

> You don't need to. Port 0x80 historically is about 8uS so just udelay(8)
> and make sure the initial default delay is conservative enough before the

How would you make it conservative enough handling let's say a 6Ghz CPU
that can execute multiple jumps per cycle?

-Andi
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: git guidance

2007-12-07 Thread Andreas Ericsson


Al Boldi wrote:

Johannes Schindelin wrote:

Hi,


Hi


On Fri, 7 Dec 2007, Al Boldi wrote:

You need to re-read the thread.

I don't know why you write that, and then say thanks.  Clearly, what you
wrote originally, and what Andreas pointed out, were quite obvious
indicators that git already does what you suggest.

You _do_ work "transparently" (whatever you understand by that overused
term) in the working directory, unimpeded by git.


If you go back in the thread, you may find a link to a gitfs client that 
somebody kindly posted.  This client pretty much defines the transparency 
I'm talking about.  The only problem is that it's read-only.


To make it really useful, it has to support versioning locally, disconnected 
from the server repository.  One way to implement this, could be by 
committing every update unconditionally to an on-the-fly created git 
repository private to the gitfs client.




Earlier you said that you need to be able to tell git when you want to make
a commit, which means pretty much any old filesystem could serve as gitfs.
Now you're saying you want every single update to be committed, which would
make it mimic an editor's undo functionality. I still don't get what it is
you really want.

With this transparently created private scratch repository it should then be 
possible for the same gitfs to re-expose the locally created commits, all 
without any direct user-intervention.


Later, this same scratch repository could then be managed by the normal 
git-management tools/commands to ultimately update the backend git 
repositories.




That's exactly what's happening today. I imagine whoever wrote the gitfs
thing did so to facilitate testing, or as some form of intellectual
masturbation.


So, to get to the bottom of this, which of the following workflows is it you
want git to support?

### WORKFLOW A ###
edit, edit, edit
edit, edit, edit
edit, edit, edit
Oops I made a mistake and need to hop back to "current - 12".
edit, edit, edit
edit, edit, edit
publish everything, similar to just tarring up your workdir and sending out
### END WORKFLOW A ###

### WORKFLOW B ###
edit, edit, edit
ok this looks good, I want to save a checkpoint here
edit, edit, edit
looks good again. next checkpoint
edit, edit, edit
oh crap, back to checkpoint 2
edit, edit, edit
ooh, that's better. save a checkpoint and publish those checkpoints
### END WORKFLOW B ###

If you could just answer that question and stop writing "transparent" or
any synonym thereof six times in each email, we can possibly help you.

As it stands now though, nobody is very interested because you haven't
explained how you want this "transparency" of yours to work in an every
day scenario.

--
Andreas Ericsson   [EMAIL PROTECTED]
OP5 AB www.op5.se
Tel: +46 8-230225  Fax: +46 8-230231
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [patch] ext2: xip check fix

2007-12-07 Thread Carsten Otte


Jared Hulbert wrote:

I think so.  The filemap_xip.c functionality doesn't work for Flash
memory yet.  Flash memory doesn't have struct pages to back it up with
which this stuff depends on.
Struct page is not the major issue. The primary problem is writing to 
the media (and I am not a flash expert at all, just relaying here): 
For some period of time, the flash memory is not usable and thus we 
need to make sure we can nuke the page table entries that we have in 
userland page tables. For that, we need a callback from the device so 
that it can ask to get its references back. Oh, and a put_xip_page 
counterpart to get_xip_page, so that the driver knows when it's safe 
to erase.


cheers,
Carsten
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] scheduler: fix x86 regression in native_sched_clock

2007-12-07 Thread Guillaume Chazarain

"Guillaume Chazarain" <[EMAIL PROTECTED]> wrote:

> On Dec 7, 2007 6:51 AM, Thomas Gleixner <[EMAIL PROTECTED]> wrote:
> > Hmrpf. sched_clock() is used for the time stamp of the printks. We
> > need to find some better solution other than killing off the tsc
> > access completely.
> 
> Something like http://lkml.org/lkml/2007/3/16/291 that would need some 
> refresh?

And here is a refreshed one just for testing with 2.6-git. The 64 bit
part is a shamelessly untested copy/paste as I cannot test it.

diff --git a/arch/x86/kernel/tsc_32.c b/arch/x86/kernel/tsc_32.c
index 9ebc0da..d561b2f 100644
--- a/arch/x86/kernel/tsc_32.c
+++ b/arch/x86/kernel/tsc_32.c
@@ -5,6 +5,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include 
 #include 
@@ -78,15 +79,32 @@ EXPORT_SYMBOL_GPL(check_tsc_unstable);
  *  cyc2ns_scale is limited to 10^6 * 2^10, which fits in 32 bits.
  *  ([EMAIL PROTECTED])
  *
+ *  ns += offset to avoid sched_clock jumps with cpufreq
+ *
  * [EMAIL PROTECTED] "math is hard, lets go shopping!"
  */
-unsigned long cyc2ns_scale __read_mostly;
 
 #define CYC2NS_SCALE_FACTOR 10 /* 2^10, carefully chosen */
 
-static inline void set_cyc2ns_scale(unsigned long cpu_khz)
+DEFINE_PER_CPU(struct cyc2ns_params, cyc2ns) __read_mostly;
+
+static void set_cyc2ns_scale(unsigned long cpu_khz)
 {
-   cyc2ns_scale = (100 << CYC2NS_SCALE_FACTOR)/cpu_khz;
+   struct cyc2ns_params *params;
+   unsigned long flags;
+   unsigned long long tsc_now, ns_now;
+
+   rdtscll(tsc_now);
+   params = &get_cpu_var(cyc2ns);
+
+   local_irq_save(flags);
+   ns_now = __cycles_2_ns(params, tsc_now);
+
+   params->scale = (NSEC_PER_MSEC << CYC2NS_SCALE_FACTOR)/cpu_khz;
+   params->offset += ns_now - __cycles_2_ns(params, tsc_now);
+   local_irq_restore(flags);
+
+   put_cpu_var(cyc2ns);
 }
 
 /*
diff --git a/arch/x86/kernel/tsc_64.c b/arch/x86/kernel/tsc_64.c
index 9c70af4..93e7a06 100644
--- a/arch/x86/kernel/tsc_64.c
+++ b/arch/x86/kernel/tsc_64.c
@@ -10,6 +10,7 @@
 
 #include 
 #include 
+#include 
 
 static int notsc __initdata = 0;
 
@@ -18,16 +19,25 @@ EXPORT_SYMBOL(cpu_khz);
 unsigned int tsc_khz;
 EXPORT_SYMBOL(tsc_khz);
 
-static unsigned int cyc2ns_scale __read_mostly;
+DEFINE_PER_CPU(struct cyc2ns_params, cyc2ns) __read_mostly;
 
-static inline void set_cyc2ns_scale(unsigned long khz)
+static void set_cyc2ns_scale(unsigned long cpu_khz)
 {
-   cyc2ns_scale = (NSEC_PER_MSEC << NS_SCALE) / khz;
-}
+   struct cyc2ns_params *params;
+   unsigned long flags;
+   unsigned long long tsc_now, ns_now;
 
-static unsigned long long cycles_2_ns(unsigned long long cyc)
-{
-   return (cyc * cyc2ns_scale) >> NS_SCALE;
+   rdtscll(tsc_now);
+   params = &get_cpu_var(cyc2ns);
+
+   local_irq_save(flags);
+   ns_now = __cycles_2_ns(params, tsc_now);
+
+   params->scale = (NSEC_PER_MSEC << CYC2NS_SCALE_FACTOR)/cpu_khz;
+   params->offset += ns_now - __cycles_2_ns(params, tsc_now);
+   local_irq_restore(flags);
+
+   put_cpu_var(cyc2ns);
 }
 
 unsigned long long sched_clock(void)
diff --git a/include/asm-x86/timer.h b/include/asm-x86/timer.h
index 0db7e99..ff4f2a3 100644
--- a/include/asm-x86/timer.h
+++ b/include/asm-x86/timer.h
@@ -2,6 +2,7 @@
 #define _ASMi386_TIMER_H
 #include 
 #include 
+#include 
 
 #define TICK_SIZE (tick_nsec / 1000)
 
@@ -16,7 +17,7 @@ extern int recalibrate_cpu_khz(void);
 #define calculate_cpu_khz() native_calculate_cpu_khz()
 #endif
 
-/* Accellerators for sched_clock()
+/* Accelerators for sched_clock()
  * convert from cycles(64bits) => nanoseconds (64bits)
  *  basic equation:
  * ns = cycles / (freq / ns_per_sec)
@@ -31,20 +32,44 @@ extern int recalibrate_cpu_khz(void);
  * And since SC is a constant power of two, we can convert the div
  *  into a shift.
  *
- *  We can use khz divisor instead of mhz to keep a better percision, since
+ *  We can use khz divisor instead of mhz to keep a better precision, since
  *  cyc2ns_scale is limited to 10^6 * 2^10, which fits in 32 bits.
  *  ([EMAIL PROTECTED])
  *
+ *  ns += offset to avoid sched_clock jumps with cpufreq
+ *
  * [EMAIL PROTECTED] "math is hard, lets go shopping!"
  */
-extern unsigned long cyc2ns_scale __read_mostly;
+
+struct cyc2ns_params {
+   unsigned long scale;
+   unsigned long long offset;
+};
+
+DECLARE_PER_CPU(struct cyc2ns_params, cyc2ns) __read_mostly;
 
 #define CYC2NS_SCALE_FACTOR 10 /* 2^10, carefully chosen */
 
-static inline unsigned long long cycles_2_ns(unsigned long long cyc)
+static inline unsigned long long __cycles_2_ns(struct cyc2ns_params *params,
+  unsigned long long cyc)
 {
-   return (cyc * cyc2ns_scale) >> CYC2NS_SCALE_FACTOR;
+   return ((cyc * params->scale) >> CYC2NS_SCALE_FACTOR) + params->offset;
 }
 
+static inline unsigned long long cycles_2_ns(unsigned long long cyc)
+{
+   struct cyc2ns_para

Re: [PATCH 13/20] net/core/dev.c: use LIST_HEAD instead of LIST_HEAD_INIT

2007-12-07 Thread David Miller

From: Denis Cheng <[EMAIL PROTECTED]>
Date: Fri,  7 Dec 2007 00:01:26 +0800

> single list_head variable initialized with LIST_HEAD_INIT could almost
> always can be replaced with LIST_HEAD declaration, this shrinks the code
> and looks better.
> 
> Signed-off-by: Denis Cheng <[EMAIL PROTECTED]>

Applied.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

ptrace API extensions for BTS

2007-12-07 Thread Metzger, Markus T

Roland, Andi,
 
I would like to discuss the ptrace user interface for the BTS extension.
In previous emails,
Andi suggested a stream-like interface, but is also OK with an
array-like interface (as far as I understood).
Roland is dubious about the ptrace API additions.

I would like to settle the discussion and find an interface that
everybody can agree to, so I can implement that interface and we can
move forward with the patch.

Here's the link to the original patch:
http://lkml.org/lkml/2007/12/5/234.


Here are the facts:
- we need to provide access to an array (cyclic buffer) of BTS records
- the array can be quite big
- the most interesting part is the tail
- a BTS record can either describe a branch (from, to address)
  or a scheduling event (task arrives/departs at timestamp)


Let's look at the entire array, first. I see the following alternatives:
1. get the entire array in one command
   + simple interface, like GETREGS
   - a lot of (redundant) copying
2. array-like commands (get size, read element at index)
   + allows precise reads; minimizes copying
3. stream-like commands (read, maybe seek) [read from back to front]
   + favors most expected use cases
   - makes other uses much harder (e.g read from front to back)
   - harder to get the semantics right and intuitive
 (when to reset read pointer? e.g. when stepping between two reads)

Alternatives 1 and 3 require a reordering to turn the cyclic buffer into
a sequential array or stream.
Alternative 2 would benefit from that, as well.
When we reorder the array, the best order would be from back to front,
so users can start reading the most interesting part first, and stop
when they read enough.

I would recommend alternative 2. Number 1 may result in too much
copying, and number 3 is better done in user space; the kernel API
should be more flexible and not favor a single use case.


Let's look at the array size, next.
1. pre-defined array size
   + most simple, no extra command
   - one size will not fit all users
2. user-defined array size
   + most flexible for the user
   (need to set a system limit to restrain greedy users)

I would recommend alternative 2. A good citizen will only ask for the
space he needs.
In the ideal case, the system limit would be variable (as Andi
suggested). 


Let's look at the array contents. Currently, we have 3 different record
types.
1. self-describing union
   + most extensible
   + allows single bts array
   - may waste (user-space) memory
2. separate fixed-type arrays
   + get command defines interpretation
   - need additional effort to describe relative order between array
elements
   - extension requires new set of access commands

I would recommend alternative 1. It is most flexible and most easily
extensible. And it is easier to use.


What extensions do we expect in the future?
1. more architectures
2. additional data

Regarding 2, a union would easily allow us to add additional data; at
the cost of a few wasted bytes, if the data is not evenly sized. A user
may look at the qualifier and either ignore records he does not
understand, or bail out.

Regarding 1, we currently provide scheduling timestamps, which are arch
independent, and from-to branch information, which should be available
on all architectures for a similar feature. I could think of basic block
from-to information as an alternative representation on some
architectures. I could also imagine that other architectures provide
additional information (like the predicted bit on Netburst that was
dropped for later architectures). Both could be modelled using
additional record types.

Additional architectures may want to (re)use and extend the x86 bts
record, or they may want to invent their own format. In the former case,
we may move the bts union and the bts commands to the generic ptrace
header, and provide a default implementation for architectures that do
not support it (basically pretend that the array is empty or return an
error). In the latter case, they may copy parts of the x86 header.

I would postpone the decision until there are more arch's that wish to
support this feature.


Thank you for reading until here.

regards,
markus.
-
Intel GmbH
Dornacher Strasse 1
85622 Feldkirchen/Muenchen Germany
Sitz der Gesellschaft: Feldkirchen bei Muenchen
Geschaeftsfuehrer: Douglas Lusk, Peter Gleissner, Hannes Schwaderer
Registergericht: Muenchen HRB 47456 Ust.-IdNr.
VAT Registration No.: DE129385895
Citibank Frankfurt (BLZ 502 109 00) 600119052

This e-mail and any attachments may contain confidential material for
the sole use of the intended recipient(s). Any review or distribution
by others is strictly prohibited. If you are not the intended
recipient, please contact the sender and delete all copies.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordom

Re: [PATCH] kexec: force x86_64 arches to boot kdump kernels on boot cpu

2007-12-07 Thread Yinghai Lu

On Dec 6, 2007 4:33 PM, Eric W. Biederman <[EMAIL PROTECTED]> wrote:
> Vivek Goyal <[EMAIL PROTECTED]> writes:
>
>
> > On Thu, Dec 06, 2007 at 04:39:51PM -0500, Neil Horman wrote:
> >> On Fri, Nov 30, 2007 at 09:51:31AM -0500, Neil Horman wrote:
> >> > On Fri, Nov 30, 2007 at 09:42:50AM -0500, Vivek Goyal wrote:
> >> 
> >> >
> >> > Thats what I'm doing at the moment.  I'm working on a RHEL5 patch at the
> > moment
> >> > (since thats whats on the production system thats failing), and will 
> >> > forward
> >> > port it once its working
> >> >
> >> > And not to split hairs, but techically thats not our _only_ choice.  We
> > could
> >> > force kdump boots on cpu0 as well ;)
> >> >
> >> > Thanks
> >> > Neil
> >> >
> >> > > Thanks
> >> > > Vivek
> >> >
> >>
> >>
> >> Sorry to have been quiet on this issue for a few days. Interesting news to
> >> report, though.  So I was working on a patch to do early apic enabling on
> >> x86_64, and had something working for the old 2.6.18 kernel that we were
> >> origionally testing on.  Unfortunately while it worked on 2.6.18 it failed
> >> miserably on 2.6.24-rc3-mm2, causing check_timer to consistently report 
> >> that
> > the
> >> timer interrupt wasn't getting received (even though we could successfully 
> >> run
> >> calibrate_delay).  Vivek and I were digging into this, when I ran accross 
> >> the
> >> description of the hypertransport configuration register in the opteron
> >> specification.  It contains a bit that, suprise, configures the ht bus to
> > either
> >> unicast interrupts delivered accross the ht bus to a single cpu, or to
> > broadcast
> >> it to all cpus.  Since it seemed more likely that the 8259 in the nvidia
> >> southbridge was transporting legacy mode interrupts over the ht bus than
> >> directly to cpu0 via an actual wire, I wrote the attached patch to add a 
> >> quirk
> >> for nvidia chipsets, which scanned for hypertransport controllers, and 
> >> ensured
> >> that that broadcast bit was set.  Test results indicate that this solves 
> >> the
> >> problem, and kdump kernels boot just fine on the affected system.
> >>
> >
> > Hi Neil,
> >
> > Should we disable this broadcasting feature once we are through? Otherwise
> > in normal systems it might mean extra traffic on hypertransport. There
> > is no need for every interrupt to be broadcasted in normal systems?
>
> My feel is that if it is for legacy interrupts only it should not be a 
> problem.
> Let's investigate and see if we can unconditionally enable this quirk
> for all opteron systems.

i checked that bit

http://www.openbios.org/viewvc/trunk/LinuxBIOSv2/src/northbridge/amd/amdk8/coherent_ht.c?revision=2596&view=markup

static void enable_apic_ext_id(u8 node)
{
#if ENABLE_APIC_EXT_ID==1
#warning "FIXME Is the right place to enable apic ext id here?"

  u32 val;

val = pci_read_config32(NODE_HT(node), 0x68);
val |= (HTTC_APIC_EXT_SPUR | HTTC_APIC_EXT_ID | HTTC_APIC_EXT_BRD_CST);
pci_write_config32(NODE_HT(node), 0x68, val);
#endif
}

that bit only be should be set when apic id is lifted and cpu apid is
using 8 bits and that mean broadcast is 0xff instead 0x0f.
for example 8 socket dual core system or 4 socket quad core
system,that you should make BSP start from 0x04, so cpus apic id will
be [0x04, 0x13)


So if you want to enable that in early_quirk, you need to
make sure apic id is using 8 bits by check if the bit 16 (HTTC_APIC_ID) is set.

most BIOS already did that. You may ask Supermicro fix their broken
BIOS instead.

YH
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 20/20] net/iucv/iucv.c: use LIST_HEAD instead of LIST_HEAD_INIT

2007-12-07 Thread David Miller

From: Denis Cheng <[EMAIL PROTECTED]>
Date: Fri,  7 Dec 2007 00:13:25 +0800

> these three list_head are all local variables, but can also use LIST_HEAD.
> 
> Signed-off-by: Denis Cheng <[EMAIL PROTECTED]>

Applied.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] scheduler: fix x86 regression in native_sched_clock

2007-12-07 Thread Ingo Molnar


* Guillaume Chazarain <[EMAIL PROTECTED]> wrote:

> > Something like http://lkml.org/lkml/2007/3/16/291 that would need 
> > some refresh?
> 
> And here is a refreshed one just for testing with 2.6-git. The 64 bit 
> part is a shamelessly untested copy/paste as I cannot test it.

yeah, we can do something like this in 2.6.25 - this will improve the 
quality of sched_clock(). The other patch i sent should solve the 
problem for 2.6.24 - printk should not be using raw sched_clock() calls. 
(as the name says it's for the scheduler's internal use.) I've also 
queued up the patch below - it removes the now unnecessary printk clock 
code.

Ingo

->
Subject: sched: remove printk_clock()
From: Ingo Molnar <[EMAIL PROTECTED]>

printk_clock() is obsolete - it has been replaced with cpu_clock().

Signed-off-by: Ingo Molnar <[EMAIL PROTECTED]>
---
 arch/arm/kernel/time.c  |   11 ---
 arch/ia64/kernel/time.c |   27 ---
 kernel/printk.c |5 -
 3 files changed, 43 deletions(-)

Index: linux/arch/arm/kernel/time.c
===
--- linux.orig/arch/arm/kernel/time.c
+++ linux/arch/arm/kernel/time.c
@@ -79,17 +79,6 @@ static unsigned long dummy_gettimeoffset
 }
 #endif
 
-/*
- * An implementation of printk_clock() independent from
- * sched_clock().  This avoids non-bootable kernels when
- * printk_clock is enabled.
- */
-unsigned long long printk_clock(void)
-{
-   return (unsigned long long)(jiffies - INITIAL_JIFFIES) *
-   (10 / HZ);
-}
-
 static unsigned long next_rtc_update;
 
 /*
Index: linux/arch/ia64/kernel/time.c
===
--- linux.orig/arch/ia64/kernel/time.c
+++ linux/arch/ia64/kernel/time.c
@@ -344,33 +344,6 @@ udelay (unsigned long usecs)
 }
 EXPORT_SYMBOL(udelay);
 
-static unsigned long long ia64_itc_printk_clock(void)
-{
-   if (ia64_get_kr(IA64_KR_PER_CPU_DATA))
-   return sched_clock();
-   return 0;
-}
-
-static unsigned long long ia64_default_printk_clock(void)
-{
-   return (unsigned long long)(jiffies_64 - INITIAL_JIFFIES) *
-   (10/HZ);
-}
-
-unsigned long long (*ia64_printk_clock)(void) = &ia64_default_printk_clock;
-
-unsigned long long printk_clock(void)
-{
-   return ia64_printk_clock();
-}
-
-void __init
-ia64_setup_printk_clock(void)
-{
-   if (!(sal_platform_features & IA64_SAL_PLATFORM_FEATURE_ITC_DRIFT))
-   ia64_printk_clock = ia64_itc_printk_clock;
-}
-
 /* IA64 doesn't cache the timezone */
 void update_vsyscall_tz(void)
 {
Index: linux/kernel/printk.c
===
--- linux.orig/kernel/printk.c
+++ linux/kernel/printk.c
@@ -573,11 +573,6 @@ static int __init printk_time_setup(char
 
 __setup("time", printk_time_setup);
 
-__attribute__((weak)) unsigned long long printk_clock(void)
-{
-   return sched_clock();
-}
-
 /* Check if we have any console registered that can be called early in boot. */
 static int have_callable_console(void)
 {
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH BUGFIX] hid: the `bit' in hidinput_mapping_quirks() is an out parameter

2007-12-07 Thread Fengguang Wu

Fix a panic, by changing 
hidinput_mapping_quirks(,, unsigned long *bit,)
to 
hidinput_mapping_quirks(,, unsigned long **bit,)

The `bit' in this function is an out parameter.

Cc: Jiri Kosina <[EMAIL PROTECTED]>
Signed-off-by: Fengguang Wu <[EMAIL PROTECTED]>
---
 drivers/hid/hid-input-quirks.c |   36 +++
 drivers/hid/hid-input.c|2 -
 include/linux/hid.h|2 -
 3 files changed, 20 insertions(+), 20 deletions(-)

--- linux-2.6.24-rc4-mm1.orig/include/linux/hid.h
+++ linux-2.6.24-rc4-mm1/include/linux/hid.h
@@ -526,7 +526,7 @@ extern void hidinput_disconnect(struct h
 int hid_set_field(struct hid_field *, unsigned, __s32);
 int hid_input_report(struct hid_device *, int type, u8 *, int, int);
 int hidinput_find_field(struct hid_device *hid, unsigned int type, unsigned 
int code, struct hid_field **field);
-int hidinput_mapping_quirks(struct hid_usage *, struct input_dev *, unsigned 
long *, int *);
+int hidinput_mapping_quirks(struct hid_usage *, struct input_dev *, unsigned 
long **, int *);
 void hidinput_event_quirks(struct hid_device *, struct hid_field *, struct 
hid_usage *, __s32);
 int hidinput_apple_event(struct hid_device *, struct input_dev *, struct 
hid_usage *, __s32);
 void hid_input_field(struct hid_device *hid, struct hid_field *field, __u8 
*data, int interrupt);
--- linux-2.6.24-rc4-mm1.orig/drivers/hid/hid-input.c
+++ linux-2.6.24-rc4-mm1/drivers/hid/hid-input.c
@@ -382,7 +382,7 @@ static void hidinput_configure_usage(str
}
 
/* handle input mappings for quirky devices */
-   ret = hidinput_mapping_quirks(usage, input, bit, &max);
+   ret = hidinput_mapping_quirks(usage, input, &bit, &max);
if (ret)
goto mapped;
 
--- linux-2.6.24-rc4-mm1.orig/drivers/hid/hid-input-quirks.c
+++ linux-2.6.24-rc4-mm1/drivers/hid/hid-input-quirks.c
@@ -16,16 +16,16 @@
 #include 
 #include 
 
-#define map_abs(c)  do { usage->code = c; usage->type = EV_ABS; bit = 
input->absbit; *max = ABS_MAX; } while (0)
-#define map_rel(c)  do { usage->code = c; usage->type = EV_REL; bit = 
input->relbit; *max = REL_MAX; } while (0)
-#define map_key(c)  do { usage->code = c; usage->type = EV_KEY; bit = 
input->keybit; *max = KEY_MAX; } while (0)
-#define map_led(c)  do { usage->code = c; usage->type = EV_LED; bit = 
input->ledbit; *max = LED_MAX; } while (0)
+#define map_abs(c)  do { usage->code = c; usage->type = EV_ABS; *bit = 
input->absbit; *max = ABS_MAX; } while (0)
+#define map_rel(c)  do { usage->code = c; usage->type = EV_REL; *bit = 
input->relbit; *max = REL_MAX; } while (0)
+#define map_key(c)  do { usage->code = c; usage->type = EV_KEY; *bit = 
input->keybit; *max = KEY_MAX; } while (0)
+#define map_led(c)  do { usage->code = c; usage->type = EV_LED; *bit = 
input->ledbit; *max = LED_MAX; } while (0)
 
-#define map_abs_clear(c)do { map_abs(c); clear_bit(c, bit); } while (0)
-#define map_key_clear(c)do { map_key(c); clear_bit(c, bit); } while (0)
+#define map_abs_clear(c)do { map_abs(c); clear_bit(c, *bit); } while 
(0)
+#define map_key_clear(c)do { map_key(c); clear_bit(c, *bit); } while 
(0)
 
 static int quirk_belkin_wkbd(struct hid_usage *usage, struct input_dev *input,
- unsigned long *bit, int *max)
+ unsigned long **bit, int *max)
 {
if ((usage->hid & HID_USAGE_PAGE) != HID_UP_CONSUMER)
return 0;
@@ -41,7 +41,7 @@ static int quirk_belkin_wkbd(struct hid_
 }
 
 static int quirk_cherry_cymotion(struct hid_usage *usage, struct input_dev 
*input,
- unsigned long *bit, int *max)
+ unsigned long **bit, int *max)
 {
if ((usage->hid & HID_USAGE_PAGE) != HID_UP_CONSUMER)
return 0;
@@ -57,7 +57,7 @@ static int quirk_cherry_cymotion(struct 
 }
 
 static int quirk_logitech_ultrax_remote(struct hid_usage *usage, struct 
input_dev *input,
- unsigned long *bit, int *max)
+ unsigned long **bit, int *max)
 {
if ((usage->hid & HID_USAGE_PAGE) != HID_UP_LOGIVENDOR)
return 0;
@@ -90,7 +90,7 @@ static int quirk_logitech_ultrax_remote(
 }
 
 static int quirk_chicony_tactical_pad(struct hid_usage *usage, struct 
input_dev *input,
- unsigned long *bit, int *max)
+ unsigned long **bit, int *max)
 {
if ((usage->hid & HID_USAGE_PAGE) != HID_UP_MSVENDOR)
return 0;
@@ -115,7 +115,7 @@ static int quirk_chicony_tactical_pad(st
 }
 
 static int quirk_microsoft_ergonomy_kb(struct hid_usage *usage, struct 
input_dev *input,
- unsigned long *bit, int *max)
+ unsigned long **bit, int *max)
 {
if ((usage->hid & HID_USAGE_PAGE) != HID_UP_MSVENDOR)
return 0;
@@ -138,7 +138

[patch] x86: scale cyc_2_nsec according to CPU frequency

2007-12-07 Thread Ingo Molnar


* Guillaume Chazarain <[EMAIL PROTECTED]> wrote:

> > > Hmrpf. sched_clock() is used for the time stamp of the printks. We 
> > > need to find some better solution other than killing off the tsc 
> > > access completely.
> > 
> > Something like http://lkml.org/lkml/2007/3/16/291 that would need 
> > some refresh?
> 
> And here is a refreshed one just for testing with 2.6-git. The 64 bit 
> part is a shamelessly untested copy/paste as I cannot test it.

Guillaume, i've updated your patch with a handful of changes - see the 
result below.

Firstly, we dont need the 'offset' anymore because cpu_clock() maintains 
offsets itself. This simplifies the math and speeds up the sched_clock() 
common case.

Secondly, with PER_CPU variables we need to update them for all possible 
CPUs - otherwise they might end up with a zero scaling factor which is 
not good. (not all CPUs are cpufreq capable)

Thirdly, we can do a bit smarter and faster by using the fact that 
local_irq_disable() is preempt-safe - so we can use per_cpu() instead of 
get_cpu_var().

Ingo

->
Subject: x86: scale cyc_2_nsec according to CPU frequency
From: "Guillaume Chazarain" <[EMAIL PROTECTED]>

scale the sched_clock() cyc_2_nsec scaling factor according to
CPU frequency changes.

[ [EMAIL PROTECTED]: simplified it and fixed it for SMP. ]

Signed-off-by: Ingo Molnar <[EMAIL PROTECTED]>
Signed-off-by: Thomas Gleixner <[EMAIL PROTECTED]>
---
 arch/x86/kernel/tsc_32.c |   41 +++-
 arch/x86/kernel/tsc_64.c |   59 +++
 include/asm-x86/timer.h  |   23 ++
 3 files changed, 102 insertions(+), 21 deletions(-)

Index: linux-x86.q/arch/x86/kernel/tsc_32.c
===
--- linux-x86.q.orig/arch/x86/kernel/tsc_32.c
+++ linux-x86.q/arch/x86/kernel/tsc_32.c
@@ -5,6 +5,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include 
 #include 
@@ -78,15 +79,31 @@ EXPORT_SYMBOL_GPL(check_tsc_unstable);
  *  cyc2ns_scale is limited to 10^6 * 2^10, which fits in 32 bits.
  *  ([EMAIL PROTECTED])
  *
+ *  ns += offset to avoid sched_clock jumps with cpufreq
+ *
  * [EMAIL PROTECTED] "math is hard, lets go shopping!"
  */
-unsigned long cyc2ns_scale __read_mostly;
 
-#define CYC2NS_SCALE_FACTOR 10 /* 2^10, carefully chosen */
+DEFINE_PER_CPU(unsigned long, cyc2ns);
 
-static inline void set_cyc2ns_scale(unsigned long cpu_khz)
+static void set_cyc2ns_scale(unsigned long cpu_khz, int cpu)
 {
-   cyc2ns_scale = (100 << CYC2NS_SCALE_FACTOR)/cpu_khz;
+   unsigned long flags, prev_scale, *scale;
+   unsigned long long tsc_now, ns_now;
+
+   local_irq_save(flags);
+   scale = &per_cpu(cyc2ns, cpu);
+
+   rdtscll(tsc_now);
+   ns_now = __cycles_2_ns(tsc_now);
+
+   prev_scale = *scale;
+   if (cpu_khz)
+   *scale = (NSEC_PER_MSEC << CYC2NS_SCALE_FACTOR)/cpu_khz;
+
+   printk("CPU#%d: changed cyc2ns scale from %ld to %ld\n",
+   cpu, prev_scale, *scale);
+   local_irq_restore(flags);
 }
 
 /*
@@ -239,7 +256,9 @@ time_cpufreq_notifier(struct notifier_bl
ref_freq, freq->new);
if (!(freq->flags & CPUFREQ_CONST_LOOPS)) {
tsc_khz = cpu_khz;
-   set_cyc2ns_scale(cpu_khz);
+   preempt_disable();
+   set_cyc2ns_scale(cpu_khz, smp_processor_id());
+   preempt_enable();
/*
 * TSC based sched_clock turns
 * to junk w/ cpufreq
@@ -367,6 +386,8 @@ static inline void check_geode_tsc_relia
 
 void __init tsc_init(void)
 {
+   int cpu;
+
if (!cpu_has_tsc || tsc_disable)
goto out_no_tsc;
 
@@ -380,7 +401,15 @@ void __init tsc_init(void)
(unsigned long)cpu_khz / 1000,
(unsigned long)cpu_khz % 1000);
 
-   set_cyc2ns_scale(cpu_khz);
+   /*
+* Secondary CPUs do not run through tsc_init(), so set up
+* all the scale factors for all CPUs, assuming the same
+* speed as the bootup CPU. (cpufreq notifiers will fix this
+* up if their speed diverges)
+*/
+   for_each_possible_cpu(cpu)
+   set_cyc2ns_scale(cpu_khz, cpu);
+
use_tsc_delay();
 
/* Check and install the TSC clocksource */
Index: linux-x86.q/arch/x86/kernel/tsc_64.c
===
--- linux-x86.q.orig/arch/x86/kernel/tsc_64.c
+++ linux-x86.q/arch/x86/kernel/tsc_64.c
@@ -10,6 +10,7 @@
 
 #include 
 #include 
+#include 
 
 static int notsc __initdata = 0;
 
@@ -18,16 +19,50 @@ EXPORT_SYMBOL(cpu_khz);
 unsigned int tsc_khz;
 EXPORT_SYMBOL(tsc_khz);
 
-static unsigned

Re: ptrace API extensions for BTS

2007-12-07 Thread Andi Kleen

On Friday 07 December 2007 13:01:28 Metzger, Markus T wrote:
> >From: Andi Kleen [mailto:[EMAIL PROTECTED] 
> >Sent: Freitag, 7. Dezember 2007 12:18
> 
> >> I would like to settle the discussion and find an interface that
> >> everybody can agree to, so I can implement that interface and we can
> >> move forward with the patch.
> >
> >The most efficient interface would be zero copy with tracer 
> >user process
> >supplying memory that is pinned (get_user_pages()) subject to the
> >mlock rlimit. Then kernel telling the CPU to directly log into
> >that.
> 
> That would require users to understand all kinds of BTS formats
> and to detect the hardware they are running on in order to interpret
> the data.

That's true. I guess it could be abstracted in a library, but doing
it all in kernel is indeed nicer.

Ok in theory you could go fancy and put the library into the vDSO
which runs in ring 3. Then it would be tied to the kernel again.

> So far, there are two different formats. But one of them is wasting
> an entire word of memory per record. I could imagine that this would
> change some day.
> 
> Other architectures would likely use an entirely different format.
> Users who want to support several architectures would benefit from
> a common format for this from-to branch information.

I guess some other users would prefer higher performance, but yes
there are probably both types. I don't know what is more important.

> Is there some other metric that would allow me to order BTS 
> chunks for different threads?

With Out-of-order CPUs exact global metrics are pretty difficult.
At which point of the instruction execution would you measure? 

Anyways if RDTSC doesn't work the only global alternatives are much slower
(like southbridge timers) or very inaccurate (jiffies) 

I would just drop it since it'll likely always be somewhat misleading.

-Andi
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: 2.6.24-rc3-git4 NFS crossmnt regression

2007-12-07 Thread Andrew Morton

On Thu, 6 Dec 2007 23:45:58 -0500 Shane <[EMAIL PROTECTED]> wrote:

> Hi,
> 
> The NFS crossmnt/nohide feature has been working beautifully
> in 2.6.23. NFS in general has been really good in 2.6.23. Thanks!
> 
> However, starting in  2.6.24-rc3-git4, I immediately get 'NFS Stale
> file handle' messages for any accesses to the NFS crossmnt'ed
> volumes. Regular NFS mounts are fine but the crossmnt'ed
> subdirs return only that error message.
> 
> 2.6.24-rc3-git1 is last known good kernel. The problem also exists
> with the latest snap 2.6.24-rc4-git4. NFS server is 2.6.23-rc9 and
> is unchanged.

hm, there have been no nfs changes since 2.6.24-rc4.

> It is easily reproducible here, hopefully for the person who
> knows how to debug it too :)
> 

I guess a full set of the commands which you typed to reproduce this would
help.

Rafael, please add to the post-2.6.23 regression list?  (If there's any
room left).

Thanks.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: programs vanish with 2.6.22+

2007-12-07 Thread Markus

Hi again!

The memtest ran 14 passes (~10h) without an error.

I now have a 2.6.24-rc4 with some debug-options turned on, waiting for 
something to happen... can I just leave it untill a window disappears 
or do I need to manually enable something or run some user-space app?!


Markus
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] scheduler: fix x86 regression in native_sched_clock

2007-12-07 Thread stefano . brivio


Quoting Nick Piggin <[EMAIL PROTECTED]>:


On Friday 07 December 2007 19:45, Ingo Molnar wrote:


ah, printk_clock() still uses sched_clock(), not jiffies. So it's not
the jiffies counter that goes back and forth, it's sched_clock() - so
this is a printk timestamps anomaly, not related to jiffies. I thought
we have fixed this bug in the printk code already: sched_clock() is a
'raw' interface that should not be used directly - the proper interface
is cpu_clock(cpu).


It's a single CPU box, so sched_clock() jumping would still be
problematic, no?


I guess so. Definitely, it didn't look like a printk issue. Drivers  
don't read logs, usually. But they got confused anyway (it seems that  
udelay's get scaled or fail or somesuch - I can't test it right now,  
will provide more feedback in a few hours).



--
Ciao
Stefano



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] scheduler: fix x86 regression in native_sched_clock

2007-12-07 Thread Andrew Morton

On Fri, 7 Dec 2007 11:40:13 +0100 Ingo Molnar <[EMAIL PROTECTED]> wrote:

> 
> * Andrew Morton <[EMAIL PROTECTED]> wrote:
> 
> > > - t = printk_clock();
> > > + t = cpu_clock(printk_cpu);
> > >   nanosec_rem = do_div(t, 10);
> > >   tlen = sprintf(tbuf,
> > >   "<%c>[%5lu.%06lu] ",
> > 
> > A bit risky - it's quite an expansion of code which no longer can call 
> > printk.
> > 
> > You might want to take that WARN_ON out of __update_rq_clock() ;)
> 
> hm, dont we already detect printk recursions and turn them into a silent 
> return instead of a hang/crash?
> 

We'll pop the locks and will proceed to do the nested printk.  So
__update_rq_clock() will need rather a lot of stack ;)
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 2.6.24-rc3] Fix /proc/net breakage

2007-12-07 Thread Denis V. Lunev

Andrew Morton wrote:
> On Fri, 07 Dec 2007 04:51:37 + David Woodhouse <[EMAIL PROTECTED]> wrote:
> 
>> On Mon, 2007-11-26 at 15:17 -0700, Eric W. Biederman wrote:
>>> Well I clearly goofed when I added the initial network namespace support
>>> for /proc/net.  Currently things work but there are odd details visible
>>> to user space, even when we have a single network namespace.
>>>
>>> Since we do not cache proc_dir_entry dentries at the moment we can
>>> just modify ->lookup to return a different directory inode depending
>>> on the network namespace of the process looking at /proc/net, replacing
>>> the current technique of using a magic and fragile follow_link method.
>>>
>>> To accomplish that this patch:
>>> - introduces a shadow_proc method to allow different dentries to
>>>   be returned from proc_lookup.
>>> - Removes the old /proc/net follow_link magic
>>> - Fixes a weakness in our not caching of proc generic dentries.
>>>
>>> As shadow_proc uses a task struct to decided which dentry to return we
>>> can go back later and fix the proc generic caching without modifying any 
>>> code that
>>> uses the shadow_proc method.
>>>
>>> Signed-off-by: Eric W. Biederman <[EMAIL PROTECTED]>
>>> ---
>>>  fs/proc/generic.c   |   12 ++-
>>>  fs/proc/proc_net.c  |   86 
>>> +++
>>>  include/linux/proc_fs.h |3 ++
>>>  3 files changed, 19 insertions(+), 82 deletions(-)
>> (commit 2b1e300a9dfc3196ccddf6f1d74b91b7af55e416)
>>
>> This seems to have broken the use of /proc/bus/usb as a mountpoint. It
>> always appears empty now, whatever's supposed to be mounted there.
>>
> 
> Yes.  Denis and Eric are tossing around competing patches but afaik nobody
> is happy with any of them.  Guys, could we get this sorted soonish please?
> 

Andrew, I become too relaxed after receiving
"Tested-by: Giacomo Catenazzi <[EMAIL PROTECTED]>"

Eric, I believe that reverting an original behavior is better than your
new one as
- you introduce search into the depth by calling have_submounts(dentry)
during revalidation for all(!) /proc dentries
- your shadowing behavior will be broken if you'll mount something in
the depth of shadowed tree (this can be done as a DoS attempt)

As a last minute call, may be it will be better to pin network namespace
like a pid namespace during mount to avoid this crap at all?

Regards,
Den
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: RFC: outb 0x80 in inb_p, outb_p harmful on some modern AMD64 with MCP51 laptops

2007-12-07 Thread Andi Kleen

Rene Herman <[EMAIL PROTECTED]> writes:
>
> If there are no sensible fixes, an 0x80/0xed choice could I assume be
> hung of DMI or something (if that _is_ parsed soon enough).

Another possibility would be to key this off DMI year (or existence
of DMI year since old systems don't have it). I guess it would
be reasonable to not do any delays on anything modern.

On x86-64 it could be presumably always disabled too, although
I was always too chicken to do that.

-Andi
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: 2.6.24-rc4-mm1: VDSOSYM build error

2007-12-07 Thread Ingo Molnar


* Andrew Morton <[EMAIL PROTECTED]> wrote:

> On Thu, 6 Dec 2007 18:28:25 -0500
> "Miles Lane" <[EMAIL PROTECTED]> wrote:
> 
> > How can I find Roland's patches, so I can try backing them out?
> > I looked in the broken out patches and only saw one related
> > to VDSO.  Backing it out did not help.  I tried searching for
> > messages to LKML sent by "roland" but mostly got a bunch of
> > folks sending spam.
> 
> They're all clumped into git-x86.patch.  Hard.

in theory the git merges could be generated as a flat series of patch 
files:

 x86.git.foo-fixes.patch
 x86.git.bar-updates.patch
 x86.git.foo-fixes-feh.patch
 ...

which could also include the commit log. "git-log -p" might be a 
suitable generator. For example, x86.git can be processed per commit, 
via this script:

  for N in `git-rev-list --reverse --no-merges --remove-empty master..mm`; do 
git-log -p $N
  done

the following git-export-quilt script (just wrote it, might be buggy, so 
careful - and it blows away the patches/ directory wherever you run it) 
will generate a series file into patches/series that can be applied via 
quilt:

  rm -rf patches
  mkdir patches
  for N in `git-rev-list --reverse --no-merges --remove-empty master..mm`; do
   git-log -p -1 $N > .tmp
   export SUBJECT=`head -5 .tmp | tail -1`

   # generate filename out of subject line:
   FILE=x86.git-"`echo $SUBJECT | cut -c10- |
tr '[:punct:] \t' '-' | tr -s - | tr '[:upper:]' '[:lower:]'`"

   # generate unique name:
   while [ -f patches/$FILE.patch ]; do FILE="$FILE"_; done

   echo $FILE.patch
   mv .tmp patches/$FILE.patch
   echo $FILE.patch >> patches/series
  done

  ls -l patches/series

i ran this script over x86.git and it produced a patch series with 247 
patches that quilt was able to push correctly. (in theory this concept 
should work for other git trees too - but i have not tried it)

this would increase the series size quite substantially though - but it 
would make cherry-picking and patch based bisection a lot easier.

Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] scheduler: fix x86 regression in native_sched_clock

2007-12-07 Thread Guillaume Chazarain

Le Fri, 7 Dec 2007 09:51:21 +0100,
Ingo Molnar <[EMAIL PROTECTED]> a écrit :

> yeah, we can do something like this in 2.6.25 - this will improve the 
> quality of sched_clock().

Thanks a lot for your interest!

I'll clean it up and resend it later. As I don't have the necessary
knowledge to do the tsc_{32,64}.c unification, should I copy paste
common functions into tsc_32.c and tsc_64.c to ease later unification
or should I start a common .c file?

Thanks again for showing interest.

-- 
Guillaume
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [BUG] 2.6.23-rc3 can't see sd partitions on Alpha

2007-12-07 Thread Bob Tracy

Andrew Morton wrote:
> On Thu, 6 Dec 2007 23:07:08 -0600 (CST) [EMAIL PROTECTED] (Bob Tracy) wrote:
> > Andrew Morton wrote:
> > > commit 6f37ac793d6ba7b35d338f791974166f67fdd9ba
> > > Merge: 2f1f53b... d90bf5a...
> > > Author: Linus Torvalds <[EMAIL PROTECTED]>
> > > Date:   Wed Nov 14 18:51:48 2007 -0800
> > > 
> > > Merge branch 'master' of 
> > > master.kernel.org:/pub/scm/linux/kernel/git/davem/n
> > > 
> > > * 'master' of 
> > > master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6:
> > >   (omitted for brevity)
> > > 
> > > I'm struggling to see how any of those could have broken block device
> > > mounting on alpha.  Are you sure you bisected right?
> > 
> > Based on what's in that commit, it *does* appear something went wrong
> > with bisection.  If the implicated commit is the next one in time
> > sequence relative to
> > 
> > # good: [2f1f53bdc6531696934f6ee7bbdfa2ab4f4f62a3] CRISv10 fasttimer: Scrap 
> > INLINE and name timeval_cmp better
> > 
> > then the test of whether I bisected correctly is as simple as applying
> > the commit and seeing if things break, because I'm running on the
> > kernel corresponding to 2f1f53bdc6531696934f6ee7bbdfa2ab4f4f62a3 right
> > now.  Let me give that a try and I'll report back.  Worst case, I'll
> > have to start over and write off the past four days...
> 
> Gad.  I trust the second time will be faster.
> 
> git-bisect _is_ very error prone.  I find one of the problems is that each
> step is so far apart in time that you forget what you were doing.  Did I
> remember to test that iteration?  Did I install the right kernel?  etc.
> 
> > Sorry about this...
> 
> Not appropriate ;)   Thanks for helping out.

Thanks for the kind words...  The above-mentioned test verified that the
bisection was/is correct: 2f1f53bdc6531696934f6ee7bbdfa2ab4f4f62a3 works,
and 6f37ac793d6ba7b35d338f791974166f67fdd9ba doesn't.  Now I've got to
figure out why.

"git diff 2f1f53bdc6531696934f6ee7bbdfa2ab4f4f62a3 
6f37ac793d6ba7b35d338f791974166f67fdd9ba"
produced a relatively short patch (18,437 bytes).  The list of involved
files:

diff --git a/drivers/char/random.c b/drivers/char/random.c
diff --git a/drivers/isdn/sc/card.h b/drivers/isdn/sc/card.h
diff --git a/drivers/isdn/sc/packet.c b/drivers/isdn/sc/packet.c
diff --git a/drivers/isdn/sc/shmem.c b/drivers/isdn/sc/shmem.c
diff --git a/drivers/net/arm/ep93xx_eth.c b/drivers/net/arm/ep93xx_eth.c
diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
diff --git a/drivers/net/e1000/e1000_main.c b/drivers/net/e1000/e1000_main.c
diff --git a/drivers/net/fs_enet/Kconfig b/drivers/net/fs_enet/Kconfig
diff --git a/drivers/net/fs_enet/Makefile b/drivers/net/fs_enet/Makefile
diff --git a/drivers/net/netx-eth.c b/drivers/net/netx-eth.c
diff --git a/drivers/net/s2io.c b/drivers/net/s2io.c
diff --git a/drivers/net/wireless/iwlwifi/iwl3945-base.c 
b/drivers/net/wireless/iwlwifi/iwl3945-base.c
diff --git a/include/net/sock.h b/include/net/sock.h
diff --git a/kernel/sysctl_check.c b/kernel/sysctl_check.c
diff --git a/net/core/dev.c b/net/core/dev.c
diff --git a/net/ipv4/route.c b/net/ipv4/route.c
diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
diff --git a/net/sched/sch_generic.c b/net/sched/sch_generic.c

Current state of the source tree is the 6f37ac... version, so I'll start
backing out the above diffs in related groups and continue until I've got
a working kernel.  For lack of an obvious target, I'll start with the
seemingly innocuous change to sysctl_check.c.  I'll report back when I've
got something.

-- 

Bob Tracy  |  "They couldn't hit an elephant at this dist- "
[EMAIL PROTECTED]   |   - Last words of Union General John Sedgwick,
   |  Battle of Spotsylvania Court House, U.S. Civil War

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

1 2 3 4 >

1 - 100 of 339 matches

Mail list logo