date:20070326

Re: [PATCH] riscom8: fix use of deprecated functions

2007-03-26 Thread Alexey Dobriyan

On Sun, Mar 25, 2007 at 10:36:02PM +0200, Joerg Roedel wrote:
> This patch replaces the deprecated functions in drivers/char/riscom8.c
> and fixes the compile warnings they produced.

That's not the point of exercise. Make it SMP-safe, instead.

> --- a/drivers/char/riscom8.c
> +++ b/drivers/char/riscom8.c
> @@ -226,13 +226,13 @@ static void __init rc_init_CD180(struct riscom_board 
> const * bp)
>  {
>   unsigned long flags;
>
> - save_flags(flags); cli();
> + local_irq_save(flags);

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] ipcns: fix !CONFIG_IPC_NS behavior

2007-03-26 Thread Serge E. Hallyn

fyi, dummy copy_ipcs() needed to move bc we need CLONE_NEWIPC
definition, but #including sched.h breaks...

Andrew, I'll send a separate version for mm since return type
changed as with utsname.

thanks,
-serge

From: "Serge E. Hallyn" <[EMAIL PROTECTED]>
Subject: [PATCH] ipcns: fix !CONFIG_IPC_NS behavior

When CONFIG_IPC_NS=n, clone(CLONE_NEWIPC) claims success, but did not actually
clone a new IPC namespace.

Fix this to return -EINVAL so the caller knows his request was denied.

Signed-off-by: Serge E. Hallyn <[EMAIL PROTECTED]>

---

 include/linux/ipc.h |7 +--
 ipc/util.c  |7 +++
 2 files changed, 8 insertions(+), 6 deletions(-)

9c4d6f67b6bf1a0e509ae2e52ffb721dc9bb2fe1
diff --git a/include/linux/ipc.h b/include/linux/ipc.h
index 636094c..5c05c88 100644
--- a/include/linux/ipc.h
+++ b/include/linux/ipc.h
@@ -96,15 +96,10 @@ extern struct ipc_namespace init_ipc_ns;
 #define INIT_IPC_NS(ns)
 #endif
 
+extern int copy_ipcs(unsigned long flags, struct task_struct *tsk);
 #ifdef CONFIG_IPC_NS
 extern void free_ipc_ns(struct kref *kref);
-extern int copy_ipcs(unsigned long flags, struct task_struct *tsk);
 extern int unshare_ipcs(unsigned long flags, struct ipc_namespace **ns);
-#else
-static inline int copy_ipcs(unsigned long flags, struct task_struct *tsk)
-{
-   return 0;
-}
 #endif
 
 static inline struct ipc_namespace *get_ipc_ns(struct ipc_namespace *ns)
diff --git a/ipc/util.c b/ipc/util.c
index 08a6479..0b65238 100644
--- a/ipc/util.c
+++ b/ipc/util.c
@@ -144,6 +144,13 @@ void free_ipc_ns(struct kref *kref)
shm_exit_ns(ns);
kfree(ns);
 }
+#else
+int copy_ipcs(unsigned long flags, struct task_struct *tsk)
+{
+   if (flags & CLONE_NEWIPC)
+   return -EINVAL;
+   return 0;
+}
 #endif
 
 /**
-- 
1.1.6
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [patch] hrtimers debug patch

2007-03-26 Thread Michal Piotrowski


On 26/03/07, Ingo Molnar <[EMAIL PROTECTED]> wrote:


* Michal Piotrowski <[EMAIL PROTECTED]> wrote:

> Stardust is down, console log and config attached.

thanks! I have stared at hrtimer.c a few more hours and the good news is
that i found a narrow SMP race. The bad news is that i dont think it
could explain your bug symptoms: the worst-case effect of the race
should be an incorrect timeout on the current CPU - not a KTIME_MAX
thing like your logs show.

But maybe i didnt think through the effects of the bug well enough, and
your box has a HT CPU, with HT CPUs being pretty good at triggering
narrow SMP races - so maybe we are lucky? Fix attached below. Patch is
build and boot-tested.


Thanks. Verification may take some time.



Ingo



Regards,
Michal

--
Michal K. K. Piotrowski
LTG - Linux Testers Group (PL)
(http://www.stardust.webpages.pl/ltg/)
LTG - Linux Testers Group (EN)
(http://www.stardust.webpages.pl/linux_testers_group_en/)
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [3/6] 2.6.21-rc4: known regressions

2007-03-26 Thread Jeff Chua

On 3/27/07, Thomas Gleixner <[EMAIL PROTECTED]> wrote:

> It's related. I tested without CONFIG_HPET_TIMER, and now my X60 can
> suspend and resume from RAM (s2ram). Even better, it works
> with/without CONFIG_NO_HZ.

Does the patch below fix the HPET_TIMER=y case ?

Thomas, I tried, but it didn't help. Upon resume from ram, "date"
still didn't advance.

Thanks,
Jeff.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [3/5] 2.6.21-rc4: known regressions (v2)

2007-03-26 Thread Marcus Better

Adrian Bunk wrote:
> > > Subject: ThinkPad R60: suspend to disk broken

> Does setting CONFIG_PCI_MSI=n make any difference?

Yes, it does. The hanging resume problem went away. 

(The display corruption and the instant resume were not affected.)

Marcus

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: -rc5: e1000 resume weirdness

2007-03-26 Thread Ingo Molnar

* Jesse Brandeburg <[EMAIL PROTECTED]> wrote:

> was there a "NETDEV WATCHDOG" message that follows this?  If not it is 
> a harmless debug print.  Note the time_stamp and jiffies difference, 
> very large, consistent with a resume.  I think we need to disable the 
> internal e1000 tx hang code that causes this debug print when we are 
> suspending.  I'll work with auke to generate a short patch.

there was no "NETDEV WATCHDOG" message. But still there was a ~30 
seconds delay until i got the first few packets through the interface - 
while normally it's available almost instantly after resume. But ... 
this condition seems sporadic, i havent seen it on subsequent 
suspend+resume attempts.

Ingo
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

SB600 and SATA disk in 2.6.21-rc5

2007-03-26 Thread Matías Alejandro Torres


Ubuntu 6.10  with kernel version 2.6.21-rc5 compiled with ahci support

I have a MSI K9AGM motherboard that ships with four SB600 SATA ports but 
when the kernel is booting it shows some errors and the SATA disk is not 
detected.


Here is what dmesg throws:

dmesg:


[1.332000] ide: Assuming 33MHz system bus speed for PIO modes; 
override with idebus=xx

[1.332000] ahci :00:12.0: version 2.1
[1.332000] ACPI: PCI Interrupt :00:12.0[A] -> GSI 22 (level, 
low) -> IRQ 16
[2.336000] ahci :00:12.0: AHCI 0001.0100 32 slots 4 ports 3 Gbps 
0xf impl SATA mode
[2.336000] ahci :00:12.0: flags: 64bit ncq ilck pm led clo pmp 
pio slum part
[2.336000] ata1: SATA max UDMA/133 cmd 0xf8824d00 ctl 0x 
bmdma 0x irq 222
[2.336000] ata2: SATA max UDMA/133 cmd 0xf8824d80 ctl 0x 
bmdma 0x irq 222
[2.336000] ata3: SATA max UDMA/133 cmd 0xf8824e00 ctl 0x 
bmdma 0x irq 222
[2.336000] ata4: SATA max UDMA/133 cmd 0xf8824e80 ctl 0x 
bmdma 0x irq 222

[2.336000] scsi0 : ahci
[2.648000] ata1: SATA link down (SStatus 0 SControl 300)
[2.648000] scsi1 : ahci
[3.132000] ata2: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
[   33.132000] ata2.00: qc timeout (cmd 0xec)
[   33.132000] ata2.00: failed to IDENTIFY (I/O error, err_mask=0x104)
[   34.112000] ata2: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
[   64.112000] ata2.00: qc timeout (cmd 0xec)
[   64.112000] ata2.00: failed to IDENTIFY (I/O error, err_mask=0x104)
[   64.112000] ata2.00: limiting speed to UDMA7:PIO5
[   65.092000] ata2: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
[   95.092000] ata2.00: qc timeout (cmd 0xec)
[   95.092000] ata2.00: failed to IDENTIFY (I/O error, err_mask=0x104)
[   96.072000] ata2: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
[   96.072000] scsi2 : ahci
[   96.384000] ata3: SATA link down (SStatus 0 SControl 300)
[   96.384000] scsi3 : ahci
[   96.696000] ata4: SATA link down (SStatus 0 SControl 300)



##
lspci -vvv:

00:12.0 SATA controller: ATI Technologies Inc SB600 Non-Raid-5 SATA 
(prog-if 01 [AHCI 1.0])

  Subsystem: Micro-Star International Co., Ltd. Unknown device 7242
  Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- 
Stepping- SERR+ FastB2B-
  Status: Cap+ 66MHz+ UDF- FastB2B- ParErr- DEVSEL=medium >TAbort- 
SERR- 
  Latency: 64, Cache Line Size: 64 bytes
  Interrupt: pin A routed to IRQ 222
  Region 0: I/O ports at e800 [size=8]
  Region 1: I/O ports at e400 [size=4]
  Region 2: I/O ports at e000 [size=8]
  Region 3: I/O ports at dc00 [size=4]
  Region 4: I/O ports at d800 [size=16]
  Region 5: Memory at febffc00 (32-bit, non-prefetchable) [size=1K]
  Capabilities: [60] Power Management version 2
  Flags: PMEClk- DSI+ D1- D2- AuxCurrent=0mA 
PME(D0-,D1-,D2-,D3hot-,D3cold-)

  Status: D0 PME-Enable- DSel=0 DScale=0 PME-
  Capabilities: [50] Message Signalled Interrupts: 64bit+ Queue=0/2 
Enable+

  Address: fee0300c  Data: 41b9
  Capabilities: [70] #12 [0010]


##
lshw

*-pci:0
description: Host bridge
product: RS480 Host Bridge
vendor: ATI Technologies Inc
physical id: 100
bus info: [EMAIL PROTECTED]:00.0
version: 10
width: 32 bits
clock: 66MHz
  *-pci:0
   description: PCI bridge
   product: RS480 PCI-X Root Port
   vendor: ATI Technologies Inc
   physical id: 2
   bus info: [EMAIL PROTECTED]:02.0
   version: 00
   width: 32 bits
   clock: 33MHz
   capabilities: pci normal_decode bus_master cap_list
   configuration: driver=pcieport-driver
  [...]
  *-storage
   description: SATA controller
   product: SB600 Non-Raid-5 SATA
   vendor: ATI Technologies Inc
   physical id: 12
   bus info: [EMAIL PROTECTED]:12.0
   version: 00
   width: 32 bits
   clock: 66MHz
   capabilities: storage ahci_1.0 bus_master cap_list
   configuration: driver=ahci
   resources: ioport:e800-e807 ioport:e400-e403 
ioport:e000-e007 ioport:dc00-dc03 ioport:d800-d80f 
iomemory:febffc00-febf irq:222

  *-usb:0
   [...]



I attach the output of the lshw lspci -vvv and dmesg commands (all three 
of them executed as root).


Matias
[0.00] Linux version 2.6.21-rc5 ([EMAIL PROTECTED]) (gcc version 4.1.2 
20060928 (prerelease) (Ubuntu 4.1.1-13ubuntu5)) #1 SMP Sun Mar 25 23:25:22 ART 
2007
[0.00] BIOS-provided physical RAM map:
[0.00] sanitize start
[0.00] sanitize end
[0.00] copy_e820_map() start:  size: 0009fc00 
end: 0009fc00 type: 1
[0.00] copy_e820_map() type is E820_RAM
[0.00] copy_e820_map() start: 0009fc00 size: 0400 
end: 000a type: 2
[0.00]

[RFC] Designing and Implementation of Directory Inode Reservation

2007-03-26 Thread coly

Hi, list,

I am working on the directory inode reservation feature now. Here is the
detailed description of my understand of the designing, and current
implementations.

Please give me your comments on this idea. Thanks for your help in
advance.

Best regards.

Coly Li


-
Designing and Implementation of Directory Inode Reservation

version 0.1

Coly Li

  This text explains what is the idea of directory inode reservation,
and current designing and implementation for this idea. Andreas Dilger
and Danial Phillips developed this idea when ext3 htree was first
written, now it is the time to implement it.

1. Issues for current inode allocating
  Currently ext3 and ext4dev allocate inodes in linear order within each
block groups. The linear allocating may causes bad performance when stat
or unlink huge number of files under a directory recursively. The
reasons are:
  * Inodes are allocated in linear order, while dentries of files are
accessed by hashed order in directory files. The difference in ordering
may cause a single inode block in inode table to be submitted multiple
times. For example, in hashed order of directory file, the inode of
first accessed file is in second inode block, inode of second accessed
file is in first inode block, inode of third accessed file is in second
inode block ... This will cause each inode block be dirtied and
submitted into journal or nature filesystem multiple times, especially
in data=writeback mode.
  * Inodes of files in different sub-directories may be allocated in one
inode block. This condition will also cause multiple dirtying and
submitting to this inode block, it can not be helpful that even the
inodes of same directory are in hashed order.
  The issue will happen when creating huge number of files under a
directory, and even worse when creating huge number of files under
multiple directories alternately within one block group.

2. Improve performance by inode reservation for sub-directories
  Inode reservation for sub-directories means when creating a
sub-directory, reserve a number of continuous inodes in inode table for
it. When creating new files under the sub-directory, inodes can be
allocated from the reserved region. Once the reserved region is full,
just find another larger reservd region in inode tables.
  * First goal, make new file inodes of same directory be allocated from
reserved inode region.
  * Second goal, make new file inodes of same directory be allocated in
hashed (like) order from reserved inode region.
  The first goal can avoid inodes from different sub-directories mixed
in one inode blocks. The second goal can try best to make inodes
allocating order follow hashed order of dentries in directory file. Both
can decrease multiple times for inode block dirting and submitting.

3. Benchmarks for ideal performance improvement
  A benchmark is done for ideal condition, the improved results are
impressive (copy operations are done on differenct harddisk, all the
files are 0 byte). Operations are:
  * cd hdiskA/sub; for i in `seq 1 50`;do touch `/usr/bin/keygen |
head -c 8`;done;done
  * reboot
  * time cp -r hdiskA/sub hdiskB/ordered1
  * cp -r hdiskB/ordered1 hdiskA/ordered1
  * reboot
  * time cp -r hdiskA/ordered1 hdiskB/ordered2
  * reboot
  * time rm -rf hdiskA/ordered1
  * time rm -rf hdiskA/sub
  Here are the results for different journaling modes:
  a) data=writeback mode
"cp -r hdiskA/sub hdiskB/ordered1" | "cp -r hdiskA/ordered1
hdiskB/ordered2"
   real7m17.616s   |real1m8.764s
   user0m1.456s|user0m1.568s
   sys 0m27.586s   |sys 0m26.050s
  "rm -rf hdiskA/sub"  |   "rm -rf hdiskA/ordered1"
   real9m49.902s   |real0m37.493s
   user0m0.220s|user0m0.076s
   sys 0m14.377s   |sys 0m11.089s
  b) data=ordered
"cp -r hdiskA/sub hdiskB/ordered1" |   "cp -r hdiskA/ordered1
hdiskB/ordered2"
   real7m57.016s   |real7m46.037s
   user0m1.632s|user0m1.604.s
   sys 0m25.558s   |sys 0m24.902s
  "rm -rf hdiskA/sub"  |   "rm -rf hdiskA/ordered1"
   real10m17.966s  |real6m32.278s
   user0m0.236s|user0m0.176s
   sys 0m14.453s   |sys 0m12.093s
  c) data=journal
"cp -r hdiskA/sub hdiskB/ordered1" |   "cp -r hdiskA/ordered1
hdiskB/ordered2"
   real6m54.151s   |real7m7.696s
   user0m1.696s|user0m1.416s
   sys 0m22.705s   |sys 0m23.541s
  "rm -rf hdiskA/sub"  |   "rm -rf hdiskA/ordered1"
   real10m41.150s  |real7m43.703s
   user0m0.188s|

Re: [PATCH 12/15] ide: make ide_hwif_t.ide_dma_host_on void

2007-03-26 Thread Sergei Shtylyov


Hello.

Bartlomiej Zolnierkiewicz wrote:

[PATCH] ide: make ide_hwif_t.ide_dma_host_on void



* since ide_hwif_t.ide_dma_host_on is called either when drive->using_dma == 1
  or when return value is discarded make it void, also drop "ide_" prefix
* make __ide_dma_host_on() void and drop "__" prefix


   BTW, it would also make sense to make hwif->ide_dma_timeout() and 
hwif->ide_dma_lostirq void too (and possibly drop the ide_ prefix). Their 
results are *explicitly* ignored.


MBR, Sergei
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH -mm] drivers/kvm/svm.c remove unused function

2007-03-26 Thread Avi Kivity


Michal Piotrowski wrote:

Remove unused function

CC  drivers/kvm/svm.o
drivers/kvm/svm.c:207: warning: ‘inject_db’ defined but not used
  


Applied, thanks.


--
error compiling committee.c: too many arguments to function

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 0/3] VM throttling: avoid blocking occasional writers

2007-03-26 Thread Bill Davidsen


Tomoki Sekiyama wrote:

Hi,
Thanks for your reply.

  

3) Use "dirty_ratio" as the blocking ratio. And add
  "start_writeback_ratio", and start writeback at
  start_writeback_ratio(default:90) * dirty_ratio / 100 [%].
  In this way, specifying blocking ratio can be done in the same way
  as current kernel, but high/low watermark algorithm is enabled.
  

I like 3 better, it should make tuning behavior more precise.



Then, what do you think of the following idea?

(4) add `dirty_start_writeback_ratio' as percentage of memory,
at which a generator of dirty pages itself starts writeback
(that is, non-blocking ratio).

In this way, `dirty_ratio' is used as the blocking ratio, so we don't
need to modify the sysctl.conf etc. I think it's easier to understand
for administrators of systems, because the interface is similar as
`dirty_background_ratio' and`dirty_ratio.'

If this is OK, I'll repost the patch.
  
It sounds good to me, just be sure behavior is sane for for both 
blocking less than start_writeback and vice versa.
  

You can make an argument for absolute values for writeback,
if my disk will only write 70MB/s I may only want 203 sec of
pending writes, regardless of available memory.



To realize tuning with absolute values, I consider that we need to
modify handling of `dirty_background_ratio,' `dirty_ratio' and so on as
well as `dirty_start_writeback_ratio.' I think this should be done in
another patch if this feature is required.

Regards,
--
Tomoki Sekiyama
Hitachi, Ltd., Systems Development Laboratory


  



--
bill davidsen <[EMAIL PROTECTED]>
 CTO TMR Associates, Inc
 Doing interesting things with small computers since 1979

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [patch] hrtimers debug patch

2007-03-26 Thread Ingo Molnar


* Michal Piotrowski <[EMAIL PROTECTED]> wrote:

> Stardust is down, console log and config attached.

thanks! I have stared at hrtimer.c a few more hours and the good news is 
that i found a narrow SMP race. The bad news is that i dont think it 
could explain your bug symptoms: the worst-case effect of the race 
should be an incorrect timeout on the current CPU - not a KTIME_MAX 
thing like your logs show.

But maybe i didnt think through the effects of the bug well enough, and 
your box has a HT CPU, with HT CPUs being pretty good at triggering 
narrow SMP races - so maybe we are lucky? Fix attached below. Patch is 
build and boot-tested.

Ingo

-->
Subject: [patch] hrtimers: fix reprogramming SMP race
From: Ingo Molnar <[EMAIL PROTECTED]>

hrtimer_start() incorrectly set the 'reprogram' flag to 
enqueue_hrtimer(), which should only be 1 if the hrtimer is queued to 
the current CPU.

doing otherwise could result in a reprogramming of the current CPU's 
clockevents device, with a timer that is not queued to it - resulting in 
a bogus next expiry value.

Signed-off-by: Ingo Molnar <[EMAIL PROTECTED]>
Needs-to-be-tested-by: Michal Piotrowski <[EMAIL PROTECTED]>
---
 kernel/hrtimer.c |7 ++-
 1 file changed, 6 insertions(+), 1 deletion(-)

Index: linux/kernel/hrtimer.c
===
--- linux.orig/kernel/hrtimer.c
+++ linux/kernel/hrtimer.c
@@ -844,7 +844,12 @@ hrtimer_start(struct hrtimer *timer, kti
 
timer_stats_hrtimer_set_start_info(timer);
 
-   enqueue_hrtimer(timer, new_base, base == new_base);
+   /*
+* Only allow reprogramming if the new base is on this CPU.
+* (it might still be on another CPU if the timer was pending)
+*/
+   enqueue_hrtimer(timer, new_base,
+   new_base->cpu_base == &__get_cpu_var(hrtimer_bases));
 
unlock_hrtimer_base(timer, );
 
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [QUICKLIST 1/5] Quicklists for page table pages V4

2007-03-26 Thread Christoph Lameter

On Fri, 23 Mar 2007, Andrew Morton wrote:

> On Fri, 23 Mar 2007 10:54:12 -0700 (PDT) Christoph Lameter <[EMAIL 
> PROTECTED]> wrote:
> 
> > Here are the results of aim9 tests on x86_64. There are some minor 
> > performance 
> > improvements and some fluctuations.
> 
> There are a lot of numbers there - what do they tell us?

That there are performance improvements because of quicklists.

> So what has changed here?  From a quick look it appears that x86_64 is
> using get_zeroed_page() for ptes, puds and pmds and is using a custom
> quicklist for pgds.

x86_64 is only using a list in order to track pgds. There is no 
quicklist without this patchset.

> After your patches, x86_64 is using a common quicklist allocator for puds,
> pmds and pgds and continues to use get_zeroed_page() for ptes.

x86_64 should be using quicklists for all ptes after this patch. I did not 
convert pte_free() since it is only used for freeing ptes during races 
(see __pte_alloc). Since pte_free gets passed a page struct it would require 
virt_to_page before being put onto the freelist. Not worth doing.

Hmmm... Then how does x86_64 free the ptes? Seems that we do 
free_page_and_swap_cache() in tlb_remove_pages. Yup so ptes are not 
handled which limits the speed improvements that we see.

> My question is pretty simple: how do we justify the retention of this
> custom allocator?

I would expect this functionality (never thought about it as an allocator) 
to extract common code from many arches that use one or the other form of 
preserving zeroed pages for page table pages. I saw lots of arches doing 
the same with some getting into trouble with the page structs. Having a 
common code base that does not have this issue would clean up the kernel 
and deal with the slab issue.

> Because simply removing it is the preferable way of fixing the SLUB
> problem.

That would reduce performance. I did not think that a common feature 
that is used throughout many arches would need rejustification.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[patch 2.6.21-rc5 3/3] layered parport code uses parport->dev

2007-03-26 Thread David Brownell

Update some of the layered parport_driver code to use parport->dev:

- i2c-parport (parent of i2c_adapter)
- spi_butterfly (parent of spi_master, allowing cruft removal)
- lp (creating class_device)
- ppdev (parent of parportN device)
- tipar (creating class_device)

There are still drivers that should be updated, like some of the input
drivers; but they won't be any worse off than they are today.

Signed-off-by: David Brownell <[EMAIL PROTECTED]>
---
 drivers/char/lp.c|2 +-
 drivers/char/ppdev.c |2 +-
 drivers/char/tipar.c |2 +-
 drivers/i2c/busses/i2c-parport.c |1 +
 drivers/spi/spi_butterfly.c  |   21 -
 5 files changed, 8 insertions(+), 20 deletions(-)

Index: g26/drivers/spi/spi_butterfly.c
===
--- g26.orig/drivers/spi/spi_butterfly.c2007-03-07 10:10:33.0 
-0800
+++ g26/drivers/spi/spi_butterfly.c 2007-03-07 11:17:26.0 -0800
@@ -20,7 +20,7 @@
 #include 
 #include 
 #include 
-#include 
+#include 
 #include 
 
 #include 
@@ -237,24 +237,16 @@ static void butterfly_attach(struct parp
int status;
struct butterfly*pp;
struct spi_master   *master;
-   struct platform_device  *pdev;
+   struct device   *dev = p->physport->dev;
 
-   if (butterfly)
+   if (butterfly || !dev)
return;
 
/* REVISIT:  this just _assumes_ a butterfly is there ... no probe,
 * and no way to be selective about what it binds to.
 */
 
-   /* FIXME where should master->cdev.dev come from?
-* e.g. /sys/bus/pnp0/00:0b, some PCI thing, etc
-* setting up a platform device like this is an ugly kluge...
-*/
-   pdev = platform_device_register_simple("butterfly", -1, NULL, 0);
-   if (IS_ERR(pdev))
-   return;
-
-   master = spi_alloc_master(>dev, sizeof *pp);
+   master = spi_alloc_master(dev, sizeof *pp);
if (!master) {
status = -ENOMEM;
goto done;
@@ -366,14 +358,12 @@ clean1:
 clean0:
(void) spi_master_put(pp->bitbang.master);
 done:
-   platform_device_unregister(pdev);
pr_debug("%s: butterfly probe, fail %d\n", p->name, status);
 }
 
 static void butterfly_detach(struct parport *p)
 {
struct butterfly*pp;
-   struct platform_device  *pdev;
int status;
 
/* FIXME this global is ugly ... but, how to quickly get from
@@ -386,7 +376,6 @@ static void butterfly_detach(struct parp
butterfly = NULL;
 
/* stop() unregisters child devices too */
-   pdev = to_platform_device(pp->bitbang.master->cdev.dev);
status = spi_bitbang_stop(>bitbang);
 
/* turn off VCC */
@@ -397,8 +386,6 @@ static void butterfly_detach(struct parp
parport_unregister_device(pp->pd);
 
(void) spi_master_put(pp->bitbang.master);
-
-   platform_device_unregister(pdev);
 }
 
 static struct parport_driver butterfly_driver = {
Index: g26/drivers/i2c/busses/i2c-parport.c
===
--- g26.orig/drivers/i2c/busses/i2c-parport.c   2007-03-07 10:10:31.0 
-0800
+++ g26/drivers/i2c/busses/i2c-parport.c2007-03-07 11:14:35.0 
-0800
@@ -175,6 +175,7 @@ static void i2c_parport_attach (struct p
adapter->algo_data.getscl = NULL;
adapter->algo_data.data = port;
adapter->adapter.algo_data = >algo_data;
+   adapter->adapter.dev.parent = port->physport->dev;
 
if (parport_claim_or_block(adapter->pdev) < 0) {
printk(KERN_ERR "i2c-parport: Could not claim parallel port\n");
Index: g26/drivers/char/ppdev.c
===
--- g26.orig/drivers/char/ppdev.c   2007-03-07 10:10:31.0 -0800
+++ g26/drivers/char/ppdev.c2007-03-07 11:14:35.0 -0800
@@ -752,7 +752,7 @@ static const struct file_operations pp_f
 
 static void pp_attach(struct parport *port)
 {
-   device_create(ppdev_class, NULL, MKDEV(PP_MAJOR, port->number),
+   device_create(ppdev_class, port->dev, MKDEV(PP_MAJOR, port->number),
"parport%d", port->number);
 }
 
Index: g26/drivers/char/lp.c
===
--- g26.orig/drivers/char/lp.c  2007-03-07 10:10:31.0 -0800
+++ g26/drivers/char/lp.c   2007-03-07 11:14:35.0 -0800
@@ -803,7 +803,7 @@ static int lp_register(int nr, struct pa
if (reset)
lp_reset(nr);
 
-   class_device_create(lp_class, NULL, MKDEV(LP_MAJOR, nr), NULL,
+   class_device_create(lp_class, NULL, MKDEV(LP_MAJOR, nr), port->dev,
"lp%d", nr);
 
printk(KERN_INFO "lp%d: using %s (%s).\n", nr,

[patch 2.6.21-rc5] arch/x86_64/kernel/early-quirks.c compiler warning

2007-03-26 Thread David Brownell

Fix "unused variable" compiler warning on non-SMP x86_64 configs.

Signed-off-by: David Brownell <[EMAIL PROTECTED]>

--- a/arch/x86_64/kernel/early-quirks.c
+++ b/arch/x86_64/kernel/early-quirks.c
@@ -73,9 +73,9 @@ static void __init ati_bugs(void)
 
 static void intel_bugs(void)
 {
+#ifdef CONFIG_SMP
u16 device = read_pci_config_16(0, 0, 0, PCI_DEVICE_ID);
 
-#ifdef CONFIG_SMP
if (device == PCI_DEVICE_ID_INTEL_E7320_MCH ||
device == PCI_DEVICE_ID_INTEL_E7520_MCH ||
device == PCI_DEVICE_ID_INTEL_E7525_MCH)
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[patch 2.6.21-rc5 2/3] legacy PC parports support parport->dev

2007-03-26 Thread David Brownell

From: Jean Delvare <[EMAIL PROTECTED]>

Give legacy parallel ports a platform device in the device tree.

This is a quick and dirty implementation; it doesn't actually convert
the legacy parport code to the device driver model (by splitting out
probing from device creation).  But at least parallel port device
drivers will finally have a device to work with.

Signed-off-by: Jean Delvare <[EMAIL PROTECTED]>
Signed-off-by: David Brownell <[EMAIL PROTECTED]>
---
 drivers/parport/parport_pc.c |   39 +++
 1 file changed, 39 insertions(+)

--- linux-2.6.21-pre.orig/drivers/parport/parport_pc.c  2007-02-19 
12:03:44.0 +0100
+++ linux-2.6.21-pre/drivers/parport/parport_pc.c   2007-02-19 
18:15:41.0 +0100
@@ -53,6 +53,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 
 #include 
@@ -2156,6 +2157,17 @@ struct parport *parport_pc_probe_port (u
struct resource *base_res;
struct resource *ECR_res = NULL;
struct resource *EPP_res = NULL;
+   struct platform_device *pdev = NULL;
+
+   if (!dev) {
+   /* We need a physical device to attach to, but none was
+* provided. Create our own. */
+   pdev = platform_device_register_simple("parport_pc",
+  base, NULL, 0);
+   if (IS_ERR(pdev))
+   return NULL;
+   dev = >dev;
+   }
 
ops = kmalloc(sizeof (struct parport_operations), GFP_KERNEL);
if (!ops)
@@ -2359,6 +2371,8 @@ out3:
 out2:
kfree (ops);
 out1:
+   if (pdev)
+   platform_device_unregister(pdev);
return NULL;
 }
 
@@ -3106,6 +3120,21 @@ static struct pnp_driver parport_pc_pnp_
 };
 
 
+static int __devinit parport_pc_platform_probe(struct platform_device *pdev)
+{
+   /* Always succeed, the actual probing is done in
+* parport_pc_probe_port(). */
+   return 0;
+}
+
+static struct platform_driver parport_pc_platform_driver = {
+   .driver = {
+   .owner  = THIS_MODULE,
+   .name   = "parport_pc",
+   },
+   .probe  = parport_pc_platform_probe,
+};
+
 /* This is called by parport_pc_find_nonpci_ports (in asm/parport.h) */
 static int __devinit __attribute__((unused))
 parport_pc_find_isa_ports (int autoirq, int autodma)
@@ -3381,9 +3410,15 @@ __setup("parport_init_mode=",parport_ini
 
 static int __init parport_pc_init(void)
 {
+   int err;
+
if (parse_parport_params())
return -EINVAL;
 
+   err = platform_driver_register(_pc_platform_driver);
+   if (err)
+   return err;
+
if (io[0]) {
int i;
/* Only probe the ports we were given. */
@@ -3408,6 +3443,7 @@ static void __exit parport_pc_exit(void)
pci_unregister_driver (_pc_pci_driver);
if (pnp_registered_parport)
pnp_unregister_driver (_pc_pnp_driver);
+   platform_driver_unregister(_pc_platform_driver);
 
spin_lock(_lock);
while (!list_empty(_list)) {
@@ -3416,6 +3452,9 @@ static void __exit parport_pc_exit(void)
priv = list_entry(ports_list.next,
  struct parport_pc_private, list);
port = priv->port;
+   if (port->dev && port->dev->bus == _bus_type)
+   platform_device_unregister(
+   to_platform_device(port->dev));
spin_unlock(_lock);
parport_pc_unregister_port(port);
spin_lock(_lock);

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[patch 2.6.21-rc5 1/3] parport->dev driver model support

2007-03-26 Thread David Brownell

Currently a parport_driver can't get a handle on the device node for the
underlying parport (PNPACPI, PCI, etc).  That prevents correct placement
of sysfs child nodes, which can affect things like power management.

This patch adds a field to "struct parport" pointing to that device node,
and updates non-legacy port drivers to initialize that device pointer.
That field replaces the analagous PCI-only support in parport_pc.

Signed-off-by: David Brownell <[EMAIL PROTECTED]>
---
NOTE this depends on an earlier patch to make pnp devices set up DMA,
parport is the primary user of legacy i8237 dma infrastructure.

 drivers/parport/parport_cs.c |2 +-
 drivers/parport/parport_mfc3.c   |1 +
 drivers/parport/parport_pc.c |   31 +--
 drivers/parport/parport_serial.c |2 +-
 drivers/parport/parport_sunbpp.c |1 +
 drivers/parport/share.c  |5 +
 include/linux/parport.h  |8 ++--
 include/linux/parport_pc.h   |3 +--
 8 files changed, 33 insertions(+), 20 deletions(-)

Index: g26/include/linux/parport.h
===
--- g26.orig/include/linux/parport.h2007-02-24 01:19:37.0 -0800
+++ g26/include/linux/parport.h 2007-02-24 01:24:19.0 -0800
@@ -279,6 +279,10 @@ struct parport {
int dma;
int muxport;/* which muxport (if any) this is */
int portnum;/* which physical parallel port (not mux) */
+   struct device *dev; /* Physical device associated with IO/DMA.
+* This may unfortulately be null if the
+* port has a legacy driver.
+*/
 
struct parport *physport;
/* If this is a non-default mux
@@ -289,7 +293,7 @@ struct parport {
   following structure members are
   meaningless: devices, cad, muxsel,
   waithead, waittail, flags, pdir,
-  ieee1284, *_lock.
+  dev, ieee1284, *_lock.
 
   It this is a default mux parport, or
   there is no mux involved, this points to
@@ -302,7 +306,7 @@ struct parport {
 
struct pardevice *waithead;
struct pardevice *waittail;
-   
+
struct list_head list;
unsigned int flags;
 
Index: g26/include/linux/parport_pc.h
===
--- g26.orig/include/linux/parport_pc.h 2007-02-24 01:19:37.0 -0800
+++ g26/include/linux/parport_pc.h  2007-02-24 01:24:19.0 -0800
@@ -38,7 +38,6 @@ struct parport_pc_private {
/* buffer suitable for DMA, if DMA enabled */
char *dma_buf;
dma_addr_t dma_handle;
-   struct pci_dev *dev;
struct list_head list;
struct parport *port;
 };
@@ -232,7 +231,7 @@ extern int parport_pc_claim_resources(st
 extern struct parport *parport_pc_probe_port (unsigned long base,
  unsigned long base_hi,
  int irq, int dma,
- struct pci_dev *dev);
+ struct device *dev);
 extern void parport_pc_unregister_port (struct parport *p);
 
 #endif
Index: g26/drivers/parport/parport_pc.c
===
--- g26.orig/drivers/parport/parport_pc.c   2007-02-24 01:19:37.0 
-0800
+++ g26/drivers/parport/parport_pc.c2007-02-28 12:40:30.0 -0800
@@ -620,6 +620,7 @@ static size_t parport_pc_fifo_write_bloc
unsigned long dmaflag;
size_t left = length;
const struct parport_pc_private *priv = port->physport->private_data;
+   struct device *dev = port->physport->dev;
dma_addr_t dma_addr, dma_handle;
size_t maxlen = 0x1; /* max 64k per DMA transfer */
unsigned long start = (unsigned long) buf;
@@ -631,8 +632,8 @@ dump_parport_state ("enter fifo_write_bl
if ((start ^ end) & ~0xUL)
maxlen = 0x1 - (start & 0x);
 
-   dma_addr = dma_handle = pci_map_single(priv->dev, (void *)buf, 
length,
-  PCI_DMA_TODEVICE);
+   dma_addr = dma_handle = dma_map_single(dev, (void *)buf, length,
+  DMA_TO_DEVICE);
 } else {
/* above 16 MB we use a bounce buffer as ISA-DMA is not 
possible */
maxlen   = PAGE_SIZE;  /* sizeof(priv->dma_buf) */
@@ -728,9 +729,9 @@ dump_parport_state ("enter fifo_write_bl
 
/* Turn off DMA mode */
frob_econtrol (port, 1<<3, 0);
-   
+

Re: [3/6] 2.6.21-rc4: known regressions

2007-03-26 Thread Thomas Gleixner

On Mon, 2007-03-26 at 13:37 +0800, Jeff Chua wrote:
> On 3/26/07, Adrian Bunk <[EMAIL PROTECTED]> wrote:
> 
> > > Resume from RAM (s2ram) still broke (tried with or without
> > > CONFIG_NO_HZ). Suspend to RAM seems ok, but upon resume, the screen
> > > will only display "inu" and only after pressing the power button will
> > > the system return to console. But "date" still doesn't advance.
> >
> > This might be related to the following regression:
> >
> > Subject: first disk access after resume takes several minutes
> >  ('date' does not advance after resume from RAM, CONFIG_NO_HZ=n)
> > References : http://lkml.org/lkml/2007/3/8/117
> >  http://lkml.org/lkml/2007/3/25/20
> > Submitter  : Michael S. Tsirkin <[EMAIL PROTECTED]>
> > Handled-By : Thomas Gleixner <[EMAIL PROTECTED]>
> >  Ingo Molnar <[EMAIL PROTECTED]>
> > Status : problem is being debugged
> 
> Adrian,
> 
> It's related. I tested without CONFIG_HPET_TIMER, and now my X60 can
> suspend and resume from RAM (s2ram). Even better, it works
> with/without CONFIG_NO_HZ.

Does the patch below fix the HPET_TIMER=y case ?

tglx

diff --git a/arch/i386/kernel/hpet.c b/arch/i386/kernel/hpet.c
index f3ab61e..76afea6 100644
--- a/arch/i386/kernel/hpet.c
+++ b/arch/i386/kernel/hpet.c
@@ -197,7 +197,7 @@ static int hpet_next_event(unsigned long delta,
cnt += delta;
hpet_writel(cnt, HPET_T0_CMP);
 
-   return ((long)(hpet_readl(HPET_COUNTER) - cnt ) > 0);
+   return ((long)(hpet_readl(HPET_COUNTER) - cnt ) > 0) ? -ETIME : 0;
 }
 
 /*



-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: debug rsdl 0.33

2007-03-26 Thread Con Kolivas

On Tuesday 27 March 2007 01:28, Andy Whitcroft wrote:
> Andy Whitcroft wrote:
> Subsequent to that Con suggested testing a refactored RSDL patch.  That
> patch seemed to work on the machine at hand, so tests have been
> submitted for all the affected machines.
>
> http://ck.kolivas.org/patches/staircase-deadline/2.6.21-rc4-mm1-rsdl-0.34-t
>est.patch
>
> ...
>
> Ok, the preliminary results are in and we seem to have good boots in the
> three machines I was hitting early boot oops.  So I think we can say
> that the new stack is a lot better than the old.
>
> Con, have a Tested-by:
> :/
>
> -apw

Well thank you very much indeed. I'm pleased that the code I decided to rip 
out of the next update also took whatever bug was there with it. Fortunately 
it also is not dependant on the buggy sched: accurate user accounting patch 
that I gave up on so here is an incremental from the current -mm queue to 
this code without the "accurate user accounting patch" component for anyone 
who's trying to track just what I'm planning on moving forward with.

http://ck.kolivas.org/patches/staircase-deadline/2.6.21-rc4-mm1/sched-rsdl-sd-0.35-test.patch

Summary:
 3 files changed, 86 insertions(+), 249 deletions(-)

It also makes lists-add_list_splice_tail.patch unnecessary

-- 
-ck
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] remove pci_dac_dma_... APIs

2007-03-26 Thread Randy Dunlap

On Mon, 26 Mar 2007 15:12:31 +0100 Jan Beulich wrote:

> Based on replies to a respective query, remove the pci_dac_dma_...() APIs
> (except for pci_dac_dma_supported() on Alpha, where this function is used
> in non-DAC PCI DMA code).
> 
> Signed-off-by: Jan Beulich <[EMAIL PROTECTED]>
> Cc: Andi Kleen <[EMAIL PROTECTED]>
> Cc: Jesse Barnes <[EMAIL PROTECTED]>
> Cc: Christoph Hellwig <[EMAIL PROTECTED]>
> Cc: David Miller <[EMAIL PROTECTED]>

meta-comment:  while diffstats are not required (by
Documentation/SubmittingPatches), they sure can be helpful
to reviewers.


---
~Randy
*** Remember to use Documentation/SubmitChecklist when testing your code ***
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: race condition in dm-crypt?

2007-03-26 Thread Alasdair G Kergon

On Fri, Mar 23, 2007 at 06:42:14PM +0100, markus reichelt wrote:
> * "Jan C. Nordholz" <[EMAIL PROTECTED]> wrote:
> > I'm seeing this for quite a while now (since 2.6.16 at least), but
> > without any obvious indicator to what might be causing it... where
> > should I continue debugging this?
> I bet folks at [EMAIL PROTECTED] would love to hear about this.
 
As mentioned in another thread, please try these patches if they aren't already
in your kernel:

  
http://www.kernel.org/pub/linux/kernel/people/agk/patches/2.6/2.6.19/dm-io-fix-bi_max_vecs.patch

   

  
http://www.kernel.org/pub/linux/kernel/people/agk/patches/2.6/editing/dm-merge-max_hw_sector.patch


  
http://www.kernel.org/pub/linux/kernel/people/agk/patches/2.6/editing/dm-crypt-disable-barriers.patch

  
http://www.kernel.org/pub/linux/kernel/people/agk/patches/2.6/editing/dm-crypt-fix-call-to-clone_init.patch
  
http://www.kernel.org/pub/linux/kernel/people/agk/patches/2.6/editing/dm-crypt-fix-avoid-cloned-bio-ref-after-free.patch
  
http://www.kernel.org/pub/linux/kernel/people/agk/patches/2.6/editing/dm-crypt-fix-remove-first_clone.patch
  
http://www.kernel.org/pub/linux/kernel/people/agk/patches/2.6/editing/dm-crypt-use-smaller-bvecs-in-clones.patch

Alasdair
-- 
[EMAIL PROTECTED]
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: -rc5: e1000 resume weirdness

2007-03-26 Thread Kok, Auke


Jesse Brandeburg wrote:

On 3/26/07, Ingo Molnar <[EMAIL PROTECTED]> wrote:

hm, on a T60, after suspend/resume, i get an e1000 timeout:

e1000: eth0: e1000_watchdog: NIC Link is Up 1000 Mbps Full Duplex, Flow 
Control: RX/TX
e1000: eth0: e1000_clean_tx_irq: Detected Tx Unit Hang
  Tx Queue <0>
  TDH  
  TDT  
  next_to_use  
  next_to_clean<82>
buffer_info[next_to_clean]
  time_stamp   
  next_to_watch<82>
  jiffies  
  next_to_watch.status <1>

it works fine after that reset. The e1000 driver didnt do this before
after resume the network was always available immediately. So this
appears to be a relatively new regression (post-rc3 or so). high-res
timers was disabled.


was there a "NETDEV WATCHDOG" message that follows this?  If not it is
a harmless debug print.  Note the time_stamp and jiffies difference,
very large, consistent with a resume.  I think we need to disable the
internal e1000 tx hang code that causes this debug print when we are
suspending.  I'll work with auke to generate a short patch.


hmm, yeah, it appears that the patch I sent just a second ago isn't applicable 
in this case, since the irq handler is obviously enabled (the Link Up message 
proves that).


thanks to Jesse for being awake :)


Auke
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: -rc5: e1000 resume weirdness

2007-03-26 Thread Jesse Brandeburg


On 3/26/07, Ingo Molnar <[EMAIL PROTECTED]> wrote:


hm, on a T60, after suspend/resume, i get an e1000 timeout:

e1000: eth0: e1000_watchdog: NIC Link is Up 1000 Mbps Full Duplex, Flow 
Control: RX/TX
e1000: eth0: e1000_clean_tx_irq: Detected Tx Unit Hang
  Tx Queue <0>
  TDH  
  TDT  
  next_to_use  
  next_to_clean<82>
buffer_info[next_to_clean]
  time_stamp   
  next_to_watch<82>
  jiffies  
  next_to_watch.status <1>

it works fine after that reset. The e1000 driver didnt do this before
after resume the network was always available immediately. So this
appears to be a relatively new regression (post-rc3 or so). high-res
timers was disabled.


was there a "NETDEV WATCHDOG" message that follows this?  If not it is
a harmless debug print.  Note the time_stamp and jiffies difference,
very large, consistent with a resume.  I think we need to disable the
internal e1000 tx hang code that causes this debug print when we are
suspending.  I'll work with auke to generate a short patch.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: -rc5: e1000 resume weirdness

2007-03-26 Thread Kok, Auke

Ingo Molnar wrote:
> hm, on a T60, after suspend/resume, i get an e1000 timeout:
> 
> e1000: eth0: e1000_watchdog: NIC Link is Up 1000 Mbps Full Duplex, Flow 
> Control: RX/TX
> e1000: eth0: e1000_clean_tx_irq: Detected Tx Unit Hang
>   Tx Queue <0>
>   TDH  
>   TDT  
>   next_to_use  
>   next_to_clean<82>
> buffer_info[next_to_clean]
>   time_stamp   
>   next_to_watch<82>
>   jiffies  
>   next_to_watch.status <1>
> 
> it works fine after that reset. The e1000 driver didnt do this before 
> after resume the network was always available immediately. So this 
> appears to be a relatively new regression (post-rc3 or so). high-res 
> timers was disabled.
> 
>   Ingo

THT == TDH -> this is a 'bogus' tx hang indicating that one or more parts
in the TX patch is not properly enabled.

Most likely, I suspect that we haven't enabled something because the ordering
of irq free/alloc was messed up and nobody cared before, but with all the
pci_save_state fixes going in we hit a bump.

The reset kicks it all back up in order so it's something silly like this for
sure.

The attached patch fixes that and sitting in my queue for a few days. Can you
see if that works?

Auke


---
e1000: Free interrupts symmetrically with resume

From: Auke Kok <[EMAIL PROTECTED]>

Free interrupts symmetrically with resume allocation to prevent
pci save/restore state from possibly failing or warning.

Signed-off-by: Auke Kok <[EMAIL PROTECTED]>
---

 drivers/net/e1000/e1000_main.c |4 +---
 1 files changed, 1 insertions(+), 3 deletions(-)

diff --git a/drivers/net/e1000/e1000_main.c b/drivers/net/e1000/e1000_main.c
index 55ef148..93d41f0 100644
--- a/drivers/net/e1000/e1000_main.c
+++ b/drivers/net/e1000/e1000_main.c
@@ -5190,6 +5190,7 @@ e1000_suspend(struct pci_dev *pdev, pm_message_t state)
if (netif_running(netdev)) {
WARN_ON(test_bit(__E1000_RESETTING, >flags));
e1000_down(adapter);
+   e1000_free_irq(adapter);
}
 
 #ifdef CONFIG_PM
@@ -5257,9 +5258,6 @@ e1000_suspend(struct pci_dev *pdev, pm_message_t state)
if (adapter->hw.phy.type == e1000_phy_igp_3)
e1000_igp3_phy_powerdown_workaround_ich8lan(>hw);
 
-   if (netif_running(netdev))
-   e1000_free_irq(adapter);
-
/* Release control of h/w to f/w.  If f/w is AMT enabled, this
 * would have already happened in close and is redundant. */
e1000_release_hw_control(adapter);
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [2.6.22 patch] more scheduled OSS driver removal

2007-03-26 Thread Lee Revell

On 3/26/07, Richard Knutsson <[EMAIL PROTECTED]> wrote:

> I guess he's referring to the well known "Master volume only controls
> front output" problem.  This really does need to be resolved, as many
> other ALSA drivers are effected.

Isn't this quite a basic feature?! Is there somewhere to monitor the
progress on this?

AC97 spec defines "Master" as controlling front outputs only.

But yes, there are proposals to improve the situation.  Search the
alsa-devel archives.

Lee
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] fat/vfat: optionally ignore system timezone offset when reading/writing timestamps

2007-03-26 Thread OGAWA Hirofumi

Hiroyuki Machida <[EMAIL PROTECTED]> writes:

> I'm not famillar with recent fat code, but code itself looks good for
> just turn on/off time adjusting. On the other hand, I feel we need more 
> consideration on use cases/requirements. I feel that turning off
> time adjustment is a just ad-hoc solution to issues like Paul san 
> brought up.

Thank you. I see. So we need "timezone" option to specify adjusted
time?  If so, I feel we can add it as "timezone=utc", then it'll can
be improved later...
-- 
OGAWA Hirofumi <[EMAIL PROTECTED]>
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: debug rsdl 0.33

2007-03-26 Thread Andy Whitcroft

Andy Whitcroft wrote:
> Con Kolivas wrote:
> 
>> This is about the only place I can see the run_list is looked at unlocked. 
>> Can
>> you see if this simple patch helps? The debug patch is unnecessary now.
> 
> Tests queued with this patch.  Will let you know.

That patch had no effect on the problem.

...

Since then we have performed some more debugging on the issue and it
appears that the first stanza in next_dynamic_task is tripping,
triggering a "major_priority_rotation" and the resulting runq bitmap
indicating there is nothing to run.  Discussions with Con seem to
indicate that this is not possible :/.

Subsequent to that Con suggested testing a refactored RSDL patch.  That
patch seemed to work on the machine at hand, so tests have been
submitted for all the affected machines.

http://ck.kolivas.org/patches/staircase-deadline/2.6.21-rc4-mm1-rsdl-0.34-test.patch

...

Ok, the preliminary results are in and we seem to have good boots in the
three machines I was hitting early boot oops.  So I think we can say
that the new stack is a lot better than the old.

Con, have a Tested-by:

:/

-apw
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] Add const to pointer qualifiers for __chk_user_ptr and __chk_io_ptr.

2007-03-26 Thread Russ Cox


Change prototypes for  __chk_user_ptr and __chk_io_ptr
to take const void* instead of void*, so that code can pass
const void* to them.  (Right now sparse does not warn
about passing const void* to void* functions, but that
is a separate bug that I believe Josh is working on,
and once sparse does check this, the changed prototypes
will be necessary.)

Signed-off-by: Russ Cox <[EMAIL PROTECTED]>
Signed-off-by: Josh Triplett <[EMAIL PROTECTED]>

---

include/linux/compiler.h |4 ++--
1 files changed, 2 insertions(+), 2 deletions(-)

e5174dfa73190036ae1086110292594a3ffb3752
diff --git a/include/linux/compiler.h b/include/linux/compiler.h
index aca6698..3b6949b 100644
--- a/include/linux/compiler.h
+++ b/include/linux/compiler.h
@@ -15,8 +15,8 @@
# define __acquire(x)   __context__(x,1)
# define __release(x)   __context__(x,-1)
# define __cond_lock(x,c)   ((c) ? ({ __acquire(x); 1; }) : 0)
-extern void __chk_user_ptr(void __user *);
-extern void __chk_io_ptr(void __iomem *);
+extern void __chk_user_ptr(const void __user *);
+extern void __chk_io_ptr(const void __iomem *);
#else
# define __user
# define __kernel
--
1.1.3
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH -mm] Fix: timeout not passed anymore to futex_lock_pi

2007-03-26 Thread Pierre Peiffer


This is a fix for a bug introduced by the patch
make-futex_wait-use-an-hrtimer-for-timeout.patch : the timeout value
is not passed anymore to futex_lock_pi.

Signed-off-by: Pierre Peiffer <[EMAIL PROTECTED]>

---
 kernel/futex.c|8 ++--
 kernel/futex_compat.c |4 +++-
 2 files changed, 9 insertions(+), 3 deletions(-)

Index: b/kernel/futex.c
===
--- a/kernel/futex.c
+++ b/kernel/futex.c
@@ -2383,8 +2383,10 @@ sys_futex64(u64 __user *uaddr, int op, u
return -EFAULT;
if (!timespec_valid())
return -EINVAL;
+
+   t = timespec_to_ktime(ts);
if (op == FUTEX_WAIT)
-   t = ktime_add(ktime_get(), timespec_to_ktime(ts));
+   t = ktime_add(ktime_get(), t);
tp = 
}
/*
@@ -2413,8 +2415,10 @@ asmlinkage long sys_futex(u32 __user *ua
return -EFAULT;
if (!timespec_valid())
return -EINVAL;
+
+   t = timespec_to_ktime(ts);
if (op == FUTEX_WAIT)
-   t = ktime_add(ktime_get(), timespec_to_ktime(ts));
+   t = ktime_add(ktime_get(), t);
tp = 
}
/*
Index: b/kernel/futex_compat.c
===
--- a/kernel/futex_compat.c
+++ b/kernel/futex_compat.c
@@ -150,8 +150,10 @@ asmlinkage long compat_sys_futex(u32 __u
return -EFAULT;
if (!timespec_valid())
return -EINVAL;
+
+   t = timespec_to_ktime(ts);
if (op == FUTEX_WAIT)
-   t = ktime_add(ktime_get(), timespec_to_ktime(ts));
+   t = ktime_add(ktime_get(), t);
tp = 
}
if (op == FUTEX_REQUEUE || op == FUTEX_CMP_REQUEUE


-- 
Pierre Peiffer
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] Fix "Section mismatch" compile warning

2007-03-26 Thread Bernhard Walle

Fix "Section mismatch" warnings in arch/x86_64/kernel/time.c

Signed-off-by: Bernhard Walle <[EMAIL PROTECTED]>

---
 arch/x86_64/kernel/time.c |6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

Index: linux-2.6.21-rc4-mm1/arch/x86_64/kernel/time.c
===
--- linux-2.6.21-rc4-mm1.orig/arch/x86_64/kernel/time.c
+++ linux-2.6.21-rc4-mm1/arch/x86_64/kernel/time.c
@@ -282,7 +282,7 @@ static unsigned int __init pit_calibrate
 #define PIT_MODE 0x43
 #define PIT_CH0  0x40
 
-static void __init __pit_init(int val, u8 mode)
+static void __pit_init(int val, u8 mode)
 {
unsigned long flags;
 
@@ -298,12 +298,12 @@ void __init pit_init(void)
__pit_init(LATCH, 0x34); /* binary, mode 2, LSB/MSB, ch 0 */
 }
 
-void __init pit_stop_interrupt(void)
+void pit_stop_interrupt(void)
 {
__pit_init(0, 0x30); /* mode 0 */
 }
 
-void __init stop_timer_interrupt(void)
+void stop_timer_interrupt(void)
 {
char *name;
if (hpet_address) {
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [patch] hrtimers debug patch

2007-03-26 Thread Michal Piotrowski


On 26/03/07, Michal Piotrowski <[EMAIL PROTECTED]> wrote:

On 26/03/07, Thomas Gleixner <[EMAIL PROTECTED]> wrote:
> On Mon, 2007-03-26 at 16:20 +0200, Michal Piotrowski wrote:
> > >
> > > I've got a crash dump, I'll try to figure out what is causing it ;)
> > >
> >
> > That might be useful
> > http://www.stardust.webpages.pl/files/tbf/bitis-gabonica/2.6.21-rc5/
>
> Can you please upload a disassembly of hrtimer_interrupt() ?

http://www.stardust.webpages.pl/files/tbf/bitis-gabonica/2.6.21-rc5/dis_hrtimer_interrupt



I just noticed, that 'dis hrtimer_interrupt > dis_hrtimer_interrupt'
doesn't save the whole output.

http://www.stardust.webpages.pl/files/tbf/bitis-gabonica/2.6.21-rc5/hrtimer_interrupt

Regards,
Michal

--
Michal K. K. Piotrowski
LTG - Linux Testers Group (PL)
(http://www.stardust.webpages.pl/ltg/)
LTG - Linux Testers Group (EN)
(http://www.stardust.webpages.pl/linux_testers_group_en/)
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] fs/buffer.c: Loop rewrite within grow_buffers for finding sizebits

2007-03-26 Thread John Anthony Kazos Jr.


From: John Anthony Kazos Jr. <[EMAIL PROTECTED]>

This patch alters the (do...while) construct to a simple (while) and saves 
one increment operation. It's entirely possible that gcc optimizes away 
the first iteration anyway, but in case it doesn't (and also because it's 
easier to read this way), I think this is better.


Signed-off-by: John Anthony Kazos Jr. <[EMAIL PROTECTED]>

---

--- linux-2.6.20.4/fs/buffer.c.orig 2007-03-26 09:42:15.0 -0400
+++ linux-2.6.20.4/fs/buffer.c  2007-03-26 10:10:25.0 -0400
@@ -1045,10 +1045,10 @@ grow_buffers(struct block_device *bdev,
pgoff_t index;
int sizebits;

-   sizebits = -1;
-   do {
+   sizebits = 0;
+   while ((size << sizebits) < PAGE_SIZE) {
sizebits++;
-   } while ((size << sizebits) < PAGE_SIZE);
+   }

index = block >> sizebits;

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [patch] hrtimers debug patch

2007-03-26 Thread Michal Piotrowski


On 26/03/07, Thomas Gleixner <[EMAIL PROTECTED]> wrote:

On Mon, 2007-03-26 at 16:20 +0200, Michal Piotrowski wrote:
> >
> > I've got a crash dump, I'll try to figure out what is causing it ;)
> >
>
> That might be useful
> http://www.stardust.webpages.pl/files/tbf/bitis-gabonica/2.6.21-rc5/

Can you please upload a disassembly of hrtimer_interrupt() ?


http://www.stardust.webpages.pl/files/tbf/bitis-gabonica/2.6.21-rc5/dis_hrtimer_interrupt

http://www.stardust.webpages.pl/files/tbf/bitis-gabonica/2.6.21-rc5/gdb.debug



I stared at hrtimer_interrupt() for quite a long time and I don't see
how it can leave without reprogramming the event, when there are timers
pending. And there are enough timers pending.

tglx





Regards,
Michal

--
Michal K. K. Piotrowski
LTG - Linux Testers Group (PL)
(http://www.stardust.webpages.pl/ltg/)
LTG - Linux Testers Group (EN)
(http://www.stardust.webpages.pl/linux_testers_group_en/)
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [patch] hrtimers debug patch

2007-03-26 Thread Thomas Gleixner

On Mon, 2007-03-26 at 16:20 +0200, Michal Piotrowski wrote:
> >
> > I've got a crash dump, I'll try to figure out what is causing it ;)
> >
> 
> That might be useful
> http://www.stardust.webpages.pl/files/tbf/bitis-gabonica/2.6.21-rc5/

Can you please upload a disassembly of hrtimer_interrupt() ?

I stared at hrtimer_interrupt() for quite a long time and I don't see
how it can leave without reprogramming the event, when there are timers
pending. And there are enough timers pending.

tglx

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [3/5] 2.6.21-rc4: known regressions (v2)

2007-03-26 Thread Adrian Bunk

On Mon, Mar 26, 2007 at 12:00:22PM +0200, Marcus Better wrote:
> Adrian Bunk wrote:
> > Subject: ThinkPad R60: suspend to disk broken
> > References : http://lkml.org/lkml/2007/3/23/74
> > Submitter  : Marcus Better <[EMAIL PROTECTED]>
> > Status : submitter tries to bisect
> 
> I just tried -rc5. Now suspend to disk seems to work. I think the XFS 
> workqueue patch fixed this.
> 
> It can also suspend to RAM, but resume is worse. The first time around it 
> resumed but corrupted the vesafb console (greenish blinking character cells), 
> something that used to work before. But the system responded to input, so I 
> suspended to RAM again. This time the resume failed, it hung after 
> printing "Linux!" in yellow at the top of the screen. (Seems to be some 
> artifact, I have seen it before even with working suspend.)
>...

Does setting CONFIG_PCI_MSI=n make any difference?

> Marcus

cu
Adrian

-- 

   "Is there not promise of rain?" Ling Tan asked suddenly out
of the darkness. There had been need of rain for many days.
   "Only a promise," Lao Er said.
   Pearl S. Buck - Dragon Seed

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] Add support for ITE887x serial chipsets

2007-03-26 Thread Niels de Vos

Hi,

the Super I/O 887x-chipsets of ITE, are currently not completely
supported. Only parport_pc has the ability to activate the (optional)
parallel port. This patch adds support for the serial ports.

Signed-off-by: Niels de Vos <[EMAIL PROTECTED]>

--- linux-2.6.20.3/drivers/serial/8250_pci.c.orig   2007-03-13 
19:27:08.0 +0100
+++ linux-2.6.20.3/drivers/serial/8250_pci.c2007-03-26 15:42:17.0 
+0200
@@ -581,6 +581,174 @@ static int pci_netmos_init(struct pci_de
return num_serial;
 }
 
+/*
+ * ITE support by Niels de Vos <[EMAIL PROTECTED]>
+ */
+
+static int __devinit pci_ite887x_init(struct pci_dev *dev)
+{
+   /* inta_addr are the configuration addresses of the ITE */
+   short inta_addr[] = { 0x2a0, 0x2c0, 0x220, 0x240, 0x1E0, 0x200,
+   0x280, 0 };
+   int ret, i, type;
+   struct resource *iobase;
+
+   /* search for the base-ioport */
+   i = 0;
+   while (inta_addr[i] && dev->dev.driver_data != NULL) {
+   dev->dev.driver_data = request_region(inta_addr[i], 32,
+   "ite887x");
+   if (dev->dev.driver_data != NULL) {
+   /* write POSIO0R - speed | size | ioport */
+   pci_write_config_dword(dev, 0x60,
+   0xe500 | inta_addr[i]);
+   /* write INTCBAR - ioport */
+   pci_write_config_dword(dev, 0x78, inta_addr[i]);
+   ret = inb(inta_addr[i]);
+   if (ret != 0xff) {
+   /* ioport connected */
+   break;
+   }
+   release_resource(dev->dev.driver_data);
+   dev->dev.driver_data = NULL;
+   }
+   i++;
+   }
+
+   if (! inta_addr[i]) {
+   printk(KERN_ERR "ite887x: could not find iobase\n");
+   return -ENODEV;
+   }
+
+   iobase = dev->dev.driver_data;
+
+   /* start of undocumented type checking (see parport_pc.c) */
+   type = inb(iobase->start + 0x18);
+   type &= 0x0f;
+
+   switch (type) {
+   case 0x2:
+   printk(KERN_DEBUG "8250_pci: ITE8871 found (1P)\n");
+   break;
+   case 0xa:
+   printk(KERN_DEBUG "8250_pci: ITE8875 found (1P)\n");
+   break;
+   case 0xe:
+   printk(KERN_INFO "8250_pci: ITE8872 found (2S1P)\n");
+   return 2;
+   case 0x6:
+   printk(KERN_INFO "8250_pci: ITE8873 found (1S)\n");
+   return 1;
+   case 0x8:
+   printk(KERN_INFO "8250_pci: ITE8874 found (2S)\n");
+   return 2;
+   default:
+   moan_device("unknown ITE887x", dev);
+   }
+
+   /* the device has no UARTs if we get here */
+   release_resource(iobase);
+   dev->dev.driver_data = NULL;
+   return -ENODEV;
+}
+
+/*
+ * activate the UART in the MISCR
+ */
+static int
+pci_ite887x_setup(struct serial_private *priv, struct pciserial_board *board,
+ struct uart_port *port, int idx)
+{
+   int ret;
+   struct pci_dev *dev = priv->dev;
+   u32 miscr, uartbar, ioport;
+   /* iobase is the private driver_data */
+   struct resource *iobase;
+   /* registers */
+   unsigned short MISCR = 0x9c, UARTBAR = 0x7c;
+   unsigned short PS1BAR = 0x14, POSIO1 = 0x64;
+
+   /* enable IO_Space bit */
+   u32 POSIO_ENABLE = 1 << 31;
+   /* Decoding speed (1 = slow, 2 = medium, 3 = fast) */
+   u32 POSIO_SPEED = 3 << 29;  
+
+   iobase = (struct resource*) dev->dev.driver_data;
+   if (iobase == NULL) {
+   printk(KERN_ERR "ite887x: iobase is not available\n");
+   return -ENODEV;
+   }
+
+   /* read the I/O port from the device */
+   ret = pci_read_config_dword(dev, PS1BAR + (0x4 * idx), );
+   if (ret) {
+   printk(KERN_ERR "ite887x: read error PS%dBAR\n", idx + 1);
+   ret = -ENODEV;
+   goto release_inta;
+   }
+
+   ioport &= 0xFF00;   /* the actual I/O space base address */
+   ret = pci_write_config_dword(dev, POSIO1 + (0x4 * idx),
+   POSIO_ENABLE | POSIO_SPEED | ioport);
+   if (ret) {
+   printk(KERN_ERR "ite887x: write error PSIO%d+\n", idx + 1);
+   ret = -ENODEV;
+   goto release_inta;
+   }
+
+   /* write the ioport to the UARTBAR */
+   ret = pci_read_config_dword(dev, UARTBAR, );
+   if (ret) {
+   printk(KERN_ERR "ite887x: read error UARTBAR\n");
+   ret = -ENODEV;
+   goto release_inta;
+   }
+   uartbar &= ~(15 << (4 * idx));  /* clear half the reg */
+   uartbar |= ioport << (16 *

Re: [4/5] 2.6.21-rc4: known regressions (v2)

2007-03-26 Thread Michael S. Tsirkin

Update: I tested 2.6.21-rc5 with the following settings
# CONFIG_NO_HZ is not set
CONFIG_HPET_TIMER=y
CONFIG_HPET_EMULATE_RTC=y
# CONFIG_HPET is not set

1. Without additional kernel options

  After systems comes out of suspend to ram, I observed the following
  behaviour (I used s2ram from console):
  1. The first disk access takes much longer than with 2.6.20
  2. System clock does not advance (date always reports the same time)
  3. After an attempt to switch to X, X starts drawing some windows and then 
hangs
  All 3 issues are new and did not occur under 2.6.20, so this is a regression.

2. Setting clocksource=acpi_pm
After resume from RAM
  1. The first disk access takes much longer than with 2.6.20
  2. System clock seems to advance properly
  3. After an attempt to switch to X, X works correctly

So it seems that clocksource=acpi_pm can be used as a work-around.
What does this tell us?

I'm also not happy with how clocksource=acpi_pm performs -
the system seems to be jerky, stalling for fractions of
seconds now and them.

TODO: test without CONFIG_HPET_TIMER.

-- 
MST
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [patch 1/2] hugetlb: add resv argument to hugetlb_file_setup

2007-03-26 Thread Adam Litke

On Fri, 2007-03-23 at 15:42 -0700, Ken Chen wrote:
> rename hugetlb_zero_setup() to hugetlb_file_setup() to better match
> function name convention like shmem implementation.  Also add an
> argument to the function to indicate whether file setup should reserve
> hugetlb page upfront or not.
> 
> Signed-off-by: Ken Chen <[EMAIL PROTECTED]>

This patch doesn't really look bad at all, but...

I am worried that what might seem nice and clean right now will slowly
get worse.  This implements an interface on top of another interface
(char device on top of a filesystem).  What is the next hugetlbfs
function that will need a boolean switch to handle a character device
special case?

Am I just worrying too much here?  Although my pagetable_operations
patches aren't the most popular right now, they do have at least one
advantage IMO: they enable side-by-side implementation of the interfaces
as opposed to stacking them.  Keeping them separate removes the need for
if ((vm_flags & VM_HUGETLB) && (is_hugetlbfs_chardev())) checking. 

-- 
Adam Litke - (agl at us.ibm.com)
IBM Linux Technology Center

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] x86: tighten kernel image page access rights

2007-03-26 Thread Jan Beulich

On x86-64, kernel memory freed after init can be entirely unmapped instead
of just getting 'poisoned' by overwriting with a debug pattern.

On i386 and x86-64 (under CONFIG_DEBUG_RODATA), kernel text and bug table
can also be write-protected. On x86-64, in addition to that, also make sure
that both mappings (kernel image and 1:1 mapping) get updated here.

(Not sure what the symbol 'stext' is good for; can it be removed?)

Signed-off-by: Jan Beulich <[EMAIL PROTECTED]>

--- linux-2.6.21-rc5/arch/i386/kernel/vmlinux.lds.S 2007-03-26 
15:20:02.0 +0200
+++ 2.6.21-rc5-x86-init-page-attribs/arch/i386/kernel/vmlinux.lds.S 
2007-03-21 12:29:00.0 +0100
@@ -61,8 +61,6 @@ SECTIONS
__stop___ex_table = .;
   }
 
-  RODATA
-
   BUG_TABLE
 
   . = ALIGN(4);
@@ -72,6 +70,8 @@ SECTIONS
__tracedata_end = .;
   }
 
+  RODATA
+
   /* writeable */
   . = ALIGN(4096);
   .data : AT(ADDR(.data) - LOAD_OFFSET) {  /* Data */
--- linux-2.6.21-rc5/arch/i386/mm/init.c2007-03-26 15:20:02.0 
+0200
+++ 2.6.21-rc5-x86-init-page-attribs/arch/i386/mm/init.c2007-03-21 
12:29:00.0 +0100
@@ -22,6 +22,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -751,13 +752,25 @@ static int noinline do_test_wp_bit(void)
 
 void mark_rodata_ro(void)
 {
-   unsigned long addr = (unsigned long)__start_rodata;
+   unsigned long start = PFN_ALIGN(_text);
+   unsigned long size = PFN_ALIGN(_etext) - start;
 
-   for (; addr < (unsigned long)__end_rodata; addr += PAGE_SIZE)
-   change_page_attr(virt_to_page(addr), 1, PAGE_KERNEL_RO);
-
-   printk("Write protecting the kernel read-only data: %uk\n",
-   (__end_rodata - __start_rodata) >> 10);
+#ifdef CONFIG_HOTPLUG_CPU
+   /* It must still be possible to apply SMP alternatives. */
+   if (num_possible_cpus() <= 1)
+#endif
+   {
+   change_page_attr(virt_to_page(start),
+size >> PAGE_SHIFT, PAGE_KERNEL_RX);
+   printk("Write protecting the kernel text: %luk\n", size >> 10);
+   }
+
+   start += size;
+   size = (unsigned long)__end_rodata - start;
+   change_page_attr(virt_to_page(start),
+size >> PAGE_SHIFT, PAGE_KERNEL_RO);
+   printk("Write protecting the kernel read-only data: %luk\n",
+  size >> 10);
 
/*
 * change_page_attr() requires a global_flush_tlb() call after it.
@@ -780,7 +793,7 @@ void free_init_pages(char *what, unsigne
free_page(addr);
totalram_pages++;
}
-   printk(KERN_INFO "Freeing %s: %ldk freed\n", what, (end - begin) >> 10);
+   printk(KERN_INFO "Freeing %s: %luk freed\n", what, (end - begin) >> 10);
 }
 
 void free_initmem(void)
--- linux-2.6.21-rc5/arch/x86_64/kernel/head.S  2007-03-26 15:20:12.0 
+0200
+++ 2.6.21-rc5-x86-init-page-attribs/arch/x86_64/kernel/head.S  2007-03-21 
12:29:00.0 +0100
@@ -258,7 +258,6 @@ ljumpvector:
.word   __KERNEL_CS
 
 ENTRY(stext)
-ENTRY(_stext)
 
$page = 0
 #define NEXT_PAGE(name) \
--- linux-2.6.21-rc5/arch/x86_64/kernel/vmlinux.lds.S   2007-03-26 
15:20:12.0 +0200
+++ 2.6.21-rc5-x86-init-page-attribs/arch/x86_64/kernel/vmlinux.lds.S   
2007-03-21 12:29:00.0 +0100
@@ -29,6 +29,7 @@ SECTIONS
   .text :  AT(ADDR(.text) - LOAD_OFFSET) {
/* First the code that has to be first for bootstrapping */
*(.bootstrap.text)
+   _stext = .;
/* Then all the functions that are "hot" in profiles, to group them
onto the same hugetlb entry */
#include "functionlist"
@@ -50,10 +51,10 @@ SECTIONS
   __ex_table : AT(ADDR(__ex_table) - LOAD_OFFSET) { *(__ex_table) }
   __stop___ex_table = .;
 
-  RODATA
-
   BUG_TABLE
 
+  RODATA
+
   . = ALIGN(PAGE_SIZE);/* Align data segment to page size boundary */
/* Data */
   .data : AT(ADDR(.data) - LOAD_OFFSET) {
--- linux-2.6.21-rc5/arch/x86_64/mm/init.c  2007-03-26 15:20:12.0 
+0200
+++ 2.6.21-rc5-x86-init-page-attribs/arch/x86_64/mm/init.c  2007-03-21 
12:29:00.0 +0100
@@ -22,6 +22,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -597,21 +598,23 @@ void free_init_pages(char *what, unsigne
if (begin >= end)
return;
 
-   printk(KERN_INFO "Freeing %s: %ldk freed\n", what, (end - begin) >> 10);
+   printk(KERN_INFO "Freeing %s: %luk freed\n", what, (end - begin) >> 10);
for (addr = begin; addr < end; addr += PAGE_SIZE) {
ClearPageReserved(virt_to_page(addr));
init_page_count(virt_to_page(addr));
memset((void *)(addr & ~(PAGE_SIZE-1)),
POISON_FREE_INITMEM, PAGE_SIZE);
+   if (addr >= __START_KERNEL_map)
+   change_page_attr_addr(addr, 1, __pgprot(0));

[PATCH] remove pci_dac_dma_... APIs

2007-03-26 Thread Jan Beulich

Based on replies to a respective query, remove the pci_dac_dma_...() APIs
(except for pci_dac_dma_supported() on Alpha, where this function is used
in non-DAC PCI DMA code).

Signed-off-by: Jan Beulich <[EMAIL PROTECTED]>
Cc: Andi Kleen <[EMAIL PROTECTED]>
Cc: Jesse Barnes <[EMAIL PROTECTED]>
Cc: Christoph Hellwig <[EMAIL PROTECTED]>
Cc: David Miller <[EMAIL PROTECTED]>

--- linux-2.6.21-rc5/Documentation/DMA-mapping.txt  2007-02-04 
19:44:54.0 +0100
+++ 2.6.21-rc5-no-pci-dac-dma/Documentation/DMA-mapping.txt 2007-03-23 
17:46:37.0 +0100
@@ -664,109 +664,6 @@ It is that simple.
 Well, not for some odd devices.  See the next section for information
 about that.
 
-   DAC Addressing for Address Space Hungry Devices
-
-There exists a class of devices which do not mesh well with the PCI
-DMA mapping API.  By definition these "mappings" are a finite
-resource.  The number of total available mappings per bus is platform
-specific, but there will always be a reasonable amount.
-
-What is "reasonable"?  Reasonable means that networking and block I/O
-devices need not worry about using too many mappings.
-
-As an example of a problematic device, consider compute cluster cards.
-They can potentially need to access gigabytes of memory at once via
-DMA.  Dynamic mappings are unsuitable for this kind of access pattern.
-
-To this end we've provided a small API by which a device driver
-may use DAC cycles to directly address all of physical memory.
-Not all platforms support this, but most do.  It is easy to determine
-whether the platform will work properly at probe time.
-
-First, understand that there may be a SEVERE performance penalty for
-using these interfaces on some platforms.  Therefore, you MUST only
-use these interfaces if it is absolutely required.  %99 of devices can
-use the normal APIs without any problems.
-
-Note that for streaming type mappings you must either use these
-interfaces, or the dynamic mapping interfaces above.  You may not mix
-usage of both for the same device.  Such an act is illegal and is
-guaranteed to put a banana in your tailpipe.
-
-However, consistent mappings may in fact be used in conjunction with
-these interfaces.  Remember that, as defined, consistent mappings are
-always going to be SAC addressable.
-
-The first thing your driver needs to do is query the PCI platform
-layer if it is capable of handling your devices DAC addressing
-capabilities:
-
-   int pci_dac_dma_supported(struct pci_dev *hwdev, u64 mask);
-
-You may not use the following interfaces if this routine fails.
-
-Next, DMA addresses using this API are kept track of using the
-dma64_addr_t type.  It is guaranteed to be big enough to hold any
-DAC address the platform layer will give to you from the following
-routines.  If you have consistent mappings as well, you still
-use plain dma_addr_t to keep track of those.
-
-All mappings obtained here will be direct.  The mappings are not
-translated, and this is the purpose of this dialect of the DMA API.
-
-All routines work with page/offset pairs.  This is the _ONLY_ way to 
-portably refer to any piece of memory.  If you have a cpu pointer
-(which may be validly DMA'd too) you may easily obtain the page
-and offset using something like this:
-
-   struct page *page = virt_to_page(ptr);
-   unsigned long offset = offset_in_page(ptr);
-
-Here are the interfaces:
-
-   dma64_addr_t pci_dac_page_to_dma(struct pci_dev *pdev,
-struct page *page,
-unsigned long offset,
-int direction);
-
-The DAC address for the tuple PAGE/OFFSET are returned.  The direction
-argument is the same as for pci_{map,unmap}_single().  The same rules
-for cpu/device access apply here as for the streaming mapping
-interfaces.  To reiterate:
-
-   The cpu may touch the buffer before pci_dac_page_to_dma.
-   The device may touch the buffer after pci_dac_page_to_dma
-   is made, but the cpu may NOT.
-
-When the DMA transfer is complete, invoke:
-
-   void pci_dac_dma_sync_single_for_cpu(struct pci_dev *pdev,
-dma64_addr_t dma_addr,
-size_t len, int direction);
-
-This must be done before the CPU looks at the buffer again.
-This interface behaves identically to pci_dma_sync_{single,sg}_for_cpu().
-
-And likewise, if you wish to let the device get back at the buffer after
-the cpu has read/written it, invoke:
-
-   void pci_dac_dma_sync_single_for_device(struct pci_dev *pdev,
-   dma64_addr_t dma_addr,
-   size_t len, int direction);
-
-before letting the device access the DMA area again.
-
-If you need to get back to the PAGE/OFFSET tuple from a dma64_addr_t
-the following interfaces are provided:
-
-   struct page

Re: [3/5] 2.6.21-rc4: known regressions (v2)

2007-03-26 Thread Marcus Better

Pavel Machek wrote:

>> > Subject: ThinkPad R60: suspend to disk broken
>> > References : http://lkml.org/lkml/2007/3/23/74

>> input, so I suspended to RAM again. This time the resume failed, it hung
>> after printing "Linux!" in yellow at the top of the screen.

> Yellow Linux! is my debugging trick.

Cute :-)

Here is my bisect log so far, with 98 revisions left. Note that all kernels 
have the XFS workqueue patch applied.

~$ git bisect log
git-bisect start
# bad: [6fb04ccf5c5e054c4107090bed6e866489f1089f] Linux 2.6.21-rc5
git-bisect bad 6fb04ccf5c5e054c4107090bed6e866489f1089f
# good: [c8f71b01a50597e298dc3214a2f2be7b8d31170c] Linux 2.6.21-rc1
git-bisect good c8f71b01a50597e298dc3214a2f2be7b8d31170c
# good: [ad5f1196792653dadf09c07a5fa917092b469c1c] ecryptfs: check xattr 
operation support fix
git-bisect good ad5f1196792653dadf09c07a5fa917092b469c1c
# good: [271368b69b9e8042063d6c713423e84503bbdaa0] Merge 
master.kernel.org:/pub/scm/linux/kernel/git/gregkh/driver-2.6
git-bisect good 271368b69b9e8042063d6c713423e84503bbdaa0
# bad: [f5b42c3324494ea3f9bf795e2a7e4d3cbb06c607] KVM: Fix guest sysenter on vmx
git-bisect bad f5b42c3324494ea3f9bf795e2a7e4d3cbb06c607

The bad kernels exhibit the hang on second resume from RAM.

The "good" ones all have the artifact with corrupted display. Moreover, they 
resume immediately from every suspend to RAM _after_ a suspend-to-disk, but not 
before it. This is when suspending with "echo mem > /sys/power/state".

> Try vga=0 ... text console seems to work for you.

Ok, will try.

Marcus


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Question: half-duplex and full-duplex serial driver

2007-03-26 Thread Mockern

Hi,

Could you help me please, how can my serial driver to work in  half-duplex and 
full-duplex mode?

Thank you
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] i386: add command line option "local_apic_timer_c2_ok"

2007-03-26 Thread Thomas Gleixner

On Mon, 2007-03-26 at 12:31 +, Pavel Machek wrote:
> > +   lapic_timer_c2_ok   [IA-32,APIC] trust the local apic timer in
> > +   C2 power state.
> > +
> 
> Could you add comment saying that this is always ok on non-broken
> systems? That way perhaps it can be added to linux-firmware-test-cd,
> etc.

Yep, post .21. 

I still twist my brain how to autodetect that in a safe way, which would
make it really useful for the firmware tester.

tglx


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] PCMCIA: Allow PCMCIA SCSI drivers to be built into the kernel.

2007-03-26 Thread Stefan Richter

On 3/25/2007 7:59 PM, Robert P. J. Day wrote:
> Remove the Kconfig requirement that the PCMCIA SCSI drivers be built
> only as modules, and allow them to be built into the kernel.
> 
> Signed-off-by: Robert P. J. Day <[EMAIL PROTECTED]>
> 
> ---
> 
> i imagine there's a historical reason for these drivers being forced
> to be built only as modules.

(I don't know. Maybe hot-ejection or re-insertion is broken?)

> and i'm not sure whether i should have
> CCed the SCSI folks, the PCMCIA folks, or both.  or whatever.

I'd say, if in doubt send to both of these, instead of LKML.

> compile-tested on x86 with "make allyesconfig".
> 
> diff --git a/drivers/scsi/pcmcia/Kconfig b/drivers/scsi/pcmcia/Kconfig
> index eac8e17..7dd787f 100644
> --- a/drivers/scsi/pcmcia/Kconfig
> +++ b/drivers/scsi/pcmcia/Kconfig
> @@ -3,11 +3,11 @@
>  #
> 
>  menu "PCMCIA SCSI adapter support"
> - depends on SCSI!=n && PCMCIA!=n && MODULES
> + depends on SCSI!=n && PCMCIA!=n
> 
>  config PCMCIA_AHA152X
>   tristate "Adaptec AHA152X PCMCIA support"
> - depends on m && !64BIT
> + depends on !64BIT
>   select SCSI_SPI_ATTRS
>   help
> Say Y here if you intend to attach this type of PCMCIA SCSI host
> @@ -18,7 +18,6 @@ config PCMCIA_AHA152X
> 
>  config PCMCIA_FDOMAIN
>   tristate "Future Domain PCMCIA support"
> - depends on m
>   help
> Say Y here if you intend to attach this type of PCMCIA SCSI host
> adapter to your computer.
> @@ -28,7 +27,7 @@ config PCMCIA_FDOMAIN
> 
>  config PCMCIA_NINJA_SCSI
>   tristate "NinjaSCSI-3 / NinjaSCSI-32Bi (16bit) PCMCIA support"
> - depends on m && !64BIT
> + depends on !64BIT
>   help
> If you intend to attach this type of PCMCIA SCSI host adapter to
> your computer, say Y here and read
> @@ -62,7 +61,6 @@ config PCMCIA_NINJA_SCSI
> 
>  config PCMCIA_QLOGIC
>   tristate "Qlogic PCMCIA support"
> - depends on m
>   help
> Say Y here if you intend to attach this type of PCMCIA SCSI host
> adapter to your computer.
> @@ -72,7 +70,6 @@ config PCMCIA_QLOGIC
> 
>  config PCMCIA_SYM53C500
>   tristate "Symbios 53c500 PCMCIA support"
> - depends on m
>   help
> Say Y here if you have a New Media Bus Toaster or other PCMCIA
> SCSI adapter based on the Symbios 53c500 controller.


-- 
Stefan Richter
-=-=-=== --== ==-=-
http://arcgraph.de/sr/
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [patch 1/3] fix illogical behavior in balance_dirty_pages()

2007-03-26 Thread Peter Zijlstra

On Mon, 2007-03-26 at 02:08 -0800, Andrew Morton wrote:
> On Mon, 26 Mar 2007 11:32:47 +0200 Miklos Szeredi <[EMAIL PROTECTED]> wrote:
> 
> > Stopping writers which have idle queues is completely unproductive,
> > and that is basically what the current algorithm does.
> 
> This is because the kernel permits all of its allotment of dirty+writeback
> pages to be dirty+writeback against a single device.
> 
> A good way of solving the one-device-starves-another-one problem is to
> dynamically adjust the per-device dirty+writeback levels so that (for
> example) if two devices are being written to, each gets 50% of the
> allotment.

This is exactly what happens with my patch if both devices write at the
same speed. (Or at least, that is what is supposed to happen ;-)

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: some PCMCIA SCSI drivers can be built only as modules

2007-03-26 Thread Stefan Richter

Robert P. J. Day wrote at LKML:
> drivers/scsi/pcmcia/Kconfig:
> ...
> config PCMCIA_AHA152X
> tristate "Adaptec AHA152X PCMCIA support"
> depends on m && !64BIT
> select SCSI_SPI_ATTRS
> help
>   Say Y here if you intend to attach this type of PCMCIA SCSI host
>   adapter to your computer.
> 
>   To compile this driver as a module, choose M here: the
>   module will be called aha152x_cs.
> ...
> 
>   it would seem to make no sense that the "depends on" clause for this
> option includes "m", forcing this (and all other four entries in that
> Kconfig file, by the way) to be built as modules, while the help text
> for all five entries suggests you can select "y".
> 
>   as jack nicholson would say, something's gotta give.

LSML a.k.a. linux-scsi may know what's gotta give.
-- 
Stefan Richter
-=-=-=== --== ==-=-
http://arcgraph.de/sr/
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH -mm] Revoke core code: fix nommu arch compiling error bug

2007-03-26 Thread David Howells

Pekka J Enberg <[EMAIL PROTECTED]> wrote:

> We don't touch private mappings at all as they're a snapshot to the inode 
> _before_ it was revoked. So private mappings don't really matter all: you 
> don't see any new data after it has been revoked nor do you flush anything 
> to the disk.

Okay, so that's not a problem.

> Well, assuming we would use revoke for things like SAK, this doesn't 
> really work out too well because all a malicious process has to is create 
> a shared mapping and they've effectively blocked the whole thing.

In NOMMU-mode, there's probably[*] nothing stopping a malicious process
running completely amok and changing stuff directly - even the kernel isn't
guaranteed to be safe - so I wouldn't worry about such a case.

[*] The FRV, for example, does have some limited protection capability - but
it is really limited and not really useful in this case.

> It's antisocial for sure but the only way to guarantee revoke() succeeds on
> a NOMMU setup.  Oh well, lets disable it for now and see if anyone even
> wants revoke() for NOMMU.

Agreed.

David
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] move die notifier handling to common code

2007-03-26 Thread Christoph Hellwig

On Mon, Mar 26, 2007 at 11:32:09AM +1000, Paul Mackerras wrote:
> Seems OK, although I think kprobes should not be using notify_die.
> The set of events that kprobes is interested in has no intersection at
> all with the set that any other consumer of the notify_die events is
> interested in, on any architecture.
> 
> Furthermore, the multiplexing of the kprobes events through notify_die
> really serves no useful purpose.  It just means that
> kprobe_exceptions_notify has to demultiplex the events with a switch
> statement.  There is no significant common code for all events in
> kprobe_exceptions_notify, just a simple check whether the event
> happened in user mode.
> 
> However, all that is in arch code so can be changed per-arch if
> desired.

I tend to agree.  Unfortunately powers higher than me like these
horrible notifier schemes.  Then again at least the die path is not
performance critical unlike the page faul path where I still need a
comment for you on getting rid of the notifier.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 1/4] coredump: add an interface to control the core dump routine

2007-03-26 Thread Kawai, Hidehiro

Hi Pavel, 
Thank you for your reply.
I'm sorry for my late reply.

I have discussed with my colleagues why you say "ugly" against my
procfs interface, then I noticed I may have misunderstood what you said.
Is the reason for saying "ugly" two interfaces, i.e. preexisting ulimit
(get/setrlimit) and my proc entry, exist to control core file size? 
If so, I'm sorry for taking your precious time by proceeding to the
discussion without enough understanding.

Assuming my presumption is true, I don't think it's so ugly because
there are other parameters to control core dumping such as dumpable
(controled by prctl(2)) and suid_dumpable (controled via
fs.suid_dumpable sysctl).
What would you think about that?

Anyway, here are the answers to what you pointed out.

Pavel Machek wrote:

>>This patch adds an interface to set/reset a flag which determines
>>anonymous shared memory segments should be dumped or not when a core
>>file is generated.
>>
>>/proc//coredump_omit_anonymous_shared file is provided to access
>>the flag. You can change the flag status for a particular process by
>>writing to or reading from the file.
> 
> Yes. So, you used very different interface interface from ulimit,
> which means locking is hard.

I'd like to allow users to change the flag from other process.  So I
have to do locking even if it is hard.  You said this flexibility was
not an advantage before, but in some cases, it is needed.

Please assume the case where a process forks many children and they
share a huge shared memory.  Sometimes an end user wants to set 
coredump_omit_anon_shared flag to those processes except for a
particular child process.  With ulimit (setrlimit) interface,
the user can't do such setup without modifying the application
program.  But normal end user will not be able to modify the program.

> Plus, what you are doing can be done in userspace using google
> coredumper.

I think that the needs differ between userland core dumper user and
in-kernel core dumper user.  Pros and cons also differ.

Some of people (such as system admins, distro vendors, etc) need
highly reliable core dumper because they don't want to experience
same failures again and they don't hope that another failure is
caused by core dumping.  Userland core dumper is useful because
it is relatively easy to be customized, but its reliability highly
depends on the application programs.

If the stack for signal handlers is not set up carefully, if the
data used by userland core dumper has been destroyed, if
coredump_omit_anon_shared flag has been overwritten by bad data,
or if the address of functions have been destroyed, the userland
core dumper may fail to dump.  So in-kernel solutoin is required
by enterprise users.

>>+ if (down_write_trylock(_settings_sem)) {
>>+ set_coredump_omit_anon_shared(mm, (val != 0));
>>+ up_write(_settings_sem);
>>+ } else
>>+ ret = -EBUSY;
> 
> Now this is an ugly interface. "If user tries to write to /proc file
> while it is locked, return him spurious error.

I'm considering using the previous argument passing approach (preserves
the setting value into a local variable and then passes it to core dump
routines) or another approach which introduce a per-process flag to
indicate that core dump is ongoing.  Both of these approach never
produce spurious errors.

>>@@ -75,6 +77,8 @@ extern int suid_dumpable;
>> #define SUID_DUMP_USER   1   /* Dump as user of process */
>> #define SUID_DUMP_ROOT   2   /* Dump as root */
>> 
>>+extern struct rw_semaphore coredump_settings_sem;
>>+
>> /* Stack area protections */
>> #define EXSTACK_DEFAULT   0  /* Whatever the arch defaults to */
>> #define EXSTACK_DISABLE_X 1  /* Disable executable stacks */
> 
> Yep, very nice, if you used interface suited for the task (ulimit),
> you'd not have to invent new locking like this.

Using the above-stated approach, this semaphore becomes unnecessary.

Thanks,
-- 
Hidehiro Kawai
Hitachi, Ltd., Systems Development Laboratory

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] clockevents: remove bad designed sysfs support for now

2007-03-26 Thread Pavel Machek

On Mon 2007-03-26 11:21:08, Thomas Gleixner wrote:
> The current sysfs support of clockevents does not obey the "only one
> value per file" rule.
> 
> The real fix is not 2.6.21 material. Therefor remove the sysfs support
> for now.
> 
> Signed-off-by: Thomas Gleixner <[EMAIL PROTECTED]>

Thanks!

-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) 
http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [2.6.22 patch] more scheduled OSS driver removal

2007-03-26 Thread Jan Engelhardt

On Mar 26 2007 00:16, Lee Revell wrote:
>
> I guess he's referring to the well known "Master volume only controls
> front output" problem.  This really does need to be resolved, as many
> other ALSA drivers are effected.

I don't see that as a bug. Mine is a TerraTec DMX XFire 1024 (snd-cs46xx).

Master maps to front, Headphone to rear. PCM is the overall volume of 
digital audio (i.e. everything I can count on my fingers). Of course PCM 
is not "everything", it does not include Line Out for example. But it 
does not either on Windows (where there also is no "everything" control 
in sndvol32), so it is not a bug for me.

Jan
-- 
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: 2.6.21-rc1 and 2.6.21-rc2 kwin dies silently

2007-03-26 Thread boris

On Thu, Mar 22, 2007 at 07:49:16PM +, Sid Boyce wrote:

> Kernel built and installed, so I shall have something to report in the
> next couple of days.

I see this kwin bug too
on a Thinkpad (x86_64) as well as on a Pmac (powerpc64).
It is only triggerd when the screen is locked AND! set
to standby mode (display/backlight is turned off - acpi).

boris

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH -mm] Revoke core code: fix nommu arch compiling error bug

2007-03-26 Thread Pekka J Enberg

Hi,

Pekka J Enberg <[EMAIL PROTECTED]> wrote:
> > revoke_mapping() is mostly same as munmap(2) except that it preserves the 
> > vma but makes it VM_REVOKED. This means that if the process tries to 
> > access the region it will SIGBUS and if it tries to remap the range it 
> > will get EINVAL.

On Mon, 26 Mar 2007, David Howells wrote:
> Yeah, that's not enforceable in NOMMU-mode situations.  I presume it differs
> from munmap() also in that it can effectively be forced by one process upon
> another.

Yes.

On Mon, 26 Mar 2007, David Howells wrote:
> In MMU-mode, how does this work with private mappings that have some private
> copies of the pages that make up the mapping?  Are those still available to a
> process that is using them?  Are they revoked when swapped out?  Or are they
> forcibly evicted?

We don't touch private mappings at all as they're a snapshot to the inode 
_before_ it was revoked. So private mappings don't really matter all: you 
don't see any new data after it has been revoked nor do you flush anything 
to the disk.

Pekka J Enberg <[EMAIL PROTECTED]> wrote:
> >   - If there are shared mappings, always return -ENOENT for revoke(2).
 
On Mon, 26 Mar 2007, David Howells wrote:
> That sounds feasible.  How about -ETXTBSY instead?

Well, assuming we would use revoke for things like SAK, this doesn't 
really work out too well because all a malicious process has to is create 
a shared mapping and they've effectively blocked the whole thing.

Pekka J Enberg <[EMAIL PROTECTED]> wrote:
> >   - If there are shared mappings, immediately raise SIGBUS for those 
> > processes that are accessing it.

On Mon, 26 Mar 2007, David Howells wrote:
> Hmmm... maybe.  That sounds a bit antisocial though, but is also 
> workable.

It's antisocial for sure but the only way to guarantee revoke() succeeds 
on a NOMMU setup. Oh well, lets disable it for now and see if anyone even 
wants revoke() for NOMMU.

Pekka J Enberg <[EMAIL PROTECTED]> wrote:
> Does the SIGBUS raised have its own si_code, btw?  Perhaps BUS_REVOKED?

That's a good idea. I'll add one.

Pekka
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: 2.6.21-rc4: known regressions with patches (v2)

2007-03-26 Thread Bob Tracy

Adrian Bunk wrote:
> Subject: boot hangs during IDE detection  (clocksource)
> References : http://lkml.org/lkml/2007/3/19/465
> Submitter  : Bob Tracy <[EMAIL PROTECTED]>
> Caused-By  : John Stultz <[EMAIL PROTECTED]>
>  commit 6bb74df481223731af6c7e0ff3adb31f6442cfcd
> Handled-By : John Stultz <[EMAIL PROTECTED]>
> Patch  : http://lkml.org/lkml/2007/3/22/287
> Status : workaround-patch available

The subject problem is fixed *without* the workaround patch in
2.6.21-rc5.  The acpi_pm clocksource patch for the case where the
PIIX4 bug is present should probably be included anyway.

-- 
---
Bob Tracy   WTO + WIPO = DMCA? http://www.anti-dmca.org
[EMAIL PROTECTED]
---
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 09/20 take 4] UBI: wear-leveling unit

2007-03-26 Thread Frank Haverkamp

Hi,

I wonder if a generic wear-leveling infrastructure makes sense. Artem is
showing us here his example of how he is attacking the problem for UBI.

The wear-leveling described here is only one approach out of many
possible. A different one, I think, is used where e.g. filesystems do
their own approach, because they have deeper knowledge on how the blocks
are used.
I think this is something special and out of the scope of what I try to
point out here.

Artems approach for UBI, as far as I can see, uses currently the hints:
erase counts per block a sequence number and a full view of all existing
blocks. If a block is requested the user can give following information:
LONGTERM, UNKNOWN, and SHORTTERM. Internally it has trees containing:
free blocks, used blocks, and blocks to scrub.

Let me try to illustrate how I think that a generic aproach might look
like:

struct wlevel_block {
phys_block_num;
erase_count;
sequence_number;
... /* which other hints are reasonable? 
maybe other devices -> other hints? */
};

struct wlevel {
int (* erase_block)(phys_num, priv_data); /* callback */
/* more callbacks needed? */
...
void *priv_data;
};

wlevel_init(struct wlevel *wl);
wlevel_add(struct wlevel *wl, struct block_info *bi); /* add block info
e.g. during scan */
wlevel_put(struct wlevel *wl, block); /* free/erase a block */
wlevel_scrub(struct wlevel *wl, block);

enum characteristics {
WL_DATA_LONGTERM,
WL_DATA_UNKNOWN,
WL_DATA_SHORTTERM,
};

wlevel_get(struct wlevel *wl, enum characteristics ch);
...
wlevel_destroy(struct wlevel *wl);

I do not hope that throwing this topic into the discussion might lead to
more itzi-bitsi-ness, and obsolete interfaces in the code (which were
just removed), but it might be worth a discussion on the side, to
explore if there are other people trying to solve similar problems like
we try here for the FLASH example, and if a common approach for this
problem makes sense or not - and of course if so how it could look like.

Regards,

Frank

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [3/5] 2.6.21-rc4: known regressions (v2)

2007-03-26 Thread Pavel Machek

Hi!

> > Subject: ThinkPad R60: suspend to disk broken
> > References : http://lkml.org/lkml/2007/3/23/74
> > Submitter  : Marcus Better <[EMAIL PROTECTED]>
> > Status : submitter tries to bisect
> 
> I just tried -rc5. Now suspend to disk seems to work. I think the XFS 
> workqueue patch fixed this.
> 
> It can also suspend to RAM, but resume is worse. The first time around it 
> resumed but corrupted the vesafb console (greenish blinking character cells), 
> something that used to work before. But the system responded to input, so I 
> suspended to RAM again. This time the resume failed, it hung after 
> printing "Linux!" in yellow at the top of the screen. (Seems to be some 
> artifact, I have seen it before even with working suspend.)

Yellow Linux! is my debugging trick. It should be there, but it should
also disapear quickly.

Try vga=0 ... text console seems to work for you.
Pavel

-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) 
http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] i386: add command line option "local_apic_timer_c2_ok"

2007-03-26 Thread Pavel Machek

Hi!

> It turned out that it is almost impossible to trust ACPI, BIOS & Co.
> regarding the C states. This was the reason to switch the local apic
> timer off in C2 state already. OTOH there are sane and well behaving
> systems, which get punished by that decision.
> 
> Allow the user to confirm that the local apic timer is trustworthy in C2
> state. This keeps the default behaviour on the safe side.
...
> @@ -780,6 +780,9 @@ and is between 256 and 4096 characters. It is defined in 
> the file
>   lapic   [IA-32,APIC] Enable the local APIC even if BIOS
>   disabled it.
>  
> + lapic_timer_c2_ok   [IA-32,APIC] trust the local apic timer in
> + C2 power state.
> +

Could you add comment saying that this is always ok on non-broken
systems? That way perhaps it can be added to linux-firmware-test-cd,
etc.

-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) 
http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Linux 2.6.21-rc5

2007-03-26 Thread Thomas Gleixner

On Mon, 2007-03-26 at 07:25 -0500, Bob Tracy wrote:
> Thomas Gleixner wrote:
> > This fix from John Stultz is still missing:
> > 
> > http://lkml.org/lkml/2007/3/22/287
> > 
> > It's in Andrews queue already and waits to be sent to you.
> 
> In summary, that fix is a workaround to allow the acpi_pm clocksource
> to be selected instead of the pit clocksource, thereby allowing my
> Dell laptop with the PIIX4 bug to boot.  Other apic, clocksource, etc.
> patches that were included in -rc5 fixed the problem that caused the
> boot process to hang when the pit clocksource was selected, as I
> suspected would be the case :-).

Ah. Ok

> Per John's message in the above URL, while the fix is no longer needed
> for allowing the laptop to boot, it's probably still "a good thing" to
> allow a better clocksource to be selected.

Yes. The read three times pmtimer is faster and more reliable than the
PIT.

tglx



-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Linux 2.6.21-rc5

2007-03-26 Thread Bob Tracy

Thomas Gleixner wrote:
> This fix from John Stultz is still missing:
> 
> http://lkml.org/lkml/2007/3/22/287
> 
> It's in Andrews queue already and waits to be sent to you.

In summary, that fix is a workaround to allow the acpi_pm clocksource
to be selected instead of the pit clocksource, thereby allowing my
Dell laptop with the PIIX4 bug to boot.  Other apic, clocksource, etc.
patches that were included in -rc5 fixed the problem that caused the
boot process to hang when the pit clocksource was selected, as I
suspected would be the case :-).

Per John's message in the above URL, while the fix is no longer needed
for allowing the laptop to boot, it's probably still "a good thing" to
allow a better clocksource to be selected.

-- 
---
Bob Tracy   WTO + WIPO = DMCA? http://www.anti-dmca.org
[EMAIL PROTECTED]
---
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: revoke: no revoke for nommu

2007-03-26 Thread David Howells

Pekka J Enberg <[EMAIL PROTECTED]> wrote:

> There's just no sane way to revoke shared memory mappings for NOMMU so lets
> disable the thing completely when CONFIG_MMU=n.

I think that's reasonable for now - we can always add support as far as
possible later.

David
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH -mm] Revoke core code: fix nommu arch compiling error bug

2007-03-26 Thread David Howells

Pekka J Enberg <[EMAIL PROTECTED]> wrote:

> > I don't know, what does it do?  Remember, once a NOMMU process thinks it
> > has the right to access a mapping, there's no way of stopping it doing so
> > short of killing the process.
> 
> revoke_mapping() is mostly same as munmap(2) except that it preserves the 
> vma but makes it VM_REVOKED. This means that if the process tries to 
> access the region it will SIGBUS and if it tries to remap the range it 
> will get EINVAL.

Yeah, that's not enforceable in NOMMU-mode situations.  I presume it differs
from munmap() also in that it can effectively be forced by one process upon
another.

In MMU-mode, how does this work with private mappings that have some private
copies of the pages that make up the mapping?  Are those still available to a
process that is using them?  Are they revoked when swapped out?  Or are they
forcibly evicted?

> What we're trying to do here is, we want to make sure no other tasks can 
> access the inode once it has been revoked.

Okay.

> So there's no way to raise SIGBUS if the range is being accessed. The 
> alternatives are:
> 
>   - No support for revoke(2) on NOMMU.

That's a bit over the top, I think.  It sounds like revoke() is perfectly fine
- as long as there aren't any mappings on the target inode (or at least shared
mappings - dunno about private mappings).

>   - If there are shared mappings, always return -ENOENT for revoke(2).

That sounds feasible.  How about -ETXTBSY instead?

>   - If there are shared mappings, immediately raise SIGBUS for those 
> processes that are accessing it.

Hmmm... maybe.  That sounds a bit antisocial though, but is also workable.
Does the SIGBUS raised have its own si_code, btw?  Perhaps BUS_REVOKED?

> Making the shared mappings private is not an option because there's no way 
> for the process to know that it's mapping is being pulled under it which 
> will result in bugs. Hmm.

Agreed.

David
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

revoke: no revoke for nommu

2007-03-26 Thread Pekka J Enberg

From: Pekka Enberg <[EMAIL PROTECTED]>

There's just no sane way to revoke shared memory mappings for NOMMU so lets
disable the thing completely when CONFIG_MMU=n.

Cc: Bryan Wu <[EMAIL PROTECTED]> 
Cc: David Howells <[EMAIL PROTECTED]>
Cc: Alan Cox <[EMAIL PROTECTED]>
Signed-off-by: Pekka Enberg <[EMAIL PROTECTED]>
---
 fs/Makefile |3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

Index: uml-2.6/fs/Makefile
===
--- uml-2.6.orig/fs/Makefile2007-03-26 15:08:42.0 +0300
+++ uml-2.6/fs/Makefile 2007-03-26 15:09:03.0 +0300
@@ -11,7 +11,7 @@ obj-y :=  open.o read_write.o file_table.
attr.o bad_inode.o file.o filesystems.o namespace.o aio.o \
seq_file.o xattr.o libfs.o fs-writeback.o \
pnode.o drop_caches.o splice.o sync.o utimes.o \
-   stack.o revoke.o revoked_inode.o
+   stack.o
 
 ifeq ($(CONFIG_BLOCK),y)
 obj-y +=   buffer.o bio.o block_dev.o direct-io.o mpage.o ioprio.o
@@ -19,6 +19,7 @@ else
 obj-y +=   no-block.o
 endif
 
+obj-$(CONFIG_MMU)  += revoke.o revoked_inode.o
 obj-$(CONFIG_INOTIFY)  += inotify.o
 obj-$(CONFIG_INOTIFY_USER) += inotify_user.o
 obj-$(CONFIG_EPOLL)+= eventpoll.o
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 1/2] MSR: Add support for safe variants

2007-03-26 Thread Jean Delvare

Hi Mikael,

On Mon, 26 Mar 2007 13:57:29 +0200 (MEST), Mikael Pettersson wrote:
> On Mon, 26 Mar 2007 13:29:37 +0200, Jean Delvare wrote:
> > * * * * * Updated patch * * * * *
> > 
> > From: Rudolf Marek <[EMAIL PROTECTED]>
> > 
> > Add safe (exception handled) variants of rdmsr_on_cpu and wrmsr_on_cpu.
> > You should use these when the target MSR may not actually exist, as
> > doing so could trigger an exception which the regular functions do not
> > handle. The safe variants are slower, though.
> > 
> > The upcoming coretemp hardware monitoring driver will need this.
> 
> Maybe I'm in the minority here, but I for one strongly believe
> that any attempt to access an MSR "which might not be there" is
> inherently wrong. It implies that your HW detection is incomplete,
> which in combination with MSR accesses means that you may end up
> accessing MSRs that aren't at all what you think they should be.

Hopefully CPU manufacturers are not that stupid and don't implement
MSRs using the same number and doing different things in CPU models
which are otherwise similar enough for one driver to attempt to handle
them both. But of course it's probably only a matter of time before I am
proven wrong...

I agree with you that accessing an MSR which might not be there should
be avoided where possible and only used as a last resort. But until
technical documentation is perfectly correct for all CPUs out there,
there will always be cases where we need to do that.

> Who supplies these imprecise MSR definitions anyway?
> Intel manuals? ACPI?

Intel. Rudolf will know the details better.

-- 
Jean Delvare
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH -mm] Revoke core code: fix nommu arch compiling error bug

2007-03-26 Thread Alan Cox

> With NOMMU as it stands, private mappings are private copies of the data, and
> have no impact on the page cache and get no updates from it.  It's as if you
> took a private writable mapping, touched every page and then mprotect()'d it.
> This isn't necessarily ideal, but we're limited by the lack on an MMU.

Given the MMUless kernel has no security model the easiest is probably to
simply not support revoke() of mmap areas on NOMMU (or maybe not to
support revoke at all)

Alan
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 1/2] MSR: Add support for safe variants

2007-03-26 Thread Mikael Pettersson

On Mon, 26 Mar 2007 13:29:37 +0200, Jean Delvare wrote:
> * * * * * Updated patch * * * * *
> 
> From: Rudolf Marek <[EMAIL PROTECTED]>
> 
> Add safe (exception handled) variants of rdmsr_on_cpu and wrmsr_on_cpu.
> You should use these when the target MSR may not actually exist, as
> doing so could trigger an exception which the regular functions do not
> handle. The safe variants are slower, though.
> 
> The upcoming coretemp hardware monitoring driver will need this.

Maybe I'm in the minority here, but I for one strongly believe
that any attempt to access an MSR "which might not be there" is
inherently wrong. It implies that your HW detection is incomplete,
which in combination with MSR accesses means that you may end up
accessing MSRs that aren't at all what you think they should be.

Who supplies these imprecise MSR definitions anyway?
Intel manuals? ACPI?

/Mikael
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH -mm] Revoke core code: fix nommu arch compiling error bug

2007-03-26 Thread David Howells

Andrew Morton <[EMAIL PROTECTED]> wrote:

> I'll touch up the changelog for nommu-hide-vm_mm-in-nommu-mode.patch and then
> I'll temporarily drop it so the blackfin guys can test their work, I guess.

Thanks.

As I said, I'm also not sure that revocation of VMAs is supportable on NOMMU,
so the thing to do may be to hide it entirely if CONFIG_MMU=n.

David
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH -mm] Revoke core code: fix nommu arch compiling error bug

2007-03-26 Thread Pekka J Enberg

On Mon, 26 Mar 2007, David Howells wrote:
> I don't know, what does it do?  Remember, once a NOMMU process thinks it has
> the right to access a mapping, there's no way of stopping it doing so short of
> killing the process.

revoke_mapping() is mostly same as munmap(2) except that it preserves the 
vma but makes it VM_REVOKED. This means that if the process tries to 
access the region it will SIGBUS and if it tries to remap the range it 
will get EINVAL.

What we're trying to do here is, we want to make sure no other tasks can 
access the inode once it has been revoked.

On Mon, 26 Mar 2007, David Howells wrote:
> With NOMMU as it stands, private mappings are private copies of the data, and
> have no impact on the page cache and get no updates from it.  It's as if you
> took a private writable mapping, touched every page and then mprotect()'d it.
> This isn't necessarily ideal, but we're limited by the lack on an MMU.

So there's no way to raise SIGBUS if the range is being accessed. The 
alternatives are:

  - No support for revoke(2) on NOMMU.
  - If there are shared mappings, always return -ENOENT for revoke(2).
  - If there are shared mappings, immediately raise SIGBUS for those 
processes that are accessing it.

Making the shared mappings private is not an option because there's no way 
for the process to know that it's mapping is being pulled under it which 
will result in bugs. Hmm.

Pekka
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 1/2] MSR: Add support for safe variants

2007-03-26 Thread Andrew Morton

On Mon, 26 Mar 2007 13:29:37 +0200 Jean Delvare <[EMAIL PROTECTED]> wrote:

> The patch from Rudolf Marek which I am posting here builds on top of
> what is already in Linus' tree. Taking it in your tree should not cause
> any problem.

OK, thanks - I'll add this then I'll un-revert the patch which uses it.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 0/21] MSI rework

2007-03-26 Thread Eric W. Biederman

Michael Ellerman <[EMAIL PROTECTED]> writes:

> OK. For starters, do you want to review the first eleven as I've sent
> them already, that saves spamming everyone again.
>
> If you're OK with those eleven, then I'll send the remaining 10 or so
> later in the week, broken up into (sort-of) functional groups.

Sounds like a decent plan.  At a quick glance there is a reasonable chance
I'm going to run into issues with patch 9 or 10.  Let me get as far as I can
with the acks and then we can see where we go with the next round of patches.

Eric
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH -mm] Blackfin arch: cleanup cache header file

2007-03-26 Thread Paul Mundt

On Mon, Mar 26, 2007 at 06:11:42PM +0800, Wu, Bryan wrote:
> +#define L1_CACHE_SHIFT   5
> +#define L1_CACHE_BYTES   (1 << L1_CACHE_SHIFT)
>  
> -/* For speed we do need to align these ...MaTed---*/
> -/*  But include/linux/cache.h does this for us if we DO not define 
> ...MaTed---*/
> -#define __cacheline_aligned  /* maybe no need this   Tony */
> +/*
> + * Don't make __cacheline_aligned and
> + * cacheline_aligned defined in include/linux/cache.h
> + */
> +#define __cacheline_aligned
>  #define cacheline_aligned
>  
You still don't need this. Ancient versions of gcc had problems with the
attribute, but it's not even possible to build the kernel with those
anymore. Please remove these and try again. You can simply alias
SMP_CACHE_BYTES to L1_CACHE_BYTES if you've left this in due to the
resulting build failure.

m68knommu seems to be another user that never got cleaned up, perhaps
it's a good time to kill that off too..
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH -mm] Revoke core code: fix nommu arch compiling error bug

2007-03-26 Thread Andrew Morton

On Mon, 26 Mar 2007 12:25:18 +0100 David Howells <[EMAIL PROTECTED]> wrote:

> Andrew Morton <[EMAIL PROTECTED]> wrote:
> 
> > Offending patch is
> > ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.21-rc4/2.6.21-rc4-mm1/broken-out/nommu-hide-vm_mm-in-nommu-mode.patch,
> > which seems rather dumb.  Or at least, its changelog does a good job of
> > making it look dumb.
> 
> vm_mm is always NULL under NOMMU as it currently stands.  As far as I know,
> this has been true since the NOMMU mm stuff was first included (I'm not sure
> the VMAs of the first NOMMU mm code *had* a vm_mm).  Hugh (I think it was)
> suggested that since this was always NULL, then it should be excised from the
> struct in NOMMU-mode.
> 
> The reason is that, at the moment, VMAs are a global *shared* resource in
> NOMMU-mode.  Each process has a list of global VMAs that it subscribes to, but
> that's it.  This (a) slightly reduces the amount of metadata allocated
> (possibly), and (b) makes sharing of executables and libraries much easier.

whoa.  You live and learn.  Logical, I guess.

I agree that in that case, we just don't want vm_mm to exist in NOMMU builds - 
it's
better to fail at compile time.

I'll touch up the changelog for nommu-hide-vm_mm-in-nommu-mode.patch and then
I'll temporarily drop it so the blackfin guys can test their work, I guess.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] Fix race between attach_task and cpuset_exit

2007-03-26 Thread Srivatsa Vaddagiri

On Sun, Mar 25, 2007 at 12:50:25PM -0700, Paul Jackson wrote:
> Is there perhaps another race here? 

Yes, we have!

Modified patch below. Compile/boot tested on a x86_64 box.


Currently cpuset_exit() changes the exiting task's ->cpuset pointer w/o
taking task_lock(). This can lead to ugly races between attach_task and
cpuset_exit. Details of the races are described at
http://lkml.org/lkml/2007/3/24/132.

Patch below closes those races. It is against 2.6.21-rc4 and has
undergone a simple compile/boot test on a x86_64 box.

Signed-off-by : Srivatsa Vaddagiri <[EMAIL PROTECTED]>


---


diff -puN kernel/cpuset.c~cpuset_race_fix kernel/cpuset.c
--- linux-2.6.21-rc4/kernel/cpuset.c~cpuset_race_fix2007-03-25 
21:08:27.0 +0530
+++ linux-2.6.21-rc4-vatsa/kernel/cpuset.c  2007-03-26 16:48:24.0 
+0530
@@ -1182,6 +1182,7 @@ static int attach_task(struct cpuset *cs
pid_t pid;
struct task_struct *tsk;
struct cpuset *oldcs;
+   struct cpuset *oldcs_to_be_released = NULL;
cpumask_t cpus;
nodemask_t from, to;
struct mm_struct *mm;
@@ -1237,6 +1238,8 @@ static int attach_task(struct cpuset *cs
}
atomic_inc(>count);
rcu_assign_pointer(tsk->cpuset, cs);
+   if (atomic_dec_and_test(>count))
+   oldcs_to_be_released = oldcs;
task_unlock(tsk);
 
guarantee_online_cpus(cs, );
@@ -1257,8 +1260,8 @@ static int attach_task(struct cpuset *cs
 
put_task_struct(tsk);
synchronize_rcu();
-   if (atomic_dec_and_test(>count))
-   check_for_release(oldcs, ppathbuf);
+   if (oldcs_to_be_released)
+   check_for_release(oldcs_to_be_released, ppathbuf);
return 0;
 }
 
@@ -2200,10 +2203,6 @@ void cpuset_fork(struct task_struct *chi
  * it is holding that mutex while calling check_for_release(),
  * which calls kmalloc(), so can't be called holding callback_mutex().
  *
- * We don't need to task_lock() this reference to tsk->cpuset,
- * because tsk is already marked PF_EXITING, so attach_task() won't
- * mess with it, or task is a failed fork, never visible to attach_task.
- *
  * the_top_cpuset_hack:
  *
  *Set the exiting tasks cpuset to the root cpuset (top_cpuset).
@@ -2241,20 +2240,23 @@ void cpuset_fork(struct task_struct *chi
 void cpuset_exit(struct task_struct *tsk)
 {
struct cpuset *cs;
+   struct cpuset *oldcs_to_be_released = NULL;
 
+   task_lock(tsk);
cs = tsk->cpuset;
tsk->cpuset = _cpuset;  /* the_top_cpuset_hack - see above */
+   if (atomic_dec_and_test(>count))
+   oldcs_to_be_released = cs;
+   task_unlock(tsk);
 
if (notify_on_release(cs)) {
char *pathbuf = NULL;
 
mutex_lock(_mutex);
-   if (atomic_dec_and_test(>count))
-   check_for_release(cs, );
+   if (oldcs_to_be_released)
+   check_for_release(oldcs_to_be_released, );
mutex_unlock(_mutex);
cpuset_release_agent(pathbuf);
-   } else {
-   atomic_dec(>count);
}
 }
 
_


-- 
Regards,
vatsa
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: forced umount?

2007-03-26 Thread Pozsar Balazs

On Sun, Mar 18, 2007 at 08:16:19PM +0100, Arjan van de Ven wrote:
> On Fri, 2007-03-16 at 23:06 -0500, Mike Snitzer wrote:
> > I'm interested in understanding the state of Linux with regard to
> > _really_ forcing a filesystem to unmount.
> > 
> > There is a (stale) project at OSDL that has various implementations:
> > http://developer.osdl.org/dev/fumount/
> 
> 
> the problem with the people who say they want forced umount is.. that
> most of the time they either want
> 1) get rid of the namespace entry
> or
> 2) want to stop any and all IO to a certain device/partition 
> 
> 1) is already supported with lazy umount (umount -l)
> for 2), it's not forced umount that they want, it's really an IO
> disconnect (which scsi supports btw in 2.6 kernels).


Could please tell me more about this IO disconnect?
How to trigger it etc, any pointers welcome.


thanks,
-- 
pozsy
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [Patch 3/7] integrity: EVM as an integrity service provider

2007-03-26 Thread Mimi Zohar

On Sun, 2007-03-25 at 21:28 -0800, Andrew Morton wrote:
> On Sun, 25 Mar 2007 23:13:02 -0400 Mimi Zohar <[EMAIL PROTECTED]> wrote:
> 
> > On Sun, 2007-03-25 at 00:16 -0800, Andrew Morton wrote:
> > > On Fri, 23 Mar 2007 12:09:36 -0400 Mimi Zohar <[EMAIL PROTECTED]> wrote:
> > > 
> > > > +++ linux-2.6.21-rc4-mm1/security/evm/Kconfig
> > > > @@ -0,0 +1,17 @@
> > > > +config INTEGRITY_EVM
> > > > +   boolean "EVM support"
> > > > +   depends on INTEGRITY && KEYS
> > > > +   select CRYPTO_HMAC
> > > > +   select CRYPTO_MD5
> > > > +   select CRYPTO_SHA1
> > > > +   default 0
> > > > +   help
> > > > + The Extended Verification Module is an integrity provider.
> > > > + An extensible set of extended attributes, as defined in
> > > > + /etc/evm.conf, are HMAC protected against modification
> > > > + using the TPM's KERNEL ROOT KEY, if configured, or with a
> > > > + pass-phrase.  Possible extended attributes include 
> > > > authenticity,
> > > > + integrity, and revision level.
> > > > +
> > > > + If you are unsure how to answer this question, answer N.
> > > > +
> > > 
> > > Is no dependency upon TPM needed?
> > 
> > It's obviously preferable to have and use a TPM, but if one is not
> > available you can use a pass-phrase.
> > 
> 
> So it will compile and run OK with CONFIG_TPM=n?  And with
> CONFIG_INTEGRITY_EVM=y, CONFIG_TPM=m?

Sorry, I guess I wasn't clear.  If you are using a TPM, then it 
has to be builtin.  In addition, if you don't enable a TPM, then
you can't enable IMA either.

Mimi 

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH -mm] Revoke core code: fix nommu arch compiling error bug

2007-03-26 Thread David Howells

Pekka J Enberg <[EMAIL PROTECTED]> wrote:

> But what's more important is, can we do revoke_mapping() for NOMMU? AFAICT 
> we can, we just need to scan all the global vmas, right?

I don't know, what does it do?  Remember, once a NOMMU process thinks it has
the right to access a mapping, there's no way of stopping it doing so short of
killing the process.

With NOMMU as it stands, private mappings are private copies of the data, and
have no impact on the page cache and get no updates from it.  It's as if you
took a private writable mapping, touched every page and then mprotect()'d it.
This isn't necessarily ideal, but we're limited by the lack on an MMU.

David
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] Fix race between attach_task and cpuset_exit

2007-03-26 Thread Srivatsa Vaddagiri

On Sun, Mar 25, 2007 at 11:22:15PM +0530, Balbir Singh wrote:
> >+struct cpuset *oldcs_tobe_released = NULL;
> 
> How about oldcs_to_be_released?

Yes, I wanted to use that, but my typo I guess.

> >@@ -2242,19 +2241,20 @@ void cpuset_exit(struct task_struct *tsk
> > {
> > struct cpuset *cs;
> >
> >+task_lock(tsk);
> > cs = tsk->cpuset;
> > tsk->cpuset = _cpuset;  /* the_top_cpuset_hack - see above */
> >+atomic_dec(>count);
> 
> How about using a local variable like ref_count and using
> 
> ref_count = atomic_dec_and_test(>count); This will avoid the two
> atomic operations, atomic_dec() and atomic_read() below.

Well, someone may have attached to this cpuset while we were waiting on the 
mutex_lock(). So we need to do a atomic_read again to ensure it is still
unused. But I notice that check_for_release() has that
atomic_read-and-check-for-zero-refcount inbuilt into it, which means we can 
blindly call it. Modified patch in another mail.

-- 
Regards,
vatsa
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 00/22 take 3] UBI: Unsorted Block Images

2007-03-26 Thread Jörn Engel

On Mon, 26 March 2007 13:49:06 +0300, Artem Bityutskiy wrote:
> On Sun, 2007-03-25 at 22:08 +0200, Jörn Engel wrote:
> > 
> > Logical volume management can just as easily move its management
> > information into a table, instead of having it spread across all blocks.
> > Blocks can keep their original size.  Since you have to scan flash
> > anyway, you can also scan for a table, compare a magical number and do
> > some extra check to protect yourself against a UBI image inside some
> > logical volume.  No big deal.
> 
> First off, I see these no big deal statements for years already, and no
> decent implementation proved by usage in real world. Could we please,
> move these academic discussions to another thread?

You could wait a day, then reread what I wrote.  Maybe you will notice
that what I wrote is not identical to what we have discussed about a
year ago and you seem to have read.

You may also want to reread this:
||[ This was not a request for UBI to be changed.  The only purpose was to
||illustrate that LogFS is not broken.  The previous thread suggested
||otherwise and I just couldn't leave it at that. ]

Jörn

-- 
tglx1 thinks that joern should get a (TM) for "Thinking Is Hard"
-- Thomas Gleixner
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 1/2] MSR: Add support for safe variants

2007-03-26 Thread Jean Delvare

Hi Andrew,

On Sun, 25 Mar 2007 14:22:15 -0800, Andrew Morton wrote:
> On Sun, 25 Mar 2007 14:18:23 +0200 Jean Delvare <[EMAIL PROTECTED]> wrote:
> 
> > Add support for _safe (exception handled) variants of rdmsr_on_cpu 
> > and wrmsr_on_cpu.  This is needed for the upcoming coretemp hardware
> > monitoring driver, which might step into non-existing (poorly
> > documented) MSR.
> 
> Crappy changelog.  What do these functions do?  What is "safe" about them? 
> This is described in neither the changelog nor the code comments, but it
> should be described in both.

This is needed when you attempt to access an MSR which might not
actually exist. Exactly what the changelog says... The core safe msr
functions (in include/asm/msr.h) also have a comment that says: "with
exception handling".

So I don't think this is as crappy as you say. But I still updated the
changelog and added a comment in the code, hope you like it better this
way.

> Also, Andi presently has several MSR patches queued (most prominently
> ftp://ftp.firstfloor.org/pub/ak/x86_64/quilt-current/patches/rwmsr-on-cpu)
> but I'm not including them in -mm due to rejects.  If I get a properly
> changlogged and commented version of this patch I can merge it, but it
> might get wrecked again when the x86_64 tree gets sorted out.

That patch you point me to is _already_ in Linus' tree:
http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff;h=b077ffb3b767c3efb44d00b998385a9cb127255c
So I'm not suprised you get rejects... Andi seemingly didn't update his
tree for over a month now.

The patch from Rudolf Marek which I am posting here builds on top of
what is already in Linus' tree. Taking it in your tree should not cause
any problem.

Thanks.

* * * * * Updated patch * * * * *

From: Rudolf Marek <[EMAIL PROTECTED]>

Add safe (exception handled) variants of rdmsr_on_cpu and wrmsr_on_cpu.
You should use these when the target MSR may not actually exist, as
doing so could trigger an exception which the regular functions do not
handle. The safe variants are slower, though.

The upcoming coretemp hardware monitoring driver will need this.

Signed-off-by: Rudolf Marek <[EMAIL PROTECTED]>
Cc: Alexey Dobriyan <[EMAIL PROTECTED]>
Cc: Dave Jones <[EMAIL PROTECTED]>
Signed-off-by: Jean Delvare <[EMAIL PROTECTED]>
---
 arch/i386/lib/msr-on-cpu.c |   73 
 include/asm-i386/msr.h |   12 +++
 include/asm-x86_64/msr.h   |   11 +++
 3 files changed, 89 insertions(+), 7 deletions(-)

--- linux-2.6.21-rc4.orig/arch/i386/lib/msr-on-cpu.c2007-03-25 
14:31:37.0 +0200
+++ linux-2.6.21-rc4/arch/i386/lib/msr-on-cpu.c 2007-03-26 13:11:26.0 
+0200
@@ -6,6 +6,7 @@
 struct msr_info {
u32 msr_no;
u32 l, h;
+   int err;
 };
 
 static void __rdmsr_on_cpu(void *info)
@@ -15,20 +16,38 @@ static void __rdmsr_on_cpu(void *info)
rdmsr(rv->msr_no, rv->l, rv->h);
 }
 
-void rdmsr_on_cpu(unsigned int cpu, u32 msr_no, u32 *l, u32 *h)
+static void __rdmsr_safe_on_cpu(void *info)
 {
+   struct msr_info *rv = info;
+
+   rv->err = rdmsr_safe(rv->msr_no, >l, >h);
+}
+
+static int _rdmsr_on_cpu(unsigned int cpu, u32 msr_no, u32 *l, u32 *h, int 
safe)
+{
+   int err = 0;
preempt_disable();
if (smp_processor_id() == cpu)
-   rdmsr(msr_no, *l, *h);
+   if (safe)
+   err = rdmsr_safe(msr_no, l, h);
+   else
+   rdmsr(msr_no, *l, *h);
else {
struct msr_info rv;
 
rv.msr_no = msr_no;
-   smp_call_function_single(cpu, __rdmsr_on_cpu, , 0, 1);
+   if (safe) {
+   smp_call_function_single(cpu, __rdmsr_safe_on_cpu,
+, 0, 1);
+   err = rv.err;
+   } else {
+   smp_call_function_single(cpu, __rdmsr_on_cpu, , 0, 
1);
+   }
*l = rv.l;
*h = rv.h;
}
preempt_enable();
+   return err;
 }
 
 static void __wrmsr_on_cpu(void *info)
@@ -38,21 +57,63 @@ static void __wrmsr_on_cpu(void *info)
wrmsr(rv->msr_no, rv->l, rv->h);
 }
 
-void wrmsr_on_cpu(unsigned int cpu, u32 msr_no, u32 l, u32 h)
+static void __wrmsr_safe_on_cpu(void *info)
 {
+   struct msr_info *rv = info;
+
+   rv->err = wrmsr_safe(rv->msr_no, rv->l, rv->h);
+}
+
+static int _wrmsr_on_cpu(unsigned int cpu, u32 msr_no, u32 l, u32 h, int safe)
+{
+   int err = 0;
preempt_disable();
if (smp_processor_id() == cpu)
-   wrmsr(msr_no, l, h);
+   if (safe)
+   err = wrmsr_safe(msr_no, l, h);
+   else
+   wrmsr(msr_no, l, h);
else {
struct msr_info rv;
 
rv.msr_no = msr_no;
rv.l = l;
rv.h = h;
-

Re: [PATCH -mm] Revoke core code: fix nommu arch compiling error bug

2007-03-26 Thread Pekka J Enberg

Hi David,

On Mon, 26 Mar 2007, David Howells wrote:
> The reason is that, at the moment, VMAs are a global *shared* resource in
> NOMMU-mode.  Each process has a list of global VMAs that it subscribes to, but
> that's it.  This (a) slightly reduces the amount of metadata allocated
> (possibly), and (b) makes sharing of executables and libraries much easier.

On Mon, 26 Mar 2007, David Howells wrote:
> I wonder if revoke_mm() is something that you can't do in NOMMU-mode.  What
> does it do?

The revoke_mm() function scans all vmas of a mm and revokes those that 
are shared and point to the inode being revoked. So, NOMMU can't do that.

But what's more important is, can we do revoke_mapping() for NOMMU? AFAICT 
we can, we just need to scan all the global vmas, right?

Pekka
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH -mm] Revoke core code: fix nommu arch compiling error bug

2007-03-26 Thread David Howells

Andrew Morton <[EMAIL PROTECTED]> wrote:

> Offending patch is
> ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.21-rc4/2.6.21-rc4-mm1/broken-out/nommu-hide-vm_mm-in-nommu-mode.patch,
> which seems rather dumb.  Or at least, its changelog does a good job of
> making it look dumb.

vm_mm is always NULL under NOMMU as it currently stands.  As far as I know,
this has been true since the NOMMU mm stuff was first included (I'm not sure
the VMAs of the first NOMMU mm code *had* a vm_mm).  Hugh (I think it was)
suggested that since this was always NULL, then it should be excised from the
struct in NOMMU-mode.

The reason is that, at the moment, VMAs are a global *shared* resource in
NOMMU-mode.  Each process has a list of global VMAs that it subscribes to, but
that's it.  This (a) slightly reduces the amount of metadata allocated
(possibly), and (b) makes sharing of executables and libraries much easier.

I have started on a patch to make VMAs non-shared, but it's not trivial.  In
NOMMU-mode, there needs to be something to pin the memory a private mapping
uses that isn't required in MMU-mode.  In the shared VMA thing, the VMA itself
can pin that memory...

I wonder if revoke_mm() is something that you can't do in NOMMU-mode.  What
does it do?

David

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 2.6.21-rc4-mm1 4/4] sys_futex64 : allows 64bit futexes

2007-03-26 Thread Andrew Morton

On Wed, 21 Mar 2007 10:54:36 +0100 [EMAIL PROTECTED] wrote:

> It does not provide the functionality for all architectures (only for x64 for 
> now).

Well that scuppers our chances of getting -mm kernels tested on ia64, s390
and sparc64.  Which is a problem - people do test s390 and ia64 and so these
patches impact the testing quality of everyone else's work.

Do we have a plan to fix this (promptly, please)?

kernel/built-in.o(.text+0x683a2): In function `futex_requeue_pi':
: undefined reference to `futex_atomic_cmpxchg_inatomic64'
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Wrong IDE cable detection in libata [Re: 2.6.21-rc4-mm1]

2007-03-26 Thread Tejun Heo

J.A. Magallón wrote:
> Libata seems to misdetect my cable.
> I have double-checked and the cable is 80 pin...

Does the following patch fix your problem?

  http://article.gmane.org/gmane.linux.ide/17444

(You can get the raw message by appending /raw to the URL).

-- 
tejun
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH -mm] vdso print fatal signals: fix compiling error bug in nommu arch

2007-03-26 Thread Ingo Molnar


* Wu, Bryan <[EMAIL PROTECTED]> wrote:

> +#ifdef CONFIG_MMU
>   struct mm_struct *mm = vma->vm_mm;
> +#else
> + struct mm_struct *mm = 0;
> +#endif

s/0/NULL ?

Ingo
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 00/22 take 3] UBI: Unsorted Block Images

2007-03-26 Thread Artem Bityutskiy

On Sun, 2007-03-25 at 22:08 +0200, Jörn Engel wrote:
> And there is no fundamental reason why UBI should export blocks with
> non-power-of-two sizes.

False. There is.

>   UBI currently consists of two parts that are
> intimately intertwined in the current implementation, but have
> relatively little connection otherwise.

False. They do have connection.

> 1. Logical volume management.
> 2. Static volumes.
> 
> Logical volume management can just as easily move its management
> information into a table, instead of having it spread across all blocks.
> Blocks can keep their original size.  Since you have to scan flash
> anyway, you can also scan for a table, compare a magical number and do
> some extra check to protect yourself against a UBI image inside some
> logical volume.  No big deal.

First off, I see these no big deal statements for years already, and no
decent implementation proved by usage in real world. Could we please,
move these academic discussions to another thread?

Second, it is much more robust to kip erase counter and mapping
information on per-eraseblock basis then to keep any on-flash table -
you may always scan whole media and gracefully recover from errors and
corruptions. And you do not loose use a lot in case of corruptions. 

Third, it is much simpler then keeping any on-flash table, it is thus
robust. We do not need a journal to update any table.

Third, if needed, on-flash table may be _added_ to increase scalability,
so "since you have to scan flash anyway" may become false when there is
real need in better scalability. For now scanning is OK. And still,
scanning method will be a good fall-back way to recover from errors.

> UBI is just as broken as LogFS is.  It works with every user in mainline
> (which comes down to JFFS2).  LogFS works with every MTD device in
> mainline.  The only combination that doesn't work is LogFS on UBI - due
> to deliberate design decisions on both sides.

You are welcome to discuss other irrelevant things to this thread.

-- 
Best regards,
Artem Bityutskiy (Битюцкий Артём)

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [patch] sched: accurate user accounting

2007-03-26 Thread malc


On Mon, 26 Mar 2007, Con Kolivas wrote:


On Monday 26 March 2007 09:01, Con Kolivas wrote:

On Monday 26 March 2007 03:14, malc wrote:

On Mon, 26 Mar 2007, Con Kolivas wrote:

On Monday 26 March 2007 01:19, malc wrote:

Erm... i just looked at the code and suddenly it stopped making any sense
at all:

 p->last_ran = rq->most_recent_timestamp = now;
 /* Sanity check. It should never go backwards or ruin accounting
*/ if (unlikely(now < p->last_ran))
 return;
 time_diff = now - p->last_ran;

First `now' is assigned to `p->last_ran' and the very next line
compares those two values, and then the difference is taken.. I quite
frankly am either very tired or fail to see the point.. time_diff is
either always zero or there's always a race here.


Bah major thinko error on my part! That will teach me to post patches
untested at 1:30 am. I'll try again shortly sorry.


Ok this one is heavily tested. Please try it when you find the time.


[..snip..]

Done, works. However there's a problem with accuracy comming from a
different angle.

I have this USB video grabber and also quite efficient way of putting
the pixels to the screen. Video is grabbed using isochronous
transfers, i.e. lots of small(on the order of 1K) chunks of data are
being transferred continously instead of one big burst unlike in, for
instance, PCI setup. With your accounting change idle from
`/proc/stat' is accurate but unfortunatelly top(1)/icewm's monitor/etc
apparently use user+sys+nice+intr+softirq+iowait to show the system
load, so system tools claim that the load is 10-12% while in reality
it is ~3%.

This situation is harder to write a hog-like testcase for. Anyhow it
seems the difference in percentage stems from the `intr' field of
`/proc/stat', which fits. And following patch (which should be applied
on top of yours) seems to help. I wouldn't really know what to do with
softirq and the rest of counts touched by this function, so i left them
alone.

Comments?

diff -ru linux-2.6.21-rc4/include/linux/kernel_stat.h 
linux-2.6.21-rc4-load/include/linux/kernel_stat.h
--- linux-2.6.21-rc4/include/linux/kernel_stat.h2007-03-26 
14:33:19.0 +0400
+++ linux-2.6.21-rc4-load/include/linux/kernel_stat.h   2007-03-26 
14:06:21.0 +0400
@@ -22,6 +22,7 @@
cputime64_t system;
cputime64_t softirq;
cputime64_t irq;
+   cputime64_t irq_ns;
cputime64_t idle;
cputime64_t idle_ns;
cputime64_t iowait;
diff -ru linux-2.6.21-rc4/kernel/sched.c linux-2.6.21-rc4-load/kernel/sched.c
--- linux-2.6.21-rc4/kernel/sched.c 2007-03-26 14:33:19.0 +0400
+++ linux-2.6.21-rc4-load/kernel/sched.c2007-03-26 14:11:36.0 
+0400
@@ -3148,9 +3148,14 @@

/* Add system time to cpustat. */
tmp = cputime_to_cputime64(cputime);
-   if (hardirq_count() - hardirq_offset)
-   cpustat->irq = cputime64_add(cpustat->irq, tmp);
-   else if (softirq_count())
+   if (hardirq_count() - hardirq_offset) {
+   cpustat->irq_ns = cputime64_add(cpustat->irq_ns, tmp);
+   if (cpustat->irq_ns > JIFFY_NS) {
+   cpustat->irq_ns = cputime64_sub(cpustat->irq_ns,
+   JIFFY_NS);
+   cpustat->irq = cputime64_add(cpustat->irq, 1);
+   }
+   } else if (softirq_count())
cpustat->softirq = cputime64_add(cpustat->softirq, tmp);
else if (p != rq->idle)
cpustat->system = cputime64_add(cpustat->system, tmp);

--
vale
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: 2.6.21-rc4-mm1

2007-03-26 Thread Andrew Morton

On Mon, 26 Mar 2007 12:34:33 +0200 Eric Rannaud <[EMAIL PROTECTED]> wrote:

> On Mon, Mar 26, 2007 at 01:22:32AM -0800, Andrew Morton wrote:
> > On Mon, 26 Mar 2007 11:09:49 +0200 Cornelia Huck <[EMAIL PROTECTED]> wrote:
> > > > If so, do you think I should labour on with
> > > > uevent-improve-error-checking-and-handling.patch plus your fix, or 
> > > > should I
> > > > drop the lot?  (I'm inclined toward the latter, but I'm still not
> > > > sure which patch(es) need to be dropped).
> > > 
> > > This depends on what semantics uevent returning an error code should
> > > have. The firmware code was using it to suppress uevents, but
> > > uevent_suppress is a better idea now. So if we want uevent returning !=
> > > 0 to imply "something really bad happened", all uevent functions have
> > > to be audited and those that work like firmware_uevent have to be
> > > converted to uevent_suppress. This would be cleaner, but I'm not sure
> > > it's worth the work.
> > 
> > We're generally struggling to stay alive amongst all the bugs at present -
> > I'll drop all those patches.
> 
> My mistake, I wrote the guilty patch
> uevent-improve-error-checking-and-handling.patch assuming it was safe to
> treat the return value as an error code, since several uevent functions
> returns things like -ENOMEM.
> 
> Should I rework the patch as Cornelia suggests and resubmit later, when
> things have settled down a little?

Sure, when we've fixed all the bugs ;)

I think we now know what to test for - firmware loading simply collapsed
all over the place with these changes.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH -mm] Revoke core code: fix nommu arch compiling error bug

2007-03-26 Thread Andrew Morton

On Mon, 26 Mar 2007 18:23:57 +0800 "Wu, Bryan" <[EMAIL PROTECTED]> wrote:

> Hi folks,
> 
> As struct mm_struct vm_mm is hidden in struct vm_area_struct in NOMMU
> arch, this is a fixing method when compiling failure on blackfin arch.
> 
> Signed-off-by: Bryan Wu <[EMAIL PROTECTED]> 
> ---
> 
>  fs/revoke.c |   22 +++---
>  1 file changed, 15 insertions(+), 7 deletions(-)
> 
> Index: linux-2.6/fs/revoke.c
> ===
> --- linux-2.6.orig/fs/revoke.c
> +++ linux-2.6/fs/revoke.c
> @@ -207,13 +207,21 @@
>  /*
>   *   LOCKING: spin_lock(>i_mmap_lock)
>   */
> -static int revoke_mm(struct mm_struct *mm, struct address_space *mapping,
> +static int revoke_mm(struct vm_area_struct *vma, struct address_space 
> *mapping,
>struct file *to_exclude)
>  {
> - struct vm_area_struct *vma;
> +#ifdef CONFIG_MMU
> + struct mm_struct *mm = vma->vm_mm;
> +#else
> + struct mm_struct *mm = 0;
> +#endif



Offending patch is
ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.21-rc4/2.6.21-rc4-mm1/broken-out/nommu-hide-vm_mm-in-nommu-mode.patch,
which seems rather dumb.  Or at least, its changelog does a good job of
making it look dumb.

David, what on earth does "this isn't used there" mean?  Surely it is
logical to have the mm backpointer in the vma in nommu mode?  What's going
on here?

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH -mm] Revoke core code: fix nommu arch compiling error bug

2007-03-26 Thread Pekka J Enberg

On Mon, 26 Mar 2007, Wu, Bryan wrote:
> As struct mm_struct vm_mm is hidden in struct vm_area_struct in NOMMU
> arch, this is a fixing method when compiling failure on blackfin arch.

What compile error is that? I don't see any #ifdef around ->vm_mm for 
struct vm_area_struct in .

On Mon, 26 Mar 2007, Wu, Bryan wrote:
> + if (!mm)
> + return -ENOENT;
> +
>   details.i_mmap_lock = >i_mmap_lock;

This means you won't be able to revoke memory mapped files with no-MMU. 
I'd rather fix it properly if someone can give me a clue how mmap works on 
non-MMU.

Pekka
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [3/5] 2.6.21-rc4: known regressions (v2)

2007-03-26 Thread Frederic Riss


2007/3/26, Thomas Gleixner <[EMAIL PROTECTED]>:

On Mon, 2007-03-26 at 08:45 +0200, Frédéric RISS wrote:
> Additional data point: I just tried with -rc5 and the issue is still
> present. The config I used for this test defines neither NO_HZ nor
> HIGH_RES_TIMERS.

Do you have CONFIG_HPET_TIMER enabled and does the box have one ?
If yes, can you please turn it off and retry ?


IIRC the box has a HPET and it gets used. I'll test and confirm when I
get home tonight.

Thanks,
Fred
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: 2.6.21-rc4-mm1

2007-03-26 Thread Eric Rannaud

On Mon, Mar 26, 2007 at 01:22:32AM -0800, Andrew Morton wrote:
> On Mon, 26 Mar 2007 11:09:49 +0200 Cornelia Huck <[EMAIL PROTECTED]> wrote:
> > > If so, do you think I should labour on with
> > > uevent-improve-error-checking-and-handling.patch plus your fix, or should 
> > > I
> > > drop the lot?  (I'm inclined toward the latter, but I'm still not
> > > sure which patch(es) need to be dropped).
> > 
> > This depends on what semantics uevent returning an error code should
> > have. The firmware code was using it to suppress uevents, but
> > uevent_suppress is a better idea now. So if we want uevent returning !=
> > 0 to imply "something really bad happened", all uevent functions have
> > to be audited and those that work like firmware_uevent have to be
> > converted to uevent_suppress. This would be cleaner, but I'm not sure
> > it's worth the work.
> 
> We're generally struggling to stay alive amongst all the bugs at present -
> I'll drop all those patches.

My mistake, I wrote the guilty patch
uevent-improve-error-checking-and-handling.patch assuming it was safe to
treat the return value as an error code, since several uevent functions
returns things like -ENOMEM.

Should I rework the patch as Cornelia suggests and resubmit later, when
things have settled down a little?

Thanks.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 0/3] VM throttling: avoid blocking occasional writers

2007-03-26 Thread Tomoki Sekiyama

Hi,
Thanks for your reply.

>>3) Use "dirty_ratio" as the blocking ratio. And add
>>   "start_writeback_ratio", and start writeback at
>>   start_writeback_ratio(default:90) * dirty_ratio / 100 [%].
>>   In this way, specifying blocking ratio can be done in the same way
>>   as current kernel, but high/low watermark algorithm is enabled.
>I like 3 better, it should make tuning behavior more precise.

Then, what do you think of the following idea?

(4) add `dirty_start_writeback_ratio' as percentage of memory,
at which a generator of dirty pages itself starts writeback
(that is, non-blocking ratio).

In this way, `dirty_ratio' is used as the blocking ratio, so we don't
need to modify the sysctl.conf etc. I think it's easier to understand
for administrators of systems, because the interface is similar as
`dirty_background_ratio' and`dirty_ratio.'

If this is OK, I'll repost the patch.

> You can make an argument for absolute values for writeback,
> if my disk will only write 70MB/s I may only want 203 sec of
> pending writes, regardless of available memory.

To realize tuning with absolute values, I consider that we need to
modify handling of `dirty_background_ratio,' `dirty_ratio' and so on as
well as `dirty_start_writeback_ratio.' I think this should be done in
another patch if this feature is required.

Regards,
--
Tomoki Sekiyama
Hitachi, Ltd., Systems Development Laboratory


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [Patch 0/7] integrity service framework and provider

2007-03-26 Thread Andrew Morton

On Fri, 23 Mar 2007 12:08:59 -0400 Mimi Zohar <[EMAIL PROTECTED]> wrote:

> This is a set of updates to the integrity service framework, previously 
> accepted into -mm, EVM a new integrity service provider, and a new LSM 
> module called Integrity Based Access Control(IBAC), a sample consumer of
> the integrity framework API.

I'll fix this:

security/integrity_dummy.c: In function 'dummy_inode_setxattr':
security/integrity_dummy.c:94: warning: implicit declaration of function 
'capable'
security/integrity_dummy.c:94: error: 'CAP_SYS_ADMIN' undeclared (first use in 
this function)
security/integrity_dummy.c:94: error: (Each undeclared identifier is reported 
only once
security/integrity_dummy.c:94: error: for each function it appears in.)

Then I'll ask you to fix these, some of which are real bugs:

security/integrity_dummy.c: In function 'dummy_verify_metadata':
security/integrity_dummy.c:30: warning: 'error' may be used uninitialized in 
this function
security/integrity_dummy.c:28: warning: 'value' may be used uninitialized in 
this function
security/integrity_dummy.c:29: warning: 'size' may be used uninitialized in 
this function

And then I'll probably end up fixing some of this lot too:

security/evm/evm_main.c: In function 'evm_verify_xattr':
security/evm/evm_main.c:165: warning: format '%d' expects type 'int', but 
argument 5 has type 'ssize_t'
security/evm/evm_crypto.c: In function 'update_link_hash':
security/evm/evm_crypto.c:94: warning: implicit declaration of function 
'kernel_readlink'
security/evm/evm_crypto.c: In function 'evm_init_integrity':
security/evm/evm_crypto.c:187: warning: format '%d' expects type 'int', but 
argument 4 has type 'size_t'
security/evm/evm_main.c: In function 'init_evm':
security/evm/evm_main.c:903: warning: control may reach end of non-void 
function 'evm_ima_init' being inlined


How does stuff like this get through?  It's just x86_64 allmodconfig.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH -mm] vdso print fatal signals: fix compiling error bug in nommu arch

2007-03-26 Thread Wu, Bryan

Hi folks,

As struct mm_struct vm_mm is hidden in struct vm_area_struct in NOMMU
arch, this is a fixing method when compiling failure on blackfin arch.

Signed-off-by: Bryan Wu <[EMAIL PROTECTED]> 
---

 kernel/signal.c |4 
 1 file changed, 4 insertions(+)

Index: linux-2.6/kernel/signal.c
===
--- linux-2.6.orig/kernel/signal.c
+++ linux-2.6/kernel/signal.c
@@ -807,7 +807,11 @@
 
 static int print_vma(struct vm_area_struct *vma)
 {
+#ifdef CONFIG_MMU
struct mm_struct *mm = vma->vm_mm;
+#else
+   struct mm_struct *mm = 0;
+#endif
struct file *file = vma->vm_file;
int flags = vma->vm_flags;
unsigned long ino = 0;
_

Thanks,
-Bryan Wu
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH -mm] Revoke core code: fix nommu arch compiling error bug

2007-03-26 Thread Wu, Bryan

Hi folks,

As struct mm_struct vm_mm is hidden in struct vm_area_struct in NOMMU
arch, this is a fixing method when compiling failure on blackfin arch.

Signed-off-by: Bryan Wu <[EMAIL PROTECTED]> 
---

 fs/revoke.c |   22 +++---
 1 file changed, 15 insertions(+), 7 deletions(-)

Index: linux-2.6/fs/revoke.c
===
--- linux-2.6.orig/fs/revoke.c
+++ linux-2.6/fs/revoke.c
@@ -207,13 +207,21 @@
 /*
  * LOCKING: spin_lock(>i_mmap_lock)
  */
-static int revoke_mm(struct mm_struct *mm, struct address_space *mapping,
+static int revoke_mm(struct vm_area_struct *vma, struct address_space *mapping,
 struct file *to_exclude)
 {
-   struct vm_area_struct *vma;
+#ifdef CONFIG_MMU
+   struct mm_struct *mm = vma->vm_mm;
+#else
+   struct mm_struct *mm = 0;
+#endif
+   struct vm_area_struct *_vma;
struct zap_details details;
int err = 0;
 
+   if (!mm)
+   return -ENOENT;
+
details.i_mmap_lock = >i_mmap_lock;
 
/*
@@ -224,11 +232,11 @@
err = -EAGAIN;
goto out;
}
-   for (vma = mm->mmap; vma != NULL; vma = vma->vm_next) {
-   if (!need_revoke(vma, to_exclude))
+   for (_vma = mm->mmap; _vma != NULL; _vma = _vma->vm_next) {
+   if (!need_revoke(_vma, to_exclude))
continue;
 
-   err = revoke_vma(vma, );
+   err = revoke_vma(_vma, );
if (err)
break;
}
@@ -254,7 +262,7 @@
if (likely(!need_revoke(vma, to_exclude)))
continue;
 
-   err = revoke_mm(vma->vm_mm, mapping, to_exclude);
+   err = revoke_mm(vma, mapping, to_exclude);
if (err == -EAGAIN) {
try_again = 1;
continue;
@@ -284,7 +292,7 @@
if (likely(!need_revoke(vma, to_exclude)))
continue;
 
-   err = revoke_mm(vma->vm_mm, mapping, to_exclude);
+   err = revoke_mm(vma, mapping, to_exclude);
if (err == -EAGAIN) {
try_again = 1;
continue;
_

Thanks,
-Bryan Wu
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH -mm] Blackfin: spi driver cleanup and coding style fixing

2007-03-26 Thread Wu, Bryan

Hi folks,

This patch cleanup blackfin SPI driver code and fix some coding style
problems.

Signed-off-by: Bryan Wu <[EMAIL PROTECTED]> 
---
 drivers/spi/spi_bfin5xx.c |  322 +-
 1 file changed, 179 insertions(+), 143 deletions(-)

Index: linux-2.6/drivers/spi/spi_bfin5xx.c
===
--- linux-2.6.orig/drivers/spi/spi_bfin5xx.c
+++ linux-2.6/drivers/spi/spi_bfin5xx.c
@@ -7,8 +7,6 @@
  * Description:  SPI controller driver for Blackfin 5xx
  * Bugs: Enter bugs at http://blackfin.uclinux.org/
  *
- * Rev:  $Id: spi_bfin5xx.c 2508 2006-12-06 07:35:43Z sonicz $
- *
  * Modified:
  * March 10, 2006  bfin5xx_spi.c Created. (Luke Yang)
  *  August 7, 2006  added full duplex mode (Axel Weiss & Luke Yang)
@@ -55,17 +53,6 @@
 MODULE_DESCRIPTION("Blackfin 5xx SPI Contoller");
 MODULE_LICENSE("GPL");
 
-#ifdef DEBUG
-#define ASSERT(expr) \
-   if (!(expr)) { \
-   printk(KERN_DEBUG "assertion failed! %s[%d]: %s\n", \
-  __FUNCTION__, __LINE__, #expr); \
-   panic(KERN_DEBUG "%s", __FUNCTION__); \
-   }
-#else
-#define ASSERT(expr)
-#endif
-
 #define IS_DMA_ALIGNED(x) (((u32)(x)&0x07)==0)
 
 #define DEFINE_SPI_REG(reg, off) \
@@ -82,15 +69,12 @@
 DEFINE_SPI_REG(RDBR, 0x10)
 DEFINE_SPI_REG(BAUD, 0x14)
 DEFINE_SPI_REG(SHAW, 0x18)
-
 #define START_STATE ((void*)0)
 #define RUNNING_STATE ((void*)1)
 #define DONE_STATE ((void*)2)
 #define ERROR_STATE ((void*)-1)
-
 #define QUEUE_RUNNING 0
 #define QUEUE_STOPPED 1
-
 int dma_requested;
 char chip_select_flag;
 
@@ -175,12 +159,13 @@
 static u16 hz_to_spi_baud(u32 speed_hz)
 {
u_long sclk = get_sclk();
-   u16 spi_baud = (sclk / (2*speed_hz));
+   u16 spi_baud = (sclk / (2 * speed_hz));
 
-   if ((sclk % (2*speed_hz)) > 0)
+   if ((sclk % (2 * speed_hz)) > 0)
spi_baud++;
 
-   pr_debug("sclk = %ld, speed_hz = %d, spi_baud = %d\n", sclk, speed_hz, 
spi_baud);
+   pr_debug("sclk = %ld, speed_hz = %d, spi_baud = %d\n", sclk, speed_hz,
+spi_baud);
 
return spi_baud;
 }
@@ -190,7 +175,8 @@
unsigned long limit = loops_per_jiffy << 1;
 
/* wait for stop and clear stat */
-   do {} while (!(read_STAT() & BIT_STAT_SPIF) && limit--);
+   do {
+   } while (!(read_STAT() & BIT_STAT_SPIF) && limit--);
write_STAT(BIT_STAT_CLR);
 
return limit;
@@ -265,7 +251,8 @@
 
while (drv_data->tx < drv_data->tx_end) {
write_TDBR(0);
-   do {} while ((read_STAT() & BIT_STAT_TXS));
+   do {
+   } while ((read_STAT() & BIT_STAT_TXS));
drv_data->tx += n_bytes;
}
 }
@@ -276,7 +263,8 @@
dummy_read();
 
while (drv_data->rx < drv_data->rx_end) {
-   do {} while (!(read_STAT() & BIT_STAT_RXS));
+   do {
+   } while (!(read_STAT() & BIT_STAT_RXS));
dummy_read();
drv_data->rx += n_bytes;
}
@@ -286,13 +274,15 @@
 {
pr_debug("cr8-s is 0x%x\n", read_STAT());
while (drv_data->tx < drv_data->tx_end) {
-   write_TDBR(*(u8 *)(drv_data->tx));
-   do {} while (read_STAT() & BIT_STAT_TXS);
+   write_TDBR(*(u8 *) (drv_data->tx));
+   do {
+   } while (read_STAT() & BIT_STAT_TXS);
++drv_data->tx;
}
 
-   // poll for SPI completion before returning
-   do {} while (!(read_STAT() & BIT_STAT_SPIF));
+   /* poll for SPI completion before returning */
+   do {
+   } while (!(read_STAT() & BIT_STAT_SPIF));
 }
 
 static void u8_cs_chg_writer(struct driver_data *drv_data)
@@ -303,10 +293,12 @@
write_FLAG(chip->flag);
SSYNC();
 
-   write_TDBR(*(u8 *)(drv_data->tx));
-   do {} while (read_STAT() & BIT_STAT_TXS);
-   do {} while (!(read_STAT() & BIT_STAT_SPIF));
-   write_FLAG(0xFF00|chip->flag);
+   write_TDBR(*(u8 *) (drv_data->tx));
+   do {
+   } while (read_STAT() & BIT_STAT_TXS);
+   do {
+   } while (!(read_STAT() & BIT_STAT_SPIF));
+   write_FLAG(0xFF00 | chip->flag);
SSYNC();
if (chip->cs_chg_udelay)
udelay(chip->cs_chg_udelay);
@@ -320,18 +312,20 @@
 {
pr_debug("cr-8 is 0x%x\n", read_STAT());
 
-   // clear TDBR buffer before read(else it will be shifted out)
+   /* clear TDBR buffer before read(else it will be shifted out) */
write_TDBR(0x);
 
dummy_read();
 
while (drv_data->rx < drv_data->rx_end - 1) {
-   do {} while (!(read_STAT() & BIT_STAT_RXS));
-   *(u8 *)(drv_data->rx) = read_RDBR();
+   do {
+   } while (!(read_STAT() & BIT_STAT_RXS));
+

[PATCH -mm] Blackfin: spi driver fix reboot kernel mounting spi flash print error bug

2007-03-26 Thread Wu, Bryan

Hi folks,

This patch fix a printing error bug when reboot kernel mounting on SPI
flash.

Signed-off-by: Bryan Wu <[EMAIL PROTECTED]> 
---

 drivers/spi/spi_bfin5xx.c |9 -
 1 file changed, 9 deletions(-)

Index: linux-2.6/drivers/spi/spi_bfin5xx.c
===
--- linux-2.6.orig/drivers/spi/spi_bfin5xx.c
+++ linux-2.6/drivers/spi/spi_bfin5xx.c
@@ -1167,14 +1167,6 @@
return 0;
 }
 
-static void bfin5xx_spi_shutdown(struct platform_device *pdev)
-{
-   int status = 0;
-
-   if ((status = bfin5xx_spi_remove(pdev)) != 0)
-   dev_err(>dev, "shutdown failed with %d\n", status);
-}
-
 /* PM, do nothing now */
 #ifdef CONFIG_PM
 static int suspend_devices(struct device *dev, void *pm_message)
@@ -1242,7 +1234,6 @@
},
.probe = bfin5xx_spi_probe,
.remove = __devexit_p(bfin5xx_spi_remove),
-   .shutdown = bfin5xx_spi_shutdown,
.suspend = bfin5xx_spi_suspend,
.resume = bfin5xx_spi_resume,
 };
_

Thanks,
-Bryan Wu
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

-rc5: e1000 resume weirdness

2007-03-26 Thread Ingo Molnar


hm, on a T60, after suspend/resume, i get an e1000 timeout:

e1000: eth0: e1000_watchdog: NIC Link is Up 1000 Mbps Full Duplex, Flow 
Control: RX/TX
e1000: eth0: e1000_clean_tx_irq: Detected Tx Unit Hang
  Tx Queue <0>
  TDH  
  TDT  
  next_to_use  
  next_to_clean<82>
buffer_info[next_to_clean]
  time_stamp   
  next_to_watch<82>
  jiffies  
  next_to_watch.status <1>

it works fine after that reset. The e1000 driver didnt do this before 
after resume the network was always available immediately. So this 
appears to be a relatively new regression (post-rc3 or so). high-res 
timers was disabled.

Ingo
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH -mm] Blackfin: rtc fix rtc_update_irq augument

2007-03-26 Thread Wu, Bryan

Hi folks,

Replacing class_dev to directly using rtc_dev.

Signed-off-by: Bryan Wu <[EMAIL PROTECTED]> 
---

 drivers/rtc/rtc-bfin.c |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Index: linux-2.6/drivers/rtc/rtc-bfin.c
===
--- linux-2.6.orig/drivers/rtc/rtc-bfin.c
+++ linux-2.6/drivers/rtc/rtc-bfin.c
@@ -166,7 +166,7 @@
events |= RTC_UF | RTC_IRQF;
}
 
-   rtc_update_irq(>rtc_dev->class_dev, 1, events);
+   rtc_update_irq(rtc->rtc_dev, 1, events);
 
spin_unlock_irq(>lock);
 
_

Thanks,
-Bryan Wu
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH -mm] Blackfin arch: cleanup cache header file

2007-03-26 Thread Wu, Bryan

Hi folks,

According to Paul's review, this patch cleanup the
include/asm-blackfin/cache.h comments.

Signed-off-by: Bryan Wu <[EMAIL PROTECTED]> 
---

 include/asm-blackfin/cache.h |   20 ++--
 1 file changed, 14 insertions(+), 6 deletions(-)

Index: linux-2.6/include/asm-blackfin/cache.h
===
--- linux-2.6.orig/include/asm-blackfin/cache.h
+++ linux-2.6/include/asm-blackfin/cache.h
@@ -1,13 +1,21 @@
+/*
+ * include/asm-blackfin/cache.h
+ */
 #ifndef __ARCH_BLACKFIN_CACHE_H
 #define __ARCH_BLACKFIN_CACHE_H
 
-/* bytes per L1 cache line */
-#defineL1_CACHE_SHIFT  5   /* BlackFin loads 32 bytes for cache */
-#defineL1_CACHE_BYTES  (1 << L1_CACHE_SHIFT)
+/*
+ * Bytes per L1 cache line
+ * Blackfin loads 32 bytes for cache
+ */
+#define L1_CACHE_SHIFT 5
+#define L1_CACHE_BYTES (1 << L1_CACHE_SHIFT)
 
-/* For speed we do need to align these ...MaTed---*/
-/*  But include/linux/cache.h does this for us if we DO not define 
...MaTed---*/
-#define __cacheline_aligned/* maybe no need this   Tony */
+/*
+ * Don't make __cacheline_aligned and
+ * cacheline_aligned defined in include/linux/cache.h
+ */
+#define __cacheline_aligned
 #define cacheline_aligned
 
 /*
_

Thanks
-Bryan
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH -mm] Blackfin arch: fix reboot kernel mounting spi flash print error bug

2007-03-26 Thread Wu, Bryan

Hi folks,

This patch fix a printing error bug when reboot kernel mounting on SPI
flash.

Signed-off-by: Bryan Wu <[EMAIL PROTECTED]> 
---

 arch/blackfin/mach-bf533/boards/cm_bf533.c  |2 +-
 arch/blackfin/mach-bf533/boards/ezkit.c |2 +-
 arch/blackfin/mach-bf533/boards/stamp.c |2 +-
 arch/blackfin/mach-bf537/boards/cm_bf537.c  |2 +-
 arch/blackfin/mach-bf537/boards/generic_board.c |2 +-
 arch/blackfin/mach-bf537/boards/pnav10.c|2 +-
 arch/blackfin/mach-bf537/boards/stamp.c |2 +-
 arch/blackfin/mach-bf561/boards/cm_bf561.c  |2 +-
 8 files changed, 8 insertions(+), 8 deletions(-)

Index: linux-2.6/arch/blackfin/mach-bf533/boards/cm_bf533.c
===
--- linux-2.6.orig/arch/blackfin/mach-bf533/boards/cm_bf533.c
+++ linux-2.6/arch/blackfin/mach-bf533/boards/cm_bf533.c
@@ -57,7 +57,7 @@
.offset = 0x2
},{
.name = "file system",
-   .size = 0x30,
+   .size = 0x70,
.offset = 0x0010,
}
 };
Index: linux-2.6/arch/blackfin/mach-bf533/boards/ezkit.c
===
--- linux-2.6.orig/arch/blackfin/mach-bf533/boards/ezkit.c
+++ linux-2.6/arch/blackfin/mach-bf533/boards/ezkit.c
@@ -91,7 +91,7 @@
.offset = 0x2
},{
.name = "file system",
-   .size = 0x30,
+   .size = 0x70,
.offset = 0x0010,
}
 };
Index: linux-2.6/arch/blackfin/mach-bf533/boards/stamp.c
===
--- linux-2.6.orig/arch/blackfin/mach-bf533/boards/stamp.c
+++ linux-2.6/arch/blackfin/mach-bf533/boards/stamp.c
@@ -114,7 +114,7 @@
.offset = 0x2
},{
.name = "file system",
-   .size = 0x30,
+   .size = 0x70,
.offset = 0x0010,
}
 };
Index: linux-2.6/arch/blackfin/mach-bf537/boards/cm_bf537.c
===
--- linux-2.6.orig/arch/blackfin/mach-bf537/boards/cm_bf537.c
+++ linux-2.6/arch/blackfin/mach-bf537/boards/cm_bf537.c
@@ -59,7 +59,7 @@
.offset = 0x2
},{
.name = "file system",
-   .size = 0x30,
+   .size = 0x70,
.offset = 0x0010,
}
 };
Index: linux-2.6/arch/blackfin/mach-bf537/boards/generic_board.c
===
--- linux-2.6.orig/arch/blackfin/mach-bf537/boards/generic_board.c
+++ linux-2.6/arch/blackfin/mach-bf537/boards/generic_board.c
@@ -259,7 +259,7 @@
.offset = 0x2
},{
.name = "file system",
-   .size = 0x30,
+   .size = 0x70,
.offset = 0x0010,
}
 };
Index: linux-2.6/arch/blackfin/mach-bf537/boards/pnav10.c
===
--- linux-2.6.orig/arch/blackfin/mach-bf537/boards/pnav10.c
+++ linux-2.6/arch/blackfin/mach-bf537/boards/pnav10.c
@@ -244,7 +244,7 @@
.offset = 0x2
},{
.name = "file system",
-   .size = 0x30,
+   .size = 0x70,
.offset = 0x0010,
}
 };
Index: linux-2.6/arch/blackfin/mach-bf537/boards/stamp.c
===
--- linux-2.6.orig/arch/blackfin/mach-bf537/boards/stamp.c
+++ linux-2.6/arch/blackfin/mach-bf537/boards/stamp.c
@@ -293,7 +293,7 @@
.offset = 0x2
},{
.name = "file system",
-   .size = 0x30,
+   .size = 0x70,
.offset = 0x0010,
}
 };
Index: linux-2.6/arch/blackfin/mach-bf561/boards/cm_bf561.c
===
--- linux-2.6.orig/arch/blackfin/mach-bf561/boards/cm_bf561.c
+++ linux-2.6/arch/blackfin/mach-bf561/boards/cm_bf561.c
@@ -58,7 +58,7 @@
.offset = 0x2
},{
.name = "file system",
-   .size = 0x30,
+   .size = 0x70,
.offset = 0x0010,
}
 };
_

Thanks
-Bryan
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

< 1 2 3 4 5 6 7 8 >

201 - 300 of 710 matches

Mail list logo