date:20071108

[PATCH mm] unionfs: clear partial read

2007-11-08 Thread Hugh Dickins

unionfs_do_readpage forgot to clear the rest of the page when vfs_read
does not fill the page: fix that.

Signed-off-by: Hugh Dickins <[EMAIL PROTECTED]>
---

 fs/unionfs/mmap.c |3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

--- 2.6.24-rc1-mm1/fs/unionfs/mmap.c2007-11-04 13:48:02.0 +
+++ linux/fs/unionfs/mmap.c 2007-11-06 13:51:02.0 +
@@ -176,7 +176,8 @@ static int unionfs_do_readpage(struct fi
err = vfs_read(lower_file, page_data, PAGE_CACHE_SIZE,
   &lower_file->f_pos);
set_fs(old_fs);
-
+   if (err >= 0 && err < PAGE_CACHE_SIZE)
+   memset(page_data + err, 0, PAGE_CACHE_SIZE - err);
kunmap(page);
 
if (err < 0)
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [BUG]: Crash with CONFIG_FAIR_CGROUP_SCHED=y

2007-11-08 Thread Srivatsa Vaddagiri

On Thu, Nov 08, 2007 at 03:48:05PM -0800, [EMAIL PROTECTED] wrote:
> With CONFIG_FAIR_CGROUP_SCHED=y, following commands on 2.6.24-rc1 crash
> the system.

Thanks for reporting the problem. It was caused because of the fact that
current task isn't kept in its runqueue in case of sched_fair class
tasks.

With the patch below, I could run ns_exec w/o any crash. Can you pls
verify it works for you as well?

Ingo,
Once Suka verifies that the patch fixes his crash, I would request you 
to include the same in your tree and route it to Linus.

--

current task is not present in its runqueue in case of sched_fair class
tasks. Take care of this fact in rt_mutex_setprio(),
sched_setscheduler() and sched_move_task() routines.

Signed-off-by : Srivatsa Vaddagiri <[EMAIL PROTECTED]>


---
 kernel/sched.c |   45 +
 1 files changed, 25 insertions(+), 20 deletions(-)

Index: current/kernel/sched.c
===
--- current.orig/kernel/sched.c
+++ current/kernel/sched.c
@@ -3986,11 +3986,13 @@ void rt_mutex_setprio(struct task_struct
oldprio = p->prio;
on_rq = p->se.on_rq;
running = task_running(rq, p);
-   if (on_rq) {
+   if (on_rq)
dequeue_task(rq, p, 0);
-   if (running)
-   p->sched_class->put_prev_task(rq, p);
-   }
+   /* current task is not kept in its runqueue in case of sched_fair class.
+* Hence we need the 'on_rq?' and 'running?' tests to be separate.
+*/
+   if (running)
+   p->sched_class->put_prev_task(rq, p);
 
if (rt_prio(prio))
p->sched_class = &rt_sched_class;
@@ -3999,9 +4001,9 @@ void rt_mutex_setprio(struct task_struct
 
p->prio = prio;
 
+   if (running)
+   p->sched_class->set_curr_task(rq);
if (on_rq) {
-   if (running)
-   p->sched_class->set_curr_task(rq);
enqueue_task(rq, p, 0);
inc_load(rq, p);
/*
@@ -4298,18 +4300,20 @@ recheck:
update_rq_clock(rq);
on_rq = p->se.on_rq;
running = task_running(rq, p);
-   if (on_rq) {
+   if (on_rq)
deactivate_task(rq, p, 0);
-   if (running)
-   p->sched_class->put_prev_task(rq, p);
-   }
+   /* current task is not kept in its runqueue in case of sched_fair class.
+* Hence we need the 'on_rq?' and 'running?' tests to be separate.
+*/
+   if (running)
+   p->sched_class->put_prev_task(rq, p);
 
oldprio = p->prio;
__setscheduler(rq, p, policy, param->sched_priority);
 
+   if (running)
+   p->sched_class->set_curr_task(rq);
if (on_rq) {
-   if (running)
-   p->sched_class->set_curr_task(rq);
activate_task(rq, p, 0);
/*
 * Reschedule if we are currently running on this runqueue and
@@ -7036,19 +7040,20 @@ void sched_move_task(struct task_struct 
running = task_running(rq, tsk);
on_rq = tsk->se.on_rq;
 
-   if (on_rq) {
+   if (on_rq)
dequeue_task(rq, tsk, 0);
-   if (unlikely(running))
-   tsk->sched_class->put_prev_task(rq, tsk);
-   }
+   /* current task is not kept in its runqueue in case of sched_fair class.
+* Hence we need the 'on_rq?' and 'running?' tests to be separate.
+*/
+   if (unlikely(running))
+   tsk->sched_class->put_prev_task(rq, tsk);
 
set_task_cfs_rq(tsk);
 
-   if (on_rq) {
-   if (unlikely(running))
-   tsk->sched_class->set_curr_task(rq);
+   if (unlikely(running))
+   tsk->sched_class->set_curr_task(rq);
+   if (on_rq)
enqueue_task(rq, tsk, 0);
-   }
 
 done:
task_rq_unlock(rq, &flags);


-- 
Regards,
vatsa
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[2.6 patch] x86 pci-dma_64.c: cleanups

2007-11-08 Thread Adrian Bunk

This patch contains the following cleanups:
- make the needlessly global iommu_setup() static
- remove the unused EXPORT_SYMBOL(iommu_merge)

Signed-off-by: Adrian Bunk <[EMAIL PROTECTED]>

---

 arch/x86/kernel/pci-dma_64.c |3 +--
 include/asm-x86/pci_64.h |1 -
 2 files changed, 1 insertion(+), 3 deletions(-)

90fe7f396f3428a83dd5301f3d9622e32fb3bac5 
diff --git a/arch/x86/kernel/pci-dma_64.c b/arch/x86/kernel/pci-dma_64.c
index aa805b1..516f85f 100644
--- a/arch/x86/kernel/pci-dma_64.c
+++ b/arch/x86/kernel/pci-dma_64.c
@@ -13,7 +13,6 @@
 #include 
 
 int iommu_merge __read_mostly = 1;
-EXPORT_SYMBOL(iommu_merge);
 
 dma_addr_t bad_dma_address __read_mostly;
 EXPORT_SYMBOL(bad_dma_address);
@@ -230,7 +229,7 @@ EXPORT_SYMBOL(dma_set_mask);
  * See  for the iommu kernel parameter
  * documentation.
  */
-__init int iommu_setup(char *p)
+static __init int iommu_setup(char *p)
 {
iommu_merge = 1;
 
diff --git a/include/asm-x86/pci_64.h b/include/asm-x86/pci_64.h
index ef54226..3746903 100644
--- a/include/asm-x86/pci_64.h
+++ b/include/asm-x86/pci_64.h
@@ -26,7 +26,6 @@ extern int (*pci_config_write)(int seg, int bus, int dev, int 
fn, int reg, int l
 
 
 extern void pci_iommu_alloc(void);
-extern int iommu_setup(char *opt);
 
 /* The PCI address space does equal the physical memory
  * address space.  The networking and block device layers use

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: msync(2) bug(?), returns AOP_WRITEPAGE_ACTIVATE to userland

2007-11-08 Thread Erez Zadok

In message <[EMAIL PROTECTED]>, Hugh Dickins writes:
> [Dave, I've Cc'ed you re handle_write_count_underflow, see below.]
> 
> On Wed, 31 Oct 2007, Erez Zadok wrote:
> > 
> > Hi Hugh, I've addressed all of your concerns and am happy to report that the
> > newly revised unionfs_writepage works even better, including under my
> > memory-pressure conditions.  To summarize my changes since the last time:
> > 
> > - I'm only masking __GFP_FS, not __GFP_IO
> > - using find_or_create_page to avoid locking issues around mapping mask
> > - handle for_reclaim case more efficiently
> > - using copy_highpage so we handle KM_USER*
> > - un/locking upper/lower page as/when needed
> > - updated comments to clarify what/why
> > - unionfs_sync_page: gone (yes, vfs.txt did confuse me, plus ecryptfs used
> >   to have it)
> > 
> > Below is the newest version of unionfs_writepage.  Let me know what you
> > think.
> > 
> > I have to say that with these changes, unionfs appears visibly faster under
> > memory pressure.  I suspect the for_reclaim handling is probably the largest
> > contributor to this speedup.
> 
> That's good news, and that unionfs_writepage looks good to me -
> with three reservations I've not observed before.
> 
> One, I think you would be safer to do a set_page_dirty(lower_page)
> before your clear_page_dirty_for_io(lower_page).  I know that sounds
> silly, but see Linus' "Yes, Virginia" comment in clear_page_dirty_for_io:
> there's a lot of subtlety hereabouts, and I think you'd be mimicing the
> usual path closer if you set_page_dirty first - there's nothing else
> doing it on that lower_page, is there?  I'm not certain that you need
> to, but I think you'd do well to look into it and make up your own mind.

Hugh, my code looks like:

if (wbc->for_reclaim) {
set_page_dirty(lower_page);
unlock_page(lower_page);
goto out_release;
}
BUG_ON(!lower_mapping->a_ops->writepage);
clear_page_dirty_for_io(lower_page); /* emulate VFS behavior */
err = lower_mapping->a_ops->writepage(lower_page, wbc);

Do you mean I should set_page_dirty(lower_page) unconditionally before
clear_page_dirty_for_io?  (I already do that in the 'if' statement above it.)

> Two, I'm unsure of the way you're clearing or setting PageUptodate on
> the upper page there.  The rules for PageUptodate are fairly obvious
> when reading, but when a write fails, it's not so obvious.  Again, I'm
> not saying what you've got is wrong (it may be unavoidable, to keep
> synch between lower and upper), but it deserves a second thought.

I looked at all mainline filesystems's ->writepage to see what, if any, they
do with their page's uptodate flag.  Most f/s don't touch the flag one way
or another.

cifs_writepage sets the uptodate flag unconditionally: why?

ecryptfs_writepage has a legit reason: if encrypting the page failed, it 
doesn't want
anyone to use it, so it clears its page's uptodate flag (else it sets it as
uptodate).

hostfs_writepage clears pageuptodate if it failed to write_file(), which I'm
not sure if it makes sense or not.

ntfs_writepage goes as far as doing BUG_ON(!PageUptodate(page)) which
indicates to me that the page passed to ->writepage should always be
uptodate.  Is that a fair statement?

smb_writepage pretty much unconditionally calls SetPageUptodate(page).  Why?

Is there a reason smbfs and cifs both do this unconditionally?  If so, then
why is ntfs calling BUG_ON if the page isn't uptodate?  Either that BUG_ON
in ntfs is redundant, or cifs/smbfs's SetPageUptodate is redundant, but they
can't both be right.

And finally, unionfs clears the uptodate flag on error from the lower
->writepage, and otherwise sets the flag on success from the lower
->writepage.  My gut feeling is that unionfs shouldn't change the page
uptodate flag at all: if the VFS passes unionfs_writepage a page which isn't
uptodate, then the VFS has a serious problem b/c it'd be asking a f/s to
write out a page which isn't up-to-date, right?  Otherwise, whether
unionfs_writepage manages to write the lower page or not, why should that
invalidate the state of the unionfs page itself?  Come to think of it, I
think clearing pageuptodate on error from ->writepage(lower_page) may be
bad.  Imagine if after such a failed unionfs_writepage, I get a
unionfs_readpage: that ->readpage will get data from the lower f/s page and
copy it *over* the unionfs page, even if the upper page's data was more
recent prior to the failed call to unionfs_writepage.  IOW, we could be
reverting a user-visible mmap'ed page to a previous on-disk version.  What
do you think: could this happen?  Anyway, I'll run some exhaustive testing
next and see what happens if I don't set/clear the uptodate flag in
unionfs_writepage.

> Three, I believe you need to add a flush_dcache_page(lower_page)
> after the copy_highpage(lower_page): some architectures will need
> that to see the new data if they have lower_page mapped

[PATCH] make ds1wm driver to check ds1wm_platform_data pointer against NULL

2007-11-08 Thread eric miao

Do a sanity check for the "struct ds1wm_platform_data" pointer passed in
by the platform_device, and so to enforce each platform to provide a
valid structure.

Signed-off-by: eric miao <[EMAIL PROTECTED]>
---
 drivers/w1/masters/ds1wm.c |4 
 1 files changed, 4 insertions(+), 0 deletions(-)

diff --git a/drivers/w1/masters/ds1wm.c b/drivers/w1/masters/ds1wm.c
index 5747997..11ce9ec 100644
--- a/drivers/w1/masters/ds1wm.c
+++ b/drivers/w1/masters/ds1wm.c
@@ -351,6 +351,10 @@ static int ds1wm_probe(struct platform_device *pdev)
goto err0;
}
plat = pdev->dev.platform_data;
+   if (!plat) {
+   ret = -ENXIO;
+   goto err0;
+   }
ds1wm_data->bus_shift = plat->bus_shift;
ds1wm_data->pdev = pdev;
ds1wm_data->pdata = plat;
-- 
1.5.2.5.GIT
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[2.6 patch] x86 pci-calgary_64.c: make a variable static

2007-11-08 Thread Adrian Bunk

"debugging" is a horrible name for a global variable - thankfully it can 
become static.

Also put it out of __read_mostly so that gcc no longer has to emit it
at all.

Signed-off-by: Adrian Bunk <[EMAIL PROTECTED]>

---

 arch/x86/kernel/pci-calgary_64.c |4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

22f49393bbb5d8e73f94338be5d1ad52f459779c 
diff --git a/arch/x86/kernel/pci-calgary_64.c b/arch/x86/kernel/pci-calgary_64.c
index 6bf1f71..a262f94 100644
--- a/arch/x86/kernel/pci-calgary_64.c
+++ b/arch/x86/kernel/pci-calgary_64.c
@@ -183,7 +183,7 @@ static struct calgary_bus_info bus_info[MAX_PHB_BUS_NUM] = 
{ { NULL, 0, 0 }, };
 
 /* enable this to stress test the chip's TCE cache */
 #ifdef CONFIG_IOMMU_DEBUG
-int debugging __read_mostly = 1;
+static int debugging = 1;
 
 static inline unsigned long verify_bit_range(unsigned long* bitmap,
int expected, unsigned long start, unsigned long end)
@@ -202,7 +202,7 @@ static inline unsigned long verify_bit_range(unsigned long* 
bitmap,
return ~0UL;
 }
 #else /* debugging is disabled */
-int debugging __read_mostly = 0;
+static int debugging = 0;
 
 static inline unsigned long verify_bit_range(unsigned long* bitmap,
int expected, unsigned long start, unsigned long end)

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[2.6 patch] x86 kprobes_64.c: make 3 functions static

2007-11-08 Thread Adrian Bunk

This patch makes the following needlessly global functions static:
- kprobe_handler()
- trampoline_probe_handler()
- post_kprobe_handler()

Signed-off-by: Adrian Bunk <[EMAIL PROTECTED]>

---

 arch/x86/kernel/kprobes_64.c |7 ---
 include/asm-x86/kprobes_64.h |2 --
 2 files changed, 4 insertions(+), 5 deletions(-)

dec0510a1f75dce9dbdb75458fd870fba14bd7b4 
diff --git a/arch/x86/kernel/kprobes_64.c b/arch/x86/kernel/kprobes_64.c
index 3db3611..7abfc8a 100644
--- a/arch/x86/kernel/kprobes_64.c
+++ b/arch/x86/kernel/kprobes_64.c
@@ -278,7 +278,7 @@ void __kprobes arch_prepare_kretprobe(struct 
kretprobe_instance *ri,
*sara = (unsigned long) &kretprobe_trampoline;
 }
 
-int __kprobes kprobe_handler(struct pt_regs *regs)
+static int __kprobes kprobe_handler(struct pt_regs *regs)
 {
struct kprobe *p;
int ret = 0;
@@ -395,7 +395,8 @@ no_kprobe:
 /*
  * Called when we hit the probe point at kretprobe_trampoline
  */
-int __kprobes trampoline_probe_handler(struct kprobe *p, struct pt_regs *regs)
+static int __kprobes trampoline_probe_handler(struct kprobe *p,
+ struct pt_regs *regs)
 {
struct kretprobe_instance *ri = NULL;
struct hlist_head *head, empty_rp;
@@ -536,7 +537,7 @@ static void __kprobes resume_execution(struct kprobe *p,
}
 }
 
-int __kprobes post_kprobe_handler(struct pt_regs *regs)
+static int __kprobes post_kprobe_handler(struct pt_regs *regs)
 {
struct kprobe *cur = kprobe_running();
struct kprobe_ctlblk *kcb = get_kprobe_ctlblk();
diff --git a/include/asm-x86/kprobes_64.h b/include/asm-x86/kprobes_64.h
index 53f4d85..497dad1 100644
--- a/include/asm-x86/kprobes_64.h
+++ b/include/asm-x86/kprobes_64.h
@@ -81,9 +81,7 @@ static inline void restore_interrupts(struct pt_regs *regs)
local_irq_enable();
 }
 
-extern int post_kprobe_handler(struct pt_regs *regs);
 extern int kprobe_fault_handler(struct pt_regs *regs, int trapnr);
-extern int kprobe_handler(struct pt_regs *regs);
 
 extern int kprobe_exceptions_notify(struct notifier_block *self,
unsigned long val, void *data);

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[2.6 patch] x86: nmi_64.c: make code static

2007-11-08 Thread Adrian Bunk

This patch makes the following needlessly global code static:
- panic_on_timeout
- setup_nmi_watchdog()

Signed-off-by: Adrian Bunk <[EMAIL PROTECTED]>

---

 arch/x86/kernel/nmi_64.c |4 ++--
 include/asm-x86/nmi_64.h |2 --
 2 files changed, 2 insertions(+), 4 deletions(-)

26ae99846d73a4a501f30a667f27a522f1b4d893 
diff --git a/arch/x86/kernel/nmi_64.c b/arch/x86/kernel/nmi_64.c
index a576fd7..ce08111 100644
--- a/arch/x86/kernel/nmi_64.c
+++ b/arch/x86/kernel/nmi_64.c
@@ -39,7 +39,7 @@ static cpumask_t backtrace_mask = CPU_MASK_NONE;
  *  0: the lapic NMI watchdog is disabled, but can be enabled
  */
 atomic_t nmi_active = ATOMIC_INIT(0);  /* oprofile uses this */
-int panic_on_timeout;
+static int panic_on_timeout;
 
 unsigned int nmi_watchdog = NMI_DEFAULT;
 static unsigned int nmi_hz = HZ;
@@ -135,7 +135,7 @@ int __init check_nmi_watchdog (void)
return 0;
 }
 
-int __init setup_nmi_watchdog(char *str)
+static int __init setup_nmi_watchdog(char *str)
 {
int nmi;
 
diff --git a/include/asm-x86/nmi_64.h b/include/asm-x86/nmi_64.h
index 65b6acf..bc997f9 100644
--- a/include/asm-x86/nmi_64.h
+++ b/include/asm-x86/nmi_64.h
@@ -41,7 +41,6 @@ extern void die_nmi(char *str, struct pt_regs *regs, int 
do_panic);
 
 #define get_nmi_reason() inb(0x61)
 
-extern int panic_on_timeout;
 extern int unknown_nmi_panic;
 extern int nmi_watchdog_enabled;
 
@@ -60,7 +59,6 @@ extern void enable_timer_nmi_watchdog(void);
 extern int nmi_watchdog_tick (struct pt_regs * regs, unsigned reason);
 
 extern void nmi_watchdog_default(void);
-extern int setup_nmi_watchdog(char *);
 
 extern atomic_t nmi_active;
 extern unsigned int nmi_watchdog;

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[2.6 patch] x86_64: remove acpi_pci_link_exit()

2007-11-08 Thread Adrian Bunk

acpi_pci_link_exit() is both unused and empty.

Signed-off-by: Adrian Bunk <[EMAIL PROTECTED]>

---
16d6853a1facb9bcb7a4bc19daad6d3b852cace3 
diff --git a/arch/x86/kernel/acpi/sleep_64.c b/arch/x86/kernel/acpi/sleep_64.c
index 79475d2..da42de2 100644
--- a/arch/x86/kernel/acpi/sleep_64.c
+++ b/arch/x86/kernel/acpi/sleep_64.c
@@ -115,6 +115,3 @@ static int __init acpi_sleep_setup(char *str)
 
 __setup("acpi_sleep=", acpi_sleep_setup);
 
-void acpi_pci_link_exit(void)
-{
-}

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[2.6 patch] x86 mce_64.c: make struct mcelog static

2007-11-08 Thread Adrian Bunk

This patch makes the needlessly global struct mcelog static.

Signed-off-by: Adrian Bunk <[EMAIL PROTECTED]>

---
d4c45993fc617ff0f56e3280fcccb7159b31e9e8 
diff --git a/arch/x86/kernel/cpu/mcheck/mce_64.c 
b/arch/x86/kernel/cpu/mcheck/mce_64.c
index b9f802e..986bf15 100644
--- a/arch/x86/kernel/cpu/mcheck/mce_64.c
+++ b/arch/x86/kernel/cpu/mcheck/mce_64.c
@@ -63,7 +63,7 @@ static DECLARE_WAIT_QUEUE_HEAD(mce_wait);
  * separate MCEs from kernel messages to avoid bogus bug reports.
  */
 
-struct mce_log mcelog = {
+static struct mce_log mcelog = {
MCE_LOG_SIGNATURE,
MCE_LOG_LEN,
 };

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[2.6 patch] x86 e820_64.c: make 2 functions static

2007-11-08 Thread Adrian Bunk

This patch makes the following needlessly global functions static:
- e820_print_map()
- early_panic()

Signed-off-by: Adrian Bunk <[EMAIL PROTECTED]>

---

 arch/x86/kernel/e820_64.c |4 ++--
 include/asm-x86/e820_64.h |1 -
 2 files changed, 2 insertions(+), 3 deletions(-)

a57254863b322df7232405b71e4ce5a792143b36 
diff --git a/arch/x86/kernel/e820_64.c b/arch/x86/kernel/e820_64.c
index 04698e0..182586a 100644
--- a/arch/x86/kernel/e820_64.c
+++ b/arch/x86/kernel/e820_64.c
@@ -363,7 +363,7 @@ unsigned long __init e820_hole_size(unsigned long start, 
unsigned long end)
return end - start - (ram << PAGE_SHIFT);
 }
 
-void __init e820_print_map(char *who)
+static void __init e820_print_map(char *who)
 {
int i;
 
@@ -587,7 +587,7 @@ static int __init copy_e820_map(struct e820entry * biosmap, 
int nr_map)
return 0;
 }
 
-void early_panic(char *msg)
+static void early_panic(char *msg)
 {
early_printk(msg);
panic(msg);
diff --git a/include/asm-x86/e820_64.h b/include/asm-x86/e820_64.h
index 0bd4787..e535e60 100644
--- a/include/asm-x86/e820_64.h
+++ b/include/asm-x86/e820_64.h
@@ -21,7 +21,6 @@ extern void contig_e820_setup(void);
 extern unsigned long e820_end_of_ram(void);
 extern void e820_reserve_resources(void);
 extern void e820_mark_nosave_regions(void);
-extern void e820_print_map(char *who);
 extern int e820_any_mapped(unsigned long start, unsigned long end, unsigned 
type);
 extern int e820_all_mapped(unsigned long start, unsigned long end, unsigned 
type);
 extern unsigned long e820_hole_size(unsigned long start, unsigned long end);

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[RFC: 2.6 patch] remove saa7134-oss

2007-11-08 Thread Adrian Bunk

The saa7134-oss is deprecated for quite some time, it's the only 
remaining OSS user outside of sound/oss/, and considering how few and 
what kind of soundcards are left supported by OSS I hardly see any use 
cases left.

Signed-off-by: Adrian Bunk <[EMAIL PROTECTED]>

---

 drivers/media/video/saa7134/Kconfig|   12 
 drivers/media/video/saa7134/Makefile   |1 
 drivers/media/video/saa7134/saa7134-alsa.c |   16 
 drivers/media/video/saa7134/saa7134-oss.c  | 1046 -
 4 files changed, 3 insertions(+), 1072 deletions(-)

af04de650322637130a9187df916572df374 
diff --git a/drivers/media/video/saa7134/Kconfig 
b/drivers/media/video/saa7134/Kconfig
index 3aa8cb2..7cfecb3 100644
--- a/drivers/media/video/saa7134/Kconfig
+++ b/drivers/media/video/saa7134/Kconfig
@@ -23,18 +23,6 @@ config VIDEO_SAA7134_ALSA
  To compile this driver as a module, choose M here: the
  module will be called saa7134-alsa.
 
-config VIDEO_SAA7134_OSS
-   tristate "Philips SAA7134 DMA audio support (OSS, DEPRECATED)"
-   depends on VIDEO_SAA7134 && SOUND_PRIME && !VIDEO_SAA7134_ALSA
-   ---help---
- This is a video4linux driver for direct (DMA) audio in
- Philips SAA713x based TV cards using OSS
-
- This is deprecated in favor of the ALSA module
-
- To compile this driver as a module, choose M here: the
- module will be called saa7134-oss.
-
 config VIDEO_SAA7134_DVB
tristate "DVB/ATSC Support for saa7134 based TV cards"
depends on VIDEO_SAA7134 && DVB_CORE
diff --git a/drivers/media/video/saa7134/Makefile 
b/drivers/media/video/saa7134/Makefile
index c85c8a8..9aff937 100644
--- a/drivers/media/video/saa7134/Makefile
+++ b/drivers/media/video/saa7134/Makefile
@@ -7,7 +7,6 @@ obj-$(CONFIG_VIDEO_SAA7134) +=  saa7134.o saa7134-empress.o \
saa6752hs.o
 
 obj-$(CONFIG_VIDEO_SAA7134_ALSA) += saa7134-alsa.o
-obj-$(CONFIG_VIDEO_SAA7134_OSS) += saa7134-oss.o
 
 obj-$(CONFIG_VIDEO_SAA7134_DVB) += saa7134-dvb.o
 
diff --git a/drivers/media/video/saa7134/saa7134-alsa.c 
b/drivers/media/video/saa7134/saa7134-alsa.c
index b9c5cf7..a71fc88 100644
--- a/drivers/media/video/saa7134/saa7134-alsa.c
+++ b/drivers/media/video/saa7134/saa7134-alsa.c
@@ -1069,24 +1069,14 @@ static int saa7134_alsa_init(void)
struct saa7134_dev *dev = NULL;
struct list_head *list;
 
-   if (!saa7134_dmasound_init && !saa7134_dmasound_exit) {
-   saa7134_dmasound_init = alsa_device_init;
-   saa7134_dmasound_exit = alsa_device_exit;
-   } else {
-   printk(KERN_WARNING "saa7134 ALSA: can't load, DMA sound 
handler already assigned (probably to OSS)\n");
-   return -EBUSY;
-   }
+   saa7134_dmasound_init = alsa_device_init;
+   saa7134_dmasound_exit = alsa_device_exit;
 
printk(KERN_INFO "saa7134 ALSA driver for DMA sound loaded\n");
 
list_for_each(list,&saa7134_devlist) {
dev = list_entry(list, struct saa7134_dev, devlist);
-   if (dev->dmasound.priv_data == NULL) {
-   alsa_device_init(dev);
-   } else {
-   printk(KERN_ERR "saa7134 ALSA: DMA sound is being 
handled by OSS. ignoring %s\n",dev->name);
-   return -EBUSY;
-   }
+   alsa_device_init(dev);
}
 
if (dev == NULL)
diff --git a/drivers/media/video/saa7134/saa7134-oss.c 
b/drivers/media/video/saa7134/saa7134-oss.c
deleted file mode 100644
index aedf046..000
--- a/drivers/media/video/saa7134/saa7134-oss.c
+++ /dev/null
@@ -1,1046 +0,0 @@
-/*
- *
- * device driver for philips saa7134 based TV cards
- * oss dsp interface
- *
- * (c) 2001,02 Gerd Knorr <[EMAIL PROTECTED]> [SuSE Labs]
- * 2005 conversion to standalone module:
- * Ricardo Cerqueira <[EMAIL PROTECTED]>
- *
- *  This program is free software; you can redistribute it and/or modify
- *  it under the terms of the GNU General Public License as published by
- *  the Free Software Foundation; either version 2 of the License, or
- *  (at your option) any later version.
- *
- *  This program is distributed in the hope that it will be useful,
- *  but WITHOUT ANY WARRANTY; without even the implied warranty of
- *  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
- *  GNU General Public License for more details.
- *
- *  You should have received a copy of the GNU General Public License
- *  along with this program; if not, write to the Free Software
- *  Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA.
- */
-
-#include 
-#include 
-#include 
-#include 
-#include 
-#include 
-#include 
-#include 
-
-#include "saa7134-reg.h"
-#include "saa7134.h"
-
-/* -- */
-
-static unsigned int debug  = 0;
-module_param(debug, int, 0644);
-MODULE_PARM_DESC(debug,"enable debug messages [oss]");
-
-static

[2.6 patch] remove additional pci_scan_child_bus() prototype

2007-11-08 Thread Adrian Bunk

There's already a prototype for pci_scan_child_bus() at the correct 
place in pci.h, so there's no reason for an additional one.

Signed-off-by: Adrian Bunk <[EMAIL PROTECTED]>

---
23fdba3eb3ed7b531fbe005a810d10679b113d2e 
diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
index de33a02..4b73d57 100644
--- a/drivers/pci/probe.c
+++ b/drivers/pci/probe.c
@@ -486,8 +486,6 @@ static void pci_fixup_parent_subordinate_busnr(struct 
pci_bus *child, int max)
}
 }
 
-unsigned int pci_scan_child_bus(struct pci_bus *bus);
-
 /*
  * If it's a bridge, configure it and scan the bus behind it.
  * For CardBus bridges, we don't scan behind as the devices will

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[2.6 patch] BLK_DEV_IDECD help: remove outdated note

2007-11-08 Thread Adrian Bunk

LILO version 16 was released on 26-02-1995 (sic), so telling people to 
not use older versions no longer has any value.

Signed-off-by: Adrian Bunk <[EMAIL PROTECTED]>

---
b79ad8a00e9bc0008b42eafd5863d758109d0594 
diff --git a/drivers/ide/Kconfig b/drivers/ide/Kconfig
index d1e8df1..e445fe6 100644
--- a/drivers/ide/Kconfig
+++ b/drivers/ide/Kconfig
@@ -203,10 +203,6 @@ config BLK_DEV_IDECD
  CD-ROM drive, you can say N to all other CD-ROM options, but be sure
  to say Y or M to "ISO 9660 CD-ROM file system support".
 
- Note that older versions of LILO (LInux LOader) cannot properly deal
- with IDE/ATAPI CD-ROMs, so install LILO 16 or higher, available from
- .
-
  To compile this driver as a module, choose M here: the
  module will be called ide-cd.
 

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[2.6 patch] always export pci_scan_single_device

2007-11-08 Thread Adrian Bunk

This patch fixes the following build error with CONFIG_HOTPLUG=n:

<--  snip  -->

...
  MODPOST 2137 modules
ERROR: "pci_scan_single_device" [drivers/edac/i82875p_edac.ko] undefined!
make[2]: *** [__modpost] Error 1

<--  snip  -->

Signed-off-by: Adrian Bunk <[EMAIL PROTECTED]>

---

 drivers/pci/probe.c |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

f80ae6568224bd5e4d2387ec162d37c8e7fd545a 
diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
index 463a5a9..de33a02 100644
--- a/drivers/pci/probe.c
+++ b/drivers/pci/probe.c
@@ -1012,6 +1012,7 @@ struct pci_dev *pci_scan_single_device(struct pci_bus 
*bus, int devfn)
 
return dev;
 }
+EXPORT_SYMBOL(pci_scan_single_device);
 
 /**
  * pci_scan_slot - scan a PCI slot on a bus for devices.
@@ -1200,7 +1201,6 @@ EXPORT_SYMBOL(pci_add_new_bus);
 EXPORT_SYMBOL(pci_do_scan_bus);
 EXPORT_SYMBOL(pci_scan_slot);
 EXPORT_SYMBOL(pci_scan_bridge);
-EXPORT_SYMBOL(pci_scan_single_device);
 EXPORT_SYMBOL_GPL(pci_scan_child_bus);
 #endif
 

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[2.6 patch] x86: acpi_pciprobe_dmi_table[] must be __devinitdata

2007-11-08 Thread Adrian Bunk

This patch fixes the following section mismatches with CONFIG_HOTPLUG=n:

<--  snip  -->

...
WARNING: vmlinux.o(.data+0x23640): Section mismatch: reference to 
.init.text.20:can_skip_ioresource_align (between 'acpi_pciprobe_dmi_table' and 
'pcibios_irq_mask')
WARNING: vmlinux.o(.data+0x2366c): Section mismatch: reference to 
.init.text.20:can_skip_ioresource_align (between 'acpi_pciprobe_dmi_table' and 
'pcibios_irq_mask')
WARNING: vmlinux.o(.data+0x23698): Section mismatch: reference to 
.init.text.20:can_skip_ioresource_align (between 'acpi_pciprobe_dmi_table' and 
'pcibios_irq_mask')
...

<--  snip  -->

Signed-off-by: Adrian Bunk <[EMAIL PROTECTED]>

---
3225b3c19396e0e45f496dfe82e85ebc86951d91 
diff --git a/arch/x86/pci/acpi.c b/arch/x86/pci/acpi.c
index 2d88f7c..a7536dc 100644
--- a/arch/x86/pci/acpi.c
+++ b/arch/x86/pci/acpi.c
@@ -13,7 +13,7 @@ static int __devinit can_skip_ioresource_align(const struct 
dmi_system_id *d)
return 0;
 }
 
-static struct dmi_system_id acpi_pciprobe_dmi_table[] = {
+static struct dmi_system_id acpi_pciprobe_dmi_table[] __devinitdata = {
 /*
  * Systems where PCI IO resource ISA alignment can be skipped
  * when the ISA enable bit in the bridge control is not set

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[2.6 patch] remove references to net-modules.txt

2007-11-08 Thread Adrian Bunk

When I removed net-modules.txt because it only contained ancient 
information I missed that many Kconfig entries pointed to this ancient 
information.

Signed-off-by: Adrian Bunk <[EMAIL PROTECTED]>

---

 Documentation/networking/3c505.txt |3 
 drivers/net/Kconfig|  199 ++---
 drivers/net/arcnet/Kconfig |   15 --
 drivers/net/tulip/Kconfig  |   21 +--
 4 files changed, 82 insertions(+), 156 deletions(-)

6451db6bafae46e17885b8c7b243095fc257e8ed 
diff --git a/Documentation/networking/3c505.txt 
b/Documentation/networking/3c505.txt
index b9d5b72..72f38b1 100644
--- a/Documentation/networking/3c505.txt
+++ b/Documentation/networking/3c505.txt
@@ -14,8 +14,7 @@ If no base address is given at boot time, the driver will 
autoprobe
 ports 0x300, 0x280 and 0x310 (in that order).  If no IRQ is given, the driver
 will try to probe for it.
 
-The driver can be used as a loadable module.  See net-modules.txt for details
-of the parameters it can take.  
+The driver can be used as a loadable module.
 
 Theoretically, one instance of the driver can now run multiple cards,
 in the standard way (when loading a module, say "modprobe 3c505
diff --git a/drivers/net/Kconfig b/drivers/net/Kconfig
index 5f800a6..0fdcf72 100644
--- a/drivers/net/Kconfig
+++ b/drivers/net/Kconfig
@@ -365,8 +365,7 @@ config MAC89x0
  read the Ethernet-HOWTO, available from
  .
 
- To compile this driver as a module, choose M here and read
- .  This module will
+ To compile this driver as a module, choose M here. This module will
  be called mac89x0.
 
 config MACSONIC
@@ -379,8 +378,7 @@ config MACSONIC
  one of these say Y and read the Ethernet-HOWTO, available from
  .
 
- To compile this driver as a module, choose M here and read
- .  This module will
+ To compile this driver as a module, choose M here. This module will
  be called macsonic.
 
 config MACMACE
@@ -618,8 +616,7 @@ config EL1
  have problems.  Some people suggest to ping ("man ping") a nearby
  machine every minute ("man cron") when using this card.
 
- To compile this driver as a module, choose M here and read
- . The module
+ To compile this driver as a module, choose M here. The module
  will be called 3c501.
 
 config EL2
@@ -631,8 +628,7 @@ config EL2
  the Ethernet-HOWTO, available from
  .
 
- To compile this driver as a module, choose M here and read
- . The module
+ To compile this driver as a module, choose M here. The module
  will be called 3c503.
 
 config ELPLUS
@@ -644,8 +640,7 @@ config ELPLUS
  this type, say Y and read the Ethernet-HOWTO, available from
  .
 
- To compile this driver as a module, choose M here and read
- . The module
+ To compile this driver as a module, choose M here. The module
  will be called 3c505.
 
 config EL16
@@ -656,8 +651,7 @@ config EL16
  the Ethernet-HOWTO, available from
  .
 
- To compile this driver as a module, choose M here and read
- . The module
+ To compile this driver as a module, choose M here. The module
  will be called 3c507.
 
 config EL3
@@ -672,8 +666,7 @@ config EL3
  setup disk to disable Plug & Play mode, and to select the default
  media type.
 
- To compile this driver as a module, choose M here and read
- . The module
+ To compile this driver as a module, choose M here. The module
  will be called 3c509.
 
 config 3C515
@@ -684,8 +677,7 @@ config 3C515
  network card, say Y and read the Ethernet-HOWTO, available from
  .
 
- To compile this driver as a module, choose M here and read
- . The module
+ To compile this driver as a module, choose M here. The module
  will be called 3c515.
 
 config ELMC
@@ -696,8 +688,7 @@ config ELMC
  the Ethernet-HOWTO, available from
  .
 
- To compile this driver as a module, choose M here and read
- . The module
+ To compile this driver as a module, choose M here. The module
  will be called 3c523.
 
 config ELMC_II
@@ -708,8 +699,7 @@ config ELMC_II
  the Ethernet-HOWTO, available from
  .
 
- To compile this driver as a module, choose M here and read
- . The module
+ To compile this driver as a module, choose M here. The module
  will be called 3c527.
 
 config VORTEX
@@ -732,8 +722,7 @@ config VORTEX
   and in the comments at
  the begin

Re: Fwd: same problem with 2.6.24-rc2

2007-11-08 Thread Randy Dunlap

On Wed, 07 Nov 2007 23:05:32 -0800 Randy Dunlap wrote:

Hi Sam,

This is somewhat of a build regression... a confusing one to me.
Maybe you will know what it's up to.


There's also a kernel boot regression: something in
crypto/xor.c::calibrate_xor_blocks() finds a null pointer.
I can't reproduce it.  [more below]

Werner, for the machine that crashes during boot, please send us
the contents of /proc/cpuinfo.  Thanks.

(BTW, for anyone reading along, vger sees Werner's emails as spam,
so you may receive mail directly from him instead of seeing it on
lkml.)


Werner's kernel-build-script is a large multi-purpose script with
package building capability.  He reported the following build output:


> gcc -m32 -m elf_i386  /usr/src/linux-2.6.24-rc2-i486-1mn/arch/x86/Makefile.o  
>  -o /usr/src/linux-2.6.24-rc2-i486-1mn/arch/x86/Makefile
> gcc: /usr/src/linux-2.6.24-rc2-i486-1mn/arch/x86/Makefile.o: No such file or 
> directory
> gcc: no input files
> make: [/usr/src/linux-2.6.24-rc2-i486-1mn/arch/x86/Makefile] Error 1 (ignored)

which I can easily reproduce by doing (at kernel top-level dir):

make defconfig
make -B

This does not happen in 2.6.23.  Instead, that sequence just loops
forever with:  (this is what I get:)

make -f /tester/linsrc/linux-2623-pv/Makefile silentoldconfig
make -f scripts/Makefile.build obj=scripts/basic
mkdir -p include/linux include/config
make -f scripts/Makefile.build obj=scripts/kconfig silentoldconfig
  cat scripts/kconfig/zconf.tab.c_shipped > scripts/kconfig/zconf.tab.c
  cat scripts/kconfig/lex.zconf.c_shipped > scripts/kconfig/lex.zconf.c
  cat scripts/kconfig/zconf.hash.c_shipped > scripts/kconfig/zconf.hash.c
  gcc -Wp,-MD,scripts/kconfig/.zconf.tab.o.d -Wall -Wstrict-prototypes -O2 
-fomit-frame-pointer   -DCURSES_LOC="" -DLOCALE -Iscripts/kconfig -c 
-o scripts/kconfig/zconf.tab.o scripts/kconfig/zconf.tab.c
  gcc  -o scripts/kconfig/conf scripts/kconfig/conf.o 
scripts/kconfig/zconf.tab.o -lncursesw 
scripts/kconfig/conf -s arch/x86_64/Kconfig
make -f /tester/linsrc/linux-2623-pv/Makefile silentoldconfig
make -f scripts/Makefile.build obj=scripts/basic
mkdir -p include/linux include/config
make -f scripts/Makefile.build obj=scripts/kconfig silentoldconfig
  cat scripts/kconfig/zconf.tab.c_shipped > scripts/kconfig/zconf.tab.c
  cat scripts/kconfig/lex.zconf.c_shipped > scripts/kconfig/lex.zconf.c
  cat scripts/kconfig/zconf.hash.c_shipped > scripts/kconfig/zconf.hash.c
  gcc -Wp,-MD,scripts/kconfig/.zconf.tab.o.d -Wall -Wstrict-prototypes -O2 
-fomit-frame-pointer   -DCURSES_LOC="" -DLOCALE -Iscripts/kconfig -c 
-o scripts/kconfig/zconf.tab.o scripts/kconfig/zconf.tab.c
  gcc  -o scripts/kconfig/conf scripts/kconfig/conf.o 
scripts/kconfig/zconf.tab.o -lncursesw 
scripts/kconfig/conf -s arch/x86_64/Kconfig
make -f /tester/linsrc/linux-2623-pv/Makefile silentoldconfig
make -f scripts/Makefile.build obj=scripts/basic
mkdir -p include/linux include/config

...

I suppose we could argue that the 2.6.24-rcN handling is better than
the 2.6.23 handling.  Werner does not report any problems like this
with 2.6.23, so he's not reporting what I am seeing.

~

Pasted from earlier email from Werner (typed in):

2.6.24-rc1-git10
EIP 0600:  EFLAGS 00010212 CPU 0
EIUP is at xor_sse_2+0x34/0x200
EAX: 10 EBX fffedb22 ECX c183f000 EDX c183c000 ESS 8005003b EDI c0929614 EBP 
c183f000 ESP c1823ef0   
DS 7b ES 7b FS d8 GS 0 SS 68
Process swapper  pid 1  ti: c182200  task c182 task.ti c=1822000
Stack:  8x 08x 0   fffedb22  0  c04067b3  10  c0849b62  c1030780  c183f000  
c183c000
call trace
c0 4067b3 do_xor_speed+0x53/0xd0
   9a9582 calibrate_xor_blocks 0xe2/0x100 (or 1a0 ?)
  191594  register_filesystem =0X44/0X70
  991565 kernel_init+0x125/0x2f0
   10420a  ret_from_fork +0x6/0x1c  (or 0xb ...)
  991440 kernel_init+0x0/0x2f0
   " again
   c0104edf  kernel_thread_helper+0x7/0x18
code  08 89 74 24 44 0f 20 cf 0f 06 (or 0b) 0f 11 04 24 0f 11 4c 34 10 0f 11 54 
24 20 0f 11 5c 24 30 0f 18 82 00
01 00 00 0f 18 82 20 01 00 00 <00> 20x 0
EIP c0407284 xor_sse_2+0x34/0x200 SS ESP 068: c1823ef0
kernel panic


and later:

> With 2.6.23-rc2 is the same problem:  it crashed at the beginning:  EIP 060 
> c03fdea4
> EFLAGS 00010212 EIP is at xor_sse_2+0x34/0x200

---
~Randy
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] Add the macro to test if "exactly_one_bit_set" to log2.h.

2007-11-08 Thread Robert P. J. Day

On Thu, 8 Nov 2007, Andrew Morton wrote:

> > On Tue, 6 Nov 2007 11:38:52 -0500 (EST) "Robert P. J. Day" <[EMAIL 
> > PROTECTED]> wrote:
> >
> > While this macro is defined in terms of "is_power_of_2" and is
> > therefore functionally equivalent, the visual semantics are
> > sometimes more appropriate for what is actually being tested.
>
> This is gettig a bit anal, but I guess you're the is_power_of_2
> maintainer.
>
> > ---
> >
> > diff --git a/include/linux/log2.h b/include/linux/log2.h
> > index c8cf5e8..d0d324e 100644
> > --- a/include/linux/log2.h
> > +++ b/include/linux/log2.h
> > @@ -55,6 +55,12 @@ bool is_power_of_2(unsigned long n)
> >  }
> >
> >  /*
> > + *  And for folks who want slightly different semantics ...
> > + */
> > +
> > +#define exactly_one_bit_set is_power_of_2

actually, i could go either way on this one.  it wasn't originally my
idea, but i tossed it out there because i have, in fact, seen comments
that explicitly said something along the lines of "make sure that
exactly one bit is set".  so i'll leave it up to someone else to
decide whether it should go in.  it's not something i'm going to go to
the mats over one way or the other.

rday

-- 

Robert P. J. Day
Linux Consulting, Training and Annoying Kernel Pedantry
Waterloo, Ontario, CANADA

http://crashcourse.ca

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] Add quirk to set AHCI mode on ICH boards

2007-11-08 Thread Jeff Garzik

On Thu, Nov 08, 2007 at 11:44:22PM -0500, Mark Lord wrote:
> Jeff Garzik wrote:
> >On Thu, Nov 08, 2007 at 10:29:37PM -0500, Mark Lord wrote:
> >>And I might even privately patch my own kernels to map the ACHI BAR
> >>in the cases where the BIOS didn't...
> >
> >The inability to do this in the general case is the main reason why
> >AHCI was not unconditionally enabled, even in IDE mode, when it was
> >originally added...  :/
> ..
> 
> Yeah, that one's always puzzled me.
> It's just software,  so why don't we do it?  In the PCI layer, that is?

Ah, but it's not just software:  when trying to find bus address
space for the BAR, we don't know if we are stomping on magic hardware
resources the BIOS has conveniently failed to tell us about.

So while in all likelihood you will have no problem finding a
suitable bus address to use, as a generalized rule it is a far more
difficult proposition.

Mind you, I would /love/ to be proven wrong here.  In additional to AHCI
BAR, modern ata_piix includes SATA PHY registers that we could make use
of, but cannot...

Jeff

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] init: Introduce rootdir bootparm to select which dir to sys_chroot

2007-11-08 Thread Al Boldi

Andrew Morton wrote:
> > On Tue, 6 Nov 2007 13:40:26 +0300 Al Boldi <[EMAIL PROTECTED]> wrote:
> >
> > This patch introduces a rootdir kernel boot parameter, which specifies
> > the path to the kernel sys_chroot boot dir.
> >
> > This is useful for systems that have more than one distribution
> > installed on the same fs/partition.
> >
> >
> > Cc: H. Peter Anvin <[EMAIL PROTECTED]>
> > Cc: Andrew Morton <[EMAIL PROTECTED]>
> > Signed-off-by: Al Boldi <[EMAIL PROTECTED]>
> >
> > ---
> >
> > --- a/init/do_mounts.c
> > +++ b/init/do_mounts.c
> > @@ -252,6 +252,15 @@ __setup("rootflags=", root_data_setup);
> >  __setup("rootfstype=", fs_names_setup);
> >  __setup("rootdelay=", root_delay_setup);
> >
> > +static char * __initdata root_dir;
> > +static int __init root_dir_setup(char *str)
> > +{
> > +   root_dir = strcat("./",str);
> > +   return 1;
> > +}
> > +
> > +__setup("rootdir=", root_dir_setup);
>
> Please update Documentation/kernel-parameters.txt when adding __setup
> options.

Sure.

If you think this feature is useful, which I think it is, then I probably 
need to resend with a doc update.  But bare in mind, this patch needs 
something like small hack to allow remounting root, which currently isn't 
possible.  I'm sure hpa is probably the genius to help out here.

>
> >  static void __init get_fs_names(char *page)
> >  {
> > char *s = page;
> > @@ -469,6 +478,10 @@ void __init prepare_namespace(void)
> > mount_root();
> >  out:
> > sys_mount(".", "/", NULL, MS_MOVE, NULL);
> > +
> > +   if(root_dir)
> > +   sys_chdir(root_dir);
> > +
>
> Please run scripts/checkpatch.pl across all patches before sending them to
> anyone.

Ok.


Thanks!

--
Al

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 1/2] NetLabel: Introduce a new kernel configuration API for NetLabel - Version 11 (2.6.24-rc2) Smack: Simplified Mandatory Access Control Kernel

2007-11-08 Thread Casey Schaufler

From: Paul Moore <[EMAIL PROTECTED]>

Add a new set of configuration functions to the NetLabel/LSM API so that
LSMs can perform their own configuration of the NetLabel subsystem without
relying on assistance from userspace.

Signed-off-by: Paul Moore <[EMAIL PROTECTED]>
---

 include/net/netlabel.h |   47 --
 net/ipv4/cipso_ipv4.c  |4 -
 net/netlabel/netlabel_cipso_v4.c   |2 
 net/netlabel/netlabel_cipso_v4.h   |3 +
 net/netlabel/netlabel_domainhash.h |1 
 net/netlabel/netlabel_kapi.c   |  177 
 6 files changed, 225 insertions(+), 9 deletions(-)

diff --git a/include/net/netlabel.h b/include/net/netlabel.h
index 2e5b2f6..facaf68 100644
--- a/include/net/netlabel.h
+++ b/include/net/netlabel.h
@@ -36,6 +36,8 @@
 #include 
 #include 
 
+struct cipso_v4_doi;
+
 /*
  * NetLabel - A management interface for maintaining network packet label
  *mapping tables for explicit packet labling protocols.
@@ -99,12 +101,6 @@ struct netlbl_audit {
uid_t loginuid;
 };
 
-/* Domain mapping definition struct */
-struct netlbl_dom_map;
-
-/* Domain mapping operations */
-int netlbl_domhsh_remove(const char *domain, struct netlbl_audit *audit_info);
-
 /* LSM security attributes */
 struct netlbl_lsm_cache {
atomic_t refcount;
@@ -285,6 +281,19 @@ static inline void netlbl_secattr_free(struct 
netlbl_lsm_secattr *secattr)
 
 #ifdef CONFIG_NETLABEL
 /*
+ * LSM configuration operations
+ */
+int netlbl_cfg_map_del(const char *domain, struct netlbl_audit *audit_info);
+int netlbl_cfg_unlbl_add_map(const char *domain,
+struct netlbl_audit *audit_info);
+int netlbl_cfg_cipsov4_add(struct cipso_v4_doi *doi_def,
+  struct netlbl_audit *audit_info);
+int netlbl_cfg_cipsov4_add_map(struct cipso_v4_doi *doi_def,
+  const char *domain,
+  struct netlbl_audit *audit_info);
+int netlbl_cfg_cipsov4_del(u32 doi, struct netlbl_audit *audit_info);
+
+/*
  * LSM security attribute operations
  */
 int netlbl_secattr_catmap_walk(struct netlbl_lsm_secattr_catmap *catmap,
@@ -318,6 +327,32 @@ void netlbl_cache_invalidate(void);
 int netlbl_cache_add(const struct sk_buff *skb,
 const struct netlbl_lsm_secattr *secattr);
 #else
+static inline int netlbl_cfg_map_del(const char *domain,
+struct netlbl_audit *audit_info)
+{
+   return -ENOSYS;
+}
+static inline int netlbl_cfg_unlbl_add_map(const char *domain,
+  struct netlbl_audit *audit_info)
+{
+   return -ENOSYS;
+}
+static inline int netlbl_cfg_cipsov4_add(struct cipso_v4_doi *doi_def,
+struct netlbl_audit *audit_info)
+{
+   return -ENOSYS;
+}
+static inline int netlbl_cfg_cipsov4_add_map(struct cipso_v4_doi *doi_def,
+const char *domain,
+struct netlbl_audit *audit_info)
+{
+   return -ENOSYS;
+}
+static inline int netlbl_cfg_cipsov4_del(u32 doi,
+struct netlbl_audit *audit_info)
+{
+   return -ENOSYS;
+}
 static inline int netlbl_secattr_catmap_walk(
  struct netlbl_lsm_secattr_catmap *catmap,
  u32 offset)
diff --git a/net/ipv4/cipso_ipv4.c b/net/ipv4/cipso_ipv4.c
index f18e88b..5e97315 100644
--- a/net/ipv4/cipso_ipv4.c
+++ b/net/ipv4/cipso_ipv4.c
@@ -546,8 +546,8 @@ int cipso_v4_doi_remove(u32 doi,
rcu_read_lock();
list_for_each_entry_rcu(dom_iter, &doi_def->dom_list, list)
if (dom_iter->valid)
-   netlbl_domhsh_remove(dom_iter->domain,
-audit_info);
+   netlbl_cfg_map_del(dom_iter->domain,
+  audit_info);
rcu_read_unlock();
cipso_v4_cache_invalidate();
call_rcu(&doi_def->rcu, callback);
diff --git a/net/netlabel/netlabel_cipso_v4.c b/net/netlabel/netlabel_cipso_v4.c
index ba0ca8d..54f9d1b 100644
--- a/net/netlabel/netlabel_cipso_v4.c
+++ b/net/netlabel/netlabel_cipso_v4.c
@@ -89,7 +89,7 @@ static const struct nla_policy 
netlbl_cipsov4_genl_policy[NLBL_CIPSOV4_A_MAX + 1
  * safely.
  *
  */
-static void netlbl_cipsov4_doi_free(struct rcu_head *entry)
+void netlbl_cipsov4_doi_free(struct rcu_head *entry)
 {
struct cipso_v4_doi *ptr;
 
diff --git a/net/netlabel/netlabel_cipso_v4.h b/net/netlabel/netlabel_cipso_v4.h
index f03cf9b..220cb9d 100644
--- a/net/netlabel/netlabel_cipso_v4.h
+++ b/net/netlabel/netlabel_cipso_v4.h
@@ -163,4 +163,7 @@ enum {
 /* NetLabel protocol functions */
 int netlbl_cipsov4_genl_init(void);
 
+/* Free the memory associated with a CIP

[PATCH 0/2] Version 11 (2.6.24-rc2) Smack: Simplified Mandatory Access Control Kernel

2007-11-08 Thread Casey Schaufler


This is version 11 of the Simplified Mandatory Access Control Kernel.

The whole thing as available on the Smack home page at

http://schaufler-ca.com

The attachments to this message are not kernel code.
They are early versions of the smackload and smackcipso
programs, and are included in the hope that doing so
may reduce (I certainly wouldn't count on it eliminating)
whinging about the revised versions of smack_write_load()
and smack_write_cipso().

The /smack/load and /smack/cipso special files are a minor
component of Smack, and much too much energy has gone into
them, and I would much prefer that people who don't like
Smack crux about things that are important rather than the
details and moral implications of parsers in kernel code.

Writes to /smack/load are now required to have this format:

SubjectLabel ObjectLabel Mode[decorations]
| 24 bytes  || 24 bytes ||4 ||undefined  |

A write to /smack/load must be 52 or more bytes in length.
The 4 mode bytes must be of the form [rR-][wW-][xX-][aA-],
in that order. The regular rules enforced by smack_import()
apply to the strings at offset 0 and offset 24.

Writes to /smack/cipso are now required to have this format:

LabelMapped Level CatCount [cat]...
| 24 bytes || 4  ||  4| |4|


A write to /smack/cipso must be at least 32 bytes long,
and also must be 32 + (4 * CatCount) bytes long. If there
are no categories CatCount must be "0   ". The label is
read using smack_import(). All other values are left
justified ("2   ", not "   2") integers in 4 bytes.

Since these formats are so fussy I have provided programs
that take care of that. They are still human readable text,
but no longer require parsing in the kernel. It is my sincere
hope that we can put the bruhaha about parsing to bed.

Two patches here. Paul Moore's netlabel api patch has been updated
due to unrelated changes in that code.


/*
 * smackload - properly format smack access rules for
 * loading into the kernel by writing to /smack/load.
 *
 */
#include 
#include 
#include 
#include 
#include 
#include 

#define LSIZE 23
#define ASIZE 4

int
main(int argc, char *argv[])
{
int loadfd;
char line[80];
char rule[LSIZE + LSIZE + ASIZE + 3];
char subject[LSIZE+1];
char object[LSIZE+1];
char accesses[ASIZE+1];
char real[ASIZE+1];
char *cp;
int i;
int err;

loadfd = open("/smack/load", O_RDWR);
if (loadfd < 0) {
perror("opening /smack/load");
exit(1);
}

while (fgets(line, 80, stdin) != NULL) {
err = 0;
if ((cp = strchr(line, '\n')) != NULL)
*cp = '\0';
if (sscanf(line,"%23s %23s %4s",subject,object,accesses) != 3) 
err = 1;
else {
strcpy(real, "");
for (i = 0;
 i < ASIZE && accesses[i] != '\0' && err == 0;
 i++) {
switch (accesses[i]) {
case 'r':
case 'R':
real[0] = 'r';
break;
case 'w':
case 'W':
real[1] = 'w';
break;
case 'x':
case 'X':
real[2] = 'x';
break;
case 'a':
case 'A':
real[3] = 'a';
break;
case '\0':
case '-':
break;
default:
err = 1;
break;
}
}
}
if (err == 0) {
fprintf(stderr, "Good input line \"%s\"\n", line);
sprintf(rule, "%-23s %-23s %4s", subject,object,real);
fprintf(stderr, "Sending in line \"%s\"\n", rule);
err = write(loadfd, rule, LSIZE + LSIZE + ASIZE + 2);
if (err < 0)
perror("writing /smack/load");
}
else
fprintf(stderr, "Bad input line \"%s\"\n", line);
}
exit(0);
}
/*
 * smackcipso - properly format smack access cipsos for
 * loading into the kernel by writing to /smack/cipso.
 *
 */
#include 
#include 
#include 
#include 
#include 
#include 
#include 

#define LSIZE 23
#define

Re: [PATCH] Add quirk to set AHCI mode on ICH boards

2007-11-08 Thread Mark Lord


Jeff Garzik wrote:

On Thu, Nov 08, 2007 at 10:29:37PM -0500, Mark Lord wrote:

And I might even privately patch my own kernels to map the ACHI BAR
in the cases where the BIOS didn't...


The inability to do this in the general case is the main reason why
AHCI was not unconditionally enabled, even in IDE mode, when it was
originally added...  :/

..

Yeah, that one's always puzzled me.
It's just software,  so why don't we do it?  In the PCI layer, that is?
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] init: Introduce rootdir bootparm to select which dir to sys_chroot

2007-11-08 Thread Andrew Morton

> On Tue, 6 Nov 2007 13:40:26 +0300 Al Boldi <[EMAIL PROTECTED]> wrote:
> 
> This patch introduces a rootdir kernel boot parameter, which specifies the 
> path to the kernel sys_chroot boot dir.
> 
> This is useful for systems that have more than one distribution installed on 
> the same fs/partition.
> 
> 
> Cc: H. Peter Anvin <[EMAIL PROTECTED]>
> Cc: Andrew Morton <[EMAIL PROTECTED]>
> Signed-off-by: Al Boldi <[EMAIL PROTECTED]>
> 
> ---
> 
> --- a/init/do_mounts.c
> +++ b/init/do_mounts.c
> @@ -252,6 +252,15 @@ __setup("rootflags=", root_data_setup);
>  __setup("rootfstype=", fs_names_setup);
>  __setup("rootdelay=", root_delay_setup);
>  
> +static char * __initdata root_dir;
> +static int __init root_dir_setup(char *str)
> +{
> + root_dir = strcat("./",str);
> + return 1;
> +}
> +
> +__setup("rootdir=", root_dir_setup);

Please update Documentation/kernel-parameters.txt when adding __setup
options.

>  static void __init get_fs_names(char *page)
>  {
>   char *s = page;
> @@ -469,6 +478,10 @@ void __init prepare_namespace(void)
>   mount_root();
>  out:
>   sys_mount(".", "/", NULL, MS_MOVE, NULL);
> +
> + if(root_dir)
> + sys_chdir(root_dir);
> +

Please run scripts/checkpatch.pl across all patches before sending them to
anyone.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] Add the macro to test if "exactly_one_bit_set" to log2.h.

2007-11-08 Thread Andrew Morton

> On Tue, 6 Nov 2007 11:38:52 -0500 (EST) "Robert P. J. Day" <[EMAIL 
> PROTECTED]> wrote:
> 
> While this macro is defined in terms of "is_power_of_2" and is
> therefore functionally equivalent, the visual semantics are sometimes
> more appropriate for what is actually being tested.
> 

This is gettig a bit anal, but I guess you're the is_power_of_2 maintainer.

> ---
> 
> diff --git a/include/linux/log2.h b/include/linux/log2.h
> index c8cf5e8..d0d324e 100644
> --- a/include/linux/log2.h
> +++ b/include/linux/log2.h
> @@ -55,6 +55,12 @@ bool is_power_of_2(unsigned long n)
>  }
> 
>  /*
> + *  And for folks who want slightly different semantics ...
> + */
> +
> +#define exactly_one_bit_set is_power_of_2

And I'm the dont-code-in-cpp-when-you-could-code-in-C maintainer.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] [POWERPC] Fix typo #ifdef -> #ifndef

2007-11-08 Thread Andrew Morton

> On Sat, 03 Nov 2007 20:16:36 +0100 Jochen Friedrich <[EMAIL PROTECTED]> wrote:
> Subject: [PATCH] [POWERPC] Fix typo #ifdef -> #ifndef

Please put the "powerpc" outside the [].  Because things inside [] get
removed when the receiver applies the patch, but the subsystem
identification ("powerpc") is useful information which we want to carry
into the permanent git record. (Although Paul might add it anyway - some
git tree maintainers do).

> User-Agent: Mozilla-Thunderbird 2.0.0.6 (X11/20071009)

This seems to be setting a record for MUA vandalism.  Leading spaces were
removed and various esoteric whitespace transformations were made.  The
diffs were small so I fixed them by hand.

Please strangle your email client.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [patch 2/2] clone: prepare to recycle CLONE_DETACHED and CLONE_STOPPED

2007-11-08 Thread Andrew Morton

> On Thu, 08 Nov 2007 13:31:43 -0800 [EMAIL PROTECTED] wrote:
> From: Andrew Morton <[EMAIL PROTECTED]>
> 
> Ulrich says that we never used these clone flags and that nothing should be
> using them.
> 
> As we're down to only a single bit left in clone's flags argument, let's add a
> warning to check that no userspace is actually using these.  Hopefully we will
> be able to recycle them.
> 
> Cc: Ulrich Drepper <[EMAIL PROTECTED]>
> Cc: Ingo Molnar <[EMAIL PROTECTED]>
> Cc: Roland McGrath <[EMAIL PROTECTED]>
> Signed-off-by: Andrew Morton <[EMAIL PROTECTED]>
> ---
> 
>  kernel/fork.c |   16 
>  1 file changed, 12 insertions(+), 4 deletions(-)
> 
> diff -puN 
> kernel/fork.c~clone-prepare-to-recycle-clone_detached-and-clone_stopped 
> kernel/fork.c
> --- a/kernel/fork.c~clone-prepare-to-recycle-clone_detached-and-clone_stopped
> +++ a/kernel/fork.c
> @@ -1420,10 +1420,18 @@ long do_fork(unsigned long clone_flags,
>   int trace = 0;
>   long nr;
>  
> - if (unlikely(current->ptrace)) {
> - trace = fork_traceflag (clone_flags);
> - if (trace)
> - clone_flags |= CLONE_PTRACE;
> + /*
> +  * We hope to recycle these flags after 2.6.26
> +  */
> + if (unlikely(clone_flags & (CLONE_DETACHED|CLONE_STOPPED))) {
> + if (printk_ratelimit()) {
> + char comm[TASK_COMM_LEN];
> +
> + printk(KERN_INFO "fork(): process `%s' used deprecated "
> + "clone flags 0x%lx\n",
> + get_task_comm(comm, current),
> + clone_flags & (CLONE_DETACHED|CLONE_STOPPED));
> + }
>   }
>  
>   p = copy_process(clone_flags, stack_start, regs, stack_size,

That was all screwed up.  Better version:

From: Andrew Morton <[EMAIL PROTECTED]>

Ulrich says that we never used these clone flags and that nothing should be
using them.

As we're down to only a single bit left in clone's flags argument, let's add a
warning to check that no userspace is actually using these.  Hopefully we will
be able to recycle them.

Roland said:

  CLONE_STOPPED was previously used by some NTPL versions when under
  thread_db (i.e.  only when being actively debugged by gdb), but not for a
  long time now, and it never worked reliably when it was used.  Removing it
  seems fine to me.

Cc: Ulrich Drepper <[EMAIL PROTECTED]>
Cc: Ingo Molnar <[EMAIL PROTECTED]>
Cc: Roland McGrath <[EMAIL PROTECTED]>
Signed-off-by: Andrew Morton <[EMAIL PROTECTED]>
---

 kernel/fork.c |   17 +
 1 file changed, 17 insertions(+)

diff -puN 
kernel/fork.c~clone-prepare-to-recycle-clone_detached-and-clone_stopped 
kernel/fork.c
--- a/kernel/fork.c~clone-prepare-to-recycle-clone_detached-and-clone_stopped
+++ a/kernel/fork.c
@@ -1420,6 +1420,23 @@ long do_fork(unsigned long clone_flags,
int trace = 0;
long nr;
 
+   /*
+* We hope to recycle these flags after 2.6.26
+*/
+   if (unlikely(clone_flags & (CLONE_DETACHED|CLONE_STOPPED))) {
+   static int __read_mostly count = 100;
+
+   if (count && printk_ratelimit()) {
+   char comm[TASK_COMM_LEN];
+
+   count--;
+   printk(KERN_INFO "fork(): process `%s' used deprecated "
+   "clone flags 0x%lx\n",
+   get_task_comm(comm, current),
+   clone_flags & (CLONE_DETACHED|CLONE_STOPPED));
+   }
+   }
+
if (unlikely(current->ptrace)) {
trace = fork_traceflag (clone_flags);
if (trace)
_

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: 2.6.24-rc1: OOPS at acpi_battery_update

2007-11-08 Thread Andrew Morton

A> On Thu, 08 Nov 2007 19:35:23 +0300 Alexey Starikovskiy <[EMAIL PROTECTED]> 
wrote:
> [remove_cycle_at_battery_removal.patch  text/x-patch (1.7KB)]
> ACPI: Battery: remove cycle from battery removal.
> 
> From: Alexey Starikovskiy <[EMAIL PROTECTED]>
> 
> get_property() should not call battery_update() on absent battery to
> avoid cycle and oops.
> 
> Signed-off-by: Alexey Starikovskiy <[EMAIL PROTECTED]>
> Tested-by: Rolf Eike Beer <[EMAIL PROTECTED]>

A patch similar to this one but with an identical changelog was merged into
Len's tree on November 2.

If it had been promptly merged into mainline then quite a lot of people's
time would not have been wasted.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] Add quirk to set AHCI mode on ICH boards

2007-11-08 Thread Jeff Garzik

On Thu, Nov 08, 2007 at 10:29:37PM -0500, Mark Lord wrote:
> And I might even privately patch my own kernels to map the ACHI BAR
> in the cases where the BIOS didn't...

The inability to do this in the general case is the main reason why
AHCI was not unconditionally enabled, even in IDE mode, when it was
originally added...  :/

Jeff, taking a trip down memory lane



-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] Add quirk to set AHCI mode on ICH boards

2007-11-08 Thread Mark Lord


Jeff Garzik wrote:

On Fri, Nov 09, 2007 at 09:02:35AM +0700, Riki Oktarianto wrote:

Some BIOSen map AHCI ABAR but lock the SATA controller to IDE mode.
This patch add quirk to set AHCI mode on ICH board with such case.

Tested on Macbook2,1 (ICH7M)


Intel will complain but it's awful tempting...


*Very* tempting.

And I might even privately patch my own kernels to map the ACHI BAR
in the cases where the BIOS didn't...
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [Lguest] [PATCH] virtio config_ops refactoring

2007-11-08 Thread Anthony Liguori


Dor Laor wrote:

ron minnich wrote:


Hi, I'm sorry, I've been stuck on other things (NFS RDMA anyone?) and
missed part of this discussion.

Is it really the case that operations on virtio devices will involve
outl/inl etc.?


What's the problem with them?
Except for the kick event it's not performance critical and the 
difference between pio vmexit

and hypercall exit is very small.


If you're a nutty guy who's interesting in creating the most absolute 
minimal VMM to run exotic paravirtual guests on massive clusters, then 
requiring PIO implies that you have to have an instruction decoder which 
is goes against the earlier goal ;-)


Regards,

Anthony Liguori


I don't know about problems in other architectures, maybe mmio is better?
Dor.



Apologies in advance for my failure to pay attention.

thanks

ron
___
Lguest mailing list
[EMAIL PROTECTED]
https://ozlabs.org/mailman/listinfo/lguest





-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: msync(2) bug(?), returns AOP_WRITEPAGE_ACTIVATE to userland

2007-11-08 Thread Erez Zadok

In message <[EMAIL PROTECTED]>, Dave Hansen writes:
> On Mon, 2007-11-05 at 15:40 +, Hugh Dickins wrote:
[...]
> I have a decent guess what the bug is, too.  In the unionfs code:
> 
> > int init_lower_nd(struct nameidata *nd, unsigned int flags)
> > {
> > ...
> > #ifdef ALLOC_LOWER_ND_FILE
> > file = kzalloc(sizeof(struct file), GFP_KERNEL);
> > if (unlikely(!file)) {
> > err = -ENOMEM;
> > break; /* exit switch statement and thus return */
> > }
> > nd->intent.open.file = file;
> > #endif /* ALLOC_LOWER_ND_FILE */
> 
> The r/o bind mount code will mnt_drop_write() on that file's f_vfsmnt at
> __fput() time.  Since that code never got a write on the mount, we'll
> see an imbalance if the file was opened for a write.  I don't see this
> file's mnt set anywhere, so I'm not completely sure that this is it.  In
> any case, rolling your own 'struct file' without using alloc_file() and
> friends is a no-no.
[...]

This #ifdef'd code in unionfs is actually not enabled.  I left it there as a
reminder of possible future things to come (esp. if nameidata gets split).
There's a related comment earlier in fs/unionfs/lookup.c:init_lower_nd()
that says:

#ifdef ALLOC_LOWER_ND_FILE
/*
 * XXX: one day we may need to have the lower return an open file
 * for us.  It is not needed in 2.6.23-rc1 for nfs2/nfs3, but may
 * very well be needed for nfs4.
 */
struct file *file;
#endif /* ALLOC_LOWER_ND_FILE */

In the interest of keeping unionfs as simple as I can, when I implemented
the whole "pass a lower nd" stuff, I left thos bits of semi-experimental
#ifdef code for this lower file upon open-intent.  It's not enabled and up
until now, it didn't seem to be needed.

Do you think unionfs has to start using this nd->intent.open.file stuff?

Thanks,
Erez.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] Add quirk to set AHCI mode on ICH boards

2007-11-08 Thread Jeff Garzik

On Fri, Nov 09, 2007 at 09:02:35AM +0700, Riki Oktarianto wrote:
> Some BIOSen map AHCI ABAR but lock the SATA controller to IDE mode.
> This patch add quirk to set AHCI mode on ICH board with such case.
> 
> Tested on Macbook2,1 (ICH7M)

Intel will complain but it's awful tempting...

Jeff



-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: smbfs/cifs large file support history?

2007-11-08 Thread Steve French

Does anyone remember when linux smbfs (or cifs) gained large file 
(>2GB, >4GB) file support?


The Linux CIFS client implementation has always had large file support (cifs.ko 
was added to the kernel first in 2.5.42), although of course some old server's do 
not support large (> 2GB) files.

I thought smbfs added it late in the 2.4 series, but don't remember, it might 
have been in 2.5.


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [kvm-devel] [PATCH 3/3] virtio PCI device

2007-11-08 Thread Anthony Liguori


Dor Laor wrote:

Anthony Liguori wrote:
This is a PCI device that implements a transport for virtio.  It 
allows virtio

devices to be used by QEMU based VMMs like KVM or Xen.


  
While it's a little premature, we can start thinking of irq path 
improvements.
The current patch acks a private isr and afterwards apic eoi will also 
be hit since its

a level trig irq. This means 2 vmexits per irq.
We can start with regular pci irqs and move afterwards to msi.
Some other ugly hack options [we're better use msi]:
   - Read the eoi directly from apic and save the first private isr ack


I must admit, that I don't know a whole lot about interrupt delivery.  
If we can avoid the private ISR ack then that would certainly be a good 
thing to do!  I think that would involve adding another bit to the 
virtqueues to indicate whether or not there is work to be handled.  It's 
really just moving the ISR to shared memory so that there's no plenty 
for accessing it.


Regards,

Anthony Liguori


   - Convert the specific irq line to edge triggered and dont share it
What do you guys think?

+/* A small wrapper to also acknowledge the interrupt when it's handled.
+ * I really need an EIO hook for the vring so I can ack the 
interrupt once we
+ * know that we'll be handling the IRQ but before we invoke the 
callback since
+ * the callback may notify the host which results in the host 
attempting to

+ * raise an interrupt that we would then mask once we acknowledged the
+ * interrupt. */
+static irqreturn_t vp_interrupt(int irq, void *opaque)
+{
+struct virtio_pci_device *vp_dev = opaque;
+struct virtio_pci_vq_info *info;
+irqreturn_t ret = IRQ_NONE;
+u8 isr;
+
+/* reading the ISR has the effect of also clearing it so it's very
+ * important to save off the value. */
+isr = ioread8(vp_dev->ioaddr + VIRTIO_PCI_ISR);
+
+/* It's definitely not us if the ISR was not high */
+if (!isr)
+return IRQ_NONE;
+
+spin_lock(&vp_dev->lock);
+list_for_each_entry(info, &vp_dev->virtqueues, node) {
+if (vring_interrupt(irq, info->vq) == IRQ_HANDLED)
+ret = IRQ_HANDLED;
+}
+spin_unlock(&vp_dev->lock);
+
+return ret;
+}
  




-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Fw: Buffer overflow in CIFS VFS.

2007-11-08 Thread Steve French

You are correct that the CIFS code calls SendReceive in cases in which
the buffer may be too small to fit a large SMB response, and that
should be fixed (e.g. to avoid possible overflows due to a server
bug), None of the eight cases (SMB TreeDisconnect, SMB uLogoff, SMB
Close, SMB FindClose etc.) in which a small buffer is passed in to
SendReceive return more than a few dozen bytes (and they are fixed
size responses), but I agree that we have to be safe (and we have seen
at least one server corrupt the bcc in the ulogoffX response and
another on the NTCreateX response) so it would be good to fix.

There are probably better ways to handle this though than passing in
the buffer size as your patch does.   Since there are only two buffer
sizes that CIFS uses - it would be easier to pass in (or out) a flag
which indicates the buffer size.  But the function SendReceive2
already does that - and the easier way to handle this seems to be
changing the eight places in fs/cifs/cifssmb.c which call
small_smb_init and then call SendReceive, to call SendReceive2
instead.

> From: Przemyslaw Wegrzyn <[EMAIL PROTECTED]>
> To: linux-kernel@vger.kernel.org
> Subject: Buffer overflow in CIFS VFS.
>
> just to find something that looks like a buffer overflow  bug.
> The problem is in SendReceive() function in transport.c - it memcpy's
> message payload into a buffer passed via out_buf param. The function
> assumes that all buffers are of size (CIFSMaxBufSize +
> MAX_CIFS_HDR_SIZE) , unfortunately it is also called with smaller
> (MAX_CIFS_SMALL_BUFFER_SIZE) buffers.
>
> To check this finding I patched Samba server to send oversized logoffX
> messages. With ~ 16kB messages the client running 2.6.23.1 crashed upon
> unmounting.
>
> I've done a quick fix, available here:
> http://czajnick.sitenet.pl/cifs-buffer-overflow-fix.patch.gz



-- 
Thanks,

Steve
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: unionfs and sys_readahead

2007-11-08 Thread Erez Zadok

In message <[EMAIL PROTECTED]>, Paul Albrecht writes:
> On Wed, 2007-11-07 at 19:53 +, Denys Vlasenko wrote:
> > On Tuesday 06 November 2007 22:01, Paul Albrecht wrote:
> > > Hi,
> > > 
> > > Whenever I use readahead-list on a union mounted file system I get a
> > > segfault and kernel oops so I'm wondering whether or not the linux
> > > unionfs supports sys_readahead.
> > > 
> > > Anyone know? I'm not usually subscribed to the lkml so please cc me in
> > > your response. Thanks.
> > 
> > Please show actual oops output.
> > 
> > Also: what kernel version do you use? any extra patch(es) applied? .config?
> 
> I'm using a stock generic kernel that's shipped with ubuntu. Here's my
> dmesg output containing two oops. The first occurs during boot and the
> second happens whenever I run readahead-list from the command line
> against a file list containing an entry in a union mounted filesystem:
[...]

> [   25.551441] Registering unionfs 1.4

The problem appears most likely because this version of Ubuntu is still
using the very old Unionfs 1.x.  I wasn't able to reproduce the bug with the
latest Unionfs 2.x.  I'm in touch with the Ubuntu guys to try and help them
move to 2.x.

Cheers,
Erez.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] Add quirk to set AHCI mode on ICH boards

2007-11-08 Thread Riki Oktarianto

Some BIOSen map AHCI ABAR but lock the SATA controller to IDE mode.
This patch add quirk to set AHCI mode on ICH board with such case.

Tested on Macbook2,1 (ICH7M)

-- 
Riki Oktarianto

--- linux-2.6.24-rc2.orig/drivers/pci/quirks.c
+++ linux-2.6.24-rc2/drivers/pci/quirks.c
@@ -466,6 +466,38 @@ DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_I
 DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL,  PCI_DEVICE_ID_INTEL_ICH8_2, 
quirk_ich6_lpc_acpi );
 DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL,  PCI_DEVICE_ID_INTEL_ICH8_3, 
quirk_ich6_lpc_acpi );
 
+static void __devinit quirk_ich_sata(struct pci_dev *dev)
+{
+   u32 ahci_bar;
+
+   pci_read_config_dword(dev, 0x24, &ahci_bar);
+   if (!ahci_bar) {
+   return;
+   }
+
+   if ((dev->class >> 8) == PCI_CLASS_STORAGE_IDE) {
+   pci_write_config_byte(dev, PCI_CLASS_PROG, 0x01);
+   pci_write_config_byte(dev, PCI_CLASS_DEVICE, 0x06);
+   }
+   dev->class = PCI_CLASS_STORAGE_SATA_AHCI;
+   printk (KERN_INFO "PCI_CLASS_STORAGE_SATA_AHCI set for %s\n",
+   pci_name(dev));
+}
+DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x2652, quirk_ich_sata);
+DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x2653, quirk_ich_sata);
+DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x27c0, quirk_ich_sata);
+DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x27c4, quirk_ich_sata);
+DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x2680, quirk_ich_sata);
+DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x2820, quirk_ich_sata);
+DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x2825, quirk_ich_sata);
+DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x2828, quirk_ich_sata);
+DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x2920, quirk_ich_sata);
+DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x2921, quirk_ich_sata);
+DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x2926, quirk_ich_sata);
+DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x2928, quirk_ich_sata);
+DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x292d, quirk_ich_sata);
+DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x292e, quirk_ich_sata);
+
 /*
  * VIA ACPI: One IO region pointed to by longword at
  * 0x48 or 0x20 (256 bytes of ACPI registers)
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

RE: [Patch] Allocate sparse vmemmap block above 4G

2007-11-08 Thread Christoph Lameter

On Fri, 9 Nov 2007, Zou, Nanhai wrote:

> > More magic values, both the 4GiB address here and the magic "1" at the
> > end are problems.
> > 
> Yes, the 4UL*1024*1024*1024 could be a define here.

The 4GB boundary here is MAX_DMA32_ADDRESS I guess? We are only having 
this problem because of the two DMA zones on x86_64. I thought Andi was 
getting rid of the first one at 16MB. If he would do so then ZONE_DMA 
could be used instead of DMA32 and everything will be fine.

For now you may want to put

#ifdef CONFIG_ZONE_DMA32

around this code since it depends on DMA32.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Some interesting observations when trying to optimize vmstat handling

2007-11-08 Thread David Miller

From: Christoph Lameter <[EMAIL PROTECTED]>
Date: Thu, 8 Nov 2007 11:58:58 -0800 (PST)

> The problem with cmpxchg_local here is that the differential has to
> be read before we execute the cmpxchg_local. So the cacheline is
> acquired first in read mode and then made exclusive on executing the
> cmpxchg_local.

I bet this can be defeated by prefetching for a write before
the read, but of course this won't help if the read is
being used to conditionally avoid the cmpxchg_local but I don't
think that's what you're trying to do here.

I've always wanted to add a write prefetch at the beginning of all of
the sparc64 atomic operation primitives because of this problem.
I just never got around to measuring if it's worthwhile or not.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: SC1200 failure in 2.6.23 and 2.6.24-rc1-git10

2007-11-08 Thread Tejun Heo

Denys Fedoryshchenko wrote:
> Thanks, it works like that.
> 
> Seems in libata there is no fall-back to non-DMA mode, if DMA didn't work.

There is, it's just too conservative about that.  With improvements
pending for 2.6.24, it should be quite snappy at falling back to PIO if
configured transfer mode doesn't seem to work at all (consecutive IO
command failures after transfer mode configuration change).

-- 
tejun
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

RE: [Patch] Allocate sparse vmemmap block above 4G

2007-11-08 Thread Zou, Nanhai


> -Original Message-
> From: Mel Gorman [mailto:[EMAIL PROTECTED]
> Sent: 2007年11月8日 22:07
> To: Zou, Nanhai
> Cc: LKML; Linus Torvalds; Greg KH; Dave Jones; Martin Ebourne; Siddha, Suresh
> B; Andi Kleen; Andrew Morton; Christoph Lameter; Andy Whitcroft
> Subject: Re: [Patch] Allocate sparse vmemmap block above 4G
> 
> On (08/11/07 08:52), Zou Nan hai didst pronounce:
> > Resend the patch for more people to review
> >
> > On some single node x64 system with huge amount of physical memory e.g >
> > 64G. the memmap size maybe very big.
> >
> > If the memmap is allocated from low pages, it may occupies too much
> > memory below 4G.
> > then swiotlb could fail to reserve bounce buffer under 4G which will
> > lead to boot failure.
> >
> > This patch will first try to allocate memmap memory above 4G in sparse
> > vmemmap code.
> > If it failed, it will allocate memmap above MAX_DMA_ADDRESS.
> > This patch is against 2.6.24-rc1-git14
> >
> 
> You never state that you depend on the strict_goal patch either here or in
> a leader mail. The usual way of getting peoples attention is to have three
> mails. In your case it would be
> 
> [PATCH 0/2] Allocate vmemmap from highmem where possible
> [PATCH 1/2] Strictly place bootmem allocations when requested
> [PATCH 2/2] Allocate sparse vmemmap block above 4G on x86_64
> 
> Maybe you have gone through all this already and I'm coming so late that I
> missed it all. If that is the case, sorry for the noise.
Hi Mel,
 Thanks for reviewing.

> 
> > Signed-off-by: Zou Nan hai <[EMAIL PROTECTED]>
> > Signed-off-by: Suresh Siddha <[EMAIL PROTECTED]>
> >
> > diff -Nraup a/arch/x86/mm/init_64.c b/arch/x86/mm/init_64.c
> > --- a/arch/x86/mm/init_64.c 2007-11-06 15:16:12.0 +0800
> > +++ b/arch/x86/mm/init_64.c 2007-11-06 15:55:50.0 +0800
> > @@ -448,6 +448,13 @@ void online_page(struct page *page)
> > num_physpages++;
> >  }
> >
> > +void * __meminit alloc_bootmem_high_node(pg_data_t *pgdat, unsigned long
> size,
> > +unsigned long align)
> > +{
> 
> Wrong markup there I believe. The __meminit markup is for functions
> that are needed at runtime when memory is hot-added or hot-removed.
> Bootmem functions do not qualify. __init is sufficient.
> 
__meminit here is to avoid section link warning here.
this function is called by vmemmap_alloc_block which is a __meminit function,
so if I mark it as __meminit, there were be warning about section mismatch, 
although this function will not be called during memory hotplug time.

> The name is confusing as well. I don't know what a high node is, but I
> think you mean alloc_bootmem_highmem  or alloc_bootmem_highmem_zone.
> That in *itself* is confusing on x86_64 because the memory above 4GiB is
> ZONE_NORMAL, not ZONE_HIGHMEM. Calling it alloc_bootmem_nondma() makes
> it worse.
> 
> I think you need to rework this to have an arch-specific function
> like arch_alloc_vmemmap_block() that by default allocates with
> alloc_bootmem_node() and otherwise uses an arch-specific function.
> In this case, it would know to call __alloc_bootmem_core() with the proper
> addressing limits.
> 
> > +return __alloc_bootmem_core(pgdat->bdata, size,
> > +align, (4UL*1024*1024*1024), 0, 1);
> > +}
> 
> More magic values, both the 4GiB address here and the magic "1" at the
> end are problems.
> 
Yes, the 4UL*1024*1024*1024 could be a define here.

However I think define constant for a boolean parameter is overkilling. Look at 
kernel code, 
e.g.
the force parameter in get_user_pages
and 
the sync parameter in try_to_wake_up 
we don't do things like
#define TRY_TO_WAKEUP_SYNC 1
#define TRY_TO_WAKEUP_NOSYNC 0 

There are other examples in kernel code. I believe it is a common practice not 
to define constant for a Boolean parameter.
How do you think?

> > +
> >  #ifdef CONFIG_MEMORY_HOTPLUG
> >  /*
> >   * Memory is added always to NORMAL zone. This means you will never get
> > diff -Nraup a/include/linux/bootmem.h b/include/linux/bootmem.h
> > --- a/include/linux/bootmem.h   2007-11-06 16:06:31.0 +0800
> > +++ b/include/linux/bootmem.h   2007-11-06 15:50:36.0 +0800
> > @@ -61,6 +61,10 @@ extern void *__alloc_bootmem_core(struct
> >   unsigned long limit,
> >   int strict_goal);
> >
> > +extern void *alloc_bootmem_high_node(pg_data_t *pgdat,
> > +unsigned long size,
> > +unsigned long align);
> > +
> 
> Confusing. Your declaration here makes it look like a bootmem API
> function but it is an arch-specific function that only exists for
> x86-64. I know it gets overridden later when a weak symbol but the more
> common approach is to have Kconfig define something like
> ARCH_HIGHMEM_VMEMMAP and
> 
> #ifdef CONFIG_ARCH_HIGHMEM_VMEMMAP
> extern void *alloc_bootmem_high_node(pg_data_t *pgdat,
> unsigned long size,
> unsigned long align);
> #else
> static inli

Re: [poll] Is the megafreeze development model broken?

2007-11-08 Thread Chris Snook


ciol wrote:

Chris Snook wrote:


Why are you asking the developers?  We do this for the sake of the users.



The kernel is the software of the developers.


The kernel is a technology.  A distribution is a product.  When decisions about 
technology and decisions about products are made *entirely* by the same people, 
the result is never good.



It's important to know how they want it to be distributed.


For commercial distributions, the answer is: "In whichever way results in the 
largest paycheck with the least amount of stress and effort", which means doing 
it the way that's best for the customer.


Non-commercial distributions have less of this pressure, but the same principle 
applies if they care about their users.  If you're not interested in the users 
but you are interested in the technology, you should be doing your work 
upstream, so the distribution is irrelevant.


Don't get me wrong, I think stable kernel trees like 2.6.16 are a good thing. 
They serve very well a whole bunch of different niches where users are willing 
to sacrifice the support benefits of a distribution kernel for the control of an 
upstream kernel, while maintaining the stability of their installed base.  These 
users have little interest in the general-purpose distribution kernel anyway, 
aside from perhaps wishing it included some config or patch that its maintainers 
have elected not to include.


-- Chris
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Some interesting observations when trying to optimize vmstat handling

2007-11-08 Thread Jeremy Fitzhardinge

Andi Kleen wrote:
> The only problem is that there might be some code who relies on 
> restore_flags() restoring other flags that IF, but at least for interrupts
> and local_irq_save/restore it should be fine to change.
>   

I don't think so.  We don't bother to save/restore the other flags in
Xen paravirt and it doesn't seem to cause a problem.  The semantics
really are specific to the state of the interrupt flag.

J
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: oops in oprofile/dump_trace/X86 with 2.6.24-rcX

2007-11-08 Thread Robert Fitzsimons

> Philippe, on Sun, 21 Oct you sent a "[patch 1/2] oProfile: oops when
> profile_pc() return ~0LU" which as far as I can tell never got applied.

This patch applied on it own doesn't fix the problem causing my oops.

> I've queued the below revert of Jan's change, in case your lost [2/2] doesn't
> address Robert's oops.
> 
> Robert, can you please test this?

This revert patch applied on it own does fix the oops.

Robert

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: sata NCQ blacklist entry

2007-11-08 Thread Luca Tettamanti

On Nov 9, 2007 12:32 AM, Robert Hancock <[EMAIL PROTECTED]> wrote:
> Luca Tettamanti wrote:
> > On Nov 7, 2007 1:55 PM, Tejun Heo <[EMAIL PROTECTED]> wrote:
> >> Florian La Roche wrote:
> >>> Hello all,
> >>>
> >>> I've taking email addresses from the last NCQ blacklist changes going
> >>> into the kernel.
> >>> This Fujitsu drive also gives me spurious command completions. Detailed
> >>> output also available at 
> >>> https://bugzilla.redhat.com/show_bug.cgi?id=366181.
> >>>
> >>> Let me know if you need more info or anything else.
> >>>
> >>> --- drivers/ata/libata-core.c
> >>> +++ drivers/ata/libata-core.c
> >>> @@ -4222,6 +4222,7 @@
> >>>   { "WDC WD740ADFD-00NLR1", NULL, ATA_HORKAGE_NONCQ, },
> >>>   { "WDC WD3200AAJS-00RYA0", "12.01B01",  ATA_HORKAGE_NONCQ, },
> >>>   { "FUJITSU MHV2080BH",  "00840028", ATA_HORKAGE_NONCQ, },
> >>> + { "FUJITSU MHW2160BJ G2",   NULL,   ATA_HORKAGE_NONCQ },
> >>>   { "ST9120822AS","3.CLF",ATA_HORKAGE_NONCQ, },
> >>>   { "ST9160821AS","3.CLF",ATA_HORKAGE_NONCQ, },
> >>>   { "ST9160821AS","3.ALD",ATA_HORKAGE_NONCQ, },
> >> Thanks.  We're currently trying to find out what's actually going on
> >> with all these drives.  At first, drives which got blacklisted aren't
> >> many and made sense (had other problems with NCQ, etc..) but with new
> >> generation drives from many vendors showing the same symptom, we aren't
> >> too sure now.
> >
> > Is there a way to tell whether Windows is using NCQ or not? I checked
> > the system log (or whatever it's called) on my notebook and is clean
> > but I'm not sure it's using NCQ (I don't even know if it'd log
> > spurious completions somewhere).
>
> Which driver is installed for the SATA controller in Windows, the
> chipset-manufacturer-provided AHCI driver or the default Microsoft
> driver? You'd need the AHCI driver installed for NCQ to be used.

I'm aware of this. I'm using the AHCI driver (from Intel). Still, I
don't know if it's really used or limited like under Linux.

Luca
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [kvm-devel] [PATCH 3/3] virtio PCI device

2007-11-08 Thread Dor Laor


Anthony Liguori wrote:

This is a PCI device that implements a transport for virtio.  It allows virtio
devices to be used by QEMU based VMMs like KVM or Xen.


  
While it's a little premature, we can start thinking of irq path 
improvements.
The current patch acks a private isr and afterwards apic eoi will also 
be hit since its

a level trig irq. This means 2 vmexits per irq.
We can start with regular pci irqs and move afterwards to msi.
Some other ugly hack options [we're better use msi]:
   - Read the eoi directly from apic and save the first private isr ack
   - Convert the specific irq line to edge triggered and dont share it
What do you guys think?

+/* A small wrapper to also acknowledge the interrupt when it's handled.
+ * I really need an EIO hook for the vring so I can ack the interrupt once we
+ * know that we'll be handling the IRQ but before we invoke the callback since
+ * the callback may notify the host which results in the host attempting to
+ * raise an interrupt that we would then mask once we acknowledged the
+ * interrupt. */
+static irqreturn_t vp_interrupt(int irq, void *opaque)
+{
+   struct virtio_pci_device *vp_dev = opaque;
+   struct virtio_pci_vq_info *info;
+   irqreturn_t ret = IRQ_NONE;
+   u8 isr;
+
+   /* reading the ISR has the effect of also clearing it so it's very
+* important to save off the value. */
+   isr = ioread8(vp_dev->ioaddr + VIRTIO_PCI_ISR);
+
+   /* It's definitely not us if the ISR was not high */
+   if (!isr)
+   return IRQ_NONE;
+
+   spin_lock(&vp_dev->lock);
+   list_for_each_entry(info, &vp_dev->virtqueues, node) {
+   if (vring_interrupt(irq, info->vq) == IRQ_HANDLED)
+   ret = IRQ_HANDLED;
+   }
+   spin_unlock(&vp_dev->lock);
+
+   return ret;
+}
  


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] uvesafb: Fix warnings about unused variables on non-x86

2007-11-08 Thread Frank Lichtenheld

Variables that are only used in #ifdef CONFIG_X86 should also
only be declared there.

Signed-off-by: Frank Lichtenheld <[EMAIL PROTECTED]>
---
 drivers/video/uvesafb.c |6 --
 1 files changed, 4 insertions(+), 2 deletions(-)

diff --git a/drivers/video/uvesafb.c b/drivers/video/uvesafb.c
index b983d26..d1d6c0f 100644
--- a/drivers/video/uvesafb.c
+++ b/drivers/video/uvesafb.c
@@ -926,8 +926,10 @@ static int uvesafb_setpalette(struct uvesafb_pal_entry 
*entries, int count,
int start, struct fb_info *info)
 {
struct uvesafb_ktask *task;
+#ifdef CONFIG_X86
struct uvesafb_par *par = info->par;
int i = par->mode_idx;
+#endif
int err = 0;
 
/*
@@ -1103,11 +1105,11 @@ static int uvesafb_pan_display(struct fb_var_screeninfo 
*var,
 
 static int uvesafb_blank(int blank, struct fb_info *info)
 {
-   struct uvesafb_par *par = info->par;
struct uvesafb_ktask *task;
int err = 1;
-
 #ifdef CONFIG_X86
+   struct uvesafb_par *par = info->par;
+
if (par->vbe_ib.capabilities & VBE_CAP_VGACOMPAT) {
int loop = 1;
u8 seq = 0, crtc17 = 0;
-- 
1.5.3.4

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] Align PCI memory regions to page size (4K) - Fix

2007-11-08 Thread Linas Vepstas

On Sun, Oct 28, 2007 at 11:52:16PM -0600, Grant Grundler wrote:
> > On Sun, Oct 28, 2007 at 03:53:20PM -0400, Barak Fargoun wrote:
> ...
> > > About your question: today, some of the hypervisors are using linux
> > > kernel as their domain-0 (e.g. Xen). In order to implement direct
> > > hardware access for these native domains (e.g.  running windows in a
> > > virtual machine above Xen), the PCI memory regions should be aligned
> > > at-least at the page-level (so, a virtual machine - can't see data of
> > > other devices which may not be assigned to it). So, for that reason,
> > > we wanted a boot parameter to let us force the kernel to align PCI
> > > memory regions at-least at a PAGE_SIZE alignment. It is very useful
> > > for hypervisors which are developed at Linux environment (e.g.: Xen).
> 
> It's a benefit IFF multiple devices are spread across more than one guest
> _and_ we don't trust every particating guest to play nicely with IO.  That way
> the Hypervisor can assign one device to a specific guest OS for direct access.
> E.g. 4 port Gige card could directly support the host and 3 guests with 
> somewhat
> lower risk of tromping on each other's MMIO space.
> 
> If Xen is cooperative, this seems a bit paranoid. I don't recall ever seeing a
> driver bug where the driver accidentally poked MMIO space at the wrong device.

I presume the issue is not a driver bug per-se, but a
spying/hacking-type security issue: Having root in one guest could in
principle allow one to write a driver that snooped on data in other
guests, and/or intentionally corrupted data on other guests.

I envision some ISP renting out 1/3 of a machine with a 4-port card,
and having some nosey college-kid wannabe hacker getting root on one of
the guests and causing trouble.  But perhaps I'm wy off base
here.

(Just like occasional cigarette smoking is known to inevitably lead to
full-fledged heroin addiction, I am pretty sure that the culture of
"cheat codes" among 12-year-olds is going to lead to an epidemic of
hackers in about 10 years. I am atuned to "wannabe hacker culture"). 

--linas
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [patch 02/23] SLUB: Rename NUMA defrag_ratio to remote_node_defrag_ratio

2007-11-08 Thread Matt Mackall

On Thu, Nov 08, 2007 at 01:28:31PM -0800, Christoph Lameter wrote:
> On Thu, 8 Nov 2007, Matt Mackall wrote:
> 
> > But perhaps I should just add a lightweight RNG to random.c and be
> > done with it.
> 
> It would be appreciated.

As someone pointed out privately, there's a random32() in lib/random32.c.

Unfortunately, this function is too heavy for many fast-path uses and
too weak for most other uses. I'll see if I can come up with something
better..

-- 
Mathematics is the supreme nostalgia of our time.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [kvm-devel] [PATCH 3/3] virtio PCI device

2007-11-08 Thread Dor Laor


Anthony Liguori wrote:

Avi Kivity wrote:
  

Anthony Liguori wrote:
  


This is a PCI device that implements a transport for virtio.  It allows virtio
devices to be used by QEMU based VMMs like KVM or Xen.

  

  

Didn't see support for dma.



Not sure what you're expecting there.  Using dma_ops in virtio_ring?

  

 I think that with Amit's pvdma patches you
can support dma-capable devices as well without too much fuss.
  



What is the use case you're thinking of?  A semi-paravirt driver that 
does dma directly to a device?


Regards,

Anthony Liguori

  
You would also lose performance since pv-dma will trigger an exit for 
each virtio io while

virtio kicks the hypervisor after several IOs were queued.


-
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems?  Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >> http://get.splunk.com/
___
kvm-devel mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/kvm-devel

  


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: How do I debug PCI resource allocation problems

2007-11-08 Thread Robert Hancock


Rainer Koenig wrote:
This will get long, sorry. But I'm a bit desperate because I encounter strange 
problems on a new mainboard with Intel Q35 chipset and a shared memory 
graphics card. The logs and data I use here are from a SLED10 SP1 (x86_64) 
installation, but the problem occurs whatever distribution I try out. 


Ok, short description of the problem:

I run 64-bit Linux using 2 GB of RAM, no problem at all. Then I turn off the 
machine, add 2 more GB so that now I have 4 GB of RAM. Turning it on I see 
the splashscreen of the boot loader, starting the kernel turns the screen 
black and that's it. The machine comes up, I can even ssh to it over the net. 
That is how I obtained the following data.


First a look at the boot.msg log. I will point out places of interest (at 
least the lines I think that are important for further analysis).


--8<-boot.msg--start--
Inspecting /boot/System.map-2.6.16.46-0.12-smp
Loaded 23173 symbols from /boot/System.map-2.6.16.46-0.12-smp.
Symbols match kernel version 2.6.16.
No module symbols loaded - kernel modules not enabled.

klogd 1.4.1, log source = ksyslog started.
<4>Bootdata ok (command line is 
root=/dev/disk/by-id/scsi-SATA_ST3160815AS_9RX01AP0-part5 vga=0x31a
resume=/dev/sda1 splash=silent)
<5>Linux version 2.6.16.46-0.12-smp ([EMAIL PROTECTED]) (gcc version 4.1.2 
20070115 (prerelease) (SUSE Linux)) #1 SMP Thu May 17 14:00:09 UTC 2007

<6>BIOS-provided physical RAM map:
<4> BIOS-e820:  - 0009d800 (usable)
<4> BIOS-e820: 0009d800 - 000a (reserved)
<4> BIOS-e820: 000ce000 - 000d (reserved)
<4> BIOS-e820: 000e - 0010 (reserved)
<4> BIOS-e820: 0010 - df5d (usable)
<4> BIOS-e820: df5d - df5dc000 (ACPI data)
<4> BIOS-e820: df5dc000 - df5df000 (ACPI NVS)
<4> BIOS-e820: df5df000 - df70 (reserved)
<4> BIOS-e820: df80 - e010 (reserved)
<4> BIOS-e820: f800 - fc00 (reserved)
<4> BIOS-e820: fec0 - fec1 (reserved)
<4> BIOS-e820: fee0 - fee01000 (reserved)
<4> BIOS-e820: ffb0 - 0001 (reserved)

* Comment: The following 2 lines are added when I upgrade from 2 GB to 
4 GB.


<4> BIOS-e820: 0001 - 00011a00 (usable)
<4> BIOS-e820: 00011a00 - 00011c00 (reserved)
<6>DMI present.
<7>ACPI: RSDP (v000 PTLTD ) @ 
0x000f7240
<7>ACPI: RSDT (v001 PTLTDRSDT   0x0006  LTP 0x) @ 
0xdf5d6d0b
<7>ACPI: FADT (v001 FSC 0x0006  0x000f4240) @ 
0xdf5dba5f
<7>ACPI: TCPA (v001 Phoeni  x   0x0006 TL  0x) @ 
0xdf5dbad3
<7>ACPI: _MAR (v001 Intel  OEMDMAR  0x0006 LOHR 0x0001) @ 
0xdf5dbb05
<7>ACPI: SSDT (v001 FSCPST_CPU0 0x0006  CSF 0x0001) @ 
0xdf5dbb35
<7>ACPI: SSDT (v001 FSCPST_CPU1 0x0006  CSF 0x0001) @ 
0xdf5dbbeb
<7>ACPI: SSDT (v001 FSCPST_CPU2 0x0006  CSF 0x0001) @ 
0xdf5dbca1
<7>ACPI: SSDT (v001 FSCPST_CPU3 0x0006  CSF 0x0001) @ 
0xdf5dbd57
<7>ACPI: SPCR (v001 PTLTD  $UCRTBL$ 0x0006 PTL  0x0001) @ 
0xdf5dbe0d
<7>ACPI: MCFG (v001 PTLTDMCFG   0x0006  LTP 0x) @ 
0xdf5dbe5d
<7>ACPI: HPET (v001 PTLTD  HPETTBL  0x0006  LTP 0x0001) @ 
0xdf5dbe99
<7>ACPI: MADT (v001 PTLTD  	 APIC   0x0006  LTP 0x) @ 
0xdf5dbed1
<7>ACPI: BOOT (v001 PTLTD  $SBFTBL$ 0x0006  LTP 0x0001) @ 
0xdf5dbf55
<7>ACPI: ASF! (v016   CETP CETP 0x0006 PTL  0x0001) @ 
0xdf5dbf7d
<7>ACPI: DSDT (v001 FSCD2587/A1 0x0006 MSFT 0x0301) @ 
0x

<6>No NUMA configuration found
<6>Faking a node at -00011a00
<6>Bootmem setup node 0 -00011a00
<7>On node 0 totalpages: 1004539
<7>  DMA zone: 2979 pages, LIFO batch:0
<7>  DMA32 zone: 896520 pages, LIFO batch:31
<7>  Normal zone: 105040 pages, LIFO batch:31
<7>  HighMem zone: 0 pages, LIFO batch:0
<6>ACPI: PM-Timer IO Port: 0x1008
<7>ACPI: Local APIC address 0xfee0
<6>ACPI: LAPIC (acpi_id[0x00] lapic_id[0x00] enabled)
<6>Processor #0 6:15 APIC version 20
<6>ACPI: LAPIC (acpi_id[0x01] lapic_id[0x01] enabled)
<6>Processor #1 6:15 APIC version 20
<6>ACPI: LAPIC (acpi_id[0x02] lapic_id[0x02] enabled)
<6>Processor #2 6:15 APIC version 20
<6>ACPI: LAPIC (acpi_id[0x03] lapic_id[0x03] enabled)
<6>Processor #3 6:15 APIC version 20
<6>ACPI: LAPIC_NMI (acpi_id[0x00] high edge lint[0x1])
<6>ACPI: LAPIC_NMI (acpi_id[0x01] high edge lint[0x1])
<6>ACPI: LAPIC_NMI (acpi_id[0x02] high edge lint[0x1])
<6>ACPI: LAPIC_NMI (acpi_id[0x03] high edge lint[0x1])
<6>ACPI: IOAPIC (id[0x04] address[0xfec0] gsi_base[0])
<6>IOAPIC[0]: apic_id

Re: [patch 01/28] cpu alloc: The allocator

2007-11-08 Thread David Miller

From: Peter Zijlstra <[EMAIL PROTECTED]>
Date: Thu, 08 Nov 2007 21:19:08 +0100

> 
> On Thu, 2007-11-08 at 10:31 -0800, Christoph Lameter wrote:
> > On Thu, 8 Nov 2007, Peter Zijlstra wrote:
> > 
> > > I don't like those shouting macros.
> > 
> > The convention for macros is to use upper case.
> 
> We have plent macros that look like regular functions. And as a primary
> interface to this functionality these shouting things look really out of
> place.

I disagree, macros in upper case make sense here.  Macros should SHOUT
at you because CPP is MAGIC and has side effects that normal real
functions do not have, and therefore you need to be REMINDED.

And, honestly, aren't there more important issues about his patches to
review than macro capitalization?
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [patch 01/28] cpu alloc: The allocator

2007-11-08 Thread David Miller

From: Christoph Lameter <[EMAIL PROTECTED]>
Date: Thu, 8 Nov 2007 12:24:22 -0800 (PST)

> On Thu, 8 Nov 2007, Peter Zijlstra wrote:
> 
> > > The convention for macros is to use upper case.
> > 
> > We have plent macros that look like regular functions. And as a primary
> > interface to this functionality these shouting things look really out of
> > place.
> 
> One point of the patchset is to clean up the messy handling of the 
> allocpercpu interface which uses lower case for macros. It is a bit 
> confusing that a function like alloc_percpu() can take a type argument. 
> I think this needs to be uppercase for clarity.

Without a doubt.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Some interesting observations when trying to optimize vmstat handling

2007-11-08 Thread Christoph Lameter

On Fri, 9 Nov 2007, Andi Kleen wrote:

> 
> > There is an interrupt enable overhead of 48 cycles that would be good to
> > be able to eliminate (Kernel code usually moves counter increments into
> > a neighboring interrupt disable section so that __ function can be used).
> 
> Replace the push flags ; popf  with test $IFMASK,flags ; jz 1f; sti ; 1:
> That will likely make it much faster (but also bigger) 

Well maybe we should change local_irq_save/restore in general?

The result would be:


if (!in_interrupt())
local_irq_disable()



if (!in_interrupt())
local_irq_enable();



Somehow we need to remember that we disabled interrupts.

Then it get more complicated.


int interrupts_disabled = 0;

if (!in_interrupt()) {
local_irq_disable():
interrrupts_disabled = 1;
}



if (interrupts_disabled)
local_irq_enable();



Not sure that this actually better.


> The only problem is that there might be some code who relies on 
> restore_flags() restoring other flags that IF, but at least for interrupts
> and local_irq_save/restore it should be fine to change.

The statistics code surely does not rely on that.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 0/6] MN10300: Add the MN10300 architecture to Linux kernel [try #4]

2007-11-08 Thread David Howells

Sam Ravnborg <[EMAIL PROTECTED]> wrote:

> kbuild does not know anything about ASFLAGS.
> In the whole kernel tree I only see frv assign them but the
> value is never used.

Hmmm... I think it used to be.  No matter, I'll drop it from MN10300.

> It just seemed stupid that someone having copyright on a file
> where the person only had contributed the copyright linei (for said file).

I'll consult MEI as to what boilerplate copyright messages I should plaster on
such files.

David
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: sata NCQ blacklist entry

2007-11-08 Thread Robert Hancock


Luca Tettamanti wrote:

On Nov 7, 2007 1:55 PM, Tejun Heo <[EMAIL PROTECTED]> wrote:

Florian La Roche wrote:

Hello all,

I've taking email addresses from the last NCQ blacklist changes going
into the kernel.
This Fujitsu drive also gives me spurious command completions. Detailed
output also available at https://bugzilla.redhat.com/show_bug.cgi?id=366181.

Let me know if you need more info or anything else.

--- drivers/ata/libata-core.c
+++ drivers/ata/libata-core.c
@@ -4222,6 +4222,7 @@
  { "WDC WD740ADFD-00NLR1", NULL, ATA_HORKAGE_NONCQ, },
  { "WDC WD3200AAJS-00RYA0", "12.01B01",  ATA_HORKAGE_NONCQ, },
  { "FUJITSU MHV2080BH",  "00840028", ATA_HORKAGE_NONCQ, },
+ { "FUJITSU MHW2160BJ G2",   NULL,   ATA_HORKAGE_NONCQ },
  { "ST9120822AS","3.CLF",ATA_HORKAGE_NONCQ, },
  { "ST9160821AS","3.CLF",ATA_HORKAGE_NONCQ, },
  { "ST9160821AS","3.ALD",ATA_HORKAGE_NONCQ, },

Thanks.  We're currently trying to find out what's actually going on
with all these drives.  At first, drives which got blacklisted aren't
many and made sense (had other problems with NCQ, etc..) but with new
generation drives from many vendors showing the same symptom, we aren't
too sure now.


Is there a way to tell whether Windows is using NCQ or not? I checked
the system log (or whatever it's called) on my notebook and is clean
but I'm not sure it's using NCQ (I don't even know if it'd log
spurious completions somewhere).


Which driver is installed for the SATA controller in Windows, the 
chipset-manufacturer-provided AHCI driver or the default Microsoft 
driver? You'd need the AHCI driver installed for NCQ to be used.


--
Robert Hancock  Saskatoon, SK, Canada
To email, remove "nospam" from [EMAIL PROTECTED]
Home Page: http://www.roberthancock.com/

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Some interesting observations when trying to optimize vmstat handling

2007-11-08 Thread Andi Kleen


> There is an interrupt enable overhead of 48 cycles that would be good to
> be able to eliminate (Kernel code usually moves counter increments into
> a neighboring interrupt disable section so that __ function can be used).

Replace the push flags ; popf  with test $IFMASK,flags ; jz 1f; sti ; 1:
That will likely make it much faster (but also bigger) 

The only problem is that there might be some code who relies on 
restore_flags() restoring other flags that IF, but at least for interrupts
and local_irq_save/restore it should be fine to change.

-Andi

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH RFC 3/7] x86: clean up asm-x86/page*.h

2007-11-08 Thread Jeremy Fitzhardinge

Glauber de Oliveira Costa wrote:
> Not exactly.
> for the native_ functions,  there's room for code sharing.
> native_pgd_val, and native_pte_val seem to be the same, for at least
> pae and x86_64.
> As for the typedefs, the same thing can be done. Much like you did in
> paravirt.h, just split out between the < 3 and >= 3 levels.

Yeah, I see what you mean.  I'll play with it and see how it turns out.

J
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [Lguest] [PATCH] virtio config_ops refactoring

2007-11-08 Thread Anthony Liguori


ron minnich wrote:

Hi, I'm sorry, I've been stuck on other things (NFS RDMA anyone?) and
missed part of this discussion.

Is it really the case that operations on virtio devices will involve
outl/inl etc.?
  


No, this is just for the PCI virtio transport.  lguest's virtio 
transport uses hypercalls and shared memory.


Regards,

Anthony Liguori


Apologies in advance for my failure to pay attention.

thanks

ron
  


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

snd_hda_intel 2.6.24-rc2 bug: interrupts don't always work on Lenovo X60s

2007-11-08 Thread Roland Dreier

With 2.6.24-rc2 on my Lenovo X60s, I sometimes get:

hda_intel: No response from codec, disabling MSI: last cmd=0x002f0d00
hda_intel: azx_get_response timeout, switching to polling mode: last 
cmd=0x002f0d00

when loading snd_hda_intel.  Sound still works but interrupts aren't generated.

I tried bisecting but I haven't gotten good info yet because it seems
that this is not completely reproducible -- sometimes when I load the
module, it works fine, and other times I get the message.

I think this is a regression, since I don't recall ever seeing the
message with 2.6.22 or so.

Any idea on how to help debug this?

Thanks,
  Roland
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [Lguest] [PATCH] virtio config_ops refactoring

2007-11-08 Thread ron minnich

Hi, I'm sorry, I've been stuck on other things (NFS RDMA anyone?) and
missed part of this discussion.

Is it really the case that operations on virtio devices will involve
outl/inl etc.?

Apologies in advance for my failure to pay attention.

thanks

ron
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: SC1200 failure in 2.6.23 and 2.6.24-rc1-git10

2007-11-08 Thread Alan Cox

On Thu, 8 Nov 2007 21:56:32 +0200
"Denys Fedoryshchenko" <[EMAIL PROTECTED]> wrote:

> Thanks, it works like that.
> 
> Seems in libata there is no fall-back to non-DMA mode, if DMA didn't work.


It should be falling back from UDMA or MWDMA to PIO, if not please file a
bug

Alan
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH RFC 3/7] x86: clean up asm-x86/page*.h

2007-11-08 Thread Glauber de Oliveira Costa

On Nov 8, 2007 6:59 PM, Jeremy Fitzhardinge <[EMAIL PROTECTED]> wrote:
> Glauber de Oliveira Costa wrote:
> > On Nov 7, 2007 11:50 PM, Jeremy Fitzhardinge <[EMAIL PROTECTED]> wrote:
> >
> >> +#define PAGETABLE_LEVELS   3
> >> +
> >> +typedef u64pteval_t;
> >> +typedef u64pmdval_t;
> >> +typedef u64pudval_t;
> >> +typedef u64pgdval_t;
> >> +
> >>
> >
> >
> >> -static inline unsigned long long native_pgd_val(pgd_t pgd)
> >> +static inline pgdval_t native_pgd_val(pgd_t pgd)
> >>  {
> >>
> > Maybe these kind of things, the typedef and native_xxx definitions can
> > go into the common header, after we define the PAGETABLE_LEVELS
> > constant?
> > I think the more goes into common headers, the better.
> >
>
> You mean put them in a common header, but conditionally by #if
> PAGETABLE_LEVELS?  I don't think that would be much of an improvement;
> it would just add more #ifs, which adds lines and conceptual
> complexity.  If you go that way, you may as well put everything in one
> header wrapped in #ifs, but personally I don't think that would help.

Not exactly.
for the native_ functions,  there's room for code sharing.
native_pgd_val, and native_pte_val seem to be the same, for at least
pae and x86_64.
As for the typedefs, the same thing can be done. Much like you did in
paravirt.h, just split out between the < 3 and >= 3 levels.

But if it turns out to be just code movement, and I'm wrong in my
supposition that we can turn three variants of the same code into two,
then I agree
with you, let's keep it this way.


-- 
Glauber de Oliveira Costa.
"Free as in Freedom"
http://glommer.net

"The less confident you are, the more serious you have to act."
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: 2.6.23-mm1 breaks C-state support on Intel T7200 x86_64

2007-11-08 Thread Mark Gross

On Thu, Nov 08, 2007 at 12:19:44PM -0500, [EMAIL PROTECTED] wrote:
> (Sorry for not reporting this sooner - I haven't been running off battery
> much in the last 3 weeks, so I didn't notice it till now...)
> 
> Dell Latitude D820 laptop, T7200 Core2 Duo CPU, x86_64 kernel.
> 
> As reported by 'powertop' on a basically idle machine:
> 
> 2.6.23-mm1:
> 
> CnAvg residency   P-states (frequencies)
> C0 (cpu running)(100.0%)2.00 Ghz 0.8%
> C10.0ms ( 0.0%) 1.67 Ghz 0.0%
> C20.0ms ( 0.0%) 1333 Mhz 0.0%
> C30.0ms ( 0.0%) 1000 Mhz99.2%
> 
> 2.6.23-rc8-mm2:
> 
> CnAvg residency   P-states (frequencies)
> C0 (cpu running)( 0.3%) 2.00 Ghz 0.0%
> C10.0ms ( 0.0%) 1.67 Ghz 0.0%
> C20.0ms ( 0.0%) 1333 Mhz 0.0%
> C3   31.5ms (99.7%) 1000 Mhz   100.0%
> 
> In addition, the ACPI power estimate reported about 25 watts for 23-mm1,
> but only 21 watts for -rc8-mm2, a significant regression.
> 
> I bisected this down to this set of patches:
> 
> pm-qos-infrastructure-and-interface.patch
> pm-qos-infrastructure-and-interface-fix.patch
> pm-qos-infrastructure-and-interface-vs-git-acpi.patch
> pm-qos-infrastructure-and-interface-vs-git-acpi-2.patch
> latencyc-use-qos-infrastructure.patch
> 
> The patch says:
> 
>   To register the default pm_qos target for the specific parameter, the
>   process must open one of /dev/[cpu_dma_latency, network_latency,
>   network_throughput]
> 
>   As long as the device node is held open that process has a registered
>   requirement on the parameter.  The name of the requirement is
>   "process_" derived from the current->pid from within the open system
>   call.
> 
> I shouldn't have to have a process open a /dev/file, write a number, and then
> stay around forever so the file doesn't close in order to get the same 
> behavior
> I was getting by default before.  What needs to happen to get this to not
> be a behavior regression/change?
> 
> 
> 
> 

wing patch fixes up the cpuidle / pm-qos integration.

I suspect that this is folded into another mm patch but it should fix
C-state issue identified.

--mgross



Signed-off-by: mark gross <[EMAIL PROTECTED]>

-

Index: linux-2.6.23-mm1/drivers/cpuidle/cpuidle.c
===
--- linux-2.6.23-mm1.orig/drivers/cpuidle/cpuidle.c 2007-11-08 
13:09:53.0 -0800
+++ linux-2.6.23-mm1/drivers/cpuidle/cpuidle.c  2007-11-08 13:25:13.0 
-0800
@@ -268,7 +268,7 @@
 
 static inline void latency_notifier_init(struct notifier_block *n)
 {
-pm_qos_add_notifier(PM_QOS_CPUIDLE, n);
+   pm_qos_add_notifier(PM_QOS_CPU_DMA_LATENCY, n);
 }
 
 #else /* CONFIG_SMP */
Index: linux-2.6.23-mm1/drivers/cpuidle/governors/ladder.c
===
--- linux-2.6.23-mm1.orig/drivers/cpuidle/governors/ladder.c2007-11-08 
13:09:53.0 -0800
+++ linux-2.6.23-mm1/drivers/cpuidle/governors/ladder.c 2007-11-08 
13:11:30.0 -0800
@@ -82,7 +82,7 @@
if (last_idx < dev->state_count - 1 &&
last_residency > last_state->threshold.promotion_time &&
dev->states[last_idx + 1].exit_latency <=
-   pm_qos_requirement(PM_QOS_CPUIDLE)) {
+   pm_qos_requirement(PM_QOS_CPU_DMA_LATENCY)) {
last_state->stats.promotion_count++;
last_state->stats.demotion_count = 0;
if (last_state->stats.promotion_count >= 
last_state->threshold.promotion_count) {
Index: linux-2.6.23-mm1/drivers/cpuidle/governors/menu.c
===
--- linux-2.6.23-mm1.orig/drivers/cpuidle/governors/menu.c  2007-11-08 
13:12:11.0 -0800
+++ linux-2.6.23-mm1/drivers/cpuidle/governors/menu.c   2007-11-08 
13:24:03.0 -0800
@@ -48,7 +48,8 @@
break;
if (s->target_residency > data->predicted_us)
break;
-   if (s->exit_latency > pm_qos_requirement(PM_QOS_CPUIDLE))
+   if (s->exit_latency >
+   pm_qos_requirement(PM_QOS_CPU_DMA_LATENCY))
break;
}
 
Index: linux-2.6.23-mm1/include/linux/pm_qos_params.h
===
--- linux-2.6.23-mm1.orig/include/linux/pm_qos_params.h 2007-11-08 
13:09:53.0 -0800
+++ linux-2.6.23-mm1/include/linux/pm_qos_params.h  2007-11-08 
13:14:05.0 -0800
@@ -6,23 +6,12 @@
 #include 
 #include 
 
-struct requirement_list {
-   struct list_head list;
-   union {
-   s32 value;
-   s32 usec;
-   s32 kbps;
-   };
-   char *name;
-};
-
 #define PM_QOS_RESERVED 0
 #define PM_QOS_CPU_DM

Re: cannot "hibernate" if program being debugged in gdb is paused after SIGABRT in linux 2.6.23 (but can in 2.6.22.7)

2007-11-08 Thread Rafael J. Wysocki

On Tuesday, 30 of October 2007, CSights wrote:
> >
> > Thanks, I'll try to reproduce the problem here and fix it.
> 
> No really, thank you!

Sorry for the long delay.

Can you please check if you are able to reproduce the problem with 2.6.24-rc2?

Thanks,
Rafael
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Buffer overflow in CIFS VFS.

2007-11-08 Thread Jörn Engel

Not everyone has the time to read lkml.  Added Steve to Cc:, just in
case.

On Thu, 8 November 2007 22:20:03 +0100, Przemyslaw Wegrzyn wrote:
> 
> I was looking at CIFS VFS code recently, trying to solve other issue,
> just to find something that looks like a buffer overflow  bug.
> The problem is in SendReceive() function in transport.c - it memcpy's
> message payload into a buffer passed via out_buf param. The function
> assumes that all buffers are of size (CIFSMaxBufSize +
> MAX_CIFS_HDR_SIZE) , unfortunately it is also called with smaller
> (MAX_CIFS_SMALL_BUFFER_SIZE) buffers.
> 
> To check this finding I patched Samba server to send oversized logoffX
> messages. With ~ 16kB messages the client running 2.6.23.1 crashed upon
> unmounting.
> 
> I've done a quick fix, available here:
> http://czajnick.sitenet.pl/cifs-buffer-overflow-fix.patch.gz

Jörn

-- 
When in doubt, punt.  When somebody actually complains, go back and fix it...
The 90% solution is a good thing.
-- Rob Landley
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] virtio config_ops refactoring

2007-11-08 Thread Anthony Liguori


Rusty Russell wrote:

On Thursday 08 November 2007 13:41:16 Anthony Liguori wrote:
  

Rusty Russell wrote:


On Thursday 08 November 2007 04:30:50 Anthony Liguori wrote:
  

I would prefer that the virtio API not expose a little endian standard.
I'm currently converting config->get() ops to ioreadXX depending on the
size which already does the endianness conversion for me so this just
messes things up.  I think it's better to let the backend deal with
endianness since it's trivial to handle for both the PCI backend and the
lguest backend (lguest doesn't need to do any endianness conversion).


-ETOOMUCHMAGIC.  We should either expose all the XX interfaces (but this
isn't a high-speed interface, so let's not) or not "sometimes" convert
endianness. Getting surprises because a field happens to be packed into 4
bytes is counter-intuitive.
  

Then I think it's necessary to expose the XX interfaces.  Otherwise, the
backend has to deal with doing all register operations at a per-byte
granularity which adds a whole lot of complexity on a per-device basis
(as opposed to a little complexity once in the transport layer).



Huh?  Take a look at the drivers, this simply isn't true.  Do you have 
evidence that it will be true later?
  


I'm a bit confused.  So right now, the higher level virtio functions do 
endianness conversion.  I really want to make sure that if a guest tries 
to read a 4-byte PCI config field, that it does so using an "outl" 
instruction so that in my QEMU backend, I don't have to deal with a 
guest reading/writing a single byte within a 4-byte configuration 
field.  It's the difference between having in the PIO handler:


switch (addr) {
case VIRTIO_BLK_CONFIG_MAX_SEG:
   return vdev->max_seg;
case VIRTIO_BLK_CONFIG_MAX_SIZE:
   return vdev->max_size;

}

and:

switch (addr) {
case VIRTIO_BLK_CONFIG_MAX_SEG:
  return vdev->max_seg & 0xFF;
case VIRTIO_BLK_CONFIG_MAX_SEG + 1:
  return (vdev->max_seg >> 8) & 0xFF;
case VIRTIO_BLK_CONFIG_MAX_SEG + 2:
  return (vdev->max_seg >> 16) & 0xFF;
case VIRTIO_BLK_CONFIG_MAX_SEG + 3:
  return (vdev->max_seg >> 24) & 0xFF;
case VIRTIO_BLK_CONFIG_MAX_SIZE:
  return vdev->max_size & 0xFF;
case VIRTIO_BLK_CONFIG_MAX_SIZE + 1:
  return (vdev->max_size >> 8) & 0xFF;
case VIRTIO_BLK_CONFIG_MAX_SIZE + 2:
  return (vdev->max_size >> 16) & 0xFF;
case VIRTIO_BLK_CONFIG_MAX_SIZE + 3:
  return (vdev->max_size >> 24) & 0xFF;
...
}


It's the host-side code I'm concerned about, not the guest-side code.  
I'm happy to just ignore the whole endianness conversion thing and 
always pass values through in the CPU bitness but it's very important to 
me that the PCI config registers are accessed with their natural sized 
instructions (just as they would with a real PCI device).


Regards,

Anthony Liguori

Plus your code will be smaller doing a single writeb/readb loop than trying to 
do a switch statement.


  

You really want to be able to rely on multi-byte atomic operations too
when setting values.  Otherwise, you need another register to just to
signal when it's okay for the device to examine any given register.



You already do; the status register fills this role.  For example, you can't 
tell what features a guest understands until it updates the status register.


Hope that clarifies,
Rusty.

  


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: 2.6.24-rc1-gb4f5550 oops

2007-11-08 Thread Rafael J. Wysocki

On Thursday, 8 of November 2007, Grant Wilson wrote:
> On Thu, 8 Nov 2007 22:42:21 +0100
> "Rafael J. Wysocki" <[EMAIL PROTECTED]> wrote:
> 
> > On Thursday, 8 of November 2007, Grant Wilson wrote:
> > > On Thu, 8 Nov 2007 16:53:10 +0100
> > > "Rafael J. Wysocki" <[EMAIL PROTECTED]> wrote:
> > > 
> > > > On Thursday, 8 of November 2007, Grant Wilson wrote:
> > > > > On Thu, 8 Nov 2007 01:06:21 +0100
> > > > > "Rafael J. Wysocki" <[EMAIL PROTECTED]> wrote:
> > > > > 
> > > > > > On Monday, 5 of November 2007, Grant Wilson wrote:
> > > > > > > Hi,
> > > > > > > I got this oops on 2.6.24-rc1-641-gb4f5550:
> > > > > > 
> > > > > > (1) Is this reproducible?
> > > > > > (2) Did it happen previously on your system?
> > > > > >
> > > > > > [18073.371126] Unable to handle kernel NULL pointer dereference at 
> > > > > > 0120 RIP: 
> > > > > > [18073.371134]  [] check_preempt_wakeup+0x6e/0x110
> > > > > 
> > > > > This has now happened twice - the second time was last night when
> > > > > running 2.6.24-rc2.
> > > > > 
> > > > > Here's that second occurrence:
> > > > > 
> > > [snip]
> > > > 
> > > > Hmm.
> > > > 
> > > > Please run "gdb vmlinux" and see what code corresponds to
> > > > check_preempt_wakeup+0x6e in your kernel.
> > >
> > > 
> > > Dump of assembler code for function check_preempt_wakeup:
> > 
> > Well, thanks, but I meant the source code.  Please do "gdb vmlinux" and then
> > "l *check_preempt_wakeup+0x6e" in gdb.
> 
> Here's the requested output:
> 
> (gdb) l *check_preempt_wakeup+0x6e
> 0x802329ae is in check_preempt_wakeup (kernel/sched_fair.c:668).
> 663
> 664 /* Do the two (enqueued) entities belong to the same group ? */
> 665 static inline int
> 666 is_same_group(struct sched_entity *se, struct sched_entity *pse)
> 667 {
> 668 if (se->cfs_rq == pse->cfs_rq)
> 669 return 1;
> 670
> 671 return 0;
> 672 }

Well, it looks like either se or pse is NULL.

Ingo, can you please have a look?

Thanks,
Rafael
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] virtio config_ops refactoring

2007-11-08 Thread Rusty Russell

On Thursday 08 November 2007 13:41:16 Anthony Liguori wrote:
> Rusty Russell wrote:
> > On Thursday 08 November 2007 04:30:50 Anthony Liguori wrote:
> >> I would prefer that the virtio API not expose a little endian standard.
> >> I'm currently converting config->get() ops to ioreadXX depending on the
> >> size which already does the endianness conversion for me so this just
> >> messes things up.  I think it's better to let the backend deal with
> >> endianness since it's trivial to handle for both the PCI backend and the
> >> lguest backend (lguest doesn't need to do any endianness conversion).
> >
> > -ETOOMUCHMAGIC.  We should either expose all the XX interfaces (but this
> > isn't a high-speed interface, so let's not) or not "sometimes" convert
> > endianness. Getting surprises because a field happens to be packed into 4
> > bytes is counter-intuitive.
>
> Then I think it's necessary to expose the XX interfaces.  Otherwise, the
> backend has to deal with doing all register operations at a per-byte
> granularity which adds a whole lot of complexity on a per-device basis
> (as opposed to a little complexity once in the transport layer).

Huh?  Take a look at the drivers, this simply isn't true.  Do you have 
evidence that it will be true later?

Plus your code will be smaller doing a single writeb/readb loop than trying to 
do a switch statement.

> You really want to be able to rely on multi-byte atomic operations too
> when setting values.  Otherwise, you need another register to just to
> signal when it's okay for the device to examine any given register.

You already do; the status register fills this role.  For example, you can't 
tell what features a guest understands until it updates the status register.

Hope that clarifies,
Rusty.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: 2.6.24-rc1-gb4f5550 oops

2007-11-08 Thread Grant Wilson

On Thu, 8 Nov 2007 22:42:21 +0100
"Rafael J. Wysocki" <[EMAIL PROTECTED]> wrote:

> On Thursday, 8 of November 2007, Grant Wilson wrote:
> > On Thu, 8 Nov 2007 16:53:10 +0100
> > "Rafael J. Wysocki" <[EMAIL PROTECTED]> wrote:
> > 
> > > On Thursday, 8 of November 2007, Grant Wilson wrote:
> > > > On Thu, 8 Nov 2007 01:06:21 +0100
> > > > "Rafael J. Wysocki" <[EMAIL PROTECTED]> wrote:
> > > > 
> > > > > On Monday, 5 of November 2007, Grant Wilson wrote:
> > > > > > Hi,
> > > > > > I got this oops on 2.6.24-rc1-641-gb4f5550:
> > > > > 
> > > > > (1) Is this reproducible?
> > > > > (2) Did it happen previously on your system?
> > > > >
> > > > > [18073.371126] Unable to handle kernel NULL pointer dereference at 
> > > > > 0120 RIP: 
> > > > > [18073.371134]  [] check_preempt_wakeup+0x6e/0x110
> > > > 
> > > > This has now happened twice - the second time was last night when
> > > > running 2.6.24-rc2.
> > > > 
> > > > Here's that second occurrence:
> > > > 
> > [snip]
> > > 
> > > Hmm.
> > > 
> > > Please run "gdb vmlinux" and see what code corresponds to
> > > check_preempt_wakeup+0x6e in your kernel.
> >
> > 
> > Dump of assembler code for function check_preempt_wakeup:
> 
> Well, thanks, but I meant the source code.  Please do "gdb vmlinux" and then
> "l *check_preempt_wakeup+0x6e" in gdb.

Here's the requested output:

(gdb) l *check_preempt_wakeup+0x6e
0x802329ae is in check_preempt_wakeup (kernel/sched_fair.c:668).
663
664 /* Do the two (enqueued) entities belong to the same group ? */
665 static inline int
666 is_same_group(struct sched_entity *se, struct sched_entity *pse)
667 {
668 if (se->cfs_rq == pse->cfs_rq)
669 return 1;
670
671 return 0;
672 }

Grant
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [patch 2/2] clone: prepare to recycle CLONE_DETACHED and CLONE_STOPPED

2007-11-08 Thread Roland McGrath

CLONE_STOPPED was previously used by some NTPL versions when under
thread_db (i.e. only when being actively debugged by gdb), but not for a
long time now, and it never worked reliably when it was used.
Removing it seems fine to me.


Thanks,
Roland
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] pf broken

2007-11-08 Thread Ondrej Zary

Sorry for the broken patch. Hope it's OK now.


The pf driver for parallel port floppy drives seems to be broken. At least
with Imation SuperDisk with EPAT chip, the driver calls pi_connect() and
pi_disconnect after each transferred sector. At least with EPAT, this
operation is very expensive - causes drive recalibration. Thus,
transferring even a single byte (dd if=/dev/pf0 of=/dev/null bs=1 count=1)
takes 20 seconds, making the driver useless.

The pf_next_buf() function seems to be broken as it returns 1 always
(except when pf_run is non-zero), causing the loop in do_pf_read_drq (and
do_pf_write_drq) to be executed only once.

The following patch fixes this problem. It also fixes swapped descriptions
in pf_lock() function and removes DBMSG macro, which seems useless.

Signed-off-by: Ondrej Zary <[EMAIL PROTECTED]>

-- 
Ondrej Zary

--- linux-2.6.23-orig/drivers/block/paride/pf.c 2007-10-09 22:31:38.0 
+0200
+++ linux/drivers/block/paride/pf.c 2007-11-08 22:29:31.0 +0100
@@ -488,13 +488,11 @@
return r;
 }
 
-#define DBMSG(msg)  ((verbose>1)?(msg):NULL)
-
 static void pf_lock(struct pf_unit *pf, int func)
 {
char lo_cmd[12] = { ATAPI_LOCK, pf->lun << 5, 0, 0, func, 0, 0, 0, 0, 
0, 0, 0 };
 
-   pf_atapi(pf, lo_cmd, 0, pf_scratch, func ? "unlock" : "lock");
+   pf_atapi(pf, lo_cmd, 0, pf_scratch, func ? "lock" : "unlock");
 }
 
 static void pf_eject(struct pf_unit *pf)
@@ -555,7 +553,7 @@
{ ATAPI_MODE_SENSE, pf->lun << 5, 0, 0, 0, 0, 0, 0, 8, 0, 0, 0 };
char buf[8];
 
-   pf_atapi(pf, ms_cmd, 8, buf, DBMSG("mode sense"));
+   pf_atapi(pf, ms_cmd, 8, buf, "mode sense");
pf->media_status = PF_RW;
if (buf[3] & 0x80)
pf->media_status = PF_RO;
@@ -591,7 +589,7 @@
char buf[8];
int bs;
 
-   if (pf_atapi(pf, rc_cmd, 8, buf, DBMSG("get capacity"))) {
+   if (pf_atapi(pf, rc_cmd, 8, buf, "get capacity")) {
pf->media_status = PF_NM;
return;
}
@@ -804,13 +802,18 @@
pf_buf += 512;
pf_block++;
if (!pf_run)
-   return 0;
-   if (!pf_count)
return 1;
-   spin_lock_irqsave(&pf_spin_lock, saved_flags);
-   pf_end_request(1);
-   spin_unlock_irqrestore(&pf_spin_lock, saved_flags);
-   return 1;
+   if (!pf_count) {
+   spin_lock_irqsave(&pf_spin_lock, saved_flags);
+   pf_end_request(1);
+   pf_req = elv_next_request(pf_queue);
+   spin_unlock_irqrestore(&pf_spin_lock, saved_flags);
+   if (!pf_req)
+   return 1;
+   pf_count = pf_req->current_nr_sectors;
+   pf_buf = pf_req->buffer;
+   }
+   return 0;
 }
 
 static inline void next_request(int success)
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 0/6] MN10300: Add the MN10300 architecture to Linux kernel [try #4]

2007-11-08 Thread Sam Ravnborg

Hi David.

> > arch/mn10300/Makefile:
> > 
> > 1) Use KBUILD_CFLAGS & KBUILD_AFLAGS & KBUILD_CPPFLAGS - they
> > have replaced the former xFLAGS.
> 
> Done.  What about ASFLAGS and LDFLAGS?
kbuild does not know anything about ASFLAGS.
In the whole kernel tree I only see frv assign them but the
value is never used.

LDFLAGS has not been changed - it was not such an easy cut.
And we anyway use the linker in much more ways so when to
use it and when not...

> > 4) Drop the symlinks - they are evil...
> > Use a structure like:
> > include/asm-mn10300/asb2303/proc/*.h
> 
> No.  I think all the arch headers should really appear to be #included under
> asm/.  That means that the structure would have to be:
> 
>   include/asm-mn10300/asb2303/asm/proc/*.h
> 
> That then puts all these header files several levels further down, which isn't
> that good.
I would prefer that any day as replacement for the symlinks.
But as everyone else seems so happy with the symlinks do whatever you prefer.

> > Then you adjust -I include/asm-mn10300/asb2303
> > and no symlinks needed.
> > And if you change processor kbuild will notice
> > and recompile everything.
> 
> That's the only plus, but it's a smaller plus than not interpolating several
> levels of almost empty directory into the paths for the proc- and 
> unit-specific
> headers.
> 
> > No-one else does it this way but mn10300 could show how to do it.
> 
> I could.  Or I could do what everyone else does.
> 
> Actually, what you perhaps ought to do for a start is to move the individual
> include/asm-$ARCH dirs to arch/$ARCH/asm and then you can avoid that symlink
> too.

It will wait until we save ARCH so we can detect
ARCH changes.

> 
> > 5)
> > +zImage: vmlinux
> > +   $(Q)$(MAKE) $(build)=$(boot) $(KBUILD_IMAGE)
> > +
> > +all: zImage
> > +
> > +Image: vmlinux
> > +   $(Q)$(MAKE) $(build)=arch/mn10300/boot $@
> > 
> > This could be done as:
> > +Image zImage: vmlinux
> > +   $(Q)$(MAKE) $(build)=$(boot) $(boot)/$@
> 
> Done.
> 
> > And then modify boot/Makefile accordingly
> 
> I don't see why it needs modification.  $(KBUILD_IMAGE) == $(boot)/zImage for
> the zImage target.
For the Image target it is Image without $(boot)/ but that target
was never supported anyway.

> > 1) You can safely remove the copyright of Linus..
> >Same goes for other places where his copyright are kept but filecontent
> >is new.
> 
> The copyright assignment on some of these files was made by MEI.  I'm not sure
> I can change them.

It just seemed stupid that someone having copyright on a file
where the person only had contributed the copyright linei (for said file).

Sam
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] pf broken

2007-11-08 Thread Ondrej Zary

Hello,
the pf driver for parallel port floppy drives seems to be broken. At least 
with Imation SuperDisk with EPAT chip, the driver calls pi_connect() and 
pi_disconnect after each transferred sector. At least with EPAT, this 
operation is very expensive - causes drive recalibration. Thus, transferring 
even a single byte (dd if=/dev/pf0 of=/dev/null bs=1 count=1) takes 20 
seconds, making the driver useless.

The pf_next_buf() function seems to be broken as it returns 1 always (except 
when pf_run is non-zero), causing the loop in do_pf_read_drq (and 
do_pf_write_drq) to be executed only once.

The following patch fixes this problem. It also fixes swapped descriptions in 
pf_lock() function and removes DBMSG macro, which seems useless.

-- 
Ondrej Zary

--- linux-2.6.23-orig/drivers/block/paride/pf.c 2007-10-09 22:31:38.0 
+0200
+++ linux/drivers/block/paride/pf.c 2007-11-08 22:29:31.0 +0100
@@ -488,13 +488,11 @@
return r;
 }
 
-#define DBMSG(msg)  ((verbose>1)?(msg):NULL)
-
 static void pf_lock(struct pf_unit *pf, int func)
 {
char lo_cmd[12] = { ATAPI_LOCK, pf->lun << 5, 0, 0, func, 0, 0, 0, 0, 
0, 0, 
0 };
 
-   pf_atapi(pf, lo_cmd, 0, pf_scratch, func ? "unlock" : "lock");
+   pf_atapi(pf, lo_cmd, 0, pf_scratch, func ? "lock" : "unlock");
 }
 
 static void pf_eject(struct pf_unit *pf)
@@ -555,7 +553,7 @@
{ ATAPI_MODE_SENSE, pf->lun << 5, 0, 0, 0, 0, 0, 0, 8, 0, 0, 0 };
char buf[8];
 
-   pf_atapi(pf, ms_cmd, 8, buf, DBMSG("mode sense"));
+   pf_atapi(pf, ms_cmd, 8, buf, "mode sense");
pf->media_status = PF_RW;
if (buf[3] & 0x80)
pf->media_status = PF_RO;
@@ -591,7 +589,7 @@
char buf[8];
int bs;
 
-   if (pf_atapi(pf, rc_cmd, 8, buf, DBMSG("get capacity"))) {
+   if (pf_atapi(pf, rc_cmd, 8, buf, "get capacity")) {
pf->media_status = PF_NM;
return;
}
@@ -804,13 +802,18 @@
pf_buf += 512;
pf_block++;
if (!pf_run)
-   return 0;
-   if (!pf_count)
return 1;
-   spin_lock_irqsave(&pf_spin_lock, saved_flags);
-   pf_end_request(1);
-   spin_unlock_irqrestore(&pf_spin_lock, saved_flags);
-   return 1;
+   if (!pf_count) {
+   spin_lock_irqsave(&pf_spin_lock, saved_flags);
+   pf_end_request(1);
+   pf_req = elv_next_request(pf_queue);
+   spin_unlock_irqrestore(&pf_spin_lock, saved_flags);
+   if (!pf_req)
+   return 1;
+   pf_count = pf_req->current_nr_sectors;
+   pf_buf = pf_req->buffer;
+   }
+   return 0;
 }
 
 static inline void next_request(int success)
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] printk: trivial optimizations

2007-11-08 Thread Denys Vlasenko

Hi Andrew,

This patch exploits some optimization opportunities
similar to those in first two patches I sent a while ago.

In particular:

In arch/x86/boot/printf.c gets rid of unused tail of digits:
const char *digits = "0123456789abcdefghijklmnopqrstuvwxyz";
(we are using 0-9a-f only)

Uses smaller/faster lowercasing (by ORing with 0x20)
if we know that we work on numbers/digits. Makes
strtoul smaller, and also we are getting rid of 
  static const char small_digits[] = "0123456789abcdefx";
  static const char large_digits[] = "0123456789ABCDEFX";
since this works equally well:
  static const char digits[16] = "0123456789ABCDEF";

Size savings:

$ size vmlinux.org vmlinux
   textdata bss dec hex filename
 877320  112252   90112 1079684  107984 vmlinux.org
 877048  112252   90112 1079412  107874 vmlinux

It may be also a tiny bit faster because code has less
branches now, but I doubt it is measurable.

Patch is run-tested.

Signed-off-by: Denys Vlasenko <[EMAIL PROTECTED]>
-- 
vda
diff -urpN linux-2.6.23-rc9/arch/i386/boot/printf.c linux-2.6.23-rc9-printf/arch/i386/boot/printf.c
--- linux-2.6.23-rc9/arch/x86/boot/printf.c	2007-10-08 15:40:45.0 +0100
+++ linux-2.6.23-rc9-printf/arch/x86/boot/printf.c	2007-10-08 16:39:57.0 +0100
@@ -33,8 +33,8 @@ static int skip_atoi(const char **s)
 #define PLUS	4		/* show plus */
 #define SPACE	8		/* space if plus */
 #define LEFT	16		/* left justified */
-#define SPECIAL	32		/* 0x */
-#define LARGE	64		/* use 'ABCDEF' instead of 'abcdef' */
+#define SMALL	32		/* Must be 32 == 0x20 */
+#define SPECIAL	64		/* 0x */
 
 #define do_div(n,base) ({ \
 int __res; \
@@ -45,12 +45,16 @@ __res; })
 static char *number(char *str, long num, int base, int size, int precision,
 		int type)
 {
-	char c, sign, tmp[66];
-	const char *digits = "0123456789abcdefghijklmnopqrstuvwxyz";
+	/* we are called with base 8, 10 or 16, only, thus don't need "G..."  */
+	static const char digits[16] = "0123456789ABCDEF"; /* "GHIJKLMNOPQRSTUVWXYZ"; */
+
+	char tmp[66];
+	char c, sign, locase;
 	int i;
 
-	if (type & LARGE)
-		digits = "0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZ";
+	/* locase = 0 or 0x20. ORing digits or letters with 'locase'
+	 * produces same digits or (maybe lowercased) letters */
+	locase = (type & SMALL);
 	if (type & LEFT)
 		type &= ~ZEROPAD;
 	if (base < 2 || base > 36)
@@ -81,7 +85,7 @@ static char *number(char *str, long num,
 		tmp[i++] = '0';
 	else
 		while (num != 0)
-			tmp[i++] = digits[do_div(num, base)];
+			tmp[i++] = (digits[do_div(num, base)] | locase);
 	if (i > precision)
 		precision = i;
 	size -= precision;
@@ -95,7 +99,7 @@ static char *number(char *str, long num,
 			*str++ = '0';
 		else if (base == 16) {
 			*str++ = '0';
-			*str++ = digits[33];
+			*str++ = ('X' | locase);
 		}
 	}
 	if (!(type & LEFT))
@@ -244,9 +248,9 @@ int vsprintf(char *buf, const char *fmt,
 			base = 8;
 			break;
 
-		case 'X':
-			flags |= LARGE;
 		case 'x':
+			flags |= SMALL;
+		case 'X':
 			base = 16;
 			break;
 
diff -urpN linux-2.6.23-rc9/lib/vsprintf.c linux-2.6.23-rc9-printf/lib/vsprintf.c
--- linux-2.6.23-rc9/lib/vsprintf.c	2007-10-08 15:40:48.0 +0100
+++ linux-2.6.23-rc9-printf/lib/vsprintf.c	2007-10-08 16:41:35.0 +0100
@@ -26,6 +26,9 @@
 #include 		/* for PAGE_SIZE */
 #include 
 
+/* Works only for digits and letters, but small and fast */
+#define TOLOWER(x) ((x) | 0x20)
+
 /**
  * simple_strtoul - convert a string to an unsigned long
  * @cp: The start of the string
@@ -41,17 +44,17 @@ unsigned long simple_strtoul(const char 
 		if (*cp == '0') {
 			base = 8;
 			cp++;
-			if ((toupper(*cp) == 'X') && isxdigit(cp[1])) {
+			if ((TOLOWER(*cp) == 'x') && isxdigit(cp[1])) {
 cp++;
 base = 16;
 			}
 		}
 	} else if (base == 16) {
-		if (cp[0] == '0' && toupper(cp[1]) == 'X')
+		if (cp[0] == '0' && TOLOWER(cp[1]) == 'x')
 			cp += 2;
 	}
 	while (isxdigit(*cp) &&
-	   (value = isdigit(*cp) ? *cp-'0' : toupper(*cp)-'A'+10) < base) {
+	   (value = isdigit(*cp) ? *cp-'0' : TOLOWER(*cp)-'a'+10) < base) {
 		result = result*base + value;
 		cp++;
 	}
@@ -92,17 +95,17 @@ unsigned long long simple_strtoull(const
 		if (*cp == '0') {
 			base = 8;
 			cp++;
-			if ((toupper(*cp) == 'X') && isxdigit(cp[1])) {
+			if ((TOLOWER(*cp) == 'x') && isxdigit(cp[1])) {
 cp++;
 base = 16;
 			}
 		}
 	} else if (base == 16) {
-		if (cp[0] == '0' && toupper(cp[1]) == 'X')
+		if (cp[0] == '0' && TOLOWER(cp[1]) == 'x')
 			cp += 2;
 	}
-	while (isxdigit(*cp) && (value = isdigit(*cp) ? *cp-'0' : (islower(*cp)
-	? toupper(*cp) : *cp)-'A'+10) < base) {
+	while (isxdigit(*cp)
+	 && (value = isdigit(*cp) ? *cp-'0' : TOLOWER(*cp)-'a'+10) < base) {
 		result = result*base + value;
 		cp++;
 	}
@@ -237,24 +240,25 @@ static noinline char* put_dec(char *buf,
 #define PLUS	4		/* show plus */
 #define SPACE	8		/* space if plus */
 #define LEFT	16		/* left justified */
-#define SPECIAL	32		/* 0x */
-#define LARGE	64		/* use 'ABCD

Re: kbuild: possible regression?

2007-11-08 Thread Sam Ravnborg

On Thu, Nov 08, 2007 at 08:45:01PM +0100, Jan Altenberg wrote:
> Hi Sam,
> 
> > > commit 0b35786d77ba4037f181982cc8ca20a7a3bf0fd2
> > > Author: Milton Miller <[EMAIL PROTECTED]>
> > > Date:   Fri Sep 21 18:09:02 2007 -0500
> > > 
> > > kbuild: call make once for all targets when O=.. is used
> > > 
> > > Change the invocations of make in the output directory Makefile and 
> > > the
> > > main Makefile for separate object trees to pass all goals to one 
> > > $(MAKE)
> > > via a new phony target "sub-make" and the existing target _all.
> > > 
> > > When compiling with separate object directories, a separate make is 
> > > called
> > > in the context of another directory (from the output directory the 
> > > main
> > > Makefile is called, the Makefile is then restarted with current 
> > > directory
> > > set to the object tree).  Before this patch, when multiple make 
> > > command
> > > goals are specified, each target results in a separate make 
> > > invocation.
> > > With make -j, these invocations may run in parallel, resulting in 
> > > multiple
> > > commands running in the same directory clobbering each others results.
> > > 
> > > I did not try to address make -j for mixed dot-config and 
> > > no-dot-config
> > > targets.  Because the order does matter, a solution was not obvious.
> > > Perhaps a simple check for MAKEFLAGS having -j and refusing to run 
> > > would
> > > be appropriate.
> > > 
> > > Signed-off-by: Milton Miller <[EMAIL PROTECTED]>
> > > Signed-off-by: Sam Ravnborg <[EMAIL PROTECTED]>
> > > 
> > > So, am I facing a kbuild regression?
> > 
> > Yes - I will try to fix it during the weekend (if Milton does not beat me).
> > Thanks for reporting and bisecting!
> 
> Have you made any progress on this? Let me know, if I can assist with
> testing.

Hi Jan.

Not at all. My limited linux time goes into some x86 unification
that I have given higher priority.
But your report is saved and I will return to it.

Sam
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 0/3] Kvm clocksource, new spin

2007-11-08 Thread Glauber de Oliveira Costa

Hi folks,

Here's a new spin of the clocksource implementation.
In this new version:
* followed avi's suggestion of:
  - letting the cpu itself register its memory area.
  - using a gfn instead of a phys addr as a parameter, to be sure we can 
cover the whole memory area
  - write guest time at exits.

Also, I 'm not using an anonymous struct in the kvm_hv_clock union, so the
vcpu struct can grab just what it needs, and not the whole padding the guest 
needs

This is it.

Have fun


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: 2.6.23.1: mdadm/raid5 hung/d-state

2007-11-08 Thread Carlos Carvalho

Jeff Lessem ([EMAIL PROTECTED]) wrote on 6 November 2007 22:00:
 >Dan Williams wrote:
 > > The following patch, also attached, cleans up cases where the code looks
 > > at sh->ops.pending when it should be looking at the consistent
 > > stack-based snapshot of the operations flags.
 >
 >I tried this patch (against a stock 2.6.23), and it did not work for
 >me.  Not only did I/O to the effected RAID5 & XFS partition stop, but
 >also I/O to all other disks.  I was not able to capture any debugging
 >information, but I should be able to do that tomorrow when I can hook
 >a serial console to the machine.
 >
 >I'm not sure if my problem is identical to these others, as mine only
 >seems to manifest with RAID5+XFS.  The RAID rebuilds with no problem,
 >and I've not had any problems with RAID5+ext3.

Us too! We're stuck trying to build a disk server with several disks
in a raid5 array, and the rsync from the old machine stops writing to
the new filesystem. It only happens under heavy IO. We can make it
lock without rsync, using 8 simultaneous dd's to the array. All IO
stops, including the resync after a newly created raid or after an
unclean reboot.

We could not trigger the problem with ext3 or reiser3; it only happens
with xfs.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Buffer overflow in CIFS VFS.

2007-11-08 Thread Przemyslaw Wegrzyn

Hello all,

I was looking at CIFS VFS code recently, trying to solve other issue,
just to find something that looks like a buffer overflow  bug.
The problem is in SendReceive() function in transport.c - it memcpy's
message payload into a buffer passed via out_buf param. The function
assumes that all buffers are of size (CIFSMaxBufSize +
MAX_CIFS_HDR_SIZE) , unfortunately it is also called with smaller
(MAX_CIFS_SMALL_BUFFER_SIZE) buffers.

To check this finding I patched Samba server to send oversized logoffX
messages. With ~ 16kB messages the client running 2.6.23.1 crashed upon
unmounting.

I've done a quick fix, available here:
http://czajnick.sitenet.pl/cifs-buffer-overflow-fix.patch.gz

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 2/3] kvmclock - the host part.

2007-11-08 Thread Glauber de Oliveira Costa

This is the host part of kvm clocksource implementation. As it does
not include clockevents, it is a fairly simple implementation. We
only have to register a per-vcpu area, and start writting to it periodically.

Signed-off-by: Glauber de Oliveira Costa <[EMAIL PROTECTED]>
---
 drivers/kvm/kvm_main.c |1 +
 drivers/kvm/x86.c  |   32 
 drivers/kvm/x86.h  |4 
 3 files changed, 37 insertions(+), 0 deletions(-)

diff --git a/drivers/kvm/kvm_main.c b/drivers/kvm/kvm_main.c
index d095002..c2c79b8 100644
--- a/drivers/kvm/kvm_main.c
+++ b/drivers/kvm/kvm_main.c
@@ -1243,6 +1243,7 @@ static long kvm_dev_ioctl(struct file *filp,
case KVM_CAP_MMU_SHADOW_CACHE_CONTROL:
case KVM_CAP_USER_MEMORY:
case KVM_CAP_SET_TSS_ADDR:
+   case KVM_CAP_CLOCKSOURCE:
r = 1;
break;
default:
diff --git a/drivers/kvm/x86.c b/drivers/kvm/x86.c
index e905d46..ef31fed 100644
--- a/drivers/kvm/x86.c
+++ b/drivers/kvm/x86.c
@@ -19,6 +19,7 @@
 #include "segment_descriptor.h"
 #include "irq.h"
 
+#include 
 #include 
 #include 
 #include 
@@ -1628,6 +1629,28 @@ int kvm_emulate_halt(struct kvm_vcpu *vcpu)
 }
 EXPORT_SYMBOL_GPL(kvm_emulate_halt);
 
+static void kvm_write_guest_time(struct kvm_vcpu *vcpu)
+{
+   struct timespec ts;
+   int r;
+
+   if (!vcpu->clock_gpa)
+   return;
+
+   /* Updates version to the next odd number, indicating we're writing */
+   vcpu->hv_clock.version++;
+   kvm_write_guest(vcpu->kvm, vcpu->clock_gpa, &vcpu->hv_clock, PAGE_SIZE);
+
+   kvm_get_msr(vcpu, MSR_IA32_TIME_STAMP_COUNTER,
+ &vcpu->hv_clock.last_tsc);
+
+   ktime_get_ts(&ts);
+   vcpu->hv_clock.now_ns = ts.tv_nsec + (NSEC_PER_SEC * (u64)ts.tv_sec);
+   vcpu->hv_clock.wc_sec = get_seconds();
+   vcpu->hv_clock.version++;
+   kvm_write_guest(vcpu->kvm, vcpu->clock_gpa, &vcpu->hv_clock, PAGE_SIZE);
+}
+
 int kvm_emulate_hypercall(struct kvm_vcpu *vcpu)
 {
unsigned long nr, a0, a1, a2, a3, ret;
@@ -1648,7 +1671,15 @@ int kvm_emulate_hypercall(struct kvm_vcpu *vcpu)
a3 &= 0x;
}
 
+   ret = 0;
switch (nr) {
+   case  KVM_HCALL_REGISTER_CLOCK:
+
+   vcpu->clock_gpa = a0 << PAGE_SHIFT;
+   vcpu->hv_clock.tsc_mult = clocksource_khz2mult(tsc_khz, 22);
+
+   break;
+
default:
ret = -KVM_ENOSYS;
break;
@@ -1924,6 +1955,7 @@ out:
goto preempted;
}
 
+   kvm_write_guest_time(vcpu);
post_kvm_run_save(vcpu, kvm_run);
 
return r;
diff --git a/drivers/kvm/x86.h b/drivers/kvm/x86.h
index 663b822..fd77b66 100644
--- a/drivers/kvm/x86.h
+++ b/drivers/kvm/x86.h
@@ -83,6 +83,10 @@ struct kvm_vcpu {
/* emulate context */
 
struct x86_emulate_ctxt emulate_ctxt;
+
+   struct kvm_hv_clock_s hv_clock;
+   gpa_t clock_gpa; /* guest frame number, physical addr */
+
 };
 
 int kvm_mmu_page_fault(struct kvm_vcpu *vcpu, gva_t gva, u32 error_code);
-- 
1.5.0.6

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 1/3] include files for kvmclock

2007-11-08 Thread Glauber de Oliveira Costa

This patch introduces the include files for kvm clock.
They'll be needed for both guest and host part.

Signed-off-by: Glauber de Oliveira Costa <[EMAIL PROTECTED]>
---
 include/asm-x86/kvm_para.h |   25 +
 include/linux/kvm.h|1 +
 include/linux/kvm_para.h   |2 ++
 3 files changed, 28 insertions(+), 0 deletions(-)

diff --git a/include/asm-x86/kvm_para.h b/include/asm-x86/kvm_para.h
index c6f3fd8..0f6b813 100644
--- a/include/asm-x86/kvm_para.h
+++ b/include/asm-x86/kvm_para.h
@@ -10,15 +10,40 @@
  * paravirtualization, the appropriate feature bit should be checked.
  */
 #define KVM_CPUID_FEATURES 0x4001
+#define KVM_FEATURE_CLOCKSOURCE 0
 
 #ifdef __KERNEL__
 #include 
+extern void kvmclock_init(void);
+
+/*
+ * Guest has page alignment and padding requirements. At the host, it will
+ * only lead to wasted space at the vcpu struct. For this reason, the struct
+ * is not anonymous
+ */
+union kvm_hv_clock {
+   struct kvm_hv_clock_s {
+   u64 tsc_mult;
+   u64 now_ns;
+   /* That's the wall clock, not the water closet */
+   u64 wc_sec;
+   u64 last_tsc;
+   /* At first, we could use the tsc value as a marker, but Jeremy
+* well noted that it will cause us locking problems in 32-bit
+* sys, so we have a special version field */
+   u32 version;
+   } fields;
+   char page_align[PAGE_SIZE];
+};
+
 
 /* This instruction is vmcall.  On non-VT architectures, it will generate a
  * trap that we will then rewrite to the appropriate instruction.
  */
 #define KVM_HYPERCALL ".byte 0x0f,0x01,0xc1"
 
+#define KVM_HCALL_REGISTER_CLOCK   1
+
 /* For KVM hypercalls, a three-byte sequence of either the vmrun or the vmmrun
  * instruction.  The hypervisor may replace it with something else but only the
  * instructions are guaranteed to be supported.
diff --git a/include/linux/kvm.h b/include/linux/kvm.h
index 71d33d6..9862241 100644
--- a/include/linux/kvm.h
+++ b/include/linux/kvm.h
@@ -359,6 +359,7 @@ struct kvm_signal_mask {
 #define KVM_CAP_MMU_SHADOW_CACHE_CONTROL 2
 #define KVM_CAP_USER_MEMORY 3
 #define KVM_CAP_SET_TSS_ADDR 4
+#define KVM_CAP_CLOCKSOURCE  5
 
 /*
  * ioctls for VM fds
diff --git a/include/linux/kvm_para.h b/include/linux/kvm_para.h
index e4db25f..094efc7 100644
--- a/include/linux/kvm_para.h
+++ b/include/linux/kvm_para.h
@@ -11,6 +11,8 @@
 
 /* Return values for hypercalls */
 #define KVM_ENOSYS 1000
+#define KVM_EINVAL 1019
+#define KVM_ENODEV 1022
 
 #ifdef __KERNEL__
 /*
-- 
1.5.0.6

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 3/3] kvmclock implementation, the guest part.

2007-11-08 Thread Glauber de Oliveira Costa

This is the guest part of kvm clock implementation
It does not do tsc-only timing, as tsc can have deltas
between cpus, and it did not seem worthy to me to keep
adjusting them.

We do use it, however, for fine-grained adjustment.

Other than that, time comes from the host.

Signed-off-by: Glauber de Oliveira Costa <[EMAIL PROTECTED]>
---
 arch/x86/Kconfig.i386   |   10 +++
 arch/x86/kernel/Makefile_32 |1 +
 arch/x86/kernel/kvmclock.c  |  171 +++
 arch/x86/kernel/setup_32.c  |5 +
 4 files changed, 187 insertions(+), 0 deletions(-)
 create mode 100644 arch/x86/kernel/kvmclock.c

diff --git a/arch/x86/Kconfig.i386 b/arch/x86/Kconfig.i386
index 7331efe..5fe4025 100644
--- a/arch/x86/Kconfig.i386
+++ b/arch/x86/Kconfig.i386
@@ -257,6 +257,16 @@ config VMI
  at the moment), by linking the kernel to a GPL-ed ROM module
  provided by the hypervisor.
 
+config KVM_CLOCK
+   bool "KVM paravirtualized clock"
+   select PARAVIRT
+   help
+ Turning on this option will allow you to run a paravirtualized clock
+ when running over the KVM hypervisor. Instead of relying on a PIT
+ (or probably other) emulation by the underlying device model, the host
+ provides the guest with timing infrastructure, as time of day, and
+ timer expiration.
+
 source "arch/x86/lguest/Kconfig"
 
 endif
diff --git a/arch/x86/kernel/Makefile_32 b/arch/x86/kernel/Makefile_32
index b9d6798..df76d8c 100644
--- a/arch/x86/kernel/Makefile_32
+++ b/arch/x86/kernel/Makefile_32
@@ -43,6 +43,7 @@ obj-$(CONFIG_K8_NB)   += k8.o
 obj-$(CONFIG_MGEODE_LX)+= geode_32.o mfgpt_32.o
 
 obj-$(CONFIG_VMI)  += vmi_32.o vmiclock_32.o
+obj-$(CONFIG_KVM_CLOCK)+= kvmclock.o
 obj-$(CONFIG_PARAVIRT) += paravirt_32.o
 obj-y  += pcspeaker.o
 
diff --git a/arch/x86/kernel/kvmclock.c b/arch/x86/kernel/kvmclock.c
new file mode 100644
index 000..df14613
--- /dev/null
+++ b/arch/x86/kernel/kvmclock.c
@@ -0,0 +1,171 @@
+/*  KVM paravirtual clock driver. A clocksource implementation
+Copyright (C) 2007 Glauber de Oliveira Costa, Red Hat Inc.
+
+This program is free software; you can redistribute it and/or modify
+it under the terms of the GNU General Public License as published by
+the Free Software Foundation; either version 2 of the License, or
+(at your option) any later version.
+
+This program is distributed in the hope that it will be useful,
+but WITHOUT ANY WARRANTY; without even the implied warranty of
+MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+GNU General Public License for more details.
+
+You should have received a copy of the GNU General Public License
+along with this program; if not, write to the Free Software
+Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA  02110-1301  USA
+*/
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include 
+#include 
+
+#define KVM_SCALE 22
+
+#define get_clock(cpu, field) hv_clock[cpu].fields.field
+
+static int kvmclock = 1;
+
+static int parse_no_kvmclock(char *arg)
+{
+   kvmclock = 0;
+   return 0;
+}
+early_param("no-kvmclock", parse_no_kvmclock);
+
+/* The hypervisor will put information about time periodically here */
+union kvm_hv_clock hv_clock[NR_CPUS] __attribute__((__aligned__(PAGE_SIZE)));
+
+static inline u64 kvm_get_delta(u64 last_tsc)
+{
+   int cpu = smp_processor_id();
+   u64 delta = native_read_tsc() - last_tsc;
+   return (delta * get_clock(cpu, tsc_mult)) >> KVM_SCALE;
+}
+
+/*
+ * The wallclock is the time of day when we booted. Since then, some time may
+ * have elapsed since the hypervisor wrote the data. So we try to account for
+ * that. Even if the tsc is not accurate, it gives us a more accurate timing
+ * than not adjusting at all
+ */
+unsigned long kvm_get_wallclock(void)
+{
+   u64 wc_sec, delta, last_tsc;
+   struct timespec ts;
+   int version, nsec, cpu = smp_processor_id();
+
+   do {
+   version = get_clock(cpu, version);
+   rmb();
+   last_tsc = get_clock(cpu, last_tsc);
+   rmb();
+   wc_sec = get_clock(cpu, wc_sec);
+   rmb();
+   } while ((get_clock(cpu, version) != version) && !(version & 1));
+
+   delta = kvm_get_delta(last_tsc);
+   nsec = do_div(delta, NSEC_PER_SEC);
+   set_normalized_timespec(&ts, wc_sec + delta, nsec);
+
+   /*
+* Of all mechanisms of time adjustment I've tested, this one
+* was the champion!
+*/
+   return ts.tv_sec + 1;
+}
+
+int kvm_set_wallclock(unsigned long now)
+{
+   return 0;
+}
+
+/*
+ * This is our read_clock function. The host puts an tsc timestamp each time
+ * it updates a new time, and then we can use it to derive a slightly more
+ * precise notion of elapsed time, converted to nano

AppArmor Security Goal

2007-11-08 Thread Crispin Cowan

re-sent due to a typo in addressing.

AppArmor Security Goal
Crispin Cowan, PhD
MercenaryLinux.com

This document is intended to specify the security goal that AppArmor is
intended to achieve, so that users can evaluate whether AppArmor will
meet their needs, and kernel developers can evaluate whether AppArmor is
living up to its claims. This document is *not* a general purpose
explanation of how AppArmor works, nor is it an explanation for why one
might want to use AppArmor rather than some other system.

AppArmor is intended to protect systems from attackers exploiting
vulnerabilities in applications that the system hosts. The threat is
that an attacker can cause a vulnerable application to do something
unexpected and undesirable. AppArmor addresses this threat by confining
the application to access only the resources it needs to access to
execute properly, effectively imposing "least privilege" execution on
the application.

Applications have access to a number of resources including files,
interprocess communication, networking, capabilities, and execution of
other applications. The purpose of least privilege is to bound the
damage that a malicious user or code can do by removing access to all
resources that the application does not need for its intended function.
For instance, a policy for a web server might grant read only access to
most web documents, preventing an attacker who can corrupt the web
server from defacing the web pages.

An "application" is one or more related processes performing a function,
e.g. the gang of processes that constitute an Apache web server, or a
Postfix mail server. AppArmor *only* confines processes that the
AppArmor policy says it should confine, and other processes are
permitted to do anything that DAC permits. This is sometimes known as a
targeted security policy.

AppArmor does not provide a "default" policy that applies to all
processes. So to defend an entire host, you have to piece-wise confine
each process that is exposed to potential attack. For instance, to
defend a system against network attack, place AppArmor profiles around
every application that accesses the network. This limits the damage a
network attacker can do to the file system to only those files granted
by the profiles for the network-available applications. Similarly, to
defend a system against attack from the console, place AppArmor profiles
around every application that accessed the keyboard and mouse. The
system is "defended" in that the worst the attacker can do to corrupt
the system is limited to the transitive closure of what the confined
processes are allowed to access.

AppArmor currently mediates access to files, ability to use POSIX.1e
Capabilities, and coarse-grained control on network access. This is
sufficient to prevent a confined process from *directly* corrupting the
file system. It is not sufficient to prevent a confined process from
*indirectly* corrupting the system by influencing some other process to
do the dirty deed. But to do so requires a complicit process that can be
manipulated through another channel such as IPC. A "complicit" process
is either a malicious process the attacker somehow got control of, or is
a process that is actively listening to IPC of some kind and can be
corrupted via IPC.

The only IPC that AppArmor mediates is access to named sockets, FIFOs,
etc. that appear in the file system name space, a side effect of
AppArmor's file access mediation. Future versions of AppArmor will
mediate more resources, including finer grained network access controls,
and controls on various forms of IPC.

AppArmor specifies the programs to be confined and the resources they
can access in a syntax similar to how users are accustomed to accessing
those resources. So file access controls are specified using absolute
paths with respect to the name space the process is in. POSIX.1e
capabilities are specified by name. Network access controls currently
are specified by simply naming the protocol that can be used e.g. tcp,
udp, and in the future will be more general, resembling firewall rules.

Thus the AppArmor security goal should be considered piecewise from the
point of view of a single confined process: that process should only be
able to access the resources specified in its profile:

* can only access files that are reachable in its name space by path
  names matching its profile, and only with the permitted modes:
  read, append, write, memory map, execute, and link
* can only use the POSIX.1e capabilities listed in the profile
* can only perform the network operations listed in the profile

Security issues that AppArmor explicitly does *not* address:

* Processes that are not confined by AppArmor are not restricted in
  any way by AppArmor. If an unconfined process is considered an
  unacceptable threat, then confine additional applications until
  adequate security is achieved.
* A process that is not permitted to directly access a

[patch 2/2] clone: prepare to recycle CLONE_DETACHED and CLONE_STOPPED

2007-11-08 Thread akpm

From: Andrew Morton <[EMAIL PROTECTED]>

Ulrich says that we never used these clone flags and that nothing should be
using them.

As we're down to only a single bit left in clone's flags argument, let's add a
warning to check that no userspace is actually using these.  Hopefully we will
be able to recycle them.

Cc: Ulrich Drepper <[EMAIL PROTECTED]>
Cc: Ingo Molnar <[EMAIL PROTECTED]>
Cc: Roland McGrath <[EMAIL PROTECTED]>
Signed-off-by: Andrew Morton <[EMAIL PROTECTED]>
---

 kernel/fork.c |   16 
 1 file changed, 12 insertions(+), 4 deletions(-)

diff -puN 
kernel/fork.c~clone-prepare-to-recycle-clone_detached-and-clone_stopped 
kernel/fork.c
--- a/kernel/fork.c~clone-prepare-to-recycle-clone_detached-and-clone_stopped
+++ a/kernel/fork.c
@@ -1420,10 +1420,18 @@ long do_fork(unsigned long clone_flags,
int trace = 0;
long nr;
 
-   if (unlikely(current->ptrace)) {
-   trace = fork_traceflag (clone_flags);
-   if (trace)
-   clone_flags |= CLONE_PTRACE;
+   /*
+* We hope to recycle these flags after 2.6.26
+*/
+   if (unlikely(clone_flags & (CLONE_DETACHED|CLONE_STOPPED))) {
+   if (printk_ratelimit()) {
+   char comm[TASK_COMM_LEN];
+
+   printk(KERN_INFO "fork(): process `%s' used deprecated "
+   "clone flags 0x%lx\n",
+   get_task_comm(comm, current),
+   clone_flags & (CLONE_DETACHED|CLONE_STOPPED));
+   }
}
 
p = copy_process(clone_flags, stack_start, regs, stack_size,
_
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[patch 1/2] get_task_comm(): return the result

2007-11-08 Thread akpm

From: Andrew Morton <[EMAIL PROTECTED]>

It was dumb to make get_task_comm() return void.  Change it to return a
pointer to the resulting output for caller convenience.

Cc: Ulrich Drepper <[EMAIL PROTECTED]>
Cc: Ingo Molnar <[EMAIL PROTECTED]>
Cc: Roland McGrath <[EMAIL PROTECTED]>
Signed-off-by: Andrew Morton <[EMAIL PROTECTED]>
---

 fs/exec.c |3 ++-
 include/linux/sched.h |2 +-
 2 files changed, 3 insertions(+), 2 deletions(-)

diff -puN include/linux/sched.h~get_task_comm-return-the-result 
include/linux/sched.h
--- a/include/linux/sched.h~get_task_comm-return-the-result
+++ a/include/linux/sched.h
@@ -1692,7 +1692,7 @@ extern long do_fork(unsigned long, unsig
 struct task_struct *fork_idle(int);
 
 extern void set_task_comm(struct task_struct *tsk, char *from);
-extern void get_task_comm(char *to, struct task_struct *tsk);
+extern char *get_task_comm(char *to, struct task_struct *tsk);
 
 #ifdef CONFIG_SMP
 extern void wait_task_inactive(struct task_struct * p);
diff -puN fs/exec.c~get_task_comm-return-the-result fs/exec.c
--- a/fs/exec.c~get_task_comm-return-the-result
+++ a/fs/exec.c
@@ -947,12 +947,13 @@ static void flush_old_files(struct files
spin_unlock(&files->file_lock);
 }
 
-void get_task_comm(char *buf, struct task_struct *tsk)
+char *get_task_comm(char *buf, struct task_struct *tsk)
 {
/* buf must be at least sizeof(tsk->comm) in size */
task_lock(tsk);
strncpy(buf, tsk->comm, sizeof(tsk->comm));
task_unlock(tsk);
+   return buf;
 }
 
 void set_task_comm(struct task_struct *tsk, char *buf)
_
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [patch 02/23] SLUB: Rename NUMA defrag_ratio to remote_node_defrag_ratio

2007-11-08 Thread Christoph Lameter

On Thu, 8 Nov 2007, Matt Mackall wrote:

> But perhaps I should just add a lightweight RNG to random.c and be
> done with it.

It would be appreciated.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [patch 00/23] Slab defragmentation V6

2007-11-08 Thread Christoph Lameter

On Thu, 8 Nov 2007, Lee Schermerhorn wrote:

> > ZONE_MOVABLE also contains mlocked pages that are also not reclaimable. 
> > The question is at what level would it be possible to make them MOVABLE? 
> > It may take some improvements to the kick() methods to make eviction more 
> > reliable. Allowing the moving of objects in the kick() methods will 
> > likely get usthere.
> 
> Christoph:  Although mlocked pages are not reclaimable, they ARE
> migratable.  You fixed that a long time ago.  [And I just verified with
> memtoy.]  Doesn't this make them "movable"?

I know. They are movable but not reclaimable.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: 2.6.24-rc1-gb4f5550 oops

2007-11-08 Thread Rafael J. Wysocki

On Thursday, 8 of November 2007, Grant Wilson wrote:
> On Thu, 8 Nov 2007 16:53:10 +0100
> "Rafael J. Wysocki" <[EMAIL PROTECTED]> wrote:
> 
> > On Thursday, 8 of November 2007, Grant Wilson wrote:
> > > On Thu, 8 Nov 2007 01:06:21 +0100
> > > "Rafael J. Wysocki" <[EMAIL PROTECTED]> wrote:
> > > 
> > > > On Monday, 5 of November 2007, Grant Wilson wrote:
> > > > > Hi,
> > > > > I got this oops on 2.6.24-rc1-641-gb4f5550:
> > > > 
> > > > (1) Is this reproducible?
> > > > (2) Did it happen previously on your system?
> > > >
> > > > [18073.371126] Unable to handle kernel NULL pointer dereference at 
> > > > 0120 RIP: 
> > > > [18073.371134]  [] check_preempt_wakeup+0x6e/0x110
> > > 
> > > This has now happened twice - the second time was last night when
> > > running 2.6.24-rc2.
> > > 
> > > Here's that second occurrence:
> > > 
> [snip]
> > 
> > Hmm.
> > 
> > Please run "gdb vmlinux" and see what code corresponds to
> > check_preempt_wakeup+0x6e in your kernel.
>
> 
> Dump of assembler code for function check_preempt_wakeup:

Well, thanks, but I meant the source code.  Please do "gdb vmlinux" and then
"l *check_preempt_wakeup+0x6e" in gdb.

Thanks,
Rafael
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [patch 00/23] Slab defragmentation V6

2007-11-08 Thread Lee Schermerhorn

On Thu, 2007-11-08 at 11:12 -0800, Christoph Lameter wrote:
> On Thu, 8 Nov 2007, Mel Gorman wrote:
> 
> > On Tue, 2007-11-06 at 17:11 -0800, Christoph Lameter wrote:
> > > Slab defragmentation is mainly an issue if Linux is used as a fileserver
> > 
> > Was hoping this would get renamed to SLUB Targetted Reclaim from
> > discussions at VM Summit. As no copying is taking place, it's confusing
> > to call it defragmentation to me anyway. Not a major deal but it made
> > reading the patches a little confusing.
> 
> The problem is that people are focusing on one feature here and forget 
> about the rest. Targetted reclaim is one feature that was added later when 
> lumpy reclaim was added to the kernel. The primary intend of this patchset 
> was always to reduce the fragmentation. The name is appropriate and the 
> patchset will support copying of objects as soon as support for that is 
> added to the kick(). In that case the copying you are looking for will be 
> there. The simple implementation for the kick() methods is to simply copy
> pieces of the reclaim code. That is what is included here.
> 
> > > With lumpy reclaim slab defragmentation can be used to enhance the
> > > ability to recover larger contiguous areas of memory. Lumpy reclaim 
> > > currently
> > > cannot do anything if a slab page is encountered. With slab 
> > > defragmentation
> > > that slab page can be removed and a large contiguous page freed. It may
> > > be possible to have slab pages also part of ZONE_MOVABLE (Mel's defrag
> > > scheme in 2.6.23)
> > 
> > More terminology nit-pick - ZONE_MOVABLE is not defragmenting anything.
> > It's just partitioning memory. The slab pages need to be 100%
> > reclaimable or movable for that to happen but even with targetted
> > reclaim, some dentries such as the root directory one cannot be
> > reclaimed, right?
> 
> 100%? I am so fond of these categorical statements 
> 
> ZONE_MOVABLE also contains mlocked pages that are also not reclaimable. 
> The question is at what level would it be possible to make them MOVABLE? 
> It may take some improvements to the kick() methods to make eviction more 
> reliable. Allowing the moving of objects in the kick() methods will 
> likely get usthere.

Christoph:  Although mlocked pages are not reclaimable, they ARE
migratable.  You fixed that a long time ago.  [And I just verified with
memtoy.]  Doesn't this make them "movable"?

Lee

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] r8169 fix regression on ASUS motherboards (updated again)

2007-11-08 Thread Francois Romieu

Mark Lord <[EMAIL PROTECTED]> :
[snip snip] 
> Stop the presses:  Another person has now reported back to me.
> This guy has the VER_17 chip inside a notebook, and the same problem.
> 
> So the unified fix for all of these is to just get rid of 
> rtl8168b_hw_phy_config(),
> as that code is all that is really different from 2.6.23 (where all is 
> okay).
> 
> Updated patch below.

Thanks. I'll push an update for it and probably a fix for a different
annoyance before going to bed.

It will help future developments if you can convince the shadow man to
send a few informations: I have no known tester for the VER_17 chip.

As a side note, people should really really really consider posting on
[EMAIL PROTECTED] when reporting test results.

-- 
Ueimor
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [patch 02/23] SLUB: Rename NUMA defrag_ratio to remote_node_defrag_ratio

2007-11-08 Thread Matt Mackall

On Thu, Nov 08, 2007 at 12:01:24PM -0800, Christoph Lameter wrote:
> On Thu, 8 Nov 2007, Matt Mackall wrote:
> 
> > Not really. drivers/char/random.c does:
> > 
> > __get_cpu_var(trickle_count)++ & 0xfff
> 
> That is incremented on each call to add_timer_randomness. Not a high 
> enough resolution there. I guess I am stuck with get_cycles().

I'm not suggesting you use trickle_count, silly. I'm suggesting you
use a similar approach.

But perhaps I should just add a lightweight RNG to random.c and be
done with it.

-- 
Mathematics is the supreme nostalgia of our time.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH RFC 3/7] x86: clean up asm-x86/page*.h

2007-11-08 Thread Jeremy Fitzhardinge

Glauber de Oliveira Costa wrote:
> On Nov 7, 2007 11:50 PM, Jeremy Fitzhardinge <[EMAIL PROTECTED]> wrote:
>   
>> +#define PAGETABLE_LEVELS   3
>> +
>> +typedef u64pteval_t;
>> +typedef u64pmdval_t;
>> +typedef u64pudval_t;
>> +typedef u64pgdval_t;
>> +
>> 
>
>   
>> -static inline unsigned long long native_pgd_val(pgd_t pgd)
>> +static inline pgdval_t native_pgd_val(pgd_t pgd)
>>  {
>> 
> Maybe these kind of things, the typedef and native_xxx definitions can
> go into the common header, after we define the PAGETABLE_LEVELS
> constant?
> I think the more goes into common headers, the better.
>   

You mean put them in a common header, but conditionally by #if
PAGETABLE_LEVELS?  I don't think that would be much of an improvement;
it would just add more #ifs, which adds lines and conceptual
complexity.  If you go that way, you may as well put everything in one
header wrapped in #ifs, but personally I don't think that would help.

J
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [poll] Is the megafreeze development model broken?

2007-11-08 Thread ciol


Adrian Bunk wrote:
[...]


Your reasoning makes sense.
But it may be not adapted for applications like apache.

Thanks.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] hugetlb: follow_hugetlb_page for write access

2007-11-08 Thread Ken Chen

On Nov 7, 2007 11:51 AM, Adam Litke <[EMAIL PROTECTED]> wrote:
> When calling get_user_pages(), a write flag is passed in by the caller to
> indicate if write access is required on the faulted-in pages.  Currently,
> follow_hugetlb_page() ignores this flag and always faults pages for
> read-only access.  This can cause data corruption because a device driver
> that calls get_user_pages() with write set will not expect COW faults to
> occur on the returned pages.
>
> This patch passes the write flag down to follow_hugetlb_page() and makes
> sure hugetlb_fault() is called with the right write_access parameter.
>
> Signed-off-by: Adam Litke <[EMAIL PROTECTED]>

Adam, this looks good.

Reviewed-by: Ken Chen <[EMAIL PROTECTED]>
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [poll] Is the megafreeze development model broken?

2007-11-08 Thread ciol


Chris Snook wrote:


Why are you asking the developers?  We do this for the sake of the users.



The kernel is the software of the developers.
It's important to know how they want it to be distributed.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Plans for Onezonelist patch series ???

2007-11-08 Thread Mel Gorman

On (08/11/07 12:20), Christoph Lameter didst pronounce:
> On Thu, 8 Nov 2007, Mel Gorman wrote:
> 
> > I've rebased the patches to mm-broken-out-2007-11-06-02-32. However, the
> > vanilla -mm and the one with onezonelist applied are locking up in the
> > same manner. I'm way too behind at the moment to guess if it is a new bug
> > or reported already. At best, I can say the patches are not making things
> > any worse :) I'll go through the archives in the morning and do a bit more
> > testing to see what happens.
> 
> I usually base my patches on Linus' tree as long as there is no tree 
> available from Andrew. But that means that may have to 
> approximate what is in there by adding this and that.
> 

Unfortunately for me, there are several collisions with the patches when
applied against -mm if the patches are based on latest git. They are mainly in
mm/vmscan.c due to the memory controller work. For the purposes of testing and
merging, it makes more sense for me to work against -mm as much as possible.

-- 
Mel Gorman
Part-time Phd Student  Linux Technology Center
University of Limerick IBM Dublin Software Lab
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [patch 00/23] Slab defragmentation V6

2007-11-08 Thread Christoph Lameter

On Thu, 8 Nov 2007, Mel Gorman wrote:

> It certainly can be tried out. However, this is a future problem and
> independent of the current patchset. I don't want to drag us down a blind
> alley about a problem that isn't even at hand.

Right. That is why I took it out.
 
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] ipconfig.c : implement DHCP Class-identifier

2007-11-08 Thread Ilpo Järvinen

On Thu, 8 Nov 2007, Rainer Jochem wrote:

> @@ -620,6 +622,17 @@ ic_dhcp_init_options(u8 *options)
>   *e++ = sizeof(ic_req_params);
>   memcpy(e, ic_req_params, sizeof(ic_req_params));
>   e += sizeof(ic_req_params);
> +
> + // Send it only if the according kernel parameter was set

No C99 comments please. Though I'm not sure if this comment is that 
necessary anyway...

> + if (*vendor_class_identifier) {
> + printk(KERN_INFO "Sending class identifier \"%s\"\n",
> +vendor_class_identifier);
> + *e++ = 60;  /* Class-identifier */
> + *e++ = strlen(vendor_class_identifier);
> + memcpy(e, vendor_class_identifier,
> +strlen(vendor_class_identifier));
> + e += strlen(vendor_class_identifier);
> + }
>   }
>  
>   *e++ = 255; /* End of the list */

-- 
 i.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

1 2 3 >

1 - 100 of 288 matches

Mail list logo