Re: [PATCH] sb1000: prevent a potential NULL pointer dereference in sb1000_dev_ioctl()

2007-07-28 Thread Satyam Sharma


On Sun, 29 Jul 2007, Domen Puncer wrote:

> On 29/07/07 00:02 +0200, Jesper Juhl wrote:
> > Hi,
> > 
> > Here's a small patch, prompted by a find by the Coverity checker, 
> > that removes a potential NULL pointer dereference from 
> > drivers/net/sb1000.c::sb1000_dev_ioctl().
> > The checker spotted that we do a NULL test of 'dev', yet we 
> > dereference the pointer prior to that check.
> > This patch simply moves the dereference after the NULL test.
> 
> But... it can't be called without a valid 'dev', no?
> A quick 'grep do_ioctl net/' confirms that all calls are in
> the form of 'dev->do_ioctl(dev, ...'.

Yup, I think so too ...


> > @@ -991,11 +991,13 @@ static int sb1000_dev_ioctl(struct net_device *dev, 
> > struct ifreq *ifr, int cmd)
> > short PID[4];
> > int ioaddr[2], status, frequency;
> > unsigned int stats[5];
> > -   struct sb1000_private *lp = netdev_priv(dev);
> > +   struct sb1000_private *lp;
> >  
> > if (!(dev && dev->flags & IFF_UP))
> > return -ENODEV;

I think we could get rid of the !dev check itself. Actually, the IFF_UP
check /also/ looks suspect to me for two reasons: (1) I remember Stephen
Hemminger once telling me dev->flags is legacy and unsafe, and one of
the netif_xxx() functions be used instead, and, (2) I wonder if we really
require the interface to be up and *running* when we do this ioctl.


Satyam
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Intel Turbo Memory

2007-07-28 Thread Stephen Rothwell
Hi all,

My new laptop came with some of the above.  Has anyone tried looking at
this to see what is involved in using it?

http://www.intel.com/support/chipsets/itm/

-- 
Cheers,
Stephen Rothwell[EMAIL PROTECTED]
http://www.canb.auug.org.au/~sfr/


pgp0jeeNpjY79.pgp
Description: PGP signature


Re: [PATCH] fix return value of i8042_aux_test_irq

2007-07-28 Thread Dmitry Torokhov
On Thursday 26 July 2007 11:57, [EMAIL PROTECTED] wrote:
> On Fri, July 27, 2007 12:29 am, Alan Cox wrote:
> >> > A small number of boxes do share IRQ12 and it was switched to shared
> >> for
> >> > them.
> >> If that is the case interrupt handlers should be able to determine
> >> whether
> >> a certain interrupt comes from their respective devices, and return
> >> IRQ_HANDLED or IRQ_NONE accordingly. Returning IRQ_HANDLED
> >> unconditionally
> >> when IRQF_SHARED is set seems strange. Is this behavior intended?
> >
> > Sometimes you simple can't tell and in those cases you have no choice.
> As I mentioned in a previous email, i8042_interrupt considers that it
> should not handle an interrupt when there is no data to read and,
> accordingly, it returns IRQ_NONE in such cases. I was just wondering if we
> could follow the same approach to make i8042_aux_test_irq more
> IRQF_SHARED-friendly.
>

Yes, you are right. Patch applied to 'for-linus' branch of input tree.

Thank you.

-- 
Dmitry
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [ck] Re: Linus 2.6.23-rc1

2007-07-28 Thread Roland Dreier
 > It's like CONFIG_HZ - more or less often debated, and now we have everyone
 > happy by giving them the choice.

That's an interesting analogy -- since really the right answer there
seems not to be modal at all, but rather to do CONFIG_NO_HZ.

 - R.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] sb1000: prevent a potential NULL pointer dereference in sb1000_dev_ioctl()

2007-07-28 Thread Domen Puncer
On 29/07/07 00:02 +0200, Jesper Juhl wrote:
> Hi,
> 
> Here's a small patch, prompted by a find by the Coverity checker, 
> that removes a potential NULL pointer dereference from 
> drivers/net/sb1000.c::sb1000_dev_ioctl().
> The checker spotted that we do a NULL test of 'dev', yet we 
> dereference the pointer prior to that check.
> This patch simply moves the dereference after the NULL test.

But... it can't be called without a valid 'dev', no?
A quick 'grep do_ioctl net/' confirms that all calls are in
the form of 'dev->do_ioctl(dev, ...'.


Domen


> @@ -991,11 +991,13 @@ static int sb1000_dev_ioctl(struct net_device *dev, 
> struct ifreq *ifr, int cmd)
>   short PID[4];
>   int ioaddr[2], status, frequency;
>   unsigned int stats[5];
> - struct sb1000_private *lp = netdev_priv(dev);
> + struct sb1000_private *lp;
>  
>   if (!(dev && dev->flags & IFF_UP))
>   return -ENODEV;
>  
> + lp = netdev_priv(dev);
> +
>   ioaddr[0] = dev->base_addr;
>   /* mem_start holds the second I/O address */
>   ioaddr[1] = dev->mem_start;
> 
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] Add /sys/module/name/notes

2007-07-28 Thread Roland McGrath

This patch adds the /sys/module//notes/ magic directory, which has
a file for each allocated SHT_NOTE section that appears in .ko.
This is the counterpart for each module of /sys/kernel/notes for vmlinux.
Reading this delivers the contents of the module's SHT_NOTE sections.
This lets userland easily glean any detailed information about that
module's build that was stored there at compile time (e.g. by ld --build-id).

Signed-off-by: Roland McGrath <[EMAIL PROTECTED]>
---
 include/linux/module.h |3 +
 kernel/module.c|  106 
 2 files changed, 109 insertions(+), 0 deletions(-)

diff --git a/include/linux/module.h b/include/linux/module.h
index b6a646c..65d0752 100644
--- a/include/linux/module.h
+++ b/include/linux/module.h
@@ -346,6 +346,9 @@ struct module
 
/* Section attributes */
struct module_sect_attrs *sect_attrs;
+
+   /* Notes attributes */
+   struct module_notes_attrs *notes_attrs;
 #endif
 
/* Per-cpu data. */
diff --git a/kernel/module.c b/kernel/module.c
index 33c04ad..d7bbe1a 100644
--- a/kernel/module.c
+++ b/kernel/module.c
@@ -20,6 +20,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -1054,6 +1055,100 @@ static void remove_sect_attrs(struct module *mod)
}
 }
 
+/*
+ * /sys/module/foo/notes/.section.name gives contents of SHT_NOTE sections.
+ */
+
+struct module_notes_attrs {
+   struct kobject *dir;
+   unsigned int notes;
+   struct bin_attribute attrs[0];
+};
+
+static ssize_t module_notes_read(struct kobject *kobj,
+struct bin_attribute *bin_attr,
+char *buf, loff_t pos, size_t count)
+{
+   /*
+* The caller checked the pos and count against our size.
+*/
+   memcpy(buf, bin_attr->private + pos, count);
+   return count;
+}
+
+static void free_notes_attrs(struct module_notes_attrs *notes_attrs,
+unsigned int i)
+{
+   if (notes_attrs->dir) {
+   while (i-- > 0)
+   sysfs_remove_bin_file(notes_attrs->dir,
+ _attrs->attrs[i]);
+   kobject_del(notes_attrs->dir);
+   }
+   kfree(notes_attrs);
+}
+
+static void add_notes_attrs(struct module *mod, unsigned int nsect,
+   char *secstrings, Elf_Shdr *sechdrs)
+{
+   unsigned int notes, loaded, i;
+   struct module_notes_attrs *notes_attrs;
+   struct bin_attribute *nattr;
+
+   /* Count notes sections and allocate structures.  */
+   notes = 0;
+   for (i = 0; i < nsect; i++)
+   if ((sechdrs[i].sh_flags & SHF_ALLOC) &&
+   (sechdrs[i].sh_type == SHT_NOTE))
+   ++notes;
+
+   if (notes == 0)
+   return;
+
+   notes_attrs = kzalloc(sizeof(*notes_attrs)
+ + notes * sizeof(notes_attrs->attrs[0]),
+ GFP_KERNEL);
+   if (notes_attrs == NULL)
+   return;
+
+   notes_attrs->notes = notes;
+   nattr = _attrs->attrs[0];
+   for (loaded = i = 0; i < nsect; ++i) {
+   if (!(sechdrs[i].sh_flags & SHF_ALLOC))
+   continue;
+   if (sechdrs[i].sh_type == SHT_NOTE) {
+   nattr->attr.name = mod->sect_attrs->attrs[loaded].name;
+   nattr->attr.mode = S_IRUGO;
+   nattr->size = sechdrs[i].sh_size;
+   nattr->private = (void *) sechdrs[i].sh_addr;
+   nattr->read = module_notes_read;
+   ++nattr;
+   }
+   ++loaded;
+   }
+
+   notes_attrs->dir = kobject_add_dir(>mkobj.kobj, "notes");
+   if (!notes_attrs->dir)
+   goto out;
+
+   for (i = 0; i < notes; ++i)
+   if (sysfs_create_bin_file(notes_attrs->dir,
+ _attrs->attrs[i]))
+   goto out;
+
+   mod->notes_attrs = notes_attrs;
+   return;
+
+  out:
+   free_notes_attrs(notes_attrs, i);
+}
+
+static void remove_notes_attrs(struct module *mod)
+{
+   if (mod->notes_attrs)
+   free_notes_attrs(mod->notes_attrs, mod->notes_attrs->notes);
+}
+
 #else
 
 static inline void add_sect_attrs(struct module *mod, unsigned int nsect,
@@ -1064,6 +1159,15 @@ static inline void add_sect_attrs(struct module *mod, 
unsigned int nsect,
 static inline void remove_sect_attrs(struct module *mod)
 {
 }
+
+static inline void add_notes_attrs(struct module *mod, unsigned int nsect,
+  char *sectstrings, Elf_Shdr *sechdrs)
+{
+}
+
+static inline void remove_notes_attrs(struct module *mod)
+{
+}
 #endif /* CONFIG_KALLSYMS */
 
 #ifdef CONFIG_SYSFS
@@ -1198,6 +1302,7 @@ static void free_module(struct module *mod)
 {
/* Delete from 

Re: [PATCH] arch/i386/kernel/apm.c: apm_init() warning fix

2007-07-28 Thread Stephen Rothwell
On Sun, 29 Jul 2007 10:49:18 +0800 Eugene Teo <[EMAIL PROTECTED]> wrote:
>
> arch/i386/kernel/apm.c: In function 'apm_init':
> arch/i386/kernel/apm.c:2240: warning: format '%lx' expects type 'long
>   unsigned int', but argument 3 has type 'u32'
> 
> apm_info.bios.offset is of type 'u32'.
> 
> Signed-off-by: Eugene Teo <[EMAIL PROTECTED]>
Acked-by: Stephen Rothwell <[EMAIL PROTECTED]>

-- 
Cheers,
Stephen Rothwell[EMAIL PROTECTED]
http://www.canb.auug.org.au/~sfr/


pgpVSp8PakgjC.pgp
Description: PGP signature


Re: [PATCH] Merge the Sonics Silicon Backplane subsystem

2007-07-28 Thread Dmitry Torokhov
On Friday 27 July 2007 16:12, Andrew Morton wrote:
> On Fri, 27 Jul 2007 21:43:59 +0200
> Michael Buesch <[EMAIL PROTECTED]> wrote:
> 
> > > Sure, but why is the locking interruptible rather than plain old
> > > mutex_lock()?
> > 
> > Hm, well. We hold this mutex for several seconds, as writing takes
> > this long. So I simply thought it was worth allowing the waiter
> > to interrupt here. If you say that's not an issue, I'll be happy
> > to use mutex_lock() and reduce code complexity in this area.
> 
> So..  is that what the _interruptible() is for?  To allow an impatient user 
> to ^c
> a read?
> 
> If so, that sounds reasonable.  It's worth a comment explaining these 
> decisions
> to future readers, because it is hard to work out this sort of thinking just
> from the bare C code.

I think most of sysfs ->show() and ->store() implementations use
_interruptible() variant to allow user to interrupt and return early.

-- 
Dmitry
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] TSDEV - Don't flood dmesg with removal warnings

2007-07-28 Thread Dmitry Torokhov
Hi Parag,

On Friday 27 July 2007 10:43, Parag Warudkar wrote:
> Ignore my previous whitespace damaged patch. This one should be good.
> 
> tsdev.c warns about scheduled removal each time tsdev_open is called -
> So even for a default boot I get to see the warning 3 times -
> 
> [  340.537078] tsdev (compaq touchscreen emulation) is scheduled for
> removal.
> [  340.537081] See Documentation/feature-removal-schedule.txt for details.
> [  340.550314] tsdev (compaq touchscreen emulation) is scheduled for
> removal.
> [  340.550318] See Documentation/feature-removal-schedule.txt for details.
> [  340.565065] tsdev (compaq touchscreen emulation) is scheduled for
> removal.
> [  340.565068] See Documentation/feature-removal-schedule.txt for details.
> 
> Move the warning to tsdev_init() from tsdev_open so we don't end up
> printing a large string in dmesg everytime tsdev_open is called.
>

The printk was moved per Andrew's request to make it more annoying.
Obviously it is working ;) Do you know what is opening /dev/input/tsX
nodes?

-- 
Dmitry
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: How can we make page replacement smarter (was: swap-prefetch)

2007-07-28 Thread Rik van Riel

Al Boldi wrote:

Chris Snook wrote:



At best, reads can be read-ahead and cached, which is why
sequential swap-in sucks less.  On-demand reads are as expensive as I/O
can get.


Which means that it should be at least as fast as swap-out, even faster 
because write to disk is usually slower than read on modern disks.  But 
linux currently shows a distinct 2x slowdown for sequential swap-in wrt 
swap-out. 


That's because writes are faster than reads in moderate
quantities.

The disk caches writes, allowing the OS to write a whole
bunch of data into the disk cache and the disk can optimize
the IO a bit internally.

The same optimization is not possible for reads.

--
Politics is the struggle between those who want to make their country
the best in the world, and those who believe it already is.  Each group
calls the other unpatriotic.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [linux-usb-devel] Edgeport UPS Monitoring Problems

2007-07-28 Thread Adam Kropelin

Andrew Morton wrote:

On Fri, 27 Jul 2007 13:37:08 -0700
Nick Pasich <[EMAIL PROTECTED]> wrote:



Greg/Peter/Al,


added linux-usb-devel.


I've been using the edgeport 4 port USB to Serial Converter
to monitor APC Smart UPS's via apcupsd for quite awhile on
various Linux boxes.

I just upgraded to Kernel Version 2.6.22.1 from 2.6.20.6 on a
couple of systems and both the edgeports stopped communicating.

I tried applying various patches, "PATCH 026/149" and "PATCH 082/149"
and one by Alan Cox..  but they didn't fix the problem.

I copied the 2.6.20.6 edgeport module sources to the new
2.6.22.1 tree and everything works again.

  linux/drivers/usb/serial/io_edgeport.c
  linux/drivers/usb/serial/io_edgeport.h
  linux/drivers/usb/serial/io_edgeport.mod.c
  linux/drivers/usb/serial/io_tables.h


Straightforward regression, most serious.  Thanks for reporting it.


I don't know much of anything about usb-serial, but I'll take a whack at 
it. Could you enable debug for that driver, launch apcupsd, and report 
any intersting messages that show up in dmesg? I'd be especially 
interested in any "Not setting..." or "Not writing..." messages, because 
some critical-looking code for baud rate setting and similar became 
conditional in 2.6.22.1 whereas it was always executed before. Apcupsd 
is going to be rather unhappy if the baud rate doesn't change when it 
asks. The debug should show if the these operations are being ignored on 
your hw.


--Adam

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: ide problems: 2.6.22-git17 working, 2.6.23-rc1* is not

2007-07-28 Thread Gabriel C
Danny ter Haar wrote:
[ added  linux-acpi and Len to CC ]
> Quoting Gabriel C ([EMAIL PROTECTED]):
>> Maybe try to : 
>> disable BSG ( maybe some leftover bug )
>> boot acpi=off ( that got merged kind late )
> 
> My first git disected kernel wouldn't boot, but with
> acpi=off it would indeed boot!

Now while we think is ACPI this should be easy for you to bisect.

This commit 
http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=39804b20f62532fa05c2a8c3e2d1ae551fd0327b
merged ACPI so this one should be your first bad one.

Maybe Len has some idea and you don't need to bisect :)

> 
> As did the 2.6.23-rc1-git5 kernel...
> 
> I will bisect further to find out exactly what patch is
> playing up in my particular setup.
> 
> thanks for the tip! ;-)

You are welcome :)

> 
> Danny
> 

Gabriel
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] bsg: Fix warning with CONFIG_BLK_DEV_BSG=n

2007-07-28 Thread Roland Dreier
The current stub definitions of bsg_register_queue() and
bsg_unregister_queue() as macros leads to

drivers/scsi/scsi_sysfs.c: In function 'scsi_sysfs_add_sdev':
drivers/scsi/scsi_sysfs.c:718: warning: unused variable 'rq'

because the first parameter of bsg_register_queue() is completely
discarded.  As akpm says, "program in C, not in cpp."  We might as
well get a little bit better type-checking when we fix this by
converting the stubs to empty inline functions.

Signed-off-by: Roland Dreier <[EMAIL PROTECTED]>
---
 include/linux/bsg.h |9 +++--
 1 files changed, 7 insertions(+), 2 deletions(-)

diff --git a/include/linux/bsg.h b/include/linux/bsg.h
index f415f89..69e23e1 100644
--- a/include/linux/bsg.h
+++ b/include/linux/bsg.h
@@ -60,8 +60,13 @@ struct bsg_class_device {
 extern int bsg_register_queue(struct request_queue *, struct device *, const 
char *);
 extern void bsg_unregister_queue(struct request_queue *);
 #else
-#define bsg_register_queue(disk, dev, name)(0)
-#define bsg_unregister_queue(disk) do { } while (0)
+static inline int bsg_register_queue(struct request_queue *q, struct device 
*gdev,
+const char *name)
+{
+   return 0;
+}
+
+static inline void bsg_unregister_queue(struct request_queue *q) { }
 #endif
 
 #endif /* __KERNEL__ */
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC/RFT 1/5] Input: implement proper locking in input core

2007-07-28 Thread Dmitry Torokhov
Hi Indan,

On Friday 27 July 2007 19:28, Indan Zupancic wrote:
> Hi,
> 
> Not real feedback, just some nitpicks.
> 
> On Tue, July 24, 2007 06:45, Dmitry Torokhov wrote:
> > +static int input_defuzz_abs_event(int value, int old_val, int fuzz)
> > +{
> > +   if (fuzz) {
> > +   if (value > old_val - fuzz / 2 && value < old_val + fuzz / 2)
> > +   return value;
> >
> > -   add_input_randomness(type, code, value);
> > +   if (value > old_val - fuzz && value < old_val + fuzz)
> > +   return (old_val * 3 + value) / 4;
> >
> > -   switch (type) {
> > +   if (value > old_val - fuzz * 2 && value < old_val + fuzz * 2)
> > +   return (old_val + value) / 2;
> > +   }
> 
> Shouldn't the return values of the second and third case be reversed?
> In the 2nd check the new values is weighted for 1/4, while in the 3rd
> case it counts for 1/2, which breaks the "account new value more when
> it is closer to the old one" logic that I thought I saw here. So to sum up,
> should the second return be "return (old_val + value * 3) / 4"?

Thank you for bringing this up. Actually the 1st return valus should be
"old_val", not value. The logic is to "gravitate towards old" when
difference is small.

> 
> 
> > +/*
> > + * Generate software autorepeat event. Note that we take
> > + * dev->event_lock here to avoid racing with input_event
> > + * which may cause keys get "stuck".
> > + */
> 
> Hurray. :-)
> 
> > -   if (code > SW_MAX || !test_bit(code, dev->swbit) || 
> > !!test_bit(code, dev->sw) == value)
> > -   return;
> > +   if (dev->rep[REP_PERIOD])
> > +   mod_timer(>timer, jiffies +
> > +   msecs_to_jiffies(dev->rep[REP_PERIOD]));
> > +   }
> 
> Perhaps use a local var for the "msecs_to_jiffies(dev->rep[REP_PERIOD])" part.
>

What would be the benefit of doing so?

> 
> > +static void input_start_autorepeat(struct input_dev *dev, int code)
> > +{
> > +   if (test_bit(EV_REP, dev->evbit) &&
> > +   dev->rep[REP_PERIOD] && dev->rep[REP_DELAY] &&
> > +   dev->timer.data) {
> > +   dev->repeat_key = code;
> > +   mod_timer(>timer,
> > + jiffies + msecs_to_jiffies(dev->rep[REP_DELAY]));
> > +   }
> > +}
> 
> Same here.
> 
> 
> > +   case EV_KEY:
> > +   if (is_event_supported(code, dev->keybit, KEY_MAX) &&
> > +   !!test_bit(code, dev->key) != value) {
> 
> A bit confusing, test_bit(0 only returns 0 or 1 anyway, doesn't it?
> So "test_bit(code, dev->key) != value" should be all right.
> I noticed that the old code did it too, but still.

Is it guaranteed? I only expect it to return 0/non-0 values, not necessarily
0 and 1.

> 
> > -   case EV_MSC:
> > +   case EV_SW:
> > +   if (is_event_supported(code, dev->swbit, SW_MAX) &&
> > +   !!test_bit(code, dev->sw) != value) {
> 
> Same.
> 
> > -   break;
> > +   case EV_LED:
> > +   if (is_event_supported(code, dev->ledbit, LED_MAX) &&
> > +   !!test_bit(code, dev->led) != value) {
> 
> And here.
> 
> 
> > +void input_inject_event(struct input_handle *handle,
> > +   unsigned int type, unsigned int code, int value)
> >  {
> > -   struct input_dev *dev = (void *) data;
> > +   struct input_dev *dev = handle->dev;
> > +   struct input_handle *grab;
> >
> > -   if (!test_bit(dev->repeat_key, dev->key))
> > -   return;
> > +   if (is_event_supported(type, dev->evbit, EV_MAX)) {
> > +   spin_lock_irq(>event_lock);
> >
> > -   input_event(dev, EV_KEY, dev->repeat_key, 2);
> > -   input_sync(dev);
> > +   grab = rcu_dereference(dev->grab);
> > +   if (!grab || grab == handle)
> > +   input_handle_event(dev, type, code, value);
> 
> 'handle' can't be NULL, so can drop the "!grab" check, as checking
> "grab == handle" should be sufficient.
>

It is "or", not "and". The idea is to pass the event if device is not
grabbed by anyone _or_ if source of event is handle that grabbed the
device.
 
> 
> > +/**
> > + * input_open_device - open input device
> > + * @handle: handle through which device is being accessed
> > + *
> > + * This function should be called by input handlers when they
> > + * want to start receive events from given input device.
> > + */
> >  int input_open_device(struct input_handle *handle)
> >  {
> > struct input_dev *dev = handle->dev;
> > -   int err;
> > +   int retval;
> >
> > -   err = mutex_lock_interruptible(>mutex);
> > -   if (err)
> > -   return err;
> > +   retval = mutex_lock_interruptible(>mutex);
> > +   if (retval)
> > +   return retval;
> > +
> > +   if (dev->going_away) {
> > +   retval = -ENODEV;
> > +   goto out;
> > +   }
> >
> > handle->open++;
> >
> > if (!dev->users++ && dev->open)
> 
> Ugh, not your code, and perhaps it's me, but that looks weird.
> The ++ hidden 

Re: [PATCH 2/2] ehca: correction include order according kernel coding style

2007-07-28 Thread Roland Dreier
thanks, I applied this by hand since it was so trivial.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: RFT: updatedb "morning after" problem [was: Re: -mm merge plans for 2.6.23]

2007-07-28 Thread Andrew Morton
On Sat, 28 Jul 2007 21:33:59 -0400 Rik van Riel <[EMAIL PROTECTED]> wrote:

> Andrew Morton wrote:
> 
> > What I think is killing us here is the blockdev pagecache: the pagecache
> > which backs those directory entries and inodes.  These pages get read
> > multiple times because they hold multiple directory entries and multiple
> > inodes.  These multiple touches will put those pages onto the active list
> > so they stick around for a long time and everything else gets evicted.
> > 
> > I've never been very sure about this policy for the metadata pagecache.  We
> > read the filesystem objects into the dcache and icache and then we won't
> > read from that page again for a long time (I expect).  But the page will
> > still hang around for a long time.
> > 
> > It could be that we should leave those pages inactive.
> 
> Good idea for updatedb.
> 
> However, it may be a bad idea for files that are often
> written to.  Turning an inode write into a read plus a
> write does not sound like such a hot idea, we really
> want to keep those in the cache.

Remember that this problem applies to both inode blocks and to directory
blocks.  Yes, it might be useful to hold onto an inode block for a future
write (atime, mtime, usually), but not a directory block.

> I think what you need is to ignore multiple references
> to the same page when they all happen in one time
> interval, counting them only if they happen in multiple
> time intervals.

Yes, the sudden burst of accesses for adjacent inode/dirents will be a
common pattern, and it'd make heaps of sense to treat that as a single
touch.  It'd have to be done in the fs I guess, and it might be a bit hard
to do.  And it turns out that embedding the touch_buffer() all the way down
in __find_get_block() was convenient, but it's going to be tricky to
change.

For now I'm fairly inclined to just nuke the touch_buffer() on the read side
and maybe add one on the modification codepaths and see what happens.

As always, testing is the problem.

> The use-once cleanup (which takes a page flag for PG_new,
> I know...) would solve that problem.
> 
> However, it would introduce the problem of having to scan
> all the pages on the list before a page becomes freeable.
> We would have to add some background scanning (or a separate
> list for PG_new pages) to make the initial pageout run use
> an acceptable amount of CPU time.
> 
> Not sure that complexity will be worth it...
> 

I suspect that the situation we have now is so bad that pretty much
anything we do will be an improvement.  I've always wondered "ytf is there
so much blockdev pagecache?"

This machine I'm typing at:

MemTotal:  3975080 kB
MemFree:750400 kB
Buffers:547736 kB
Cached:1299532 kB
SwapCached:  12772 kB
Active:1789864 kB
Inactive:   861420 kB
HighTotal:   0 kB
HighFree:0 kB
LowTotal:  3975080 kB
LowFree:750400 kB
SwapTotal: 4875716 kB
SwapFree:  4715660 kB
Dirty:  76 kB
Writeback:   0 kB
Mapped: 638036 kB
Slab:   522724 kB
CommitLimit:   6863256 kB
Committed_AS:  1115632 kB
PageTables:  14452 kB
VmallocTotal: 34359738367 kB
VmallocUsed: 36432 kB
VmallocChunk: 34359696379 kB
HugePages_Total: 0
HugePages_Free:  0
HugePages_Rsvd:  0
Hugepagesize: 2048 kB

More that a quarter of my RAM in fs metadata!  Most of it I'll bet is on the
active list.  And the fs on which I do most of the work is mounted
noatime..


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC/RFT 0/5] Input locking patches

2007-07-28 Thread Dmitry Torokhov
Hi Indan,

On Friday 27 July 2007 18:25, Indan Zupancic wrote:
> Sorry for the babbling, just wanted to say that I've tested these
> patches and that they seem to fix real problems.
> 

Thank you for testing the patches.

-- 
Dmitry
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [ofa-general] [PATCH 1/2] ehca: remove checkpatch.pl's warnings "externs should be avoided in .c files"

2007-07-28 Thread Roland Dreier
the patch looks fine except your mailer seems to have mangled
it... can you resend so I can apply it?

thanks...
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: ide problems: 2.6.22-git17 working, 2.6.23-rc1* is not

2007-07-28 Thread Danny ter Haar
Quoting Gabriel C ([EMAIL PROTECTED]):
> Maybe try to : 
> disable BSG ( maybe some leftover bug )
> boot acpi=off ( that got merged kind late )

My first git disected kernel wouldn't boot, but with
acpi=off it would indeed boot!

As did the 2.6.23-rc1-git5 kernel...

I will bisect further to find out exactly what patch is
playing up in my particular setup.

thanks for the tip! ;-)

Danny

-- 
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC] scheduler: improve SMP fairness in CFS

2007-07-28 Thread Chris Snook

Tong Li wrote:
Without the global locking, the global synchronization here is simply 
ping-ponging a cache line once of while. This doesn't look expensive to 
me, but if it does after benchmarking, adjusting sysctl_base_round_slice 
can reduce the ping-pong frequency. There might also be a smart 
implementation that can alleviate this problem.


Scaling it proportionally to migration cost and log2(cpus) should suffice.

I don't understand why quantizing CPU time is a bad thing. Could you 
educate me on this?


It depends on how precisely you do it.  We save a lot of power going 
tickless.  If round expiration is re-introducing ticks on idle CPUs, we 
could waste a lot of power.  Hardware is getting even more aggressive 
about power saving, to the point of allowing individual cores to be 
completely powered off when idle.  We need to make sure the scheduler 
doesn't interfere with power management.


-- Chris
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] fs/partitions/check.c: add_partition() warning fixes

2007-07-28 Thread Eugene Teo
This patch fixes these warnings:

fs/partitions/check.c: In function ‘add_partition’:
fs/partitions/check.c:391: warning: ignoring return value of ‘kobject_add’,
declared with attribute warn_unused_result
fs/partitions/check.c:394: warning: ignoring return value of
‘sysfs_create_link’, declared with attribute warn_unused_result
fs/partitions/check.c:401: warning: ignoring return value of
‘sysfs_create_file’, declared with attribute warn_unused_result

Signed-off-by: Eugene Teo <[EMAIL PROTECTED]>
---
 fs/partitions/check.c |   21 ++---
 1 files changed, 18 insertions(+), 3 deletions(-)

diff --git a/fs/partitions/check.c b/fs/partitions/check.c
index 783c57e..01397c1 100644
--- a/fs/partitions/check.c
+++ b/fs/partitions/check.c
@@ -371,6 +371,7 @@ void delete_partition(struct gendisk *disk, int part)
 void add_partition(struct gendisk *disk, int part, sector_t start, sector_t 
len, int flags)
 {
struct hd_struct *p;
+   int err;
 
p = kzalloc(sizeof(*p), GFP_KERNEL);
if (!p)
@@ -388,20 +389,34 @@ void add_partition(struct gendisk *disk, int part, 
sector_t start, sector_t len,
p->kobj.parent = >kobj;
p->kobj.ktype = _part;
kobject_init(>kobj);
-   kobject_add(>kobj);
+   err = kobject_add(>kobj);
+   if (err)
+   goto err_out;
if (!disk->part_uevent_suppress)
kobject_uevent(>kobj, KOBJ_ADD);
-   sysfs_create_link(>kobj, _subsys.kobj, "subsystem");
+   err = sysfs_create_link(>kobj, _subsys.kobj, "subsystem");
+   if (err)
+   goto err_out_del_kobj;
if (flags & ADDPART_FLAG_WHOLEDISK) {
static struct attribute addpartattr = {
.name = "whole_disk",
.mode = S_IRUSR | S_IRGRP | S_IROTH,
};
 
-   sysfs_create_file(>kobj, );
+   err = sysfs_create_file(>kobj, );
+   if (err)
+   goto err_out_del_link;
}
partition_sysfs_add_subdir(p);
disk->part[part-1] = p;
+   return;
+
+err_out_del_link:
+   sysfs_remove_link(>kobj, "subsystem");
+err_out_del_kobj:
+   kobject_del(>kobj);
+err_out:
+   kfree(p);
 }
 
 static char *make_block_name(struct gendisk *disk)

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: ide problems: 2.6.22-git17 working, 2.6.23-rc1* is not

2007-07-28 Thread Gabriel C
Danny ter Haar wrote:
> Quoting Bartlomiej Zolnierkiewicz ([EMAIL PROTECTED]):
>> Please retry with the latest -git kernel and if the problem is still
>> there install git, get kernel tree and run git-bisect.
> 
> I ran over "make menuconfig" and did a few changes.
> 
> http://www.dth.net/kernel/config-2.6.23-rc1-git5
> 

Maybe try to : 

disable BSG ( maybe some leftover bug )
boot acpi=off ( that got merged kind late )

> 
> Danny


Gabriel
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] arch/i386/kernel/apm.c: apm_init() warning fix

2007-07-28 Thread Eugene Teo
arch/i386/kernel/apm.c: In function 'apm_init':
arch/i386/kernel/apm.c:2240: warning: format '%lx' expects type 'long
unsigned int', but argument 3 has type 'u32'

apm_info.bios.offset is of type 'u32'.

Signed-off-by: Eugene Teo <[EMAIL PROTECTED]>
---
 arch/i386/kernel/apm.c |2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/arch/i386/kernel/apm.c b/arch/i386/kernel/apm.c
index 47001d5..f02a8ac 100644
--- a/arch/i386/kernel/apm.c
+++ b/arch/i386/kernel/apm.c
@@ -2235,7 +2235,7 @@ static int __init apm_init(void)
apm_info.bios.cseg_16_len = 0; /* 64k */
 
if (debug) {
-   printk(KERN_INFO "apm: entry %x:%lx cseg16 %x dseg %x",
+   printk(KERN_INFO "apm: entry %x:%x cseg16 %x dseg %x",
apm_info.bios.cseg, apm_info.bios.offset,
apm_info.bios.cseg_16, apm_info.bios.dseg);
if (apm_info.bios.version > 0x100)

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] List VESA graphics videomodes when vesafb is present

2007-07-28 Thread Petr Vandrovec
Hello,
  I had something like this in video.S for years, so it is probably time
to try to push it upstream...  Besides other problems it confirmed that
when I connect HDTV to my nVidia, BIOS decides that 640x480 is largest
possible resolution, as it is largest standard resolution smaller than
non-interlaced 1920x540, and apparently BIOS I have does not believe
that interlaced modes exist.
Thanks,
Petr Vandrovec


List VESA videomodes when vesafb is available

There is no CONFIG_VIDEO_VESA option, so code to retrieve VESA modes
(even text ones) was always disabled - it was introduced by conversion,
old video.S had #define CONFIG_VIDEO_VESA at the beginning.

Modify video-vesa.c to list graphics videomodes when vesafb is built in
and videomode is acceptable.

Because there is lot of videomodes in some VESA BIOSes, list them in three
columns (should be good for anybody - videocards with more than ~70 videomodes
cannot be used with some OSes, so vendors usually try to not cross more
than 60 videomodes reported by VESA).  Which unfortunately means that
card_name for "BIOS (scanned)" needs to be made shorter.

And display color depth for videomodes.  To avoid confusion depth
is shown only if non-text videomodes are present in the list.

---
commit 9453a2a4afee6cadb020838d0a35c2d11af25aa6
tree 5903315f7fd62db34f22107e63ec37a24bbf8075
parent 5de7fc0bf0e2e36a8dcf619e95576219dcc13d70
author Petr Vandrovec <[EMAIL PROTECTED]> Sun, 22 Jul 2007 23:54:34 -0700
committer Petr Vandrovec <[EMAIL PROTECTED]> Sun, 22 Jul 2007 23:54:34 -0700

 arch/i386/boot/video-bios.c |3 ++-
 arch/i386/boot/video-vesa.c |   48 +++
 arch/i386/boot/video-vga.c  |   20 +-
 arch/i386/boot/video.c  |   39 ---
 arch/i386/boot/video.h  |3 ++-
 5 files changed, 89 insertions(+), 24 deletions(-)

diff --git a/arch/i386/boot/video-bios.c b/arch/i386/boot/video-bios.c
index afea46c..376ff71 100644
--- a/arch/i386/boot/video-bios.c
+++ b/arch/i386/boot/video-bios.c
@@ -104,6 +104,7 @@ static int bios_probe(void)
 
mi = GET_HEAP(struct mode_info, 1);
mi->mode = VIDEO_FIRST_BIOS+mode;
+   mi->depth = 0;  /* text */
mi->x = rdfs16(0x44a);
mi->y = rdfs8(0x484)+1;
nmodes++;
@@ -116,7 +117,7 @@ static int bios_probe(void)
 
 __videocard video_bios =
 {
-   .card_name  = "BIOS (scanned)",
+   .card_name  = "BIOS",
.probe  = bios_probe,
.set_mode   = bios_set_mode,
.unsafe = 1,
diff --git a/arch/i386/boot/video-vesa.c b/arch/i386/boot/video-vesa.c
index e6aa9eb..af7019f 100644
--- a/arch/i386/boot/video-vesa.c
+++ b/arch/i386/boot/video-vesa.c
@@ -28,7 +28,7 @@ static void vesa_store_mode_params_graphics(void);
 
 static int vesa_probe(void)
 {
-#if defined(CONFIG_VIDEO_VESA) || defined(CONFIG_FIRMWARE_EDID)
+#if defined(CONFIG_VIDEO_SELECT) || defined(CONFIG_FIRMWARE_EDID)
u16 ax;
u16 mode;
addr_t mode_ptr;
@@ -47,8 +47,8 @@ static int vesa_probe(void)
vginfo.signature != VESA_MAGIC ||
vginfo.version < 0x0102)
return 0;   /* Not present */
-#endif /* CONFIG_VIDEO_VESA || CONFIG_FIRMWARE_EDID */
-#ifdef CONFIG_VIDEO_VESA
+#endif /* CONFIG_VIDEO_SELECT || CONFIG_FIRMWARE_EDID */
+#ifdef CONFIG_VIDEO_SELECT
set_fs(vginfo.video_mode_ptr.seg);
mode_ptr = vginfo.video_mode_ptr.off;
 
@@ -75,19 +75,49 @@ static int vesa_probe(void)
/* Text Mode, TTY BIOS supported,
   supported by hardware */
mi = GET_HEAP(struct mode_info, 1);
-   mi->mode = mode + VIDEO_FIRST_VESA;
-   mi->x= vminfo.h_res;
-   mi->y= vminfo.v_res;
+   mi->mode  = mode + VIDEO_FIRST_VESA;
+   mi->depth = 0; /* text */
+   mi->x = vminfo.h_res;
+   mi->y = vminfo.v_res;
nmodes++;
} else if ((vminfo.mode_attr & 0x99) == 0x99) {
 #ifdef CONFIG_FB
/* Graphics mode, color, linear frame buffer
-  supported -- register the mode but hide from
+  supported -- register the mode, and if there
+  is no VESA framebuffer then hide from
   the menu.  Only do this if framebuffer is
   configured, however, otherwise the user will
   be left without a screen. */
mi = GET_HEAP(struct mode_info, 1);
-   mi->mode = mode + VIDEO_FIRST_VESA;
+   mi->mode = mode + VIDEO_FIRST_VESA;
+   mi->depth = 7;  /* 

Re: [RFC] scheduler: improve SMP fairness in CFS

2007-07-28 Thread Chris Snook

Tong Li wrote:

On Fri, 27 Jul 2007, Chris Snook wrote:


Bill Huey (hui) wrote:
You have to consider the target for this kind of code. There are 
applications
where you need something that falls within a constant error bound. 
According
to the numbers, the current CFS rebalancing logic doesn't achieve 
that to
any degree of rigor. So CFS is ok for SCHED_OTHER, but not for 
anything more

strict than that.


I've said from the beginning that I think that anyone who desperately 
needs perfect fairness should be explicitly enforcing it with the aid 
of realtime priorities.  The problem is that configuring and tuning a 
realtime application is a pain, and people want to be able to 
approximate this behavior without doing a whole lot of dirty work 
themselves.  I believe that CFS can and should be enhanced to ensure 
SMP-fairness over potentially short, user-configurable intervals, even 
for SCHED_OTHER.  I do not, however, believe that we should take it to 
the extreme of wasting CPU cycles on migrations that will not improve 
performance for *any* task, just to avoid letting some tasks get ahead 
of others.  We should be as fair as possible but no fairer.  If we've 
already made it as fair as possible, we should account for the margin 
of error and correct for it the next time we rebalance.  We should not 
burn the surplus just to get rid of it.


Proportional-share scheduling actually has one of its roots in real-time 
and having a p-fair scheduler is essential for real-time apps (soft 
real-time).


Sounds like another scheduler class might be in order.  I find CFS to be 
fair enough for most purposes.  If the code that gives us near-perfect 
fairness at the expense of efficiency only runs when tasks have been 
given boosted priority by a privileged user, and only on the CPUs that 
have such tasks queued on them, the run time overhead and code 
complexity become much smaller concerns.




On a non-NUMA box with single-socket, non-SMT processors, a constant 
error bound is fine.  Once we add SMT, go multi-core, go NUMA, and add 
inter-chassis interconnects on top of that, we need to multiply this 
error bound at each stage in the hierarchy, or else we'll end up 
wasting CPU cycles on migrations that actually hurt the processes 
they're supposed to be helping, and hurt everyone else even more.  I 
believe we should enforce an error bound that is proportional to 
migration cost.




I think we are actually in agreement. When I say constant bound, it can 
certainly be a constant that's determined based on inputs from the 
memory hierarchy. The point is that it needs to be a constant 
independent of things like # of tasks.


Agreed.

But this patch is only relevant to SCHED_OTHER.  The realtime 
scheduler doesn't have a concept of fairness, just priorities.  That 
why each realtime priority level has its own separate runqueue.  
Realtime schedulers are supposed to be dumb as a post, so they cannot 
heuristically decide to do anything other than precisely what you 
configured them to do, and so they don't get in the way when you're 
context switching a million times a second.


Are you referring to hard real-time? As I said, an infrastructure that 
enables p-fair scheduling, EDF, or things alike is the foundation for 
real-time. I designed DWRR, however, with a target of non-RT apps, 
although I was hoping the research results might be applicable to RT.


I'm referring to the static priority SCHED_FIFO and SCHED_RR schedulers, 
which are (intentionally) dumb as a post, allowing userspace to manage 
CPU time explicitly.  Proportionally fair scheduling is a cool 
capability, but not a design goal of those schedulers.


-- Chris
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 2/3] core_pattern: allow passing of arguments to user mode helper when core_pattern is a pipe

2007-07-28 Thread Neil Horman
On Sat, Jul 28, 2007 at 03:52:02PM -0700, Jeremy Fitzhardinge wrote:
> Neil Horman wrote:
> > Jeremy asked that I make a patch next week to address split_argv's 
> > requirement
> > that the argc parameter be non-NULL.  I'll be fixing that next week, and 
> > what I
> > can do is further enhance it such that it ignores spaces in quoted strings,
> > which should address the case that concerns you.  I.E I can make split_argv
> > behave such that:
> > echo "|\"foo bar\" --pid %p" > /proc/sys/kernel/core_pattern
> > results in the following argv:
> > {{"foo bar"}, {"--pid"}, {"1234"}}
> >
> > Which I think handles what you are looking for.
> >   
> 
> No, please don't.  My original argv_split did that, and it was just way
> too complex.  If you need complex quoting, you can always point it at a
> shell script and handle it there.
> 
> J

Ok, well then, it seems this corner case is much too harry to just fix up
immediately.  Given that we certainly don't handle quoted strings now, and the
fact that this is a case that will almost never come up, and can be esaily
worked around, lets address it at some time after we get this base functionality
in place

Regards
Neil


-- 
/***
 *Neil Horman
 *Software Engineer
 *Red Hat, Inc.
 [EMAIL PROTECTED]
 *gpg keyid: 1024D / 0x92A74FA1
 *http://pgp.mit.edu
 ***/
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [ck] Re: Linus 2.6.23-rc1

2007-07-28 Thread Charles philip Chan
Con Kolivas <[EMAIL PROTECTED]> writes:

> Interesting... Trying to avoid reading email but with a flooded inbox
> it's quite hard to do.

Con, good to hear from you. Good luck with your future endeavors.

Charles

-- 
"Are [Linux users] lemmings collectively jumping off of the cliff of
reliable, well-engineered commercial software?"
(By Matt Welsh)


pgp22bG9rBbnK.pgp
Description: PGP signature


Re: [PATCH] UML - Console should handle spurious IRQS

2007-07-28 Thread Eduard-Gabriel Munteanu

*This message was transferred with a trial version of CommuniGate(r) Pro*
Jeff Dike wrote:

The previous DEBUG_SHIRQ patch missed one case.  The console doesn't
set its host descriptors non-blocking.


Sorry, things looked okay when I tested on my UML environment (Puppy 
Linux). Some xterms popped around (because I was using "con=xterm") and 
the system was usable, so it gave me no indication something was wrong.


I thought of adding an extra debugging option to warn us when a blocking 
 I/O operation is issued for a socket/fd, but UML-specific code is not 
consistent regarding glibc functions. That is, most of the time it calls 
os_*(), but sometimes it calls functions like recvfrom() directly. I'll 
grep the source code for such calls and send a patch to clean it up a bit.


There might still be such cases, I haven't tested all channel types yet.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: How can we make page replacement smarter

2007-07-28 Thread Rik van Riel

Al Boldi wrote:

Good idea, but unless we understand the problems involved, we are bound to 
repeat it.  So my first question would be:  Why is swap-in so slow?


As I have posted in other threads, swap-in of consecutive pages suffers a 2x 
slowdown wrt swap-out, whereas swap-in of random pages suffers over 6x 
slowdown.


Because it is hard to quantify the expected swap-in speed for random pages, 
let's first tackle the swap-in of consecutive pages, which should be at 
least as fast as swap-out.  So again, why is swap-in so slow?


I suspect that this is a locality of reference issue.

Anonymous memory can get jumbled up by repeated free and
malloc cycles of many smaller objects.  The amount of
anonymous memory is often smaller than or roughly the same
size as system memory.

Locality of refenence to anonymous memory tends to be
temporal in nature, with the same sets of pages being
accessed over and over again.

Files are different.  File content tends to be grouped
in large related chunks, both logically in the file and
on disk.  Generally there is a lot more file data on a
system than what fits in memory.

Locality of reference to file data tends to be spatial
in nature, with one file access leading up to the system
accessing "nearby" data.  The data is not necessarily
touched again any time soon.

Once we understand this problem, we may be able to suggest a smart 
improvement.


Like the one on http://linux-mm.org/PageoutFailureModes ?

I have the LRU lists split and am working on getting SEQ
replacement implemented for the anonymous pages.

The most recent (untested) patches are attached.

--
Politics is the struggle between those who want to make their country
the best in the world, and those who believe it already is.  Each group
calls the other unpatriotic.
--- linux-2.6.21.noarch/drivers/base/node.c.vmsplit	2007-04-25 23:08:32.0 -0400
+++ linux-2.6.21.noarch/drivers/base/node.c	2007-07-23 11:42:52.0 -0400
@@ -44,33 +44,37 @@ static ssize_t node_read_meminfo(struct 
 	si_meminfo_node(, nid);
 
 	n = sprintf(buf, "\n"
-		   "Node %d MemTotal: %8lu kB\n"
-		   "Node %d MemFree:  %8lu kB\n"
-		   "Node %d MemUsed:  %8lu kB\n"
-		   "Node %d Active:   %8lu kB\n"
-		   "Node %d Inactive: %8lu kB\n"
+		   "Node %d MemTotal:   %8lu kB\n"
+		   "Node %d MemFree:%8lu kB\n"
+		   "Node %d MemUsed:%8lu kB\n"
+		   "Node %d Active(anon):   %8lu kB\n"
+		   "Node %d Inactive(anon): %8lu kB\n"
+		   "Node %d Active(file):   %8lu kB\n"
+		   "Node %d Inactive(file): %8lu kB\n"
 #ifdef CONFIG_HIGHMEM
-		   "Node %d HighTotal:%8lu kB\n"
-		   "Node %d HighFree: %8lu kB\n"
-		   "Node %d LowTotal: %8lu kB\n"
-		   "Node %d LowFree:  %8lu kB\n"
+		   "Node %d HighTotal:  %8lu kB\n"
+		   "Node %d HighFree:   %8lu kB\n"
+		   "Node %d LowTotal:   %8lu kB\n"
+		   "Node %d LowFree:%8lu kB\n"
 #endif
-		   "Node %d Dirty:%8lu kB\n"
-		   "Node %d Writeback:%8lu kB\n"
-		   "Node %d FilePages:%8lu kB\n"
-		   "Node %d Mapped:   %8lu kB\n"
-		   "Node %d AnonPages:%8lu kB\n"
-		   "Node %d PageTables:   %8lu kB\n"
-		   "Node %d NFS_Unstable: %8lu kB\n"
-		   "Node %d Bounce:   %8lu kB\n"
-		   "Node %d Slab: %8lu kB\n"
-		   "Node %d SReclaimable: %8lu kB\n"
-		   "Node %d SUnreclaim:   %8lu kB\n",
+		   "Node %d Dirty:  %8lu kB\n"
+		   "Node %d Writeback:  %8lu kB\n"
+		   "Node %d FilePages:  %8lu kB\n"
+		   "Node %d Mapped: %8lu kB\n"
+		   "Node %d AnonPages:  %8lu kB\n"
+		   "Node %d PageTables: %8lu kB\n"
+		   "Node %d NFS_Unstable:   %8lu kB\n"
+		   "Node %d Bounce: %8lu kB\n"
+		   "Node %d Slab:   %8lu kB\n"
+		   "Node %d SReclaimable:   %8lu kB\n"
+		   "Node %d SUnreclaim: %8lu kB\n",
 		   nid, K(i.totalram),
 		   nid, K(i.freeram),
 		   nid, K(i.totalram - i.freeram),
-		   nid, node_page_state(nid, NR_ACTIVE),
-		   nid, node_page_state(nid, NR_INACTIVE),
+		   nid, node_page_state(nid, NR_ACTIVE_ANON),
+		   nid, node_page_state(nid, NR_INACTIVE_ANON),
+		   nid, node_page_state(nid, NR_ACTIVE_FILE),
+		   nid, node_page_state(nid, NR_INACTIVE_FILE),
 #ifdef CONFIG_HIGHMEM
 		   nid, K(i.totalhigh),
 		   nid, K(i.freehigh),
--- linux-2.6.21.noarch/fs/proc/proc_misc.c.vmsplit	2007-07-05 12:06:14.0 -0400
+++ linux-2.6.21.noarch/fs/proc/proc_misc.c	2007-07-23 11:42:52.0 -0400
@@ -146,43 +146,47 @@ static int meminfo_read_proc(char *page,
 	 * Tagged format, for easy grepping and expansion.
 	 */
 	len = sprintf(page,
-		"MemTotal: %8lu kB\n"
-		"MemFree:  %8lu kB\n"
-		"Buffers:  %8lu kB\n"
-		"Cached:   %8lu kB\n"
-		"SwapCached:   %8lu kB\n"
-		"Active:   %8lu kB\n"
-		

RE: 2.6.23-rc1-git3 init failure

2007-07-28 Thread Sid Boyce

> Boot failure on x86_64 (64X2), says it can't find init, specifically
> /init. 2.6.23-rc1-git1 boots and runs successfully. I haven't tried
> -git2. I shall reboot on 2.6.23-rc1-git3 tomorrow and record the full
> message.
> Strings from vmlinux in both the above:-
>
> Kernel alive
> /dev/console
> <4>Warning: unable to open an initial console.
> <4>Failed to execute %s
> <4>Failed to execute %s.  Attempting defaults...
> /sbin/init
> /etc/init
> /bin/init
> /bin/sh
> No init found.  Try passing init= option to kernel.
>
> Tried option "init=/sbin/init" and got the same failure.
> Regards
> Sid.

I see the sam problem with 2.6.23-rc1-git5.

Freeing unused kernel memory: 236k freed
failed to execute /init
kernel panic - not syncing: No init found. Try passing init= option to 
kernel


Copying /sbin/init to / results in the same error.
openSUSE 10.3Alpha6plus
# rpm -qf /sbin/init
sysvinit-2.86-90

Regards
Sid.

--
Sid Boyce ... Hamradio License G3VBV, Licensed Private Pilot
Emeritus IBM/Amdahl Mainframes and Sun/Fujitsu Servers Tech Support 
Specialist, Cricket Coach

Microsoft Windows Free Zone - Linux used for all Computing Tasks

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: RFT: updatedb "morning after" problem [was: Re: -mm merge plans for 2.6.23]

2007-07-28 Thread Rik van Riel

Andrew Morton wrote:


What I think is killing us here is the blockdev pagecache: the pagecache
which backs those directory entries and inodes.  These pages get read
multiple times because they hold multiple directory entries and multiple
inodes.  These multiple touches will put those pages onto the active list
so they stick around for a long time and everything else gets evicted.

I've never been very sure about this policy for the metadata pagecache.  We
read the filesystem objects into the dcache and icache and then we won't
read from that page again for a long time (I expect).  But the page will
still hang around for a long time.

It could be that we should leave those pages inactive.


Good idea for updatedb.

However, it may be a bad idea for files that are often
written to.  Turning an inode write into a read plus a
write does not sound like such a hot idea, we really
want to keep those in the cache.

I think what you need is to ignore multiple references
to the same page when they all happen in one time
interval, counting them only if they happen in multiple
time intervals.

The use-once cleanup (which takes a page flag for PG_new,
I know...) would solve that problem.

However, it would introduce the problem of having to scan
all the pages on the list before a page becomes freeable.
We would have to add some background scanning (or a separate
list for PG_new pages) to make the initial pageout run use
an acceptable amount of CPU time.

Not sure that complexity will be worth it...

--
Politics is the struggle between those who want to make their country
the best in the world, and those who believe it already is.  Each group
calls the other unpatriotic.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: ide problems: 2.6.22-git17 working, 2.6.23-rc1* is not

2007-07-28 Thread Danny ter Haar
Quoting Bartlomiej Zolnierkiewicz ([EMAIL PROTECTED]):
> Please retry with the latest -git kernel and if the problem is still
> there install git, get kernel tree and run git-bisect.

I ran over "make menuconfig" and did a few changes.

http://www.dth.net/kernel/config-2.6.23-rc1-git5

It boots, but freezes solid before doing any "work" as a firewall.
I was able to catch boot messages with netconsole

http://www.dth.net/kernel/via_output_2.6.23-rc1-git5

It didn't respond after the last line.
magig sysrq etc, all nada.

Will start to get acquainted with git ;-)

Danny
-- 
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [ck] Re: Linus 2.6.23-rc1

2007-07-28 Thread hui
On Sat, Jul 28, 2007 at 03:18:24PM -0700, Linus Torvalds wrote:
> I don't think anything was suppressed here.

I disagree. See below.

> You seem to say that more modular code would have helped make for a nicer 
> way to do schedulers, but if so, where were those patches to do that? 
> Con's patches didn't do that either. They just replaced the code.

They replaced code because he would have liked to have taken scheduler code
in possibly a completely different direction. This is a large conceptual
change from what is currently there. That might also mean how the notion of
bandwidth with regards to core frequency might be expressed in the system
with regards to power saving and other things. Things get dropped often
not because of pure technical reasons but because of person preference
and the lack of willingness to ask where this might take us.

The way that Con works and conceptualizes things is quite a bit different
and more comprehensive in a lot of ways compared to how the regular kernel
community operates. He's strong in this area and weak in general kernel
hackery as a function of time and experience. That doesn't mean that he,
his ideas and his code should be subject to an either/or situation with the
scheduler and other ideas that have been rejected by various folks. He
maintained -ck branch successfully for a long time and is a very capable
developer.

I do acknowledge that having a maintainer that you can trust is more
important, but it should not be exclusionary in this way. I totally
understand his reaction.

> In fact, Ingo's patches _do_ add some modularity, and might make it easier 
> to replace the scheduler. So it would seem that you would argue for CFS, 
> not against it?

It's not the same as sched plugin. Some folks might not like to use the
rbtree that's in place and express things in a completely different
manner. Take for instance, Tong Li's stuff with CFS a bit of a conceptual
mismatch with his attempt at expression rebalancing in terms expiry rounds
yet would be more seamlessly integrated with something like either the old
O(1) scheduler or Con's stuff. It's also the only method posted to lkml
that can deal with fairness across SMP situtations with low error. Yet
what's happening here is that his implementation is being rejected because
of size and complexity because of a data structure conceptual mismatch.

Because of this, his notion of trio as a general method of getting
aggressive group fairness (by far the most complete conceptually on lkml,
over design is a different topic altogether) may never see the light of
day in Linux because of people's collective lack of foresight.

To answer the question that you posed, no. I'm not arguing against it. I'm
in favor of it going into the kernel like any dead line mechanism since
it can be generalized, but the current developement processes in Linux
kernel should not create an either/or situation with the scheduler code.
There has been multipule rejection of ideas with regards to the scheduler
code over the years that could have take things in a very different and
possibly complete kick ass way that was suppress because of the development
attitude of various Linux kernel developers.

It's all of a sudden because of Con's work there's a flurry of development
in this area when this idea is shown to be superior and even then, it's
conceptually incomplete and subject to a lot of arbitrary hacking. This
is very different than Con's development style and mine as well.

This is an area that could have been addressed sooner if the general
community admitted that there was a problem earlier and permitted more
conscious and open change. I've seen changes in this area from Con be
reject time and time again which effect the technical direction he
originally wanted to take this.

Now, Con might have a communication problem here, but nobody asked to
clarify what he might have wanted and why, yet folks were very quick at
dismissing him, nitpick him to death,  even when he explained why he might
have wanted a particular change in the first place. This is the
"facilitation" part that's missing in the current kernel culture.

This is a very important idea as the community grows, because I see folks
that are capable of doing work get discouraged and locked out because of
code maintainability issues and an inability to get folks to move that
direction because of a missing concensus mechanism in the community
other that sucking up to developers.

Con and folks like him should be permitted the opportunity to fail on
their own account. If Linux was truely open, it would have dealt with
issue by now and there wouldn't be so much flammage from the general
community.

> > I think that's kind of a bogus assumption from the very get go. Scheduling
> > in Linux is one of the most unevolved systems in the kernel that still
> > could go through a large transformation and get big gains like what
> > we've had over the last few months. This evident with both 

Re: [PATCH] Fix lguest bzImage loading with CONFIG_RELOCATABLE=y

2007-07-28 Thread Rusty Russell
On Fri, 2007-07-27 at 12:45 +0200, Andi Kleen wrote:
> Rusty Russell <[EMAIL PROTECTED]> writes:
> 
> > Jason Yeh sent his crashing .config: bzImages made with
> > CONFIG_RELOCATABLE=y put the relocs where the BSS is expected, and we
> > crash with unusual results such as:
> 
> The normal kernel startup should already clear BSS. Why does
> this not work here? Can it be fixed?

Unfortunately, lguest doesn't go through the normal startup path (which
does this in asm).

Thanks,
Rusty.


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] Framebuffer: Consolidated cleanup of pvr2fb.c for Sega Dreamcast

2007-07-28 Thread Adrian McMenamin
On 29/07/07, Adrian McMenamin <[EMAIL PROTECTED]> wrote:
> Tony,
>
> Second time attempt at this and a much better job I think.
>
Sorry, given I've jsut said this *does* work at 24bpp and 32bpp I'd
better clean up the Documentation patch,,,
diff --git a/Documentation/fb/pvr2fb.txt b/Documentation/fb/pvr2fb.txt
index 2bf6c23..0e4a3c6 100644
--- a/Documentation/fb/pvr2fb.txt
+++ b/Documentation/fb/pvr2fb.txt
@@ -9,14 +9,14 @@ one found in the Dreamcast.
 Advantages:
 
  * It provides a nice large console (128 cols + 48 lines with 1024x768)
-   without using tiny, unreadable fonts.
+   without using tiny, unreadable fonts (this size is NOT available on the 
+   Dreamcast)
  * You can run XF86_FBDev on top of /dev/fb0
  * Most important: boot logo :-)
 
 Disadvantages:
 
- * Driver is currently limited to the Dreamcast PowerVR 2 implementation
-   at the time of this writing.
+ * Driver is largely untested on non-Dreamcast systems.
 
 Configuration
 =
@@ -29,11 +29,13 @@ Accepted options:
 font:X- default font to use. All fonts are supported, including the
 SUN12x22 font which is very nice at high resolutions.
 
-mode:X- default video mode. The following video modes are supported:
-640x240-60, 640x480-60.
+mode:X- default video mode
+The following video modes are supported:
+[EMAIL PROTECTED], [EMAIL PROTECTED], [EMAIL PROTECTED] The Dreamcast
+	defaults to [EMAIL PROTECTED]
 	
 Note: the 640x240 mode is currently broken, and should not be
-used for any reason. It is only mentioned as a reference.
+used for any reason. It is only mentioned here as a reference.
 
 inverse   - invert colors on screen (for LCD displays)
 
@@ -49,13 +51,20 @@ cable:X   - cable type. This can be any of the following: vga, rgb, and
 output:X  - output type. This can be any of the following: pal, ntsc, and
 vga. If none is specified, we guess.
 
+Video mode may also be specified in the form:
+
+	   [xres]x[yres][-[EMAIL PROTECTED]
+
+	   eg [EMAIL PROTECTED]
+
+
 X11
 ===
 
-XF86_FBDev should work, in theory. At the time of this writing it is
-totally untested and may or may not even portray the beginnings of
-working. If you end up testing this, please let me know!
+XF86_FBDev has been shown to work on the Dremcast in the past - though not yet
+on any 2.6 series kernel.
 
 --
 Paul Mundt <[EMAIL PROTECTED]>
+Updated by Adrian McMenamin <[EMAIL PROTECTED]>
 
diff --git a/drivers/video/pvr2fb.c b/drivers/video/pvr2fb.c
index 3ac32f3..264b6a6 100644
--- a/drivers/video/pvr2fb.c
+++ b/drivers/video/pvr2fb.c
@@ -94,6 +94,7 @@
 #define DISP_DIWCONF (DISP_BASE + 0xe8)
 #define DISP_DIWHSTRT (DISP_BASE + 0xec)
 #define DISP_DIWVSTRT (DISP_BASE + 0xf0)
+#define DISP_PIXDEPTH (DISP_BASE + 0x108)
 
 /* Pixel clocks, one for TV output, doubled for VGA output */
 #define TV_CLK 74239
@@ -143,6 +144,7 @@ static struct pvr2fb_par {
 	unsigned char is_lowres;	/* Is horizontal pixel-doubling enabled? */
 
 	unsigned long mmio_base;	/* MMIO base */
+	u32 palette[16];
 } *currentpar;
 
 static struct fb_info *fb_info;
@@ -320,7 +322,7 @@ static int pvr2fb_setcolreg(unsigned int regno, unsigned int red,
 
 	if (regno > info->cmap.len)
 		return 1;
-
+	
 	/*
 	 * We only support the hardware palette for 16 and 32bpp. It's also
 	 * expected that the palette format has been set by the time we get
@@ -333,24 +335,25 @@ static int pvr2fb_setcolreg(unsigned int regno, unsigned int red,
 		  ((blue  & 0xf800) >> 11);
 
 		pvr2fb_set_pal_entry(par, regno, tmp);
-		((u16*)(info->pseudo_palette))[regno] = tmp;
 		break;
 	case 24: /* RGB 888 */
 		red >>= 8; green >>= 8; blue >>= 8;
-		((u32*)(info->pseudo_palette))[regno] = (red << 16) | (green << 8) | blue;
+		tmp = (red << 16) | (green << 8) | blue;
 		break;
 	case 32: /* ARGB  */
 		red >>= 8; green >>= 8; blue >>= 8;
 		tmp = (transp << 24) | (red << 16) | (green << 8) | blue;
 
 		pvr2fb_set_pal_entry(par, regno, tmp);
-		((u32*)(info->pseudo_palette))[regno] = tmp;
 		break;
 	default:
 		pr_debug("Invalid bit depth %d?!?\n", info->var.bits_per_pixel);
 		return 1;
 	}
 
+	if (regno < 16)
+		((u32*)(info->pseudo_palette))[regno] = tmp;
+
 	return 0;
 }
 
@@ -598,6 +601,7 @@ static void pvr2_init_display(struct fb_info *info)
 
 	/* bits per pixel */
 	fb_writel(fb_readl(DISP_DIWMODE) | (--bytesperpixel << 2), DISP_DIWMODE);
+	fb_writel(bytesperpixel << 2, DISP_PIXDEPTH);
 
 	/* video enable, color sync, interlace,
 	 * hsync and vsync polarity (currently unused) */
@@ -789,7 +793,7 @@ static int __devinit pvr2fb_common_init(void)
 	fb_info->fbops		= _ops;
 	fb_info->fix		= pvr2_fix;
 	fb_info->par		= currentpar;
-	fb_info->pseudo_palette	= (void *)(fb_info->par + 1);
+	fb_info->pseudo_palette	= currentpar->palette;
 	fb_info->flags		= FBINFO_DEFAULT | FBINFO_HWACCEL_YPAN;
 
 	if (video_output == VO_VGA)
@@ -806,6 +810,8 @@ static int __devinit 

Re: Linus 2.6.23-rc1

2007-07-28 Thread Kasper Sandberg
On Sun, 2007-07-29 at 01:41 +0200, Volker Armin Hemmann wrote:
> Hi,
> 
> I never tried Con's patchset, for two reasons:
> I tried his 2.4 patches ones, and I never saw any improvements. So when 
> people 
> were reporting huge improvements with his SD scheduler, I compared that with 
> the reports of huge improvements with his 2.4 kernel patches.

Well thats a reason if there ever were one...

> ...
> The second: too many patches. I only would have tried one or two, but the 
> ck-patchset is a lot bigger.. and I am a little bit uneasy about that.

so use only the scheduler? nobody forces you to do many things..

> 
> But I tried a lot of Ingo's cfs patches - and it was a very pleasant 
> experience. Ingo reacted very fast on my feedback and when I hit a problem he 
> really tried to find the cause and solve it - and it always was one patch, so 
> I felt a lot less scared ;)
> 
> My usual workload is very 'usual'. KDE desktop, kmail, konqueror, sometimes 
> xine or amarok providing some background noise while typing away in kate, 
> triplea, wesnoth or some other game when I need to 'rest' for a while. A lot 
> of compiling in the background, because I am one of these gentoo users.
> 
> With cfs the experience was much more pleasant than with the 'old' scheduler. 
> Compiling did not hurt as much as usual anymore - the only thing that hurts 
> is swap 
> 
> But there is another thing I do regularly: I play ut2004. Not every single 
> day, but sometimes several times a day. 20minutes of mayhem and then back to 
> the desktop.
> 
> And I do not see any problems with cfs and ut2004. The maximum FPS are indeed 
> a little bit lower (and you can argue that this really is not important if 
> the pre-game FPS in a level looking down on the floor go down from 390 to 
> 380FPS), but the minimum FPS went up!

well, surely CFS is better than the old vanilla scheduler, also with 3d,
and if you have that high fps, i doubt you will notice the effects me
and others are having. it is not that it is bad, its just not as good as
SD has shown to be possible..

> 
> In scenes when my system is fighting hard to provide the FPS, when the action 
> is high (like when fighting with half a douzend bots at a power node, while 
> some other bots are shooting into the mess) CFS is much better than the old 
> scheduler. It is a big difference if you get 6-10FPS or 15-25.
> (I am playing with maximum 'beautifullness' - I would be able to get a lot 
> more FPS, if I wanted, but I want a nice scenery and maximum visual 
> effects ...)
> 
> From my point of view 3D is a lot better with cfs. 

Better than old vanilla yes, but than SD? well, you should give it a
try.

> 
> Now the question for all the people who are bashing cfs for its bad 3d 
> performance: what am I doing wrong?

As said, we never said CFS was worse than old vanilla, and we never said
it was BAD, we did however say its not as good as SD :)

> 
> Glück Auf,
> Volker
> 
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [EMAIL PROTECTED]
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] Framebuffer: Consolidated cleanup of pvr2fb.c for Sega Dreamcast

2007-07-28 Thread Adrian McMenamin
Tony,

Second time attempt at this and a much better job I think.

This patch consolidates your earlier patch, some cleanup of the
documentation and, crucially, some better handling of the pvr2
registers based on more up to date information.

Testing shows that it seems to work pretty well at 16bpp, 24bpp and
32bpp - including proper rendering of the boot logo at all levels
(previously this was a bit broken even at 16bpp) and giving white
against black text. Really detailed testing (eg with X11) requires
support for the maple bus - which isn't (currently - next project
assuming this is okay) available, but I have no reason to think this
is broken.

Incidentally, substituing DIRECTCOLOR for TRUECOLOR appears to break the driver.

Signed-off by: Adrian McMenamin <[EMAIL PROTECTED]>
diff --git a/Documentation/fb/pvr2fb.txt b/Documentation/fb/pvr2fb.txt
index 2bf6c23..3d08551 100644
--- a/Documentation/fb/pvr2fb.txt
+++ b/Documentation/fb/pvr2fb.txt
@@ -9,14 +9,14 @@ one found in the Dreamcast.
 Advantages:
 
  * It provides a nice large console (128 cols + 48 lines with 1024x768)
-   without using tiny, unreadable fonts.
+   without using tiny, unreadable fonts (this size is NOT available on the 
+   Dreamcast)
  * You can run XF86_FBDev on top of /dev/fb0
  * Most important: boot logo :-)
 
 Disadvantages:
 
- * Driver is currently limited to the Dreamcast PowerVR 2 implementation
-   at the time of this writing.
+ * Driver is largely untested on non-Dremcast systems.
 
 Configuration
 =
@@ -29,11 +29,15 @@ Accepted options:
 font:X- default font to use. All fonts are supported, including the
 SUN12x22 font which is very nice at high resolutions.
 
-mode:X- default video mode. The following video modes are supported:
-640x240-60, 640x480-60.
+mode:X- default video mode
+The following video modes are supported:
+[EMAIL PROTECTED], [EMAIL PROTECTED], [EMAIL PROTECTED] The Dreamcast
+	defaults to [EMAIL PROTECTED] At the time of writing the 
+24bpp and 32bpp modes function poorly. Work to fix that is
+ongoing
 	
 Note: the 640x240 mode is currently broken, and should not be
-used for any reason. It is only mentioned as a reference.
+used for any reason. It is only mentioned here as a reference.
 
 inverse   - invert colors on screen (for LCD displays)
 
@@ -49,13 +53,20 @@ cable:X   - cable type. This can be any of the following: vga, rgb, and
 output:X  - output type. This can be any of the following: pal, ntsc, and
 vga. If none is specified, we guess.
 
+Video mode may also be specified in the form:
+
+	   [xres]x[yres][-[EMAIL PROTECTED]
+
+	   eg [EMAIL PROTECTED]
+
+
 X11
 ===
 
-XF86_FBDev should work, in theory. At the time of this writing it is
-totally untested and may or may not even portray the beginnings of
-working. If you end up testing this, please let me know!
+XF86_FBDev has been shown to work on the Dremcast in the past - though not yet
+on any 2.6 series kernel.
 
 --
 Paul Mundt <[EMAIL PROTECTED]>
+Updated by Adrian McMenamin <[EMAIL PROTECTED]>
 
diff --git a/drivers/video/pvr2fb.c b/drivers/video/pvr2fb.c
index 3ac32f3..264b6a6 100644
--- a/drivers/video/pvr2fb.c
+++ b/drivers/video/pvr2fb.c
@@ -94,6 +94,7 @@
 #define DISP_DIWCONF (DISP_BASE + 0xe8)
 #define DISP_DIWHSTRT (DISP_BASE + 0xec)
 #define DISP_DIWVSTRT (DISP_BASE + 0xf0)
+#define DISP_PIXDEPTH (DISP_BASE + 0x108)
 
 /* Pixel clocks, one for TV output, doubled for VGA output */
 #define TV_CLK 74239
@@ -143,6 +144,7 @@ static struct pvr2fb_par {
 	unsigned char is_lowres;	/* Is horizontal pixel-doubling enabled? */
 
 	unsigned long mmio_base;	/* MMIO base */
+	u32 palette[16];
 } *currentpar;
 
 static struct fb_info *fb_info;
@@ -320,7 +322,7 @@ static int pvr2fb_setcolreg(unsigned int regno, unsigned int red,
 
 	if (regno > info->cmap.len)
 		return 1;
-
+	
 	/*
 	 * We only support the hardware palette for 16 and 32bpp. It's also
 	 * expected that the palette format has been set by the time we get
@@ -333,24 +335,25 @@ static int pvr2fb_setcolreg(unsigned int regno, unsigned int red,
 		  ((blue  & 0xf800) >> 11);
 
 		pvr2fb_set_pal_entry(par, regno, tmp);
-		((u16*)(info->pseudo_palette))[regno] = tmp;
 		break;
 	case 24: /* RGB 888 */
 		red >>= 8; green >>= 8; blue >>= 8;
-		((u32*)(info->pseudo_palette))[regno] = (red << 16) | (green << 8) | blue;
+		tmp = (red << 16) | (green << 8) | blue;
 		break;
 	case 32: /* ARGB  */
 		red >>= 8; green >>= 8; blue >>= 8;
 		tmp = (transp << 24) | (red << 16) | (green << 8) | blue;
 
 		pvr2fb_set_pal_entry(par, regno, tmp);
-		((u32*)(info->pseudo_palette))[regno] = tmp;
 		break;
 	default:
 		pr_debug("Invalid bit depth %d?!?\n", info->var.bits_per_pixel);
 		return 1;
 	}
 
+	if (regno < 16)
+		((u32*)(info->pseudo_palette))[regno] = tmp;
+
 	return 0;
 }
 
@@ -598,6 +601,7 @@ static void 

Re: ide problems: 2.6.22-git17 working, 2.6.23-rc1* is not

2007-07-28 Thread Danny ter Haar
Quoting Bartlomiej Zolnierkiewicz ([EMAIL PROTECTED]):
> Should be harmless for now but we would like to fix it the long-term,
> please send "hdparm --Istdout /dev/hda" output.


I allready made that available in the same subdir:

http://www.dth.net/kernel/

Output repeated here:


voyage:~# hdparm -I /dev/hda

/dev/hda:

ATA device, with non-removable media
Model Number:   PQI IDE DiskOnModule
Serial Number:  
Firmware Revision:  db01.20a
Standards:
Likely used: 1
Configuration:
hard sectored
not MFM encoded 
head switch time > 15us
fixed drive
disk xfer rate > 5Mbs
Logical max current
cylinders   500 500
heads   16  16
sectors/track   32  32
--
bytes/track: 0  bytes/sector: 528
CHS current addressable sectors: 256000
LBAuser addressable sectors: 256000
device size with M = 1024*1024: 125 MBytes
device size with M = 1000*1000: 131 MBytes 
Capabilities:
LBA, IORDY not likely
Buffer type: 0002: dual port, multi-sector
Buffer size: 1.0kB  bytes avail on r/w long: 4
Cannot perform double-word IO
R/W multiple sector transfer: Max = 1   Current = 0
DMA: not supported
PIO: pio0 pio1 pio2 

voyage:~# hdparm -tT /dev/hda
/dev/hda:
 Timing cached reads:   116 MB in  2.00 seconds =  57.92 MB/sec
 Timing buffered disk reads:   16 MB in  3.01 seconds =   5.32 MB/sec
# hdparm --Istdout /dev/hda

/dev/hda:
045a 01f4  0010  0210 0020 0003
e800  2020 2020 2020 2020 2020 2020
2020 2020 2020 2020 0002 0002 0004 6462
3031 2e32 3061 5051 4920 4944 4520 4469
736b 4f6e 4d6f 6475 6c65 2020 2020 2020
2020 2020 2020 2020 2020 2020 2020 0001
 0200  0200  0001 01f4 0010
0020 e800 0003 0100 e800 0003  
       
       
       
       
       
       
       
       
       
       
       
       
       
       
       
       
       
       
       
       
       
       
       
       

> No IDE changes yet after 2.6.22-git17 so something else must have broke down.

darn

I also tried with libata driver on the 2.6.22 series but that wouldn't 
work as well. I contacted Alan Cox and he made some patches/suggestions
that made 2.6.22-git16 with libata work!
I hope his patch will make it into mainline soon.

For the dmesg output of this kernel 
http://www.dth.net/kernel/via_output_2.6.22-git16-libata_fix_alan_cox

> Please retry with the latest -git kernel and if the problem is still
> there install git, get kernel tree and run git-bisect.
> 
> Since 2.6.22-git17 works fine the initial "good" commit would be
> e51f802babc5e368c60fbfd08c6c11269c9253b0 and the initial "bad" one
> f695baf2df9e0413d3521661070103711545207a (for 2.6.23-rc1).
> 
> If you need some practical examples on using git-bisect, this one
> 
> http://www.reactivated.net/weblog/archives/2006/01/using-git-bisect-to-find-buggy-kernel-patches/
> 
> is a very good one (IMO).

I will certainly will read/try that.

> > VP_IDE: VIA vt8231 (rev 10) IDE UDMA100 controller on pci:00:11.1
> > ide0: BM-DMA at 0xd000-0xd007, BIOS settings: hda:pio, hdb:pio
> > ide1: BM-DMA at 0xd008-0xd00f, BIOS settings: hdc:pio, hdd:pio
> > Marking TSC unstable due to: possible TSC halt in C2.
> > Time: acpi_pm clocksource has been installed.
> > hda: PQI IDE DiskOnModule, ATA DISK drive
> > hda: IRQ probe failed (0xfff2)
> 
> This message would indicate broken IRQ routing, however it is no
> longer present in the log for the kernel 2.6.23-rc1-git4 ([3]).

i booted with routeirq=pci but gave the same negative result.
 
> Interesting that the kernel 2.6.22-git17 (log [2]) doesn't use via82cxxx
> IDE host driver while the kernel 2.6.23-rc1-git4 (log [3]) does...?

kernel configs are also in above mentioned webdir

basically copied the 2.6.22-git17 and ran a make oldconfig.


Did a diff -u on those config files and there is really a lot different:

[snip]
-# ATA/IDE/MFM/RLL support
+# Device Drivers
 #
[snip]
 #
 # IDE chipset support/bugfixes
 #
+# CONFIG_BLK_DEV_TRM290 is not set
 

Re: [2/3] 2.6.23-rc1: known regressions v2

2007-07-28 Thread Bartlomiej Zolnierkiewicz
On Friday 27 July 2007, Michal Piotrowski wrote:

> IDE
> 
> Subject : ide problems: 2.6.22-git17 working, 2.6.23-rc1* is not
> References  : http://lkml.org/lkml/2007/7/27/298
> Last known good : ?
> Submitter   : dth <[EMAIL PROTECTED]>
> Caused-By   : ?
> Handled-By  : ?
> Status  : unknown

No IDE changes after 2.6.22-git17.

Despite this I will try to follow this bugreport.

Thanks,
Bart
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [ck] Re: Linus 2.6.23-rc1

2007-07-28 Thread Con Kolivas
Interesting... Trying to avoid reading email but with a flooded inbox it's 
quite hard to do.

A lot of useful discussion seems to have generated in response to people's 
_interpretation_ of my interview rather than what I actually said. For 
example, everyone seems to think I quit because CFS was chosen over SD (hint: 
it wasn't). Since it's generating good discussion I'll otherwise leave it as 
is.


As a parting gesture; a couple of hints for CFS. 

Any difference in behaviour between CFS and SD since they both aim for 
fairness would come down to the way they interpret fair. Since CFS accounts 
sleep time whereas SD does not, that would be the reason.

As for volanomark regressions, they're always the sched_yield implementation. 
SD addressed a similar regression a few months back.

Good luck.

-- 
-ck
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] USB Pegasus driver - avoid a potential NULL pointer dereference.

2007-07-28 Thread Jesper Juhl
On 29/07/07, Satyam Sharma <[EMAIL PROTECTED]> wrote:
> Hi,
>
> On 7/29/07, Jesper Juhl <[EMAIL PROTECTED]> wrote:
> > Hello,
> >
> > This patch makes sure we don't dereference a NULL pointer in
> > drivers/net/usb/pegasus.c::write_bulk_callback() in the initial
> > struct net_device *net = pegasus->net; assignment.
> > The existing code checks if 'pegasus' is NULL and bails out if
> > it is, so we better not touch that pointer until after that check.
> > [...]
> > diff --git a/drivers/net/usb/pegasus.c b/drivers/net/usb/pegasus.c
> > index a05fd97..04cba6b 100644
> > --- a/drivers/net/usb/pegasus.c
> > +++ b/drivers/net/usb/pegasus.c
> > @@ -768,11 +768,13 @@ done:
> >  static void write_bulk_callback(struct urb *urb)
> >  {
> > pegasus_t *pegasus = urb->context;
> > -   struct net_device *net = pegasus->net;
> > +   struct net_device *net;
> >
> > if (!pegasus)
> > return;
> >
> > +   net = pegasus->net;
> > +
> > if (!netif_device_present(net) || !netif_running(net))
> > return;
>
> Is it really possible that we're called into this function with
> urb->context == NULL? If not, I'd suggest let's just get rid of
> the check itself, instead.
>
I'm not sure. I am not very familiar with this code. I just figured
that moving the assignment is potentially a little safer and it is
certainly no worse than the current code, so that's a safe and
potentially benneficial change. Removing the check may be safe but I'm
not certain enough to make that call...

-- 
Jesper Juhl <[EMAIL PROTECTED]>
Don't top-post  http://www.catb.org/~esr/jargon/html/T/top-post.html
Plain text mails only, please  http://www.expita.com/nomime.html
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: ide problems: 2.6.22-git17 working, 2.6.23-rc1* is not

2007-07-28 Thread Bartlomiej Zolnierkiewicz

Hi,

On Friday 27 July 2007, dth wrote:
> I have a via mini-itx epia 5000 motherboard as a firewall.
> It has an ide device: PQI 128MB flash DOM [1] which just plugs
> directly into the ide connector on the motherboard.
> 2.6.22-git17 works, although it gives some warning 
> hda: set_drive_speed_status: status=0x51 { DriveReady SeekComplete Error }
> hda: set_drive_speed_status: error=0x04 { DriveStatusError }

Should be harmless for now but we would like to fix it the long-term,
please send "hdparm --Istdout /dev/hda" output.

> Full dmesg output [2]
> 
> When i compiled any 2.6.23-rc1 kernel out so far it always froze on me.
> Yesterday i tried 2.6.23-rc1-git3 and i booted until:

No IDE changes yet after 2.6.22-git17 so something else must have broke down.

Please retry with the latest -git kernel and if the problem is still
there install git, get kernel tree and run git-bisect.

Since 2.6.22-git17 works fine the initial "good" commit would be
e51f802babc5e368c60fbfd08c6c11269c9253b0 and the initial "bad" one
f695baf2df9e0413d3521661070103711545207a (for 2.6.23-rc1).

If you need some practical examples on using git-bisect, this one

http://www.reactivated.net/weblog/archives/2006/01/using-git-bisect-to-find-buggy-kernel-patches/

is a very good one (IMO).

> VP_IDE: VIA vt8231 (rev 10) IDE UDMA100 controller on pci:00:11.1
> ide0: BM-DMA at 0xd000-0xd007, BIOS settings: hda:pio, hdb:pio
> ide1: BM-DMA at 0xd008-0xd00f, BIOS settings: hdc:pio, hdd:pio
> Marking TSC unstable due to: possible TSC halt in C2.
> Time: acpi_pm clocksource has been installed.
> hda: PQI IDE DiskOnModule, ATA DISK drive
> hda: IRQ probe failed (0xfff2)

This message would indicate broken IRQ routing, however it is no
longer present in the log for the kernel 2.6.23-rc1-git4 ([3]).

Interesting that the kernel 2.6.22-git17 (log [2]) doesn't use via82cxxx
IDE host driver while the kernel 2.6.23-rc1-git4 (log [3]) does...?

> Clocksource tsc unstable (delta = 84358771493 ns)
> 
> So i went back to 2.6.22-git17.
> This morning i saw a fresh git4 and compiled/installed that.
> This kernel actually booted! (once)
> I had a netconsole running to catch the lucky event [3]
> 
> After about 2 minutes of working however, the whole machine froze.
> No message at the console, not able to toggle numlock, no magic
> sysrq key features: solid frozen.
> After the power cycle i was not able to boot the same kernel
> ever again. It was just like a timer had run out and it wouldn't 
> work anymore. 
> 
> I know my hardware is "ancient" and the flash thing is probably
> not current. My firewall however is solar powerpowered (10 watt 

Neat. :)

> power consumption) and it would be fun if i could run the latest/
> greatest kernels on it.

With the buggy commit number obtained from git-bisect it should be
fixed quite quickly.

> Config files and hdparm info on the DOM in the same directory as
> dmesg output.
> 
> Any comments appriciated.
> 
> Danny
> 
> [1] http://www.memorydepot.com/ssd_diskonmodule.asp
> [2] http://www.dth.net/kernel/via_output_2.6.22_git17
> [3] http://www.dth.net/kernel/via_output_2.6.23-rc1-git4

Thanks,
Bart
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] ide: sis5513.c: Add FSC Amilo A1630 PCI subvendor/dev to laptops

2007-07-28 Thread Bartlomiej Zolnierkiewicz
On Friday 27 July 2007, David Lamparter wrote:
> [PATCH] ide: sis5513.c: Add FSC Amilo A1630 PCI subvendor/dev to laptops
> 
> Recognise the FSC Amilo A1630's incarnation of a SiS5513 chip as laptop to
> get UDMA100 support.
> 
> Signed-off-by: David Lamparter <[EMAIL PROTECTED]>

applied, thanks
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Linus 2.6.23-rc1

2007-07-28 Thread Volker Armin Hemmann
Hi,

I never tried Con's patchset, for two reasons:
I tried his 2.4 patches ones, and I never saw any improvements. So when people 
were reporting huge improvements with his SD scheduler, I compared that with 
the reports of huge improvements with his 2.4 kernel patches.
...
The second: too many patches. I only would have tried one or two, but the 
ck-patchset is a lot bigger.. and I am a little bit uneasy about that.

But I tried a lot of Ingo's cfs patches - and it was a very pleasant 
experience. Ingo reacted very fast on my feedback and when I hit a problem he 
really tried to find the cause and solve it - and it always was one patch, so 
I felt a lot less scared ;)

My usual workload is very 'usual'. KDE desktop, kmail, konqueror, sometimes 
xine or amarok providing some background noise while typing away in kate, 
triplea, wesnoth or some other game when I need to 'rest' for a while. A lot 
of compiling in the background, because I am one of these gentoo users.

With cfs the experience was much more pleasant than with the 'old' scheduler. 
Compiling did not hurt as much as usual anymore - the only thing that hurts 
is swap 

But there is another thing I do regularly: I play ut2004. Not every single 
day, but sometimes several times a day. 20minutes of mayhem and then back to 
the desktop.

And I do not see any problems with cfs and ut2004. The maximum FPS are indeed 
a little bit lower (and you can argue that this really is not important if 
the pre-game FPS in a level looking down on the floor go down from 390 to 
380FPS), but the minimum FPS went up!

In scenes when my system is fighting hard to provide the FPS, when the action 
is high (like when fighting with half a douzend bots at a power node, while 
some other bots are shooting into the mess) CFS is much better than the old 
scheduler. It is a big difference if you get 6-10FPS or 15-25.
(I am playing with maximum 'beautifullness' - I would be able to get a lot 
more FPS, if I wanted, but I want a nice scenery and maximum visual 
effects ...)

>From my point of view 3D is a lot better with cfs. 

Now the question for all the people who are bashing cfs for its bad 3d 
performance: what am I doing wrong?

Glück Auf,
Volker

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] USB Pegasus driver - avoid a potential NULL pointer dereference.

2007-07-28 Thread Satyam Sharma
Hi,

On 7/29/07, Jesper Juhl <[EMAIL PROTECTED]> wrote:
> Hello,
>
> This patch makes sure we don't dereference a NULL pointer in
> drivers/net/usb/pegasus.c::write_bulk_callback() in the initial
> struct net_device *net = pegasus->net; assignment.
> The existing code checks if 'pegasus' is NULL and bails out if
> it is, so we better not touch that pointer until after that check.
> [...]
> diff --git a/drivers/net/usb/pegasus.c b/drivers/net/usb/pegasus.c
> index a05fd97..04cba6b 100644
> --- a/drivers/net/usb/pegasus.c
> +++ b/drivers/net/usb/pegasus.c
> @@ -768,11 +768,13 @@ done:
>  static void write_bulk_callback(struct urb *urb)
>  {
> pegasus_t *pegasus = urb->context;
> -   struct net_device *net = pegasus->net;
> +   struct net_device *net;
>
> if (!pegasus)
> return;
>
> +   net = pegasus->net;
> +
> if (!netif_device_present(net) || !netif_running(net))
> return;

Is it really possible that we're called into this function with
urb->context == NULL? If not, I'd suggest let's just get rid of
the check itself, instead.

Satyam
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 2/3] core_pattern: allow passing of arguments to user mode helper when core_pattern is a pipe

2007-07-28 Thread Jeremy Fitzhardinge
Neil Horman wrote:
> Jeremy asked that I make a patch next week to address split_argv's requirement
> that the argc parameter be non-NULL.  I'll be fixing that next week, and what 
> I
> can do is further enhance it such that it ignores spaces in quoted strings,
> which should address the case that concerns you.  I.E I can make split_argv
> behave such that:
> echo "|\"foo bar\" --pid %p" > /proc/sys/kernel/core_pattern
> results in the following argv:
> {{"foo bar"}, {"--pid"}, {"1234"}}
>
> Which I think handles what you are looking for.
>   

No, please don't.  My original argv_split did that, and it was just way
too complex.  If you need complex quoting, you can always point it at a
shell script and handle it there.

J
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] lib: move kasprintf to a separate file

2007-07-28 Thread Jeremy Fitzhardinge
Sam Ravnborg wrote:
> kasprintf pulls in kmalloc which proved to be fatal for at least
> bootimage target on alpha.
> Move it to a separate file so only users of kasprintf are exposed
> to the dependency on kmalloc.
>   

OK by me (that's what my original patch did), but it might be worth
documenting what environmental constraints vsprintf.c is under.  I
didn't realize it was used by non-kernel code.

J
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH][ACPI] Let's not gamble that a possible double free will never happen in asus_hotk_get_info()

2007-07-28 Thread Jesper Juhl
Hi,

The Coverity checker points out (CID: 1500) that we can in some 
cases end up doing a double free of 'model' in asus_hotk_get_info().
I'm not 100% sure it is right, but better safe than sorry, 
especially since this is so simple to turn into a non-issue - simply 
set 'model' to NULL after the first kfree() and then the second 
kfree() is harmless (if it actually can happen, and if it cannot
happen then the cost is just a single extra assignment).

Here is the function with Coverity's annotations 
(proposed patch at the end of the mail)

...
1141/*
1142 * This function is used to initialize the hotk with right values. In 
this
1143 * method, we can make all the detection we want, and modify the hotk 
struct
1144 */
1145static int asus_hotk_get_info(void)
1146{
1147struct acpi_buffer buffer = { ACPI_ALLOCATE_BUFFER, NULL };
1148union acpi_object *model = NULL;
1149int bsts_result;
1150char *string = NULL;
1151acpi_status status;
1152
1153/*
1154 * Get DSDT headers early enough to allow for differentiating 
between 
1155 * models, but late enough to allow acpi_bus_register_driver() 
to fail 
1156 * before doing anything ACPI-specific. Should we encounter a 
machine,
1157 * which needs special handling (i.e. its hotkey device has a 
different
1158 * HID), this bit will be moved. A global variable asus_info 
contains
1159 * the DSDT header.
1160 */
1161status = acpi_get_table(ACPI_SIG_DSDT, 1, _info);
1162if (ACPI_FAILURE(status))
1163printk(KERN_WARNING "  Couldn't get the DSDT table 
header\n");
1164
1165/* We have to write 0 on init this far for all ASUS models */
1166if (!write_acpi_int(hotk->handle, "INIT", 0, )) {
1167printk(KERN_ERR "  Hotkey initialization failed\n");
1168return -ENODEV;
1169}
1170
1171/* This needs to be called for some laptops to init properly */
1172if (!read_acpi_int(hotk->handle, "BSTS", _result))
1173printk(KERN_WARNING "  Error calling BSTS\n");
1174else if (bsts_result)
1175printk(KERN_NOTICE "  BSTS called, 0x%02x returned\n",
1176   bsts_result);
1177
1178/*
1179 * Try to match the object returned by INIT to the specific 
model.
1180 * Handle every possible object (or the lack of thereof) the 
DSDT 
1181 * writers might throw at us. When in trouble, we pass NULL to 
1182 * asus_model_match() and try something completely different.
1183 */
1184if (buffer.pointer) {

Event alias: aliasing "(buffer).pointer" with "model"
Also see events: [freed_arg][double_free]

1185model = buffer.pointer;
1186switch (model->type) {
1187case ACPI_TYPE_STRING:
1188string = model->string.pointer;
1189break;
1190case ACPI_TYPE_BUFFER:
1191string = model->buffer.pointer;
1192break;

At conditional (1): "default" taking true path

1193default:

Event freed_arg: Pointer "model" freed by function "kfree"
Also see events: [alias][double_free]

1194kfree(model);
1195break;
1196}
1197}
1198hotk->model = asus_model_match(string);

At conditional (2): "(hotk)->model == 23" taking false path

1199if (hotk->model == END_MODEL) { /* match failed */
1200if (asus_info &&
1201strncmp(asus_info->oem_table_id, "ODEM", 4) == 0) {
1202hotk->model = P30;
1203printk(KERN_NOTICE
1204   "  Samsung P30 detected, supported\n");
1205} else {
1206hotk->model = M2E;
1207printk(KERN_NOTICE "  unsupported model %s, 
trying "
1208   "default values\n", string);
1209printk(KERN_NOTICE
1210   "  send /proc/acpi/dsdt to the 
developers\n");
1211}
1212hotk->methods = _conf[hotk->model];
1213return AE_OK;
1214}
1215hotk->methods = _conf[hotk->model];
1216printk(KERN_NOTICE "  %s model detected, supported\n", string);
1217
1218/* Sort of per-model blacklist */

At conditional (3): "strncmp == 0" taking false path

1219if (strncmp(string, "L2B", 3) == 0)
1220hotk->methods->lcd_status = 

[PATCH 2/2] Wait for page writeback when directly reclaiming contiguous areas

2007-07-28 Thread Andy Whitcroft

From: Mel Gorman <[EMAIL PROTECTED]>

Lumpy reclaim works by selecting a lead page from the LRU list and then
selecting pages for reclaim from the order-aligned area of pages. In the
situation were all pages in that region are inactive and not referenced by
any process over time, it works well.

In the situation where there is even light load on the system, the pages may
not free quickly. Out of a area of 1024 pages, maybe only 950 of them are
freed when the allocation attempt occurs because lumpy reclaim returned early.
This patch alters the behaviour of direct reclaim for large contiguous blocks.

The first attempt to call shrink_page_list() is asynchronous but if it
fails, the pages are submitted a second time and the calling process waits
for the IO to complete. It'll retry up to 5 times for the pages to be
fully freed. This may stall allocators waiting for contiguous memory but
that should be expected behaviour for high-order users. It is preferable
behaviour to potentially queueing unnecessary areas for IO. Note that kswapd
will not stall in this fashion.

[EMAIL PROTECTED]: update to version 2]
Signed-off-by: Mel Gorman <[EMAIL PROTECTED]>
Signed-off-by: Andy Whitcroft <[EMAIL PROTECTED]>

Changelog:

Changes in V2:
 - remove retry loop
 - fix up active accounting (count deactivate events correctly)
 - use our own sync/async flag type
---
diff --git a/mm/vmscan.c b/mm/vmscan.c
index 99ec7fa..1c21714 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -271,6 +271,12 @@ static void handle_write_error(struct address_space 
*mapping,
unlock_page(page);
 }
 
+/* Request for sync pageout. */
+typedef enum {
+   PAGEOUT_IO_ASYNC,
+   PAGEOUT_IO_SYNC,
+} pageout_io_t;
+
 /* possible outcome of pageout() */
 typedef enum {
/* failed to write page out, page is locked */
@@ -287,7 +293,8 @@ typedef enum {
  * pageout is called by shrink_page_list() for each dirty page.
  * Calls ->writepage().
  */
-static pageout_t pageout(struct page *page, struct address_space *mapping)
+static pageout_t pageout(struct page *page, struct address_space *mapping,
+   pageout_io_t sync_writeback)
 {
/*
 * If the page is dirty, only perform writeback if that write
@@ -346,6 +353,15 @@ static pageout_t pageout(struct page *page, struct 
address_space *mapping)
ClearPageReclaim(page);
return PAGE_ACTIVATE;
}
+
+   /*
+* Wait on writeback if requested to. This happens when
+* direct reclaiming a large contiguous area and the
+* first attempt to free a ranage of pages fails
+*/
+   if (PageWriteback(page) && sync_writeback == PAGEOUT_IO_SYNC)
+   wait_on_page_writeback(page);
+
if (!PageWriteback(page)) {
/* synchronous write or broken a_ops? */
ClearPageReclaim(page);
@@ -423,7 +439,8 @@ cannot_free:
  * shrink_page_list() returns the number of reclaimed pages
  */
 static unsigned long shrink_page_list(struct list_head *page_list,
-   struct scan_control *sc)
+   struct scan_control *sc,
+   pageout_io_t sync_writeback)
 {
LIST_HEAD(ret_pages);
struct pagevec freed_pvec;
@@ -458,8 +475,12 @@ static unsigned long shrink_page_list(struct list_head 
*page_list,
if (page_mapped(page) || PageSwapCache(page))
sc->nr_scanned++;
 
-   if (PageWriteback(page))
-   goto keep_locked;
+   if (PageWriteback(page)) {
+   if (sync_writeback == PAGEOUT_IO_SYNC)
+   wait_on_page_writeback(page);
+   else
+   goto keep_locked;
+   }
 
referenced = page_referenced(page, 1);
/* In active use or really unfreeable?  Activate it. */
@@ -505,7 +526,7 @@ static unsigned long shrink_page_list(struct list_head 
*page_list,
goto keep_locked;
 
/* Page is dirty, try to write it out here */
-   switch(pageout(page, mapping)) {
+   switch (pageout(page, mapping, sync_writeback)) {
case PAGE_KEEP:
goto keep_locked;
case PAGE_ACTIVATE:
@@ -786,7 +807,29 @@ static unsigned long shrink_inactive_list(unsigned long 
max_scan,
spin_unlock_irq(>lru_lock);
 
nr_scanned += nr_scan;
-   nr_freed = shrink_page_list(_list, sc);
+   nr_freed = shrink_page_list(_list, sc, PAGEOUT_IO_ASYNC);
+
+   /*
+* If we are direct reclaiming for contiguous pages and we do
+

[PATCH 1/2] ensure we count pages transitioning inactive via clear_active_flags

2007-07-28 Thread Andy Whitcroft

We are transitioning pages from active to inactive in
clear_active_flags, those need counting as PGDEACTIVATE vm events.

Signed-off-by: Andy Whitcroft <[EMAIL PROTECTED]>
Acked-by: Mel Gorman <[EMAIL PROTECTED]>
---
diff --git a/mm/vmscan.c b/mm/vmscan.c
index d419e10..99ec7fa 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -777,6 +777,7 @@ static unsigned long shrink_inactive_list(unsigned long 
max_scan,
 (sc->order > PAGE_ALLOC_COSTLY_ORDER)?
 ISOLATE_BOTH : ISOLATE_INACTIVE);
nr_active = clear_active_flags(_list);
+   __count_vm_events(PGDEACTIVATE, nr_active);
 
__mod_zone_page_state(zone, NR_ACTIVE, -nr_active);
__mod_zone_page_state(zone, NR_INACTIVE,
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 0/2] Synchronous Lumpy Reclaim V2

2007-07-28 Thread Andy Whitcroft
As pointed out by Mel when reclaim is applied at higher orders a
significant amount of IO may be started.  As this takes finite time
to drain reclaim will consider more areas than ultimatly needed
to satisfy the request.  This leads to more reclaim than strictly
required and reduced success rates.

I was able to confirm Mel's test results on systems locally.
These show that even under light load the success rates drop off far
more than expected.  Testing with a modified version of his patch
(which follows) I was able to allocate almost all of ZONE_MOVABLE
with a near idle system.  I ran 5 test passes sequentially following
system boot (the system has 29 hugepages in ZONE_MOVABLE):

  2.6.23-rc1  11  8  6  7  7
  sync_lumpy v2   28 28 29 29 26

These show that although hugely better than the near 0% success
normally expected we can only allocate about a 1/4 of the zone.
Using synchronous reclaim for these allocations we get close to 100%
as expected.

I have also run our standard high order tests and these show no
regressions in allocation success rates at rest, and some significant
improvements under load.

Following this email are two patches, both should be considered as
bug fixes to lumpy reclaim:

ensure-we-count-pages-transitioning-inactive-via-clear_active_flags:
  this a bug fix for Lumpy Reclaim fixing up a bug in VM Event
  accounting when it marks pages inactive, and

Wait-for-page-writeback-when-directly-reclaiming-contiguous-areas:
  updates reclaim making direct reclaim synchronous when applied
  at orders above PAGE_ALLOC_COSTLY_ORDER.

Patches against 2.6.23-rc1.  Andrew please consider for -mm and
for pushing to mainline.

-apw
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch] mm: reduce pagetable-freeing latencies

2007-07-28 Thread Benjamin Herrenschmidt

> The onstack array seems fine to me, even if you do end up deciding on
> an array of one.  Is there any evidence that it's a problem getting a
> page for the freeing (other than in circumstances that are already
> badly slowed down)?  It's obvious that we need a fallback route,
> but optimizing throughput on that route seems premature.

Hrm, no evidence of that so far indeed. I'm worried by the stack usage
of the unmap_mapping_ranges() but appart from that, no.

Appart from that, yeah, I suppose we can have a macro defining how many
on-stack backup we have and adjust it if we see that being a problem.
I'm not fan of dynamic on-stack allocations.

Ben.


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [ck] Re: Linus 2.6.23-rc1

2007-07-28 Thread Alex Besogonov

Linus Torvalds wrote:
I personally feel that modal behaviour is bad, so it would introduce what 
is in my opinion bad code, and likely result in problems not being found 
and fixed as well (because people would pick the thing that "works for 
them", and ignore the problems in the other module). 
I'm sorry, but this argument doesn't hold water. It was invoked years 
ago and turned out to be incorrect - the new CFS scheduler is not just a

fixed old scheduler, it's a completely redesigned one.

--
With respect,
Alex Besogonov ([EMAIL PROTECTED])

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: sound is interrupting with new kernels

2007-07-28 Thread [EMAIL PROTECTED]
On 7/23/07, Ingo Molnar <[EMAIL PROTECTED]> wrote:
>
> could you try CONFIG_HZ_1000 instead of the 250 you are using currently?
> Also, please enable CONFIG_SCHED_DEBUG to improve the output of
> cfs-debug-info.sh.
>
> Ingo
>
Hi, Igno.
Sorry for so long response, I hadn't opportunity to reboot machine for
new kernel.
I've built 2.6.22 with CONFIG_HZ_1000 and CONFIG_PREEMPT - nothing changed =(.
Interesting that in Totem(Gnome vp) sound isn't interrupting during
video watching.
I'll try other kernels later to find out what is working good for my case.
In gxine(xine-based) sound is interrupting too!

>firstly, could you check whether the ogg123/mpg321 console apps work
>without any audio skipping? If they work fine, does Amarok work fine?
>(Amarok is an X apps that has a high-quality latency design - most other
>X based players are affected by X communication latencies.)

I'm not using amarok but audacious. It's seems that's everything is
alright with it.

I'll send more tests results later.

Best wishes!

Dima
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Linus 2.6.23-rc1

2007-07-28 Thread Linus Torvalds


On Sat, 28 Jul 2007, Jan Engelhardt wrote:
> 
> I generally run with CONFIG_HZ=100, CONFIG_NO_HZ=n, CONFIG_PREEMPT_NONE.

Ok, that's HZ=100 is likely the worst case, as it effectively multiples 
all the scheduler latencies by 10 (rather than by 4, which is what the 
default 250Hz does).

That said, I think most testing showed that the CFS scheduler tunables 
didn't have a huge amount of impact on how things felt, so that 
factor-of-ten might not even matter that much. The 3D game issues may well 
be totally elsewhere.

But it's certainly worth looking at.

Linus
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] USB Pegasus driver - avoid a potential NULL pointer dereference.

2007-07-28 Thread Jesper Juhl
Hello,

This patch makes sure we don't dereference a NULL pointer in 
drivers/net/usb/pegasus.c::write_bulk_callback() in the initial 
struct net_device *net = pegasus->net; assignment.
The existing code checks if 'pegasus' is NULL and bails out if 
it is, so we better not touch that pointer until after that check.

Please consider merging.


Signed-off-by: Jesper Juhl <[EMAIL PROTECTED]>
---

 drivers/net/usb/pegasus.c |4 +++-
 1 files changed, 3 insertions(+), 1 deletions(-)

diff --git a/drivers/net/usb/pegasus.c b/drivers/net/usb/pegasus.c
index a05fd97..04cba6b 100644
--- a/drivers/net/usb/pegasus.c
+++ b/drivers/net/usb/pegasus.c
@@ -768,11 +768,13 @@ done:
 static void write_bulk_callback(struct urb *urb)
 {
pegasus_t *pegasus = urb->context;
-   struct net_device *net = pegasus->net;
+   struct net_device *net;
 
if (!pegasus)
return;
 
+   net = pegasus->net;
+
if (!netif_device_present(net) || !netif_running(net))
return;
 

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [ck] Re: Linus 2.6.23-rc1

2007-07-28 Thread Linus Torvalds


On Sat, 28 Jul 2007, Bill Huey wrote:
> 
> My argument is that schedule development is open ended. Although having
> a central scheduler to hack is a a good thing, it shouldn't lock out or
> supress development from other groups that might be trying to solve the
> problem in unique ways.

I don't think anything was suppressed here.

You seem to say that more modular code would have helped make for a nicer 
way to do schedulers, but if so, where were those patches to do that? 
Con's patches didn't do that either. They just replaced the code.

In fact, Ingo's patches _do_ add some modularity, and might make it easier 
to replace the scheduler. So it would seem that you would argue for CFS, 
not against it?

> I think that's kind of a bogus assumption from the very get go. Scheduling
> in Linux is one of the most unevolved systems in the kernel that still
> could go through a large transformation and get big gains like what
> we've had over the last few months. This evident with both schedulers,
> both do well and it's a good thing overall the CFS is going in.
> 
> Now, the way it happened is completely screwed up in so many ways that I
> don't see how folks can miss it. This is not just Ingo versus Con, this
> is the current Linux community and how it makes decision from the top down
> and the current cultural attitude towards developers doing things that
> are:

I don't think so.

I think you're barking up the totally wrong tree here.

I think that what happened was very simple: somebody showed that we did 
badly and had benchmarks to show for it, and that in turn resulted in a 
huge spurt of coding from the people involved.

The fact that you think this is "broken" is interesting. I can point to a 
very real example of where this also happened, and where I bet you don't 
think the process was "broken".

Do you remember the mindcraft study?

Exact same thing. Somebody came in, and showed that Linux did really badly 
on some benchmark, and that an alternate approach was much better.

What happened? A huge spurt of development in a pretty short timeframe, 
that totally _obliterated_ the mindcraft results. 

It could have happened independently, but the fact is, it didn't. These 
kinds of events where somebody shows (with real numbers and code) that 
things can be done better really *are* a good way to do development, and 
it's how development generally ends up happening. It's hugely 
motivational, both because competition is motivational in itself, but also 
because somebody shows that things can be done so much better opens 
peoples eyes to it.

And if you think the scheduler situation is different, why? Was it just 
because the mindcraft study compared against Windows NT, not another 
version of Linux patches?

The thing is, development is slow and gradual, but at the same time, it 
happens in spurts (btw, if you have ever studied evolution, you'll find 
the exact same thing: evolution is slow and gradual, but it also happens 
in sudden "spurts" where you have relatively much bigger changes happening 
because you get into a feedback loop).

Another comparison to evolution: most of the competitive pressure actually 
comes from the _same_ species, not from outside. It's not so much rabbits 
competing against foxes (although that happens too), quite a lot of it is 
rabbits competing against other rabbits!

Linus
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Linus 2.6.23-rc1

2007-07-28 Thread Jan Engelhardt

On Jul 28 2007 14:33, Linus Torvalds wrote:
>
>Btw, people who actually have 3D games installed (I have exactly one: 
>ppracer, and I can't really say that I care about how it feels), if you 
>don't have CONFIG_HZ=1000, this really is worth testing.
>
>I think Ingo probably ran with CONFIG_NO_HZ and HZ_1000, but the default 
>timer tick is actually 250Hz, which makes all the default scheduler values 
>come out four times bigger than they are documented/supposed to be.

I generally run with CONFIG_HZ=100, CONFIG_NO_HZ=n, CONFIG_PREEMPT_NONE.



Jan
-- 
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: RFT: updatedb "morning after" problem [was: Re: -mm merge plans for 2.6.23]

2007-07-28 Thread Daniel Hazelton
On Saturday 28 July 2007 17:06:50 [EMAIL PROTECTED] wrote:
> On Sat, 28 Jul 2007, Daniel Hazelton wrote:
> > On Saturday 28 July 2007 04:55:58 [EMAIL PROTECTED] wrote:
> >> On Sat, 28 Jul 2007, Rene Herman wrote:
> >>> On 07/27/2007 09:43 PM, [EMAIL PROTECTED] wrote:
>   On Fri, 27 Jul 2007, Rene Herman wrote:
> >  On 07/27/2007 07:45 PM, Daniel Hazelton wrote:
> >>
> >> nobody is arguing that swap prefetch helps in the second cast.
> >
> > Actually, I made a mistake when tracking the thread and reading the code
> > for the patch and started to argue just that. But I have to admit I made
> > a mistake - the patches author has stated (as Rene was kind enough to
> > point out) that swap prefetch can't help when memory is filled.
>
> I stand corrected, thaks for speaking up and correcting your position.

If you had made the statement before I decided to speak up you would have been 
correct :)

Anyway, I try to always admit when I've made a mistake - its part of my 
philosophy. (There have been times when I haven't done it, but I'm trying to 
make that stop entirely)

> >> what people are arguing is that there are situations where it helps for
> >> the first case. on some machines and version of updatedb the nighly run
> >> of updatedb can cause both sets of problems. but the nightly updatedb
> >> run is not the only thing that can cause problems
> >
> > Solving the cache filling memory case is difficult. There have been a
> > number of discussions about it. The simplest solution, IMHO, would be to
> > place a (configurable) hard limit on the maximum size any of the kernels
> > caches can grow to. (The only solution that was discussed, however, is a
> > complex beast)
>
> limiting the size of the cache is also the wrong thing to do in many
> situations. it's only right if the cache pushes out other data you care
> about, if you are trying to do one thing as fast as you can you really do
> want the system to use all the memory it can for the cache.

After thinking about this you are partially correct. There are those sorts of 
situations where you want the system to use all the memory it can for caches. 
OTOH, if those situations could be described in some sort of simple 
heuristic, then a soft-limit that uses those heuristics to determine when to 
let the cache expand could exploit the benefits of having both a limited and 
unlimited cache. (And, potentially, if the heuristic has allowed a cache to 
expand beyond the limit then, when the heuristic no longer shows the oversize 
cache is no longer necessary it could trigger and automatic reclaim of that 
memory.)

(I'm willing to help write and test code to do this exactly. There is no 
guarantee that I'll be able to help with more than testing - I don't 
understand the parts of the code involved all that well)

DRH

-- 
Dialup is like pissing through a pipette. Slow and excruciatingly painful.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Linus 2.6.23-rc1

2007-07-28 Thread Linus Torvalds


On Sat, 28 Jul 2007, Linus Torvalds wrote:
> 
> Yes, it's what "/proc/sys/kernel/sched_granularity_ns" is supposed to 
> tweak, but maybe there's some misfeature there, or maybe the default is 
> just bad for games, or whatever.
> 
> Ingo: that sysctl_sched_granularity initialization doesn't make sense. You 
> talk about it being in units of nanoseconds, but then you do
> 
>   20ULL/HZ
> 
> which is nonsensical.

Btw, people who actually have 3D games installed (I have exactly one: 
ppracer, and I can't really say that I care about how it feels), if you 
don't have CONFIG_HZ=1000, this really is worth testing.

I think Ingo probably ran with CONFIG_NO_HZ and HZ_1000, but the default 
timer tick is actually 250Hz, which makes all the default scheduler values 
come out four times bigger than they are documented/supposed to be.

On SMP, that scheduler granularity then gets doubled once more if you have 
two CPU's, so rather than 2ms by default, it ends up being 16ns (and the 
time slices themselves end up being bigger than that). 

So doing some testing with a simple

echo 200 > /proc/sys/kernel/sched_granularity_ns
echo 100 > /proc/sys/kernel/sched_batch_wakeup_granularity_ns
echo 800 > /proc/sys/kernel/sched_runtime_limit_ns

might be worth doing (and if you vary numbers to see if it matters, 
please do let people know!)

Linus
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [ck] Re: Linus 2.6.23-rc1

2007-07-28 Thread hui
On Sat, Jul 28, 2007 at 11:06:09PM +0200, Diego Calleja wrote:
> So your argument is that SD shouldn't have been merged either, because it
> would have resulted in one scheduler over the other?

My argument is that schedule development is open ended. Although having
a central scheduler to hack is a a good thing, it shouldn't lock out or
supress development from other groups that might be trying to solve the
problem in unique ways.

This can be accomplished in a couple of ways:

1) scheduler modularity

Clearly Con is highly qualified to experiement with scheduler code and
this should be technically facilitate by some means if not a maintainer.
He's only a part time maintainer and nobody helped him with this stuff
nor did they try to understand what his scheduler was trying to do other
than Tong Li.

2) better code modularity

Now, cleaner code would help with this a lot. If that was in place, we
might not need (1) and pluggable scheduler. It would limit the amount
of refactoring for folks so that their code can drop in easier. There's
a significant amount of churn that it locks out developers by default
since they have to constantly clean up the code in question while another
developer can commit without consideration to how it effects others.
That's their right as a maintainer, but also as maintainer, they should
give proper amount of consideration to how others might intend to extend
the code so that development remains "inclusive".

This notion of "open source, open development" is false when working
under those circumstances.

> > where capable but one is locked out now because of the choices of
> > current high level kernel developers in Linux.
> 
> Well, there are two schedulers...it's obvious that "high level kernel
> developers" needed to chose one.

I think that's kind of a bogus assumption from the very get go. Scheduling
in Linux is one of the most unevolved systems in the kernel that still
could go through a large transformation and get big gains like what
we've had over the last few months. This evident with both schedulers,
both do well and it's a good thing overall the CFS is going in.

Now, the way it happened is completely screwed up in so many ways that I
don't see how folks can miss it. This is not just Ingo versus Con, this
is the current Linux community and how it makes decision from the top down
and the current cultural attitude towards developers doing things that
are:

1) architecturally significant

which they will get flamed to death by the establish Linux kernel culture
before they can get any users to report bugs after their posting on lkml.

2) conceptual different

which is subject to the reasons above, but also get flamed to death unless
it comes from folks internal to the Linux development processes.

When groups get to a certain size like it has, there needs to be a
revision of development processes so that they can scale and be "inclusive"
to the overall spirit the Linux development process. When that breaks down,
we get situations like what we have with Con leaving development. Other
developers like me get turned off to the situation, also feel the same as
Con and stop Linux development. That's my current status as well.

> The main problem is clearly that no scheduler was clearly better than the
> other. This remembers me of the LVM2/MD vs EVMS in the 2.5 days - both
> of them were good enought, but only one of them could be merged. The
> difference is that EVMS developers didn't get that annoyed, and not only
> they didn't quit but they continued developing their userspace tools to
> make it work with the solution included in the kernel

That's a good point to have folks not go down that particular path. But
Con was kind of put down during the arguments with Ingo about his
assumptions of the problems and then was personally crapped on by having
his own idea under go a a complete reversal in opinion by Ingo, with
Ingo then doing this own version of Con's work displacing him

How would you feel in that situation ? I'd be pretty damn pissed.

[For the record Peter Zijlstra did the same thing to me which is annoying,
but since he's my buddy doesn't get as rude as the above situation, included
me in every private mail about his working so that I don't feel like RH
is paying him to undermine my brilliance, it's ok :)]

bill

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [ck] Re: Linus 2.6.23-rc1

2007-07-28 Thread Linus Torvalds


On Sat, 28 Jul 2007, Diego Calleja wrote:

> El Sat, 28 Jul 2007 11:05:25 -0700 (PDT), Linus Torvalds <[EMAIL PROTECTED]> 
> escribió:
> > 
> > So "modal" things are good for fixing behaviour in the short run. But they 
> > are a total disaster in the long run, and even in the short run they tend 
> > to have problems (simply because there will be cases that straddle the 
> > line, and show some of _both_ issues, and now *neither* mode is the right 
> > one)
> 
> I fully agree with this, but plugsched could have avoided this useless 
> "division"
> on the topic of SD vs CFS. IMO that counts as an advantage, too ;)

Sure. I actually think it's a huge advantage (see the ManagementStyle file 
on pissing people off), but at the same time, I don't like playing 
politics with technology. The kernel is a technical project, and I make 
technical decisions.

So I absolutely detest adding code for "political" reasons.

I personally feel that modal behaviour is bad, so it would introduce what 
is in my opinion bad code, and likely result in problems not being found 
and fixed as well (because people would pick the thing that "works for 
them", and ignore the problems in the other module). So while I don't like 
making irreversible decisions (and the choice of CFS wasn't irreversible 
in itself, but if it pisses off Con, _that_ is generally not reversible), 
I dislike even more making a half-assed decision.

So rather than making a choice at all, my other choice would have been to 
not merge _either_ scheduler, and let people just continue to fight it 
out. Would that have made people happier? I seriously doubt it.

Linus
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: RFT: updatedb "morning after" problem [was: Re: -mm merge plans for 2.6.23]

2007-07-28 Thread david

On Sat, 28 Jul 2007, Daniel Hazelton wrote:



On Saturday 28 July 2007 04:55:58 [EMAIL PROTECTED] wrote:

On Sat, 28 Jul 2007, Rene Herman wrote:

On 07/27/2007 09:43 PM, [EMAIL PROTECTED] wrote:

 On Fri, 27 Jul 2007, Rene Herman wrote:

 On 07/27/2007 07:45 PM, Daniel Hazelton wrote:


nobody is arguing that swap prefetch helps in the second cast.


Actually, I made a mistake when tracking the thread and reading the code for
the patch and started to argue just that. But I have to admit I made a
mistake - the patches author has stated (as Rene was kind enough to point
out) that swap prefetch can't help when memory is filled.


I stand corrected, thaks for speaking up and correcting your position.


what people are arguing is that there are situations where it helps for
the first case. on some machines and version of updatedb the nighly run of
updatedb can cause both sets of problems. but the nightly updatedb run is
not the only thing that can cause problems


Solving the cache filling memory case is difficult. There have been a number
of discussions about it. The simplest solution, IMHO, would be to place a
(configurable) hard limit on the maximum size any of the kernels caches can
grow to. (The only solution that was discussed, however, is a complex beast)


limiting the size of the cache is also the wrong thing to do in many 
situations. it's only right if the cache pushes out other data you care 
about, if you are trying to do one thing as fast as you can you really do 
want the system to use all the memory it can for the cache.


David Lang
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [ck] Re: Linus 2.6.23-rc1

2007-07-28 Thread Jory A. Pratt
On Fri, 2007-07-27 at 19:35 -0700, Linus Torvalds wrote:

> As a long-term maintainer, trust me, I know what matters. And a person who 
> can actually be bothered to follow up on problem reports is a *hell* of a 
> lot more important than one who just argues with reporters.
> 
>   Linus
Once again linus blows a nut getting off about this and that. The fact
of the matter linus is a one sided. The fact is linus says what he wants
and people think he is god. The fact is noone get code in unless they
are a major player in a linux distro. Ingo had much advantage by using
fedora users. The fact Con did not take all bugs serious yes that is a
player of the game but linus is GOD so all bow before him before he
blows his back out while jacking off to his rants about how the kernel
and other projects should run.

Jory

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [ck] Re: Linus 2.6.23-rc1

2007-07-28 Thread Diego Calleja
El Sat, 28 Jul 2007 13:07:05 -0700, Bill Huey (hui) <[EMAIL PROTECTED]> 
escribió:

> of how crappy X is. This is an open argument on how to solve, but it
> should not have resulted in really one scheduler over the other. Both

So your argument is that SD shouldn't have been merged either, because it
would have resulted in one scheduler over the other?

> where capable but one is locked out now because of the choices of
> current high level kernel developers in Linux.

Well, there are two schedulers...it's obvious that "high level kernel
developers" needed to chose one.

The main problem is clearly that no scheduler was clearly better than the
other. This remembers me of the LVM2/MD vs EVMS in the 2.5 days - both
of them were good enought, but only one of them could be merged. The
difference is that EVMS developers didn't get that annoyed, and not only
they didn't quit but they continued developing their userspace tools to
make it work with the solution included in the kernel
(http://lwn.net/Articles/14714/)
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: RFT: updatedb "morning after" problem [was: Re: -mm merge plans for 2.6.23]

2007-07-28 Thread david

On Sat, 28 Jul 2007, Alan Cox wrote:


It is. Prefetched pages can be dropped on the floor without additional I/O.


Which is essentially free for most cases. In addition your disk access
may well have been in idle time (and should be for this sort of stuff)
and if it was in the same chunk as something nearby was effectively free
anyway.


as I understand it the swap-prefetch only kicks in if the device is idle


Actual physical disk ops are precious resource and anything that mostly
reduces the number will be a win - not to stay swap prefetch is the right
answer but accidentally or otherwise there are good reasons it may happen
to help.

Bigger more linear chunks of writeout/readin is much more important I
suspect than swap prefetching.


I'm sure this is true while you are doing the swapout or swapin and the 
system is waiting for it. but with prefetch you may be able to avoid doing 
the swapin at a time when the system is waiting for it by doing it at a 
time when the system is otherwise idle.


David Lang
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: RFT: updatedb "morning after" problem [was: Re: -mm merge plans for 2.6.23]

2007-07-28 Thread david

On Sat, 28 Jul 2007, Rene Herman wrote:


On 07/28/2007 10:55 AM, [EMAIL PROTECTED] wrote:


 it looks to me like unless the code was really bad (and after 23 months in
 -mm it doesn't sound like it is)


Not to sound pretentious or anything but I assume that Andrew has a fairly 
good overview of exactly how broken -mm can be at times. How many -mm users 
use it anyway? He himself said he's not convinced of usefulness having not 
seen it help for him (and notice that most developers are also users), turned 
it off due to it annoying him at some point and hasn't seen a serious 
investigation into potential downsides.


if that was the case then people should be responding to the request to 
get it merged with 'but it caused problems for me when I tried it'


I haven't seen any comments like that.


 that the only significant con left is the potential to mask other
 problems.


Which is not a madeup issue, mind you. As an example, I just now tried GNU 
locate and saw it's a complete pig and specifically unsuitable for the low 
memory boxes under discussion. Upon completion, it actually frees enough 
memory that swap-prefetch _could_ help on some boxes, while the real issue is 
that they should first and foremost dump GNU locate.


I see the conclusion as being exactly the opposite.

here is a workload with some badly designed userspace software that the 
kernel can make much more pleasent for users.


arguing that users should never use badly designed software in userspace 
doesn't seem like an argument that will gain much traction. I'm not saying 
the kernel needs to fix the software itself (ala the sched_yeild issues), 
but the kernel should try and keep such software from hurting the rest of 
the system where it can.


in this case it can't help it while the bad software is running, but it 
could minimize the impact after it finishes.



 however there are many legitimate cases where it is definantly dong the
 right thing (swapout was correct in pushing out the pages, but now the
 cause of that preasure is gone). the amount of benifit from this will vary
 from situation to situation, but it's not reasonable to claim that this
 provides no benifit (you have benchmark numbers that show it in synthetic
 benchmarks, and you have user reports that show it in the real-worlk)


I certainly would not want to argue anything of the sort no. As said a few 
times, I agree that swap-prefetch makes sense and has at least the potential 
to help some situations that you really wouldnt even want to try and fix any 
other way, simply because nothing's broken.


so there is a legitimate situation where swap-prefetch will help 
significantly, what is the downside that prevents it from being included? 
(reading this thread it sometimes seems like the downside is that updatedb 
shouldn't cause this problem and so if you fixed updatedb there wold be no 
legitimate benifit, or alturnatly this patch doesn't help updatedb so 
there's no legitimate benifit)



 there are lots of things in the kernel who's job is to pre-fill the memroy
 with data that may (or may not) be useful in the future. this is just
 another method of filling the cache. it does so my saying "the user
 wanted these pages in the recent past, so it's a reasonable guess to say
 that the user will want them again in the future"


Well, _that_ is what the kernel is already going to great lengths at doing, 
and it decided that those pages us poor overnight OO.o users want in in the 
morning weren't reasonable guesses. The kernel also won't any time soon be 
reading our minds, so any solution would need either user intervention (we 
could devise a way to tell the kernel "hey ho, I consider these pages to be 
very important -- try not to swap them out" possible even with a "and if you 
do, please pull them back in when possible") or we can let swap-prefetch do 
the "just in case" thing it is doing.


it's not that they shouldn't have been swapped out (they should have 
been), it's that the reason they were swapped out no longer exists.


While swap-prefetch may not be the be all end all of solutions I agree that 
having a machine sit around with free memory and applications in swap seems 
not too useful if (as is the case) fetched pages can be dropped immediately 
when it turns out swap-prefetch made the wrong decision.


So that's for the concept. As to implementation, if I try and look at the 
code, it seems to be trying hard to really be free and as such, potential 
downsides seem limited. It's a rather core concept though and as such needs 
someone with a _lot_ more VM clue to ack. Sorry for not knowing, but who's 
maintaining/submitting the thing now that Con's not? He or she should 
preferably address any concerns it seems.


I've seen it mentioned that there is still a maintainer but I missed who 
it is, but I haven't seen any concerns that can be addressed, they all 
seem to be 'this is a core concept, people need to think about it' or 'but 
someone may find 

Re: [ck] Re: Linus 2.6.23-rc1

2007-07-28 Thread Jan Engelhardt

On Jul 28 2007 22:51, Diego Calleja wrote:
>El Sat, 28 Jul 2007 11:05:25 -0700 (PDT), Linus Torvalds <[EMAIL PROTECTED]> 
>escribió:
>
>> So "modal" things are good for fixing behaviour in the short run. But they 
>> are a total disaster in the long run, and even in the short run they tend 
>> to have problems (simply because there will be cases that straddle the 
>> line, and show some of _both_ issues, and now *neither* mode is the right 
>> one)
>
>I fully agree with this, but plugsched could have avoided this useless 
>"division"
>on the topic of SD vs CFS. IMO that counts as an advantage, too ;)
>

It's like CONFIG_HZ - more or less often debated, and now we have everyone
happy by giving them the choice.




Jan
-- 

Re: keyboard stopped working after de9ce703c6b807b1dfef5942df4f2fadd0fdb67a

2007-07-28 Thread Vojtech Pavlik
On Tue, Jul 17, 2007 at 03:52:07PM -0400, Dmitry Torokhov wrote:
> Hi Christoph,
> 
> On 7/17/07, Christoph Pfister <[EMAIL PROTECTED]> wrote:
> >
> >Yep, attached (cold reboot, not pressing any keys, 2.6.21).
> >
> 
> Ok, I see. You don't use PS/2 mouse and so BIOS told us that mouse is
> absent and reassigned IRQ12 to EHCI controller. However we do not
> listen to BIOS on i386 (for historucal reasons) and process with
> registering AUX port... Now IRQ12 is shared between AUX port and EHCI
> and the keyboard controller is unhappy wehereas before (with polling
> timer) it would release IRQ12 and close port...

Here I should add that IRQ sharing between ISA/LPC where i8042 lives and
PCI where EHCI lives doesn't work - only one of the sides will ever be
able to trigger interrupts, depending on the bridge config.

> Does your keyboard start working if you boot with i8042.noaux?

-- 
Vojtech Pavlik
Director SuSE Labs
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] add a missing LIB_Y to arch/alpha/boot Makefile

2007-07-28 Thread Meelis Roos
> kasprintf pulls in kmalloc which proved to be fatal for at least
> bootimage target on alpha.
> Move it to a separate file so only users of kasprintf are exposed
> to the dependency on kmalloc.

Withe the addition of missing $(LIB_Y) it really seems to compile now. 
Can not boot test before some time next week.

Add $(LIBS_Y) to get lib/lib.a so srm_printk is present.

Signed-off-by: Meelis Roos <[EMAIL PROTECTED]>

diff --git a/arch/alpha/boot/Makefile b/arch/alpha/boot/Makefile
index e1ae14c..cd14388 100644
--- a/arch/alpha/boot/Makefile
+++ b/arch/alpha/boot/Makefile
@@ -104,7 +104,7 @@ OBJ_bootlx   := $(obj)/head.o $(obj)/main.o
 OBJ_bootph   := $(obj)/head.o $(obj)/bootp.o
 OBJ_bootpzh  := $(obj)/head.o $(obj)/bootpz.o $(obj)/misc.o
 
-$(obj)/bootloader: $(obj)/bootloader.lds $(OBJ_bootlx) FORCE
+$(obj)/bootloader: $(obj)/bootloader.lds $(OBJ_bootlx) $(LIBS_Y) FORCE
$(call if_changed,ld)
 
 $(obj)/bootpheader: $(obj)/bootloader.lds $(OBJ_bootph) $(LIBS_Y) FORCE

-- 
Meelis Roos ([EMAIL PROTECTED])
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [ck] Re: Linus 2.6.23-rc1

2007-07-28 Thread Diego Calleja
El Sat, 28 Jul 2007 11:05:25 -0700 (PDT), Linus Torvalds <[EMAIL PROTECTED]> 
escribió:

> So "modal" things are good for fixing behaviour in the short run. But they 
> are a total disaster in the long run, and even in the short run they tend 
> to have problems (simply because there will be cases that straddle the 
> line, and show some of _both_ issues, and now *neither* mode is the right 
> one)


I fully agree with this, but plugsched could have avoided this useless 
"division"
on the topic of SD vs CFS. IMO that counts as an advantage, too ;)
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: kernel panic w/ 2.6.22.1, VIA EPIA Mini ITX

2007-07-28 Thread Herbert Rosmanith
> Checked the RAM on the box? Kinda weird if you're getting VRAM corruption, I 
> wonder if this is due to the RAM failing at the point where the framebuffer 
> is mapped?

you are probably right... I removed one of the RAMs, no crash
anymore. This is not ... nice. The manual says, that the board can
handle up to 1GB (2x512MB). I don't think the RAM itself is damaged --
these are brandnew Kingston "ValueRam" PC133 SD-RAMs with lifelong 
guarantee. I wonder if this behaviour only apperas with my particular
board, or if all VIA EPIA Mini ITX 500 are affected (I've run out
of boards to test ...)

thanks,
herp

> Try running memtest86 on it.

good idea.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [ck] Re: Linus 2.6.23-rc1

2007-07-28 Thread Linus Torvalds


On Sat, 28 Jul 2007, jos poortvliet wrote:
>
> Your point here seems to be: this is how it went, and it was right. Ok, got 
> that.

But I wanted to bring out more than what you make sound like "that's what 
happened, deal with it". I tried to explain _why_ the choices that were 
made were in fact made.

And it's a (I think) important thing for people to be aware of. The fact 
is, "personality" and "work with the other developers" is a big issue.

You cannot just go off and do your own thing in your private world, and 
then expect it to be accepted without any discussion or other people 
showing up and doing alternate things. That's _especially_ true in an area 
that has a respected and working maintainer.

>Yet, Con walked away (and not just over SD). Seeing Con go, I wonder 
> how many did leave without this splash.

We've had people go with a splash before. Quite frankly, the current 
scheduler situation looks very much like the CML2 situation. Anybody 
remember that? The developer there also got rejected, the improvement was 
made differently (and much more in line with existing practices and 
maintainership), and life went on. Eric Raymond, however, left with a 
splash.

It's not common, but it's not unheard of. Anybody who thinks that 
developers don't have huge egos probably haven't ever met a software 
engineer. And I suspect kernel people have bigger egos than most. No 
wonder there are clashes every once in a while - it's a wonder there 
aren't _more_ of them.

> How and why? And is it due to a deeper problem?

Well, one part of it is that the way to make changes in the kernel 
community is to do them incrementally.

Small and incremental improvements are much easier to merge. If you go off 
and rewrite a subsystem, you shouldn't expect it to get merged, at least 
not unless it can live side-by-side with the old one (the new firewire 
stack is an example of that, and most filesystems are this way too). And 
the closer to some central part you get, the harder that gets.

So the *bulk* of the kernel stuff can be handled either incrementally, or 
side-by-side, and as a result, you actually seldom see issues like this. 
The kernel is extremely modular, and a large reason for that is exactly to 
avoid couplings.

Some (very few) things cannot be done incrementally. That's why I bring 
up CML2 as a fairly good example of this having happened before. Some 
things require flag-days. But you should pretty much *assume* that if 
there is a flag-day, and if there is a maintainer, the maintainer has to 
be involved.

Does "maintainership" give infinite powers? No. I'll take patches that 
bypass maintainers, but there needs to be some reason for them (ie in some 
sense the maintainer needs to have done a bad job, or the patch just needs 
to be trivial enough - or it cuts across maintainership areas - that it's 
not even _worth_ going through all maintainers).

So maintainers aren't "everything". But they are important. You can't just 
ignore them and go do your own thing, and then expect it to be merged.

Linus
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Volanomark slows by 80% under CFS

2007-07-28 Thread Dave Jones
On Fri, Jul 27, 2007 at 10:47:21PM -0400, Rik van Riel wrote:
 > Tim Chen wrote:
 > > Ingo,
 > > 
 > > Volanomark slows by 80% with CFS scheduler on 2.6.23-rc1.  
 > > Benchmark was run on a 2 socket Core2 machine.
 > > 
 > > The change in scheduler treatment of sched_yield 
 > > could play a part in changing Volanomark behavior.
 > > In CFS, sched_yield is implemented
 > > by dequeueing and requeueing a process .  The time a process 
 > > has spent running probably reduced the the cpu time due it 
 > > by only a bit. The process could get re-queued pretty close
 > > to head of the queue, and may get scheduled again pretty
 > > quickly if there is still a lot of cpu time due.  
 > 
 > I wonder if this explains the 30% drop in top performance
 > seen with the MySQL sysbench benchmark when the scheduler
 > changed to CFS...
 > 
 > See http://people.freebsd.org/~jeff/sysbench.png

 From the authors blog when he did that graph:
 http://jeffr-tech.livejournal.com/10103.html

"So I updated the image for the second time today to include Ingo's cfs
 scheduler. This kernel is from the rpm on his website. I double checked
 that it was not using tcmalloc at the time and switching back to a
 2.6.21 kernel returned to the expected perf.

 Basically, it has the same performance as the FreeBSD 4BSD scheduler
 now. Which is to say the peak is terrible but it has virtually no
 dropoff and performs better under load than the default 2.6.21
 scheduler. "



Dave

-- 
http://www.codemonkey.org.uk
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: alpha compile failure (srm_printk)

2007-07-28 Thread Sam Ravnborg
On Sat, Jul 28, 2007 at 10:04:50PM +0200, Sam Ravnborg wrote:
> 
> The fix is to split kasprintf out to a separate file to
> avoid pulling in more stuff than necessary.
I just sent a patch doing this and I assume you take care of the alpha changes.

Thanks,
Sam
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] lib: move kasprintf to a separate file

2007-07-28 Thread Sam Ravnborg
kasprintf pulls in kmalloc which proved to be fatal for at least
bootimage target on alpha.
Move it to a separate file so only users of kasprintf are exposed
to the dependency on kmalloc.

Signed-off-by: Sam Ravnborg <[EMAIL PROTECTED]>
Cc: Jeremy Fitzhardinge <[EMAIL PROTECTED]>
Cc: Meelis Roos <[EMAIL PROTECTED]>
---
 lib/Makefile|2 +-
 lib/kasprintf.c |   44 
 lib/vsprintf.c  |   35 ---
 3 files changed, 45 insertions(+), 36 deletions(-)
 create mode 100644 lib/kasprintf.c

diff --git a/lib/Makefile b/lib/Makefile
index 6149663..d9e5f1c 100644
--- a/lib/Makefile
+++ b/lib/Makefile
@@ -2,7 +2,7 @@
 # Makefile for some libs needed in the kernel.
 #
 
-lib-y := ctype.o string.o vsprintf.o cmdline.o \
+lib-y := ctype.o string.o vsprintf.o kasprintf.o cmdline.o \
 rbtree.o radix-tree.o dump_stack.o \
 idr.o int_sqrt.o bitmap.o extable.o prio_tree.o \
 sha1.o irq_regs.o reciprocal_div.o argv_split.o
diff --git a/lib/kasprintf.c b/lib/kasprintf.c
new file mode 100644
index 000..c5ff1fd
--- /dev/null
+++ b/lib/kasprintf.c
@@ -0,0 +1,44 @@
+/*
+ *  linux/lib/kasprintf.c
+ *
+ *  Copyright (C) 1991, 1992  Linus Torvalds
+ */
+
+#include 
+#include 
+#include 
+#include 
+
+/* Simplified asprintf. */
+char *kvasprintf(gfp_t gfp, const char *fmt, va_list ap)
+{
+   unsigned int len;
+   char *p;
+   va_list aq;
+
+   va_copy(aq, ap);
+   len = vsnprintf(NULL, 0, fmt, aq);
+   va_end(aq);
+
+   p = kmalloc(len+1, gfp);
+   if (!p)
+   return NULL;
+
+   vsnprintf(p, len+1, fmt, ap);
+
+   return p;
+}
+EXPORT_SYMBOL(kvasprintf);
+
+char *kasprintf(gfp_t gfp, const char *fmt, ...)
+{
+   va_list ap;
+   char *p;
+
+   va_start(ap, fmt);
+   p = kvasprintf(gfp, fmt, ap);
+   va_end(ap);
+
+   return p;
+}
+EXPORT_SYMBOL(kasprintf);
diff --git a/lib/vsprintf.c b/lib/vsprintf.c
index 6b6734d..7b481ce 100644
--- a/lib/vsprintf.c
+++ b/lib/vsprintf.c
@@ -978,38 +978,3 @@ int sscanf(const char * buf, const char * fmt, ...)
 }
 
 EXPORT_SYMBOL(sscanf);
-
-
-/* Simplified asprintf. */
-char *kvasprintf(gfp_t gfp, const char *fmt, va_list ap)
-{
-   unsigned int len;
-   char *p;
-   va_list aq;
-
-   va_copy(aq, ap);
-   len = vsnprintf(NULL, 0, fmt, aq);
-   va_end(aq);
-
-   p = kmalloc(len+1, gfp);
-   if (!p)
-   return NULL;
-
-   vsnprintf(p, len+1, fmt, ap);
-
-   return p;
-}
-EXPORT_SYMBOL(kvasprintf);
-
-char *kasprintf(gfp_t gfp, const char *fmt, ...)
-{
-   va_list ap;
-   char *p;
-
-   va_start(ap, fmt);
-   p = kvasprintf(gfp, fmt, ap);
-   va_end(ap);
-
-   return p;
-}
-EXPORT_SYMBOL(kasprintf);
-- 
1.5.1.rc3.g84b7-dirty

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [ck] Re: Linus 2.6.23-rc1

2007-07-28 Thread hui
On Sat, Jul 28, 2007 at 09:28:36PM +0200, jos poortvliet wrote:
> Your point here seems to be: this is how it went, and it was right. Ok, got 
> that. Yet, Con walked away (and not just over SD). Seeing Con go, I wonder 
> how many did leave without this splash. How many didn't even get involved at 
> all??? Did THAT have to happen? I don't blame you for it - the point is that 
> somewhere in the process a valuable kernel hacker went away. How and why? And 
> is it due to a deeper problem?

Absolutely, the current Linux community hasn't realized how large the
community has gotten and the internal processes for dealing with new
developers, that aren't at companies like SuSE or RedHat, haven't been
extended to deal with it yet. It comes off as elitism which it partially
is.

Nobody tries to facilitate or understand ideas in the larger community
which locks folks like Con out that try to do provocative things outside
of the normal technical development mindset. He was punished for doing
so and is a huge failure in this community.

Con basically got caught in a scheduler philosophical argument of whether
to push a policy into userspace or to nice a process instead because
of how crappy X is. This is an open argument on how to solve, but it
should not have resulted in really one scheduler over the other. Both
where capable but one is locked out now because of the choices of
current high level kernel developers in Linux.

There are a lot good kernel folks in many different communities that
look at something like this and would be turned off to participating
in Linux development. And I have a good record of doing rather
interesting stuff in kernel.

bill

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: kernel panic w/ 2.6.22.1, VIA EPIA Mini ITX

2007-07-28 Thread Alistair John Strachan
> I have absolutely no idea what triggers this crash.

Checked the RAM on the box? Kinda weird if you're getting VRAM corruption, I 
wonder if this is due to the RAM failing at the point where the framebuffer 
is mapped?

Try running memtest86 on it.

-- 
Cheers,
Alistair.

137/1 Warrender Park Road, Edinburgh, UK.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: alpha compile failure (srm_printk)

2007-07-28 Thread Sam Ravnborg
On Sat, Jul 28, 2007 at 08:05:08PM +0300, Meelis Roos wrote:
> Retested this compile error with todays 2.6.23-rc1+git, still the same.
> 
> > >   LD  arch/alpha/boot/bootloader
> > > arch/alpha/boot/bootloader.lds:25: undefined symbol `srm_printk' 
> > > referenced in expression
> > 
> > I was unable to repeoduce these errors on 2.6.22-rc4 with your config.
> 
> Hmm. Just make works, make bootimage does not. I debugged this further 
> today and I can not see how it can work.
> 
> The link command in question is
> ld   -static -uvsprintf -T   arch/alpha/boot/bootloader.lds 
> arch/alpha/boot/head.o arch/alpha/boot/main.o -o arch/alpha/boot/bootloader
> and it still tells
> arch/alpha/boot/bootloader.lds:25: undefined symbol `srm_printk' referenced 
> in expression
> 
> arch/alpha/boot/bootloader.lds contains a single related line referring 
> to srm_printk:
> printk = srm_printk;
> This only seems to define an alias to srm_printk so not important (and 
> unused)?

Hi Meelis.
I took the time to investige this a bit. The relevant files in
this case (arch/alpha/MAkefile + boot/Makefile has not seen many changes
when browsing the git tree.
So I looked back a bit further in the bitkeeper based tree.

Before boot/Makefile were converted to kbuild style bootloader indeed
referenced $(LIBS). That was lost in the process.
So adding $(LIBS_Y) is the right thing to do.

The linker error you get is due to kasprintf uses kmalloc and
it gets pulled in when srm_printf uses vsprintf.

The fix is to split kasprintf out to a separate file to
avoid pulling in more stuff than necessary.

PS. I had trouble compiling objstrip and had to due a lot of ugly
hacks before it compiles. I assume it is a toolchain issue..

Sam
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: ATA scsi driver misbehavior under kdump capture kernel

2007-07-28 Thread Guennadi Liakhovetski
On Fri, 27 Jul 2007, Cliff Wickman wrote:

> I've run into a problem with the ATA SCSI disk driver when running in a
> kdump dump-capture kernel.
> 
> I'm running on 2-processor x86_64 box.  It has 2 scsi disks, /dev/sda and
> /dev/sdb
> 
> My kernel is 2.6.22, and built to be a dump capturing kernel loaded by kexec.
> When I boot this kernel by itself, it finds both sda and sdb.
> 
> But when it is loaded by kexec and booted on a panic it only finds sda.
> 
> Any ideas from those familiar with the ATA driver?

No, just wanted to suggest to try to ask on kexec /fastboot seems to be 
deprecated) mailing list?

Thanks
Guennadi

> 
> 
> -Cliff Wickman
>  SGI
> 
> 
> 
> I put some printk's into it and get this:
> 
> Standalone:
> 
>[nv_adma_error_handler]
> cpw: ata_host_register probe port 1 (error_handler:81348625)
> cpw: ata_host_register call ata_port_probe
> cpw: ata_host_register call ata_port_schedule
> cpw: ata_host_register call ata_port_wait_eh
> cpw: ata_port_wait_eh entered
> cpw: ata_port_wait_eh, preparing to wait
> ata2: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
> cpw: ata_dev_configure entered
> cpw: ata_dev_configure testing class
> cpw: ata_dev_configure class is ATA_DEV_ATA
> ata2.00: ATA-6: ST3200822AS, 3.01, max UDMA/133
> ata2.00: 390721968 sectors, multi 16: LBA48
> cpw: ata_dev_configure exiting
> cpw: ata_dev_configure entered
> cpw: ata_dev_configure testing class
> cpw: ata_dev_configure class is ATA_DEV_ATA
> cpw: ata_dev_configure exiting
> cpw: ata_dev_set_mode printing:
> ata2.00: configured for UDMA/133
> cpw: ata_port_wait_eh, finished wait
> cpw: ata_port_wait_eh exiting
> cpw: ata_host_register done with probe port 1
> 
> 
> When loaded with kexec and booted on a panic:
> 
> cpw: ata_host_register probe port 1 (error_handler:81348625)
> cpw: ata_host_register call ata_port_probe
> cpw: ata_host_register call ata_port_schedule
> cpw: ata_host_register call ata_port_wait_eh
> cpw: ata_port_wait_eh entered
> cpw: ata_port_wait_eh, preparing to wait
> ata2: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
> cpw: ata_port_wait_eh, finished wait
> cpw: ata_port_wait_eh exiting
> cpw: ata_host_register done with probe port 1
> 
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [EMAIL PROTECTED]
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
> 

---
Guennadi Liakhovetski
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC] scheduler: improve SMP fairness in CFS

2007-07-28 Thread Tong Li

On Fri, 27 Jul 2007, Chris Snook wrote:


Bill Huey (hui) wrote:
You have to consider the target for this kind of code. There are 
applications
where you need something that falls within a constant error bound. 
According

to the numbers, the current CFS rebalancing logic doesn't achieve that to
any degree of rigor. So CFS is ok for SCHED_OTHER, but not for anything 
more

strict than that.


I've said from the beginning that I think that anyone who desperately needs 
perfect fairness should be explicitly enforcing it with the aid of realtime 
priorities.  The problem is that configuring and tuning a realtime 
application is a pain, and people want to be able to approximate this 
behavior without doing a whole lot of dirty work themselves.  I believe that 
CFS can and should be enhanced to ensure SMP-fairness over potentially short, 
user-configurable intervals, even for SCHED_OTHER.  I do not, however, 
believe that we should take it to the extreme of wasting CPU cycles on 
migrations that will not improve performance for *any* task, just to avoid 
letting some tasks get ahead of others.  We should be as fair as possible but 
no fairer.  If we've already made it as fair as possible, we should account 
for the margin of error and correct for it the next time we rebalance.  We 
should not burn the surplus just to get rid of it.


Proportional-share scheduling actually has one of its roots in real-time 
and having a p-fair scheduler is essential for real-time apps (soft 
real-time).




On a non-NUMA box with single-socket, non-SMT processors, a constant error 
bound is fine.  Once we add SMT, go multi-core, go NUMA, and add 
inter-chassis interconnects on top of that, we need to multiply this error 
bound at each stage in the hierarchy, or else we'll end up wasting CPU cycles 
on migrations that actually hurt the processes they're supposed to be 
helping, and hurt everyone else even more.  I believe we should enforce an 
error bound that is proportional to migration cost.




I think we are actually in agreement. When I say constant bound, it can 
certainly be a constant that's determined based on inputs from the memory 
hierarchy. The point is that it needs to be a constant independent of 
things like # of tasks.


But this patch is only relevant to SCHED_OTHER.  The realtime scheduler 
doesn't have a concept of fairness, just priorities.  That why each realtime 
priority level has its own separate runqueue.  Realtime schedulers are 
supposed to be dumb as a post, so they cannot heuristically decide to do 
anything other than precisely what you configured them to do, and so they 
don't get in the way when you're context switching a million times a second.


Are you referring to hard real-time? As I said, an infrastructure that 
enables p-fair scheduling, EDF, or things alike is the foundation for 
real-time. I designed DWRR, however, with a target of non-RT apps, 
although I was hoping the research results might be applicable to RT.


  tong
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH -mm] Fix libata warnings with CONFIG_PM=n

2007-07-28 Thread Gabriel C
Hi Jeff,

I noticed this warnings when CONFIG_PM=n 

...

drivers/ata/libata-core.c:5993: warning: 'ata_host_disable_link_pm' defined but 
not used
drivers/ata/libata-core.c:6004: warning: 'ata_host_enable_link_pm' defined but 
not used

...

Signed-off-by: Gabriel Craciunescu <[EMAIL PROTECTED]>

---

--- linux-2.6.23-rc1/drivers/ata/libata-core.c.orig 2007-07-28 
21:17:31.0 +0200
+++ linux-2.6.23-rc1/drivers/ata/libata-core.c  2007-07-28 21:17:48.0 
+0200
@@ -5989,6 +5989,7 @@ int ata_flush_cache(struct ata_device *d
return 0;
 }
 
+#ifdef CONFIG_PM
 static void ata_host_disable_link_pm(struct ata_host *host)
 {
int i;
@@ -6011,7 +6012,7 @@ static void ata_host_enable_link_pm(stru
}
 }
 
-#ifdef CONFIG_PM
+
 static int ata_host_request_pm(struct ata_host *host, pm_message_t mesg,
   unsigned int action, unsigned int ehi_flags,
   int wait)

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Linus 2.6.23-rc1

2007-07-28 Thread Linus Torvalds


On Sat, 28 Jul 2007, Jan Engelhardt wrote:
> 
> Time to investigate...

Well, one thing that would be worth doing is to simply create a trace of 
time-slices for both schedulers.

It could easily be some hacky thing that just saves the process name and 
TSC at each scheduling event in some fairly small fixed-sized per-CPU 
circular buffer, and have a /sys interface that reads it out, and then you 
do

sleep 60 ; cat /sys/cpubuffer > buffer

and play the game for 60 seconds (so that you get a buffer that represents 
perhaps the last 10 seconds of play).

It could *literally* just be an effect of the time quanta used, and CFS 
just deciding that it's not interactive and giving things too long of a 
CPU slice.

Yes, it's what "/proc/sys/kernel/sched_granularity_ns" is supposed to 
tweak, but maybe there's some misfeature there, or maybe the default is 
just bad for games, or whatever.

Ingo: that sysctl_sched_granularity initialization doesn't make sense. You 
talk about it being in units of nanoseconds, but then you do

20ULL/HZ

which is nonsensical. That value is "2 seconds" (not 2ms like the comment 
says) in nanoseconds, but then divided by HZ, so what's the meaning of 
that HZ thing? Nothing in the scheduler should care about jiffies, why is 
that related to HZ? All the scheduler clocks are in ns.

Linus
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


kernel panic w/ 2.6.22.1, VIA EPIA Mini ITX

2007-07-28 Thread Herbert Rosmanith

good day,

when setting up a more or less old board, 2.6.22.1 crashes during an
"emerge --sync" (gentoo installation).

first, the screen will start funny - like a crashed C64 (from the "good"
old days). I made a photo, you can look at it here:

http://wildsau.enemy.org/~kernel/epia-crash.jpg

actually, that's not a static picture, but there's also a lotta
blinking going on. the image you see is from a non-framebuffer
mode kernel, since I first suspected the framebuffer being the
culprit, because when I framebuffer mode, the crash will result
in many coloured pixels on the screen, like when doing
"cat /dev/random /dev/fb0". But since this happens in textmode too,
something else is triggering this.

anyway  hardware: the board is a VIA EPIA 500Mhz fanless (Mini ITX)
with a C3 processor and 1GB (2x512) SD-RAM.

the kernel complains about this:
Jul 28 22:30:17 localhost BUG: unable to handle kernel paging request at virtual
 address 00ff


or, to me more specific:


Jul 28 22:30:17 localhost BUG: unable to handle kernel paging request at virtual
 address 00ff
Jul 28 22:30:17 localhost printing eip:
Jul 28 22:30:17 localhost c016c506
Jul 28 22:30:17 localhost *pde = 
Jul 28 22:30:17 localhost Oops:  [#1]
Jul 28 22:30:17 localhost Modules linked in: parport_pc parport uhci_hcd usbcore
Jul 28 22:30:17 localhost CPU:0
Jul 28 22:30:17 localhost EIP:0060:[]Not tainted VLI
Jul 28 22:30:17 localhost EFLAGS: 00010206   (2.6.22.1 #2)
Jul 28 22:30:17 localhost EIP is at __d_lookup+0x66/0xe0
Jul 28 22:30:17 localhost eax: 4c482617   ebx: 00ff   ecx: 0011   edx: c
1819180
Jul 28 22:30:17 localhost esi: 00ff   edi: f7634005   ebp: ef848780   esp: f
7a87dc0
Jul 28 22:30:17 localhost ds: 007b   es: 007b   fs:   gs: 0033  ss: 0068
Jul 28 22:30:17 localhost Process bash (pid: 4209, ti=f7a86000 task=c1bb8070 tas
k.ti=f7a86000)
Jul 28 22:30:17 localhost Stack: 4c482617 f7a87e1c f7a87e3c f7fe7114 f7a87f30 00
11 f7634005 f7634016 
Jul 28 22:30:17 localhost f7a87e3c f7a87f04 f7a87e3c c0162f48 f7a87e48 c18e6220 
c01a82b0 f7ef12a0 
Jul 28 22:30:17 localhost f7634016 f7a87e3c f7ef12a0 f7a87f04 c0164a69 f7634005 
0001 4c482617 
Jul 28 22:30:17 localhost Call Trace:
Jul 28 22:30:17 localhost [] do_lookup+0x28/0x190
Jul 28 22:30:17 localhost [] ext3_permission+0x0/0x10
Jul 28 22:30:17 localhost [] __link_path_walk+0x669/0xb20
Jul 28 22:30:17 localhost [] mntput_no_expire+0x1b/0x70
Jul 28 22:30:17 localhost [] link_path_walk+0x63/0xc0
Jul 28 22:30:17 localhost [] link_path_walk+0x43/0xc0
Jul 28 22:30:17 localhost [] do_path_lookup+0x64/0x180
Jul 28 22:30:17 localhost [] getname+0xb3/0xe0
Jul 28 22:30:17 localhost [] __user_walk_fd+0x3b/0x60
Jul 28 22:30:17 localhost [] vfs_stat_fd+0x22/0x60
Jul 28 22:30:17 localhost [] sys_stat64+0xf/0x30
Jul 28 22:30:17 localhost [] syscall_call+0x7/0xb
Jul 28 22:30:17 localhost ===
Jul 28 22:30:17 localhost Code: ea 31 d0 8b 15 b4 9e 3b c0 89 7c 24 18 21 d0 8b 
15 bc 9e 3b c0 8b 34 82 85 f6 75 0f eb 54 8d b4 26 00 00 00 00 85 db 89 de 74 47
 <8b> 1e 8d 74 26 00 8d 6e f4 8b 04 24 3b 45 18 75 e9 8b 44 24 0c 
Jul 28 22:30:17 localhost EIP: [] __d_lookup+0x66/0xe0 SS:ESP 0068:f7a
87dc0
Jul 28 22:30:17 localhost BUG: unable to handle kernel paging request at virtual
 address 00ff
Jul 28 22:30:17 localhost printing eip:
Jul 28 22:30:17 localhost c016c506
Jul 28 22:30:17 localhost *pde = 
Jul 28 22:30:17 localhost Oops:  [#2]
Jul 28 22:30:17 localhost Modules linked in: parport_pc parport uhci_hcd usbcore
Jul 28 22:30:17 localhost CPU:0
Jul 28 22:30:17 localhost EIP:0060:[]Not tainted VLI
Jul 28 22:30:17 localhost EFLAGS: 00010206   (2.6.22.1 #2)
Jul 28 22:30:17 localhost EIP is at __d_lookup+0x66/0xe0
Jul 28 22:30:17 localhost eax: 00284951   ebx: 00ff   ecx: 0011   edx: c
1819180
Jul 28 22:30:17 localhost esi: 00ff   edi: f7bb1005   ebp: efa49688   esp: e
4837dc0
Jul 28 22:30:17 localhost ds: 007b   es: 007b   fs:   gs: 0033  ss: 0068
Jul 28 22:30:17 localhost Process utempter (pid: 4308, ti=e4836000 task=f7c34090
 task.ti=e4836000)
Jul 28 22:30:17 localhost Stack: 00284951 e4837e1c e4837e3c f7d3a5ec e4837e48 00
03 f7bb1005 f7bb1009 
Jul 28 22:30:17 localhost e4837e3c e4837f04 e4837e3c c0162f48 e4837e48 c18e6e20 
c0158690 c1b54cc0 


I have absolutely no idea what triggers this crash.

cheers,
herp
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [ck] Re: Linus 2.6.23-rc1

2007-07-28 Thread jos poortvliet
Op Saturday 28 July 2007, schreef Linus Torvalds:

>
> Compare this to SD for a while. Ponder.
>
>   Linus

Your point here seems to be: this is how it went, and it was right. Ok, got 
that. Yet, Con walked away (and not just over SD). Seeing Con go, I wonder 
how many did leave without this splash. How many didn't even get involved at 
all??? Did THAT have to happen? I don't blame you for it - the point is that 
somewhere in the process a valuable kernel hacker went away. How and why? And 
is it due to a deeper problem?



-- 
Disclaimer:

Alles wat ik doe denk en zeg is gebaseerd op het wereldbeeld wat ik nu heb. 
Ik ben niet verantwoordelijk voor wijzigingen van de wereld, of het beeld wat 
ik daarvan heb, noch voor de daaruit voortvloeiende gedragingen van mezelf. 
Alles wat ik zeg is aardig bedoeld, tenzij expliciet vermeld.

Please avoid sending me Word or PowerPoint attachments.
See http://www.gnu.org/philosophy/no-word-attachments.html

   A: Because it destroys the flow of the conversation
   Q: Why is top-posting bad?


signature.asc
Description: This is a digitally signed message part.


Re: [RFC] scheduler: improve SMP fairness in CFS

2007-07-28 Thread Tong Li

On Fri, 27 Jul 2007, Chris Snook wrote:

I don't think that achieving a constant error bound is always a good thing. 
We all know that fairness has overhead.  If I have 3 threads and 2 
processors, and I have a choice between fairly giving each thread 1.0 billion 
cycles during the next second, or unfairly giving two of them 1.1 billion 
cycles and giving the other 0.9 billion cycles, then we can have a useful 
discussion about where we want to draw the line on the fairness/performance 
tradeoff.  On the other hand, if we can give two of them 1.1 billion cycles 
and still give the other one 1.0 billion cycles, it's madness to waste those 
0.2 billion cycles just to avoid user jealousy.  The more complex the memory 
topology of a system, the more "free" cycles you'll get by tolerating 
short-term unfairness.  As a crude heuristic, scaling some fairly low 
tolerance by log2(NCPUS) seems appropriate, but eventually we should take the 
boot-time computed migration costs into consideration.


I think we are in agreement. To avoid confusion, I think we should be more 
precise on what fairness means. Lag (i.e., ideal fair time - actual 
service time) is the commonly used metric for fairness. The definition is 
that a scheduler is proportionally fair if for any task in any time 
interval, the task's lag is bounded by a constant (note it's in terms of 
absolute time). The knob here is this constant and can help trade off 
performance and fairness. The reason for a constant bound is that we want 
consistent fairness properties regardless of the number of tasks. For 
example, we don't want the system to be much less fair as the number of 
tasks increases. With DWRR, the lag bound is the max weight of currently 
running tasks, multiplied by sysctl_base_round_slice. So if all tasks are 
of nice 0, i.e., weight 1, and sysctl_base_round_slice equals 30 ms, then 
we are guaranteed each task is at most 30ms off of the ideal case. This is 
a useful property. Just like what you mentioned about the migration cost, 
this property allows the scheduler or user to accurately reason about the 
tradeoffs. If we want to trade fairness for performance, we can increase 
sysctl_base_round_slice to, say, 100ms; doing so we also know accurately 
the worst impact it has on fairness.


Adding system calls, while great for research, is not something which is done 
lightly in the published kernel.  If we're going to implement a user 
interface beyond simply interpreting existing priorities more precisely, it 
would be nice if this was part of a framework with a broader vision, such as 
a scheduler economy.


Agreed. I've seen papers on scheduler economy but not familiar enough to 
comment on it.





Scheduling Algorithm:

The scheduler keeps a set data structures, called Trio groups, to maintain 
the weight or reservation of each thread group (including one or more 
threads) and the local weight of each member thread. When scheduling a 
thread, it consults these data structures and computes (in constant time) a 
system-wide weight for the thread that represents an equivalent CPU share. 
Consequently, the scheduling algorithm, DWRR, operates solely based on the 
system-wide weight (or weight for short, hereafter) of each thread. Having 
a flat space of system-wide weights for individual threads avoids 
performing seperate scheduling at each level of the group hierarchy and 
thus greatly simplies the implementation for group scheduling.


Implementing a flat weight space efficiently is nontrivial.  I'm curious to 
see how you reworked the original patch without global locking.


I simply removed the locking and changed a little bit in idle_balance(). 
The lock was trying to avoid a thread from reading or writing the global 
highest round value while another thread is writing to it. For writes, 
it's simple to ensure without locking only one write takes effect when 
multiple writes are concurrent. For the case that there's one write going 
on and multiple threads read, without locking, the only problem is that a 
reader may read a stale value and thus thinks the current highest round is 
X while it's actually X + 1. The end effect is that a thread can be at 
most two rounds behind the highest round. This changes DWRR's lag bound to 
2 * (max weight of current tasks) * sysctl_base_round_slice, which is 
still constant.


I had a feeling this patch was originally designed for the O(1) scheduler, 
and this is why.  The old scheduler had expired arrays, so adding a 
round-expired array wasn't a radical departure from the design.  CFS does not 
have an expired rbtree, so adding one *is* a radical departure from the 
design.  I think we can implement DWRR or something very similar without 
using this implementation method.  Since we've already got a tree of queued 
tasks, it might be easiest to basically break off one subtree (usually just 
one task, but not necessarily) and migrate it to a less loaded tree whenever 
we can reduce the difference 

Re: Linus 2.6.23-rc1

2007-07-28 Thread Jan Engelhardt

On Jul 28 2007 10:50, Linus Torvalds wrote:
>On Sat, 28 Jul 2007, Kasper Sandberg wrote:
>>
>> First off, i've personally run tests on many more machines than my own,
>> i've had lots of people try on their machines, and i've seen totally
>> unrelated posts to lkml, plus i've seen the experiences people are
>> writing about on IRC. Frankly, im not just thinking of myself.
>
>Ok, good. Has anybody tried to figure out why 3D games seem to be such a 
>special case? 

Is it specific to 3D? I would not think so. dosbox, bochs should have
the same issue. Games with "a lot of motion" usually implement their event
handling and screen drawing in a busy loop to get the maximum possible
frame rate.

Usually, only the GL thread would need to run at full power, and reducing the
input subsystem to a simple event-based loop (for example reading a pipe in
blocking mode). This could IMO makes games a bit more responsive.

However, most games combine the input subsystem and graphics output in one
thread. Due to the way CFS works, it may mean that processes get scheduled
too fair, though I'd suspect that a GL busy loop has no interactivity bonus at
all anyway in the old scheduler or SD.




I/O is also something that can hurt games in their framerate and/or handling
(something the user cares most about). Since I have not tried 2.6.23-rc yet, I
can only speak for the old scheduler. I have always turned cron off so that
updatedb does not run, because it makes games sluggish for some reason,
even though updatedb (or subordinate processes) don't take a lot of CPU time
according to `top`. What's more, running BOINC in the background (nice 20)
while running unreal (nice 0), everything is ok.
(But not if BOINC is at nice 0).

Time to investigate...



Jan
-- 
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: Volanomark slows by 80% under CFS

2007-07-28 Thread David Schwartz

> > Volanomark runs better
> > and is only 40% (instead of 80%) down from old scheduler
> > without CFS.

> 40 or 80 % is still a huge regression.
> Dmitry Adamushko

Can anyone explain precisely what Volanomark is doing? If it's something
dumb like "looping on sched_yield until the 'right' thread runs and finishes
what we're waiting for" then I think any regression can be ignored.

This applies if and only if CFS' sched_yield behavior is sane and Volano's
is insane.

A sane sched_yield implementation must do two things:

1) Reward processes that actually do yield most of their CPU time to another
process.

2) Make an effort to run every ready-to-run process at the same or higher
static priority level before re-scheduling this process. (That won't always
be possible due to SMP issues, but a reasonable effort is needed.)

If CFS is doing these two things, and Volanomark is looping on sched_yield
until the 'right thread' runs, then CFS is doing the right and Volanomark
isn't. Volanomark deserves to lose.

If CFS binds processes to processors more tightly and thus sched_yield can't
yield to a process that was planned to run on another CPU in the future,
that would be a legitimate complaint about CFS.

DS


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: NETPOLL=y , NETDEVICES=n compile error ( Re: 2.6.23-rc1-mm1 )

2007-07-28 Thread Gabriel C
Andrew Morton wrote:
> On Sat, 28 Jul 2007 17:44:45 +0200 Gabriel C <[EMAIL PROTECTED]> wrote:
> 
>> Hi,
>>
>> I got this compile error with a randconfig ( 
>> http://194.231.229.228/MM/randconfig-auto-82.broken.netpoll.c ).
>>
>> ...
>>
>> net/core/netpoll.c: In function 'netpoll_poll':
>> net/core/netpoll.c:155: error: 'struct net_device' has no member named 
>> 'poll_controller'
>> net/core/netpoll.c:159: error: 'struct net_device' has no member named 
>> 'poll_controller'
>> net/core/netpoll.c: In function 'netpoll_setup':
>> net/core/netpoll.c:670: error: 'struct net_device' has no member named 
>> 'poll_controller'
>> make[2]: *** [net/core/netpoll.o] Error 1
>> make[1]: *** [net/core] Error 2
>> make: *** [net] Error 2
>> make: *** Waiting for unfinished jobs
>>
>> ...
>>
>>
>> I think is because KGDBOE selects just NETPOLL.
>>
> 
> Looks like it.
> 
> Select went and selected NETPOLL and NETPOLL_TRAP but things like
> CONFIG_NETDEVICES and CONFIG_NET_POLL_CONTROLLER remain unset.  `select'
> remains evil.
> 
> Something like this..
> 
> --- a/lib/Kconfig.kgdb~kgdb-kconfig-fix
> +++ a/lib/Kconfig.kgdb
> @@ -175,8 +175,7 @@ endchoice
>  config KGDBOE
>   tristate "KGDB: On ethernet" if !KGDBOE_NOMODULE
>   depends on m && KGDB
> - select NETPOLL
> - select NETPOLL_TRAP
> + depends on NETPOLL_TRAP && NET_POLL_CONTROLLER
>   help
> Uses the NETPOLL API to communicate with the host GDB via UDP.
> In order for this to work, the ethernet interface specified must
> _
> 
> 


That doesn't fix it. With that patch an 'make oldconfig' all NETPOLL stuff gone 
and we end up with :

...

drivers/built-in.o: In function `option_setup':
/work/crazy/linux-git/MM/linux-2.6.23-rc1/drivers/net/kgdboe.c:160: undefined 
reference to `netpoll_parse_options'
drivers/built-in.o: In function `configure_kgdboe':
/work/crazy/linux-git/MM/linux-2.6.23-rc1/drivers/net/kgdboe.c:183: undefined 
reference to `netpoll_setup'
/work/crazy/linux-git/MM/linux-2.6.23-rc1/drivers/net/kgdboe.c:189: undefined 
reference to `netpoll_cleanup'
drivers/built-in.o: In function `eth_post_exception_handler':
/work/crazy/linux-git/MM/linux-2.6.23-rc1/drivers/net/kgdboe.c:119: undefined 
reference to `netpoll_set_trap'
drivers/built-in.o: In function `eth_pre_exception_handler':
/work/crazy/linux-git/MM/linux-2.6.23-rc1/drivers/net/kgdboe.c:111: undefined 
reference to `netpoll_set_trap'
drivers/built-in.o: In function `eth_flush_buf':
/work/crazy/linux-git/MM/linux-2.6.23-rc1/drivers/net/kgdboe.c:138: undefined 
reference to `netpoll_send_udp'
drivers/built-in.o: In function `eth_get_char':
/work/crazy/linux-git/MM/linux-2.6.23-rc1/drivers/net/kgdboe.c:127: undefined 
reference to `netpoll_poll'
drivers/built-in.o: In function `cleanup_kgdboe':
/work/crazy/linux-git/MM/linux-2.6.23-rc1/drivers/net/kgdboe.c:217: undefined 
reference to `netpoll_cleanup'
make: *** [.tmp_vmlinux1] Error 1

...


If I get that right  select is needed here because  all NETPOLL{_*} depends on 
if NETDEVICES && if NET_ETHERNET.

Also doing 

...
select NETPOLL_TRAP 
select NETPOLL
select NET_POLL_CONTROLLER
...

makes the driver happy and everything compiles fine.

I think there may be a logical issue ( again if I got it right ).
We need some ethernet card to work with kgdboe right ? but we don't have any if 
!NETDEVICES && !NET_ETHERNET.

So maybe some ' depends on ... && NETDEVICES!=n && NET_ETHERNET!=n ' is needed 
too ? 

( really sory if I said something stupid these Kconfig depends are not really 
easy to figure for me )


Gabriel 
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: CONFIG_SUSPEND? (was: Re: [GIT PATCH] ACPI patches for 2.6.23-rc1)

2007-07-28 Thread Linus Torvalds


On Sat, 28 Jul 2007, Rafael J. Wysocki wrote:
> 
> OK, I'll prepare a patch to introduce CONFIG_SUSPEND, but that will require
> quite a bit of (compilation) testing on different architectures.

Sure. I'm not too worried, the fallout should be of the trivial kind. 

Also, mind basing it on the (independent) cleanups that Adrian already 
sent out. This is all intertwined..

Linus
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [ck] Re: Linus 2.6.23-rc1

2007-07-28 Thread Linus Torvalds


On Sat, 28 Jul 2007, jos poortvliet wrote:
> 
> 

Actually, the tag you were looking for was ""

> http://osnews.com/permalink.php?news_id=18350_id=259044
> 
> Now I wonder. Apparently, one person complaining about SD was reason to keep 
> it out http://osnews.com/permalink.php?news_id=18350_id=258997
> 
> Will this first post stop CFS from entering the kernel?

You seem to be not understanding the argument.

It wasn't about "one person complaining". Of *course* people will 
complain. That always happens, and sometimes with totally bogus complaints 
(the most common being "I'm not used to it").

The problem was the reaction to complaints. 

Ingo got lots of complaints too. He was very responsive to them (which is 
not something surprising - he's been doing this a long time), and while 
some of the tangents he went off on were definitely bogus (the whole 
renicing thing), they were still useful as part of the discussion.

And Ingo got other - totally unrelated - developers involved too, ie the 
group fairness logic came from Vatsa. And he ended up supporting not just 
scheduler people, but also talking to the block layer people ahout the 
scheduler timer usage as a fast clock for block requests etc.

And you have to realize that to me, as the top-level maintainer and one 
who seldom actually does big coding things, but just ends up making sure 
that people work with others, and fix the problems that crop up, *that* 
kind of behaviour is much much MUCH more important than the code itself.

Can you see that?

Can you see how big of a difference those whole approaches make? 

> Now I'll try to be a bit more constructive. I hope your benevolent 
> dictatorship allows self reflection.

Nobody is very good at self-reflection, I'm afraid. 

> Sure, the difference in behaviour (not in code) between SD and CFS is small, 
> and for me it doesn't matter. I'm fine with CFS in the kernel, it's a huge 
> improvement over the previous one. But why, while there was a seemingly good 
> alternative, did THAT one stay in that long? And this argument goes for more 
> code 'out there', btw.

Actually, nobody pushed SD to me, and neither Con nor anybody else tried 
to get me to merge it until some time in March of this year, I think.

Do you think I go trolling for code to merge? No. I actually _require_ 
that people send it to me, and that I also get the feeling that people are 
asking for it! 

In other words, my job is not to "merge code" (even though I sometimes 
describe it that way), my job is actually largely to "say no". You 
shouldn't see me as the person who goes out and tries to get everything 
together - quite the reverse. My job is to say "too late for the merge 
window", or "too experimental", or "you need to show numbers" or "are 
there going to be any _users_ for this"?

>  Some things get into the kernel, other don't. Some get in too soon, others 
> too late. Sure. But shouldn't we try to improve this process, instead of 
> saying 'it is what it is, get over it'?

Umm. The absolute *last* thing we want to do is to merge earlier. In fact, 
one of our biggest problems is that people send half-cooked stuff to me 
(and even more so, to Andrew). 

So in this case, if you've been on the CK mailing list, ask yourself: why 
wasn't parts of it pushed up to the standard kernel? Asking "why didn't 
Linus take it earlier" is exactly the wrong thing to do, since nobody even 
_asked_ me to. I never _ever_ got a patch saying "please merge this".

Seriously.

(Btw, on that note: please don't send me patches saying "please merge 
this". I want more than just that. I want an explanation, and I want it to 
be in many small pieces, and I want to feel like it got tested and is 
likely to be an obvious improvement).

So now look at what happened to CFS:

 - Ingo pushed it, and has been a maintainer of the area and shown himself 
   over years to be able to work with others and react to reports of 
   problems.

 - It was fairly obviously an improvement over the previous status quo
   (although I expect that there will be regressions - almost nothing is 
   ever a _pure_ improvement, if it's in any way non-trivial)

 - Even so, I asked for (and got) a series of independent patches.

Compare this to SD for a while. Ponder.

Linus
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [2.6 patch] let SUSPEND select HOTPLUG_CPU

2007-07-28 Thread Rafael J. Wysocki
On Saturday, 28 July 2007 00:25, Adrian Bunk wrote:
> On Thu, Jul 26, 2007 at 01:55:18PM -0700, Linus Torvalds wrote:
> > 
> > 
> > On Thu, 26 Jul 2007, Rafael J. Wysocki wrote:
> > > 
> > > My point is we have ACPI dependent on PM, so if you want ACPI, you end
> > > up with all of the STR stuff built in, which is what you don't like (if I
> > > understand that correctly).  If we have CONFIG_SUSPEND, you'll be able to
> > > choose ACPI alone. :-)
> > 
> > Good point. 
> > 
> > Anyway, I think the ACPI problem really is as trivial as the following 
> > three-liner removal fix. If the user doesn't want suspend, ACPI shouldn't 
> > force it on him.
> > 
> > A nicer fix might be to also make some of the ACPI helper routines depend 
> > on whether they are needed or not (which in turn will depend on whether 
> > suspend support has been compiled into the kernel), but quite frankly, 
> > that's secondary at least for me.
> > 
> > So if we have a few ACPI routines that will never get called (because we 
> > don't even enable the interfaces that would *cause* them to be called), I 
> > don't think that's a huge problem. It's a beauty wart, but nobody really 
> > cares (and it's even something that we could get the compiler to optimize 
> > away for us if we really cared).
> > 
> > Linus
> > 
> > ---
> > Don't force-enable suspend/hibernate support just for ACPI
> > 
> > It's a totally independent decision for the user whether he wants
> > suspend and/or hibernation support, and ACPI shouldn't care.
> > 
> > Signed-off-by: Linus Torvalds <[EMAIL PROTECTED]>
> > ---
> >  drivers/acpi/Kconfig |3 ---
> >  1 files changed, 0 insertions(+), 3 deletions(-)
> > 
> > diff --git a/drivers/acpi/Kconfig b/drivers/acpi/Kconfig
> > index 251344c..22b401b 100644
> > --- a/drivers/acpi/Kconfig
> > +++ b/drivers/acpi/Kconfig
> > @@ -11,9 +11,6 @@ menuconfig ACPI
> > depends on PCI
> > depends on PM
> > select PNP
> > -   # for sleep
> > -   select HOTPLUG_CPU if X86 && SMP
> > -   select SUSPEND_SMP if X86 && SMP
> > default y
> > ---help---
> >   Advanced Configuration and Power Interface (ACPI) support for 
> 
> The dependency of SUSPEND_SMP on HOTPLUG_CPU is quite unintuitive, so 
> what about something like the patch below?
> 
> This should address a main issue behind Len's patch.
> 
> cu
> Adrian
> 
> 
> <--  snip  -->
> 
> 
> An implementation detail of the suspend code that is not intuitive for 
> the user is the HOTPLUG_CPU dependency of SOFTWARE_SUSPEND if SMP.
> 
> This patch changes SOFTWARE_SUSPEND if SMP to select HOTPLUG_CPU instead 
> of depending on it.
> 
> Signed-off-by: Adrian Bunk <[EMAIL PROTECTED]>
> 
> ---
> 
>  kernel/power/Kconfig |   20 ++--
>  1 file changed, 14 insertions(+), 6 deletions(-)
> 
> --- a/kernel/power/Kconfig
> +++ b/kernel/power/Kconfig
> @@ -72,9 +72,22 @@ config PM_TRACE
>   CAUTION: this option will cause your machine's real-time clock to be
>   set to an invalid time after a resume.
>  
> +config SUSPEND_SMP_POSSIBLE
> + bool
> + depends on (X86 && !X86_VOYAGER) || (PPC64 && (PPC_PSERIES || PPC_PMAC))
> + depends on SMP
> + default y
> +
> +config SUSPEND_SMP
> + bool
> + depends on SUSPEND_SMP_POSSIBLE && SOFTWARE_SUSPEND
> + select HOTPLUG_CPU
> + default y

That should not depend on SOFTWARE_SUSPEND (it's equivalent to HIBERNATION).

Greetings,
Rafael


-- 
"Premature optimization is the root of all evil." - Donald Knuth
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: CONFIG_SUSPEND? (was: Re: [GIT PATCH] ACPI patches for 2.6.23-rc1)

2007-07-28 Thread Rafael J. Wysocki
On Saturday, 28 July 2007 18:55, Linus Torvalds wrote:
> 
> On Sat, 28 Jul 2007, Linus Torvalds wrote:
> > 
> > And it's the *top*level* code that selects HOTPLUG_CPU. Through 
> > SUSPEND_SMP (which will select HOTPLUG_CPU) and SOFTWARE_SUSPEND.
> 
> In other words, the problem seems to be that 
> 
>   kernel/power/main.c:
>   suspend_devices_and_enter()
> 
> does the proper "disable/enable_nonboot_cpus()", but it does so without 
> having enabled CPU hotplug.
> 
> And you seem to think that it's ACPI that should enable the hotplug, even 
> though the code that actually needs it is _outside_ ACPI. And I think 
> that's wrong, and that this is a bug.
> 
> So I think the real issue is that we allow that 
> "suspend_devices_and_enter()" code to be compiled without HOTPLUG_CPU in 
> the first place. It's not supposed to work that way.
> 
> Of course, it may well be that other architectures can happily suspend 
> even with multiple CPU's active, which may be the cause of this mess. But 
> I really think it shouldn't be ACPI that has to select the CPU hotplug, 
> since it's not ACPI that _uses_ it in the first place.
> 
> Rafael: making a config option for STR (the same way we have a config 
> option for hibernate), and just not allowing it on SMP without HOTPLUG_CPU 
> seems to be the right thing. Len is right in that we do insane things 
> right now (trying to STR with multiple CPU's still active), and I just 
> don't think he's the one that should work around it!

Well, I agree and that's why I asked. :-)

OK, I'll prepare a patch to introduce CONFIG_SUSPEND, but that will require
quite a bit of (compilation) testing on different architectures.

Greetings,
Rafael


-- 
"Premature optimization is the root of all evil." - Donald Knuth
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] fix IDE legacy mode resource

2007-07-28 Thread Jeff Garzik

Yoichi Yuasa wrote:

Hi,

I got the following error on MIPS Cobalt.
MIPS Cobalt has the 0x1000 offset between resource and bus region. 


PCI: Unable to reserve I/O region #1:[EMAIL PROTECTED] for device :00:09.1
pata_via :00:09.1: failed to request/iomap BARs for port 0 (errno=-16)
PCI: Unable to reserve I/O region #3:[EMAIL PROTECTED] for device :00:09.1
pata_via :00:09.1: failed to request/iomap BARs for port 1 (errno=-16)
pata_via :00:09.1: no available native port

At this point, these resources should be the bus regions.

Signed-off-by: Yoichi Yuasa <[EMAIL PROTECTED]>


I'm not sure I understand what's going on here... could you or someone 
provide additional explanation as to why this is a fix?


Thanks,

Jeff



-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Linus 2.6.23-rc1

2007-07-28 Thread Kasper Sandberg
On Sat, 2007-07-28 at 10:50 -0700, Linus Torvalds wrote:
> 
> On Sat, 28 Jul 2007, Kasper Sandberg wrote:
> >
> > First off, i've personally run tests on many more machines than my own,
> > i've had lots of people try on their machines, and i've seen totally
> > unrelated posts to lkml, plus i've seen the experiences people are
> > writing about on IRC. Frankly, im not just thinking of myself.
> 
> Ok, good. Has anybody tried to figure out why 3D games seem to be such a 
> special case? 
> 
> I know Ingo looked at it, and seemed to think that he found and fixed 
> something. But it sounds like it's worth a lot more discussion.
> 

Yes, but the various patches i've recieved seems to not solve it, it
simply changed the load at which CFS seemed to perform well.

On irc there has been wild speculation as to whether its the
sched_yield() stuff in most 3d drivers, but my tests with stubbing it
out, and altering behavior has not changed anything.

> > Okay, i wasnt going to ask, but ill do it anyway, did you even read the
> > threads about SD?
> 
> I don't _ever_ go on specialty mailing lists. I don't read -mm, and I 
> don't read the -fs mailing lists. I don't think they are interesting. 
> 
> And I tried to explain why: people who concentrate on one thing tend to 
> become this self-selecting group that never looks at anything else, and 
> then rejects outside input from people who hadn't become part of the "mind 
> meld". 
> 
> That's what I think I saw - I saw the reactions from where external people 
> were talking and cc'ing me.
> 
> And yes, it's quite possible that I also got a very one-sided picture of 
> it. I'm not disputing that. Con was also ill for a rather critical period, 
> which was certainly not helping it all.
> 
> > Con was extremely polite to everyone, and he did work
> > with a multitude of people, you seem to be totally deadlocked into the
> > ONE incident with a person that was unhappy with SD, simply for being a
> > fair scheduler.
> 
> Hey, maybe that one incident just ended up being a rather big portion of 
> what I saw. Too bad. That said, the end result (Con's public gripes about 
> other kernel developers) mostly reinforced my opinion that I did the right 
> choice.
> 
> But maybe you can show a better side of it all. I don't think _any_ 
> scheduler is perfect, and almost all of the time, the RightAnswer(tm) ends 
> up being not "one or the other", but "somewhere in between".
> 
> It's not like we've come to the end of the road: the baseline has just 
> improved. If you guys can show that SD actually is better at some loads, 
> without penalizing others, we can (and will) revisit this issue.

well, as far as my tests show, the only real difference between SD and
CFS in terms of performance, is 3d, where both will deliver basically
the same FPS in a given application, SD does it smooth, which is the
best way to explain it, what happens with CFS, as i experience it, is
that it seems to burstly allocate ressources.

> 
> So what you should take away from this is that: from what I saw over the 
> last couple of months, it really wasn't much of a decision. The difference 
> in how Ingo and Con reacted to peoples reports was pretty stark. And no, I 
> haven't followed the ck mailing list, and so yes, I obviously did get just 
> a part of the picture, but the part I got was pretty damn unambiguous.

I really think you should try read the SD and RSDL threads on lkml
again, the only place where con havent been extremely fourthcoming was
deep in the thread where Mike was unhappy with SD not giving X more
prioity than fairness dictates..

> 
> But at the same time, no technical decision is ever written in stone. It's 
> all a balancing act. I've replaced the scheduler before, I'm 100% sure 
> we'll replace it again. Schedulers are actually not at all that important 
> in the end: they are a very very small detail in the kernel.
> 
>   Linus
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [EMAIL PROTECTED]
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [ck] Re: Linus 2.6.23-rc1

2007-07-28 Thread Linus Torvalds


On Sat, 28 Jul 2007, Jan Engelhardt wrote:
> 
> You cannot please everybody in the scheduler question, that is clear,
> then why not offer dedicated scheduling alternatives (plugsched comes to mind)
> and let them choose what pleases them most, and handles their workload best?

This is one approach, but it's actually one that I personally think is  
often the worst possible choice. 

Why? Because it ends up meaning that you never get the cross-pollination 
from different approaches (they stay separate "modes"), and it's also 
usually really bad for users in that it forces the user to make some 
particular choice that the user is usually not even aware of.

So I personally think that it's much better to find a setup that works 
"well enough" for people, without having modal behaviour. People complain 
and gripe now, but what people seem to be missing is that it's a journey, 
not an end-of-the-line destination. We haven't had a single release kernel 
with the new scheduler yet, so the only people who have tried it are 
either

 (a) interested in schedulers in the first place (which I think is *not* a 
 good subset, because they have very specific expectations of what is 
 right and what is wrong, and they come into the whole thing with that 
 mental baggage)

 (b) people who test -rc1 kernels (I love you guys, but sadly, you're not 
 nearly as common as I'd like ;)

so the fact is, we'll find out more information about where CFS falls 
down, and where it does well,  and we'll be able to *fix* it and tweak it.

In contrast, if you go for a modal approach, you tend to always fixate 
those two modes forever, and you'll never get something that works well: 
people have to switch modes when they switch workloads.

[ This, btw, has nothing to do with schedulers per se. We have had these 
  exact same issues in the memory management too - which is a lot more 
  complex than scheduling, btw. The whole page replacement algorithm is 
  something where you could easily have "specialized" algorithms in order 
  to work really well under certain loads, but exactly as with scheduling, 
  I will argue that it's a lot better to be "good across a wide swath of 
  loads" than to try to be "perfect in one particular modal setup". ]

This is also, btw, why I think that people who argue for splitting desktop 
kernels from server kernels are total morons, and only show that they 
don't know what the hell they are talking about.

The fact is, the work we've done on server loads has improved the desktop 
experience _immensely_, with all the scalability work (or the work on 
large memory configurations, etc etc) that went on there, and that used to 
be totally irrelevant for the desktop.

And btw, the same is very much true in reverse: a lot of the stuff that 
was done for desktop reasons (hotplug etc) has been a _huge_ boon for the 
server side, and while there were certainly issues that had to be resolved 
(the sysfs stuff so central to the hotplug model used tons of memory when 
you had ten thousand disks, and server people were sometimes really 
unhappy), a lot of the big improvements actually happen because somethng 
totally _unrelated_ needed them, and then it just turns out that it's good 
for the desktop too, even if it started out as a server thing or vice 
versa.

This is why the whole "modal" mindset is stupid. It basically freezes a 
choice that shouldn't be frozen. It sets up an artificial barrier between 
two kinds of uses (whether they be about "server" vs "desktop" or "3D 
gaming" vs "audio processing", or anything else), and that frozen choice 
actually ends up being a barrier to development in the long run.

So "modal" things are good for fixing behaviour in the short run. But they 
are a total disaster in the long run, and even in the short run they tend 
to have problems (simply because there will be cases that straddle the 
line, and show some of _both_ issues, and now *neither* mode is the right 
one)

Linus
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [ck] Re: Linus 2.6.23-rc1

2007-07-28 Thread jos poortvliet
Op Saturday 28 July 2007, schreef Linus Torvalds:
> On Sat, 28 Jul 2007, Michael Chang wrote:
> > I do recall there is one issue on which Con wouldn't budge -- anything
> > that involved boosting certain kinds of processes in the kernel.
>
> I did that myself, so that's a non-issue.
>
> No. The complaints were about the CK scheduler not being as responsive
> under load as even the _old_ scheduler was. I don't know why people ignore
> this fact. It was a long thread back in March or April, and I'm pretty
> sure the CK mailing list was cc'd.

Of course it wasn't. The speed of tasks slows proportionally with the amount 
of system usage. That's the whole point, and CFS can't fix that either, can 
it?

> Sure, most people don't actually have load-averages above ten etc, but
> it's important to do those well _too_.
>
>   Linus


http://osnews.com/permalink.php?news_id=18350_id=259044

Now I wonder. Apparently, one person complaining about SD was reason to keep 
it out http://osnews.com/permalink.php?news_id=18350_id=258997

Will this first post stop CFS from entering the kernel?



Now I'll try to be a bit more constructive. I hope your benevolent 
dictatorship allows self reflection.

Sure, the difference in behaviour (not in code) between SD and CFS is small, 
and for me it doesn't matter. I'm fine with CFS in the kernel, it's a huge 
improvement over the previous one. But why, while there was a seemingly good 
alternative, did THAT one stay in that long? And this argument goes for more 
code 'out there', btw.
 
 Some things get into the kernel, other don't. Some get in too soon, others 
too late. Sure. But shouldn't we try to improve this process, instead of 
saying 'it is what it is, get over it'?
 
 For me, that's the purpose of this whole discussion. We're losing valuable 
code and contributors, yet at the same time code which isn't mature yet 
enters the kernel. Acknowledging there is a problem is the first step in 
solving it.

 Of course, I don't have answers - but I do feel strongly that you think there 
is no issue. Is there, or isn't there? And if there is, what do you plan to 
do about it?



Your influence on the behaviour of the people around you, your 'lieutenants', 
is huge. Larger than you might think. And in many cases, ppl following 
someone behave more extreme. That's a big reason why the LKML isn't very 
polite nor inviting (mind you, I don't think that's necessarily a bad thing, 
that's up to you to decide).

You might want to think about ways to improve the whole process. Again, I'm no 
Linus, it's your call. And you can make a big difference, I'm sure.


Greetings,

Jos


signature.asc
Description: This is a digitally signed message part.


Re: [PATCH] Framebuffer: Fix 16bpp colour output in Dreamcast pvr2fb

2007-07-28 Thread Adrian McMenamin
On 28/07/07, Ondrej Zajicek <[EMAIL PROTECTED]> wrote:
> On Sat, Jul 28, 2007 at 03:51:38PM +0100, Adrian McMenamin wrote:
> > Tony,
> >
> > This patch - on top of your others - fixes the colour output for 16bpp
> > RGB565 output in the Dreamcast - it was a simple out by one error in
> > the bit shift.
>
> > @@ -330,27 +331,28 @@ static int pvr2fb_setcolreg(unsigned int regno, 
> > unsigned int red,
> >   case 16: /* RGB 565 */
> >   tmp =  (red   & 0xf800)   |
> > ((green & 0xfc00) >> 5) |
> > -   ((blue  & 0xf800) >> 11);
> > +   ((blue  & 0xf800) >> 10);
>
> This mixes lsb of green with msb of blue. If you want RGB 565,
> then >> 11 is correct. If you want RGB 555, green should
> be anded with 0xf800.
>
You are, of course, quite right, which makes it all the more the
strange that it appeared to fix the problem. Back to the drawing board
then.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


2.6.23-rc1, KVM-AMD problem

2007-07-28 Thread Alistair John Strachan
Hi,

I'm getting periodic oopses running KVM-33 on 2.6.23-rc1. Here is a digital 
photo of the oops. Alarmingly, a lot of the time it triple faults the machine 
and I don't get a chance to grab it. This time I was lucky, though.

http://devzero.co.uk/~alistair/kvm-2.6.23-rc1.jpg

Unfortunately, some of the oops text scrolled out of the screen. I will 
endeavour to reproduce the bug over serial console, but I can make no 
guarantees.

The CPU is an AMD X2 BE-2350, chipset is AMD 690G.

-- 
Cheers,
Alistair.

137/1 Warrender Park Road, Edinburgh, UK.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Linus 2.6.23-rc1

2007-07-28 Thread Linus Torvalds


On Sat, 28 Jul 2007, Kasper Sandberg wrote:
>
> First off, i've personally run tests on many more machines than my own,
> i've had lots of people try on their machines, and i've seen totally
> unrelated posts to lkml, plus i've seen the experiences people are
> writing about on IRC. Frankly, im not just thinking of myself.

Ok, good. Has anybody tried to figure out why 3D games seem to be such a 
special case? 

I know Ingo looked at it, and seemed to think that he found and fixed 
something. But it sounds like it's worth a lot more discussion.

> Okay, i wasnt going to ask, but ill do it anyway, did you even read the
> threads about SD?

I don't _ever_ go on specialty mailing lists. I don't read -mm, and I 
don't read the -fs mailing lists. I don't think they are interesting. 

And I tried to explain why: people who concentrate on one thing tend to 
become this self-selecting group that never looks at anything else, and 
then rejects outside input from people who hadn't become part of the "mind 
meld". 

That's what I think I saw - I saw the reactions from where external people 
were talking and cc'ing me.

And yes, it's quite possible that I also got a very one-sided picture of 
it. I'm not disputing that. Con was also ill for a rather critical period, 
which was certainly not helping it all.

> Con was extremely polite to everyone, and he did work
> with a multitude of people, you seem to be totally deadlocked into the
> ONE incident with a person that was unhappy with SD, simply for being a
> fair scheduler.

Hey, maybe that one incident just ended up being a rather big portion of 
what I saw. Too bad. That said, the end result (Con's public gripes about 
other kernel developers) mostly reinforced my opinion that I did the right 
choice.

But maybe you can show a better side of it all. I don't think _any_ 
scheduler is perfect, and almost all of the time, the RightAnswer(tm) ends 
up being not "one or the other", but "somewhere in between".

It's not like we've come to the end of the road: the baseline has just 
improved. If you guys can show that SD actually is better at some loads, 
without penalizing others, we can (and will) revisit this issue.

So what you should take away from this is that: from what I saw over the 
last couple of months, it really wasn't much of a decision. The difference 
in how Ingo and Con reacted to peoples reports was pretty stark. And no, I 
haven't followed the ck mailing list, and so yes, I obviously did get just 
a part of the picture, but the part I got was pretty damn unambiguous.

But at the same time, no technical decision is ever written in stone. It's 
all a balancing act. I've replaced the scheduler before, I'm 100% sure 
we'll replace it again. Schedulers are actually not at all that important 
in the end: they are a very very small detail in the kernel.

Linus
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


  1   2   3   4   >