date:20071122

Re: snd_hda_intel 2.6.24-rc2 bug: interrupts don't always work on Lenovo X60s

2007-11-22 Thread Takashi Iwai

At Thu, 22 Nov 2007 12:42:42 -0500,
Theodore Tso wrote:
> 
> On Tue, Nov 13, 2007 at 04:43:05AM +0100, Takashi Iwai wrote:
> > > By the way, the "polling mode" seems to work OK: I still get normal
> > > playback of music etc.
> > 
> > Yes, the polling mode should work in most cases, too.
> 
> Out of curiosity, how many wakeups/interrupts are involved with the
> sound going into polling mode?  Is it going to make a difference as
> far as battery life is concerned?

Switch to polling mode is judged whether the response comes within one
second.  It's irrelevant of number of interrupts.
This mechanism was introduced at the time many devices have broken
BIOS and indeed irq was screwed up.

But, the polling mode is no dramatic change.  This means that the
driver polls the response value not only waiting for an ACK via 
irq.  Thus no big change over battery life.  The new option,
CONFIG_SND_HDA_POWER_SAVE, is a far bigger behavior change and would
influence a lot more on battery life.


> I'm seeing the message:
> 
> hda_intel: azx_get_response timeout, switching to polling mode: last 
> cmd=0x005f000c
> 
> on my X61s laptop as well, where the last_cmd varies quite a bit.
> Over the past two weeks, I've seen last cmd be:
> 
> 0x003f000c, 0x004f000c, 0x005f000c, 0x006f000c, 0x00db8000,
> 0x011b8000, 0x011ba000, 0x012ba000, 0x012f000c, 0x013f000c,
> 0x014f000d, 0x019f000c, 0x020b0001, 0x020b0003, 0x020b2000,
> 0x020b2001, 0x020b2002, 0x025f0012

A half of them should go away with my patch.  They are wrong PINCAP
verbs.  But others seem not.

> Interestingly, when I was using a post 2.6.24-rc1 and -rc2 kernel, I
> was getting a lot of these "switching polling to mode messages",
> usually within a minute of the machine booting.  Now that I have
> switched to a recent rc3 kernel, they seem to have largely gone away.

Interesting, indeed.

> Looking at my kernel, it looks like the patch you suggested to Roland
> was *not* applied, and "git log sound/pci/hda" shows that the only
> change to that directory was a patch from Ingo Molnar that I had
> cherry picked from LKML.  Given that we were doing a
> schedule_timeout_uninterruptible for a full second, that certainly
> seems to be a likely candidate for why we were getting the response
> timeout message!  Does this analysis make sense to you?

It might be true.  As mentioned, the threshold to polling mode is one
second.  If schedule_timeout_uninterruptible(1) takes one second, of
course, this fails.  I didn't expect that schedule_timeout() may take
that long.

If Ingo's patch really helps in such a situation, we should get it
into 2.6.24.  It's already in ALSA tree (perex/alsa.git mm branch),
but I didn't ask Jaroslav to included in the push chunk.

Jaroslav, care to push again?


thanks,

Takashi

> 
> Regards,
> 
>   - Ted
> 
> commit 2f7e58208e0d59ca6e4ad1561f47391d4efa19fa
> Author: Ingo Molnar <[EMAIL PROTECTED]>
> Date:   Fri Nov 16 11:35:05 2007 -0500
> 
> snd hda suspend latency: shorten codec read
> 
> not sleeping for every codec read/write but doing a short udelay and
> a conditional reschedule has cut suspend+resume latency by about 1
> second on my T60.
> 
> Signed-off-by: Ingo Molnar <[EMAIL PROTECTED]>
> 
> diff --git a/sound/pci/hda/hda_intel.c b/sound/pci/hda/hda_intel.c
> index 3fa0f97..62b9fb3 100644
> --- a/sound/pci/hda/hda_intel.c
> +++ b/sound/pci/hda/hda_intel.c
> @@ -555,7 +555,8 @@ static unsigned int azx_rirb_get_response(struct 
> hda_codec *codec)
>   }
>   if (!chip->rirb.cmds)
>   return chip->rirb.res; /* the last value */
> - schedule_timeout_uninterruptible(1);
> + udelay(10);
> + cond_resched();
>   } while (time_after_eq(timeout, jiffies));
>  
>   if (chip->msi) {
> 
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Laptop keyboard unusable when ACPI is active

2007-11-22 Thread Mats Johannesson

Many responsible CCs added.

On 2007-10-20 18:33:09 Pavel Machek responded to Daniele C.:
> > > Try disabling acpi embedded controller.
> > >
> > How can I accomplish this? Are you referring to the i8042?
>
> rmmod acpi_ec or how is it called. But I'm not sure how easy this is.

Designed to be 'hard' because it in effect breaks several functions.
See "config ACPI_EC" in drivers/acpi/Kconfig which one must edit
manually, as well as arch/*/configs/*_defconfig

Either way, the result is exactly as not loading modularized battery
and ac (thermal module still worked on my system).

But this whole issue is getting ridiculous. INPUT subsystem is addled
more and more by ACPI (in real laptop life) while the maintainers,
from a user view, seem stumped.

The bad interaction between ACPI controlled EC (embedded controller)
and the i8042 interrupt handler is theorized about in detail at OLPCs
http://dev.laptop.org/ticket/2401 - almost at the end of that page.
Thanks to Daniele C for the link.

And the bug scenario has been present for many _years_ now. From my own
experience it is only getting worse. Here's the current relations in
2.6.24-rc3-git1

_EC-reading ACPI modules unloaded_

Notebook keyboard and Synaptics touchpad operations all perfectly OK.
Nothing related show up in the logs.


_EC-reading ACPI modules loaded_

Keyboard keys get stuck when EC is accessed for battery, temperature
etc.

Touchpad (mouse pointer) gets stuck for up to ca 5 seconds if I eg move
a window while EC data is read.

The logs are spammed when EC screws over i8042:
psmouse.c: TouchPad at isa0060/serio4/input0 lost sync at byte 4
psmouse.c: TouchPad at isa0060/serio4/input0 lost sync at byte 1
psmouse.c: TouchPad at isa0060/serio4/input0 - driver resynched.
psmouse.c: TouchPad at isa0060/serio4/input0 lost sync at byte 1
psmouse.c: TouchPad at isa0060/serio4/input0 lost sync at byte 1
psmouse.c: TouchPad at isa0060/serio4/input0 lost sync at byte 1
psmouse.c: TouchPad at isa0060/serio4/input0 lost sync at byte 1
psmouse.c: TouchPad at isa0060/serio4/input0 lost sync at byte 1
psmouse.c: issuing reconnect request
[etc]

Messages about an unknown key show up (possibly in relation to a stuck
key). I don't have this on disk, so I'll copy the message from the bug
http://bugzilla.kernel.org/show_bug.cgi?id=9147
atkbd.c: Unknown key released (translated set 2, code 0xe0 on
isa0060/serio0).
atkbd.c: Use 'setkeycodes e060 ' to make it known.


_EC-reading ACPI modules loaded. Command line option i8042.nomux used_

Keyboard keys _can_ get stuck when EC is accessed for battery,
temperature etc. Much less frequent than without i8042.nomux though.

Synaptics touchpad operations OK. Nothing related show up in the logs.


I know that some kind of US holiday is in effect presently, but please
make an effort to take a coordinated look at this afterwards. The
issues are so old and widespread that even the angels cry (I'm told).

PS. No, Dmitry. Lowering the report rate, eg psmouse.rate=40 does not
fix anything, as I tried on your suggestion already in August 2006.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: 2.6.24-rc3-mm1: I/O error, system hangs

2007-11-22 Thread Hannes Reinecke

Laurent Riffard wrote:
> Le 21.11.2007 23:41, Andrew Morton a écrit :
>> On Wed, 21 Nov 2007 22:45:22 +0100
>> Laurent Riffard <[EMAIL PROTECTED]> wrote:
>>
>>> Le 21.11.2007 05:45, Andrew Morton a écrit :
 ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.24-rc3/2.6.24-rc3-mm1/
>>> Hello, 
>>>
>>> My system hangs shortly after I logged in Gnome desktop. SysRq-W shows
>>> that a bunch of task are blocked in "D" state, they seem to wait for
>>> some I/O completion. I can try to hand-copy some data if requested.
>>>
>>> I found these messages in dmesg:
>>>
>>> ~$ grep -C2 end_request dmesg-2.6.24-rc3-mm1 
>>> EXT3-fs: mounted filesystem with ordered data mode.
>>> sd 0:0:0:0: [sda] Result: hostbyte=DID_NO_CONNECT 
>>> driverbyte=DRIVER_OK,SUGGEST_OK
>>> end_request: I/O error, dev sda, sector 16460
>>> ReiserFS: sda7: found reiserfs format "3.6" with standard journal
>>> ReiserFS: sda7: using ordered data mode
>>> --
>>> ReiserFS: sda7: Using r5 hash to sort names
>>> sd 0:0:1:0: [sdb] Result: hostbyte=DID_NO_CONNECT 
>>> driverbyte=DRIVER_OK,SUGGEST_OK
>>> end_request: I/O error, dev sdb, sector 19632
>>> sd 0:0:1:0: [sdb] Result: hostbyte=DID_NO_CONNECT 
>>> driverbyte=DRIVER_OK,SUGGEST_OK
>>> end_request: I/O error, dev sdb, sector 40037363
>>> Adding 1048568k swap on /dev/mapper/vglinux1-lvswap.  Priority:-1 extents:1 
>>> across:1048568k
>>> lp0: using parport0 (interrupt-driven).
>>>
>>> These errors occur *only* with 2.6.24-rc3-mm1, they are 100% reproducible.
>>> 2.6.24-rc3 and 2.6.24-rc2-mm1 are fine.
>>>
>>> Maybe something is broken in pata_via driver ?
>>>
>> Could be - 
>> libata-reimplement-ata_acpi_cbl_80wire-using-ata_acpi_gtm_xfermask.patch
>> and 
>> pata_amd-pata_via-de-couple-programming-of-pio-mwdma-and-udma-timings.patch
>> touch pata_via.c.
> 
> None of the above...
> 
> I did a bisection, it spotted git-scsi-misc.patch. 
> I just run 2.6.24-rc3-mm1 + revert-git-scsi-misc.patch, and it works fine.
> 
> I guess commit 8655a546c83fc43f0a73416bbd126d02de7ad6c0 "[SCSI] Do not 
> requeue requests if REQ_FAILFAST is set" is the real culprit. The other 
> commits are touching documentation or drivers I don't use. I'll try 
> to revert only this one this evening.
> 
Hmm. Weird. I'll have a look into it. Apparently I'll be returning an error 
where
I shouldn't. Checking ...

Cheers,

Hannes
-- 
Dr. Hannes Reinecke   zSeries & Storage
[EMAIL PROTECTED] +49 911 74053 688
SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: Markus Rex, HRB 16746 (AG Nürnberg)
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC/PATCH] Export force_sig_info

2007-11-22 Thread Andrew Morton

On Mon, 12 Nov 2007 16:36:52 +1100 Jeremy Kerr <[EMAIL PROTECTED]> wrote:

> This change allows force_sig_info to be called from modules.
> 
> Signed-off-by: Jeremy Kerr <[EMAIL PROTECTED]>
> 
> --
> 
> Any objections to exporting this symbol? I'm  planning to move some
> SPU fault-handling code from the kernel to the spufs.ko object.
> 
> ---
> 
>  kernel/signal.c |1 +
>  1 file changed, 1 insertion(+)
> 
> Index: linux-2.6-spufs/kernel/signal.c
> ===
> --- linux-2.6-spufs.orig/kernel/signal.c
> +++ linux-2.6-spufs/kernel/signal.c
> @@ -815,6 +815,7 @@ force_sig_info(int sig, struct siginfo *
>  
>   return ret;
>  }
> +EXPORT_SYMBOL_GPL(force_sig_info);
>  
>  void
>  force_sig_specific(int sig, struct task_struct *t)

Perhaps export it from within a powerpc-specific C file (along with
suitable comment) to prevent people from generally relying upon the export?

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 2.6.24-rc3-mm1] IPC: make struct ipc_ids static in ipc_namespace

2007-11-22 Thread Pavel Emelyanov

Cedric Le Goater wrote:
> Pierre Peiffer wrote:
>> Each ipc_namespace contains a table of 3 pointers to struct ipc_ids (3 for
>> msg, sem and shm, structure used to store all ipcs)
>> These 'struct ipc_ids' are dynamically allocated for each icp_namespace as
>> the ipc_namespace itself (for the init namespace, they are initialized with
>> pointers to static variables instead)
>>
>> It is so for historical reason: in fact, before the use of idr to store the
>> ipcs, the ipcs were stored in tables of variable length, depending of the
>> maximum number of ipc allowed.
>> Now, these 'struct ipc_ids' have a fixed size. As they are allocated in any
>> cases for each new ipc_namespace, there is no gain of memory in having them
>> allocated separately of the struct ipc_namespace.
>>
>> This patch proposes to make this table static in the struct ipc_namespace.
>> Thus, we can allocate all in once and get rid of all the code needed to
>> allocate and free these ipc_ids separately.
> 
> It looks safe and saves quite a lot of line.
> 
> Pavel, what do you think of it ? 

Looks sane, good catch, Pierre.

But I'd find out whether these three ipc_ids intersect any 
cache-line. In other words I'd mark the struct ipc_ids as
cacheline_aligned and checked for any differences.

> Acked-by: Cedric Le Goater <[EMAIL PROTECTED]>
> 
> Thanks,

Thanks,
Pavel

> C.
> 
> 
>> Signed-off-by: Pierre Peiffer <[EMAIL PROTECTED]>
>> ---
>>  include/linux/ipc_namespace.h |   13 +++--
>>  ipc/msg.c |   26 --
>>  ipc/namespace.c   |   25 -
>>  ipc/sem.c |   26 --
>>  ipc/shm.c |   26 --
>>  ipc/util.c|6 +++---
>>  ipc/util.h|   16 
>>  7 files changed, 34 insertions(+), 104 deletions(-)
>>
>> Index: b/include/linux/ipc_namespace.h
>> ===
>> --- a/include/linux/ipc_namespace.h
>> +++ b/include/linux/ipc_namespace.h
>> @@ -2,11 +2,20 @@
>>  #define __IPC_NAMESPACE_H__
>>  
>>  #include 
>> +#include 
>> +#include 
>> +
>> +struct ipc_ids {
>> +int in_use;
>> +unsigned short seq;
>> +unsigned short seq_max;
>> +struct rw_semaphore rw_mutex;
>> +struct idr ipcs_idr;
>> +};
>>  
>> -struct ipc_ids;
>>  struct ipc_namespace {
>>  struct kref kref;
>> -struct ipc_ids  *ids[3];
>> +struct ipc_ids  ids[3];
>>  
>>  int sem_ctls[4];
>>  int used_sems;
>> Index: b/ipc/msg.c
>> ===
>> --- a/ipc/msg.c
>> +++ b/ipc/msg.c
>> @@ -67,9 +67,7 @@ struct msg_sender {
>>  #define SEARCH_NOTEQUAL 3
>>  #define SEARCH_LESSEQUAL4
>>  
>> -static struct ipc_ids init_msg_ids;
>> -
>> -#define msg_ids(ns) (*((ns)->ids[IPC_MSG_IDS]))
>> +#define msg_ids(ns) ((ns)->ids[IPC_MSG_IDS])
>>  
>>  #define msg_unlock(msq) ipc_unlock(&(msq)->q_perm)
>>  #define msg_buildid(id, seq)ipc_buildid(id, seq)
>> @@ -80,30 +78,17 @@ static int newque(struct ipc_namespace *
>>  static int sysvipc_msg_proc_show(struct seq_file *s, void *it);
>>  #endif
>>  
>> -static void __msg_init_ns(struct ipc_namespace *ns, struct ipc_ids *ids)
>> +void msg_init_ns(struct ipc_namespace *ns)
>>  {
>> -ns->ids[IPC_MSG_IDS] = ids;
>>  ns->msg_ctlmax = MSGMAX;
>>  ns->msg_ctlmnb = MSGMNB;
>>  ns->msg_ctlmni = MSGMNI;
>>  atomic_set(>msg_bytes, 0);
>>  atomic_set(>msg_hdrs, 0);
>> -ipc_init_ids(ids);
>> +ipc_init_ids(>ids[IPC_MSG_IDS]);
>>  }
>>  
>>  #ifdef CONFIG_IPC_NS
>> -int msg_init_ns(struct ipc_namespace *ns)
>> -{
>> -struct ipc_ids *ids;
>> -
>> -ids = kmalloc(sizeof(struct ipc_ids), GFP_KERNEL);
>> -if (ids == NULL)
>> -return -ENOMEM;
>> -
>> -__msg_init_ns(ns, ids);
>> -return 0;
>> -}
>> -
>>  void msg_exit_ns(struct ipc_namespace *ns)
>>  {
>>  struct msg_queue *msq;
>> @@ -126,15 +111,12 @@ void msg_exit_ns(struct ipc_namespace *n
>>  }
>>  
>>  up_write(_ids(ns).rw_mutex);
>> -
>> -kfree(ns->ids[IPC_MSG_IDS]);
>> -ns->ids[IPC_MSG_IDS] = NULL;
>>  }
>>  #endif
>>  
>>  void __init msg_init(void)
>>  {
>> -__msg_init_ns(_ipc_ns, _msg_ids);
>> +msg_init_ns(_ipc_ns);
>>  ipc_init_proc_interface("sysvipc/msg",
>>  "   key  msqid perms  cbytes   
>> qnum lspid lrpid   uid   gid  cuid  cgid  stime  rtime  ctime\n",
>>  IPC_MSG_IDS, sysvipc_msg_proc_show);
>> Index: b/ipc/namespace.c
>> ===
>> --- a/ipc/namespace.c
>> +++ b/ipc/namespace.c
>> @@ -14,35 +14,18 @@
>>  
>>  static struct ipc_namespace *clone_ipc_ns(struct ipc_namespace *old_ns)
>>  {
>> -int err;
>>  struct ipc_namespace

Re: Is it possible to give the user the option to cancel forkbombs?

2007-11-22 Thread AstralStorm

On Sat, 17 Nov 2007 09:55:01 -0800
Dane Mutters <[EMAIL PROTECTED]> wrote:

> I don't know if this is at all feasible, but is it possible to have a
> mechanism that would detect a fork bomb in progress and either stop the
> fork, or allow the user to cancel the operation?  For example, are there
> any legitimate processes (i.e. ones that really need to fork like crazy)
> that would need to generate 200+ processes in less than 1 second?
> 
> (Note: I'm not a programmer; I'm just throwing out the idea.)
> 

If the parent PID of the new task is exported through TASKSTATS, you can
do it already in userspace. If not, that data should be exported.

Then you could write a root daemon using netlink, set it to RT priority
and create an inheritable counter in it to thwart binary forking.
The counter would be cleared every x seconds.
No need to do it in the kernel.

signature.asc
Description: PGP signature

Re: 2.6.24-rc3-mm1: I/O error, system hangs

2007-11-22 Thread Laurent Riffard


Le 21.11.2007 23:41, Andrew Morton a écrit :
> On Wed, 21 Nov 2007 22:45:22 +0100
> Laurent Riffard <[EMAIL PROTECTED]> wrote:
> 
>> Le 21.11.2007 05:45, Andrew Morton a écrit :
>>> ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.24-rc3/2.6.24-rc3-mm1/
>> Hello, 
>>
>> My system hangs shortly after I logged in Gnome desktop. SysRq-W shows
>> that a bunch of task are blocked in "D" state, they seem to wait for
>> some I/O completion. I can try to hand-copy some data if requested.
>>
>> I found these messages in dmesg:
>>
>> ~$ grep -C2 end_request dmesg-2.6.24-rc3-mm1 
>> EXT3-fs: mounted filesystem with ordered data mode.
>> sd 0:0:0:0: [sda] Result: hostbyte=DID_NO_CONNECT 
>> driverbyte=DRIVER_OK,SUGGEST_OK
>> end_request: I/O error, dev sda, sector 16460
>> ReiserFS: sda7: found reiserfs format "3.6" with standard journal
>> ReiserFS: sda7: using ordered data mode
>> --
>> ReiserFS: sda7: Using r5 hash to sort names
>> sd 0:0:1:0: [sdb] Result: hostbyte=DID_NO_CONNECT 
>> driverbyte=DRIVER_OK,SUGGEST_OK
>> end_request: I/O error, dev sdb, sector 19632
>> sd 0:0:1:0: [sdb] Result: hostbyte=DID_NO_CONNECT 
>> driverbyte=DRIVER_OK,SUGGEST_OK
>> end_request: I/O error, dev sdb, sector 40037363
>> Adding 1048568k swap on /dev/mapper/vglinux1-lvswap.  Priority:-1 extents:1 
>> across:1048568k
>> lp0: using parport0 (interrupt-driven).
>>
>> These errors occur *only* with 2.6.24-rc3-mm1, they are 100% reproducible.
>> 2.6.24-rc3 and 2.6.24-rc2-mm1 are fine.
>>
>> Maybe something is broken in pata_via driver ?
>>
> 
> Could be - 
> libata-reimplement-ata_acpi_cbl_80wire-using-ata_acpi_gtm_xfermask.patch
> and 
> pata_amd-pata_via-de-couple-programming-of-pio-mwdma-and-udma-timings.patch
> touch pata_via.c.

None of the above...

I did a bisection, it spotted git-scsi-misc.patch. 
I just run 2.6.24-rc3-mm1 + revert-git-scsi-misc.patch, and it works fine.

I guess commit 8655a546c83fc43f0a73416bbd126d02de7ad6c0 "[SCSI] Do not 
requeue requests if REQ_FAILFAST is set" is the real culprit. The other 
commits are touching documentation or drivers I don't use. I'll try 
to revert only this one this evening.

-- 
laurent


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 2.6.24-rc3-mm1] IPC: make struct ipc_ids static in ipc_namespace

2007-11-22 Thread Cedric Le Goater

Pierre Peiffer wrote:
> 
> Each ipc_namespace contains a table of 3 pointers to struct ipc_ids (3 for
> msg, sem and shm, structure used to store all ipcs)
> These 'struct ipc_ids' are dynamically allocated for each icp_namespace as
> the ipc_namespace itself (for the init namespace, they are initialized with
> pointers to static variables instead)
> 
> It is so for historical reason: in fact, before the use of idr to store the
> ipcs, the ipcs were stored in tables of variable length, depending of the
> maximum number of ipc allowed.
> Now, these 'struct ipc_ids' have a fixed size. As they are allocated in any
> cases for each new ipc_namespace, there is no gain of memory in having them
> allocated separately of the struct ipc_namespace.
> 
> This patch proposes to make this table static in the struct ipc_namespace.
> Thus, we can allocate all in once and get rid of all the code needed to
> allocate and free these ipc_ids separately.

It looks safe and saves quite a lot of line.

Pavel, what do you think of it ? 

Acked-by: Cedric Le Goater <[EMAIL PROTECTED]>

Thanks,

C.


> Signed-off-by: Pierre Peiffer <[EMAIL PROTECTED]>
> ---
>  include/linux/ipc_namespace.h |   13 +++--
>  ipc/msg.c |   26 --
>  ipc/namespace.c   |   25 -
>  ipc/sem.c |   26 --
>  ipc/shm.c |   26 --
>  ipc/util.c|6 +++---
>  ipc/util.h|   16 
>  7 files changed, 34 insertions(+), 104 deletions(-)
> 
> Index: b/include/linux/ipc_namespace.h
> ===
> --- a/include/linux/ipc_namespace.h
> +++ b/include/linux/ipc_namespace.h
> @@ -2,11 +2,20 @@
>  #define __IPC_NAMESPACE_H__
>  
>  #include 
> +#include 
> +#include 
> +
> +struct ipc_ids {
> + int in_use;
> + unsigned short seq;
> + unsigned short seq_max;
> + struct rw_semaphore rw_mutex;
> + struct idr ipcs_idr;
> +};
>  
> -struct ipc_ids;
>  struct ipc_namespace {
>   struct kref kref;
> - struct ipc_ids  *ids[3];
> + struct ipc_ids  ids[3];
>  
>   int sem_ctls[4];
>   int used_sems;
> Index: b/ipc/msg.c
> ===
> --- a/ipc/msg.c
> +++ b/ipc/msg.c
> @@ -67,9 +67,7 @@ struct msg_sender {
>  #define SEARCH_NOTEQUAL  3
>  #define SEARCH_LESSEQUAL 4
>  
> -static struct ipc_ids init_msg_ids;
> -
> -#define msg_ids(ns)  (*((ns)->ids[IPC_MSG_IDS]))
> +#define msg_ids(ns)  ((ns)->ids[IPC_MSG_IDS])
>  
>  #define msg_unlock(msq)  ipc_unlock(&(msq)->q_perm)
>  #define msg_buildid(id, seq) ipc_buildid(id, seq)
> @@ -80,30 +78,17 @@ static int newque(struct ipc_namespace *
>  static int sysvipc_msg_proc_show(struct seq_file *s, void *it);
>  #endif
>  
> -static void __msg_init_ns(struct ipc_namespace *ns, struct ipc_ids *ids)
> +void msg_init_ns(struct ipc_namespace *ns)
>  {
> - ns->ids[IPC_MSG_IDS] = ids;
>   ns->msg_ctlmax = MSGMAX;
>   ns->msg_ctlmnb = MSGMNB;
>   ns->msg_ctlmni = MSGMNI;
>   atomic_set(>msg_bytes, 0);
>   atomic_set(>msg_hdrs, 0);
> - ipc_init_ids(ids);
> + ipc_init_ids(>ids[IPC_MSG_IDS]);
>  }
>  
>  #ifdef CONFIG_IPC_NS
> -int msg_init_ns(struct ipc_namespace *ns)
> -{
> - struct ipc_ids *ids;
> -
> - ids = kmalloc(sizeof(struct ipc_ids), GFP_KERNEL);
> - if (ids == NULL)
> - return -ENOMEM;
> -
> - __msg_init_ns(ns, ids);
> - return 0;
> -}
> -
>  void msg_exit_ns(struct ipc_namespace *ns)
>  {
>   struct msg_queue *msq;
> @@ -126,15 +111,12 @@ void msg_exit_ns(struct ipc_namespace *n
>   }
>  
>   up_write(_ids(ns).rw_mutex);
> -
> - kfree(ns->ids[IPC_MSG_IDS]);
> - ns->ids[IPC_MSG_IDS] = NULL;
>  }
>  #endif
>  
>  void __init msg_init(void)
>  {
> - __msg_init_ns(_ipc_ns, _msg_ids);
> + msg_init_ns(_ipc_ns);
>   ipc_init_proc_interface("sysvipc/msg",
>   "   key  msqid perms  cbytes   
> qnum lspid lrpid   uid   gid  cuid  cgid  stime  rtime  ctime\n",
>   IPC_MSG_IDS, sysvipc_msg_proc_show);
> Index: b/ipc/namespace.c
> ===
> --- a/ipc/namespace.c
> +++ b/ipc/namespace.c
> @@ -14,35 +14,18 @@
>  
>  static struct ipc_namespace *clone_ipc_ns(struct ipc_namespace *old_ns)
>  {
> - int err;
>   struct ipc_namespace *ns;
>  
> - err = -ENOMEM;
>   ns = kmalloc(sizeof(struct ipc_namespace), GFP_KERNEL);
>   if (ns == NULL)
> - goto err_mem;
> + return ERR_PTR(-ENOMEM);
>  
> - err = sem_init_ns(ns);
> - if (err)
> - goto err_sem;
> - err = msg_init_ns(ns);
> - if (err)
> - goto err_msg;
> -

Re: sata NCQ blacklist entry

2007-11-22 Thread Andrew Morton

On Tue, 13 Nov 2007 21:55:15 +0100 Jan-Simon M__ller <[EMAIL PROTECTED]> wrote:

> Hi!

You removed from cc the guys who are most likely to fix this.  Please always
do reply-to-all.

> Just using kernel 2.6.24-rc2 (325d22df7b19e0116aff3391d3a03f73d0634ded).
> 
> When booting the system hangs, using the emergency-sync a couple of times 
> gets 
> the system to go on at some point. 
> Its always around starting X/Firewall (can't actually say whats done in this 
> moment).
> 
> Looking at dmesg i found this (NCQ disabled ...)
> 
> dmesg output
> 
> ata1.00: exception Emask 0x2 SAct 0x4 SErr 0x0 action 0x2 frozen
> ata1.00: spurious completions during NCQ issue=0x0 SAct=0x4 
> FIS=004040a1:0002
> ata1.00: cmd 61/08:10:bc:b2:5d/00:00:08:00:00/40 tag 2 cdb 0x0 data 4096 out
>  res 40/00:14:bc:b2:5d/00:00:08:00:00/40 Emask 0x2 (HSM violation)
> ata1.00: status: { DRDY }
> ata1: soft resetting link
> ata1: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
> ata1.00: configured for UDMA/133
> ata1: EH complete
> sd 0:0:0:0: [sda] 488397168 512-byte hardware sectors (250059 MB)
> sd 0:0:0:0: [sda] Write Protect is off
> sd 0:0:0:0: [sda] Mode Sense: 00 3a 00 00
> sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support 
> DPO or FUA
> ata1.00: NCQ disabled due to excessive errors
> ata1.00: exception Emask 0x2 SAct 0xfffd3 SErr 0x0 action 0x2 frozen
> ata1.00: spurious completions during NCQ issue=0x0 SAct=0xfffd3 
> FIS=004040a1:0020
> ata1.00: cmd 60/10:00:14:05:69/00:00:06:00:00/40 tag 0 cdb 0x0 data 8192 in
>  res 40/00:04:14:05:69/00:00:06:00:00/40 Emask 0x2 (HSM violation)
> ata1.00: status: { DRDY }
> ata1.00: cmd 60/08:08:cc:dd:12/00:00:04:00:00/40 tag 1 cdb 0x0 data 4096 in
>  res 40/00:04:14:05:69/00:00:06:00:00/40 Emask 0x2 (HSM violation)
> ata1.00: status: { DRDY }
> ata1.00: cmd 60/18:20:84:54:a3/00:00:05:00:00/40 tag 4 cdb 0x0 data 12288 in
>  res 40/00:04:14:05:69/00:00:06:00:00/40 Emask 0x2 (HSM violation)
> ata1.00: status: { DRDY }
> ata1.00: cmd 60/08:30:bc:b1:07/00:00:06:00:00/40 tag 6 cdb 0x0 data 4096 in
>  res 40/00:04:14:05:69/00:00:06:00:00/40 Emask 0x2 (HSM violation)
> ata1.00: status: { DRDY }
> ata1.00: cmd 60/08:38:dc:b1:07/00:00:06:00:00/40 tag 7 cdb 0x0 data 4096 in
>  res 40/00:04:14:05:69/00:00:06:00:00/40 Emask 0x2 (HSM violation)
> ata1.00: status: { DRDY }
> ata1.00: cmd 60/10:40:3c:b2:07/00:00:06:00:00/40 tag 8 cdb 0x0 data 8192 in
>  res 40/00:04:14:05:69/00:00:06:00:00/40 Emask 0x2 (HSM violation)
> ata1.00: status: { DRDY }
> ata1.00: cmd 60/08:48:2c:cf:07/00:00:06:00:00/40 tag 9 cdb 0x0 data 4096 in
>  res 40/00:04:14:05:69/00:00:06:00:00/40 Emask 0x2 (HSM violation)
> ata1.00: status: { DRDY }
> ata1.00: cmd 60/08:50:3c:cf:07/00:00:06:00:00/40 tag 10 cdb 0x0 data 4096 in
>  res 40/00:04:14:05:69/00:00:06:00:00/40 Emask 0x2 (HSM violation)
> ata1.00: status: { DRDY }
> ata1.00: cmd 60/08:58:64:b4:12/00:00:04:00:00/40 tag 11 cdb 0x0 data 4096 in
>  res 40/00:04:14:05:69/00:00:06:00:00/40 Emask 0x2 (HSM violation)
> ata1.00: status: { DRDY }
> ata1.00: cmd 60/28:60:6c:b4:12/00:00:04:00:00/40 tag 12 cdb 0x0 data 20480 in
>  res 40/00:04:14:05:69/00:00:06:00:00/40 Emask 0x2 (HSM violation)
> ata1.00: status: { DRDY }
> ata1.00: cmd 60/f8:68:7c:e1:07/00:00:06:00:00/40 tag 13 cdb 0x0 data 126976 in
>  res 40/00:04:14:05:69/00:00:06:00:00/40 Emask 0x2 (HSM violation)
> ata1.00: status: { DRDY }
> ata1.00: cmd 60/08:70:a4:04:69/00:00:06:00:00/40 tag 14 cdb 0x0 data 4096 in
>  res 40/00:04:14:05:69/00:00:06:00:00/40 Emask 0x2 (HSM violation)
> ata1.00: status: { DRDY }
> ata1.00: cmd 60/18:78:b4:04:69/00:00:06:00:00/40 tag 15 cdb 0x0 data 12288 in
>  res 40/00:04:14:05:69/00:00:06:00:00/40 Emask 0x2 (HSM violation)
> ata1.00: status: { DRDY }
> ata1.00: cmd 60/08:80:a4:5c:6a/00:00:06:00:00/40 tag 16 cdb 0x0 data 4096 in
>  res 40/00:04:14:05:69/00:00:06:00:00/40 Emask 0x2 (HSM violation)
> ata1.00: status: { DRDY }
> ata1.00: cmd 60/20:88:2c:05:69/00:00:06:00:00/40 tag 17 cdb 0x0 data 16384 in
>  res 40/00:04:14:05:69/00:00:06:00:00/40 Emask 0x2 (HSM violation)
> ata1.00: status: { DRDY }
> ata1.00: cmd 60/30:90:54:05:69/00:00:06:00:00/40 tag 18 cdb 0x0 data 24576 in
>  res 40/00:04:14:05:69/00:00:06:00:00/40 Emask 0x2 (HSM violation)
> ata1.00: status: { DRDY }
> ata1.00: cmd 60/08:98:94:05:69/00:00:06:00:00/40 tag 19 cdb 0x0 data 4096 in
>  res 40/00:04:14:05:69/00:00:06:00:00/40 Emask 0x2 (HSM violation)
> ata1.00: status: { DRDY }
> ata1: soft resetting link
> ata1: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
> ata1.00: configured for UDMA/133
> ata1: EH complete
> sd 0:0:0:0: [sda] 488397168 512-byte hardware sectors (250059 MB)
> sd 0:0:0:0: [sda] Write Protect is off
> sd 0:0:0:0: [sda] Mode Sense: 00 3a 00 00
> sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support 
> DPO or FUA

Re: [RFC] Documentation about unaligned memory access

2007-11-22 Thread dean gaudet

On Fri, 23 Nov 2007, Alan Cox wrote:

> Its usually faster if you don't misalign on x86 as well.

i'm not sure if i agree with "usually"... but i know you (alan) are 
probably aware of the exact requirements of the hw.

for everyone else:

on intel x86 processors an access is unaligned only if it crosses a 
cacheline boundary (64 bytes).  otherwise it's aligned.  the penalty for 
crossing a cacheline boundary varies from ~12 cycles (core2) to many 
dozens of cycles (p4).

on AMD x86 pre-family 10h the boundary is 8 bytes, and on fam 10h it's 16 
bytes.  the penalty is a mere 3 cycles if an access crosses the specified 
boundary.

if you're making <= 4 byte accesses i recommend not worrying about 
alignment on x86.  it's pretty hard to beat the hardware support.

i curse all the RISC and embedded processor designers who pretend 
unaligned accesses are something evil and to be avoided.  in case you're 
worried, MIPS patent 4,814,976 expired in december 2006 :)

-dean
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

RE: [RFC/PATCH] SO_NO_CHECK for IPv6

2007-11-22 Thread David Schwartz


> Regardless of whatever verifications your application is doing
> on the data, it is not checksumming the ports and that's what
> the pseudo-header is helping with.

So what? We are in the case where the data has already gotten to him. If it
got to him in error, he'll reject it anyway. The receive checksum check will
only reject packets that he would reject anyway. That makes it needless.

Of course, if the check is nearly free, there's no potential win, so no
point in bothering.

DS


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: 2.6.24-rc3-mm1

2007-11-22 Thread Kirill A. Shutemov

On [Fri, 23.11.2007 01:48], Thomas Gleixner wrote:
> On Thu, 22 Nov 2007, Andrew Morton wrote:
> 
> > On Thu, 22 Nov 2007 12:22:05 +0200 "Kirill A. Shutemov" <[EMAIL PROTECTED]> 
> > wrote:
> > 
> > > On x86_64 'uname -m' return 'x86'.  It break many userspace programs. apt
> > > and rpm for example.
> > > 
> > 
> > Yes, there have been various discussions about this.  I think Sam is 
> > cooking up
> > a fix?
> 
> http://lkml.org/lkml/2007/11/19/323
> 
> I push it Linus wards ASAP.
diff --git a/arch/x86/Makefile b/arch/x86/Makefile
index 116b03a..7aa1dc6 100644
--- a/arch/x86/Makefile
+++ b/arch/x86/Makefile
@@ -11,10 +11,9 @@ endif
 $(srctree)/arch/x86/Makefile%: ;
 
 ifeq ($(CONFIG_X86_32),y)
+UTS_MACHINE := i386
 include $(srctree)/arch/x86/Makefile_32
 else
+UTS_MACHINE := x86_64
 include $(srctree)/arch/x86/Makefile_64
 endif

Many programs expect i686 on Pentium II.

-- 
Regards,  Kirill A. Shutemov
 + Belarus, Minsk
 + Velesys LLC, http://www.velesys.com/
 + ALT Linux Team, http://www.altlinux.com/


signature.asc
Description: Digital signature

Re: 2.6.24-rc3-mm1

2007-11-22 Thread Gabriel C

Andrew Morton wrote:
> On Fri, 23 Nov 2007 02:39:08 +0100 Gabriel C <[EMAIL PROTECTED]> wrote:
> 
>> I have some warnings on each SCSI disc:
>>
>>
>> ...
>>
>> [   30.724410] scsi 0:0:0:0: Direct-Access SEAGATE  ST318406LW   
>> 0109 PQ: 0 ANSI: 3
>> [   30.724419] scsi0:A:0:0: Tagged Queuing enabled.  Depth 32
>> [   30.724435]  target0:0:0: Beginning Domain Validation
>> [   30.724446]  target0:0:0: Domain Validation Initial Inquiry Failed <--
>> [   30.724572]  target0:0:0: Ending Domain Validation
>> [   30.729747] scsi 0:0:1:0: Direct-Access FUJITSU  MAH3182MP
>> 0114 PQ: 0 ANSI: 4
>> [   30.729754] scsi0:A:1:0: Tagged Queuing enabled.  Depth 32
>> [   30.729771]  target0:0:1: Beginning Domain Validation
>> [   30.729780]  target0:0:1: Domain Validation Initial Inquiry Failed <--
>> [   30.729908]  target0:0:1: Ending Domain Validation
>>
> 
> Don't know what would have caused that.  But yes, something is wrong in
> scsi land.

Actually I'm lucky the author didn't fix that FIXME in scsi_transport_spi.c and 
I still can boot ;)

> 
>> no idea whatever this is related but buffered disk reads are 2.XX MB/sec and 
>> the box is somewhat laggy.
>>
>> hdparm -t on sda and sdb reports :
>>
>> /dev/sda:
>>  Timing buffered disk reads:8 MB in  3.26 seconds =   2.46 MB/sec
>>
>> /dev/sdb:
>>  Timing buffered disk reads:8 MB in  3.56 seconds =   2.25 MB/sec
>>
>> My IDE discs are fine.
>>
>> Please let me know if you need my config or any other informations.
>>
> 
> And you're the second to report very slow scsi throughput in 2.6.24-rc3-mm1.
> 

I found the commit which cause these problems , it is in git-scsi-misc patch 
and reverting it fixes both problems for me.

http://git.kernel.org/?p=linux/kernel/git/jejb/scsi-misc-2.6.git;a=commitdiff_plain;h=8655a546c83fc43f0a73416bbd126d02de7ad6c0;hp=5bc717b6bdaaf52edf365eb7d9d8c89fec79df5d


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[Patch] mm/sparse.c: Improve the error handling for sparse_add_one_section()

2007-11-22 Thread WANG Cong


Improve the error handling for mm/sparse.c::sparse_add_one_section().
And I see no reason to check 'usemap' until holding the
'pgdat_resize_lock'. If someone knows, please let me know.

Note! This patch is _not_ tested yet, since it seems that I can't
configure sparse memory for i386 box. Sorry for this. ;(
I hope someone can help me to test it.

Cc: Christoph Lameter <[EMAIL PROTECTED]>
Cc: Dave Hansen <[EMAIL PROTECTED]>
Cc: Rik van Riel <[EMAIL PROTECTED]>
Signed-off-by: WANG Cong <[EMAIL PROTECTED]>

---
 mm/sparse.c |   17 ++---
 1 file changed, 10 insertions(+), 7 deletions(-)

Index: linux-2.6/mm/sparse.c
===
--- linux-2.6.orig/mm/sparse.c
+++ linux-2.6/mm/sparse.c
@@ -391,9 +391,17 @@ int sparse_add_one_section(struct zone *
 * no locking for this, because it does its own
 * plus, it does a kmalloc
 */
-   sparse_index_init(section_nr, pgdat->node_id);
+   ret = sparse_index_init(section_nr, pgdat->node_id);
+   if (ret < 0)
+   return ret;
memmap = kmalloc_section_memmap(section_nr, pgdat->node_id, nr_pages);
+   if (!memmap)
+   return -ENOMEM;
usemap = __kmalloc_section_usemap();
+   if (!usemap) {
+   __kfree_section_memmap(memmap, nr_pages);
+   return -ENOMEM;
+   }
 
pgdat_resize_lock(pgdat, );
 
@@ -403,18 +411,13 @@ int sparse_add_one_section(struct zone *
goto out;
}
 
-   if (!usemap) {
-   ret = -ENOMEM;
-   goto out;
-   }
ms->section_mem_map |= SECTION_MARKED_PRESENT;
 
ret = sparse_init_one_section(ms, section_nr, memmap, usemap);
 
 out:
pgdat_resize_unlock(pgdat, );
-   if (ret <= 0)
-   __kfree_section_memmap(memmap, nr_pages);
+
return ret;
 }
 #endif
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[GIT PULL] sh updates for 2.6.24-rc4

2007-11-22 Thread Paul Mundt

Please pull from:

master.kernel.org:/pub/scm/linux/kernel/git/lethal/sh-2.6.24.git

Which contains:

Heiko Schocher (1):
  sh: Fix copy_{to,from}_user_page() with cache disabled.

Magnus Damm (3):
  sh: fix R2D-1 CF support
  sh: include ax88796 in the defconfig for r7780mp
  sh: include ax88796 in the defconfig for r7785rp

Paul Mundt (4):
  sh: Kill off UTLB flush in fast-path.
  sh: lockless UTLB miss fast-path.
  sh: Update mailing list info.
  fb: Orphan imsttfb.

 MAINTAINERS   |   16 +--
 arch/sh/boards/renesas/rts7751r2d/setup.c |2 +
 arch/sh/configs/r7780mp_defconfig |  287 +
 arch/sh/configs/r7785rp_defconfig |   10 +-
 arch/sh/mm/fault.c|   33 +---
 include/asm-sh/cacheflush.h   |2 +-
 6 files changed, 108 insertions(+), 242 deletions(-)
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] PPC: CHRP - fix possible NULL pointer dereference

2007-11-22 Thread Cyrill Gorcunov

Here is updated version
---
From: Cyrill Gorcunov <[EMAIL PROTECTED]>

This patch does fix possible NULL pointer dereference
inside of strncmp() if of_get_property() failed.

Signed-off-by: Cyrill Gorcunov <[EMAIL PROTECTED]>
---
 arch/powerpc/platforms/chrp/setup.c |   13 +
 1 files changed, 9 insertions(+), 4 deletions(-)

diff --git a/arch/powerpc/platforms/chrp/setup.c 
b/arch/powerpc/platforms/chrp/setup.c
index 5930626..da4e45e 100644
--- a/arch/powerpc/platforms/chrp/setup.c
+++ b/arch/powerpc/platforms/chrp/setup.c
@@ -115,7 +115,7 @@ void chrp_show_cpuinfo(struct seq_file *m)
seq_printf(m, "machine\t\t: CHRP %s\n", model);
 
/* longtrail (goldengate) stuff */
-   if (!strncmp(model, "IBM,LongTrail", 13)) {
+   if (model && !strncmp(model, "IBM,LongTrail", 13)) {
/* VLSI VAS96011/12 `Golden Gate 2' */
/* Memory banks */
sdramen = (in_le32(gg2_pci_config_base + GG2_PCI_DRAM_CTRL)
@@ -203,15 +203,20 @@ static void __init sio_fixup_irq(const char *name, u8 
device, u8 level,
 static void __init sio_init(void)
 {
struct device_node *root;
+   const char *model;
 
-   if ((root = of_find_node_by_path("/")) &&
-   !strncmp(of_get_property(root, "model", NULL),
-   "IBM,LongTrail", 13)) {
+   root = of_find_node_by_path("/");
+   if (!root)
+   return;
+
+   model = of_get_property(root, "model", NULL);
+   if (model && !strncmp(model,"IBM,LongTrail", 13)) {
/* logical device 0 (KBC/Keyboard) */
sio_fixup_irq("keyboard", 0, 1, 2);
/* select logical device 1 (KBC/Mouse) */
sio_fixup_irq("mouse", 1, 12, 2);
}
+
of_node_put(root);
 }

Re: [linux-usb-devel] 2.6.24-rc3-mm1: usb mouse doesn't work

2007-11-22 Thread Kirill A. Shutemov

On [Thu, 22.11.2007 21:51], Alan Stern wrote:
> On Thu, 22 Nov 2007, Marin Mitov wrote:
> 
> > > > > I've had some strangenesses with USB lately.  Sometimes running 
> > > > > `lsusb'
> > > > > makes the USB system notice a newly attached device.
> > > >
> > > > No. But I have new messages in dmesg:
> > > >
> > > > uhci_hcd :00:1d.3: FGR not stopped yet!
> > > > uhci_hcd :00:1d.2: FGR not stopped yet!
> > > > uhci_hcd :00:1d.1: FGR not stopped yet!
> > > > uhci_hcd :00:1d.0: FGR not stopped yet!
> > > >
> > > > > Is that "FGR not stopped yet!" messgae new behaviour?
> > > >
> > > > It is a new message since 2.6.24-rc3. I have never try -mm tree before.
> > >
> > > These messages could indicate a timing problem.  You can see the code
> > > that writes the messages near the end of wakeup_rh() in
> > > drivers/usb/host/uhci-hcd.c.
> > >
> > > The message gets written if the controller hardware hasn't turned off a
> > > particular bit after a 4-us delay.  If the udelay() function wasn't
> > > working right, it could cause this problem.
> > 
> > udelay() _is_ OK for 2.6.24-rc3, so it is not the cause of the problem
> 
> But is it OK for 2.6.24-rc3-mm1?  Kirill said specifically that 
> 2.6.24-rc3 does not display the message but 2.6.24-rc3-mm1 does.

How can I test it?

-- 
Regards,  Kirill A. Shutemov
 + Belarus, Minsk
 + Velesys LLC, http://www.velesys.com/
 + ALT Linux Team, http://www.altlinux.com/


signature.asc
Description: Digital signature

Re: Why is FIBMAP ioctl root only?

2007-11-22 Thread Theodore Tso

On Thu, Nov 22, 2007 at 07:56:20PM +, Alan Cox wrote:
> > probably principle of least privilege; the location on physical media
> > for a file is clearly something internal to the OS, and non-trusted
> > users normally don't have any business knowing that. 
> 
> FIBMAP isn't correctly locked against misuse, and that requires FIBMAP is
> safe against truncate and relocation. There was thread on l/k about this
> a month ago or so.
> 
> Its also the wrong API (32bit, no notion of extents, compression etc)

The right approach would be to create a new syscall, and a new entry
point in the inode operations table, and filesystems could provide
support for the new system call as their bmap code was audited for
correctness.  

For bonus points the new interface would also provide make it more
efficient for filesystems to return information about extents.  (i.e.,
Not only is logical block 150 mapped to physical block 5550, it is
part of an 200 block extent starting at logical block 0 to physical
block 5400.)

- Ted
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Unkillable gdb process gets system unusably slow

2007-11-22 Thread Timo Sirainen

Fully reproducible with me. v2.6.23.1 x86-64 SMP kernel, Core 2 CPU, gdb
v6.6.90.20070912-debian.

gdb ./hang
run
fr 1
p (char*)base

p command hangs and the entire system becomes unusably slow. kill -9
doesn't kill gdb.

/* gcc hang.c -o hang -g -Wall */
#include 
#include 
#include 
#include 
#include 

int main(void)
{
int fd;
char buf[100];
void *base;

fd = open("hang.tmp", O_RDWR | O_CREAT, 0600);
if (fd == -1) perror("open");

base = mmap(NULL, 100, PROT_READ|PROT_WRITE, MAP_SHARED, fd, 0);
if (base == MAP_FAILED)
perror("mmap");
memcpy(buf, base, sizeof(buf));
return 0;
}



signature.asc
Description: This is a digitally signed message part

Re: Where is the interrupt going?

2007-11-22 Thread niessner



Quite right. I read it too quickly and thought it had succeeded when  
it had failed. I will modify the module to do the shared IRQ and then  
try the noapic test again. Exactly why I reserved the right to do it  
again.


This is good because it means the hammer may work after all.

Thank you very much and I will post to let you know the outcome.

Quoting Marin Mitov <[EMAIL PROTECTED]>, on Thu 22 Nov 2007 07:18:01 PM PST:


Hi,

On Friday 23 November 2007 02:48:53 am you wrote:

I tried the hammer and the problem persists.
[EMAIL PROTECTED]:~$ cat /proc/cmdline
root=UUID=8b3c3666-22c3-4c04-b399-ece266f2ef30 ro noapic quiet splash

However, I reserve the right to try the hammer again in the future.
When I look at /proc/interrupts without the APIC:
[EMAIL PROTECTED]:~$ cat /proc/interrupts
CPU0
   0:144XT-PIC-XTtimer
   1: 10XT-PIC-XTi8042
   2:  0XT-PIC-XTcascade
   5: 10XT-PIC-XTohci_hcd:usb5, mxser
   6:  5XT-PIC-XTfloppy
   7:  1XT-PIC-XTparport0
   8:  3XT-PIC-XTrtc
   9:  1XT-PIC-XTacpi, uhci_hcd:usb2
  10: 10XT-PIC-XTohci_hcd:usb4, ehci_hcd:usb6,
[EMAIL PROTECTED]::01:00.0
  11:   2231XT-PIC-XTuhci_hcd:usb1, ohci_hcd:usb3, eth0
  12:130XT-PIC-XTi8042
  14:   4362XT-PIC-XTlibata
  15:  15315XT-PIC-XTlibata
NMI:  0
LOC: 130125
ERR:  0
MIS:  0

I do not even see the device that I registered unless it is that
r128... line. However the code printed out in /var/log/messages:


No, this is your radeon 128 board (on AGP I suppose). Could be integrated
on the mobo if it is a server mobo.


Nov 22 16:05:27 bbb kernel: [  104.712473] apc8620: VID = 0x10B5
Nov 22 16:05:27 bbb kernel: [  104.712486] apc8620: mapped addr = e0bd4000
Nov 22 16:05:27 bbb kernel: [  104.713022] apc8620: registered carrier 0
Nov 22 16:05:27 bbb kernel: [  104.713028] apc8620: interrupt data
(0xe1083e40) on irq (10) and status (0x10)


Here is the problem (I suppose):
if status (0x10 hex or 16 decimal) is the value returned by request_irq:
status = request_irq (apcsi[i].board_irq,
  apc8620_handler,
  IRQF_DISABLED,
  DEVICE_NAME,
  (void*)[i]);
(from your first post), that means the irq is NOT registered, because
according to the LDD v.3 book:

The value returned from request_irq to the requesting function is either 0
to indicate success or a negative error code, as usual. It’s not uncommon
for the function to return -EBUSY to signal that another driver is already
using the requested interrupt line.

If you grep the kernels's include directory for EBUSY you will find:
#defineEBUSY   16  /* Device or resource busy */
in include/asm-generic/errno-base.h

So I think your mobo has shared (with other devices) irq line on the
PCI/PCIe slot you use for your hardware and these other devices have
already registered shered irq handlers for the same irq (10), so the
attempt to register nonshared irq fails.

Either try to register the irq as shared, or put the hardware on
another slot whose irq line is not shared with other devises
(if such one exists). This info should be available from the mobo
manual book.


which indicates it successfully registered without being shared.


No, as I already explained.
The only problem :-) in my explanation is:
request_irq returns EBUSY (not -EBUSY as should be)

Marin Mitov


When
I have more time, I will changed the code to be a shared IRQ and try
the noapic again.

However, without the noapic /proc/interrupts looks like:
[EMAIL PROTECTED]:~$ cat /proc/interrupts
CPU0
   0:154   IO-APIC-edge  timer
   1: 10   IO-APIC-edge  i8042
   6:  5   IO-APIC-edge  floppy
   7:  0   IO-APIC-edge  parport0
   8:  3   IO-APIC-edge  rtc
   9:  1   IO-APIC-fasteoi   acpi
  10:  0   IO-APIC-edge  apc8620
  12:130   IO-APIC-edge  i8042
  14:   2861   IO-APIC-edge  libata
  15:   1049   IO-APIC-edge  libata
  16: 11   IO-APIC-fasteoi   ohci_hcd:usb5, mxser
  17:  0   IO-APIC-fasteoi   uhci_hcd:usb1, ohci_hcd:usb3
  18:  0   IO-APIC-fasteoi   uhci_hcd:usb2
  19:187   IO-APIC-fasteoi   eth0
  20:  0   IO-APIC-fasteoi   ohci_hcd:usb4, [EMAIL 
PROTECTED]::01:00.0
  21:  0   IO-APIC-fasteoi   ehci_hcd:usb6
NMI:  0
LOC:   8820
ERR:  0
MIS:  0


I have attached the kernel module. The apc8620 is an IndustryPack
carrier card. I can therefore open up N (in this specific case 5) sub
memory windows in the memory mapped PCI address. The kernel module
keeps track of the slot offsets from the

Chelsio driver bug in offload mode

2007-11-22 Thread TEJ

hey
I am using chelsio offload stack to tranfer data between two back to
back connected systems.
Now I using lighttpd server and http browser. After three-way
handskahe successful completed, when it start transfering of data it
gives me the following warning and then close the socket.

cxgb3 :01:00.0 CIM SDRAM address out of range (0x2)
cxgb3 :01:00.0 encountered fatal error. operation suspended.
cxgb3 :01:00.0 FW status: 0x0,0x0,0x0,0x0

n then i don't get anything on the browser side.

As i have tried to debug the problem i found that when skb is
nonlinear and skb->len became more then mtu, it give me this bug.
After that i have tried to linearize the skb and send but other side
it shows me a really html crapped data.

i am not able to clearly debug the problem. what i am doing wrong.

thanks
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] PPC: CHRP - fix possible NULL pointer dereference

2007-11-22 Thread Cyrill Gorcunov

On 11/23/07, Stephen Rothwell <[EMAIL PROTECTED]> wrote:
> On Thu, 22 Nov 2007 22:54:23 +0300 Cyrill Gorcunov <[EMAIL PROTECTED]>
> wrote:
> >
> > This patch does fix possible NULL pointer dereference
> > inside of strncmp() if of_get_property() failed.
>
> Thanks for this.
>
> >  static void __init sio_init(void)
> >  {
> >   struct device_node *root;
> > + const char *model = NULL;
>
> You don't need this initialization as you always assign the variable
> before you use it.
>
> > + root = of_find_node_by_path("/");
> > + if (root) {
>
>  if (!root)
>   return;
>
> would save a level of indentation. Not important.
>
> --
> Cheers,
> Stephen Rothwell[EMAIL PROTECTED]
> http://www.canb.auug.org.au/~sfr/
>
Oh my :) Thanks. I'll fix it and resend.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: 2.6.24-rc3-mm1

2007-11-22 Thread Andrew Morton

On Fri, 23 Nov 2007 02:39:08 +0100 Gabriel C <[EMAIL PROTECTED]> wrote:

> I have some warnings on each SCSI disc:
> 
> 
> ...
> 
> [   30.724410] scsi 0:0:0:0: Direct-Access SEAGATE  ST318406LW   0109 
> PQ: 0 ANSI: 3
> [   30.724419] scsi0:A:0:0: Tagged Queuing enabled.  Depth 32
> [   30.724435]  target0:0:0: Beginning Domain Validation
> [   30.724446]  target0:0:0: Domain Validation Initial Inquiry Failed <--
> [   30.724572]  target0:0:0: Ending Domain Validation
> [   30.729747] scsi 0:0:1:0: Direct-Access FUJITSU  MAH3182MP0114 
> PQ: 0 ANSI: 4
> [   30.729754] scsi0:A:1:0: Tagged Queuing enabled.  Depth 32
> [   30.729771]  target0:0:1: Beginning Domain Validation
> [   30.729780]  target0:0:1: Domain Validation Initial Inquiry Failed <--
> [   30.729908]  target0:0:1: Ending Domain Validation
> 

Don't know what would have caused that.  But yes, something is wrong in
scsi land.

> 
> no idea whatever this is related but buffered disk reads are 2.XX MB/sec and 
> the box is somewhat laggy.
> 
> hdparm -t on sda and sdb reports :
> 
> /dev/sda:
>  Timing buffered disk reads:8 MB in  3.26 seconds =   2.46 MB/sec
> 
> /dev/sdb:
>  Timing buffered disk reads:8 MB in  3.56 seconds =   2.25 MB/sec
> 
> My IDE discs are fine.
> 
> Please let me know if you need my config or any other informations.
> 

And you're the second to report very slow scsi throughput in 2.6.24-rc3-mm1.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 2/9]: Reduce Log I/O latency

2007-11-22 Thread David Chinner

On Fri, Nov 23, 2007 at 03:53:17AM +0100, Andi Kleen wrote:
> On Fri, Nov 23, 2007 at 12:15:39AM +1100, David Chinner wrote:
> > On Thu, Nov 22, 2007 at 01:06:11PM +0100, Andi Kleen wrote:
> > > > FWIW from a "real time" database POV this seems to make sense to me...
> > > > in fact, we probably rely on filesystem metadata way too much
> > > > (historically it's just "worked" although we do seem to get issues
> > > > on ext3).
> > > 
> > > For that case you really would need priority inheritance: any metadata
> > > IO on behalf or blocking a process needs to use the process' block IO 
> > > priority.
> > 
> > How do you do that when the processes are blocking on semaphores,
> > mutexes or rw-semaphores in the fileysystem three layers removed from
> > the I/O in progress?
> 
> [...] I didn't say it was easy (or rather explicitely said it would be 
> tricky).
> Probably it would be possible to fold it somehow into rt mutexes PI,
> but it's not easy and semaphores would need to be handled too.
> 
> Just my point was to solve the metadata RT problem unconditionally increasing
> the priority is a bad idea and not really a replacement to a "full"
> solution. Short term a user can just increase the priority of all the XFS 
> threads anyways.

The point is that it's not actually a thread-based problem - the priority
can't be inherited via the traditional mutex-like manner. There is no
connection between a thread and an I/o it has already issued and so you
can't transfer a priority from a blocked thread to an issued-but-blocked
i/o

Cheers,

Dave.
-- 
Dave Chinner
Principal Engineer
SGI Australian Software Group
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] dmaengine: Driver for the AVR32 DMACA controller

2007-11-22 Thread David Brownell

On Tuesday 20 November 2007, Haavard Skinnemoen wrote:
> This patch depends on "DMA: Correct invalid assumptions in the Kconfig
> text" (without the part that adds AVR32 to the dependency list) and
> "DMAENGINE: Convert from class_device to device".

That regression fix still doesn't seem to be merged, or
even in the MM tree.

Here's a tweaked version of what Haavard sent.

- Dave

CUT HERE
From: Haavard Skinnemoen <[EMAIL PROTECTED]>

This patch corrects recently changed (and now invalid) Kconfig
descriptions for the DMA engine framework:

 - Non-Intel(R) hardware also has DMA engines;
 - DMA is used for more than memcpy and RAID offloading.
 
In fact, on most platforms memcpy and RAID aren't factors, and DMA
exists so that peripherals can transfer data to/from memory while
the CPU does other work.

Signed-off-by: Haavard Skinnemoen <[EMAIL PROTECTED]>
Signed-off-by: David Brownell <[EMAIL PROTECTED]>
---
 drivers/dma/Kconfig |8 +---
 1 file changed, 5 insertions(+), 3 deletions(-)

--- g26.orig/drivers/dma/Kconfig2007-10-30 23:58:27.0 -0700
+++ g26/drivers/dma/Kconfig 2007-11-22 17:43:33.0 -0800
@@ -3,11 +3,13 @@
 #
 
 menuconfig DMADEVICES
-   bool "DMA Offload Engine support"
+   bool "DMA Engine support"
depends on (PCI && X86) || ARCH_IOP32X || ARCH_IOP33X || ARCH_IOP13XX
help
- Intel(R) offload engines enable offloading memory copies in the
- network stack and RAID operations in the MD driver.
+ DMA engines can do asynchronous data transfers without
+ involving the host CPU.  Currently, this framework can be
+ used to offload memory copies in the network stack and
+ RAID operations in the MD driver.
 
 if DMADEVICES
 

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH RFC] [1/9] Core module symbol namespaces code and intro.

2007-11-22 Thread Rusty Russell

On Friday 23 November 2007 12:36:22 Andi Kleen wrote:
> On Friday 23 November 2007 01:25, Rusty Russell wrote:
> > That's my point.  If there's a whole class of modules which can use a
> > symbol, why are we ruling out external modules?
>
> The point is to get cleaner interfaces.

But this doesn't change interfaces at all.  It makes modules fail to load 
unless they're on a permitted list, which now requires maintenance.

> Anything which is kind of internal 
> should only be used by closely related in tree modules which can be
> updated.

Is there evidence that this is a problem for us?  Are there any interfaces 
you've restricted so far which are causing problems?

> Point of is not to be some kind of license enforcer or similar, 
> there are already other mechanisms for that. Just to get the set of really
> public kernel interfaces down to a manageable level.

Why do we care what a "really public"?  We treat them all the same, as 
changeable interfaces.  ie.  None of them are "really public".

For example, you put all the udp functions in the "udp" namespace.  But what 
have we gained?  What has become easier to maintain?  All those function 
start with "udp_": are people having trouble telling what they're for?

If you really want to reduce "public interfaces" then it's much simpler to 
mark explicitly what out-of-tree modules can use.  We can have a list of 
symbol names in include/linux/public-exports.h.

I just don't see what problems this separation solves.
Rusty.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Where is the interrupt going?

2007-11-22 Thread Marin Mitov

Hi,

On Friday 23 November 2007 02:48:53 am you wrote:
> I tried the hammer and the problem persists.
> [EMAIL PROTECTED]:~$ cat /proc/cmdline
> root=UUID=8b3c3666-22c3-4c04-b399-ece266f2ef30 ro noapic quiet splash
>
> However, I reserve the right to try the hammer again in the future.
> When I look at /proc/interrupts without the APIC:
> [EMAIL PROTECTED]:~$ cat /proc/interrupts
> CPU0
>0:144XT-PIC-XTtimer
>1: 10XT-PIC-XTi8042
>2:  0XT-PIC-XTcascade
>5: 10XT-PIC-XTohci_hcd:usb5, mxser
>6:  5XT-PIC-XTfloppy
>7:  1XT-PIC-XTparport0
>8:  3XT-PIC-XTrtc
>9:  1XT-PIC-XTacpi, uhci_hcd:usb2
>   10: 10XT-PIC-XTohci_hcd:usb4, ehci_hcd:usb6,
> [EMAIL PROTECTED]::01:00.0
>   11:   2231XT-PIC-XTuhci_hcd:usb1, ohci_hcd:usb3, eth0
>   12:130XT-PIC-XTi8042
>   14:   4362XT-PIC-XTlibata
>   15:  15315XT-PIC-XTlibata
> NMI:  0
> LOC: 130125
> ERR:  0
> MIS:  0
>
> I do not even see the device that I registered unless it is that
> r128... line. However the code printed out in /var/log/messages:

No, this is your radeon 128 board (on AGP I suppose). Could be integrated
on the mobo if it is a server mobo.

> Nov 22 16:05:27 bbb kernel: [  104.712473] apc8620: VID = 0x10B5
> Nov 22 16:05:27 bbb kernel: [  104.712486] apc8620: mapped addr = e0bd4000
> Nov 22 16:05:27 bbb kernel: [  104.713022] apc8620: registered carrier 0
> Nov 22 16:05:27 bbb kernel: [  104.713028] apc8620: interrupt data
> (0xe1083e40) on irq (10) and status (0x10)

Here is the problem (I suppose): 
if status (0x10 hex or 16 decimal) is the value returned by request_irq:
status = request_irq (apcsi[i].board_irq,
  apc8620_handler,
  IRQF_DISABLED,
  DEVICE_NAME,
  (void*)[i]);
(from your first post), that means the irq is NOT registered, because
according to the LDD v.3 book:

The value returned from request_irq to the requesting function is either 0 
to indicate success or a negative error code, as usual. It’s not uncommon 
for the function to return -EBUSY to signal that another driver is already 
using the requested interrupt line. 

If you grep the kernels's include directory for EBUSY you will find:
#defineEBUSY   16  /* Device or resource busy */
in include/asm-generic/errno-base.h

So I think your mobo has shared (with other devices) irq line on the 
PCI/PCIe slot you use for your hardware and these other devices have
already registered shered irq handlers for the same irq (10), so the 
attempt to register nonshared irq fails.

Either try to register the irq as shared, or put the hardware on
another slot whose irq line is not shared with other devises 
(if such one exists). This info should be available from the mobo
manual book.
>
> which indicates it successfully registered without being shared. 

No, as I already explained. 
The only problem :-) in my explanation is:
request_irq returns EBUSY (not -EBUSY as should be)

Marin Mitov

> When 
> I have more time, I will changed the code to be a shared IRQ and try
> the noapic again.
>
> However, without the noapic /proc/interrupts looks like:
> [EMAIL PROTECTED]:~$ cat /proc/interrupts
> CPU0
>0:154   IO-APIC-edge  timer
>1: 10   IO-APIC-edge  i8042
>6:  5   IO-APIC-edge  floppy
>7:  0   IO-APIC-edge  parport0
>8:  3   IO-APIC-edge  rtc
>9:  1   IO-APIC-fasteoi   acpi
>   10:  0   IO-APIC-edge  apc8620
>   12:130   IO-APIC-edge  i8042
>   14:   2861   IO-APIC-edge  libata
>   15:   1049   IO-APIC-edge  libata
>   16: 11   IO-APIC-fasteoi   ohci_hcd:usb5, mxser
>   17:  0   IO-APIC-fasteoi   uhci_hcd:usb1, ohci_hcd:usb3
>   18:  0   IO-APIC-fasteoi   uhci_hcd:usb2
>   19:187   IO-APIC-fasteoi   eth0
>   20:  0   IO-APIC-fasteoi   ohci_hcd:usb4, [EMAIL 
> PROTECTED]::01:00.0
>   21:  0   IO-APIC-fasteoi   ehci_hcd:usb6
> NMI:  0
> LOC:   8820
> ERR:  0
> MIS:  0
>
>
> I have attached the kernel module. The apc8620 is an IndustryPack
> carrier card. I can therefore open up N (in this specific case 5) sub
> memory windows in the memory mapped PCI address. The kernel module
> keeps track of the slot offsets from the memory mapped address so that
> the user can simply use read and write instead of a zillion ugly ioctl
> calls. Because the kernel module tracks the slot offsets, I place acp
> state into the private data of the file pointer. There can also be
> multiple carriers on the bus. So, the array in the

Re: [RFC] Documentation about unaligned memory access

2007-11-22 Thread Kyle Moffett


On Nov 22, 2007, at 20:29:11, Alan Cox wrote:
Most architectures are unable to perform unaligned memory  
accesses. Any unaligned access causes a processor exception.


Not all. Some simply produce the wrong answer - thats oh so much  
more exciting.


As one example, the MicroBlaze soft-core processor family designed  
for use on Xilinx FPGAs will (by default) simply forcibly zero the  
lower bits of the unaligned address, such that the following code  
will fail mysteriously:


const char foo[] = { 0x00, 0x01, 0x02, 0x03, 0x04, 0x05, 0x06, 0x07 };
printf("0x%08lx 0x%08lx 0x%08lx 0x%08lx\n",
*((u32 *)(foo+0)),
*((u32 *)(foo+1)),
*((u32 *)(foo+2)),
*((u32 *)(foo+3)));

Instead of outputting:
0x00010203 0x01020304 0x02030405 0x03040506

It will output:
0x00010203 0x00010203 0x00010203 0x00010203

Other embedded architectures have very similar problems.  Some may  
provide an "unaligned data access" exception, but offer insufficient  
information to repair the damage and resume execution.


Cheers,
Kyle Moffett

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 2/9]: Reduce Log I/O latency

2007-11-22 Thread Andi Kleen

On Fri, Nov 23, 2007 at 12:15:39AM +1100, David Chinner wrote:
> On Thu, Nov 22, 2007 at 01:06:11PM +0100, Andi Kleen wrote:
> > > FWIW from a "real time" database POV this seems to make sense to me...
> > > in fact, we probably rely on filesystem metadata way too much
> > > (historically it's just "worked" although we do seem to get issues
> > > on ext3).
> > 
> > For that case you really would need priority inheritance: any metadata
> > IO on behalf or blocking a process needs to use the process' block IO 
> > priority.
> 
> How do you do that when the processes are blocking on semaphores,
> mutexes or rw-semaphores in the fileysystem three layers removed from
> the I/O in progress?

[...] I didn't say it was easy (or rather explicitely said it would be tricky).
Probably it would be possible to fold it somehow into rt mutexes PI,
but it's not easy and semaphores would need to be handled too.

Just my point was to solve the metadata RT problem unconditionally increasing
the priority is a bad idea and not really a replacement to a "full"
solution. Short term a user can just increase the priority of all the XFS 
threads anyways.

-Andi
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [linux-usb-devel] 2.6.24-rc3-mm1: usb mouse doesn't work

2007-11-22 Thread Alan Stern

On Thu, 22 Nov 2007, Marin Mitov wrote:

> > > > I've had some strangenesses with USB lately.  Sometimes running `lsusb'
> > > > makes the USB system notice a newly attached device.
> > >
> > > No. But I have new messages in dmesg:
> > >
> > > uhci_hcd :00:1d.3: FGR not stopped yet!
> > > uhci_hcd :00:1d.2: FGR not stopped yet!
> > > uhci_hcd :00:1d.1: FGR not stopped yet!
> > > uhci_hcd :00:1d.0: FGR not stopped yet!
> > >
> > > > Is that "FGR not stopped yet!" messgae new behaviour?
> > >
> > > It is a new message since 2.6.24-rc3. I have never try -mm tree before.
> >
> > These messages could indicate a timing problem.  You can see the code
> > that writes the messages near the end of wakeup_rh() in
> > drivers/usb/host/uhci-hcd.c.
> >
> > The message gets written if the controller hardware hasn't turned off a
> > particular bit after a 4-us delay.  If the udelay() function wasn't
> > working right, it could cause this problem.
> 
> udelay() _is_ OK for 2.6.24-rc3, so it is not the cause of the problem

But is it OK for 2.6.24-rc3-mm1?  Kirill said specifically that 
2.6.24-rc3 does not display the message but 2.6.24-rc3-mm1 does.

Alan Stern

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH RFC] [1/9] Core module symbol namespaces code and intro.

2007-11-22 Thread Dave Young

On Nov 23, 2007 2:19 AM, Andi Kleen <[EMAIL PROTECTED]> wrote:
>
> > Andy, I like your idea.  IMHO, as Rusty said a simple EXPORT_SYMBOL_TO
> > is better.
>
> I don't think so. e.g. tcpcong would be very very messy this way.
>
> > And I wonder if it is possible to export to something like  the struct
> > device_driver? If it's possible then it will not limited to modules.
>
> Not sure I follow you. Can you expand?
>
I know little about module internal, so if I'm wrong please just don't
mind, please  point out or just ignore.

Kernel symbols could apply to kernel object model, doesn't it?
I just think that because the device_driver have a mod_name member
(for built-in module), so if something can be done as device driver is
registered.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC] Documentation about unaligned memory access

2007-11-22 Thread Andi Kleen

Robert Hancock <[EMAIL PROTECTED]> writes:
>
> Also, x86 doesn't prohibit unaligned accesses,

That depends, e.g. for SSE2 they can be forbidden.

> but I believe they have
> a significant performance cost and are best avoided where possible.

On Opteron the typical cost of a misaligned access is a single cycle
and some possible penalty to load-store forwarding.

On Intel it is a bit worse, but not all that much. Unless you do 
a lot of accesses of it in a loop it's not really worth something
caring about too much.

-Andi
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] sata_nv: don't use legacy DMA in ADMA mode (v2)

2007-11-22 Thread Robert Hancock

We need to run any DMA command with result taskfile requested in ADMA mode
when the port is in ADMA mode, otherwise it may try to use the legacy DMA engine
in ADMA mode which is not allowed. Enforce this with BUG_ON() since data
corruption could potentially result if this happened. Also WARN_ON() if we try
and send result taskfile commands while NCQ commands are still active, since the
hardware doesn't allow this.

Signed-off-by: Robert Hancock <[EMAIL PROTECTED]>

--- linux-2.6.24-rc3-git1/drivers/ata/sata_nv.c 2007-11-20 17:40:09.0 
-0600
+++ linux-2.6.24-rc3-git1edit/drivers/ata/sata_nv.c 2007-11-22 
19:40:58.0 -0600
@@ -791,11 +791,13 @@
 
 static void nv_adma_tf_read(struct ata_port *ap, struct ata_taskfile *tf)
 {
-   /* Since commands where a result TF is requested are not
-  executed in ADMA mode, the only time this function will be called
-  in ADMA mode will be if a command fails. In this case we
-  don't care about going into register mode with ADMA commands
-  pending, as the commands will all shortly be aborted anyway. */
+   /* Other than when internal or pass-through commands are executed,
+  the only time this function will be called in ADMA mode will be
+  if a command fails. In the failure case we don't care about going
+  into register mode with ADMA commands pending, as the commands will
+  all shortly be aborted anyway. We assume that NCQ commands are not
+  issued via passthrough, which is the only way that switching into
+  ADMA mode could abort outstanding commands. */
nv_adma_register_mode(ap);
 
ata_tf_read(ap, tf);
@@ -1359,11 +1361,9 @@
struct nv_adma_port_priv *pp = qc->ap->private_data;
 
/* ADMA engine can only be used for non-ATAPI DMA commands,
-  or interrupt-driven no-data commands, where a result taskfile
-  is not required. */
+  or interrupt-driven no-data commands. */
if ((pp->flags & NV_ADMA_ATAPI_SETUP_COMPLETE) ||
-  (qc->tf.flags & ATA_TFLAG_POLLING) ||
-  (qc->flags & ATA_QCFLAG_RESULT_TF))
+  (qc->tf.flags & ATA_TFLAG_POLLING))
return 1;
 
if ((qc->flags & ATA_QCFLAG_DMAMAP) ||
@@ -1381,6 +1381,8 @@
   NV_CPB_CTL_IEN;
 
if (nv_adma_use_reg_mode(qc)) {
+   BUG_ON(!(pp->flags & NV_ADMA_ATAPI_SETUP_COMPLETE) &&
+   (qc->flags & ATA_QCFLAG_DMAMAP));
nv_adma_register_mode(qc->ap);
ata_qc_prep(qc);
return;
@@ -1425,9 +1427,17 @@
 
VPRINTK("ENTER\n");
 
+   /* We can't handle result taskfile with NCQ commands active, since
+  retrieving the taskfile switches us out of ADMA mode and would abort
+  existing commands. */
+   WARN_ON((qc->flags & ATA_QCFLAG_RESULT_TF) &&
+   (qc->ap->qc_allocated & ~(1 << qc->tag)));
+
if (nv_adma_use_reg_mode(qc)) {
/* use ATA register mode */
VPRINTK("using ATA register mode: 0x%lx\n", qc->flags);
+   BUG_ON(!(pp->flags & NV_ADMA_ATAPI_SETUP_COMPLETE) &&
+   (qc->flags & ATA_QCFLAG_DMAMAP));
nv_adma_register_mode(qc->ap);
return ata_qc_issue_prot(qc);
} else

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] sata_nv: fix ADMA ATAPI issues with memory over 4GB (v3)

2007-11-22 Thread Robert Hancock

This fixes some problems with ATAPI devices on nForce4 controllers in ADMA mode
on systems with memory located above 4GB. We need to delay setting the 64-bit
DMA mask until the PRD table and padding buffer are allocated so that they don't
get allocated above 4GB and break legacy mode (which is needed for ATAPI
devices).

Signed-off-by: Robert Hancock <[EMAIL PROTECTED]>

--- linux-2.6.24-rc3-git1edit/drivers/ata/sata_nv.c.before2 2007-11-22 
19:42:28.0 -0600
+++ linux-2.6.24-rc3-git1edit/drivers/ata/sata_nv.c 2007-11-22 
19:48:25.0 -0600
@@ -247,6 +247,7 @@
void __iomem*ctl_block;
void __iomem*gen_block;
void __iomem*notifier_clear_block;
+   u64 adma_dma_mask;
u8  flags;
int last_issue_ncq;
 };
@@ -748,7 +749,7 @@
adma_enable = 0;
nv_adma_register_mode(ap);
} else {
-   bounce_limit = *ap->dev->dma_mask;
+   bounce_limit = pp->adma_dma_mask;
segment_boundary = NV_ADMA_DMA_BOUNDARY;
sg_tablesize = NV_ADMA_SGTBL_TOTAL_LEN;
adma_enable = 1;
@@ -1134,10 +1135,20 @@
void *mem;
dma_addr_t mem_dma;
void __iomem *mmio;
+   struct pci_dev *pdev = to_pci_dev(dev);
u16 tmp;
 
VPRINTK("ENTER\n");
 
+   /* Ensure DMA mask is set to 32-bit before allocating legacy PRD and
+  pad buffers */
+   rc = pci_set_dma_mask(pdev, DMA_BIT_MASK(32));
+   if (rc)
+   return rc;
+   rc = pci_set_consistent_dma_mask(pdev, DMA_BIT_MASK(32));
+   if (rc)
+   return rc;
+
rc = ata_port_start(ap);
if (rc)
return rc;
@@ -1153,6 +1164,15 @@
pp->notifier_clear_block = pp->gen_block +
   NV_ADMA_NOTIFIER_CLEAR + (4 * ap->port_no);
 
+   /* Now that the legacy PRD and padding buffer are allocated we can
+  safely raise the DMA mask to allocate the CPB/APRD table.
+  These are allowed to fail since we store the value that ends up
+  being used to set as the bounce limit in slave_config later if
+  needed. */
+   pci_set_dma_mask(pdev, DMA_BIT_MASK(64));
+   pci_set_consistent_dma_mask(pdev, DMA_BIT_MASK(64));
+   pp->adma_dma_mask = *dev->dma_mask;
+
mem = dmam_alloc_coherent(dev, NV_ADMA_PORT_PRIV_DMA_SZ,
  _dma, GFP_KERNEL);
if (!mem)
@@ -2414,12 +2434,6 @@
hpriv->type = type;
host->private_data = hpriv;
 
-   /* set 64bit dma masks, may fail */
-   if (type == ADMA) {
-   if (pci_set_dma_mask(pdev, DMA_64BIT_MASK) == 0)
-   pci_set_consistent_dma_mask(pdev, DMA_64BIT_MASK);
-   }
-
/* request and iomap NV_MMIO_BAR */
rc = pcim_iomap_regions(pdev, 1 << NV_MMIO_BAR, DRV_NAME);
if (rc)

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Where is the interrupt going?

2007-11-22 Thread Bartlomiej Zolnierkiewicz

On Friday 23 November 2007, Alan Cox wrote:
> On Thu, 22 Nov 2007 16:48:53 -0800
> [EMAIL PROTECTED] wrote:
> 
> > 
> > I tried the hammer and the problem persists.
> 
> See my earlier email - your driver registers the irq with IRQF_DISABLED
> then never enables it.

As already explained by Kyle IRQF_DISABLED shouldn't matter here.

[ Nowadays IRQF_DISABLED only tells kernel/irq/handle.c::handle_IRQ_event()
  to not enable local interrupts before calling your IRQ handler.

  I've recently removed IRQF_DISABLED from IDE after noticing this. ]

Bart
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Inotify fails to send IN_ATTRIB events

2007-11-22 Thread Morten Welinder

> Wanna try the patch below?

With this patch I am seeing a endless stream of  IN_IGNORED events
for a removed watch.  I don't see a reason that user space should
ever see any IN_IGNORED, but an endless steam is not good.

Utterly unrelated, inotify does not work in /proc/.  The list archives
suggest that it isn't likely to start working anytime soon, but shouldn't
inotify_add_watch when fail with ENOSYS instead of pretending
it worked?

Morten




Failed to create inotify watch for /home/welinder/hi: No such file or directory
Created inotify watch 1 with mask 0x07c0 for /home/welinder
# "touch hi" here
Got event 0100 for 1
Created inotify watch 2 with mask 0x0fc6 for /home/welinder/hi
Removing notify watch 1
Got event 8000 for 1
Got event 0004 for 2
# "rm hi" here
Got event 0400 for 2
Removing notify watch 2
Got event 8000 for 2
Failed to create inotify watch for /home/welinder/hi: No such file or directory
Created inotify watch 3 with mask 0x07c0 for /home/welinder
Got event 0200 for 3
Failed to create inotify watch for /home/welinder/hi: No such file or directory
Created inotify watch 3 with mask 0x07c0 for /home/welinder
# "touch hi" here
Got event 0100 for 3
Created inotify watch 4 with mask 0x0fc6 for /home/welinder/hi
Removing notify watch 3
Got event 8000 for 3
Got event 0004 for 4
Got event 0004 for 4
# "rm hi" here
Got event 0400 for 4
Removing notify watch 4
Failed to create inotify watch for /home/welinder/hi: No such file or directory
Created inotify watch 5 with mask 0x07c0 for /home/welinder
Got event 8000 for 4
Got event 8000 for 4
Got event 8000 for 4
Got event 8000 for 4
...
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH RFC] [1/9] Core module symbol namespaces code and intro.

2007-11-22 Thread Andi Kleen

On Friday 23 November 2007 01:25, Rusty Russell wrote:
> On Thursday 22 November 2007 22:05:45 Christoph Hellwig wrote:
> > On Thu, Nov 22, 2007 at 02:56:22PM +1100, Rusty Russell wrote:
> > > This is an interesting idea, thanks for the code!  My only question
> > > is whether we can get most of this benefit by dropping the indirection
> > > of namespaces and have something like "EXPORT_SYMBOL_TO(sym, modname)"?
> > >  It doesn't work so well for exporting to a group of modules, but that
> > > seems a reasonable line to draw anyway.
> >
> > I'd say exporting to a group of modules is the main use case.  E.g. in
> > scsi there would be symbols exported to transport class modules only
> > or lots of the vfs_ symbols would be exported only to stackable
> > filesystems or nfsd.
>
> That's my point.  If there's a whole class of modules which can use a
> symbol, why are we ruling out external modules? 

The point is to get cleaner interfaces. Anything which is kind of internal
should only be used by closely related in tree modules which can be updated. 
Point of is not to be some kind of license enforcer or similar, there 
are already other mechanisms for that. Just to get the set of really
public kernel interfaces down to a manageable level.

But I still think exporting only to a single module would be to limiting 
for this case even. It would work for the TCP<->ipv6.ko post child,
but not for some of the other networking cases where it makes sense.

> If that's what you want, 
> why not have a list of permitted modules compiled into the kernel and allow
> no others?

That would not make the relationship explicit, which would not further
the goal.

-Andi
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: 2.6.24-rc3-mm1

2007-11-22 Thread Gabriel C

I have some warnings on each SCSI disc:


...

[   30.724410] scsi 0:0:0:0: Direct-Access SEAGATE  ST318406LW   0109 
PQ: 0 ANSI: 3
[   30.724419] scsi0:A:0:0: Tagged Queuing enabled.  Depth 32
[   30.724435]  target0:0:0: Beginning Domain Validation
[   30.724446]  target0:0:0: Domain Validation Initial Inquiry Failed <--
[   30.724572]  target0:0:0: Ending Domain Validation
[   30.729747] scsi 0:0:1:0: Direct-Access FUJITSU  MAH3182MP0114 
PQ: 0 ANSI: 4
[   30.729754] scsi0:A:1:0: Tagged Queuing enabled.  Depth 32
[   30.729771]  target0:0:1: Beginning Domain Validation
[   30.729780]  target0:0:1: Domain Validation Initial Inquiry Failed <--
[   30.729908]  target0:0:1: Ending Domain Validation

...

no idea whatever this is related but buffered disk reads are 2.XX MB/sec and 
the box is somewhat laggy.

hdparm -t on sda and sdb reports :

/dev/sda:
 Timing buffered disk reads:8 MB in  3.26 seconds =   2.46 MB/sec

/dev/sdb:
 Timing buffered disk reads:8 MB in  3.56 seconds =   2.25 MB/sec

My IDE discs are fine.

Please let me know if you need my config or any other informations.


Gabriel
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Where is the new timerfd?

2007-11-22 Thread Andrew Morton

On Thu, 22 Nov 2007 16:35:38 -0800 (PST) Davide Libenzi <[EMAIL PROTECTED]> 
wrote:

> On Thu, 22 Nov 2007, Andrew Morton wrote:
> 
> > On Thu, 22 Nov 2007 11:46:13 -0800 (PST) Davide Libenzi <[EMAIL PROTECTED]> 
> > wrote:
> > 
> > > On Thu, 22 Nov 2007, Michael Kerrisk wrote:
> > > 
> > > > On Nov 22, 2007 6:34 PM, Davide Libenzi <[EMAIL PROTECTED]> wrote:
> > > > > On Thu, 22 Nov 2007, Michael Kerrisk wrote:
> > > > >
> > > > > > Hey Davide,
> > > > > >
> > > > > > Where is the new timerfd API.  In 2.6.24-rc3, I see the *old* API...
> > > > >
> > > > > Maybe Andrew stuffed the turkey with it? :) It was there. I remeber 
> > > > > it was
> > > > > merged. Some screw up reverted it?
> > > > 
> > > > t looks that way.
> > > 
> > > I'm looking at the log now. It never went in actually. Andrew-san, what 
> > > happened?
> > > 
> > 
> > Last I recall, we removed the API for 2.6.23 because we intended to do a
> > different interface for 2.6.24.
> > 
> > But I don't recall seeing any timerfd patches in maybe a month.
> 
> Was sent on Sep 23, Subject: new timerfd API

Half of us weren't born then ;)

> Do you want me to repost?

yes please.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC] Documentation about unaligned memory access

2007-11-22 Thread Alan Cox

> Most architectures are unable to perform unaligned memory accesses. Any
> unaligned access causes a processor exception.

Not all. Some simply produce the wrong answer - thats oh so much more
exciting.

> You may be wondering why you have never seen these problems on your own
> architecture. Some architectures (such as i386 and x86_64) do not have this
> limitation, but nevertheless it is important for you to write portable code
> that works everywhere.

Its usually faster if you don't misalign on x86 as well.

Alan
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC] Documentation about unaligned memory access

2007-11-22 Thread David Miller


Thanks you for working proactively on these problems.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Where is the interrupt going?

2007-11-22 Thread Alan Cox

On Thu, 22 Nov 2007 16:48:53 -0800
[EMAIL PROTECTED] wrote:

> 
> I tried the hammer and the problem persists.

See my earlier email - your driver registers the irq with IRQF_DISABLED
then never enables it.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC] Documentation about unaligned memory access

2007-11-22 Thread Robert Hancock


Daniel Drake wrote:

Being spoilt by the luxuries of i386/x86_64 I've never really had a good
grasp on unaligned memory access problems on other architectures and decided
it was time to figure it out. As a result I've written this documentation
which I plan to submit for inclusion as
Documentation/unaligned_memory_access.txt

Before I do so, any comments on the following?


...


You may be wondering why you have never seen these problems on your own
architecture. Some architectures (such as i386 and x86_64) do not have this
limitation, but nevertheless it is important for you to write portable code
that works everywhere.


Also, x86 doesn't prohibit unaligned accesses, but I believe they have a 
significant performance cost and are best avoided where possible.


--
Robert Hancock  Saskatoon, SK, Canada
To email, remove "nospam" from [EMAIL PROTECTED]
Home Page: http://www.roberthancock.com/

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Where is the interrupt going?

2007-11-22 Thread Robert Hancock


[EMAIL PROTECTED] wrote:


I tried the hammer and the problem persists.
[EMAIL PROTECTED]:~$ cat /proc/cmdline
root=UUID=8b3c3666-22c3-4c04-b399-ece266f2ef30 ro noapic quiet splash

However, I reserve the right to try the hammer again in the future. When 
I look at /proc/interrupts without the APIC:

[EMAIL PROTECTED]:~$ cat /proc/interrupts
   CPU0
  0:144XT-PIC-XTtimer
  1: 10XT-PIC-XTi8042
  2:  0XT-PIC-XTcascade
  5: 10XT-PIC-XTohci_hcd:usb5, mxser
  6:  5XT-PIC-XTfloppy
  7:  1XT-PIC-XTparport0
  8:  3XT-PIC-XTrtc
  9:  1XT-PIC-XTacpi, uhci_hcd:usb2
 10: 10XT-PIC-XTohci_hcd:usb4, ehci_hcd:usb6, 
[EMAIL PROTECTED]::01:00.0

 11:   2231XT-PIC-XTuhci_hcd:usb1, ohci_hcd:usb3, eth0
 12:130XT-PIC-XTi8042
 14:   4362XT-PIC-XTlibata
 15:  15315XT-PIC-XTlibata
NMI:  0
LOC: 130125
ERR:  0
MIS:  0

I do not even see the device that I registered unless it is that r128... 
line. However the code printed out in /var/log/messages:

Nov 22 16:05:27 bbb kernel: [  104.712473] apc8620: VID = 0x10B5
Nov 22 16:05:27 bbb kernel: [  104.712486] apc8620: mapped addr = e0bd4000
Nov 22 16:05:27 bbb kernel: [  104.713022] apc8620: registered carrier 0
Nov 22 16:05:27 bbb kernel: [  104.713028] apc8620: interrupt data 
(0xe1083e40) on irq (10) and status (0x10)


which indicates it successfully registered without being shared. When I 
have more time, I will changed the code to be a shared IRQ and try the 
noapic again.


You're not calling pci_enable_device anywhere. Unless you do this before 
requesting the IRQ, the IRQ routing may not be set up properly for your 
device and it may not even give you the right IRQ number. You should see 
a line like this somewhere in dmesg for the IRQ your card is on:


ACPI: PCI Interrupt :00:1f.2[D] -> GSI 19 (level, low) -> IRQ 17

I think this behavior changed in the somewhat recent past..

--
Robert Hancock  Saskatoon, SK, Canada
To email, remove "nospam" from [EMAIL PROTECTED]
Home Page: http://www.roberthancock.com/

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] sata_nv: fix ADMA ATAPI issues with memory over 4GB (v2)

2007-11-22 Thread Tejun Heo

Robert Hancock wrote:
>>> +/* Ensure DMA mask is set to 32-bit before allocating legacy PRD
>>> and
>>> +   pad buffers */
>>> +pci_set_dma_mask(pdev, DMA_BIT_MASK(32));
>>> +pci_set_consistent_dma_mask(pdev, DMA_BIT_MASK(32));
>> [--snip--]
>>> +pci_set_dma_mask(pdev, DMA_BIT_MASK(64));
>>> +pci_set_consistent_dma_mask(pdev, DMA_BIT_MASK(64));
>>
>> I'm probably being paranoid here but please add error checks.  Just
>> checking return value and returning error suffices.
> 
> In the 32-bit case, I'm pretty sure those are guaranteed not to fail
> because 32-bit is the default. For the 64-bit ones, we don't care if
> they fail, because then we'll just use whatever mask ends up being set
> (we store the actual set DMA mask in adma_dma_mask for use when we need
> to reconfigure the bounce limit). We definitely don't want to fail
> initialization if the 64-bit set doesn't succeed..

Then please add BUG or WARN_ON after 32bit switching (but then again if
you're gonna do that why not just add if (rc) return rc?) and add
comment stating setting 64 bit dma masks is allowed to fail.

Thanks.

-- 
tejun
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC/PATCH] SO_NO_CHECK for IPv6

2007-11-22 Thread David Miller

From: Jeff Garzik <[EMAIL PROTECTED]>
Date: Wed, 21 Nov 2007 19:17:40 -0500

> YOSHIFUJI Hideaki / 吉藤英明 wrote:
> > In article <[EMAIL PROTECTED]> (at Wed, 21 Nov 2007 07:45:32 -0500), Jeff 
> > Garzik <[EMAIL PROTECTED]> says:
> > 
> >> SO_NO_CHECK support for IPv6 appeared to be missing. This is presented,
> >> based on a reading of net/ipv4/udp.c.
> > 
> > Disagree. UDP checksum is mandatory in IPv6.
> 
> Ah, you mean that I need to turn off UDP checksum on receive end as well 
> in IPv6...  true.
> 
> For those interested, I am dealing with a UDP app that already does very 
> strong checksumming and encryption, so additional software checksumming 
> at the lower layers is quite simply a waste of CPU cycles.  Hardware 
> checksumming is fine, as long as its "free."

Regardless of whatever verifications your application is doing
on the data, it is not checksumming the ports and that's what
the pseudo-header is helping with.

You cannot disable checksums in ipv6/UDP, they are not optional and
with %99.999 of cards doing the checksum in hardware, and even if
we do have to compute it it's free during the copy during recvmsg().
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Use of mutex in interrupt context flawed/impossible, need advice.

2007-11-22 Thread Robert Hancock


Leon Woestenberg wrote:

Hello,


I'm converting an out-of-tree (*1) driver from binary semaphore to mutex.

Userspace updates a look-up-table using write(). The driver tries to
write this LUT to the FPGA in the (video frame) interrupt handler. It
is important that the LUT is consistent and thus changed atomically.
Note that it is not important that the LUT is updated each interrupt.

The current approach is to try-down()ing a binary semaphore in
interrupt context, and write the LUT to the FPGA if the semaphore was
down()ed, do nothing else.
The write() down()s the semaphore as well before updating the
in-driver-copy of the LUT, then up()s it again.

I understand this design is not clean (*2), and not even possible with
mutexes, as mutex_trylock() is not interrupt safe.

My current approach would be to have userspace write into a shadow
copy, and use a spinlock to update the live copy. The interrupt then
would try a spinlock.


Unless this update into the FPGA takes a significant amount of time, I 
wouldn't bother with that complexity - just do spin_lock_irq/irqsave on 
that spinlock.


Using a trylock for this rather sucks since the behavior is entirely 
non-deterministic. It could take a really long time in some cases for 
the trylock to ever succeed.




My feeling is that we have a  valid use of mutex_trylock() in
interrupt context; "i.e. update LUT if we can do so consistently and
in time, or not at all".

I would like to know why this is not so, and if someone has a cleaner
proposal than the "try spinlock" approach?


--
Robert Hancock  Saskatoon, SK, Canada
To email, remove "nospam" from [EMAIL PROTECTED]
Home Page: http://www.roberthancock.com/

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: network driver usage count

2007-11-22 Thread David Miller

From: Wagner Ferenc <[EMAIL PROTECTED]>
Date: Wed, 21 Nov 2007 23:16:59 +0100

> Hmm, that would warrant nuking all the reference counts on every
> driver.

That's not true.  When packets are in flight, references go
to the device and the device cannot be unloaded until those
references get dropped.

This behavior makes sense because otherwise you have to figure
out the myriad of references (each ipv4 address, each ipv6
address, routes, ARP entries, etc.) just to perform such a
simple operation.

If you do not mean to unload the device, simply do not do it.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] sata_nv: fix ADMA ATAPI issues with memory over 4GB (v2)

2007-11-22 Thread Robert Hancock


Tejun Heo wrote:

Hello, Robert.

Robert Hancock wrote:

This fixes some problems with ATAPI devices on nForce4 controllers in ADMA
mode on systems with memory located above 4GB. We need to delay setting the
64-bit DMA mask until the PRD table and padding buffer are allocated so that
they don't get allocated above 4GB and break legacy mode (which is needed for
ATAPI devices). Also, explicitly set a 32-bit DMA mask before allocating the
legacy buffers since setting the DMA mask affects both ports and we need to
ensure the second port's buffers are allocated properly (fixes a problem
with the previous version of this patch).

Signed-off-by: Robert Hancock <[EMAIL PROTECTED]>

+   /* Ensure DMA mask is set to 32-bit before allocating legacy PRD and
+  pad buffers */
+   pci_set_dma_mask(pdev, DMA_BIT_MASK(32));
+   pci_set_consistent_dma_mask(pdev, DMA_BIT_MASK(32));

[--snip--]

+   pci_set_dma_mask(pdev, DMA_BIT_MASK(64));
+   pci_set_consistent_dma_mask(pdev, DMA_BIT_MASK(64));


I'm probably being paranoid here but please add error checks.  Just
checking return value and returning error suffices.


In the 32-bit case, I'm pretty sure those are guaranteed not to fail 
because 32-bit is the default. For the 64-bit ones, we don't care if 
they fail, because then we'll just use whatever mask ends up being set 
(we store the actual set DMA mask in adma_dma_mask for use when we need 
to reconfigure the bounce limit). We definitely don't want to fail 
initialization if the 64-bit set doesn't succeed..

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: 2.6.24-rc3-mm1

2007-11-22 Thread Thomas Gleixner

On Thu, 22 Nov 2007, Andrew Morton wrote:

> On Thu, 22 Nov 2007 12:22:05 +0200 "Kirill A. Shutemov" <[EMAIL PROTECTED]> 
> wrote:
> 
> > On x86_64 'uname -m' return 'x86'.  It break many userspace programs. apt
> > and rpm for example.
> > 
> 
> Yes, there have been various discussions about this.  I think Sam is cooking 
> up
> a fix?

http://lkml.org/lkml/2007/11/19/323

I push it Linus wards ASAP.

Thanks,

tglx
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Where is the interrupt going?

2007-11-22 Thread niessner



I tried the hammer and the problem persists.
[EMAIL PROTECTED]:~$ cat /proc/cmdline
root=UUID=8b3c3666-22c3-4c04-b399-ece266f2ef30 ro noapic quiet splash

However, I reserve the right to try the hammer again in the future.  
When I look at /proc/interrupts without the APIC:

[EMAIL PROTECTED]:~$ cat /proc/interrupts
   CPU0
  0:144XT-PIC-XTtimer
  1: 10XT-PIC-XTi8042
  2:  0XT-PIC-XTcascade
  5: 10XT-PIC-XTohci_hcd:usb5, mxser
  6:  5XT-PIC-XTfloppy
  7:  1XT-PIC-XTparport0
  8:  3XT-PIC-XTrtc
  9:  1XT-PIC-XTacpi, uhci_hcd:usb2
 10: 10XT-PIC-XTohci_hcd:usb4, ehci_hcd:usb6,  
[EMAIL PROTECTED]::01:00.0

 11:   2231XT-PIC-XTuhci_hcd:usb1, ohci_hcd:usb3, eth0
 12:130XT-PIC-XTi8042
 14:   4362XT-PIC-XTlibata
 15:  15315XT-PIC-XTlibata
NMI:  0
LOC: 130125
ERR:  0
MIS:  0

I do not even see the device that I registered unless it is that  
r128... line. However the code printed out in /var/log/messages:

Nov 22 16:05:27 bbb kernel: [  104.712473] apc8620: VID = 0x10B5
Nov 22 16:05:27 bbb kernel: [  104.712486] apc8620: mapped addr = e0bd4000
Nov 22 16:05:27 bbb kernel: [  104.713022] apc8620: registered carrier 0
Nov 22 16:05:27 bbb kernel: [  104.713028] apc8620: interrupt data  
(0xe1083e40) on irq (10) and status (0x10)


which indicates it successfully registered without being shared. When  
I have more time, I will changed the code to be a shared IRQ and try  
the noapic again.


However, without the noapic /proc/interrupts looks like:
[EMAIL PROTECTED]:~$ cat /proc/interrupts
   CPU0
  0:154   IO-APIC-edge  timer
  1: 10   IO-APIC-edge  i8042
  6:  5   IO-APIC-edge  floppy
  7:  0   IO-APIC-edge  parport0
  8:  3   IO-APIC-edge  rtc
  9:  1   IO-APIC-fasteoi   acpi
 10:  0   IO-APIC-edge  apc8620
 12:130   IO-APIC-edge  i8042
 14:   2861   IO-APIC-edge  libata
 15:   1049   IO-APIC-edge  libata
 16: 11   IO-APIC-fasteoi   ohci_hcd:usb5, mxser
 17:  0   IO-APIC-fasteoi   uhci_hcd:usb1, ohci_hcd:usb3
 18:  0   IO-APIC-fasteoi   uhci_hcd:usb2
 19:187   IO-APIC-fasteoi   eth0
 20:  0   IO-APIC-fasteoi   ohci_hcd:usb4, [EMAIL 
PROTECTED]::01:00.0
 21:  0   IO-APIC-fasteoi   ehci_hcd:usb6
NMI:  0
LOC:   8820
ERR:  0
MIS:  0


I have attached the kernel module. The apc8620 is an IndustryPack  
carrier card. I can therefore open up N (in this specific case 5) sub  
memory windows in the memory mapped PCI address. The kernel module  
keeps track of the slot offsets from the memory mapped address so that  
the user can simply use read and write instead of a zillion ugly ioctl  
calls. Because the kernel module tracks the slot offsets, I place acp  
state into the private data of the file pointer. There can also be  
multiple carriers on the bus. So, the array in the kernel module keeps  
track of the card specific details with the file pointer the slot  
specific information. Both are the same structure (bad on my part I  
know but I never intended to show my dirty underwear). To get data  
from interrupts (asynchronous IO) I was using readv. Now I am using  
aio_read and had to make some minor changes that you will see comments  
about to accomidate the change.


Just noticed that r128 is not the carrier card...

Thanks for all of the help so far and I hope this information is helpful.

I almost forgot. I also attached the dmesg output and will try the  
irqpoll as it suggests. It is just the IRQ 16 is not the one I am  
looking for, but is probably related to my mxser problems that I will  
get to later.


Quoting Kyle McMartin <[EMAIL PROTECTED]>, on Wed 21 Nov 2007 06:20:04 PM PST:


On Wed, Nov 21, 2007 at 05:08:30PM -0800, Al Niessner wrote:

On with the detailed technical information. I developed a kernel module
for an PCI card back in 2.4, moved it to 2.6.3, then 2.6.11 or so and
now I am trying to move it to 2.6.22. When I began the to move to
2.6.22, I changed all of the deprecated calls for finding the card on
the PCI bus, modified the interrupt handler prototype, and changed my
readvv/writev to aio_read/aio_write following
http://lwn.net/Articles/202449/. So initialization looks like this:



Hi Al,

From the sounds of it, you might have an interrupt routing problem. Can
you describe the machine you have this plugged into? Possibly attaching
a copy of "dmesg" and "/proc/interrupts"?

Feel free to attach the driver source to your email if the size is
reasonable (which it sounds like it is.)

As a "big hammer" in case it is an APIC problem, please try booting the
kernel with the "noapic" parameter.

cheers,
Kyle




/*

Re: Where is the interrupt going?

2007-11-22 Thread niessner



I do not think so. I have printk (KERN_NOTICE ...) scattered  
throughout to make sure the ioctl() is succeeding and to print out  
registers on the hardware. Those are showing up in /var/log/messages  
without a hitch. If there is a setting for printk in interrupts, then  
maybe because I would not know the macro to look for in the  
configuration.


Quoting Jesper Juhl <[EMAIL PROTECTED]>, on Wed 21 Nov 2007  
06:16:45 PM PST:



On 22/11/2007, Al Niessner <[EMAIL PROTECTED]> wrote:


Quickly stated, I have a piece of hardware on the PCI bus that is
generating an interrupt (can watch it with a scope) but my handler is
not being called (no printk in /var/log/messages). So, where has the
interrupt gone?


Just to rule out the trivial causes. Could it be that you've simply
not configured your system to log messages at the loglevel that your
printk() is using?

--
Jesper Juhl <[EMAIL PROTECTED]>
Don't top-post  http://www.catb.org/~esr/jargon/html/T/top-post.html
Plain text mails only, please  http://www.expita.com/nomime.html




-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Where is the new timerfd?

2007-11-22 Thread Davide Libenzi

On Thu, 22 Nov 2007, Andrew Morton wrote:

> On Thu, 22 Nov 2007 11:46:13 -0800 (PST) Davide Libenzi <[EMAIL PROTECTED]> 
> wrote:
> 
> > On Thu, 22 Nov 2007, Michael Kerrisk wrote:
> > 
> > > On Nov 22, 2007 6:34 PM, Davide Libenzi <[EMAIL PROTECTED]> wrote:
> > > > On Thu, 22 Nov 2007, Michael Kerrisk wrote:
> > > >
> > > > > Hey Davide,
> > > > >
> > > > > Where is the new timerfd API.  In 2.6.24-rc3, I see the *old* API...
> > > >
> > > > Maybe Andrew stuffed the turkey with it? :) It was there. I remeber it 
> > > > was
> > > > merged. Some screw up reverted it?
> > > 
> > > t looks that way.
> > 
> > I'm looking at the log now. It never went in actually. Andrew-san, what 
> > happened?
> > 
> 
> Last I recall, we removed the API for 2.6.23 because we intended to do a
> different interface for 2.6.24.
> 
> But I don't recall seeing any timerfd patches in maybe a month.

Was sent on Sep 23, Subject: new timerfd API
Do you want me to repost?



- Davide


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH RFC] [1/9] Core module symbol namespaces code and intro.

2007-11-22 Thread Rusty Russell

On Thursday 22 November 2007 22:05:45 Christoph Hellwig wrote:
> On Thu, Nov 22, 2007 at 02:56:22PM +1100, Rusty Russell wrote:
> > This is an interesting idea, thanks for the code!  My only question
> > is whether we can get most of this benefit by dropping the indirection of
> > namespaces and have something like "EXPORT_SYMBOL_TO(sym, modname)"?  It
> > doesn't work so well for exporting to a group of modules, but that seems
> > a reasonable line to draw anyway.
>
> I'd say exporting to a group of modules is the main use case.  E.g. in
> scsi there would be symbols exported to transport class modules only
> or lots of the vfs_ symbols would be exported only to stackable filesystems
> or nfsd.

That's my point.  If there's a whole class of modules which can use a symbol, 
why are we ruling out external modules?  If that's what you want, why not 
have a list of permitted modules compiled into the kernel and allow no 
others?

Cheers,
Rusty.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH RFC] [1/9] Core module symbol namespaces code and intro.

2007-11-22 Thread Rusty Russell

On Thursday 22 November 2007 22:46:23 Andi Kleen wrote:
> On Thursday 22 November 2007 04:56, Rusty Russell wrote:
> > This is an interesting idea, thanks for the code!  My only question
> > is whether we can get most of this benefit by dropping the indirection of
> > namespaces and have something like "EXPORT_SYMBOL_TO(sym, modname)"?  It
> > doesn't work so well for exporting to a group of modules, but that seems
> > a reasonable line to draw anyway.
>
> That would explode quickly already even for my example "inet" namespace.
> It already has several modules.  I don't think so much duplication would be
> a good idea.

Yes, and if a symbol is already used by multiple modules, it's generically 
useful.  And if so, why restrict it to in-tree modules?

If your real intent is to bias against out-of-tree modules, let's just 
generate a list of in-tree module names, and restrict some or all exports to 
that set.

Rusty.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC] Documentation about unaligned memory access

2007-11-22 Thread Avuton Olrich

On Nov 22, 2007 4:15 PM, Daniel Drake <[EMAIL PROTECTED]> wrote:
> Before I do so, any comments on the following?
>

< above case it would insert 2 bytes of padding inbetween field1 and field2.
> above case it would insert 2 bytes of padding in between field1 and field2.


< moving field3 to sit inbetween field1 and field2 (where the padding is
> moving field3 to sit in between field1 and field2 (where the padding is
-- 
avuton
--
 Anyone who quotes me in their sig is an idiot. -- Rusty Russell.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Where is the new timerfd?

2007-11-22 Thread Andrew Morton

On Thu, 22 Nov 2007 11:46:13 -0800 (PST) Davide Libenzi <[EMAIL PROTECTED]> 
wrote:

> On Thu, 22 Nov 2007, Michael Kerrisk wrote:
> 
> > On Nov 22, 2007 6:34 PM, Davide Libenzi <[EMAIL PROTECTED]> wrote:
> > > On Thu, 22 Nov 2007, Michael Kerrisk wrote:
> > >
> > > > Hey Davide,
> > > >
> > > > Where is the new timerfd API.  In 2.6.24-rc3, I see the *old* API...
> > >
> > > Maybe Andrew stuffed the turkey with it? :) It was there. I remeber it was
> > > merged. Some screw up reverted it?
> > 
> > t looks that way.
> 
> I'm looking at the log now. It never went in actually. Andrew-san, what 
> happened?
> 

Last I recall, we removed the API for 2.6.23 because we intended to do a
different interface for 2.6.24.

But I don't recall seeing any timerfd patches in maybe a month.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 2/9]: Reduce Log I/O latency

2007-11-22 Thread Matt Mackall

On Fri, Nov 23, 2007 at 10:09:22AM +1100, David Chinner wrote:
> On Fri, Nov 23, 2007 at 09:29:09AM +1100, David Chinner wrote:
> > On Thu, Nov 22, 2007 at 12:10:29PM -0600, Matt Mackall wrote:
> > > If I've got XFS on filesystems A and B on the same spindle (or volume
> > > group?) and my real RT I/O takes place only on B, then I want log
> > > flushing to happen in RT on B. But -never on A-. If I can do this with
> > > a tunable, I'm perfectly happy.
> > 
> > No, not another mount option. I'm just going to drop this one for
> > now...
> 
> Actually, I might change it to use the highest non-rt priority, which
> would solve the latency issues in the normal cases and still leave
> the RT rope dangling for those that want to use it.
> 
> Is that an acceptible compromise, Matt?

Yes, that's perfectly fine.

-- 
Mathematics is the supreme nostalgia of our time.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 2/9]: Reduce Log I/O latency

2007-11-22 Thread Matt Mackall

On Fri, Nov 23, 2007 at 09:29:09AM +1100, David Chinner wrote:
> On Thu, Nov 22, 2007 at 12:10:29PM -0600, Matt Mackall wrote:
> > On Thu, Nov 22, 2007 at 09:31:59PM +1100, David Chinner wrote:
> > [...]
> > > > In other words, I/O priority is per-spindle and not per-filesystem and
> > > > thus this change has consequences that leak outside the filesystem in
> > > > question. That's bad.
> > > 
> > > This has nothing to do with this patch - it's a problem with sharing
> > > a single resource in a RT system between two non-deterministic
> > > constructs. e.g. I can put two ext3 filesystems on the one spindle,
> > > run two completely independent RT workloads on the different
> > > filesystems and have one workload DOS the other due to differences
> > > in priority at the spindle.
> > 
> > Sure. And it's up to the RT system designer not to do something stupid
> > like that. The problem is that your patch potentially promotes a
> > non-RT I/O activity to an RT one without regard to the rest of the
> > system.
> 
> So this:
> 
> http://marc.info/?l=linux-kernel=119247074517414=2
> 
> shouldn't be allowed, either? (rt kjournald for ext3)

No, I think not. If a user wants to manually promote kjournald, that's fine.

> > Perfectly understood. And that's fine. A system designer is allowed to
> > shoot himself in the foot.
> 
> Ok. I'll point anyone that complains at you, Matt ;)
> 
> > I don't think there's any fundamental reason the I/O subsystem or
> > filesystems can't be taught to handle priority inversion, which is
> > much more acceptable and general fix.
> 
> See my reply to Andi.

I did. And I'll admit it's pretty thorny and I certainly don't know
enough about XFS internals to comment further.

> > If I've got XFS on filesystems A and B on the same spindle (or volume
> > group?) and my real RT I/O takes place only on B, then I want log
> > flushing to happen in RT on B. But -never on A-. If I can do this with
> > a tunable, I'm perfectly happy.
> 
> No, not another mount option. I'm just going to drop this one for
> now...

I was actually just suggesting allowing a user to do ioprio_set on the
appropriate kernel threads.

-- 
Mathematics is the supreme nostalgia of our time.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: 2.6.24-rc3-mm1

2007-11-22 Thread Andrew Morton

On Thu, 22 Nov 2007 12:22:05 +0200 "Kirill A. Shutemov" <[EMAIL PROTECTED]> 
wrote:

> On x86_64 'uname -m' return 'x86'.  It break many userspace programs. apt
> and rpm for example.
> 

Yes, there have been various discussions about this.  I think Sam is cooking up
a fix?
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[RFC] Documentation about unaligned memory access

2007-11-22 Thread Daniel Drake

Being spoilt by the luxuries of i386/x86_64 I've never really had a good
grasp on unaligned memory access problems on other architectures and decided
it was time to figure it out. As a result I've written this documentation
which I plan to submit for inclusion as
Documentation/unaligned_memory_access.txt

Before I do so, any comments on the following?

Thanks,
Daniel




UNALIGNED MEMORY ACCESSES
=

Linux runs on a wide variety of architectures which have varying behaviour
when it comes to memory access. This document presents some details about
unaligned accesses, why you need to write code that doesn't cause them,
and how to write such code!


What's the definition of an unaligned access?
=

Unaligned memory accesses occur when you try to read N bytes of data starting
from an address that is not evenly divisible by N (i.e. addr % N != 0).
For example, reading 4 bytes of data from address 0x1004 is fine, but
reading 4 bytes of data from address 0x1005 would be an unaligned memory
access.


Why unaligned access is bad
===

Most architectures are unable to perform unaligned memory accesses. Any
unaligned access causes a processor exception.

Some architectures have an exception handler implemented in the kernel which
corrects the memory access, but this is very expensive and is not true for
all architectures. You cannot rely on the exception handler to correct your
memory accesses.

In summary: if your code causes unaligned memory accesses to happen, your code
will not work on some platforms, and will perform *very* badly on others.

You may be wondering why you have never seen these problems on your own
architecture. Some architectures (such as i386 and x86_64) do not have this
limitation, but nevertheless it is important for you to write portable code
that works everywhere.


Natural alignment
=

The rule we mentioned earlier forms what we refer to as natural alignment:
When accessing N bytes of memory, the base memory address must be evenly
divisible by N, i.e. addr % N == 0

When writing code, assume the target architecture has natural alignment
requirements.

Sidenote: in reality, only a few architectures require natural alignment
on all sizes of memory access. However, again we must consider ALL supported
architectures; natural alignment is the only way to achieve full portability.


Code that doesn't cause unaligned access


At first, the concepts above may seem a little hard to relate to actual
coding practice. After all, you don't have a great deal of control over
memory addresses of certain variables, etc.

Fortunately things are not too complex, as in most cases, the compiler
ensures that things will work for you. For example, take the following
structure:

struct foo {
u16 field1;
u32 field2;
u8 field3;
};

Let us assume that an instance of the above structure resides in memory
starting at address 0x1000. With a basic level of understanding, it would
not be unreasonable to expect that accessing field2 would cause an unaligned
access. You'd be expecting field2 to be located at offset 2 bytes into the
structure, i.e. address 0x1002, but that address is not evenly divisible
by 4 (remember, we're reading a 4 byte value here).

Fortunately, the compiler understands the alignment constraints, so in the
above case it would insert 2 bytes of padding inbetween field1 and field2.
Therefore, for standard structure types you can always rely on the compiler
to pad structures so that accesses to fields are suitably aligned (assuming
you do not cast the field to a type of different length).

Similarly, you can also rely on the compiler to align variables and function
parameters to a naturally aligned scheme, based on the size of the type of
the variable.

Sidenote: in the above example, you may wish to reorder the fields in the
above structure so that the overall structure uses less memory. For example,
moving field3 to sit inbetween field1 and field2 (where the padding is
inserted) would shrink the overall structure by 1 byte:

struct foo {
u16 field1;
u8 field3;
u32 field2;
};

Sidenote: it should be obvious by now, but in case it is not, accessing a
single byte (u8 or char) can never cause an unaligned access, because all
memory addresses are evenly divisible by 1.


Code that causes unaligned access
=

With the above in mind, let's move onto a real life example of a function
that can cause an unaligned memory access. The following function adapted
from include/linux/etherdevice.h is an optimized routine to compare two
ethernet MAC addresses for equality.

unsigned int compare_ether_addr(const u8 *addr1, const u8 *addr2)
{
const u16 *a = (const u16 *) addr1;
const u16 *b =

Re: uml doesn't work on 2.6.24-rc2

2007-11-22 Thread Jeff Dike

On Thu, Nov 22, 2007 at 07:08:47PM +0100, Miklos Szeredi wrote:
> Thanks.  My other problem is (probably you are aware) that recent -mm
> kernels don't compile for UML.

The patch below fixes the build for rc3-mm1 for me.

Jeff

-- 
Work email - jdike at linux dot intel dot com

Index: linux-2.6.22/include/asm-um/asm.h
===
--- /dev/null   1970-01-01 00:00:00.0 +
+++ linux-2.6.22/include/asm-um/asm.h   2007-11-21 12:15:03.0 -0500
@@ -0,0 +1,6 @@
+#ifndef __UM_ASM_H
+#define __UM_ASM_H
+
+#include "asm/arch/asm.h"
+
+#endif
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Use CR0 defines.

2007-11-22 Thread Thomas Gleixner

On Mon, 19 Nov 2007, Dave Jones wrote:

> We have definitions of the CR0 bits in processor-flags.h
> Use them instead of hardcoded values in various places.

Applied, thanks

 tglx
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] NET: dmfe: don't access configuration space in D3 state

2007-11-22 Thread Maxim Levitsky

>From 7e24227257f315e52fe0b494dc1253d2a0ce5dff Mon Sep 17 00:00:00 2001
From: Maxim Levitsky <[EMAIL PROTECTED]>
Date: Fri, 23 Nov 2007 01:15:36 +0200
Subject: [PATCH] NET: dmfe: don't access configuration space in D3 state
 Accidently I reversed the order of pci_save_state and
 pci_set_power_state in .suspend()/.resume() callbacks

Signed-off-by: Maxim Levitsky <[EMAIL PROTECTED]>
---
 drivers/net/tulip/dmfe.c |4 ++--
 1 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/net/tulip/dmfe.c b/drivers/net/tulip/dmfe.c
index 390d02d..9d199b0 100644
--- a/drivers/net/tulip/dmfe.c
+++ b/drivers/net/tulip/dmfe.c
@@ -2118,8 +2118,8 @@ static int dmfe_suspend(struct pci_dev *pci_dev, 
pm_message_t state)
pci_enable_wake(pci_dev, PCI_D3cold, 1);
 
/* Power down device*/
-   pci_set_power_state(pci_dev, pci_choose_state (pci_dev,state));
pci_save_state(pci_dev);
+   pci_set_power_state(pci_dev, pci_choose_state (pci_dev,state));
 
return 0;
 }
@@ -2129,8 +2129,8 @@ static int dmfe_resume(struct pci_dev *pci_dev)
struct net_device *dev = pci_get_drvdata(pci_dev);
u32 tmp;
 
-   pci_restore_state(pci_dev);
pci_set_power_state(pci_dev, PCI_D0);
+   pci_restore_state(pci_dev);
 
/* Re-initilize DM910X board */
dmfe_init_dm910x(dev);
-- 
1.5.3.4

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

NET: dmfe.c : fix access to card's pci config space in D3

2007-11-22 Thread Maxim Levitsky

Hi,

I somehow assumed that pci_save_state should be called while 
device is powered off, but actually the opposite is true.

Thus I am sending this patch to fix it.

Sorry for this mistake,
Best regards,
Maxim Levitsky

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] PPC: CHRP - fix possible NULL pointer dereference

2007-11-22 Thread Stephen Rothwell

On Thu, 22 Nov 2007 22:54:23 +0300 Cyrill Gorcunov <[EMAIL PROTECTED]> wrote:
>
> This patch does fix possible NULL pointer dereference
> inside of strncmp() if of_get_property() failed.

Thanks for this.

>  static void __init sio_init(void)
>  {
>   struct device_node *root;
> + const char *model = NULL;

You don't need this initialization as you always assign the variable
before you use it.

> + root = of_find_node_by_path("/");
> + if (root) {

if (!root)
return;

would save a level of indentation. Not important.

-- 
Cheers,
Stephen Rothwell[EMAIL PROTECTED]
http://www.canb.auug.org.au/~sfr/


pgpatt9cXXTv5.pgp
Description: PGP signature

Re: [PATCH 2/9]: Reduce Log I/O latency

2007-11-22 Thread David Chinner

On Fri, Nov 23, 2007 at 09:29:09AM +1100, David Chinner wrote:
> On Thu, Nov 22, 2007 at 12:10:29PM -0600, Matt Mackall wrote:
> > If I've got XFS on filesystems A and B on the same spindle (or volume
> > group?) and my real RT I/O takes place only on B, then I want log
> > flushing to happen in RT on B. But -never on A-. If I can do this with
> > a tunable, I'm perfectly happy.
> 
> No, not another mount option. I'm just going to drop this one for
> now...

Actually, I might change it to use the highest non-rt priority, which
would solve the latency issues in the normal cases and still leave
the RT rope dangling for those that want to use it.

Is that an acceptible compromise, Matt?

Cheers,

Dave.
-- 
Dave Chinner
Principal Engineer
SGI Australian Software Group
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 2/9]: Reduce Log I/O latency

2007-11-22 Thread David Chinner

On Thu, Nov 22, 2007 at 12:10:29PM -0600, Matt Mackall wrote:
> On Thu, Nov 22, 2007 at 09:31:59PM +1100, David Chinner wrote:
> [...]
> > > In other words, I/O priority is per-spindle and not per-filesystem and
> > > thus this change has consequences that leak outside the filesystem in
> > > question. That's bad.
> > 
> > This has nothing to do with this patch - it's a problem with sharing
> > a single resource in a RT system between two non-deterministic
> > constructs. e.g. I can put two ext3 filesystems on the one spindle,
> > run two completely independent RT workloads on the different
> > filesystems and have one workload DOS the other due to differences
> > in priority at the spindle.
> 
> Sure. And it's up to the RT system designer not to do something stupid
> like that. The problem is that your patch potentially promotes a
> non-RT I/O activity to an RT one without regard to the rest of the
> system.

So this:

http://marc.info/?l=linux-kernel=119247074517414=2

shouldn't be allowed, either? (rt kjournald for ext3)

> Perfectly understood. And that's fine. A system designer is allowed to
> shoot himself in the foot.

Ok. I'll point anyone that complains at you, Matt ;)

> I don't think there's any fundamental reason the I/O subsystem or
> filesystems can't be taught to handle priority inversion, which is
> much more acceptable and general fix.

See my reply to Andi.

> If I've got XFS on filesystems A and B on the same spindle (or volume
> group?) and my real RT I/O takes place only on B, then I want log
> flushing to happen in RT on B. But -never on A-. If I can do this with
> a tunable, I'm perfectly happy.

No, not another mount option. I'm just going to drop this one for
now...

Cheers,

Dave.
-- 
Dave Chinner
Principal Engineer
SGI Australian Software Group
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: snd_hda_intel 2.6.24-rc2 bug: interrupts don't always work on Lenovo X60s

2007-11-22 Thread Theodore Tso

On Tue, Nov 13, 2007 at 04:43:05AM +0100, Takashi Iwai wrote:
> > By the way, the "polling mode" seems to work OK: I still get normal
> > playback of music etc.
> 
> Yes, the polling mode should work in most cases, too.

Out of curiosity, how many wakeups/interrupts are involved with the
sound going into polling mode?  Is it going to make a difference as
far as battery life is concerned?

I'm seeing the message:

hda_intel: azx_get_response timeout, switching to polling mode: last 
cmd=0x005f000c

on my X61s laptop as well, where the last_cmd varies quite a bit.
Over the past two weeks, I've seen last cmd be:

0x003f000c, 0x004f000c, 0x005f000c, 0x006f000c, 0x00db8000,
0x011b8000, 0x011ba000, 0x012ba000, 0x012f000c, 0x013f000c,
0x014f000d, 0x019f000c, 0x020b0001, 0x020b0003, 0x020b2000,
0x020b2001, 0x020b2002, 0x025f0012

Interestingly, when I was using a post 2.6.24-rc1 and -rc2 kernel, I
was getting a lot of these "switching polling to mode messages",
usually within a minute of the machine booting.  Now that I have
switched to a recent rc3 kernel, they seem to have largely gone away.

Looking at my kernel, it looks like the patch you suggested to Roland
was *not* applied, and "git log sound/pci/hda" shows that the only
change to that directory was a patch from Ingo Molnar that I had
cherry picked from LKML.  Given that we were doing a
schedule_timeout_uninterruptible for a full second, that certainly
seems to be a likely candidate for why we were getting the response
timeout message!  Does this analysis make sense to you?

Regards,

- Ted

commit 2f7e58208e0d59ca6e4ad1561f47391d4efa19fa
Author: Ingo Molnar <[EMAIL PROTECTED]>
Date:   Fri Nov 16 11:35:05 2007 -0500

snd hda suspend latency: shorten codec read

not sleeping for every codec read/write but doing a short udelay and
a conditional reschedule has cut suspend+resume latency by about 1
second on my T60.

Signed-off-by: Ingo Molnar <[EMAIL PROTECTED]>

diff --git a/sound/pci/hda/hda_intel.c b/sound/pci/hda/hda_intel.c
index 3fa0f97..62b9fb3 100644
--- a/sound/pci/hda/hda_intel.c
+++ b/sound/pci/hda/hda_intel.c
@@ -555,7 +555,8 @@ static unsigned int azx_rirb_get_response(struct hda_codec 
*codec)
}
if (!chip->rirb.cmds)
return chip->rirb.res; /* the last value */
-   schedule_timeout_uninterruptible(1);
+   udelay(10);
+   cond_resched();
} while (time_after_eq(timeout, jiffies));

if (chip->msi) {
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 17/59] drivers/ide: Add missing "space"

2007-11-22 Thread Bartlomiej Zolnierkiewicz

On Tuesday 20 November 2007, Joe Perches wrote:
> 
> Signed-off-by: Joe Perches <[EMAIL PROTECTED]>

applied
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

WARNING: at kernel/resource.c:189 __release_resource

2007-11-22 Thread Jiri Slaby

Hi,

Step aside. What's the purpose of having two similar patches for one issue,
it then warns about the same thing twice:
make-sure-nobodys-leaking-resources.patch
releasing-resources-with-children.patch

Ok, I hit the bug, suspend of 00:06 device complains about it:
WARNING: at .../kernel/resource.c:185 __release_resource()

Call Trace:
 [] release_resource+0xb5/0xf0
 [] pnp_release_resources+0x70/0x130
 [] pnp_stop_dev+0x45/0x90
 [] pnp_bus_suspend+0x92/0xb0
 [] suspend_device+0x113/0x180
 [] device_suspend+0x200/0x320
 [] suspend_devices_and_enter+0xa5/0x170
 [] enter_state+0x209/0x270
 [] state_store+0xaf/0xf0
 [] kobj_attr_store+0x17/0x20
 [] sysfs_write_file+0xce/0x140
 [] vfs_write+0xc7/0x170
 [] sys_write+0x50/0x90
 [] system_call+0x7e/0x83

# LANG=en ll /sys/devices/pnp0/00:06/
total 0
lrwxrwxrwx 1 root root0 Nov 22 22:35 driver -> 
../../../bus/pnp/drivers/serial
-r--r--r-- 1 root root 4096 Nov 22 22:35 id
-r--r--r-- 1 root root 4096 Nov 22 22:35 options
drwxr-xr-x 2 root root0 Nov 22 22:35 power
-rw-r--r-- 1 root root 4096 Nov 22 22:35 resources
lrwxrwxrwx 1 root root0 Nov 22 22:35 subsystem -> ../../../bus/pnp
drwxr-xr-x 3 root root0 Nov 22 22:35 tty
-rw-r--r-- 1 root root 4096 Nov 22 22:35 uevent

regards,
-- 
Jiri Slaby ([EMAIL PROTECTED])
Faculty of Informatics, Masaryk University
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: System reboot triggered by just reading a device file....!?

2007-11-22 Thread devzero

Hi Clemens, 

thanks, but i know i could do this.

this thread is not meant to protect myself from this curiousity but it is meant 
to protect others.
it`s a trap. 
i stepped into that.
now i know that trap, so i can easily sidestep.

but most people using linux don`t know about the watchdog, so i don`t think 
they will know about this trap. 
you can`t make that become common knowledge.

and we can`t expect that they will find out _what`s_ the trap at all, if they 
step into.
having this behaviour documented is like putting a sign "don`t step into this" 
at the back of the trap 

so why shouldn`t we help them avoiding it ?

it maybe very seldom that someone steps into this. 
but it may happen and then someone will have trouble and spend time on this.
i think every admin can tell you about weird random reboots of his systems 
which he cannot explain what was the reason for it.
this maybe some of those reasons and this one could be avoided. 
i`m thinking of something simple like echo "now you`re armed" > /dev/watchdog

regards
roland


> -Ursprüngliche Nachricht-
> Von: "Clemens Koller" <[EMAIL PROTECTED]>
> Gesendet: 22.11.07 21:43:15
> An: [EMAIL PROTECTED]
> CC: Simon Arlott <[EMAIL PROTECTED]>, Robert Hancock <[EMAIL PROTECTED]>,  
> linux-kernel@vger.kernel.org
> Betreff: Re: System reboot triggered by just reading a device file!?


> 
> [EMAIL PROTECTED] schrieb:
> 
>  > [was: reading /dev/watchdog triggers reboot as intended]
>  > need to change my own philosophy now, because i learned that reading isn`t 
> harmless.   ;)
> 
> If you want to protect you from your curiosity (or from reading anything),
> you could just disable the watchdog in the kernel.
> See: Device Drivers -> Character devices -> Watchdog Timer Support -> ...
> 
> Regards,
> -- 
> Clemens Koller
> __
> R Imaging Devices
> Anagramm GmbH
> Rupert-Mayer-Straße 45/1
> Linhof Werksgelände
> D-81379 München
> Tel.089-741518-50
> Fax 089-741518-19
> http://www.anagramm-technology.com
> 


__
Jetzt neu! Im riesigen WEB.DE Club SmartDrive Dateien freigeben und mit 
Freunden teilen! http://www.freemail.web.de/club/smartdrive_ttc.htm/?mc=021134

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [2.6.24-rc] pdflush stuck in D state

2007-11-22 Thread Rafael J. Wysocki

On Thursday, 22 of November 2007, [EMAIL PROTECTED] wrote:
> [EMAIL PROTECTED] wrote on 22/11/2007 16:28:22:
> 
> > [EMAIL PROTECTED] wrote on 22/11/2007 16:34:01:
> > 
> > > On Thursday, 22 of November 2007, Tvrtko A. Ursulin wrote:
> > > > 
> > > > Hi all,
> > > > 
> > > > So as the subject says - pdflush is stuck i D state which means 
> > > constant load 
> > > > average of at least one:
> > > > 
> > > > root   151  0.0  0.0  0 0 ?DNov20   0:33 
> > [pdflush]
> > > 
> > > Hm, this is supposed to be fixed.  Please see:
> > > http://bugzilla.kernel.org/show_bug.cgi?id=9323
> > 
> > Doesn't look like the same bug to me - I have only ever seen pdflush get 
> 
> > stuck in D, not random processes.
> > 
> > Also, I can confirm that -rc3 I am running contains the patch referenced 
> 
> > in your bugzilla link above.
> 
> Here is a trace for good and bad pdflush processes:
> 
> Nov 22 16:35:09 oxygene kernel:  ===
> Nov 22 16:35:09 oxygene kernel: pdflush   S c014fe4e 0   150 2
> Nov 22 16:35:09 oxygene kernel: 0046  c014fe4e 
> f7c30fa4 c014f9c6 f7ce8560 f7ce869c
> Nov 22 16:35:09 oxygene kernel:c201e100 0001 fffc 0002 
>    
> Nov 22 16:35:09 oxygene kernel:0021 f7c30fc8 c014fe4e  
>  c014ff09 0202 f7ce8560
> Nov 22 16:35:09 oxygene kernel: Call Trace:
> Nov 22 16:35:09 oxygene kernel:  [pdflush+0/434] pdflush+0x0/0x1b2
> Nov 22 16:35:09 oxygene kernel:  [background_writeout+125/173] 
> background_writeout+0x7d/0xad
> Nov 22 16:35:09 oxygene kernel:  [pdflush+0/434] pdflush+0x0/0x1b2
> Nov 22 16:35:09 oxygene kernel:  [pdflush+187/434] pdflush+0xbb/0x1b2
> Nov 22 16:35:09 oxygene kernel:  [kthread+56/96] kthread+0x38/0x60
> Nov 22 16:35:09 oxygene kernel:  [kthread+0/96] kthread+0x0/0x60
> Nov 22 16:35:09 oxygene kernel:  [kernel_thread_helper+7/16] 
> kernel_thread_helper+0x7/0x10
> Nov 22 16:35:09 oxygene kernel:  ===
> Nov 22 16:35:09 oxygene kernel: pdflush   D f7c3aef4 0   151 2
> Nov 22 16:35:09 oxygene kernel:0b83b953 0046 0002 f7c3aef4 
> f7c3aeec  f7ce8ac0 f7ce8bfc
> Nov 22 16:35:09 oxygene kernel:c201e100 0001 f7c3af18 0b83b9cc 
> f7c3af18 0003  0001
> Nov 22 16:35:09 oxygene kernel: f7c3af18 0b83b9cc f7c3af70 
>  c027ddb7 f781e000 c010490c
> Nov 22 16:35:09 oxygene kernel: Call Trace:
> Nov 22 16:35:09 oxygene kernel:  [schedule_timeout+112/141] 
> schedule_timeout+0x70/0x8d
> Nov 22 16:35:09 oxygene kernel:  [apic_timer_interrupt+40/48] 
> apic_timer_interrupt+0x28/0x30
> Nov 22 16:35:09 oxygene kernel:  [process_timeout+0/5] 
> process_timeout+0x0/0x5
> Nov 22 16:35:09 oxygene kernel:  [schedule_timeout+107/141] 
> schedule_timeout+0x6b/0x8d
> Nov 22 16:35:09 oxygene kernel:  [io_schedule_timeout+27/36] 
> io_schedule_timeout+0x1b/0x24
> Nov 22 16:35:09 oxygene kernel:  [congestion_wait+80/100] 
> congestion_wait+0x50/0x64
> Nov 22 16:35:09 oxygene kernel:  [autoremove_wake_function+0/53] 
> autoremove_wake_function+0x0/0x35
> Nov 22 16:35:09 oxygene kernel:  [pdflush+0/434] pdflush+0x0/0x1b2
> Nov 22 16:35:09 oxygene kernel:  [wb_kupdate+152/219] wb_kupdate+0x98/0xdb
> Nov 22 16:35:09 oxygene kernel:  [pdflush+282/434] pdflush+0x11a/0x1b2
> Nov 22 16:35:09 oxygene kernel:  [wb_kupdate+0/219] wb_kupdate+0x0/0xdb
> Nov 22 16:35:09 oxygene kernel:  [kthread+56/96] kthread+0x38/0x60
> Nov 22 16:35:09 oxygene kernel:  [kthread+0/96] kthread+0x0/0x60
> Nov 22 16:35:09 oxygene kernel:  [kernel_thread_helper+7/16] 
> kernel_thread_helper+0x7/0x10
> Nov 22 16:35:09 oxygene kernel:  ===

Can you attach these traces to the Bugzilla entry at:
http://bugzilla.kernel.org/show_bug.cgi?id=9441 , please?

Thanks,
Rafael
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [2.6.24-rc] pdflush stuck in D state

2007-11-22 Thread Rafael J. Wysocki

On Thursday, 22 of November 2007, [EMAIL PROTECTED] wrote:
> [EMAIL PROTECTED] wrote on 22/11/2007 16:28:22:
> 
> > [EMAIL PROTECTED] wrote on 22/11/2007 16:34:01:
> > 
> > > On Thursday, 22 of November 2007, Tvrtko A. Ursulin wrote:
> > > > 
> > > > Hi all,
> > > > 
> > > > So as the subject says - pdflush is stuck i D state which means 
> > > constant load 
> > > > average of at least one:
> > > > 
> > > > root   151  0.0  0.0  0 0 ?DNov20   0:33 
> > [pdflush]
> > > 
> > > Hm, this is supposed to be fixed.  Please see:
> > > http://bugzilla.kernel.org/show_bug.cgi?id=9323
> > 
> > Doesn't look like the same bug to me - I have only ever seen pdflush get 
> 
> > stuck in D, not random processes.
> > 
> > Also, I can confirm that -rc3 I am running contains the patch referenced 
> 
> > in your bugzilla link above.

OK, so I'm adding this to the list of known regreesions.

Thanks,
Rafael
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: System reboot triggered by just reading a device file....!?

2007-11-22 Thread Clemens Koller


[EMAIL PROTECTED] schrieb:

> [was: reading /dev/watchdog triggers reboot as intended]
> need to change my own philosophy now, because i learned that reading isn`t 
harmless.   ;)

If you want to protect you from your curiosity (or from reading anything),
you could just disable the watchdog in the kernel.
See: Device Drivers -> Character devices -> Watchdog Timer Support -> ...

Regards,
--
Clemens Koller
__
R Imaging Devices
Anagramm GmbH
Rupert-Mayer-Straße 45/1
Linhof Werksgelände
D-81379 München
Tel.089-741518-50
Fax 089-741518-19
http://www.anagramm-technology.com
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: nohz and strange sleep latencies

2007-11-22 Thread Thomas Gleixner

On Thu, 22 Nov 2007, Pavel Machek wrote:
> > but perhaps somehow we miss this fact and fail to turn off the lapic 
> > clockevents drivers?
> 
> Ok, I guess I'm lost. If I offline second CPU, I immediately get
> 1000Hz timer tick... is that expected?

Hmm. No. I have no idea why this is happening.

34196 total events, 55.083 events/sec
echo 0 >/sys/devices/system/cpu/cpu1/online
36073 total events, 54.679 events/sec

> I'm trying to decide when system is idle (lets say that means "no user
> task is scheduled to wakeup within 10 seconds)... I added some
> instrumentation to nohz subsystem, but it does not behave like I'd
> expect: even if I run "while true; do sleep .01; done" loop, I see
> nohz preparing for 5 seconds sleep... while it seems obvious that it
> can only be 10msec sleep, and with max_cstate=1, it works that
> way... Plus, nte->start_pid seems to contain some random numbers :-(.
> 
> What am I doing wrong?
> 
> (Patch for illustration, I can generate full diff against vanilla,
> but...)

Just to make sure what we are hunting: Do you have the same problem
with an non-pavel-tainted 2.6.24-rc3 ?

 tglx
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] fix plip 2

2007-11-22 Thread Mikulas Patocka

This is my second patch for plip. Plip passes string "name" that is 
allocated on stack to parport_register_device. parport_register_device 
holds the pointer to "name" and when the registering function exits, it 
points nowhere.

On some machine, this bug causes bad names to appear in /proc filesystem, 
such as /proc/sys/dev/parport/parport0/devices/T^/ÁX^/Á, on others, the 
plip proc node is completely missing.

The patch also fixes documentation to note this requirement.

Mikulas

Signed-off-by: Mikulas Patocka <[EMAIL PROTECTED]>

diff -u -r linux-2.6.24-rc2/Documentation/parport-lowlevel.txt 
linux-2.6.24-test/Documentation/parport-lowlevel.txt
--- linux-2.6.24-rc2/Documentation/parport-lowlevel.txt 2007-11-06 
22:57:46.0 +0100
+++ linux-2.6.24-test/Documentation/parport-lowlevel.txt2007-11-22 
21:11:28.0 +0100
@@ -339,6 +339,10 @@
 ('port').  Once you have done that, you will be able to use
 parport_claim and parport_release in order to use the port.
 
+The ('name') argument is the name of the device that appears in /proc
+filesystem. The string must be valid for the whole lifetime of the
+device (until parport_unregister_device is called).
+
 This function will register three callbacks into your driver:
 'preempt', 'wakeup' and 'irq'.  Each of these may be NULL in order to
 indicate that you do not want a callback.
diff -u -r linux-2.6.24-rc2/drivers/net/plip.c 
linux-2.6.24-test/drivers/net/plip.c
--- linux-2.6.24-rc2/drivers/net/plip.c 2007-11-06 22:57:46.0 +0100
+++ linux-2.6.24-test/drivers/net/plip.c2007-11-22 21:11:28.0 
+0100
@@ -1269,7 +1269,7 @@
 
nl = netdev_priv(dev);
nl->dev = dev;
-   nl->pardev = parport_register_device(port, name, plip_preempt,
+   nl->pardev = parport_register_device(port, dev->name, 
plip_preempt,
 plip_wakeup, plip_interrupt,
 0, dev);

Re: [PATCH] fix plip 1

2007-11-22 Thread Mikulas Patocka

I forgot to add:
Signed-off-by: Mikulas Patocka <[EMAIL PROTECTED]>

> diff -u -r linux-2.6.24-rc2/drivers/net/plip.c 
> linux-2.6.24-test/drivers/net/plip.c
> --- linux-2.6.24-rc2/drivers/net/plip.c   2007-11-06 22:57:46.0 
> +0100
> +++ linux-2.6.24-test/drivers/net/plip.c  2007-11-22 21:11:28.0 
> +0100
> @@ -663,7 +663,7 @@
>   case PLIP_PK_DONE:
>   /* Inform the upper layer for the arrival of a packet. */
>   rcv->skb->protocol=plip_type_trans(rcv->skb, dev);
> - netif_rx(rcv->skb);
> + netif_rx_ni(rcv->skb);
>   dev->last_rx = jiffies;
>   dev->stats.rx_bytes += rcv->length.h;
>   dev->stats.rx_packets++;
> 
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Possible bug from kernel 2.6.22 and above

2007-11-22 Thread Matt Mackall

On Wed, Nov 21, 2007 at 09:58:10PM -0500, Jie Chen wrote:
> Simon Holm Th??gersen wrote:
> >ons, 21 11 2007 kl. 20:52 -0500, skrev Jie Chen:
> 
> >There is a backport of the CFS scheduler to 2.6.21, see
> >http://lkml.org/lkml/2007/11/19/127
> >
> Hi, Simon:
> 
> I will try that after the thanksgiving holiday to find out whether the 
> odd behavior will show up using 2.6.21 with back ported CFS.
> 
> Kernel 2.6.21
> Number of Threads  2  4   6 8
> SpinLock (Time micro second)   10.561810.5853810.5915   10.643
>   (Overhead)   0.073  0.05746 0.102805 0.154563
> Barrier (Time micro second)11.020410  11.678125   11.9889   12.38002
>  (Overhead)0.531660   1.1502  1.500112 1.891617
> 
> Each thread is bound to a particular core using pthread_setaffinity_np.
> 
> Kernel 2.6.23.8
> Number of Threads  2  4   6 8
> SpinLock (Time micro second)   14.849915  17.117603   14.4496   10.5990
>  (Overhead)4.345417   6.6172073.949435  0.110985
> Barrier (Time micro second)19.462255  20.285117   16.19395  12.37662
>  (Overhead)8.957755   9.7847225.699590  1.869518
> 
> 
> >
> >
> >Simon Holm Th??gersen
> >
> >
> I just ran a simple test to prove that the problem may be related to 
> load balance of the scheduler. I first started 6 processes using 
> "taskset -c 2 donothing&; taskset -c 3 donothing&; ..., taskset -c 7 
> donothing". These 6 processes will run on core 2 to 7. Then I started my 
> test program using two threads bound to core 0 and 1. Here is the result:
> 
> Two threads on Kernel 2.6.23.8:
> SpinLock (Time micro second) 10.558255
>  (Overhead)  0.068965
> Barrier  (Time micro second) 10.865520
>  (Overhead)  0.376230
> 
> Similarly, I started 4 donothing processes on core 4, 5, 6 and 7, and 
> ran the test program. I have the following result:
> 
> Four threads on Kernel 2.6.23.8:
> SpinLock (Time micro second) 10.579413
>  (Overhead)  0.090023
> Barrier  (Time micro second) 11.363193
>  (Overhead)  0.873803
> 
> Finally, here is the result for 6 threads with two donothing processes 
> running on core 6 and 7:
> 
> Six threads on Kernel 2.6.23.8:
> SpinLock (Time micro second) 10.590030
>  (Overhead)  0.100940
> Barrier  (Time micro second) 11.977548
>  (Overhead)  1.488458
> 
> Now the above results are very much similar to the results obtained for 
> the kernel 2.6.21. I hope this helps you guys in some ways. Thank you.

Yes, this really does look like a scheduling regression. I've added
Ingo to the cc: list. Next time you should pick a more descriptive
subject line - we've got lots of email about possible bugs.

-- 
Mathematics is the supreme nostalgia of our time.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] fix plip 1

2007-11-22 Thread Mikulas Patocka

Hi

netif_rx is meant to be called from interrupts because it doesn't wake up 
ksoftirqd. For calling from outside interrupts, netif_rx_ni exists.

This patch fixes plip to use netif_rx_ni. It fixes the infamous error 
"NOHZ: local_softirq_panding 08" that happens on some machines with NOHZ 
and plip --- it is caused by the fact that softirq is pending and 
ksoftirqd is sleeping.

Mikulas

diff -u -r linux-2.6.24-rc2/drivers/net/plip.c 
linux-2.6.24-test/drivers/net/plip.c
--- linux-2.6.24-rc2/drivers/net/plip.c 2007-11-06 22:57:46.0 +0100
+++ linux-2.6.24-test/drivers/net/plip.c2007-11-22 21:11:28.0 
+0100
@@ -663,7 +663,7 @@
case PLIP_PK_DONE:
/* Inform the upper layer for the arrival of a packet. */
rcv->skb->protocol=plip_type_trans(rcv->skb, dev);
-   netif_rx(rcv->skb);
+   netif_rx_ni(rcv->skb);
dev->last_rx = jiffies;
dev->stats.rx_bytes += rcv->length.h;
dev->stats.rx_packets++;
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[patch][v3] x86, ptrace: support for branch trace store(BTS)

2007-11-22 Thread Metzger, Markus T

Changes to previous version(s):
- moved task arrives/departs notifications to __switch_to_xtra()
- added _TIF_BTS_TRACE and _TIF_BTS_TRACE_TS to _TIF_WORK_CTXSW_*
- split _TIF_WORK_CTXSW into ~_PREV and ~_NEXT for x86_64
- ptrace_bts_init_intel() function called from init_intel()
- removed PTRACE_BTS_INIT ptrace command
- cache DEBUGCTRL MSR
- replace struct declarations and operations struct with
  configuration struct defining offset/size pairs and
  generic operations
- added a patch for the ptrace.2 man page for discussing the API
  in this forum


Support for Intel's last branch recording to ptrace. This gives
debuggers
access to this hardware feature and allows them to show an execution
trace
of the debugged application.

Last branch recording (see section 18.5 in the Intel 64 and IA-32
Architectures Software Developer's Manual) allows taking an execution
trace of the running application without instrumentation. When a branch
is executed, the hardware logs the source and destination address in a
cyclic buffer given to it by the OS.

This can be a great debugging aid. It shows you how exactly you got
where you currently are without requiring you to do lots of single
stepping and rerunning.

This patch manages the various buffers, configures the trace
hardware, disentangles the trace, and provides a user interface via
ptrace. On the high-level design:
- there is one optional trace buffer per thread_struct
- upon a context switch, the trace hardware is reconfigured to either
  disable tracing or to use the appropriate buffer for the new task.
  - tracing induces ~20% overhead as branch records are sent out on
the bus. 
  - the hardware collects trace per processor. To disentangle the
traces for different tasks, we use separate buffers and reconfigure
the trace hardware.
- the low-level data layout is configured at cpu initialization time
  - different processors use different branch record formats

Opens:
- support for more processors (to come)
- ptrace command numbers (just picked some numbers randomly)


Here is the patch to the ptrace.2 man page I found on my linux system
---

Index: man/man2/ptrace.2
===
--- man.orig/man2/ptrace.2  2007-11-22 20:25:21.%N +0100
+++ man/man2/ptrace.2   2007-11-22 20:25:33.%N +0100
@@ -40,7 +40,10 @@
 .\"PTRACE_SETSIGINFO, PTRACE_SYSEMU, PTRACE_SYSEMU_SINGLESTEP
 .\"(Thanks to Blaisorblade, Daniel Jacobowitz and others who
helped.)
 .\"
-.TH PTRACE 2 2006-03-24 "Linux 2.6.16" "Linux Programmer's Manual"
+.\" Modified Nov 2007, Markus Metzger <[EMAIL PROTECTED]>
+.\" Added PTRACE_BTS_* commands
+.\"
+.TH PTRACE 2 2007-11 "Linux 2.6.16" "Linux Programmer's Manual"
 .SH NAME
 ptrace \- process trace
 .SH SYNOPSIS
@@ -312,6 +315,96 @@
 detached in this way regardless of which method was used to initiate
 tracing.
 (\fIaddr\fP is ignored.)
+.LP
+The following ptrace commands provide access to the hardware's last
+branch recording. They may not be available on all architectures.
+.LP
+Last branch recording stores an execution trace of the traced process
+in a circular buffer (called Branch Trace Store). For every
+(conditional) control flow change, the source and destination address
+are stored. On some architectures, control flow changes inside the
+kernel may be recorded, as well. On later architectures, these are
+automatically filtered out.
+.LP
+In addition to branches, timestamps may optionally be recorded when
+the traced process arrives and departs, respectively. This information
+can be used to obtain a qualitative execution order, if more than one
+process is traced.
+.LP
+.nf
+enum ptrace_bts_qualifier {
+   PTRACE_BTS_INVALID = 0,
+   PTRACE_BTS_BRANCH,
+   PTRACE_BTS_TASK_ARRIVES,
+   PTRACE_BTS_TASK_DEPARTS
+};
+.sp
+struct ptrace_bts_record {
+   enum ptrace_bts_qualifier qualifier;
+   union {
+   /* PTRACE_BTS_BRANCH */
+   struct {
+   void *from_ip;
+   void *to_ip;
+   } lbr;
+   /* PTRACE_BTS_TASK_ARRIVES or
+  PTRACE_BTS_TASK_DEPARTS */
+   unsigned long long timestamp;
+   } variant;
+};
+.fi
+.LP
+.TP
+PTRACE_BTS_MAX_BUFFER
+This is not a ptrace command but a macro that defines the maximal size
+of the BTS buffer in number of BTS records.
+.TP
+PTRACE_BTS_ALLOCATE_BUFFER
+Allocate a new BTS buffer big enough to hold \fIdata\fP \fBstruct
+ptrace_bts_record\fP entries.
+\fIData\fP must be in the range of 0..PTRACE_BTS_MAX_BUFFER.
+If a buffer is already allocated, that buffer is freed after the new
+buffer was successfully allocated. The new buffer initially contains
+invalid entries.
+Typically, a buffer is allocated once when tracing starts. It is
+automatically deallocated when the parent detaches from the child.
+(\fIaddr\fP is ignored.)
+.TP
+PTRACE_BTS_GET_BUFFER_SIZE
+Returns the actual BTS buffer size in

RE: [patch][v2] x86, ptrace: support for branch trace store(BTS)

2007-11-22 Thread Metzger, Markus T

>Your patch seems to be still word wrapped.

I hope this is better with the next version I'm going to
send out in a few minutes. Sorry about that.


>The noflags variant should be probably data driven too.

I rewrote the entire code to use an offset/size configuration
instead of declaring structs for the various architecture
variants.

>
>
>>  
>> +case PTRACE_BTS_ALLOCATE_BUFFER:
>> +ret = ptrace_bts_allocate_bts(child, data);
>> +break;
>> +
>> +case PTRACE_BTS_GET_BUFFER_SIZE:
>> +ret = ptrace_bts_get_buffer_size(child);
>> +break;
>> +
>> +case PTRACE_BTS_GET_INDEX:
>> +ret = ptrace_bts_get_index(child);
>> +break;
>> +
>> +case PTRACE_BTS_READ_RECORD:
>> +ret = ptrace_bts_read_record
>> +(child, data,
>> + (struct ptrace_bts_record __user *) addr);
>> +break;
>> +
>> +case PTRACE_BTS_CONFIG:
>> +ptrace_bts_config(child, data);
>> +ret = 0;
>> +break;
>> +
>> +case PTRACE_BTS_STATUS:
>> +ret = ptrace_bts_status(child);
>> +break;
>> +
>
>Regarding your interface (can you please write those manpages to get a
>full rationable)? 

They will be part of the next version. I keep the two patches separate,
I hope that does not confuse any scripts that try to extract patches 
from the email.


>I'm not sure it's a good idea to have a READ_RECORD -- better would 
>be likely a batched interface. Probably it would
>be cleaner to combine get_index and read_record into a higher
>level interface that works like normal read() -- gives you data 
>not read before for multiple records. 

I tried to keep the interface as simple and flexible as possible.
A user would typically want to read the trace from back to front until
he
read enough trace. But I could also think of a more random accesses
pattern.
If you know you're going to read the entire buffer, reading it from
front to back might be preferrable.

The memory interface uses peek and poke to read and write a single 
word, respectively. I think that the READ_RECORD command matches the
PEEK command pretty well.

I would expect a user to provide a stream view to higher levels if it
is beneficial for him. Such a view can be easily implemented with
the current interface.


>Also not sure why you have separate config and allocate_buffer steps.
>They could be easily combined, couldn't they? 

The buffer is typically allocated once per session, whereas a user
may want to turn on and off tracing several times during a session.


>> +/*
>> + * Maximal BTS size in number of records
>> + */
>> +#define PTRACE_BTS_MAX_BTS_SIZE 4000
>
>This should be likely some sort of sysctl. Or perhaps just use 
>user supplied
>buffers limited by the mlock ulimit (that would allow zero copy). Ok
>it means the high level read interface proposed above wouldn't work.
>Perhaps the high level interface is better, although zero copy would
>be more efficient. Not 100% sure what is better.

I would not expect many users to want to change that value.
I would make this go into the /usr/include/sys/ptrace.h header file, so
the
ptrace user is aware of the limit.
If it turns out that there is a need to make this more flexible, we can
always do it with a later patch.


>> +rdmsrl(MSR_IA32_DEBUGCTLMSR, debugctl_msr);
>> +debugctl_msr |= ptrace_bts_cfg.debugctrl_mask;
>> +wrmsrl(MSR_IA32_DEBUGCTLMSR, debugctl_msr);
>
>I still think you should cache the DEBUGCTLMSR. If you worry
>about other people changing it provide a general accessor.

I cached the values. The MSR is read during initialization and the 
enabled and disabled variants are stored in the processor configuration.

grep did not find another use of MSR_IA32_DEBUGCTLMSR anywhere in the
kernel.


thanks and regards,
markus.
-
Intel GmbH
Dornacher Strasse 1
85622 Feldkirchen/Muenchen Germany
Sitz der Gesellschaft: Feldkirchen bei Muenchen
Geschaeftsfuehrer: Douglas Lusk, Peter Gleissner, Hannes Schwaderer
Registergericht: Muenchen HRB 47456 Ust.-IdNr.
VAT Registration No.: DE129385895
Citibank Frankfurt (BLZ 502 109 00) 600119052

This e-mail and any attachments may contain confidential material for
the sole use of the intended recipient(s). Any review or distribution
by others is strictly prohibited. If you are not the intended
recipient, please contact the sender and delete all copies.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] PPC: CHRP - fix possible NULL pointer dereference

2007-11-22 Thread Cyrill Gorcunov

From: Cyrill Gorcunov <[EMAIL PROTECTED]>

This patch does fix possible NULL pointer dereference
inside of strncmp() if of_get_property() failed.

Signed-off-by: Cyrill Gorcunov <[EMAIL PROTECTED]>
---

 arch/powerpc/platforms/chrp/setup.c |   23 +--
 1 files changed, 13 insertions(+), 10 deletions(-)

diff --git a/arch/powerpc/platforms/chrp/setup.c 
b/arch/powerpc/platforms/chrp/setup.c
index 5930626..e3f276d 100644
--- a/arch/powerpc/platforms/chrp/setup.c
+++ b/arch/powerpc/platforms/chrp/setup.c
@@ -115,7 +115,7 @@ void chrp_show_cpuinfo(struct seq_file *m)
seq_printf(m, "machine\t\t: CHRP %s\n", model);
 
/* longtrail (goldengate) stuff */
-   if (!strncmp(model, "IBM,LongTrail", 13)) {
+   if (model && !strncmp(model, "IBM,LongTrail", 13)) {
/* VLSI VAS96011/12 `Golden Gate 2' */
/* Memory banks */
sdramen = (in_le32(gg2_pci_config_base + GG2_PCI_DRAM_CTRL)
@@ -203,16 +203,19 @@ static void __init sio_fixup_irq(const char *name, u8 
device, u8 level,
 static void __init sio_init(void)
 {
struct device_node *root;
+   const char *model = NULL;
 
-   if ((root = of_find_node_by_path("/")) &&
-   !strncmp(of_get_property(root, "model", NULL),
-   "IBM,LongTrail", 13)) {
-   /* logical device 0 (KBC/Keyboard) */
-   sio_fixup_irq("keyboard", 0, 1, 2);
-   /* select logical device 1 (KBC/Mouse) */
-   sio_fixup_irq("mouse", 1, 12, 2);
-   }
-   of_node_put(root);
+   root = of_find_node_by_path("/");
+   if (root) {
+   model = of_get_property(root, "model", NULL);
+   if (model && !strncmp(model,"IBM,LongTrail", 13)) {
+   /* logical device 0 (KBC/Keyboard) */
+   sio_fixup_irq("keyboard", 0, 1, 2);
+   /* select logical device 1 (KBC/Mouse) */
+   sio_fixup_irq("mouse", 1, 12, 2);
+   }
+   of_node_put(root);
+   }
 }
 
 
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Why is FIBMAP ioctl root only?

2007-11-22 Thread Alan Cox

> probably principle of least privilege; the location on physical media
> for a file is clearly something internal to the OS, and non-trusted
> users normally don't have any business knowing that. 

FIBMAP isn't correctly locked against misuse, and that requires FIBMAP is
safe against truncate and relocation. There was thread on l/k about this
a month ago or so.

Its also the wrong API (32bit, no notion of extents, compression etc)
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Why is FIBMAP ioctl root only?

2007-11-22 Thread Alan Cox

On Thu, 22 Nov 2007 19:17:14 +0100
Jan Kara <[EMAIL PROTECTED]> wrote:

>   Hi,
> 
>   I guess subject says it all - why is FIBMAP ioctl restricted only to
> root (CAP_SYS_RAWIO)? Corresponding ioctl for XFS is allowed without any
> special capabilities so we are inconsistent here too...
>   Would anyone mind if the check is removed?

It would be great if it was but it involves a *lot* of work.

Alan
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: System reboot triggered by just reading a device file....!?

2007-11-22 Thread devzero

since i have gotten more or less similar answers from here, i have talked to 
some more people privately.

the result is interesting:
if the person i talked to was some sysadmin or related to that (i.e. some 
person running servers), his opinion was very similar to mine.
if the person was a developer, he didn`t really understand why i have a problem 
with that. "be careful if you are root" was what i got.

one of the admins gave a good statement, which i liked very much and want to 
share:

"even if you are root: it`s unix philosophy, that reading is harmless!"

i never thought about that, but i think that`s exactly the point and that`s why 
i`m feeling uncomfortable with that.

anyway - it cost me some time to find a bug which was none  and the only 
mistake i did was using a tool for which i was sure did nothing more than 
reading. so why should i care that i was root ?

need to change my own philosophy now, because i learned that reading isn`t 
harmless.   ;)

regards
roland

> -Ursprüngliche Nachricht-
> Von: "Simon Arlott" <[EMAIL PROTECTED]>
> Gesendet: 21.11.07 13:30:05
> An: [EMAIL PROTECTED]
> CC: "Robert Hancock" <[EMAIL PROTECTED]>, linux-kernel@vger.kernel.org
> Betreff: Re: System reboot triggered by just reading a device file!?

> 
> On Wed, November 21, 2007 00:01, [EMAIL PROTECTED] wrote:
> >>There is.. it's called "root privileges".
> > yes, true.
> >
> > but - regardless of being a windows app or not - what if you want to take a 
> > look on your system as a whole,
> > especially when using some tool which graphically shows how and where your 
> > diskspace is being used?  if i
> > let this run from ordinary useraccount it would get lot`s of "permission 
> > denied"  and then it`s only telling
> > half of the truth.
> 
> Such a tool shouldn't need to open any files, whether they're device files or 
> not. Do you expect it to open
> /dev/zero etc. too and read from an infinitely sized "file"?
> 
> >> > i`d wish there would be some fence around this or iTCO_wdt /dev/watchdog 
> >> > not being active after a
> >> default desktop installation.
> 
> Delete it?
> 
> -- 
> Simon Arlott
> 

__
Jetzt neu! Im riesigen WEB.DE Club SmartDrive Dateien freigeben und mit 
Freunden teilen! http://www.freemail.web.de/club/smartdrive_ttc.htm/?mc=021134

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

2.6.23 WARNING: at kernel/softirq.c:139 local_bh_enable()

2007-11-22 Thread Simon Arlott

WARN during log message being output to ttyS0 and netconsole:

[2059664.615816] __iptables__: init4 IN=ppp0 OUT=ppp0 WARNING: at 
kernel/softirq.c:139 local_bh_enable()
[2059664.620535]  [<80120364>] local_bh_enable+0x3c/0x97
[2059664.620553]  [<802e3356>] __nf_ct_ext_destroy+0x35/0x5b
[2059664.620569]  [<802dfbeb>] destroy_conntrack+0x5e/0xf6
[2059664.620577]  [<802db821>] nf_conntrack_destroy+0x1f/0x3f
[2059664.620585]  [<802c0c71>] __kfree_skb+0xb8/0xf6
[2059664.620597]  [<802d00f0>] zap_completion_queue+0x3e/0x64
[2059664.620606]  [<802d0583>] find_skb+0x14/0x6b
[2059664.620612]  [<801167cc>] inc_nr_running+0x12/0x25
[2059664.620622]  [<802d0fd8>] netpoll_send_udp+0x2d/0x251
[2059664.620630]  [<80226135>] uart_console_write+0x2a/0x33
[2059664.620645]  [<80241319>] write_msg+0x32/0x41
[2059664.620657]  [<8011c205>] __call_console_drivers+0x61/0x6d
[2059664.620669]  [<8011c3fc>] release_console_sem+0x164/0x1bf
[2059664.620679]  [<8011c81f>] vprintk+0x27a/0x2ff
[2059664.620692]  [<8013881e>] handle_IRQ_event+0x1a/0x3f
[2059664.620702]  [<801203a5>] local_bh_enable+0x7d/0x97
[2059664.620709]  [<8031ae6a>] fn_hash_lookup+0xb0/0xcd
[2059664.620723]  [<8011c8bf>] printk+0x1b/0x1f
[2059664.620731]  [<8032b372>] ipt_log_packet+0x71/0x15e
[2059664.620747]  [<8032b4a0>] ipt_log_target+0x41/0x4a
[2059664.620757]  [<802ea446>] ipt_limit_match+0x58/0x76
[2059664.620766]  [<8032b45f>] ipt_log_target+0x0/0x4a
[2059664.620774]  [<803288c5>] ipt_do_table+0x3e2/0x479
[2059664.620785]  [<80331228>] xfrm_policy_put_afinfo+0xa/0x1e
[2059664.620800]  [<80332219>] xfrm_lookup+0x15/0x69
[2059664.620809]  [<802db7d0>] nf_iterate+0x38/0x6a
[2059664.620822]  [<802dbb60>] nf_hook_slow+0x57/0xe0
[2059664.620830]  [<802f1e7c>] ip_forward_finish+0x0/0x22
[2059664.620841]  [<802f20d1>] ip_forward+0x233/0x2ae
[2059664.620849]  [<802f1e7c>] ip_forward_finish+0x0/0x22
[2059664.620859]  [<802f0e7a>] ip_rcv+0x46f/0x49c
[2059664.620866]  [<802f04a8>] ip_rcv_finish+0x0/0x2ab
[2059664.620876]  [<802f0a0b>] ip_rcv+0x0/0x49c
[2059664.620884]  [<802c544a>] netif_receive_skb+0x326/0x3ae
[2059664.620894]  [<802c6efe>] process_backlog+0x6d/0xd2
[2059664.620903]  [<802c766e>] net_rx_action+0x86/0x193
[2059664.620911]  [<80120001>] __do_softirq+0x40/0x85
[2059664.620919]  [<8012006d>] do_softirq+0x27/0x2b
[2059664.620925]  [<8012023d>] irq_exit+0x2d/0x37
[2059664.620931]  [<801062d9>] do_IRQ+0x7a/0x8d
[2059664.620943]  [<8010479b>] common_interrupt+0x23/0x28
[2059664.620954]  [<80209f85>] acpi_processor_idle+0x235/0x3a0
[2059664.620967]  [<80102344>] cpu_idle+0x43/0x6c
[2059664.620973]  [<804aa99e>] start_kernel+0x210/0x215
[2059664.620989]  [<804aa317>] unknown_bootoption+0x0/0x195
[2059664.620998]  ===
[2059664.852188] SRC=66.249.93.147 DST=* LEN=40 TOS=0x00 PREC=0x00 TTL=53 
ID=23868 DF PROTO=TCP SPT=80 DPT=55219 WINDOW=0 RES=0x00 RST URGP=0


[0.00] Linux version 2.6.23-git ([EMAIL PROTECTED]) (gcc version 4.1.2 
20070214 ( (gdc 0.23, using dmd 1.007)) (Gentoo 4.1.2)) #41 PREEMPT Sun Oct 21 
20:42:28 BST 2007
[0.00] BIOS-provided physical RAM map:
[0.00]  BIOS-e820:  - 0009fc00 (usable)
[0.00]  BIOS-e820: 0009fc00 - 000a (reserved)
[0.00]  BIOS-e820: 000f - 0010 (reserved)
[0.00]  BIOS-e820: 0010 - 3fee (usable)
[0.00]  BIOS-e820: 3fee - 3fee3000 (ACPI NVS)
[0.00]  BIOS-e820: 3fee3000 - 3fef (ACPI data)
[0.00]  BIOS-e820: 3fef - 3ff0 (reserved)
[0.00]  BIOS-e820: fec0 - fec01000 (reserved)
[0.00]  BIOS-e820: fee0 - fee01000 (reserved)
[0.00]  BIOS-e820:  - 0001 (reserved)
[0.00] 0MB HIGHMEM available.
[0.00] 1022MB LOWMEM available.
[0.00] found SMP MP-table at 000f4d50
[0.00] NX (Execute Disable) protection: active
[0.00] Entering add_active_range(0, 0, 261856) 0 entries of 256 used
[0.00] Zone PFN ranges:
[0.00]   DMA 0 -> 4096
[0.00]   Normal   4096 ->   261856
[0.00]   HighMem261856 ->   261856
[0.00] Movable zone start PFN for each node
[0.00] early_node_map[1] active PFN ranges
[0.00] 0:0 ->   261856
[0.00] On node 0 totalpages: 261856
[0.00]   DMA zone: 32 pages used for memmap
[0.00]   DMA zone: 0 pages reserved
[0.00]   DMA zone: 4064 pages, LIFO batch:0
[0.00]   Normal zone: 2013 pages used for memmap
[0.00]   Normal zone: 255747 pages, LIFO batch:31
[0.00]   HighMem zone: 0 pages used for memmap
[0.00]   Movable zone: 0 pages used for memmap
[0.00] DMI 2.3 present.
[0.00] ACPI: RSDP 000F6E10, 0014 (r0 CN700 )
[0.00] ACPI: RSDT 3FEE3040, 002C (r1 CN700  AWRDACPI 42302E31 AWRD  
  0)

Re: Where is the new timerfd?

2007-11-22 Thread Davide Libenzi

On Thu, 22 Nov 2007, Michael Kerrisk wrote:

> On Nov 22, 2007 6:34 PM, Davide Libenzi <[EMAIL PROTECTED]> wrote:
> > On Thu, 22 Nov 2007, Michael Kerrisk wrote:
> >
> > > Hey Davide,
> > >
> > > Where is the new timerfd API.  In 2.6.24-rc3, I see the *old* API...
> >
> > Maybe Andrew stuffed the turkey with it? :) It was there. I remeber it was
> > merged. Some screw up reverted it?
> 
> t looks that way.

I'm looking at the log now. It never went in actually. Andrew-san, what 
happened?



- Davide


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] mmc: Add missing sg_init_table() call

2007-11-22 Thread Haavard Skinnemoen

mmc_init_queue only initializes the scatterlists with sg_init_table()
when using a bounce buffer. This leads to a BUG() when CONFIG_DEBUG_SG
is set.

Signed-off-by: Haavard Skinnemoen <[EMAIL PROTECTED]>
---
 drivers/mmc/card/queue.c |3 ++-
 1 files changed, 2 insertions(+), 1 deletions(-)

diff --git a/drivers/mmc/card/queue.c b/drivers/mmc/card/queue.c
index 1b9c9b6..30cd13b 100644
--- a/drivers/mmc/card/queue.c
+++ b/drivers/mmc/card/queue.c
@@ -180,12 +180,13 @@ int mmc_init_queue(struct mmc_queue *mq, struct mmc_card 
*card, spinlock_t *lock
blk_queue_max_hw_segments(mq->queue, host->max_hw_segs);
blk_queue_max_segment_size(mq->queue, host->max_seg_size);
 
-   mq->sg = kzalloc(sizeof(struct scatterlist) *
+   mq->sg = kmalloc(sizeof(struct scatterlist) *
host->max_phys_segs, GFP_KERNEL);
if (!mq->sg) {
ret = -ENOMEM;
goto cleanup_queue;
}
+   sg_init_table(mq->sg, host->max_phys_segs);
}
 
init_MUTEX(>thread_sem);
-- 
1.5.3.4

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: MMC/SDIO sub-system: block mode versus byte mode

2007-11-22 Thread Pierre Ossman

On Thu, 22 Nov 2007 12:15:11 +
Dean Jenkins <[EMAIL PROTECTED]> wrote:

> Hi Pierre,
> 
> Thanks for information.
> 
> My card does support 256 byte transfers in byte mode but I was expecting
> block mode to be used as block mode is supported by the card. Indeed,
> sdio_io_rw_ext_helper() checks for support of block mode before using
> byte mode. eg. block mode is preferred over byte mode in the design of
> sdio_io_rw_ext_helper().

Not really, no. What it preferred is a low number of transfers. Hence we try to 
shuffle as much data as possible using blocks first.

> 
> Yes, I do think single block transfers is broken :(
> 
> Both sdio_io_rw_ext_helper() and mmc_io_rw_extended() are broken for
> single block transfers.
> 
> Will you be doing a fix or how does a fix get proposed ?
> 

I agree that mmc_io_rw_extended() is broken, but I do not think 
sdio_io_rw_ext_helper() is. I'll be doing a fix eventually, but it probably 
won't get that high on my todo list as it will only be useful together with 
some new API. Since your card was content with byte transfers, it shouldn't be 
that much need for a rush.

Rgds
-- 
 -- Pierre Ossman

  Linux kernel, MMC maintainerhttp://www.kernel.org
  PulseAudio, core developer  http://pulseaudio.org
  rdesktop, core developer  http://www.rdesktop.org
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 18/59] drivers/ieee1394: Add missing "space"

2007-11-22 Thread Stefan Richter

Joe Perches wrote:
>  drivers/ieee1394/raw1394.c |4 ++--
>  1 files changed, 2 insertions(+), 2 deletions(-)
...
> - DBGMSG("arm_read  called by node: %X"
> + DBGMSG("arm_read  called by node: %X "
...

Committed to linux1394-2.6.git.  Thanks,
-- 
Stefan Richter
-=-=-=== =-== =-==-
http://arcgraph.de/sr/
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [build bug] ./net/rxrpc/ar-key.c fails to build

2007-11-22 Thread David Howells

Ingo Molnar <[EMAIL PROTECTED]> wrote:

> on the latest kernel (2.6.24-rc3-git1) the attached config triggers the 
> following build error:
> 
> net/built-in.o: In function `rxrpc_destroy_s':
> ar-key.c:(.text+0x9c50d): undefined reference to `crypto_free_tfm'
> net/built-in.o: In function `rxrpc_instantiate_s':
> ar-key.c:(.text+0x9c613): undefined reference to `crypto_alloc_base'

I see it.  The simplest answer is just to make AF_RXRPC select CRYPTO.
However, that's probably not the right solution in the long run (the common
secret key management code assumes stuff about the key payload and the crypto
algorithms used that it shouldn't).

David
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Why is FIBMAP ioctl root only?

2007-11-22 Thread Olivier Galibert

Original thread btw:
  http://www.ussg.indiana.edu/hypermail/linux/kernel/9907.0/0132.html

  OG.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: nohz and strange sleep latencies

2007-11-22 Thread Pavel Machek

Hi!

> > > to me this has the feeling of lapic breakage in C2 mode. Does it get any 
> > > better if you boot with 'nolapic'? (but that might in turn turn off 
> > > high-res timers and nohz in essence) Thomas, any ideas?
> > 
> > Hmm, lapic is considered unstable in c2 by default. You have to tell 
> > the kernel that you trust it in C2 on the command line.
> 
> yeah, i was wondering about that too. ACPI enumerated them properly at a 
> certain stage:
> 
>  ACPI: CPU0 (power states: C1[C1] C2[C2] C3[C3])
>  ACPI: CPU1 (power states: C1[C1] C2[C2] C3[C3])
> 
> but perhaps somehow we miss this fact and fail to turn off the lapic 
> clockevents drivers?

Ok, I guess I'm lost. If I offline second CPU, I immediately get
1000Hz timer tick... is that expected?

I'm trying to decide when system is idle (lets say that means "no user
task is scheduled to wakeup within 10 seconds)... I added some
instrumentation to nohz subsystem, but it does not behave like I'd
expect: even if I run "while true; do sleep .01; done" loop, I see
nohz preparing for 5 seconds sleep... while it seems obvious that it
can only be 10msec sleep, and with max_cstate=1, it works that
way... Plus, nte->start_pid seems to contain some random numbers :-(.

What am I doing wrong?

(Patch for illustration, I can generate full diff against vanilla,
but...)
Pavel

+++ b/kernel/time/tick-sched.c
@@ -229,11 +232,13 @@ void tick_nohz_stop_sched_tick(void)
if (delta_jiffies > 1)
cpu_set(cpu, nohz_cpu_mask);
 
+   {
+   int user_wait = get_next_timer_interrupt(last_jiffies, 
1) - last_jiffies;
+
+   if ((user_wait > HZ/10) && (num_online_cpus() == 1))
+   printk("NOHZ: user ready for %d:%d sec wait 
(kernel %d:%d sec wait), naughty %d\n", user_wait/HZ, user_wait%HZ, 
delta_jiffies/HZ, delta_jiffies%HZ, naughty_pid);
+   }
+
/*
 * nohz_stop_sched_tick can be called several times before
 * the nohz_restart_sched_tick is called. This happens when
+++ b/kernel/timer.c
@@ -691,6 +693,12 @@ static unsigned long __next_timer_interr
if (tbase_get_deferrable(nte->base))
continue;
 
+   if (flags && (nte->start_pid < 1))
+   continue;
+
+   if (flags)
+   naughty_pid = nte->start_pid;
+
found = 1;
expires = nte->expires;
/* Look at the cascade bucket(s)? */



-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) 
http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Why is FIBMAP ioctl root only?

2007-11-22 Thread Olivier Galibert

On Thu, Nov 22, 2007 at 07:17:14PM +0100, Jan Kara wrote:
>   Hi,
> 
>   I guess subject says it all - why is FIBMAP ioctl restricted only to
> root (CAP_SYS_RAWIO)? Corresponding ioctl for XFS is allowed without any
> special capabilities so we are inconsistent here too...
>   Would anyone mind if the check is removed?

Once upon a time some filesystems fucked up when incorrect values
(negative offsets in particular).  So the easy way out was taken and
FIBMAP was restricted, to the eternal annoyance of DVD players which
needed the sector number for CSS reasons.  Since then dvd players have
included an udf parser and life went on.

Well, psx movie players needed it too, but bah.

Essentially if you remove the restriction you have to audit all
filesystems to be sure that they're not going to be problematic.

  OG.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: nohz and strange sleep latencies

2007-11-22 Thread Pavel Machek

Hi!

> > > Clock Event Device: hpet
> > >  set_next_event: hpet_legacy_next_event
> > >  set_mode:   hpet_legacy_set_mode
> > >  event_handler:  tick_handle_oneshot_broadcast
> > > 
> > > Clock Event Device: lapic
> > >  set_next_event: lapic_next_event
> > >  set_mode:   lapic_timer_setup
> > >  event_handler:  hrtimer_interrupt
> > > 
> > > Clock Event Device: lapic
> > >  set_next_event: lapic_next_event
> > >  set_mode:   lapic_timer_setup
> > >  event_handler:  hrtimer_interrupt
> > > 
> > > to me this has the feeling of lapic breakage in C2 mode. Does it get any 
> > > better if you boot with 'nolapic'? (but that might in turn turn off 
> > > high-res timers and nohz in essence) Thomas, any ideas?
> > 
> > Hmm, lapic is considered unstable in c2 by default. You have to tell 
> > the kernel that you trust it in C2 on the command line.
> 
> yeah, i was wondering about that too. ACPI enumerated them properly at a 
> certain stage:
> 
>  ACPI: CPU0 (power states: C1[C1] C2[C2] C3[C3])
>  ACPI: CPU1 (power states: C1[C1] C2[C2] C3[C3])
> 
> but perhaps somehow we miss this fact and fail to turn off the lapic 
> clockevents drivers?

I can confirm that setting max-cstate=1 helps. (Plus, problem gets
_way_ more visible with stripped down config. With usb and irda off,
time sleep .01 easily takes 2 seconds).
Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) 
http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Where is the new timerfd?

2007-11-22 Thread Michael Kerrisk

On Nov 22, 2007 6:34 PM, Davide Libenzi <[EMAIL PROTECTED]> wrote:
> On Thu, 22 Nov 2007, Michael Kerrisk wrote:
>
> > Hey Davide,
> >
> > Where is the new timerfd API.  In 2.6.24-rc3, I see the *old* API...
>
> Maybe Andrew stuffed the turkey with it? :) It was there. I remeber it was
> merged. Some screw up reverted it?

t looks that way.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [linux-usb-devel] 2.6.24-rc3-mm1: usb mouse doesn't work

2007-11-22 Thread Marin Mitov

On Thursday 22 November 2007 07:07:00 pm you wrote:
> On Thu, 22 Nov 2007, Kirill A. Shutemov wrote:
> > > > uhci_hcd :00:1d.3: UHCI Host Controller
> > > > uhci_hcd :00:1d.3: new USB bus registered, assigned bus number 4
> > > > uhci_hcd :00:1d.3: irq 7, io base 0xbf20
> > > > usb usb4: configuration #1 chosen from 1 choice
> > > > hub 4-0:1.0: USB hub found
> > > > hub 4-0:1.0: 2 ports detected
> > > > usb usb4: new device found, idVendor=, idProduct=
> > > > usb usb4: new device strings: Mfr=3, Product=2, SerialNumber=1
> > > > usb usb4: Product: UHCI Host Controller
> > > > usb usb4: Manufacturer: Linux 2.6.24-kas-alt1 uhci_hcd
> > > > usb usb4: SerialNumber: :00:1d.3
> > > > uhci_hcd :00:1d.3: FGR not stopped yet!
> > >
> > > I've had some strangenesses with USB lately.  Sometimes running `lsusb'
> > > makes the USB system notice a newly attached device.
> >
> > No. But I have new messages in dmesg:
> >
> > uhci_hcd :00:1d.3: FGR not stopped yet!
> > uhci_hcd :00:1d.2: FGR not stopped yet!
> > uhci_hcd :00:1d.1: FGR not stopped yet!
> > uhci_hcd :00:1d.0: FGR not stopped yet!
> >
> > > Is that "FGR not stopped yet!" messgae new behaviour?
> >
> > It is a new message since 2.6.24-rc3. I have never try -mm tree before.
>
> These messages could indicate a timing problem.  You can see the code
> that writes the messages near the end of wakeup_rh() in
> drivers/usb/host/uhci-hcd.c.
>
> The message gets written if the controller hardware hasn't turned off a
> particular bit after a 4-us delay.  If the udelay() function wasn't
> working right, it could cause this problem.

udelay() _is_ OK for 2.6.24-rc3, so it is not the cause of the problem

Marin Mitov
>
> Alan Stern
>
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [EMAIL PROTECTED]
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Why is FIBMAP ioctl root only?

2007-11-22 Thread Arjan van de Ven

On Thu, 22 Nov 2007 19:17:14 +0100
Jan Kara <[EMAIL PROTECTED]> wrote:

>   Hi,
> 
>   I guess subject says it all - why is FIBMAP ioctl restricted only to
> root (CAP_SYS_RAWIO)? Corresponding ioctl for XFS is allowed without
> any special capabilities so we are inconsistent here too...
>   Would anyone mind if the check is removed?

probably principle of least privilege; the location on physical media
for a file is clearly something internal to the OS, and non-trusted
users normally don't have any business knowing that. 

I can't think of any immediate exploitable thing with it, but I'm sure
attackers would find a way to use it to increase their privilege once
they can do something like "write 512 bytes to a disk address of my
choice".. (but then again it's game over mostly already)

> 
>   Honza

-- 
If you want to reach me at my work email, use [EMAIL PROTECTED]
For development, discussion and tips for power savings, 
visit http://www.lesswatts.org
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH RFC] [1/9] Core module symbol namespaces code and intro.

2007-11-22 Thread Andi Kleen


> Andy, I like your idea.  IMHO, as Rusty said a simple EXPORT_SYMBOL_TO
> is better.

I don't think so. e.g. tcpcong would be very very messy this way.

> And I wonder if it is possible to export to something like  the struct
> device_driver? If it's possible then it will not limited to modules.

Not sure I follow you. Can you expand? 

-Andi
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

1 2 3 4 >

1 - 100 of 342 matches

Mail list logo