Re: [PATCH] Shrink ext3_inode_info by 8 bytes for !POSIX_ACL.

2008-01-18 Thread Indan Zupancic
On Fri, January 18, 2008 20:16, Mingming Cao wrote:
> On Sat, 2008-01-12 at 21:35 +0100, Indan Zupancic wrote:
>> i_file_acl and i_dir_acl aren't always needed.
>>
>> With certain configs this makes 10 ext3_inode_cache objects fit in
>> one slab instead of the current 9, as the size shrinks from 416 to
>> 408 bytes for 32 bit, !POSIX_ACL and !EXT3_FS_XATTR configs.
>>
>> Signed-off-by: Indan Zupancic <[EMAIL PROTECTED]>
>> ---
>>  fs/ext3/ialloc.c  |2 ++
>>  fs/ext3/inode.c   |   29 +++--
>>  include/linux/ext3_fs_i.h |2 ++
>>  3 files changed, 23 insertions(+), 10 deletions(-)
>>
>> diff --git a/fs/ext3/ialloc.c b/fs/ext3/ialloc.c
>> index 1bc8cd8..01745bc 100644
>> --- a/fs/ext3/ialloc.c
>> +++ b/fs/ext3/ialloc.c
>> @@ -574,8 +574,10 @@ got:
>>  ei->i_frag_no = 0;
>>  ei->i_frag_size = 0;
>>  #endif
>> +#ifdef CONFIG_EXT3_FS_POSIX_ACL
>>  ei->i_file_acl = 0;
>>  ei->i_dir_acl = 0;
>> +#endif
>
> For regular file, i_dir_acl is being reused as i_size_high to support
> large file.

Only the i_dir_acl of struct ext3_inode, not the one from ext3_inode_info.

Thanks,

Indan


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] Shrink ext3_inode_info by 8 bytes for !POSIX_ACL.

2008-01-18 Thread Indan Zupancic
On Fri, January 18, 2008 20:16, Mingming Cao wrote:
 On Sat, 2008-01-12 at 21:35 +0100, Indan Zupancic wrote:
 i_file_acl and i_dir_acl aren't always needed.

 With certain configs this makes 10 ext3_inode_cache objects fit in
 one slab instead of the current 9, as the size shrinks from 416 to
 408 bytes for 32 bit, !POSIX_ACL and !EXT3_FS_XATTR configs.

 Signed-off-by: Indan Zupancic [EMAIL PROTECTED]
 ---
  fs/ext3/ialloc.c  |2 ++
  fs/ext3/inode.c   |   29 +++--
  include/linux/ext3_fs_i.h |2 ++
  3 files changed, 23 insertions(+), 10 deletions(-)

 diff --git a/fs/ext3/ialloc.c b/fs/ext3/ialloc.c
 index 1bc8cd8..01745bc 100644
 --- a/fs/ext3/ialloc.c
 +++ b/fs/ext3/ialloc.c
 @@ -574,8 +574,10 @@ got:
  ei-i_frag_no = 0;
  ei-i_frag_size = 0;
  #endif
 +#ifdef CONFIG_EXT3_FS_POSIX_ACL
  ei-i_file_acl = 0;
  ei-i_dir_acl = 0;
 +#endif

 For regular file, i_dir_acl is being reused as i_size_high to support
 large file.

Only the i_dir_acl of struct ext3_inode, not the one from ext3_inode_info.

Thanks,

Indan


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] Shrink ext3_inode_info by 8 bytes for !POSIX_ACL.

2008-01-12 Thread Indan Zupancic
i_file_acl and i_dir_acl aren't always needed.

With certain configs this makes 10 ext3_inode_cache objects fit in
one slab instead of the current 9, as the size shrinks from 416 to
408 bytes for 32 bit, !POSIX_ACL and !EXT3_FS_XATTR configs.

Signed-off-by: Indan Zupancic <[EMAIL PROTECTED]>
---
 fs/ext3/ialloc.c  |2 ++
 fs/ext3/inode.c   |   29 +++--
 include/linux/ext3_fs_i.h |2 ++
 3 files changed, 23 insertions(+), 10 deletions(-)

diff --git a/fs/ext3/ialloc.c b/fs/ext3/ialloc.c
index 1bc8cd8..01745bc 100644
--- a/fs/ext3/ialloc.c
+++ b/fs/ext3/ialloc.c
@@ -574,8 +574,10 @@ got:
ei->i_frag_no = 0;
ei->i_frag_size = 0;
 #endif
+#ifdef CONFIG_EXT3_FS_POSIX_ACL
ei->i_file_acl = 0;
ei->i_dir_acl = 0;
+#endif
ei->i_dtime = 0;
ei->i_block_alloc_info = NULL;
ei->i_block_group = group;
diff --git a/fs/ext3/inode.c b/fs/ext3/inode.c
index 9b162cd..20a8aeb 100644
--- a/fs/ext3/inode.c
+++ b/fs/ext3/inode.c
@@ -46,9 +46,12 @@ static int ext3_writepage_trans_blocks(struct inode *inode);
  */
 static int ext3_inode_is_fast_symlink(struct inode *inode)
 {
-   int ea_blocks = EXT3_I(inode)->i_file_acl ?
-   (inode->i_sb->s_blocksize >> 9) : 0;
+   int ea_blocks = 0;
 
+#ifdef CONFIG_EXT3_FS_POSIX_ACL
+   if (EXT3_I(inode)->i_file_acl)
+   ea_blocks = inode->i_sb->s_blocksize >> 9;
+#endif
return (S_ISLNK(inode->i_mode) && inode->i_blocks - ea_blocks == 0);
 }
 
@@ -2717,13 +2720,16 @@ void ext3_read_inode(struct inode * inode)
ei->i_frag_no = raw_inode->i_frag;
ei->i_frag_size = raw_inode->i_fsize;
 #endif
-   ei->i_file_acl = le32_to_cpu(raw_inode->i_file_acl);
-   if (!S_ISREG(inode->i_mode)) {
-   ei->i_dir_acl = le32_to_cpu(raw_inode->i_dir_acl);
-   } else {
+   if (S_ISREG(inode->i_mode)) {
inode->i_size |=
((__u64)le32_to_cpu(raw_inode->i_size_high)) << 32;
}
+#ifdef CONFIG_EXT3_FS_POSIX_ACL
+   else {
+   ei->i_dir_acl = le32_to_cpu(raw_inode->i_dir_acl);
+   }
+   ei->i_file_acl = le32_to_cpu(raw_inode->i_file_acl);
+#endif
ei->i_disksize = inode->i_size;
inode->i_generation = le32_to_cpu(raw_inode->i_generation);
ei->i_block_group = iloc.block_group;
@@ -2854,10 +2860,7 @@ static int ext3_do_update_inode(handle_t *handle,
raw_inode->i_frag = ei->i_frag_no;
raw_inode->i_fsize = ei->i_frag_size;
 #endif
-   raw_inode->i_file_acl = cpu_to_le32(ei->i_file_acl);
-   if (!S_ISREG(inode->i_mode)) {
-   raw_inode->i_dir_acl = cpu_to_le32(ei->i_dir_acl);
-   } else {
+   if (S_ISREG(inode->i_mode)) {
raw_inode->i_size_high =
cpu_to_le32(ei->i_disksize >> 32);
if (ei->i_disksize > 0x7fffULL) {
@@ -2883,6 +2886,12 @@ static int ext3_do_update_inode(handle_t *handle,
}
}
}
+#ifdef CONFIG_EXT3_FS_POSIX_ACL
+   else {
+   raw_inode->i_dir_acl = cpu_to_le32(ei->i_dir_acl);
+   }
+   raw_inode->i_file_acl = cpu_to_le32(ei->i_file_acl);
+#endif
raw_inode->i_generation = cpu_to_le32(inode->i_generation);
if (S_ISCHR(inode->i_mode) || S_ISBLK(inode->i_mode)) {
if (old_valid_dev(inode->i_rdev)) {
diff --git a/include/linux/ext3_fs_i.h b/include/linux/ext3_fs_i.h
index 7894dd0..9e7f1b6 100644
--- a/include/linux/ext3_fs_i.h
+++ b/include/linux/ext3_fs_i.h
@@ -75,8 +75,10 @@ struct ext3_inode_info {
__u8i_frag_no;
__u8i_frag_size;
 #endif
+#ifdef CONFIG_EXT3_FS_POSIX_ACL
ext3_fsblk_ti_file_acl;
__u32   i_dir_acl;
+#endif
__u32   i_dtime;
 
/*
-- 
1.5.3.7

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] Shrink ext3_inode_info by 8 bytes for !POSIX_ACL.

2008-01-12 Thread Indan Zupancic
i_file_acl and i_dir_acl aren't always needed.

With certain configs this makes 10 ext3_inode_cache objects fit in
one slab instead of the current 9, as the size shrinks from 416 to
408 bytes for 32 bit, !POSIX_ACL and !EXT3_FS_XATTR configs.

Signed-off-by: Indan Zupancic [EMAIL PROTECTED]
---
 fs/ext3/ialloc.c  |2 ++
 fs/ext3/inode.c   |   29 +++--
 include/linux/ext3_fs_i.h |2 ++
 3 files changed, 23 insertions(+), 10 deletions(-)

diff --git a/fs/ext3/ialloc.c b/fs/ext3/ialloc.c
index 1bc8cd8..01745bc 100644
--- a/fs/ext3/ialloc.c
+++ b/fs/ext3/ialloc.c
@@ -574,8 +574,10 @@ got:
ei-i_frag_no = 0;
ei-i_frag_size = 0;
 #endif
+#ifdef CONFIG_EXT3_FS_POSIX_ACL
ei-i_file_acl = 0;
ei-i_dir_acl = 0;
+#endif
ei-i_dtime = 0;
ei-i_block_alloc_info = NULL;
ei-i_block_group = group;
diff --git a/fs/ext3/inode.c b/fs/ext3/inode.c
index 9b162cd..20a8aeb 100644
--- a/fs/ext3/inode.c
+++ b/fs/ext3/inode.c
@@ -46,9 +46,12 @@ static int ext3_writepage_trans_blocks(struct inode *inode);
  */
 static int ext3_inode_is_fast_symlink(struct inode *inode)
 {
-   int ea_blocks = EXT3_I(inode)-i_file_acl ?
-   (inode-i_sb-s_blocksize  9) : 0;
+   int ea_blocks = 0;
 
+#ifdef CONFIG_EXT3_FS_POSIX_ACL
+   if (EXT3_I(inode)-i_file_acl)
+   ea_blocks = inode-i_sb-s_blocksize  9;
+#endif
return (S_ISLNK(inode-i_mode)  inode-i_blocks - ea_blocks == 0);
 }
 
@@ -2717,13 +2720,16 @@ void ext3_read_inode(struct inode * inode)
ei-i_frag_no = raw_inode-i_frag;
ei-i_frag_size = raw_inode-i_fsize;
 #endif
-   ei-i_file_acl = le32_to_cpu(raw_inode-i_file_acl);
-   if (!S_ISREG(inode-i_mode)) {
-   ei-i_dir_acl = le32_to_cpu(raw_inode-i_dir_acl);
-   } else {
+   if (S_ISREG(inode-i_mode)) {
inode-i_size |=
((__u64)le32_to_cpu(raw_inode-i_size_high))  32;
}
+#ifdef CONFIG_EXT3_FS_POSIX_ACL
+   else {
+   ei-i_dir_acl = le32_to_cpu(raw_inode-i_dir_acl);
+   }
+   ei-i_file_acl = le32_to_cpu(raw_inode-i_file_acl);
+#endif
ei-i_disksize = inode-i_size;
inode-i_generation = le32_to_cpu(raw_inode-i_generation);
ei-i_block_group = iloc.block_group;
@@ -2854,10 +2860,7 @@ static int ext3_do_update_inode(handle_t *handle,
raw_inode-i_frag = ei-i_frag_no;
raw_inode-i_fsize = ei-i_frag_size;
 #endif
-   raw_inode-i_file_acl = cpu_to_le32(ei-i_file_acl);
-   if (!S_ISREG(inode-i_mode)) {
-   raw_inode-i_dir_acl = cpu_to_le32(ei-i_dir_acl);
-   } else {
+   if (S_ISREG(inode-i_mode)) {
raw_inode-i_size_high =
cpu_to_le32(ei-i_disksize  32);
if (ei-i_disksize  0x7fffULL) {
@@ -2883,6 +2886,12 @@ static int ext3_do_update_inode(handle_t *handle,
}
}
}
+#ifdef CONFIG_EXT3_FS_POSIX_ACL
+   else {
+   raw_inode-i_dir_acl = cpu_to_le32(ei-i_dir_acl);
+   }
+   raw_inode-i_file_acl = cpu_to_le32(ei-i_file_acl);
+#endif
raw_inode-i_generation = cpu_to_le32(inode-i_generation);
if (S_ISCHR(inode-i_mode) || S_ISBLK(inode-i_mode)) {
if (old_valid_dev(inode-i_rdev)) {
diff --git a/include/linux/ext3_fs_i.h b/include/linux/ext3_fs_i.h
index 7894dd0..9e7f1b6 100644
--- a/include/linux/ext3_fs_i.h
+++ b/include/linux/ext3_fs_i.h
@@ -75,8 +75,10 @@ struct ext3_inode_info {
__u8i_frag_no;
__u8i_frag_size;
 #endif
+#ifdef CONFIG_EXT3_FS_POSIX_ACL
ext3_fsblk_ti_file_acl;
__u32   i_dir_acl;
+#endif
__u32   i_dtime;
 
/*
-- 
1.5.3.7

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH][RFC] Simple tamper-proof device filesystem.

2008-01-11 Thread Indan Zupancic
Hi,

On Fri, January 11, 2008 09:46, Tetsuo Handa wrote:
> It depends.
> Some users have to continue using brain dead legacy applications
> without modification because ...
>
>the application's source code is not available.

Source isn't needed, as long as the vendor has it.

>the distributor no longer supports the application.

Then why should anyone else support it?

>the application is too difficult/complicated to reconstruct.

Then you can't trust it and it shouldn't have permission to do
potentially dangerous things in /dev/ either. Even if you can
contain the device node creation, it most likely does other
potentially dangerous things too. As a whole it can't be trusted.

> I assume "a static /dev/" means a /dev/ directory in 2.4 kernels.
> This filesystem's advantage:

I'm not talking about devfs, I'm talking about a real static /dev.
I'm using it now and it works fine (I let udev manage /udev/ to see
what's it's doing).

>   (1) Can guarantee filename/attribute pairs.

Wrong. All nodes are created and thus there's never a need to create
new nodes. So /dev/ can't be modified by anyone. This works because
all nodes that anyone might want to create already exist.

>   (2) Can keep nodes that needn't to be deleted/modified for read-only.

This would also be true for all nodes in a static /dev I think.

>   (3) Can hide unwanted device nodes.

In a static /dev you only create the nodes you want. It's true that it
can't hide nodes for hardware that doesn't exist (other than deleting
the nodes manually), but that was the norm for years before the
whole dynamic /dev thing catched up.

> I don't know. I'm not using rare software.

It doesn't have to be rare, anything is fine. You don't know
anything else than udev? (And shell commands like mknod etc.)

Then why all the talk about mysterious apps that might need to
do all kind of crazy things in /dev?

> No. The kernel must be involved.

> Who can guarantee that the daemon can access all namespaces?
> The process who requests "mknod" and the process who performs "mknod"
> are not always using the same namespace.

This is true on a theoretical level. But practically I think you can either
run multiple daemons, one for each namespace where you want to
control /dev/, or if you really want one daemon you can pass the
directory fd to it where the node should be created and use mknodat().
I believe that crosses namespaces correctly.

If the daemon can't be contacted or doesn't want to do a mknod for you,
the preloaded lib can fallback to doing the mknod itself, though normally
that would be disallowed by MAC.

But I think that the chance that any process needs to create device nodes
in a chroot is at the level of fairy existance.

> If "foo" or "bar" is a statically linked or suid-root application
> (where LD_PRELOAD is ignored), they would attempt to create device nodes
> directly (i.e. call sys_mknod() instead of communicating with the daemon)
> and abort due to failure.
> Not only applications who wants to create device nodes in /dev/ ,
> but also all applications who wants to modify entries in /dev/ .

If the preloaded library is setuid, it will also work for setuid programs.
It's true that it won't work for statically linked apps, but so what?

Device node creating apps are rare enough, let alone the ones that are
also statically linked. Nice theoretical problem, but I don't think anyone
will care in practice.

> From the beginning, the kernel is deeply involved because in-kernel MAC
> is essential to realize "only the tiny daemon can modify /dev/".
> Why not do this "filename/attribute" checking in the kernel too?

That "only the tiny daemon can modify /dev/" is done with MAC rules,
the ones that should be the default for all applications except udev by
default already. For teh kernel nothing changes.

>> The ammount of code will be the current parsing code + a few hundred
>> lines of code, including the preloaded library.
>
> You will be bothered with "what is the realpath of /dev/null?" and
> "how can I reach the realpath?" because you have to manage
> namespace information.

Or ignore the problem and see if it's a real problem or a nice theoretical
case. And when it turns out to be a real problem, there are probably
ways to fix it (See above). But you know what exactly is needed only
after problems do turn up.

> OK. I'll consider adding this feature.
> But I'd like to use approach (B) to keep the advantage (3).
>
> (A) White-listing + Black-listing approach.
>
> "Permit any operations if the filename didn't appear
>  in the configuration file".
>
> (B) White-listing + Wild-card approach.
>
> "Support wildcard and permit only operations if
>  the filename-with-wildcard/attributes-with-wildcard appeared
>  in the configuration file".

With this the filesystem at least adds some unique abilities.

If anyone really needs it and where/how it should be implemented is
another matter.

Without it it's a glorified and complicated 

Re: [PATCH][RFC] Simple tamper-proof device filesystem.

2008-01-11 Thread Indan Zupancic
Hi,

On Fri, January 11, 2008 09:46, Tetsuo Handa wrote:
 It depends.
 Some users have to continue using brain dead legacy applications
 without modification because ...

the application's source code is not available.

Source isn't needed, as long as the vendor has it.

the distributor no longer supports the application.

Then why should anyone else support it?

the application is too difficult/complicated to reconstruct.

Then you can't trust it and it shouldn't have permission to do
potentially dangerous things in /dev/ either. Even if you can
contain the device node creation, it most likely does other
potentially dangerous things too. As a whole it can't be trusted.

 I assume a static /dev/ means a /dev/ directory in 2.4 kernels.
 This filesystem's advantage:

I'm not talking about devfs, I'm talking about a real static /dev.
I'm using it now and it works fine (I let udev manage /udev/ to see
what's it's doing).

   (1) Can guarantee filename/attribute pairs.

Wrong. All nodes are created and thus there's never a need to create
new nodes. So /dev/ can't be modified by anyone. This works because
all nodes that anyone might want to create already exist.

   (2) Can keep nodes that needn't to be deleted/modified for read-only.

This would also be true for all nodes in a static /dev I think.

   (3) Can hide unwanted device nodes.

In a static /dev you only create the nodes you want. It's true that it
can't hide nodes for hardware that doesn't exist (other than deleting
the nodes manually), but that was the norm for years before the
whole dynamic /dev thing catched up.

 I don't know. I'm not using rare software.

It doesn't have to be rare, anything is fine. You don't know
anything else than udev? (And shell commands like mknod etc.)

Then why all the talk about mysterious apps that might need to
do all kind of crazy things in /dev?

 No. The kernel must be involved.

 Who can guarantee that the daemon can access all namespaces?
 The process who requests mknod and the process who performs mknod
 are not always using the same namespace.

This is true on a theoretical level. But practically I think you can either
run multiple daemons, one for each namespace where you want to
control /dev/, or if you really want one daemon you can pass the
directory fd to it where the node should be created and use mknodat().
I believe that crosses namespaces correctly.

If the daemon can't be contacted or doesn't want to do a mknod for you,
the preloaded lib can fallback to doing the mknod itself, though normally
that would be disallowed by MAC.

But I think that the chance that any process needs to create device nodes
in a chroot is at the level of fairy existance.

 If foo or bar is a statically linked or suid-root application
 (where LD_PRELOAD is ignored), they would attempt to create device nodes
 directly (i.e. call sys_mknod() instead of communicating with the daemon)
 and abort due to failure.
 Not only applications who wants to create device nodes in /dev/ ,
 but also all applications who wants to modify entries in /dev/ .

If the preloaded library is setuid, it will also work for setuid programs.
It's true that it won't work for statically linked apps, but so what?

Device node creating apps are rare enough, let alone the ones that are
also statically linked. Nice theoretical problem, but I don't think anyone
will care in practice.

 From the beginning, the kernel is deeply involved because in-kernel MAC
 is essential to realize only the tiny daemon can modify /dev/.
 Why not do this filename/attribute checking in the kernel too?

That only the tiny daemon can modify /dev/ is done with MAC rules,
the ones that should be the default for all applications except udev by
default already. For teh kernel nothing changes.

 The ammount of code will be the current parsing code + a few hundred
 lines of code, including the preloaded library.

 You will be bothered with what is the realpath of /dev/null? and
 how can I reach the realpath? because you have to manage
 namespace information.

Or ignore the problem and see if it's a real problem or a nice theoretical
case. And when it turns out to be a real problem, there are probably
ways to fix it (See above). But you know what exactly is needed only
after problems do turn up.

 OK. I'll consider adding this feature.
 But I'd like to use approach (B) to keep the advantage (3).

 (A) White-listing + Black-listing approach.

 Permit any operations if the filename didn't appear
  in the configuration file.

 (B) White-listing + Wild-card approach.

 Support wildcard and permit only operations if
  the filename-with-wildcard/attributes-with-wildcard appeared
  in the configuration file.

With this the filesystem at least adds some unique abilities.

If anyone really needs it and where/how it should be implemented is
another matter.

Without it it's a glorified and complicated drop-in replacement for
a static /dev/.

Regards,

Indan


--
To 

Re: [PATCH][RFC] Simple tamper-proof device filesystem.

2008-01-10 Thread Indan Zupancic
On Thu, January 10, 2008 05:57, Tetsuo Handa wrote:
> It seems to me that the alternatives you are proposing include
> modification of userland applications. But my assumption is
> that "Don't require modification of userland applications".

If you want a secure system it isn't that unreasonable to expect
applications to not do brain dead things, so not requiring any
modifications or config changes seems a bit optimistic to me.

> In other words, I want to implement without asking applications
> to use /dev/dynamic/ or something.
> This filesystem is intended to provide support for legacy applications.
> (In fact, this filesystem in TOMOYO Linux is for kernel 2.4.30/2.6.11 and
> later.)

Legacy applications should cope with a static /dev/.

What is the advantage of your filesystem compared to a static /dev/?

> Use of a tiny daemon that communicates with udev is not sufficient.
> The udev is not the only application that modifies /dev files.

Oh, it isn't? Which other applications do modify /dev files? I'd like to
hear about a few, no matter how obscure or proprietary. And please
tell how many of those will stop working with a static /dev with all
nodes they might create already existing.

> At least, the tiny daemon should communicate with the kernel
> so that all requests are checked by the tiny daemon.

No, why should the kernel be involved? The tiny daemon would be
the only one allowed to modify /dev/, so all mknod commands will
be done by it. Of course it means that you might need to modify
the two or three apps wanting to create device nodes, or you can
make an LD_PRELOAD lib that intercepts mknod commands and
sends them to the daemon.

The ammount of code will be the current parsing code + a few hundred
lines of code, including the preloaded library.

> But use of the tiny daemon (which is a process running in userland)
> causes a lot of troubles.

No, it doesn't, and most of those problems are true for all programs
that access /dev! If those are straced or whatever they can be forced
to open the wrong file, practically breaking the filename/attribute pairs.
So all security you think you need to have for the daemon process is
the same security you already need for all processes anyway to protect
them against each other.

>> If an administrator wants something else than
>> 3 or 5, you're breaking something.
> That's the fate of white-list based access control.
>
> Does this filesystem sound too strict to support dynamic device?
> May be this filesystem should be able to permit creation of device
> nodes that are not listed in the policy file.

Actually, I assumed that was the case, because if it's strictly white-list
based it's almost the same as a static /dev with some nodes hidden.
Without it has even less value, because it just complicates matters
compared to a normal static dev.

I thought it checked that if a device name was in the list, it has the
correct attributes, and was free to create nodes without restricted
names.

> From your next posting:
>> But I think doing more is getting ridiculous, because if a process can
>> create a device node, it can also access it and do whatever harm could
>> be done by the confusion caused by unexpected name/attribute pairs.
>
> FYI. Being able to create a device node is different from being able to access
> it and do whatever harm. You will need read and/or write permission to open
> that device.

Yes, but as the process creates the device it can also choose the file mode and
probably also ownership. And as it creates a new file there likely aren't strict
MAC rules in place restricting the process from reading or writing to it. So
yes, you're right, but in practise it isn't as easy to close that hole,
especially
not if the applications isn't very clean and single purpose. If it creates the
node
it probably wans to use it too, and that means read/write access. Even if it can
live without it, it could give access to the node to another process and let the
other process do the dirty work. Very tricky.

Greetings,

Indan


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH][RFC] Simple tamper-proof device filesystem.

2008-01-09 Thread Indan Zupancic
On Thu, January 10, 2008 00:08, Serge E. Hallyn wrote:
> These emails again are getting really long, but I think the gist of
> Indan's suggestion can be concisely summarized:

No worry, I wasn't planning on extending it, I've said what I've to say.

Except...

>
>   "To confine process P3 to /dev/hda2 being 'b 3 2', create
>   /dev/p3, launch P3 in a new mounts namespace, mount --bind
>   /dev/p3 /dev, exec what you want p3 running, and have
>   MAC prevent umount /dev/p3."
>
> This is a neat idea, but Tetsuo's rebutall is
>
>   "P3 may be legacy code needing to create or delete
>   /dev/floppy, where -EPERM confuses P3 and prevents
>   it working correctly."
>
> Indan's idea is interesting and I like it, but is there an answer to
> Tetsuo's problem with it?

...that I didn't mean that, but a more simple

/dev/ directory protected from any modifications by MAC,

/dev/* all the nodes that need to have guaranteed name/attribute pairs,
like /dev/null, /dev/zero, /dev/random, etc. and:

/dev/dynamic/ being a dir where apps who really need to create/modify
device nodes can do whatever they want to do. It can be multiple dirs
too, like /dev/snd/, /dev/input/ etc.

I guess this covers about 96% of the usecases of this tamper-proof dev fs.

You can think of unlikely cases that aren't solved by this, but those can
be solved in another way if really wanted (like a checking daemon,
modified udev, shadow /dev/, to name a few).

But I think doing more is getting ridiculous, because if a process can
create a device node, it can also access it and do whatever harm could
be done by the confusion caused by unexpected name/attribute pairs.

As for information snooping, that's mostly about /dev/null or other
things that are known beforehand.

> PS - Indan, you also said in essence "if P3 can be trusted to create
> /dev/floppy why can't it be trusted to create /dev/hda1".  I trust that,
> phrased that way, the question answers itself?

Not exactly. If there's a process that dynamically created certain device
nodes, and it wants to create one that doesn't fit the rules, you can't
know if it's wrong or if your rules are wrong. The process has a certain
policy of naming/creating the devices, but you also have a policy at the
kernel side with this fs. If it mismatches you don't know which one is
right.

If you trust a process to create /dev/hd*, you can also trust it to create
the proper /dev/hdXn, no need to verify if /dev/hda1 is really 3 1.

The whole thing about filename/attribute pairs is that it's about what
applications expect. There aren't many expectations about dynamically
created device nodes which might not always be there, because their
name isn't stable.

The use case for this fs is a malicious app that can create device nodes,
and we're worried about mismatching name/attribute pairs. Not about
our data, or anything else. Call me an optimist, but I think you don't
need to worry about name/attribute pairs.

Greetings,

Indan


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH][RFC] Simple tamper-proof device filesystem.

2008-01-09 Thread Indan Zupancic
Hello,

On Wed, January 9, 2008 05:39, Tetsuo Handa wrote:
> Hello.
>
> Indan Zupancic wrote:
>> I think you focus too much on your way of enforcing filename/attributes
>> pairs.
> So?

So that you miss alternatives and don't see the bigger picture.

>
>> The same can be achieved by creating the device nodes with
>> expected attributes, and preventing processes from changing those files.
> The device nodes have to be deletable if some process (including udev) needs
> to delete.
> Thus, you cannot unconditionally prevent processes from changing those files.
>
>> This because expected combinations are known beforehand.
> Yes.
>
>> And once those files are present, the MAC system used doesn't have to have
>> special
>> device nodes attributes support. Protecting those files is enough to
>> guarantee filename/attributes pairs.
> If MAC system needn't to support this filesystem's functionality,
> who creates those files with warrantee of expected attributes? The udev does?
> If udev is exploited, who can guarantee?

The person that would write the config file for your fs, the one who wants
that guarantee.

>
>> No, this is because rename permission was given for files that it shouldn't
>> had.
> Do you think all MAC implementation have the same granularity and
> functionalities?
> I don't think so. Not all MAC implementation can control with such
> granularity.
> This filesystem is designed to be combined with any MAC,
> although the MAC used with this filesystem should be able to restrict
> namespace manipulation requests so that this filesystem can remain /dev
> and visible to userland applications.

Good point, but I assume they all have at least a directory granularity, and 
then
/dev/ can be static and udev and other can have free reign in e.g. 
/dev/dynamic/.
Just use subdirs for the dynamic stuff and this granularity problem is, with
slight
inconvenience, solved.

>
>> Either you want a process to manage device names and attributes, and then
>> you
>> give it permission to do that, or you want to enforce certain
>> filename/attribute
>> pairs and then you just do it yourself.
> If I modify udev to enforce certain filename/attribute pairs and the modified
> udev
> was exploited, who can guarantee?
> "Don't trust userland application" is the basis of restricting access in
> kernel space.
> If you can trust userland application, you don't need in-kernel access
> control.

Funny, I thought that it was in the kernel because that's the way to protect
processes against eachother, the fs against processes, and for performance
reasons.

Exploits are in code, and where that code is doesn't matter that much, either
kernel or userspace, though if it's exploitable you'll rather not have it in the
kernel. So I think it's more secure if the checking would be done by udev than
in a special filesystem, even if that means that you're screwed if udev is
exploited. Of course you fully trust your own code, naturally.

A tiny daemon that communicates with udev and does the checking you have
now, and if ok it creates the node is really not much more code than your fs,
so as hard to exploit too. Then if udev is hacked you have the same guarantee
as you have now.

I can think of more alternatives that are as secure or more secure than the
current solution.

>
>
>> Will your filesystem prevent the trivial case of
>>
>> rm /dev/hda1
>> ln -s /dev/hda2 /dev/hda1
>>
> Of course. To permit the above operation, the following permissions are
> needed.
>
>   hda1660 0   6   2   b   3   1
>   hda1777 0   0  33   l   .

Yes, I should've read the code before asking that, instead of the other way
round.

>
>> Rename permission can be given for /dev in general, but prohibited for
>> certain files in /dev, the ones you want to have specific attributes.
>> It isn't all or nothing.
> Do you think all MAC implementation can prohibit renaming for certain files in
> /dev ?
>
>> It's "forbid modifying certain nodes that process needn't to modify"
>> versus "forbid breaking filename/attribute pairs of certain nodes".
>>
>> Both have the same effect, except that the first one is generic and
>> can be done by existing MAC systems, while the second one needs
>> a special filesystem and a handful of MAC rules to make it effective.
> Do you think all MAC implementation can do?
> I think the first one is implementation specific and the second one is
> generic.

Protecting certain files from being modified seems to me more generic than
enforcing filename/attributes pairs on device nodes. And if they can't do it
surely they can do it per directory, and the using su

Re: [PATCH][RFC] Simple tamper-proof device filesystem.

2008-01-09 Thread Indan Zupancic
Hello,

On Wed, January 9, 2008 05:39, Tetsuo Handa wrote:
 Hello.

 Indan Zupancic wrote:
 I think you focus too much on your way of enforcing filename/attributes
 pairs.
 So?

So that you miss alternatives and don't see the bigger picture.


 The same can be achieved by creating the device nodes with
 expected attributes, and preventing processes from changing those files.
 The device nodes have to be deletable if some process (including udev) needs
 to delete.
 Thus, you cannot unconditionally prevent processes from changing those files.

 This because expected combinations are known beforehand.
 Yes.

 And once those files are present, the MAC system used doesn't have to have
 special
 device nodes attributes support. Protecting those files is enough to
 guarantee filename/attributes pairs.
 If MAC system needn't to support this filesystem's functionality,
 who creates those files with warrantee of expected attributes? The udev does?
 If udev is exploited, who can guarantee?

The person that would write the config file for your fs, the one who wants
that guarantee.


 No, this is because rename permission was given for files that it shouldn't
 had.
 Do you think all MAC implementation have the same granularity and
 functionalities?
 I don't think so. Not all MAC implementation can control with such
 granularity.
 This filesystem is designed to be combined with any MAC,
 although the MAC used with this filesystem should be able to restrict
 namespace manipulation requests so that this filesystem can remain /dev
 and visible to userland applications.

Good point, but I assume they all have at least a directory granularity, and 
then
/dev/ can be static and udev and other can have free reign in e.g. 
/dev/dynamic/.
Just use subdirs for the dynamic stuff and this granularity problem is, with
slight
inconvenience, solved.


 Either you want a process to manage device names and attributes, and then
 you
 give it permission to do that, or you want to enforce certain
 filename/attribute
 pairs and then you just do it yourself.
 If I modify udev to enforce certain filename/attribute pairs and the modified
 udev
 was exploited, who can guarantee?
 Don't trust userland application is the basis of restricting access in
 kernel space.
 If you can trust userland application, you don't need in-kernel access
 control.

Funny, I thought that it was in the kernel because that's the way to protect
processes against eachother, the fs against processes, and for performance
reasons.

Exploits are in code, and where that code is doesn't matter that much, either
kernel or userspace, though if it's exploitable you'll rather not have it in the
kernel. So I think it's more secure if the checking would be done by udev than
in a special filesystem, even if that means that you're screwed if udev is
exploited. Of course you fully trust your own code, naturally.

A tiny daemon that communicates with udev and does the checking you have
now, and if ok it creates the node is really not much more code than your fs,
so as hard to exploit too. Then if udev is hacked you have the same guarantee
as you have now.

I can think of more alternatives that are as secure or more secure than the
current solution.



 Will your filesystem prevent the trivial case of

 rm /dev/hda1
 ln -s /dev/hda2 /dev/hda1

 Of course. To permit the above operation, the following permissions are
 needed.

   hda1660 0   6   2   b   3   1
   hda1777 0   0  33   l   .

Yes, I should've read the code before asking that, instead of the other way
round.


 Rename permission can be given for /dev in general, but prohibited for
 certain files in /dev, the ones you want to have specific attributes.
 It isn't all or nothing.
 Do you think all MAC implementation can prohibit renaming for certain files in
 /dev ?

 It's forbid modifying certain nodes that process needn't to modify
 versus forbid breaking filename/attribute pairs of certain nodes.

 Both have the same effect, except that the first one is generic and
 can be done by existing MAC systems, while the second one needs
 a special filesystem and a handful of MAC rules to make it effective.
 Do you think all MAC implementation can do?
 I think the first one is implementation specific and the second one is
 generic.

Protecting certain files from being modified seems to me more generic than
enforcing filename/attributes pairs on device nodes. And if they can't do it
surely they can do it per directory, and the using subdirs solves it.


 It doesn't matter where they are, it's that a different fs than yours could
 be
 mounted over it. You say a MAC can prevent that from happening, but a
 MAC can also prevent all processes except for udev from modifying /dev.
 But MAC cannot prevent udev from modifying /dev . And what if exploited?
 Not all MAC can enforce access control over all processes with the granularity
 you are talking. And what if a process that cannot

Re: [PATCH][RFC] Simple tamper-proof device filesystem.

2008-01-09 Thread Indan Zupancic
On Thu, January 10, 2008 00:08, Serge E. Hallyn wrote:
 These emails again are getting really long, but I think the gist of
 Indan's suggestion can be concisely summarized:

No worry, I wasn't planning on extending it, I've said what I've to say.

Except...


   To confine process P3 to /dev/hda2 being 'b 3 2', create
   /dev/p3, launch P3 in a new mounts namespace, mount --bind
   /dev/p3 /dev, exec what you want p3 running, and have
   MAC prevent umount /dev/p3.

 This is a neat idea, but Tetsuo's rebutall is

   P3 may be legacy code needing to create or delete
   /dev/floppy, where -EPERM confuses P3 and prevents
   it working correctly.

 Indan's idea is interesting and I like it, but is there an answer to
 Tetsuo's problem with it?

...that I didn't mean that, but a more simple

/dev/ directory protected from any modifications by MAC,

/dev/* all the nodes that need to have guaranteed name/attribute pairs,
like /dev/null, /dev/zero, /dev/random, etc. and:

/dev/dynamic/ being a dir where apps who really need to create/modify
device nodes can do whatever they want to do. It can be multiple dirs
too, like /dev/snd/, /dev/input/ etc.

I guess this covers about 96% of the usecases of this tamper-proof dev fs.

You can think of unlikely cases that aren't solved by this, but those can
be solved in another way if really wanted (like a checking daemon,
modified udev, shadow /dev/, to name a few).

But I think doing more is getting ridiculous, because if a process can
create a device node, it can also access it and do whatever harm could
be done by the confusion caused by unexpected name/attribute pairs.

As for information snooping, that's mostly about /dev/null or other
things that are known beforehand.

 PS - Indan, you also said in essence if P3 can be trusted to create
 /dev/floppy why can't it be trusted to create /dev/hda1.  I trust that,
 phrased that way, the question answers itself?

Not exactly. If there's a process that dynamically created certain device
nodes, and it wants to create one that doesn't fit the rules, you can't
know if it's wrong or if your rules are wrong. The process has a certain
policy of naming/creating the devices, but you also have a policy at the
kernel side with this fs. If it mismatches you don't know which one is
right.

If you trust a process to create /dev/hd*, you can also trust it to create
the proper /dev/hdXn, no need to verify if /dev/hda1 is really 3 1.

The whole thing about filename/attribute pairs is that it's about what
applications expect. There aren't many expectations about dynamically
created device nodes which might not always be there, because their
name isn't stable.

The use case for this fs is a malicious app that can create device nodes,
and we're worried about mismatching name/attribute pairs. Not about
our data, or anything else. Call me an optimist, but I think you don't
need to worry about name/attribute pairs.

Greetings,

Indan


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH][RFC] Simple tamper-proof device filesystem.

2008-01-08 Thread Indan Zupancic
Hi Tetsuo,

I think you focus too much on your way of enforcing filename/attributes
pairs. The same can be achieved by creating the device nodes with
expected attributes, and preventing processes from changing those files.
This because expected combinations are known beforehand. And once
those files are present, the MAC system used doesn't have to have special
device nodes attributes support. Protecting those files is enough to
guarantee filename/attributes pairs.

On Tue, January 8, 2008 14:50, Tetsuo Handa wrote:
> Hello.
>
>
> Indan Zupancic wrote:
>> > I want to use this filesystem in case where a process with root privilege
>> was
>> > hijacked but the behavior of the hijacked process is still restricted by
>> MAC.
>>
>> 1) If the behaviour can be controlled, why can't the process be
>>disallowed to change anything badly in /dev? Like disallowing anything
>>from modifying existing nodes that weren't created by that process.
>>That would have practically the same effect as your filesystem,
>>won't it?
> MAC system can prevent hijacked processes from changing anything badly in /dev
> .
> But MAC system can't prevent hijacked processes from doing
> "mv /dev/hda1 /dev/hda1.tmp; mv /dev/hda2 /dev/hda1; mv /dev/hda1.tmp
> /dev/hda2"
> if permissions to rename device nodes in /dev are given to hijacked processes.
> This is because MAC implementation doesn't check filename/attribute pairs.

No, this is because rename permission was given for files that it shouldn't had.
Either you want a process to manage device names and attributes, and then you
give it permission to do that, or you want to enforce certain filename/attribute
pairs and then you just do it yourself.

Will your filesystem prevent the trivial case of

rm /dev/hda1
ln -s /dev/hda2 /dev/hda1

>
> But this filesystem can prevent hijacked processes from doing
> "mv /dev/hda1 /dev/hda1.tmp; mv /dev/hda2 /dev/hda1; mv /dev/hda1.tmp
> /dev/hda2"
> even if permissions to rename device nodes in /dev are given to hijacked
> processes.

Rename permission can be given for /dev in general, but prohibited for
certain files in /dev, the ones you want to have specific attributes.
It isn't all or nothing.

>
> This filesystem is not designed to
> "forbid modifying nodes if that process needn't to modify nodes".
> This filesystem is designed to
> "forbid breaking filename/attribute pairs of nodes
> even if that process need to (or permitted to) modify nodes".

It's "forbid modifying certain nodes that process needn't to modify"
versus "forbid breaking filename/attribute pairs of certain nodes".

Both have the same effect, except that the first one is generic and
can be done by existing MAC systems, while the second one needs
a special filesystem and a handful of MAC rules to make it effective.


>> 2) The MAC system may not be able to guarantee certain combinations
>>of device names and properties, but isn't that policy that shouldn't
>>be in the kernel anyway? But if it is, shouldn't all device nodes be
>>checked? That is, shouldn't it be a global check instead of a filesystem
>>specific one?
> I think the reason why MAC system doesn't handle filename/attributes pairs is
> that:
>
> Filename and its attributes pairs are conventionally considered as
> constant and reliable.
>
> It makes the MAC's policy syntax complicated to describe this attribute
> enforcement information in MAC's policy.
>
> Thus, this should be a global check. But usually device nodes are only in /dev
> .

It doesn't matter where they are, it's that a different fs than yours could be
mounted over it. You say a MAC can prevent that from happening, but a
MAC can also prevent all processes except for udev from modifying /dev.
Done globally instead of as a filesystem it can actually guarantee name/attr
pairs, now it can't even do that on its own.

>
>> 3) Code efficiency. Thousand lines of code just to close one very specific
>>attack, which can be done in lots of different other ways that all need
>>to be prevented by the MAC system. (mounting over it, intercepting open
>>calls, duping the fd, etc.) Is it worth it?
> This filesystem is doing what MAC system is not doing.
> So, please don't complain about inability of this filesystem to close all
> attacks.

I don't. What I complain about is that it's too specific and does it one chosen
job badly. It lacks abstraction. As far as I can see any decent MAC can achieve
the same end result as your filesystem, without directly enforcing name/attr
pairs.

> You can use MAC system to prevent attackers from mounting other filesystem
> over this filesystem.
>
> The filename/attribute pairs are so

Re: [PATCH][RFC] Simple tamper-proof device filesystem.

2008-01-08 Thread Indan Zupancic
Hi Tetsuo,

I think you focus too much on your way of enforcing filename/attributes
pairs. The same can be achieved by creating the device nodes with
expected attributes, and preventing processes from changing those files.
This because expected combinations are known beforehand. And once
those files are present, the MAC system used doesn't have to have special
device nodes attributes support. Protecting those files is enough to
guarantee filename/attributes pairs.

On Tue, January 8, 2008 14:50, Tetsuo Handa wrote:
 Hello.


 Indan Zupancic wrote:
  I want to use this filesystem in case where a process with root privilege
 was
  hijacked but the behavior of the hijacked process is still restricted by
 MAC.

 1) If the behaviour can be controlled, why can't the process be
disallowed to change anything badly in /dev? Like disallowing anything
from modifying existing nodes that weren't created by that process.
That would have practically the same effect as your filesystem,
won't it?
 MAC system can prevent hijacked processes from changing anything badly in /dev
 .
 But MAC system can't prevent hijacked processes from doing
 mv /dev/hda1 /dev/hda1.tmp; mv /dev/hda2 /dev/hda1; mv /dev/hda1.tmp
 /dev/hda2
 if permissions to rename device nodes in /dev are given to hijacked processes.
 This is because MAC implementation doesn't check filename/attribute pairs.

No, this is because rename permission was given for files that it shouldn't had.
Either you want a process to manage device names and attributes, and then you
give it permission to do that, or you want to enforce certain filename/attribute
pairs and then you just do it yourself.

Will your filesystem prevent the trivial case of

rm /dev/hda1
ln -s /dev/hda2 /dev/hda1


 But this filesystem can prevent hijacked processes from doing
 mv /dev/hda1 /dev/hda1.tmp; mv /dev/hda2 /dev/hda1; mv /dev/hda1.tmp
 /dev/hda2
 even if permissions to rename device nodes in /dev are given to hijacked
 processes.

Rename permission can be given for /dev in general, but prohibited for
certain files in /dev, the ones you want to have specific attributes.
It isn't all or nothing.


 This filesystem is not designed to
 forbid modifying nodes if that process needn't to modify nodes.
 This filesystem is designed to
 forbid breaking filename/attribute pairs of nodes
 even if that process need to (or permitted to) modify nodes.

It's forbid modifying certain nodes that process needn't to modify
versus forbid breaking filename/attribute pairs of certain nodes.

Both have the same effect, except that the first one is generic and
can be done by existing MAC systems, while the second one needs
a special filesystem and a handful of MAC rules to make it effective.


 2) The MAC system may not be able to guarantee certain combinations
of device names and properties, but isn't that policy that shouldn't
be in the kernel anyway? But if it is, shouldn't all device nodes be
checked? That is, shouldn't it be a global check instead of a filesystem
specific one?
 I think the reason why MAC system doesn't handle filename/attributes pairs is
 that:

 Filename and its attributes pairs are conventionally considered as
 constant and reliable.

 It makes the MAC's policy syntax complicated to describe this attribute
 enforcement information in MAC's policy.

 Thus, this should be a global check. But usually device nodes are only in /dev
 .

It doesn't matter where they are, it's that a different fs than yours could be
mounted over it. You say a MAC can prevent that from happening, but a
MAC can also prevent all processes except for udev from modifying /dev.
Done globally instead of as a filesystem it can actually guarantee name/attr
pairs, now it can't even do that on its own.


 3) Code efficiency. Thousand lines of code just to close one very specific
attack, which can be done in lots of different other ways that all need
to be prevented by the MAC system. (mounting over it, intercepting open
calls, duping the fd, etc.) Is it worth it?
 This filesystem is doing what MAC system is not doing.
 So, please don't complain about inability of this filesystem to close all
 attacks.

I don't. What I complain about is that it's too specific and does it one chosen
job badly. It lacks abstraction. As far as I can see any decent MAC can achieve
the same end result as your filesystem, without directly enforcing name/attr
pairs.

 You can use MAC system to prevent attackers from mounting other filesystem
 over this filesystem.

 The filename/attribute pairs are something like system call entry tables.
 The application will go wrong if __NR_read is mapped to sys_write() and
 __NR_write is mapped to sys_read().
 Userland applications access special functionalities (e.g. /dev/zero and
 /dev/random)
 by name (i.e. syscall numbers). Therefore, keeping the filename/attribute
 pairs
 tamper-proof is important.

 You recognize that there is a threat that device

Re: [PATCH][RFC] Simple tamper-proof device filesystem.

2008-01-07 Thread Indan Zupancic
Hello,

Some questions:

On Sun, January 6, 2008 16:20, Tetsuo Handa wrote:
> I want to use this filesystem in case where a process with root privilege was
> hijacked but the behavior of the hijacked process is still restricted by MAC.

1) If the behaviour can be controlled, why can't the process be
   disallowed to change anything badly in /dev? Like disallowing anything
   from modifying existing nodes that weren't created by that process.
   That would have practically the same effect as your filesystem,
   won't it?

   Or phrased differently, if the MAC system used can't protect /dev, it
   won't be able to protect other directories either, and if it can't
   protect e.g. my homedir, doesn't it make the whole MAC system
   ineffective? And if the MAC system used is ineffective, your
   filesystem is useless and you've bigger problems to fix.

2) The MAC system may not be able to guarantee certain combinations
   of device names and properties, but isn't that policy that shouldn't
   be in the kernel anyway? But if it is, shouldn't all device nodes be
   checked? That is, shouldn't it be a global check instead of a filesystem
   specific one?

3) Code efficiency. Thousand lines of code just to close one very specific
   attack, which can be done in lots of different other ways that all need
   to be prevented by the MAC system. (mounting over it, intercepting open
   calls, duping the fd, etc.) Is it worth it?

I really don't care how you try to protect your system, but I don't think
this is an effective way to do it.

Good luck,

Indan


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH][RFC] Simple tamper-proof device filesystem.

2008-01-07 Thread Indan Zupancic
Hello,

Some questions:

On Sun, January 6, 2008 16:20, Tetsuo Handa wrote:
 I want to use this filesystem in case where a process with root privilege was
 hijacked but the behavior of the hijacked process is still restricted by MAC.

1) If the behaviour can be controlled, why can't the process be
   disallowed to change anything badly in /dev? Like disallowing anything
   from modifying existing nodes that weren't created by that process.
   That would have practically the same effect as your filesystem,
   won't it?

   Or phrased differently, if the MAC system used can't protect /dev, it
   won't be able to protect other directories either, and if it can't
   protect e.g. my homedir, doesn't it make the whole MAC system
   ineffective? And if the MAC system used is ineffective, your
   filesystem is useless and you've bigger problems to fix.

2) The MAC system may not be able to guarantee certain combinations
   of device names and properties, but isn't that policy that shouldn't
   be in the kernel anyway? But if it is, shouldn't all device nodes be
   checked? That is, shouldn't it be a global check instead of a filesystem
   specific one?

3) Code efficiency. Thousand lines of code just to close one very specific
   attack, which can be done in lots of different other ways that all need
   to be prevented by the MAC system. (mounting over it, intercepting open
   calls, duping the fd, etc.) Is it worth it?

I really don't care how you try to protect your system, but I don't think
this is an effective way to do it.

Good luck,

Indan


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch 1/2] [RFC] Simple tamper-proof device filesystem.

2007-12-17 Thread Indan Zupancic
Hi,

On Mon, December 17, 2007 01:40, Tetsuo Handa wrote:
> Hello.
>
> Indan Zupancic wrote:
>> What prevents them from mounting tmpfs on top of /dev, bypassing your fs?
> Mandatory access control (MAC) prevents them from mounting tmpfs on top of
> /dev .
> MAC mediates namespace manipulation requests such as mount()/umount().
>
>> Also, if they have root there are plenty of ways to prevent an administrator
>> from logging in, e.g. using iptables or changing the password.
> MAC mediates execution of /sbin/iptables or /usr/bin/passwd .
>
> So, use of this filesystem alone is meaningless because
> attackers with root privileges can do what you are saying.
> But use of this filesystem with MAC is still valid because
> MAC can prevent attackers with root privileges from doing what you are saying.

If MAC can avoid all that, then why can't it also avoid tampering with /dev?
What security does your filesystem add at all, if it's useless without a MAC
doing
all the hard work?

I think you can better spend your time on read-only bind mounts.

Greetings,

Indan


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch 1/2] [RFC] Simple tamper-proof device filesystem.

2007-12-17 Thread Indan Zupancic
Hi,

On Mon, December 17, 2007 01:40, Tetsuo Handa wrote:
 Hello.

 Indan Zupancic wrote:
 What prevents them from mounting tmpfs on top of /dev, bypassing your fs?
 Mandatory access control (MAC) prevents them from mounting tmpfs on top of
 /dev .
 MAC mediates namespace manipulation requests such as mount()/umount().

 Also, if they have root there are plenty of ways to prevent an administrator
 from logging in, e.g. using iptables or changing the password.
 MAC mediates execution of /sbin/iptables or /usr/bin/passwd .

 So, use of this filesystem alone is meaningless because
 attackers with root privileges can do what you are saying.
 But use of this filesystem with MAC is still valid because
 MAC can prevent attackers with root privileges from doing what you are saying.

If MAC can avoid all that, then why can't it also avoid tampering with /dev?
What security does your filesystem add at all, if it's useless without a MAC
doing
all the hard work?

I think you can better spend your time on read-only bind mounts.

Greetings,

Indan


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch 1/2] [RFC] Simple tamper-proof device filesystem.

2007-12-16 Thread Indan Zupancic
Hi,

On Sun, December 16, 2007 13:03, Tetsuo Handa wrote:
> Hello.
>
> David Newall wrote:
>> > You won't be able to login to the system because /sbin/mingetty
>> > fails to "chown/chmod" /dev/tty* if /dev is mounted for read-only mode.
>>
>> Good point.  So, if only root can modify files in /dev, what's the
>> problem you're fixing?  (I'm sure you tried to explain this in your
>> original post, but your reasons weren't clear to me.)
>
> In 2003, I was trying to make / partition read-only to avoid tampering system
> files.
> Use of policy based mandatory access control (such as SELinux) is
> one of ways to avoid tampering, but management of policy was a daunting task.
> So, I tried to store / partition in a read-only medium so that
> the system is free from tampering system files.
>
> When I attended at Security Stadium 2003 as a defense side,
> I was using devfs for /dev directory. The files in /dev directory
> were deleted by attckers and the administrator was unable to login.
> So I developed this filesystem so that attackers who got root privilege
> can't tamper files in /dev directory.

What prevents them from mounting tmpfs on top of /dev, bypassing your fs?

Also, if they have root there are plenty of ways to prevent an administrator
from logging in, e.g. using iptables or changing the password.

Greetings,

Indan


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch 1/2] [RFC] Simple tamper-proof device filesystem.

2007-12-16 Thread Indan Zupancic
Hi,

On Sun, December 16, 2007 13:03, Tetsuo Handa wrote:
 Hello.

 David Newall wrote:
  You won't be able to login to the system because /sbin/mingetty
  fails to chown/chmod /dev/tty* if /dev is mounted for read-only mode.

 Good point.  So, if only root can modify files in /dev, what's the
 problem you're fixing?  (I'm sure you tried to explain this in your
 original post, but your reasons weren't clear to me.)

 In 2003, I was trying to make / partition read-only to avoid tampering system
 files.
 Use of policy based mandatory access control (such as SELinux) is
 one of ways to avoid tampering, but management of policy was a daunting task.
 So, I tried to store / partition in a read-only medium so that
 the system is free from tampering system files.

 When I attended at Security Stadium 2003 as a defense side,
 I was using devfs for /dev directory. The files in /dev directory
 were deleted by attckers and the administrator was unable to login.
 So I developed this filesystem so that attackers who got root privilege
 can't tamper files in /dev directory.

What prevents them from mounting tmpfs on top of /dev, bypassing your fs?

Also, if they have root there are plenty of ways to prevent an administrator
from logging in, e.g. using iptables or changing the password.

Greetings,

Indan


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Announce] Linux-tiny project revival

2007-09-27 Thread Indan Zupancic
On Thu, September 27, 2007 09:00, Arnd Bergmann wrote:
> Assuming that we want to go down that road, I think you can do better with
> more evil macro magic, by using something along the lines of
>
> #define KERN_NOTICE "<5>",
>
> #define PRINTK_CONTINUED "",
>
>  #define printk(level, str, ...) \
>do { \
>  if (sizeof(level) == 1) /* continued printk */\
>   actual_printk(str, __VA_ARGS__); \
>  else if ((level[1] - '0') < CONFIG_PRINTK_DOICARE) \
>actual_printk(level str, __VA_ARGS__); \
>} while(0);
>
> Then you don't have to change every single printk in the kernel, but
> only those that don't currently come with a log level. More importantly,
> you can do the conversion without a flag day, by spreading (an empty)
> PRINTK_CONTINUED in places that do need a printk without a log level.

The problem is, how do you know whether to print a continued printk or not?
It depends on the loglevel of the first printk.

So besides compile-time parsing of the source code, replacing printk with
loglevel specific alternatives (one way or the other) seems the only option.

Greetings,

Indan


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Announce] Linux-tiny project revival

2007-09-27 Thread Indan Zupancic
On Thu, September 27, 2007 09:00, Arnd Bergmann wrote:
 Assuming that we want to go down that road, I think you can do better with
 more evil macro magic, by using something along the lines of

 #define KERN_NOTICE 5,

 #define PRINTK_CONTINUED ,

  #define printk(level, str, ...) \
do { \
  if (sizeof(level) == 1) /* continued printk */\
   actual_printk(str, __VA_ARGS__); \
  else if ((level[1] - '0')  CONFIG_PRINTK_DOICARE) \
actual_printk(level str, __VA_ARGS__); \
} while(0);

 Then you don't have to change every single printk in the kernel, but
 only those that don't currently come with a log level. More importantly,
 you can do the conversion without a flag day, by spreading (an empty)
 PRINTK_CONTINUED in places that do need a printk without a log level.

The problem is, how do you know whether to print a continued printk or not?
It depends on the loglevel of the first printk.

So besides compile-time parsing of the source code, replacing printk with
loglevel specific alternatives (one way or the other) seems the only option.

Greetings,

Indan


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Announce] Linux-tiny project revival

2007-09-20 Thread Indan Zupancic
On Fri, September 21, 2007 01:18, Rob Landley wrote:
> On Thursday 20 September 2007 4:26:13 pm Indan Zupancic wrote:
>> A quick scroll through a vmlinux binary shows that there are quite a
>> lot areas consisting only of some repeated pattern. Mostly 0x00, but
>> also 0x90 and ".GCC: (GNU) 4.2.1.". Getting rid of those would save
>> something between 50 and 100KB.
>
> Worse, if you feed an absolute path to O= when you build the kernel out of
> tree, then it uses absolute paths for all the __FILE__ strings and that makes
> kernel BIG.  (Did that by accident a while back.)  Too bad there's no way
> to keep the __FILE__ strings compressed at runtime and gunzip them as needed
> like busybox does with help messages... :)

I suspect that can be fixed by changing the built system. How can using O=
change the source file path anyway? That seems unnecessary.

It seems to be worse, full pathnames are also used when giving a relative path.
(I'm using O=../obj/).

On the other hand, it doesn't seem to cause that much bloat here:

$ strings vmlinux | grep /home/ |wc
119 1816400

CC'ing Sam Ravnborg, perhaps he has some ideas.

Greetings,

Indan


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Announce] Linux-tiny project revival

2007-09-20 Thread Indan Zupancic
On Thu, September 20, 2007 22:38, Rob Landley wrote:
> I've been playing with an idea for a while to improve the printk() situation,
> but it's a more intrusive change than I've had time to bang on.
>
> Right now, the first argument to printk() is a loglevel, but it's handled via
> string concatenation.  I'd like to change that to be an integer, and make it
> an actual comma-separated first argument.  (Mandatory, not optional.)
>
> So instead of:
>   printk(KERN_NOTICE "Fruit=%d\n", banana);
> It would now be:
>   printk(KERN_NOTICE, "Fruit=%d\n", banana);
>
> Change the header from:
>   #define KERN_NOTICE "<5>"
> to:
>   #define KERN_NOTICE 5

You have to jump through less hoops if you do:

#define KERN_NOTICE 5,

But the problem remains that there are printk's which don't have
a KERN_* as the first argument. Those are also impossible to get
rid off in this way, as the loglevel is unknown (and you don't want
partially printed messages).

So adding the comma is really needed and in addition all printk's
without a loglevel should get one. Which clutters the code and may
increase codesize.

A quick scroll through a vmlinux binary shows that there are quite a
lot areas consisting only of some repeated pattern. Mostly 0x00, but
also 0x90 and ".GCC: (GNU) 4.2.1.". Getting rid of those would save
something between 50 and 100KB.

Greetings,

Indan


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Announce] Linux-tiny project revival

2007-09-20 Thread Indan Zupancic
On Thu, September 20, 2007 22:38, Rob Landley wrote:
 I've been playing with an idea for a while to improve the printk() situation,
 but it's a more intrusive change than I've had time to bang on.

 Right now, the first argument to printk() is a loglevel, but it's handled via
 string concatenation.  I'd like to change that to be an integer, and make it
 an actual comma-separated first argument.  (Mandatory, not optional.)

 So instead of:
   printk(KERN_NOTICE Fruit=%d\n, banana);
 It would now be:
   printk(KERN_NOTICE, Fruit=%d\n, banana);

 Change the header from:
   #define KERN_NOTICE 5
 to:
   #define KERN_NOTICE 5

You have to jump through less hoops if you do:

#define KERN_NOTICE 5,

But the problem remains that there are printk's which don't have
a KERN_* as the first argument. Those are also impossible to get
rid off in this way, as the loglevel is unknown (and you don't want
partially printed messages).

So adding the comma is really needed and in addition all printk's
without a loglevel should get one. Which clutters the code and may
increase codesize.

A quick scroll through a vmlinux binary shows that there are quite a
lot areas consisting only of some repeated pattern. Mostly 0x00, but
also 0x90 and .GCC: (GNU) 4.2.1.. Getting rid of those would save
something between 50 and 100KB.

Greetings,

Indan


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Announce] Linux-tiny project revival

2007-09-20 Thread Indan Zupancic
On Fri, September 21, 2007 01:18, Rob Landley wrote:
 On Thursday 20 September 2007 4:26:13 pm Indan Zupancic wrote:
 A quick scroll through a vmlinux binary shows that there are quite a
 lot areas consisting only of some repeated pattern. Mostly 0x00, but
 also 0x90 and .GCC: (GNU) 4.2.1.. Getting rid of those would save
 something between 50 and 100KB.

 Worse, if you feed an absolute path to O= when you build the kernel out of
 tree, then it uses absolute paths for all the __FILE__ strings and that makes
 kernel BIG.  (Did that by accident a while back.)  Too bad there's no way
 to keep the __FILE__ strings compressed at runtime and gunzip them as needed
 like busybox does with help messages... :)

I suspect that can be fixed by changing the built system. How can using O=
change the source file path anyway? That seems unnecessary.

It seems to be worse, full pathnames are also used when giving a relative path.
(I'm using O=../obj/).

On the other hand, it doesn't seem to cause that much bloat here:

$ strings vmlinux | grep /home/ |wc
119 1816400

CC'ing Sam Ravnborg, perhaps he has some ideas.

Greetings,

Indan


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


2.6.23-rc2: WARNING at kernel/irq/resend.c:70 check_irq_resend()

2007-08-08 Thread Indan Zupancic
Hi,

Just added an old network card, RTL-8029(AS), ne2k-pci driver, and tried to
expand the network (failed because I didn't use a cross-over cable).

The code snippet that spat the thing:

/*
 * IRQ resend
 *
 * Is called with interrupts disabled and desc->lock held.
 */
void check_irq_resend(struct irq_desc *desc, unsigned int irq)
{
unsigned int status = desc->status;

/*
 * Make sure the interrupt is enabled, before resending it:
 */
desc->chip->enable(irq);

/*
 * Temporary hack to figure out more about the problem, which
 * is causing the ancient network cards to die.
 */
if (desc->handle_irq != handle_edge_irq) {
WARN_ON_ONCE(1);
return;
}

and relevant dmesg snippet:

[  446.836399] eth1: RealTek RTL-8029 found at 0x9000, IRQ 16, 
00:00:B4:BB:BF:E5.
[  855.458468] r8169: eth0: link up
[  857.500667] r8169: eth0: link up
[  902.294283] r8169: eth0: link up
[ 4126.348427] r8169: eth0: link up
[ 5598.446233] r8169: eth0: link down
[ 5618.904846] r8169: eth0: link up
[ 5623.133090] r8169: eth0: link down
[ 5624.708071] r8169: eth0: link up
[ 6328.267872] WARNING: at /home/indan/src/git/linux-2.6/kernel/irq/resend.c:70 
check_irq_resend()
[ 6328.267896]  [] check_irq_resend+0x4f/0x8c
[ 6328.267912]  [] enable_irq+0x84/0xaa
[ 6328.267918]  [] ne2k_pci_block_output+0x0/0x138 [ne2k_pci]
[ 6328.267931]  [] ne2k_pci_block_output+0x0/0x138 [ne2k_pci]
[ 6328.267939]  [] ne2k_pci_block_output+0x0/0x138 [ne2k_pci]
[ 6328.267947]  [] ei_start_xmit+0x29f/0x2b9 [8390]
[ 6328.267966]  [] dev_hard_start_xmit+0x19e/0x1fd
[ 6328.267976]  [] __qdisc_run+0x6c/0x151
[ 6328.267986]  [] dev_queue_xmit+0x12a/0x271
[ 6328.267991]  [] arp_send+0x4c/0x64
[ 6328.268002]  [] arp_xmit+0x4d/0x51
[ 6328.268009]  [] arp_process+0x29b/0x50a
[ 6328.268018]  [] ip_forward+0x22c/0x23a
[ 6328.268026]  [] ip_rcv_finish+0x0/0x282
[ 6328.268032]  [] ip_rcv+0x479/0x4a8
[ 6328.268037]  [] ip_rcv_finish+0x0/0x282
[ 6328.268044]  [] arp_rcv+0xf0/0x104
[ 6328.268049]  [] hrtimer_run_queues+0x12/0x1a1
[ 6328.268055]  [] arp_rcv+0x0/0x104
[ 6328.268061]  [] netif_receive_skb+0x2d9/0x317
[ 6328.268068]  [] process_backlog+0x6d/0xd2
[ 6328.268075]  [] net_rx_action+0x81/0x14d
[ 6328.268082]  [] __do_softirq+0x35/0x75
[ 6328.268088]  [] do_softirq+0x3e/0x8d
[ 6328.268098]  [] handle_fasteoi_irq+0x0/0xbe
[ 6328.268103]  [] irq_exit+0x25/0x30
[ 6328.268107]  [] do_IRQ+0x94/0xad
[ 6328.268114]  [] common_interrupt+0x23/0x28
[ 6328.268123]  [] default_idle+0x27/0x39
[ 6328.268128]  [] cpu_idle+0x41/0x55
[ 6328.268133]  [] start_kernel+0x20e/0x213
[ 6328.268140]  [] unknown_bootoption+0x0/0x196
[ 6328.268146]  ===

# cat /proc/interrupts
   CPU0
  0:9804926   IO-APIC-edge  timer
  1:   8270   IO-APIC-edge  i8042
  9:  0   IO-APIC-fasteoi   acpi
 12: 181281   IO-APIC-edge  i8042
 16:  40410   IO-APIC-fasteoi   sata_sil, eth1
 17:  0   IO-APIC-fasteoi   ohci_hcd:usb1, NVidia nForce2
 18: 499428   IO-APIC-fasteoi   ohci_hcd:usb2
 19:  2   IO-APIC-fasteoi   ehci_hcd:usb3
 20: 689588   IO-APIC-fasteoi   eth0
NMI:  0
LOC:9805098
ERR:  0
MIS:  0

So the card is sharing an IRQ with the disk controller.

No idea if the network card died after this warning, for all
practical purposes it didn't work before, so couldn't check.

I can provide more information and run some tests if anyone
wants that, just keep in mind that the card isn't connected to
anything.

Greetings,

Indan


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


2.6.23-rc2: WARNING at kernel/irq/resend.c:70 check_irq_resend()

2007-08-08 Thread Indan Zupancic
Hi,

Just added an old network card, RTL-8029(AS), ne2k-pci driver, and tried to
expand the network (failed because I didn't use a cross-over cable).

The code snippet that spat the thing:

/*
 * IRQ resend
 *
 * Is called with interrupts disabled and desc-lock held.
 */
void check_irq_resend(struct irq_desc *desc, unsigned int irq)
{
unsigned int status = desc-status;

/*
 * Make sure the interrupt is enabled, before resending it:
 */
desc-chip-enable(irq);

/*
 * Temporary hack to figure out more about the problem, which
 * is causing the ancient network cards to die.
 */
if (desc-handle_irq != handle_edge_irq) {
WARN_ON_ONCE(1);
return;
}

and relevant dmesg snippet:

[  446.836399] eth1: RealTek RTL-8029 found at 0x9000, IRQ 16, 
00:00:B4:BB:BF:E5.
[  855.458468] r8169: eth0: link up
[  857.500667] r8169: eth0: link up
[  902.294283] r8169: eth0: link up
[ 4126.348427] r8169: eth0: link up
[ 5598.446233] r8169: eth0: link down
[ 5618.904846] r8169: eth0: link up
[ 5623.133090] r8169: eth0: link down
[ 5624.708071] r8169: eth0: link up
[ 6328.267872] WARNING: at /home/indan/src/git/linux-2.6/kernel/irq/resend.c:70 
check_irq_resend()
[ 6328.267896]  [b0135d8f] check_irq_resend+0x4f/0x8c
[ 6328.267912]  [b0135888] enable_irq+0x84/0xaa
[ 6328.267918]  [c08e37cc] ne2k_pci_block_output+0x0/0x138 [ne2k_pci]
[ 6328.267931]  [c08e37cc] ne2k_pci_block_output+0x0/0x138 [ne2k_pci]
[ 6328.267939]  [c08e37cc] ne2k_pci_block_output+0x0/0x138 [ne2k_pci]
[ 6328.267947]  [c08e0f8a] ei_start_xmit+0x29f/0x2b9 [8390]
[ 6328.267966]  [b022a53e] dev_hard_start_xmit+0x19e/0x1fd
[ 6328.267976]  [b023620a] __qdisc_run+0x6c/0x151
[ 6328.267986]  [b022c459] dev_queue_xmit+0x12a/0x271
[ 6328.267991]  [b0260ec9] arp_send+0x4c/0x64
[ 6328.268002]  [b0260524] arp_xmit+0x4d/0x51
[ 6328.268009]  [b026117c] arp_process+0x29b/0x50a
[ 6328.268018]  [b0244f0a] ip_forward+0x22c/0x23a
[ 6328.268026]  [b02434a8] ip_rcv_finish+0x0/0x282
[ 6328.268032]  [b0243db9] ip_rcv+0x479/0x4a8
[ 6328.268037]  [b02434a8] ip_rcv_finish+0x0/0x282
[ 6328.268044]  [b02614db] arp_rcv+0xf0/0x104
[ 6328.268049]  [b012ae17] hrtimer_run_queues+0x12/0x1a1
[ 6328.268055]  [b02613eb] arp_rcv+0x0/0x104
[ 6328.268061]  [b022a07d] netif_receive_skb+0x2d9/0x317
[ 6328.268068]  [b022baa0] process_backlog+0x6d/0xd2
[ 6328.268075]  [b022c1c2] net_rx_action+0x81/0x14d
[ 6328.268082]  [b011dd84] __do_softirq+0x35/0x75
[ 6328.268088]  [b0106030] do_softirq+0x3e/0x8d
[ 6328.268098]  [b0136530] handle_fasteoi_irq+0x0/0xbe
[ 6328.268103]  [b011dd44] irq_exit+0x25/0x30
[ 6328.268107]  [b010630f] do_IRQ+0x94/0xad
[ 6328.268114]  [b0104763] common_interrupt+0x23/0x28
[ 6328.268123]  [b0102a0c] default_idle+0x27/0x39
[ 6328.268128]  [b0102347] cpu_idle+0x41/0x55
[ 6328.268133]  [b032f99b] start_kernel+0x20e/0x213
[ 6328.268140]  [b032f317] unknown_bootoption+0x0/0x196
[ 6328.268146]  ===

# cat /proc/interrupts
   CPU0
  0:9804926   IO-APIC-edge  timer
  1:   8270   IO-APIC-edge  i8042
  9:  0   IO-APIC-fasteoi   acpi
 12: 181281   IO-APIC-edge  i8042
 16:  40410   IO-APIC-fasteoi   sata_sil, eth1
 17:  0   IO-APIC-fasteoi   ohci_hcd:usb1, NVidia nForce2
 18: 499428   IO-APIC-fasteoi   ohci_hcd:usb2
 19:  2   IO-APIC-fasteoi   ehci_hcd:usb3
 20: 689588   IO-APIC-fasteoi   eth0
NMI:  0
LOC:9805098
ERR:  0
MIS:  0

So the card is sharing an IRQ with the disk controller.

No idea if the network card died after this warning, for all
practical purposes it didn't work before, so couldn't check.

I can provide more information and run some tests if anyone
wants that, just keep in mind that the card isn't connected to
anything.

Greetings,

Indan


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Linux Kernel cfs scheduler and xorg kbd

2007-08-07 Thread Indan Zupancic
On Tue, August 7, 2007 08:05, Arjan van de Ven wrote:
> On Tue, 2007-08-07 at 00:40 -0400, Kyle Moffett wrote:
>> On Aug 01, 2007, at 11:06:00, Indan Zupancic wrote:
>> > Different programs, different keys "stuck", so hard to tell. The
>> > amount the pressed key is repeated differs too, from double to
>> > hundreds.
>> >
>> > As an apparently separate problem my mouse warps once in a while
>> > too, and that's not totally fixed yet, neither with that locking
>> > patch. Throw in that two of my USB ports stopped working after a
>> > rare power outage, I didn't totally trust my hardware any more,
>> > hence my initial reluctance to report these problems.
>> >
>> > We could have independent problems with more or less the same
>> > symptoms, at least Teresa's problem seems much worse than ours, and
>> > if you and Ingo only experience a stuck delete key, it might be
>> > something else.
>
>
> the last time I saw something like this it was a BIOS that kept doing
> USB->PS/2 emulation even when the kernel was running.. that caused great
> havoc somehow...
>
> those who are seeing this... is it all on a USB keyboard? If so, is the
> bios set to do PS/2 emulation (sometimes called USB legacy emulation) ??

No, using PS/2 keyboard and mouse here, and made sure to disable USB
legacy emulation in the BIOS ages ago.

But for me the problems are gone now, since 2.6.23-rc1 (I don't think it was
fixed by CFS though, just in the same timeframe).

Greetings,

Indan


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Linux Kernel cfs scheduler and xorg kbd

2007-08-07 Thread Indan Zupancic
On Tue, August 7, 2007 08:05, Arjan van de Ven wrote:
 On Tue, 2007-08-07 at 00:40 -0400, Kyle Moffett wrote:
 On Aug 01, 2007, at 11:06:00, Indan Zupancic wrote:
  Different programs, different keys stuck, so hard to tell. The
  amount the pressed key is repeated differs too, from double to
  hundreds.
 
  As an apparently separate problem my mouse warps once in a while
  too, and that's not totally fixed yet, neither with that locking
  patch. Throw in that two of my USB ports stopped working after a
  rare power outage, I didn't totally trust my hardware any more,
  hence my initial reluctance to report these problems.
 
  We could have independent problems with more or less the same
  symptoms, at least Teresa's problem seems much worse than ours, and
  if you and Ingo only experience a stuck delete key, it might be
  something else.


 the last time I saw something like this it was a BIOS that kept doing
 USB-PS/2 emulation even when the kernel was running.. that caused great
 havoc somehow...

 those who are seeing this... is it all on a USB keyboard? If so, is the
 bios set to do PS/2 emulation (sometimes called USB legacy emulation) ??

No, using PS/2 keyboard and mouse here, and made sure to disable USB
legacy emulation in the BIOS ages ago.

But for me the problems are gone now, since 2.6.23-rc1 (I don't think it was
fixed by CFS though, just in the same timeframe).

Greetings,

Indan


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


OT: enabling Xcomposite [was: Linux Kernel cfs scheduler and xorg kbd]

2007-08-06 Thread Indan Zupancic
On Mon, August 6, 2007 19:46, Rene Herman wrote:
> On 08/06/2007 05:19 PM, Indan Zupancic wrote:
>
> [ Yes, OT, I'll shelve it after this ]

Me too, and I wonder if we should've dropped a bunch of CCs...

>
>> Anyway, if you want to experience a shock, try enabling xcomposite and
>> run xcompmgr -a or something. It's so wonderfully smooth (even though
>> some rendering seems slower than before, it's really worth it for me).
>
> Don't believe I can -- using a Matrox Millenium G550 (driver mga) and
> nothing I do seems to have an effect.

Well, as far as I know adding the following to your xorg.conf should be enough:

Section "Extensions"
Option  "Composite" "enable"
EndSection

and then to enable it at runtime start a composite manager.

Reading the mga manpage it seems you need to add

Option "AccelMethod" "EXA"

to the driver section. (Though it says EXA is possibly unstable.)

Run "xdpyinfo | grep Composite" to see if it works or not.

But as you said, getting off-topic, so if the above doesn't work sent a
private mail, or go to the xorg mailinglist.

Greetings,

Indan


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Linux 2.6.23-rc2

2007-08-06 Thread Indan Zupancic
On Mon, August 6, 2007 19:16, H. Peter Anvin wrote:
> Indan Zupancic wrote:
>>
>> Does executing unicode_start, unicode_stop fixes it?
>>
>
> Does that seem likely, since his display isn't even in text mode?

No idea, just thought it would be worth a try. My console is garbled
too, doing that unicode stuff fixed it for me. (Though it appears to
work better nowaday: new text is fine and other than the first consoles
are fine too, which wasn't the case a while ago I think.)

I thought he was using a VGA console, wouldn't that be in text mode?

Greetings,

Indan


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Linux 2.6.23-rc2

2007-08-06 Thread Indan Zupancic
On Mon, August 6, 2007 18:06, Jeff Chua wrote:
> On 8/6/07, Jan Engelhardt <[EMAIL PROTECTED]> wrote:
>
>> >���
>> That looks pretty much like all-one-bits (� = 255).
>> What about vcsa1 (use hexdump -C)?
>
> # hexdump -C /dev/vcsa1
>   19 50 00 18 ff ff ff ff  ff ff ff ff ff ff ff ff  |.P..|
> 0010  ff ff ff ff ff ff ff ff  ff ff ff ff ff ff ff ff  ||
> *
> 0fa0  ff ff ff ff   ||
> 0fa4
>
> Yep, it's all one's.
>
> And once "s2ram" (without parameter) is executed, there's no way to
> restore back the console in text-mode even using "s2ram --force
> --acpi_sleep 1 --vbe_mode" until a "suspend-to-disk" or a reboot.

Does executing unicode_start, unicode_stop fixes it?

Greetings,

Indan


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Linux Kernel cfs scheduler and xorg kbd

2007-08-06 Thread Indan Zupancic
On Mon, August 6, 2007 15:10, Rene Herman wrote:
> On 08/06/2007 09:12 AM, Ingo Molnar wrote:
>
>> * Indan Zupancic <[EMAIL PROTECTED]> wrote:
>>
>>> All right, how would you debug it? Give us some insight in how to
>>> solve hard to trigger, happens at most only a few times a day,
>>> annoying input bug? I thought the mouse warping was fixed after 23-rc1
>>> and that input locking patch, but alas, the third day it happened
>>> again.
>>
>> i've got no idea how to debug such input bugs best, but, as a starting
>> point, i've Cc:-ed the current maintainer of the input subsystem :-)
>
> FWIW, I haven't experienced my "stuck delete" key anymore -- "since using
> CFS v19.1", but that might very well just be coincedence. If anyone
> wants/needs me to, I'll try to debug it, but for now I seem to be fine again.

No stuck keys and no warping mouse seen yet since last time (currently running
2.6.23-rc2), so both problems are, for all practical purposes, fixed.

> Now all I need to know is whether or not moving thunderbird's windows around
> is expected to leave such an enormous non-repainting visual trail on the
> screen...
>
> I believe I'm concluding that I'm not all together fond of the "new" modular
> X.org.

I don't think it's fair to blame Xorg for that, except if changing back to an 
older
version fixes it. If Thunderbird is anything comparable to Firefox, then it's 
most
likely the generic slowdown of that program that might have made it worse. If 
it's
Xorg then it's probably something driver/settings related I guess, best to ask 
on
their mailinglist for that.

Anyway, if you want to experience a shock, try enabling xcomposite and run
xcompmgr -a or something. It's so wonderfully smooth (even though some
rendering seems slower than before, it's really worth it for me).

Greetings,

Indan


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Linux Kernel cfs scheduler and xorg kbd

2007-08-06 Thread Indan Zupancic
On Mon, August 6, 2007 15:10, Rene Herman wrote:
 On 08/06/2007 09:12 AM, Ingo Molnar wrote:

 * Indan Zupancic [EMAIL PROTECTED] wrote:

 All right, how would you debug it? Give us some insight in how to
 solve hard to trigger, happens at most only a few times a day,
 annoying input bug? I thought the mouse warping was fixed after 23-rc1
 and that input locking patch, but alas, the third day it happened
 again.

 i've got no idea how to debug such input bugs best, but, as a starting
 point, i've Cc:-ed the current maintainer of the input subsystem :-)

 FWIW, I haven't experienced my stuck delete key anymore -- since using
 CFS v19.1, but that might very well just be coincedence. If anyone
 wants/needs me to, I'll try to debug it, but for now I seem to be fine again.

No stuck keys and no warping mouse seen yet since last time (currently running
2.6.23-rc2), so both problems are, for all practical purposes, fixed.

 Now all I need to know is whether or not moving thunderbird's windows around
 is expected to leave such an enormous non-repainting visual trail on the
 screen...

 I believe I'm concluding that I'm not all together fond of the new modular
 X.org.

I don't think it's fair to blame Xorg for that, except if changing back to an 
older
version fixes it. If Thunderbird is anything comparable to Firefox, then it's 
most
likely the generic slowdown of that program that might have made it worse. If 
it's
Xorg then it's probably something driver/settings related I guess, best to ask 
on
their mailinglist for that.

Anyway, if you want to experience a shock, try enabling xcomposite and run
xcompmgr -a or something. It's so wonderfully smooth (even though some
rendering seems slower than before, it's really worth it for me).

Greetings,

Indan


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Linux 2.6.23-rc2

2007-08-06 Thread Indan Zupancic
On Mon, August 6, 2007 18:06, Jeff Chua wrote:
 On 8/6/07, Jan Engelhardt [EMAIL PROTECTED] wrote:

 ���
 That looks pretty much like all-one-bits (� = 255).
 What about vcsa1 (use hexdump -C)?

 # hexdump -C /dev/vcsa1
   19 50 00 18 ff ff ff ff  ff ff ff ff ff ff ff ff  |.P..|
 0010  ff ff ff ff ff ff ff ff  ff ff ff ff ff ff ff ff  ||
 *
 0fa0  ff ff ff ff   ||
 0fa4

 Yep, it's all one's.

 And once s2ram (without parameter) is executed, there's no way to
 restore back the console in text-mode even using s2ram --force
 --acpi_sleep 1 --vbe_mode until a suspend-to-disk or a reboot.

Does executing unicode_start, unicode_stop fixes it?

Greetings,

Indan


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Linux 2.6.23-rc2

2007-08-06 Thread Indan Zupancic
On Mon, August 6, 2007 19:16, H. Peter Anvin wrote:
 Indan Zupancic wrote:

 Does executing unicode_start, unicode_stop fixes it?


 Does that seem likely, since his display isn't even in text mode?

No idea, just thought it would be worth a try. My console is garbled
too, doing that unicode stuff fixed it for me. (Though it appears to
work better nowaday: new text is fine and other than the first consoles
are fine too, which wasn't the case a while ago I think.)

I thought he was using a VGA console, wouldn't that be in text mode?

Greetings,

Indan


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


OT: enabling Xcomposite [was: Linux Kernel cfs scheduler and xorg kbd]

2007-08-06 Thread Indan Zupancic
On Mon, August 6, 2007 19:46, Rene Herman wrote:
 On 08/06/2007 05:19 PM, Indan Zupancic wrote:

 [ Yes, OT, I'll shelve it after this ]

Me too, and I wonder if we should've dropped a bunch of CCs...


 Anyway, if you want to experience a shock, try enabling xcomposite and
 run xcompmgr -a or something. It's so wonderfully smooth (even though
 some rendering seems slower than before, it's really worth it for me).

 Don't believe I can -- using a Matrox Millenium G550 (driver mga) and
 nothing I do seems to have an effect.

Well, as far as I know adding the following to your xorg.conf should be enough:

Section Extensions
Option  Composite enable
EndSection

and then to enable it at runtime start a composite manager.

Reading the mga manpage it seems you need to add

Option AccelMethod EXA

to the driver section. (Though it says EXA is possibly unstable.)

Run xdpyinfo | grep Composite to see if it works or not.

But as you said, getting off-topic, so if the above doesn't work sent a
private mail, or go to the xorg mailinglist.

Greetings,

Indan


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [ck] Re: Linux Kernel cfs scheduler and xorg kbd

2007-08-02 Thread Indan Zupancic
On Thu, August 2, 2007 14:22, Rene Herman wrote:
> On 08/02/2007 09:11 AM, S�bastien Dugu� wrote:
>
>> This reminds me I had something similar happening about a year or so ago
>> running with Debian's unstable xorg and gnome.
>>
>> The keys would stick for a short time (no, no coffee spilled on the
>> keyboard, I can assure you ;-), the most annoying one being the delete
>> key. Nothing deterministic, it could happen anytime (or not at all for
>> several days).
>>
>> As a final resort, I upgraded the xorg and gnome packages - problem
>> solved.
>
> Thank you. That "about a year" sounds possible; slackware has been one of
> the last to switch to modular xorg and until recently I was still running
> the old 6.9 series, so it could be xorg...

Well, I'm running xorg since 2004, and hadn't the sticking key problem
until a few months ago, half a year max. I didn't install a new xorg
version just before the problem went away, and I'm using Fluxbox.

Oh well, the problem's gone now, no matter how or why.

Greetings,

Indan


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [ck] Re: Linux Kernel cfs scheduler and xorg kbd

2007-08-02 Thread Indan Zupancic
On Thu, August 2, 2007 14:22, Rene Herman wrote:
 On 08/02/2007 09:11 AM, S�bastien Dugu� wrote:

 This reminds me I had something similar happening about a year or so ago
 running with Debian's unstable xorg and gnome.

 The keys would stick for a short time (no, no coffee spilled on the
 keyboard, I can assure you ;-), the most annoying one being the delete
 key. Nothing deterministic, it could happen anytime (or not at all for
 several days).

 As a final resort, I upgraded the xorg and gnome packages - problem
 solved.

 Thank you. That about a year sounds possible; slackware has been one of
 the last to switch to modular xorg and until recently I was still running
 the old 6.9 series, so it could be xorg...

Well, I'm running xorg since 2004, and hadn't the sticking key problem
until a few months ago, half a year max. I didn't install a new xorg
version just before the problem went away, and I'm using Fluxbox.

Oh well, the problem's gone now, no matter how or why.

Greetings,

Indan


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Linux Kernel cfs scheduler and xorg kbd

2007-08-01 Thread Indan Zupancic
On Wed, August 1, 2007 17:09, Ingo Molnar wrote:
>
> * Indan Zupancic <[EMAIL PROTECTED]> wrote:
>
>> We could have independent problems with more or less the same
>> symptoms, at least Teresa's problem seems much worse than ours, and if
>> you and Ingo only experience a stuck delete key, it might be something
>> else.
>
> i experienced nothing in this area so far - if i experienced anything of
> the likes it would be long fixed! =B-)

Ah, indeed, you appended Rene's email without too many new lines,
so I thought your wrote it (http://lkml.org/lkml/2007/8/1/4).

All right, how would you debug it? Give us some insight in how to solve
hard to trigger, happens at most only a few times a day, annoying input
bug? I thought the mouse warping was fixed after 23-rc1 and that input
locking patch, but alas, the third day it happened again.

I would be happy just to know if it's bad hardware or a software bug,
I don't ask for too much, do I?

Greetings,

Indan


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Linux Kernel cfs scheduler and xorg kbd

2007-08-01 Thread Indan Zupancic
On Wed, August 1, 2007 15:58, Rene Herman wrote:
> On 08/01/2007 03:33 PM, <:::.. TeresaII ..:::> wrote:
>
>> Problem is, that i can't reproduce it after last reboot.
>> Can this be different on each reboot ? Any idea ?
>>
>> About another kernel: i only had -ck kernels since last year or more, i
>> can remember i had same behavier atleast 6 month ago, or maybe more. But
>> it wasn't that hard and it dissapiered self after some time.
>
> Hmm.
>
>> Now on first run of 2.6.22.1-cfs-v19.1 i got it again. And realy bad. It
>> was even very hard to write report mail.
>
> I notice by the way that you are also using Thunderbird 2.0 -- that was my
> own suspect; I had just switched from Thunderbird 1.5.
>
>> I did reboot, day after that it was back, but not that hard. And now
>> rebooted yesterday and cant reproduce it at all.
>
> I believe that at least for now we can safely remove this from any list of
> possible CFS regresssions. I'll sit on it and see what I can find out
> when/if my stuck delete happens again.
>
> Indan -- under what circumstances/programs did you experience what?

Different programs, different keys "stuck", so hard to tell. The amount
the pressed key is repeated differs too, from double to hundreds.

As an apparently separate problem my mouse warps once in a while too,
and that's not totally fixed yet, neither with that locking patch. Throw in
that two of my USB ports stopped working after a rare power outage, I
didn't totally trust my hardware any more, hence my initial reluctance to
report these problems.

We could have independent problems with more or less the same symptoms,
at least Teresa's problem seems much worse than ours, and if you and Ingo
only experience a stuck delete key, it might be something else.

Greetings,

Indan


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Linux Kernel cfs scheduler and xorg kbd

2007-08-01 Thread Indan Zupancic
On Wed, August 1, 2007 14:50, Rene Herman wrote:
> On 08/01/2007 02:34 PM, Indan Zupancic wrote:
>
>> On Wed, August 1, 2007 13:01, Rene Herman wrote:
>
>>> Teresa was already using 2.6.22.1, with CFS (v19.1) patched in, so reverting
>>> that would be a matter of patching it out again. She said she wasn't seeing
>>> trouble on other kernels though.
>>>
>>> I was the one who also saw keyboard trouble on 2.6.22.1 without CFS by the
>>> way -- and haven't (yet) seen any anymore since patching _in_ CFS v19.1...
>>
>> Could the people who had or have keyboard problems try out Dmitry's
>> input locking patches, and see if those help: 
>> http://lkml.org/lkml/2007/7/24/17
>> And if it does, report it to him, either there or here after CCing him.
>
> I'm having trouble reproducing this at will -- it sounded as though Teresa
> had less trouble at least originally. Teresa?

Same here, so testing is a pain. I was more thinking about running a patched
version for a day or longer and see if it happened or not. For me it happened
irregularly, but definitely a few times a day.

(It might help if people report what they're using. So, for what it's worth, I'm
using a PS/2 keyboard with xorg 1.2.0 with the "keyboard" driver.)

Greetings,

Indan


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Linux Kernel cfs scheduler and xorg kbd

2007-08-01 Thread Indan Zupancic
On Wed, August 1, 2007 13:01, Rene Herman wrote:
> On 08/01/2007 01:31 PM, Andi Kleen wrote:
>
>> Ingo Molnar <[EMAIL PROTECTED]> writes:
>
>>> another reaction below in this thread reported kbd problems in vanilla
>>> 2.6.22.1 as well. What is the X versions, etc.? Does the problem go away
>>> if the X kbd driver selection is tweaked to a simpler model, say:
>>
>> Or perhaps just test with CFS reverted? It really sounds more like something
>> independent from the scheduler. Perhaps someone could put a 
>> CFS-revert-for-testing patch
>> on ftp somewhere. It's only a few git changesets so shouldn't be too 
>> difficult to
>> do.
>
> Teresa was already using 2.6.22.1, with CFS (v19.1) patched in, so reverting
> that would be a matter of patching it out again. She said she wasn't seeing
> trouble on other kernels though.
>
> I was the one who also saw keyboard trouble on 2.6.22.1 without CFS by the
> way -- and haven't (yet) seen any anymore since patching _in_ CFS v19.1...

Could the people who had or have keyboard problems try out Dmitry's
input locking patches, and see if those help: http://lkml.org/lkml/2007/7/24/17
And if it does, report it to him, either there or here after CCing him.

Greetings,

Indan


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Linux Kernel cfs scheduler and xorg kbd

2007-08-01 Thread Indan Zupancic
On Wed, August 1, 2007 13:01, Rene Herman wrote:
 On 08/01/2007 01:31 PM, Andi Kleen wrote:

 Ingo Molnar [EMAIL PROTECTED] writes:

 another reaction below in this thread reported kbd problems in vanilla
 2.6.22.1 as well. What is the X versions, etc.? Does the problem go away
 if the X kbd driver selection is tweaked to a simpler model, say:

 Or perhaps just test with CFS reverted? It really sounds more like something
 independent from the scheduler. Perhaps someone could put a 
 CFS-revert-for-testing patch
 on ftp somewhere. It's only a few git changesets so shouldn't be too 
 difficult to
 do.

 Teresa was already using 2.6.22.1, with CFS (v19.1) patched in, so reverting
 that would be a matter of patching it out again. She said she wasn't seeing
 trouble on other kernels though.

 I was the one who also saw keyboard trouble on 2.6.22.1 without CFS by the
 way -- and haven't (yet) seen any anymore since patching _in_ CFS v19.1...

Could the people who had or have keyboard problems try out Dmitry's
input locking patches, and see if those help: http://lkml.org/lkml/2007/7/24/17
And if it does, report it to him, either there or here after CCing him.

Greetings,

Indan


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Linux Kernel cfs scheduler and xorg kbd

2007-08-01 Thread Indan Zupancic
On Wed, August 1, 2007 14:50, Rene Herman wrote:
 On 08/01/2007 02:34 PM, Indan Zupancic wrote:

 On Wed, August 1, 2007 13:01, Rene Herman wrote:

 Teresa was already using 2.6.22.1, with CFS (v19.1) patched in, so reverting
 that would be a matter of patching it out again. She said she wasn't seeing
 trouble on other kernels though.

 I was the one who also saw keyboard trouble on 2.6.22.1 without CFS by the
 way -- and haven't (yet) seen any anymore since patching _in_ CFS v19.1...

 Could the people who had or have keyboard problems try out Dmitry's
 input locking patches, and see if those help: 
 http://lkml.org/lkml/2007/7/24/17
 And if it does, report it to him, either there or here after CCing him.

 I'm having trouble reproducing this at will -- it sounded as though Teresa
 had less trouble at least originally. Teresa?

Same here, so testing is a pain. I was more thinking about running a patched
version for a day or longer and see if it happened or not. For me it happened
irregularly, but definitely a few times a day.

(It might help if people report what they're using. So, for what it's worth, I'm
using a PS/2 keyboard with xorg 1.2.0 with the keyboard driver.)

Greetings,

Indan


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Linux Kernel cfs scheduler and xorg kbd

2007-08-01 Thread Indan Zupancic
On Wed, August 1, 2007 17:09, Ingo Molnar wrote:

 * Indan Zupancic [EMAIL PROTECTED] wrote:

 We could have independent problems with more or less the same
 symptoms, at least Teresa's problem seems much worse than ours, and if
 you and Ingo only experience a stuck delete key, it might be something
 else.

 i experienced nothing in this area so far - if i experienced anything of
 the likes it would be long fixed! =B-)

Ah, indeed, you appended Rene's email without too many new lines,
so I thought your wrote it (http://lkml.org/lkml/2007/8/1/4).

All right, how would you debug it? Give us some insight in how to solve
hard to trigger, happens at most only a few times a day, annoying input
bug? I thought the mouse warping was fixed after 23-rc1 and that input
locking patch, but alas, the third day it happened again.

I would be happy just to know if it's bad hardware or a software bug,
I don't ask for too much, do I?

Greetings,

Indan


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Linux Kernel cfs scheduler and xorg kbd

2007-08-01 Thread Indan Zupancic
On Wed, August 1, 2007 15:58, Rene Herman wrote:
 On 08/01/2007 03:33 PM, :::.. TeresaII ..::: wrote:

 Problem is, that i can't reproduce it after last reboot.
 Can this be different on each reboot ? Any idea ?

 About another kernel: i only had -ck kernels since last year or more, i
 can remember i had same behavier atleast 6 month ago, or maybe more. But
 it wasn't that hard and it dissapiered self after some time.

 Hmm.

 Now on first run of 2.6.22.1-cfs-v19.1 i got it again. And realy bad. It
 was even very hard to write report mail.

 I notice by the way that you are also using Thunderbird 2.0 -- that was my
 own suspect; I had just switched from Thunderbird 1.5.

 I did reboot, day after that it was back, but not that hard. And now
 rebooted yesterday and cant reproduce it at all.

 I believe that at least for now we can safely remove this from any list of
 possible CFS regresssions. I'll sit on it and see what I can find out
 when/if my stuck delete happens again.

 Indan -- under what circumstances/programs did you experience what?

Different programs, different keys stuck, so hard to tell. The amount
the pressed key is repeated differs too, from double to hundreds.

As an apparently separate problem my mouse warps once in a while too,
and that's not totally fixed yet, neither with that locking patch. Throw in
that two of my USB ports stopped working after a rare power outage, I
didn't totally trust my hardware any more, hence my initial reluctance to
report these problems.

We could have independent problems with more or less the same symptoms,
at least Teresa's problem seems much worse than ours, and if you and Ingo
only experience a stuck delete key, it might be something else.

Greetings,

Indan


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC/RFT 1/5] Input: implement proper locking in input core

2007-07-29 Thread Indan Zupancic
On Sun, July 29, 2007 05:50, Dmitry Torokhov wrote:
> Hi Indan,
>
> On Friday 27 July 2007 19:28, Indan Zupancic wrote:
>> Hi,
>>
>> Not real feedback, just some nitpicks.
>>
>> On Tue, July 24, 2007 06:45, Dmitry Torokhov wrote:
>> > +static int input_defuzz_abs_event(int value, int old_val, int fuzz)
>> > +{
>> > +  if (fuzz) {
>> > +  if (value > old_val - fuzz / 2 && value < old_val + fuzz / 2)
>> > +  return value;
>> >
>> > -  add_input_randomness(type, code, value);
>> > +  if (value > old_val - fuzz && value < old_val + fuzz)
>> > +  return (old_val * 3 + value) / 4;
>> >
>> > -  switch (type) {
>> > +  if (value > old_val - fuzz * 2 && value < old_val + fuzz * 2)
>> > +  return (old_val + value) / 2;
>> > +  }
>>
>> Shouldn't the return values of the second and third case be reversed?
>> In the 2nd check the new values is weighted for 1/4, while in the 3rd
>> case it counts for 1/2, which breaks the "account new value more when
>> it is closer to the old one" logic that I thought I saw here. So to sum up,
>> should the second return be "return (old_val + value * 3) / 4"?
>
> Thank you for bringing this up. Actually the 1st return valus should be
> "old_val", not value. The logic is to "gravitate towards old" when
> difference is small.
>
>>
>>
>> > +/*
>> > + * Generate software autorepeat event. Note that we take
>> > + * dev->event_lock here to avoid racing with input_event
>> > + * which may cause keys get "stuck".
>> > + */
>>
>> Hurray. :-)
>>
>> > -  if (code > SW_MAX || !test_bit(code, dev->swbit) || 
>> > !!test_bit(code, dev->sw) == value)
>> > -  return;
>> > +  if (dev->rep[REP_PERIOD])
>> > +  mod_timer(>timer, jiffies +
>> > +  msecs_to_jiffies(dev->rep[REP_PERIOD]));
>> > +  }
>>
>> Perhaps use a local var for the "msecs_to_jiffies(dev->rep[REP_PERIOD])" 
>> part.
>>
>
> What would be the benefit of doing so?

I was hoping it would make things more readable, and less staircase like.

>
>>
>> > +static void input_start_autorepeat(struct input_dev *dev, int code)
>> > +{
>> > +  if (test_bit(EV_REP, dev->evbit) &&
>> > +  dev->rep[REP_PERIOD] && dev->rep[REP_DELAY] &&
>> > +  dev->timer.data) {
>> > +  dev->repeat_key = code;
>> > +  mod_timer(>timer,
>> > +jiffies + msecs_to_jiffies(dev->rep[REP_DELAY]));
>> > +  }
>> > +}
>>
>> Same here.
>>
>>
>> > +  case EV_KEY:
>> > +  if (is_event_supported(code, dev->keybit, KEY_MAX) &&
>> > +  !!test_bit(code, dev->key) != value) {
>>
>> A bit confusing, test_bit(0 only returns 0 or 1 anyway, doesn't it?
>> So "test_bit(code, dev->key) != value" should be all right.
>> I noticed that the old code did it too, but still.
>
> Is it guaranteed? I only expect it to return 0/non-0 values, not necessarily
> 0 and 1.

Not sure, but it seems so. All the test_bit implementations I checked
returned either 0 or 1. No idea where to get that guarantee though.

>
>>
>> > -  case EV_MSC:
>> > +  case EV_SW:
>> > +  if (is_event_supported(code, dev->swbit, SW_MAX) &&
>> > +  !!test_bit(code, dev->sw) != value) {
>>
>> Same.
>>
>> > -  break;
>> > +  case EV_LED:
>> > +  if (is_event_supported(code, dev->ledbit, LED_MAX) &&
>> > +  !!test_bit(code, dev->led) != value) {
>>
>> And here.
>>
>>
>> > +void input_inject_event(struct input_handle *handle,
>> > +  unsigned int type, unsigned int code, int value)
>> >  {
>> > -  struct input_dev *dev = (void *) data;
>> > +  struct input_dev *dev = handle->dev;
>> > +  struct input_handle *grab;
>> >
>> > -  if (!test_bit(dev->repeat_key, dev->key))
>> > -  return;
>> > +  if (is_event_supported(type, dev->evbit, EV_MAX)) {
>> > +  spin_lock_irq(>event_lock);
>> >
>>

Re: [RFC/RFT 0/5] Input locking patches

2007-07-29 Thread Indan Zupancic
On Sun, July 29, 2007 05:38, Dmitry Torokhov wrote:
> Hi Indan,
>
> On Friday 27 July 2007 18:25, Indan Zupancic wrote:
>> Sorry for the babbling, just wanted to say that I've tested these
>> patches and that they seem to fix real problems.
>>
>
> Thank you for testing the patches.

Unfortunately I spoke too soon and some mouse warps occurred
yesterday. I suspect faulty hardware, so I'll try with my old
non-optical mouse. If it still happens then it's either the mobo
or the kernel. In the meantime I'll plug this one in another pc
and see if the warps happen there. After that it's running old
kernels to see if the warps happen there too. Any other ideas
to check whether it's a software or hardware problem?

The more annoying sticking key problem is gone for good though.

Greetings,

Indan


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC/RFT 0/5] Input locking patches

2007-07-29 Thread Indan Zupancic
On Sun, July 29, 2007 05:38, Dmitry Torokhov wrote:
 Hi Indan,

 On Friday 27 July 2007 18:25, Indan Zupancic wrote:
 Sorry for the babbling, just wanted to say that I've tested these
 patches and that they seem to fix real problems.


 Thank you for testing the patches.

Unfortunately I spoke too soon and some mouse warps occurred
yesterday. I suspect faulty hardware, so I'll try with my old
non-optical mouse. If it still happens then it's either the mobo
or the kernel. In the meantime I'll plug this one in another pc
and see if the warps happen there. After that it's running old
kernels to see if the warps happen there too. Any other ideas
to check whether it's a software or hardware problem?

The more annoying sticking key problem is gone for good though.

Greetings,

Indan


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC/RFT 1/5] Input: implement proper locking in input core

2007-07-29 Thread Indan Zupancic
On Sun, July 29, 2007 05:50, Dmitry Torokhov wrote:
 Hi Indan,

 On Friday 27 July 2007 19:28, Indan Zupancic wrote:
 Hi,

 Not real feedback, just some nitpicks.

 On Tue, July 24, 2007 06:45, Dmitry Torokhov wrote:
  +static int input_defuzz_abs_event(int value, int old_val, int fuzz)
  +{
  +  if (fuzz) {
  +  if (value  old_val - fuzz / 2  value  old_val + fuzz / 2)
  +  return value;
 
  -  add_input_randomness(type, code, value);
  +  if (value  old_val - fuzz  value  old_val + fuzz)
  +  return (old_val * 3 + value) / 4;
 
  -  switch (type) {
  +  if (value  old_val - fuzz * 2  value  old_val + fuzz * 2)
  +  return (old_val + value) / 2;
  +  }

 Shouldn't the return values of the second and third case be reversed?
 In the 2nd check the new values is weighted for 1/4, while in the 3rd
 case it counts for 1/2, which breaks the account new value more when
 it is closer to the old one logic that I thought I saw here. So to sum up,
 should the second return be return (old_val + value * 3) / 4?

 Thank you for bringing this up. Actually the 1st return valus should be
 old_val, not value. The logic is to gravitate towards old when
 difference is small.



  +/*
  + * Generate software autorepeat event. Note that we take
  + * dev-event_lock here to avoid racing with input_event
  + * which may cause keys get stuck.
  + */

 Hurray. :-)

  -  if (code  SW_MAX || !test_bit(code, dev-swbit) || 
  !!test_bit(code, dev-sw) == value)
  -  return;
  +  if (dev-rep[REP_PERIOD])
  +  mod_timer(dev-timer, jiffies +
  +  msecs_to_jiffies(dev-rep[REP_PERIOD]));
  +  }

 Perhaps use a local var for the msecs_to_jiffies(dev-rep[REP_PERIOD]) 
 part.


 What would be the benefit of doing so?

I was hoping it would make things more readable, and less staircase like.



  +static void input_start_autorepeat(struct input_dev *dev, int code)
  +{
  +  if (test_bit(EV_REP, dev-evbit) 
  +  dev-rep[REP_PERIOD]  dev-rep[REP_DELAY] 
  +  dev-timer.data) {
  +  dev-repeat_key = code;
  +  mod_timer(dev-timer,
  +jiffies + msecs_to_jiffies(dev-rep[REP_DELAY]));
  +  }
  +}

 Same here.


  +  case EV_KEY:
  +  if (is_event_supported(code, dev-keybit, KEY_MAX) 
  +  !!test_bit(code, dev-key) != value) {

 A bit confusing, test_bit(0 only returns 0 or 1 anyway, doesn't it?
 So test_bit(code, dev-key) != value should be all right.
 I noticed that the old code did it too, but still.

 Is it guaranteed? I only expect it to return 0/non-0 values, not necessarily
 0 and 1.

Not sure, but it seems so. All the test_bit implementations I checked
returned either 0 or 1. No idea where to get that guarantee though.



  -  case EV_MSC:
  +  case EV_SW:
  +  if (is_event_supported(code, dev-swbit, SW_MAX) 
  +  !!test_bit(code, dev-sw) != value) {

 Same.

  -  break;
  +  case EV_LED:
  +  if (is_event_supported(code, dev-ledbit, LED_MAX) 
  +  !!test_bit(code, dev-led) != value) {

 And here.


  +void input_inject_event(struct input_handle *handle,
  +  unsigned int type, unsigned int code, int value)
   {
  -  struct input_dev *dev = (void *) data;
  +  struct input_dev *dev = handle-dev;
  +  struct input_handle *grab;
 
  -  if (!test_bit(dev-repeat_key, dev-key))
  -  return;
  +  if (is_event_supported(type, dev-evbit, EV_MAX)) {
  +  spin_lock_irq(dev-event_lock);
 
  -  input_event(dev, EV_KEY, dev-repeat_key, 2);
  -  input_sync(dev);
  +  grab = rcu_dereference(dev-grab);
  +  if (!grab || grab == handle)
  +  input_handle_event(dev, type, code, value);

 'handle' can't be NULL, so can drop the !grab check, as checking
 grab == handle should be sufficient.


 It is or, not and. The idea is to pass the event if device is not
 grabbed by anyone _or_ if source of event is handle that grabbed the
 device.

Ah, oops.



  +/**
  + * input_open_device - open input device
  + * @handle: handle through which device is being accessed
  + *
  + * This function should be called by input handlers when they
  + * want to start receive events from given input device.
  + */
   int input_open_device(struct input_handle *handle)
   {
 struct input_dev *dev = handle-dev;
  -  int err;
  +  int retval;
 
  -  err = mutex_lock_interruptible(dev-mutex);
  -  if (err)
  -  return err;
  +  retval = mutex_lock_interruptible(dev-mutex);
  +  if (retval)
  +  return retval;
  +
  +  if (dev-going_away) {
  +  retval = -ENODEV;
  +  goto out;
  +  }
 
 handle-open++;
 
 if (!dev-users++  dev-open)

 Ugh, not your code, and perhaps it's me, but that looks weird.
 The ++ hidden inthe if check is ugly, and would mean that users
 can be negative, which

Re: swap-prefetch: A smart way to make good use of idle resources (was: updatedb)

2007-07-27 Thread Indan Zupancic
On Sat, July 28, 2007 01:34, grundig wrote:
> El Fri, 27 Jul 2007 15:06:14 -0700, Arjan van de Ven <[EMAIL PROTECTED]> 
> escribi�:
>
>> how do you know there will be other activity? You start the IO and that
>> basically blacks out the disk for 5 to 10 ms. If the "real" IO gets
>> submitted in that time you add latency. You cannot predict that IO
>> happening or not happening.
>
> If there hasn't be much IO for some time, it looks quite reasonable to expect
> that there won't be more in the near future.

Good argument.

> As most of heuristics can fail, but
> then this is a feature mostly for desktops, not servers.

Bad argument. It doesn't matter for who the feature is intended, it matter
what it does and if it does it well or not. In this case, prefetching swap 
without
disturbing anything else.

> There's an old saying that says something like "an open source project starts
> dying when new people can't participate in the project no matter how hard
> they try". It's hard to understand why there's so many people opposing to
> this when other more controversial features are merged much faster, (like, fe.
> the UIO driver framework).

Could people please stop this emotional crap non-argumentation? At best it 
reduces
the chance of swap-prefetch to be merged.

Perhaps one of the reasons is that this is core kernel code. And that it isn't 
a new
feature, but a performance improvement with doubtful trade-offs. The problem
statement isn't clear either. It seems like a natural enhancement, but is that 
enough
reason to merge it? Maybe, maybe not. But if slow swap-in is the problem, 
shouldn't
that be fixed instead of bypassed?

Yes, there are people that say that it works for them, but of those a lot claim
updatedb damage is fixed by it too, while that can't be true. And how many of 
those
people did test swap prefetch stand-alone? The ck kernel has other mm patches 
too,
perhaps those are the real goodies...

And there don't seem to be many people opposing swap prefetch either. A bunch 
seem
in favour of it, and others seem unconvinced.

Me, I don't know if it should be merged or not, it solves one very specific 
workload, and
nothing else (swap is used, and memory becomes free which won't be used in the 
near
future). All in all it seems good, but doubtful, and when in doubt, don't merge.

Greetings,

Indan


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC/RFT 1/5] Input: implement proper locking in input core

2007-07-27 Thread Indan Zupancic
Hi,

Not real feedback, just some nitpicks.

On Tue, July 24, 2007 06:45, Dmitry Torokhov wrote:
> +static int input_defuzz_abs_event(int value, int old_val, int fuzz)
> +{
> + if (fuzz) {
> + if (value > old_val - fuzz / 2 && value < old_val + fuzz / 2)
> + return value;
>
> - add_input_randomness(type, code, value);
> + if (value > old_val - fuzz && value < old_val + fuzz)
> + return (old_val * 3 + value) / 4;
>
> - switch (type) {
> + if (value > old_val - fuzz * 2 && value < old_val + fuzz * 2)
> + return (old_val + value) / 2;
> + }

Shouldn't the return values of the second and third case be reversed?
In the 2nd check the new values is weighted for 1/4, while in the 3rd
case it counts for 1/2, which breaks the "account new value more when
it is closer to the old one" logic that I thought I saw here. So to sum up,
should the second return be "return (old_val + value * 3) / 4"?


> +/*
> + * Generate software autorepeat event. Note that we take
> + * dev->event_lock here to avoid racing with input_event
> + * which may cause keys get "stuck".
> + */

Hurray. :-)

> - if (code > SW_MAX || !test_bit(code, dev->swbit) || 
> !!test_bit(code, dev->sw) == value)
> - return;
> + if (dev->rep[REP_PERIOD])
> + mod_timer(>timer, jiffies +
> + msecs_to_jiffies(dev->rep[REP_PERIOD]));
> + }

Perhaps use a local var for the "msecs_to_jiffies(dev->rep[REP_PERIOD])" part.


> +static void input_start_autorepeat(struct input_dev *dev, int code)
> +{
> + if (test_bit(EV_REP, dev->evbit) &&
> + dev->rep[REP_PERIOD] && dev->rep[REP_DELAY] &&
> + dev->timer.data) {
> + dev->repeat_key = code;
> + mod_timer(>timer,
> +   jiffies + msecs_to_jiffies(dev->rep[REP_DELAY]));
> + }
> +}

Same here.


> + case EV_KEY:
> + if (is_event_supported(code, dev->keybit, KEY_MAX) &&
> + !!test_bit(code, dev->key) != value) {

A bit confusing, test_bit(0 only returns 0 or 1 anyway, doesn't it?
So "test_bit(code, dev->key) != value" should be all right.
I noticed that the old code did it too, but still.

> - case EV_MSC:
> + case EV_SW:
> + if (is_event_supported(code, dev->swbit, SW_MAX) &&
> + !!test_bit(code, dev->sw) != value) {

Same.

> - break;
> + case EV_LED:
> + if (is_event_supported(code, dev->ledbit, LED_MAX) &&
> + !!test_bit(code, dev->led) != value) {

And here.


> +void input_inject_event(struct input_handle *handle,
> + unsigned int type, unsigned int code, int value)
>  {
> - struct input_dev *dev = (void *) data;
> + struct input_dev *dev = handle->dev;
> + struct input_handle *grab;
>
> - if (!test_bit(dev->repeat_key, dev->key))
> - return;
> + if (is_event_supported(type, dev->evbit, EV_MAX)) {
> + spin_lock_irq(>event_lock);
>
> - input_event(dev, EV_KEY, dev->repeat_key, 2);
> - input_sync(dev);
> + grab = rcu_dereference(dev->grab);
> + if (!grab || grab == handle)
> + input_handle_event(dev, type, code, value);

'handle' can't be NULL, so can drop the "!grab" check, as checking
"grab == handle" should be sufficient.


> +/**
> + * input_open_device - open input device
> + * @handle: handle through which device is being accessed
> + *
> + * This function should be called by input handlers when they
> + * want to start receive events from given input device.
> + */
>  int input_open_device(struct input_handle *handle)
>  {
>   struct input_dev *dev = handle->dev;
> - int err;
> + int retval;
>
> - err = mutex_lock_interruptible(>mutex);
> - if (err)
> - return err;
> + retval = mutex_lock_interruptible(>mutex);
> + if (retval)
> + return retval;
> +
> + if (dev->going_away) {
> + retval = -ENODEV;
> + goto out;
> + }
>
>   handle->open++;
>
>   if (!dev->users++ && dev->open)

Ugh, not your code, and perhaps it's me, but that looks weird.
The ++ hidden inthe if check is ugly, and would mean that "users"
can be negative, which is strange.

> - err = dev->open(dev);
> + retval = dev->open(dev);
>
> - if (err)
> - handle->open--;
> + if (retval && !--handle->open) {

Eek! That -- is hidden well there. Would it hurt to call synchronize_sched()
unconditionally? Something like:

if (retval) {
handle->open--;

It's a rare case anyway.

> + /*
> +  * Make sure we are not delivering any more events
> +  * through this handle
> +  */
> + synchronize_sched();
> + }
>

> 

Re: [RFC/RFT 0/5] Input locking patches

2007-07-27 Thread Indan Zupancic
On Tue, July 24, 2007 06:45, Dmitry Torokhov wrote:
> Hi everyone,
>
> I finally managed to put together some patches implementing
> locking in input core and main input handles. Please look
> over them and give them a spin.

Since kernel 2.6.21 or so I was annoyed by a warping mouse, and
one kernel version later also by "stuck" keys, causing repeated input
at the most inconvenient moments (e.g. when opening a program by
pressing F1).

As it happened irregularly and unpredictable it was hard to debug,
and I suspected faulty hardware. My cpu was quite hot, but after
removing all the dust it seems all right again. Unfortunately that
was about the same time I upgraded to 2.6.23-rc1, so all I can say
is that the stuck key problem seems to be gone, though not sure
thanks to what, but that neither the cleaning nor the upgrade fixed
the warping mouse problem.

I'm running with these locking patches for two days now and the
mouse doesn't warp any more (it can also have fixed the stuck key
problem, not sure). Normally it would warp several times a day,
and that didn't happen yet, so I'm tempted to praise your patches.

Sorry for the babbling, just wanted to say that I've tested these
patches and that they seem to fix real problems.

Thanks,

Indan


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: swap-prefetch: A smart way to make good use of idle resources (was: updatedb)

2007-07-27 Thread Indan Zupancic
On Sat, July 28, 2007 00:06, Arjan van de Ven wrote:
> On Fri, 2007-07-27 at 23:51 +0200, Indan Zupanci
>> > also, they take up seek time (5 to 10 msec), so if you were to read
>> > something else at the time you get additional latency.
>>
>> If there's other disk activity swap prefetch shouldn't do much, so this isn't
>> really true.
>
> how do you know there will be other activity? You start the IO and that
> basically blacks out the disk for 5 to 10 ms. If the "real" IO gets
> submitted in that time you add latency. You cannot predict that IO
> happening or not happening.

Ah, in that way. Yes, you right about that (though NCQ might help then?),
but that's true for all disk activity. Though I think swap prefetch didn't want
to run when there was CPU activity, so that would reduce the chance that
new IO is submitted right at that moment. I think in practice this isn't worth
worrying about, the real issue is the extra disk activity in the first place.

Greetings,

Indan


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: swap-prefetch: A smart way to make good use of idle resources (was: updatedb)

2007-07-27 Thread Indan Zupancic
On Fri, July 27, 2007 22:34, Arjan van de Ven wrote:
> On Fri, July 27, 2007 21:43, Al Boldi wrote:
>> IMHO, what everybody agrees on, is that swap-prefetch has a positive effect
>> in some cases, and nobody can prove an adverse effect (excluding power
>> consumption).  The reason for this positive effect is also crystal clear:
>> It prefetches from swap on idle into free memory, ie: it doesn't force
>
> the fact that there is free memory is ... strange. IN principle, Linux
> keeps almost no memory free (except some emergency buffers) so that
> things you swap in prematurely will BY DEFINITION go at the expense of
> other things that could be there

It's not strange, the use case here is if something memory hungry process
is shut down it leaves behind a lot of free memory. Having things swapped
out while there's free memory is strange, so swap prefetch fills it up again.

> also, they take up seek time (5 to 10 msec), so if you were to read
> something else at the time you get additional latency.

If there's other disk activity swap prefetch shouldn't do much, so this isn't
really true.

>> Conclusion:  Either prove swap-prefetch is broken, or get this merged quick.

There are a whole lot of other requirements too than that it isn't broken (of
which most are fulfilled, but anyway). One reason could be that there's a
better solution out there for the problem swap prefetch tries to solve. That
said, as swap prefetch is here now for a while and that other solution not it's
not such a great argument.

Personally I think that a more generic solution would be better, one that
prefetches the lastly evicted pages back in, not favouring either of swap or
file data, like swap prefetch does now.

Greetings,

Indan


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: swap-prefetch: A smart way to make good use of idle resources (was: updatedb)

2007-07-27 Thread Indan Zupancic
On Fri, July 27, 2007 22:34, Arjan van de Ven wrote:
 On Fri, July 27, 2007 21:43, Al Boldi wrote:
 IMHO, what everybody agrees on, is that swap-prefetch has a positive effect
 in some cases, and nobody can prove an adverse effect (excluding power
 consumption).  The reason for this positive effect is also crystal clear:
 It prefetches from swap on idle into free memory, ie: it doesn't force

 the fact that there is free memory is ... strange. IN principle, Linux
 keeps almost no memory free (except some emergency buffers) so that
 things you swap in prematurely will BY DEFINITION go at the expense of
 other things that could be there

It's not strange, the use case here is if something memory hungry process
is shut down it leaves behind a lot of free memory. Having things swapped
out while there's free memory is strange, so swap prefetch fills it up again.

 also, they take up seek time (5 to 10 msec), so if you were to read
 something else at the time you get additional latency.

If there's other disk activity swap prefetch shouldn't do much, so this isn't
really true.

 Conclusion:  Either prove swap-prefetch is broken, or get this merged quick.

There are a whole lot of other requirements too than that it isn't broken (of
which most are fulfilled, but anyway). One reason could be that there's a
better solution out there for the problem swap prefetch tries to solve. That
said, as swap prefetch is here now for a while and that other solution not it's
not such a great argument.

Personally I think that a more generic solution would be better, one that
prefetches the lastly evicted pages back in, not favouring either of swap or
file data, like swap prefetch does now.

Greetings,

Indan


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: swap-prefetch: A smart way to make good use of idle resources (was: updatedb)

2007-07-27 Thread Indan Zupancic
On Sat, July 28, 2007 00:06, Arjan van de Ven wrote:
 On Fri, 2007-07-27 at 23:51 +0200, Indan Zupanci
  also, they take up seek time (5 to 10 msec), so if you were to read
  something else at the time you get additional latency.

 If there's other disk activity swap prefetch shouldn't do much, so this isn't
 really true.

 how do you know there will be other activity? You start the IO and that
 basically blacks out the disk for 5 to 10 ms. If the real IO gets
 submitted in that time you add latency. You cannot predict that IO
 happening or not happening.

Ah, in that way. Yes, you right about that (though NCQ might help then?),
but that's true for all disk activity. Though I think swap prefetch didn't want
to run when there was CPU activity, so that would reduce the chance that
new IO is submitted right at that moment. I think in practice this isn't worth
worrying about, the real issue is the extra disk activity in the first place.

Greetings,

Indan


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC/RFT 0/5] Input locking patches

2007-07-27 Thread Indan Zupancic
On Tue, July 24, 2007 06:45, Dmitry Torokhov wrote:
 Hi everyone,

 I finally managed to put together some patches implementing
 locking in input core and main input handles. Please look
 over them and give them a spin.

Since kernel 2.6.21 or so I was annoyed by a warping mouse, and
one kernel version later also by stuck keys, causing repeated input
at the most inconvenient moments (e.g. when opening a program by
pressing F1).

As it happened irregularly and unpredictable it was hard to debug,
and I suspected faulty hardware. My cpu was quite hot, but after
removing all the dust it seems all right again. Unfortunately that
was about the same time I upgraded to 2.6.23-rc1, so all I can say
is that the stuck key problem seems to be gone, though not sure
thanks to what, but that neither the cleaning nor the upgrade fixed
the warping mouse problem.

I'm running with these locking patches for two days now and the
mouse doesn't warp any more (it can also have fixed the stuck key
problem, not sure). Normally it would warp several times a day,
and that didn't happen yet, so I'm tempted to praise your patches.

Sorry for the babbling, just wanted to say that I've tested these
patches and that they seem to fix real problems.

Thanks,

Indan


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC/RFT 1/5] Input: implement proper locking in input core

2007-07-27 Thread Indan Zupancic
Hi,

Not real feedback, just some nitpicks.

On Tue, July 24, 2007 06:45, Dmitry Torokhov wrote:
 +static int input_defuzz_abs_event(int value, int old_val, int fuzz)
 +{
 + if (fuzz) {
 + if (value  old_val - fuzz / 2  value  old_val + fuzz / 2)
 + return value;

 - add_input_randomness(type, code, value);
 + if (value  old_val - fuzz  value  old_val + fuzz)
 + return (old_val * 3 + value) / 4;

 - switch (type) {
 + if (value  old_val - fuzz * 2  value  old_val + fuzz * 2)
 + return (old_val + value) / 2;
 + }

Shouldn't the return values of the second and third case be reversed?
In the 2nd check the new values is weighted for 1/4, while in the 3rd
case it counts for 1/2, which breaks the account new value more when
it is closer to the old one logic that I thought I saw here. So to sum up,
should the second return be return (old_val + value * 3) / 4?


 +/*
 + * Generate software autorepeat event. Note that we take
 + * dev-event_lock here to avoid racing with input_event
 + * which may cause keys get stuck.
 + */

Hurray. :-)

 - if (code  SW_MAX || !test_bit(code, dev-swbit) || 
 !!test_bit(code, dev-sw) == value)
 - return;
 + if (dev-rep[REP_PERIOD])
 + mod_timer(dev-timer, jiffies +
 + msecs_to_jiffies(dev-rep[REP_PERIOD]));
 + }

Perhaps use a local var for the msecs_to_jiffies(dev-rep[REP_PERIOD]) part.


 +static void input_start_autorepeat(struct input_dev *dev, int code)
 +{
 + if (test_bit(EV_REP, dev-evbit) 
 + dev-rep[REP_PERIOD]  dev-rep[REP_DELAY] 
 + dev-timer.data) {
 + dev-repeat_key = code;
 + mod_timer(dev-timer,
 +   jiffies + msecs_to_jiffies(dev-rep[REP_DELAY]));
 + }
 +}

Same here.


 + case EV_KEY:
 + if (is_event_supported(code, dev-keybit, KEY_MAX) 
 + !!test_bit(code, dev-key) != value) {

A bit confusing, test_bit(0 only returns 0 or 1 anyway, doesn't it?
So test_bit(code, dev-key) != value should be all right.
I noticed that the old code did it too, but still.

 - case EV_MSC:
 + case EV_SW:
 + if (is_event_supported(code, dev-swbit, SW_MAX) 
 + !!test_bit(code, dev-sw) != value) {

Same.

 - break;
 + case EV_LED:
 + if (is_event_supported(code, dev-ledbit, LED_MAX) 
 + !!test_bit(code, dev-led) != value) {

And here.


 +void input_inject_event(struct input_handle *handle,
 + unsigned int type, unsigned int code, int value)
  {
 - struct input_dev *dev = (void *) data;
 + struct input_dev *dev = handle-dev;
 + struct input_handle *grab;

 - if (!test_bit(dev-repeat_key, dev-key))
 - return;
 + if (is_event_supported(type, dev-evbit, EV_MAX)) {
 + spin_lock_irq(dev-event_lock);

 - input_event(dev, EV_KEY, dev-repeat_key, 2);
 - input_sync(dev);
 + grab = rcu_dereference(dev-grab);
 + if (!grab || grab == handle)
 + input_handle_event(dev, type, code, value);

'handle' can't be NULL, so can drop the !grab check, as checking
grab == handle should be sufficient.


 +/**
 + * input_open_device - open input device
 + * @handle: handle through which device is being accessed
 + *
 + * This function should be called by input handlers when they
 + * want to start receive events from given input device.
 + */
  int input_open_device(struct input_handle *handle)
  {
   struct input_dev *dev = handle-dev;
 - int err;
 + int retval;

 - err = mutex_lock_interruptible(dev-mutex);
 - if (err)
 - return err;
 + retval = mutex_lock_interruptible(dev-mutex);
 + if (retval)
 + return retval;
 +
 + if (dev-going_away) {
 + retval = -ENODEV;
 + goto out;
 + }

   handle-open++;

   if (!dev-users++  dev-open)

Ugh, not your code, and perhaps it's me, but that looks weird.
The ++ hidden inthe if check is ugly, and would mean that users
can be negative, which is strange.

 - err = dev-open(dev);
 + retval = dev-open(dev);

 - if (err)
 - handle-open--;
 + if (retval  !--handle-open) {

Eek! That -- is hidden well there. Would it hurt to call synchronize_sched()
unconditionally? Something like:

if (retval) {
handle-open--;

It's a rare case anyway.

 + /*
 +  * Make sure we are not delivering any more events
 +  * through this handle
 +  */
 + synchronize_sched();
 + }


 +/**
 + * input_close_device - close input device
 + * @handle: handle through which device is being accessed
 + *
 + * This function should be called by input handlers 

Re: swap-prefetch: A smart way to make good use of idle resources (was: updatedb)

2007-07-27 Thread Indan Zupancic
On Sat, July 28, 2007 01:34, grundig wrote:
 El Fri, 27 Jul 2007 15:06:14 -0700, Arjan van de Ven [EMAIL PROTECTED] 
 escribi�:

 how do you know there will be other activity? You start the IO and that
 basically blacks out the disk for 5 to 10 ms. If the real IO gets
 submitted in that time you add latency. You cannot predict that IO
 happening or not happening.

 If there hasn't be much IO for some time, it looks quite reasonable to expect
 that there won't be more in the near future.

Good argument.

 As most of heuristics can fail, but
 then this is a feature mostly for desktops, not servers.

Bad argument. It doesn't matter for who the feature is intended, it matter
what it does and if it does it well or not. In this case, prefetching swap 
without
disturbing anything else.

 There's an old saying that says something like an open source project starts
 dying when new people can't participate in the project no matter how hard
 they try. It's hard to understand why there's so many people opposing to
 this when other more controversial features are merged much faster, (like, fe.
 the UIO driver framework).

Could people please stop this emotional crap non-argumentation? At best it 
reduces
the chance of swap-prefetch to be merged.

Perhaps one of the reasons is that this is core kernel code. And that it isn't 
a new
feature, but a performance improvement with doubtful trade-offs. The problem
statement isn't clear either. It seems like a natural enhancement, but is that 
enough
reason to merge it? Maybe, maybe not. But if slow swap-in is the problem, 
shouldn't
that be fixed instead of bypassed?

Yes, there are people that say that it works for them, but of those a lot claim
updatedb damage is fixed by it too, while that can't be true. And how many of 
those
people did test swap prefetch stand-alone? The ck kernel has other mm patches 
too,
perhaps those are the real goodies...

And there don't seem to be many people opposing swap prefetch either. A bunch 
seem
in favour of it, and others seem unconvinced.

Me, I don't know if it should be merged or not, it solves one very specific 
workload, and
nothing else (swap is used, and memory becomes free which won't be used in the 
near
future). All in all it seems good, but doubtful, and when in doubt, don't merge.

Greetings,

Indan


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFH] Partition table recovery

2007-07-22 Thread Indan Zupancic
On Sun, July 22, 2007 18:28, Theodore Tso wrote:
> On Sun, Jul 22, 2007 at 07:10:31AM +0300, Al Boldi wrote:
>> Sounds great, but it may be advisable to hook this into the partition
>> modification routines instead of mkfs/fsck.  Which would mean that the
>> partition manager could ask the kernel to instruct its fs subsystem to
>> update the backup partition table for each known fs-type that supports such
>> a feature.
>
> Well, let's think about this a bit.  What are the requirements?
>
> 1) The partition manager should be able explicitly request that a new
> backup of the partition tables be stashed in each filesystem that has
> room for such a backup.  That way, when the user affirmatively makes a
> partition table change, it can get backed up in all of the right
> places automatically.
>
> 2) The fsck program should *only* stash a backup of the partition
> table if there currently isn't one in the filesystem.  It may be that
> the partition table has been corrupted, and so merely doing an fsck
> should not transfer a current copy of the partition table to the
> filesystem-secpfic backup area.  It could be that the partition table
> was only partially recovered, and we don't want to overwrite the
> previously existing backups except on an explicit request from the
> system administrator.
>
> 3) The mkfs program should automatically create a backup of the
> current partition table layout.  That way we get a backup in the newly
> created filesystem as soon as it is created.
>
> 4) The exact location of the backup may vary from filesystem to
> filesystem.  For ext2/3/4, bytes 512-1023 are always unused, and don't
> interfere with the boot sector at bytes 0-511, so that's the obvious
> location.  Other filesystems may have that location in use, and some
> other location might be a better place to store it.  Ideally it will
> be a well-known location, that isn't dependent on finding an inode
> table, or some such, but that may not be possible for all filesystems.

To be on the safe side, maybe also add a checksum, timestamp and
something identifying the disk the filesystem was created on.

Regards,

Indan


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFH] Partition table recovery

2007-07-22 Thread Indan Zupancic
On Sun, July 22, 2007 18:28, Theodore Tso wrote:
 On Sun, Jul 22, 2007 at 07:10:31AM +0300, Al Boldi wrote:
 Sounds great, but it may be advisable to hook this into the partition
 modification routines instead of mkfs/fsck.  Which would mean that the
 partition manager could ask the kernel to instruct its fs subsystem to
 update the backup partition table for each known fs-type that supports such
 a feature.

 Well, let's think about this a bit.  What are the requirements?

 1) The partition manager should be able explicitly request that a new
 backup of the partition tables be stashed in each filesystem that has
 room for such a backup.  That way, when the user affirmatively makes a
 partition table change, it can get backed up in all of the right
 places automatically.

 2) The fsck program should *only* stash a backup of the partition
 table if there currently isn't one in the filesystem.  It may be that
 the partition table has been corrupted, and so merely doing an fsck
 should not transfer a current copy of the partition table to the
 filesystem-secpfic backup area.  It could be that the partition table
 was only partially recovered, and we don't want to overwrite the
 previously existing backups except on an explicit request from the
 system administrator.

 3) The mkfs program should automatically create a backup of the
 current partition table layout.  That way we get a backup in the newly
 created filesystem as soon as it is created.

 4) The exact location of the backup may vary from filesystem to
 filesystem.  For ext2/3/4, bytes 512-1023 are always unused, and don't
 interfere with the boot sector at bytes 0-511, so that's the obvious
 location.  Other filesystems may have that location in use, and some
 other location might be a better place to store it.  Ideally it will
 be a well-known location, that isn't dependent on finding an inode
 table, or some such, but that may not be possible for all filesystems.

To be on the safe side, maybe also add a checksum, timestamp and
something identifying the disk the filesystem was created on.

Regards,

Indan


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: understanding firmware loader for speedtouch (kernel 2.6.21.5)

2007-07-09 Thread Indan Zupancic
On Mon, July 9, 2007 16:40, mikie wrote:
> 2007/7/9, Kay Sievers <[EMAIL PROTECTED]>:
>> On 7/9/07, mikie <[EMAIL PROTECTED]> wrote:
>> > 2007/7/9, Indan Zupancic <[EMAIL PROTECTED]>:
>> > > On Mon, July 9, 2007 10:49, mikie wrote:
>> > > > 2007/7/6, Indan Zupancic <[EMAIL PROTECTED]>:
>> > > >> On Fri, July 6, 2007 16:20, Duncan Sands wrote:
>> > > >> > On Friday 6 July 2007 14:54:18 mikie wrote:
>> > > >> >> Hi,
>> > > >> >>
>> > > >> >> I experience some problems with the speedtch.c module, especially 
>> > > >> >> in
>> > > >> >> regards to its firmware loader.
>> > > >> >> I am not quite sure if this module is going to load the firmware
>> > > >> >> itself or does it use some external software to do that ?
>> > > >> >
>> > > >> > It loads it itself, using some external software!
>> > > >>
>> > > >> For information, it generates a hotplug event and the kernel calls the
>> > > >> program written at /proc/sys/kernel/hotplug with certain information.
>> > > >> That used to be /sbin/hotplug, became later udev, but today in general
>> > > >> udev reads the uevents generated by the kernel.
>> > > >
>> > > > On my system the /proc/sys/kernel/hotplug points to /sbin/hotplug.
>> > > > I copied your script to /sbin/hotplug and also added simple logging,
>> > > > so I can see whenever the script is being started. It turns out that
>> > > > the script is not started at all by the kernel...
>> > > >
>> > > > I am afraid that the kernel generates uevents only, could this be true?
>> > >
>> > > Not really, it would break backward compatibility, and if it were true,
>> > > they're remove the /proc/sys/kernel/hotplug setting too. As far as I 
>> > > know,
>> > > if it's set, that program will be executed, and if zero is written to 
>> > > it, only
>> > > uevents are generated.
>> > >
>> > > Make sure that the script is executable (chmod +x)
>> >
>> > Yes I have set it to be executable:
>> > [EMAIL PROTECTED]:/sbin# ls -lah hotplug
>> > -rwxr-xr-x1 root root  934 Jul  9 09:13 hotplug
>> >
>> > > and has "#!/bin/sh"
>> > > at the top, or else it won't work. If that's already the case, I've no 
>> > > idea.
>> >
>> > [EMAIL PROTECTED]:/sbin# head hotplug
>> > #!/bin/sh
>> > set -e
>> >
>> >
>> > Everything looks OK, but still it does not fire up the script...
>>
>> It should run, if the kernel creates events. Can you run "udevmonitor
>> --env" while loading the driver? That way we can see if events are
>> generated by the kernel (if you don't have it, no need to install
>> udev, just download the udev tarball, type "make" and run
>> "./udevmonitor --env").
>
> It seems that uevents are generated:

What if you run a very simple script like

#!/bin/sh
echo "test" >> /tmp/test

is it called at all?

If it is then it's probably something with the other script. Try removing
the set -e and/or run it manually with the right parameters set and see if
anything goes wrong. Or use a minimal udev and let it do the hotplugging,
having only rules for speedtouch and firmware loading (attached, but
make sure the paths and pppd call is correct for your system). /dev can stay
static, just change udev.conf to handle /udev or something.

Regards,

Indan


speedtch.rules
Description: Binary data


Re: understanding firmware loader for speedtouch (kernel 2.6.21.5)

2007-07-09 Thread Indan Zupancic
On Mon, July 9, 2007 10:49, mikie wrote:
> 2007/7/6, Indan Zupancic <[EMAIL PROTECTED]>:
>> On Fri, July 6, 2007 16:20, Duncan Sands wrote:
>> > On Friday 6 July 2007 14:54:18 mikie wrote:
>> >> Hi,
>> >>
>> >> I experience some problems with the speedtch.c module, especially in
>> >> regards to its firmware loader.
>> >> I am not quite sure if this module is going to load the firmware
>> >> itself or does it use some external software to do that ?
>> >
>> > It loads it itself, using some external software!
>>
>> For information, it generates a hotplug event and the kernel calls the
>> program written at /proc/sys/kernel/hotplug with certain information.
>> That used to be /sbin/hotplug, became later udev, but today in general
>> udev reads the uevents generated by the kernel.
>
> On my system the /proc/sys/kernel/hotplug points to /sbin/hotplug.
> I copied your script to /sbin/hotplug and also added simple logging,
> so I can see whenever the script is being started. It turns out that
> the script is not started at all by the kernel...
>
> I am afraid that the kernel generates uevents only, could this be true?

Not really, it would break backward compatibility, and if it were true,
they're remove the /proc/sys/kernel/hotplug setting too. As far as I know,
if it's set, that program will be executed, and if zero is written to it, only
uevents are generated.

Make sure that the script is executable (chmod +x) and has "#!/bin/sh"
at the top, or else it won't work. If that's already the case, I've no idea.

Good luck,

Indan


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: understanding firmware loader for speedtouch (kernel 2.6.21.5)

2007-07-09 Thread Indan Zupancic
On Mon, July 9, 2007 10:49, mikie wrote:
 2007/7/6, Indan Zupancic [EMAIL PROTECTED]:
 On Fri, July 6, 2007 16:20, Duncan Sands wrote:
  On Friday 6 July 2007 14:54:18 mikie wrote:
  Hi,
 
  I experience some problems with the speedtch.c module, especially in
  regards to its firmware loader.
  I am not quite sure if this module is going to load the firmware
  itself or does it use some external software to do that ?
 
  It loads it itself, using some external software!

 For information, it generates a hotplug event and the kernel calls the
 program written at /proc/sys/kernel/hotplug with certain information.
 That used to be /sbin/hotplug, became later udev, but today in general
 udev reads the uevents generated by the kernel.

 On my system the /proc/sys/kernel/hotplug points to /sbin/hotplug.
 I copied your script to /sbin/hotplug and also added simple logging,
 so I can see whenever the script is being started. It turns out that
 the script is not started at all by the kernel...

 I am afraid that the kernel generates uevents only, could this be true?

Not really, it would break backward compatibility, and if it were true,
they're remove the /proc/sys/kernel/hotplug setting too. As far as I know,
if it's set, that program will be executed, and if zero is written to it, only
uevents are generated.

Make sure that the script is executable (chmod +x) and has #!/bin/sh
at the top, or else it won't work. If that's already the case, I've no idea.

Good luck,

Indan


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: understanding firmware loader for speedtouch (kernel 2.6.21.5)

2007-07-09 Thread Indan Zupancic
On Mon, July 9, 2007 16:40, mikie wrote:
 2007/7/9, Kay Sievers [EMAIL PROTECTED]:
 On 7/9/07, mikie [EMAIL PROTECTED] wrote:
  2007/7/9, Indan Zupancic [EMAIL PROTECTED]:
   On Mon, July 9, 2007 10:49, mikie wrote:
2007/7/6, Indan Zupancic [EMAIL PROTECTED]:
On Fri, July 6, 2007 16:20, Duncan Sands wrote:
 On Friday 6 July 2007 14:54:18 mikie wrote:
 Hi,

 I experience some problems with the speedtch.c module, especially 
 in
 regards to its firmware loader.
 I am not quite sure if this module is going to load the firmware
 itself or does it use some external software to do that ?

 It loads it itself, using some external software!
   
For information, it generates a hotplug event and the kernel calls the
program written at /proc/sys/kernel/hotplug with certain information.
That used to be /sbin/hotplug, became later udev, but today in general
udev reads the uevents generated by the kernel.
   
On my system the /proc/sys/kernel/hotplug points to /sbin/hotplug.
I copied your script to /sbin/hotplug and also added simple logging,
so I can see whenever the script is being started. It turns out that
the script is not started at all by the kernel...
   
I am afraid that the kernel generates uevents only, could this be true?
  
   Not really, it would break backward compatibility, and if it were true,
   they're remove the /proc/sys/kernel/hotplug setting too. As far as I 
   know,
   if it's set, that program will be executed, and if zero is written to 
   it, only
   uevents are generated.
  
   Make sure that the script is executable (chmod +x)
 
  Yes I have set it to be executable:
  [EMAIL PROTECTED]:/sbin# ls -lah hotplug
  -rwxr-xr-x1 root root  934 Jul  9 09:13 hotplug
 
   and has #!/bin/sh
   at the top, or else it won't work. If that's already the case, I've no 
   idea.
 
  [EMAIL PROTECTED]:/sbin# head hotplug
  #!/bin/sh
  set -e
 
 
  Everything looks OK, but still it does not fire up the script...

 It should run, if the kernel creates events. Can you run udevmonitor
 --env while loading the driver? That way we can see if events are
 generated by the kernel (if you don't have it, no need to install
 udev, just download the udev tarball, type make and run
 ./udevmonitor --env).

 It seems that uevents are generated:

What if you run a very simple script like

#!/bin/sh
echo test  /tmp/test

is it called at all?

If it is then it's probably something with the other script. Try removing
the set -e and/or run it manually with the right parameters set and see if
anything goes wrong. Or use a minimal udev and let it do the hotplugging,
having only rules for speedtouch and firmware loading (attached, but
make sure the paths and pppd call is correct for your system). /dev can stay
static, just change udev.conf to handle /udev or something.

Regards,

Indan


speedtch.rules
Description: Binary data


Re: understanding firmware loader for speedtouch (kernel 2.6.21.5)

2007-07-06 Thread Indan Zupancic
On Fri, July 6, 2007 16:20, Duncan Sands wrote:
> On Friday 6 July 2007 14:54:18 mikie wrote:
>> Hi,
>>
>> I experience some problems with the speedtch.c module, especially in
>> regards to its firmware loader.
>> I am not quite sure if this module is going to load the firmware
>> itself or does it use some external software to do that ?
>
> It loads it itself, using some external software!

For information, it generates a hotplug event and the kernel calls the
program written at /proc/sys/kernel/hotplug with certain information.
That used to be /sbin/hotplug, became later udev, but today in general
udev reads the uevents generated by the kernel.


>> I have read and checked the firmware.agent script from some hotplug
>> package, and noticed that it makes something like simple file copy of
>> the firmware to the "data" and echoing 1 or 0 to the "loading" file.
>> But what is the location of these data/loading ? Also the firmware is
>> split into 2 parts -- should I upload them one after another? Is that
>> all that should be done to get the firmware loaded?
>
> You need to write the file that was asked for: the modem starts by asking
> for the first file, uploads it, then asks for the second file and uploads it.
> The file is written to the "data" file, but note that the data file is created
> on the fly when the modem requests a firmware load so you can't write to it
> at some random time.

If you really don't want to use udevd, you could use the below script I wrote
(not really tested, it's an bash version of the Udev rules I've have, more or 
less.)

Echo the script pathname to /proc/sys/kernel/hotplug early at bootup and all
should work, assuming your pppd is setup correctly and the script isn't buggy.

#!/bin/sh
set -e

p()
{
# For debugging/logging, edit for your needs.
echo "$*"
echo "$*" > /dev/vc/1;
echo "$*" >> /tmp/bla;
}

if [ "$SUBSYSTEM" = "firmware" ]; then
p "Loading firmware..."
cd /sys/$DEVPATH
if [ -f "/lib/firmware/$FIRMWARE" ]; then
echo 1 > loading
cat "/lib/firmware/$FIRMWARE" > data
echo 0 > loading
p "Done."
else
echo -1 > loading
p "$FIRMWARE not found."
fi
elif [ "$SUBSYSTEM" = "atm" ]; then
if [[ "$KERNEL" != speedtch* ]]; then
exit
fi
if [ "$ACTION" = "add" ]; then
p "Starting pppd."
pppd call adsl
elif [ "$ACTION" = "remove" ]; then
p "Stopping pppd."
pkill pppd
fi
fi

Greetings,

Indan


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: understanding firmware loader for speedtouch (kernel 2.6.21.5)

2007-07-06 Thread Indan Zupancic
On Fri, July 6, 2007 16:20, Duncan Sands wrote:
 On Friday 6 July 2007 14:54:18 mikie wrote:
 Hi,

 I experience some problems with the speedtch.c module, especially in
 regards to its firmware loader.
 I am not quite sure if this module is going to load the firmware
 itself or does it use some external software to do that ?

 It loads it itself, using some external software!

For information, it generates a hotplug event and the kernel calls the
program written at /proc/sys/kernel/hotplug with certain information.
That used to be /sbin/hotplug, became later udev, but today in general
udev reads the uevents generated by the kernel.


 I have read and checked the firmware.agent script from some hotplug
 package, and noticed that it makes something like simple file copy of
 the firmware to the data and echoing 1 or 0 to the loading file.
 But what is the location of these data/loading ? Also the firmware is
 split into 2 parts -- should I upload them one after another? Is that
 all that should be done to get the firmware loaded?

 You need to write the file that was asked for: the modem starts by asking
 for the first file, uploads it, then asks for the second file and uploads it.
 The file is written to the data file, but note that the data file is created
 on the fly when the modem requests a firmware load so you can't write to it
 at some random time.

If you really don't want to use udevd, you could use the below script I wrote
(not really tested, it's an bash version of the Udev rules I've have, more or 
less.)

Echo the script pathname to /proc/sys/kernel/hotplug early at bootup and all
should work, assuming your pppd is setup correctly and the script isn't buggy.

#!/bin/sh
set -e

p()
{
# For debugging/logging, edit for your needs.
echo $*
echo $*  /dev/vc/1;
echo $*  /tmp/bla;
}

if [ $SUBSYSTEM = firmware ]; then
p Loading firmware...
cd /sys/$DEVPATH
if [ -f /lib/firmware/$FIRMWARE ]; then
echo 1  loading
cat /lib/firmware/$FIRMWARE  data
echo 0  loading
p Done.
else
echo -1  loading
p $FIRMWARE not found.
fi
elif [ $SUBSYSTEM = atm ]; then
if [[ $KERNEL != speedtch* ]]; then
exit
fi
if [ $ACTION = add ]; then
p Starting pppd.
pppd call adsl
elif [ $ACTION = remove ]; then
p Stopping pppd.
pkill pppd
fi
fi

Greetings,

Indan


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: blink driver power saving

2007-07-02 Thread Indan Zupancic
On Mon, July 2, 2007 13:51, Andi Kleen wrote:
> On Monday 02 July 2007 13:43:23 Indan Zupancic wrote:
>> On Mon, July 2, 2007 01:59, Andi Kleen wrote:
>> > Well only those that could be already hung from user space
>> > with setleds (that was also confirmed).  Actually I thought
>> > they didn't hang completely, but just stopped reacting to
>> > the keyboard (which is actually pretty bad for every user
>> > to be able to trigger)
>>
>> Pavel's lost key events, mine stopped reacting altogether.
>
> Did you try if the network was still alive? Perhaps it was
> just a locked up keyboard.

No, it wasn't locked up, because the leds were blinking.
I think if I waited long enough it eventually reacted to other
key presses too, sometimes. At least flooding it with Ctrl+C
stopped the setleds loop long after I had any hope left.

>>
>> > Anyways, Stephen's patch just doesn't make sense:
>> > he clearly didn't understand the code at all. Before you
>> > apply it and cripple it better drop the driver completely.
>>
>> CC'ing Dmitry, as I think he doesn't like the blink driver much either. ;-)
>
> Perhaps one of you geniuses who all hate it can find a better way to
> solve the "video output dead after kexec; but need visual feedback to the user
> while crash dumping" problem. I'm waiting for your patches.

While someone's at it, perhaps USB keyboards could be made working too.
And make it generic infrastructure too so everyone can (ab)use it?
What about adopting the approach here and share the code:
http://lkml.org/lkml/2007/6/19/112

At least it works with all keyboards...

Greetings,

Indan


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: blink driver power saving

2007-07-02 Thread Indan Zupancic
On Mon, July 2, 2007 01:59, Andi Kleen wrote:
> Well only those that could be already hung from user space
> with setleds (that was also confirmed).  Actually I thought
> they didn't hang completely, but just stopped reacting to
> the keyboard (which is actually pretty bad for every user
> to be able to trigger)

Pavel's lost key events, mine stopped reacting altogether.

> I guess the better way to handle those would be to find out the
> minimum frequency of blinking that is still ok and rate limit it to that in
> the keyboard driver.

Dmitry already has a patch for that. He limited it to one event each 50 ms.

> Anyways, Stephen's patch just doesn't make sense:
> he clearly didn't understand the code at all. Before you
> apply it and cripple it better drop the driver completely.

CC'ing Dmitry, as I think he doesn't like the blink driver much either. ;-)

Greetings,

Indan


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: blink driver power saving

2007-07-02 Thread Indan Zupancic
On Mon, July 2, 2007 01:59, Andi Kleen wrote:
 Well only those that could be already hung from user space
 with setleds (that was also confirmed).  Actually I thought
 they didn't hang completely, but just stopped reacting to
 the keyboard (which is actually pretty bad for every user
 to be able to trigger)

Pavel's lost key events, mine stopped reacting altogether.

 I guess the better way to handle those would be to find out the
 minimum frequency of blinking that is still ok and rate limit it to that in
 the keyboard driver.

Dmitry already has a patch for that. He limited it to one event each 50 ms.

 Anyways, Stephen's patch just doesn't make sense:
 he clearly didn't understand the code at all. Before you
 apply it and cripple it better drop the driver completely.

CC'ing Dmitry, as I think he doesn't like the blink driver much either. ;-)

Greetings,

Indan


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: blink driver power saving

2007-07-02 Thread Indan Zupancic
On Mon, July 2, 2007 13:51, Andi Kleen wrote:
 On Monday 02 July 2007 13:43:23 Indan Zupancic wrote:
 On Mon, July 2, 2007 01:59, Andi Kleen wrote:
  Well only those that could be already hung from user space
  with setleds (that was also confirmed).  Actually I thought
  they didn't hang completely, but just stopped reacting to
  the keyboard (which is actually pretty bad for every user
  to be able to trigger)

 Pavel's lost key events, mine stopped reacting altogether.

 Did you try if the network was still alive? Perhaps it was
 just a locked up keyboard.

No, it wasn't locked up, because the leds were blinking.
I think if I waited long enough it eventually reacted to other
key presses too, sometimes. At least flooding it with Ctrl+C
stopped the setleds loop long after I had any hope left.


  Anyways, Stephen's patch just doesn't make sense:
  he clearly didn't understand the code at all. Before you
  apply it and cripple it better drop the driver completely.

 CC'ing Dmitry, as I think he doesn't like the blink driver much either. ;-)

 Perhaps one of you geniuses who all hate it can find a better way to
 solve the video output dead after kexec; but need visual feedback to the user
 while crash dumping problem. I'm waiting for your patches.

While someone's at it, perhaps USB keyboards could be made working too.
And make it generic infrastructure too so everyone can (ab)use it?
What about adopting the approach here and share the code:
http://lkml.org/lkml/2007/6/19/112

At least it works with all keyboards...

Greetings,

Indan


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.22-rc[23]: blinking capslock led, stuck keys?

2007-06-16 Thread Indan Zupancic
On Sat, June 16, 2007 05:34, Dmitry Torokhov wrote:
> On Friday 15 June 2007 22:04, Indan Zupancic wrote:
>> On Fri, June 15, 2007 07:41, Dmitry Torokhov wrote:
>> >  /*
>> > + * Schedule switch for execution. We need to throttle requests,
>> > + * otherwise keyboard may become unresponsive.
>> > + */
>> > +static void atkbd_schedule_event_work(struct atkbd *atkbd, int event_bit)
>> > +{
>> > +  unsigned long delay = msecs_to_jiffies(50);
>> > +
>> > +  if (time_after(jiffies, atkbd->event_jiffies + delay))
>> > +  delay = 0;
>> > +
>> > +  atkbd->event_jiffies = jiffies;
>> > +  set_bit(event_bit, >event_mask);
>> > +  wmb();
>> > +  schedule_delayed_work(>event_work, delay);
>> > +}
>>
>> I don't know whether schedule_delayed_work() requeues event_work, or if
>> it adds more work, but both seem to give wrong behaviour:
>
> Well, my advise would be to research the matter before saying that
> it will not work.

Good advise. My advise to others is, that if you do research it, then don't
do it hastily like I did. ;-)

My quick search didn't find the exact behaviour of schedule_delayed_work(),
when looking in LDD, but I should've digged through to the description of
queue_delayed_work().


>> In the first case
>> event_work can be postponed forever if atkbd_schedule_event_work() is
>> called repeatedly each time within 50 ms, and for the second case there's a
>> delay added, but the number of times the LED is switched stays the same,
>> so it's not being throttled.
>>
>
> No, once work is queued for execution subsequent attempts to queue the
> same work will be ignored (until work starts executing). Therefore first
> time work will be scheduled for execution immediately and then execution
> be spaced by ~50 ms.

Thank you for the explanation.

I applied the patch, and it compiles and runs as expected, can't lockup the
keyboard anymore with it applied.

Greetings,

Indan


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.22-rc[23]: blinking capslock led, stuck keys?

2007-06-16 Thread Indan Zupancic
On Sat, June 16, 2007 05:34, Dmitry Torokhov wrote:
 On Friday 15 June 2007 22:04, Indan Zupancic wrote:
 On Fri, June 15, 2007 07:41, Dmitry Torokhov wrote:
   /*
  + * Schedule switch for execution. We need to throttle requests,
  + * otherwise keyboard may become unresponsive.
  + */
  +static void atkbd_schedule_event_work(struct atkbd *atkbd, int event_bit)
  +{
  +  unsigned long delay = msecs_to_jiffies(50);
  +
  +  if (time_after(jiffies, atkbd-event_jiffies + delay))
  +  delay = 0;
  +
  +  atkbd-event_jiffies = jiffies;
  +  set_bit(event_bit, atkbd-event_mask);
  +  wmb();
  +  schedule_delayed_work(atkbd-event_work, delay);
  +}

 I don't know whether schedule_delayed_work() requeues event_work, or if
 it adds more work, but both seem to give wrong behaviour:

 Well, my advise would be to research the matter before saying that
 it will not work.

Good advise. My advise to others is, that if you do research it, then don't
do it hastily like I did. ;-)

My quick search didn't find the exact behaviour of schedule_delayed_work(),
when looking in LDD, but I should've digged through to the description of
queue_delayed_work().


 In the first case
 event_work can be postponed forever if atkbd_schedule_event_work() is
 called repeatedly each time within 50 ms, and for the second case there's a
 delay added, but the number of times the LED is switched stays the same,
 so it's not being throttled.


 No, once work is queued for execution subsequent attempts to queue the
 same work will be ignored (until work starts executing). Therefore first
 time work will be scheduled for execution immediately and then execution
 be spaced by ~50 ms.

Thank you for the explanation.

I applied the patch, and it compiles and runs as expected, can't lockup the
keyboard anymore with it applied.

Greetings,

Indan


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.22-rc[23]: blinking capslock led, stuck keys?

2007-06-15 Thread Indan Zupancic
On Fri, June 15, 2007 07:41, Dmitry Torokhov wrote:
> Does the patch below help?

Didn't try it yet, but will tomorrow.


> Input: atkbd - throttle LED switching
>
> On some boxes keyboard controllers are too slow to withstand
> continuous flow of requests to turn keyboard LEDs on and off
> and start losing some keypresses or even all of them.
>
> Delay executing of LED switching request if we had another one
> withing 50 ms thus easing load on the controller.
>
> Signed-off-by: Dmitry Torokhov <[EMAIL PROTECTED]>
> ---
>
>  drivers/input/keyboard/atkbd.c |   40 
> ++--
>  1 files changed, 26 insertions(+), 14 deletions(-)
>
> Index: work/drivers/input/keyboard/atkbd.c
> ===
> --- work.orig/drivers/input/keyboard/atkbd.c
> +++ work/drivers/input/keyboard/atkbd.c
> @@ -219,7 +219,8 @@ struct atkbd {
>   unsigned long time;
>   unsigned long err_count;
>
> - struct work_struct event_work;
> + struct delayed_work event_work;
> + unsigned long event_jiffies;
>   struct mutex event_mutex;
>   unsigned long event_mask;
>  };
> @@ -565,7 +566,7 @@ static int atkbd_set_leds(struct atkbd *
>
>  static void atkbd_event_work(struct work_struct *work)
>  {
> - struct atkbd *atkbd = container_of(work, struct atkbd, event_work);
> + struct atkbd *atkbd = container_of(work, struct atkbd, event_work.work);
>
>   mutex_lock(>event_mutex);
>
> @@ -579,12 +580,30 @@ static void atkbd_event_work(struct work
>  }
>
>  /*
> + * Schedule switch for execution. We need to throttle requests,
> + * otherwise keyboard may become unresponsive.
> + */
> +static void atkbd_schedule_event_work(struct atkbd *atkbd, int event_bit)
> +{
> + unsigned long delay = msecs_to_jiffies(50);
> +
> + if (time_after(jiffies, atkbd->event_jiffies + delay))
> + delay = 0;
> +
> + atkbd->event_jiffies = jiffies;
> + set_bit(event_bit, >event_mask);
> + wmb();
> + schedule_delayed_work(>event_work, delay);
> +}

I don't know whether schedule_delayed_work() requeues event_work, or if
it adds more work, but both seem to give wrong behaviour: In the first case
event_work can be postponed forever if atkbd_schedule_event_work() is
called repeatedly each time within 50 ms, and for the second case there's a
delay added, but the number of times the LED is switched stays the same,
so it's not being throttled.

Greetings,

Indan


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.22-rc[23]: blinking capslock led, stuck keys?

2007-06-15 Thread Indan Zupancic
On Fri, June 15, 2007 07:41, Dmitry Torokhov wrote:
 Does the patch below help?

Didn't try it yet, but will tomorrow.


 Input: atkbd - throttle LED switching

 On some boxes keyboard controllers are too slow to withstand
 continuous flow of requests to turn keyboard LEDs on and off
 and start losing some keypresses or even all of them.

 Delay executing of LED switching request if we had another one
 withing 50 ms thus easing load on the controller.

 Signed-off-by: Dmitry Torokhov [EMAIL PROTECTED]
 ---

  drivers/input/keyboard/atkbd.c |   40 
 ++--
  1 files changed, 26 insertions(+), 14 deletions(-)

 Index: work/drivers/input/keyboard/atkbd.c
 ===
 --- work.orig/drivers/input/keyboard/atkbd.c
 +++ work/drivers/input/keyboard/atkbd.c
 @@ -219,7 +219,8 @@ struct atkbd {
   unsigned long time;
   unsigned long err_count;

 - struct work_struct event_work;
 + struct delayed_work event_work;
 + unsigned long event_jiffies;
   struct mutex event_mutex;
   unsigned long event_mask;
  };
 @@ -565,7 +566,7 @@ static int atkbd_set_leds(struct atkbd *

  static void atkbd_event_work(struct work_struct *work)
  {
 - struct atkbd *atkbd = container_of(work, struct atkbd, event_work);
 + struct atkbd *atkbd = container_of(work, struct atkbd, event_work.work);

   mutex_lock(atkbd-event_mutex);

 @@ -579,12 +580,30 @@ static void atkbd_event_work(struct work
  }

  /*
 + * Schedule switch for execution. We need to throttle requests,
 + * otherwise keyboard may become unresponsive.
 + */
 +static void atkbd_schedule_event_work(struct atkbd *atkbd, int event_bit)
 +{
 + unsigned long delay = msecs_to_jiffies(50);
 +
 + if (time_after(jiffies, atkbd-event_jiffies + delay))
 + delay = 0;
 +
 + atkbd-event_jiffies = jiffies;
 + set_bit(event_bit, atkbd-event_mask);
 + wmb();
 + schedule_delayed_work(atkbd-event_work, delay);
 +}

I don't know whether schedule_delayed_work() requeues event_work, or if
it adds more work, but both seem to give wrong behaviour: In the first case
event_work can be postponed forever if atkbd_schedule_event_work() is
called repeatedly each time within 50 ms, and for the second case there's a
delay added, but the number of times the LED is switched stays the same,
so it's not being throttled.

Greetings,

Indan


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.22-rc[23]: blinking capslock led, stuck keys?

2007-06-13 Thread Indan Zupancic
On Wed, June 13, 2007 10:18, Pavel Machek wrote:
>> Well, as I said before, I've the "stuck key"/repeated output too (as well
>> as a warping PS/2 mouse), but no blinking led problem, so I believe the
>> two things are totally unrelated.
>
> Well, after turning off CONFIG_BLINK, my problems went away, and with
> a fast-blink done from userspace, I can make them way worse. They
> _are_ related here.

I missed you saying that before, so to me it looked like everyone just
assumed that.

So, just for fun, I tried running:

while true; do setleds +num; setleds -num; done

and it totally locked up my keyboard. Even SysRq didn't work. On the
bright side, the numlock LED was indeed blinking. Though running the
same with a sleep 0.1 added doesn't produce any problems. So maybe
my problem is indeed a bit related to this after all, somehow. Anyone
any ideas how to debug this problem?

Greetings,

Indan


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.22-rc[23]: blinking capslock led, stuck keys?

2007-06-13 Thread Indan Zupancic
On Wed, June 13, 2007 10:18, Pavel Machek wrote:
 Well, as I said before, I've the stuck key/repeated output too (as well
 as a warping PS/2 mouse), but no blinking led problem, so I believe the
 two things are totally unrelated.

 Well, after turning off CONFIG_BLINK, my problems went away, and with
 a fast-blink done from userspace, I can make them way worse. They
 _are_ related here.

I missed you saying that before, so to me it looked like everyone just
assumed that.

So, just for fun, I tried running:

while true; do setleds +num; setleds -num; done

and it totally locked up my keyboard. Even SysRq didn't work. On the
bright side, the numlock LED was indeed blinking. Though running the
same with a sleep 0.1 added doesn't produce any problems. So maybe
my problem is indeed a bit related to this after all, somehow. Anyone
any ideas how to debug this problem?

Greetings,

Indan


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.22-rc[23]: blinking capslock led, stuck keys?

2007-06-12 Thread Indan Zupancic
On Tue, June 12, 2007 07:42, Dmitry Torokhov wrote:
> For what it worth I finally tried that setleds loop on my laptop. I am
> not getting any lost keypresses/releases. But then I don't have EC
> (or at least it is not exported via ACPI). This is an old Dell notebook.

Well, as I said before, I've the "stuck key"/repeated output too (as well
as a warping PS/2 mouse), but no blinking led problem, so I believe the
two things are totally unrelated.

Greetings,

Indan


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.22-rc[23]: blinking capslock led, stuck keys?

2007-06-12 Thread Indan Zupancic
On Tue, June 12, 2007 07:42, Dmitry Torokhov wrote:
 For what it worth I finally tried that setleds loop on my laptop. I am
 not getting any lost keypresses/releases. But then I don't have EC
 (or at least it is not exported via ACPI). This is an old Dell notebook.

Well, as I said before, I've the stuck key/repeated output too (as well
as a warping PS/2 mouse), but no blinking led problem, so I believe the
two things are totally unrelated.

Greetings,

Indan


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.22-rc[23]: blinking capslock led, stuck keys?

2007-06-04 Thread Indan Zupancic
On Mon, June 4, 2007 15:12, Jiri Kosina wrote:
> Are you sure that it's this dummy blink driver that makes the kernel
> stuck? I can't see how it could be causing any hogs - see commit f038f9.

To make it clear, I'm not using the blink driver and get the stuck
key problem too. (And also a warpy PS/2 mouse.)

Wildly guessing: As the problem doesn't seem to be widespread,
perhaps it's suspend to ram related. I'll try running a while without
suspending and see if it crops up too then.

Greetings,

Indan


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.22-rc[23]: blinking capslock led, stuck keys?

2007-06-04 Thread Indan Zupancic
On Mon, June 4, 2007 14:38, Jiri Kosina wrote:
> On Mon, 4 Jun 2007, Indan Zupancic wrote:
>
>> > I also get some stuck keys I was not getting before. I hope my
>> > userland did not go crazy...
>> I get that too, and wasn't sure about it either. It happens irregularly,
>> so it's hard to debug, but often enough to be annoying.
>
> What kind of keyboards are they? I guess the x60 case is serio/ps2, what
> keyboard do you experience this with, Indan?

Also serio/ps2:
input: AT Translated Set 2 keyboard as /class/input/input0


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.22-rc[23]: blinking capslock led, stuck keys?

2007-06-04 Thread Indan Zupancic
On Mon, June 4, 2007 13:24, Pavel Machek wrote:
> I also get some stuck keys I was not getting before. I hope my
> userland did not go crazy...

I get that too, and wasn't sure about it either. It happens irregularly,
so it's hard to debug, but often enough to be annoying.

My mouse pointer tends to warp too in the same irregular way,
no idea if that's related (it seemed to happen in an earlier kernel
version than the sticky key thing, but not sure, as I don't have this
mouse that long.)

Greetings,

Indan


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.22-rc[23]: blinking capslock led, stuck keys?

2007-06-04 Thread Indan Zupancic
On Mon, June 4, 2007 13:24, Pavel Machek wrote:
 I also get some stuck keys I was not getting before. I hope my
 userland did not go crazy...

I get that too, and wasn't sure about it either. It happens irregularly,
so it's hard to debug, but often enough to be annoying.

My mouse pointer tends to warp too in the same irregular way,
no idea if that's related (it seemed to happen in an earlier kernel
version than the sticky key thing, but not sure, as I don't have this
mouse that long.)

Greetings,

Indan


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.22-rc[23]: blinking capslock led, stuck keys?

2007-06-04 Thread Indan Zupancic
On Mon, June 4, 2007 14:38, Jiri Kosina wrote:
 On Mon, 4 Jun 2007, Indan Zupancic wrote:

  I also get some stuck keys I was not getting before. I hope my
  userland did not go crazy...
 I get that too, and wasn't sure about it either. It happens irregularly,
 so it's hard to debug, but often enough to be annoying.

 What kind of keyboards are they? I guess the x60 case is serio/ps2, what
 keyboard do you experience this with, Indan?

Also serio/ps2:
input: AT Translated Set 2 keyboard as /class/input/input0


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 2.6.22-rc[23]: blinking capslock led, stuck keys?

2007-06-04 Thread Indan Zupancic
On Mon, June 4, 2007 15:12, Jiri Kosina wrote:
 Are you sure that it's this dummy blink driver that makes the kernel
 stuck? I can't see how it could be causing any hogs - see commit f038f9.

To make it clear, I'm not using the blink driver and get the stuck
key problem too. (And also a warpy PS/2 mouse.)

Wildly guessing: As the problem doesn't seem to be widespread,
perhaps it's suspend to ram related. I'll try running a while without
suspending and see if it crops up too then.

Greetings,

Indan


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [2.6.22-rc3][ACPI?] Resume from s2r doesn't work.

2007-06-01 Thread Indan Zupancic
On Sat, June 2, 2007 00:17, Olaf Dietsche wrote:
> It doesn't work. I tried all options "s2ram -f (-s, -p, -m, -r, -a 1, -a 2,
> -a 3)" one after the other.
>
> Since the screen (or any other device) works without problems, when I
> skip acpi_enter_sleep_state(), I don't think it's screen related.

I use "s2ram -f -p -s -a 3", maybe you need some exotic combination too.

Greetings,

Indan


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [2.6.22-rc3][ACPI?] Resume from s2r doesn't work.

2007-06-01 Thread Indan Zupancic
On Sat, June 2, 2007 00:17, Olaf Dietsche wrote:
 It doesn't work. I tried all options s2ram -f (-s, -p, -m, -r, -a 1, -a 2,
 -a 3) one after the other.

 Since the screen (or any other device) works without problems, when I
 skip acpi_enter_sleep_state(), I don't think it's screen related.

I use s2ram -f -p -s -a 3, maybe you need some exotic combination too.

Greetings,

Indan


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] x86: fix section mismatch warnings in mtrr

2007-05-26 Thread Indan Zupancic
On Mon, May 21, 2007 17:11, Sam Ravnborg wrote:
> On Mon, May 21, 2007 at 02:52:39PM +0100, Jeremy Fitzhardinge wrote:
>> Sam Ravnborg wrote:
>> > There was another patch that removed the __init marker to _fix_ section 
>> > mismatch errors.
>> > I have lost the actual mail but I asked the submitter to send me a copy of
>> > the configuration so I could take a closer look.
>> > Obviously it was the wrong fix to remove the _init marker.
>>
>> Hm, I think there were two overlapping patches to address the problem.
>> One removed the __init markers, and was a bit of a hack.  Mine
>> restructured all the functions so that it was all done properly (ie,
>> init called init; cpuinit called cpuinit).  Looks like both got applied.
>
> OK, then all is good.

Did the patches reach 2.6.22-rc3 yet?

If they did, then the following warnings might need fixing:

WARNING: arch/i386/kernel/built-in.o(.text+0x84bc): Section mismatch: reference 
to
.init.text:amd_init_mtrr (between 'mtrr_bp_init' and 'mtrr_attrib_to_str')
WARNING: arch/i386/kernel/built-in.o(.text+0x84c1): Section mismatch: reference 
to
.init.text:cyrix_init_mtrr (between 'mtrr_bp_init' and 'mtrr_attrib_to_str')
WARNING: arch/i386/kernel/built-in.o(.text+0x84c6): Section mismatch: reference 
to
.init.text:centaur_init_mtrr (between 'mtrr_bp_init' and 'mtrr_attrib_to_str')
WARNING: arch/i386/kernel/built-in.o(.text+0x92d8): Section mismatch: reference 
to .init.text:
(between 'get_mtrr_state' and 'generic_get_mtrr')
WARNING: arch/i386/kernel/built-in.o(.text+0x92ef): Section mismatch: reference 
to .init.text:
(between 'get_mtrr_state' and 'generic_get_mtrr')
WARNING: arch/i386/kernel/built-in.o(.text+0x9313): Section mismatch: reference 
to .init.text:
(between 'get_mtrr_state' and 'generic_get_mtrr')

Config attached.

Greetings,

Indan


config.gz
Description: application/gzip


Re: [PATCH] x86: fix section mismatch warnings in mtrr

2007-05-26 Thread Indan Zupancic
On Mon, May 21, 2007 17:11, Sam Ravnborg wrote:
 On Mon, May 21, 2007 at 02:52:39PM +0100, Jeremy Fitzhardinge wrote:
 Sam Ravnborg wrote:
  There was another patch that removed the __init marker to _fix_ section 
  mismatch errors.
  I have lost the actual mail but I asked the submitter to send me a copy of
  the configuration so I could take a closer look.
  Obviously it was the wrong fix to remove the _init marker.

 Hm, I think there were two overlapping patches to address the problem.
 One removed the __init markers, and was a bit of a hack.  Mine
 restructured all the functions so that it was all done properly (ie,
 init called init; cpuinit called cpuinit).  Looks like both got applied.

 OK, then all is good.

Did the patches reach 2.6.22-rc3 yet?

If they did, then the following warnings might need fixing:

WARNING: arch/i386/kernel/built-in.o(.text+0x84bc): Section mismatch: reference 
to
.init.text:amd_init_mtrr (between 'mtrr_bp_init' and 'mtrr_attrib_to_str')
WARNING: arch/i386/kernel/built-in.o(.text+0x84c1): Section mismatch: reference 
to
.init.text:cyrix_init_mtrr (between 'mtrr_bp_init' and 'mtrr_attrib_to_str')
WARNING: arch/i386/kernel/built-in.o(.text+0x84c6): Section mismatch: reference 
to
.init.text:centaur_init_mtrr (between 'mtrr_bp_init' and 'mtrr_attrib_to_str')
WARNING: arch/i386/kernel/built-in.o(.text+0x92d8): Section mismatch: reference 
to .init.text:
(between 'get_mtrr_state' and 'generic_get_mtrr')
WARNING: arch/i386/kernel/built-in.o(.text+0x92ef): Section mismatch: reference 
to .init.text:
(between 'get_mtrr_state' and 'generic_get_mtrr')
WARNING: arch/i386/kernel/built-in.o(.text+0x9313): Section mismatch: reference 
to .init.text:
(between 'get_mtrr_state' and 'generic_get_mtrr')

Config attached.

Greetings,

Indan


config.gz
Description: application/gzip


Re: sd_resume redundant? [was: [PATCH] libata: implement ata_wait_after_reset()]

2007-05-20 Thread Indan Zupancic
On Sun, May 20, 2007 19:17, Tejun Heo wrote:
> Indan Zupancic wrote:
>> So over all it takes half a second longer to detect the disk, but
>> because everything waits on it, it takes more than three seconds
>> longer to resume.
>
> Eeeek.  Extra three secs doesn't sound too hot.  :-(

Hey, without your reset fix it was 11 seconds slower. ;-)

Well, it used to be around 1.7 seconds to resume, and now (after the fixes)
it's 2.8 (according to the dmesg timestamps). But the first timestamp printed
was 0.7, and now it's 2.1, so I don't know if it's a real slowdown or that the 
timer
is set correctly earlier. That was with 2.19 btw, I don't know what happened in
the meantime. I suspect the timer changes. To know for sure I'd need to boot
an older kernel and measure resume time with a stopwatch.


>> Setting manage_start_stop to 0 fixes it and is good enough for me, I
>> didn't notice anything bad yet because of the unmanaged
>> stop. Implementing background spin up will fix it too.
>
> Just commenting out sd_resume() would be a better solution for your
> case tho.

I like your idea below more. ;-)


>>>> Everything seems to work fine without sd_resume(), so why is it needed?
>>> Because not all disks spin up without being told to do so and like it or
>>> not spinning disks up on resume is the default behavior.  As I wrote in
>>> the other reply, it would be worthwhile to make it configurable.
>>
>> Not even after they receive a read command? Ugh.
>
> After receiving a command which requires media access, they do.  What
> I was saying is that the current default behavior is to spin up all
> devices on resume and part of that is achieved by sd_resume().
>
> Hmmm... skipping START_STOP during sd_resume() actually is a pretty
> good solution for ATA devices.  I'll think about it.

I can't speak for others, but for me this would fix all regressions and even
disk spin up seems half a second faster without the START_STOP.
It would also solve the laptopmode thing, not unnecessarily spinning up
disks that don't need to be (if the hardware doesn't do it).

Greetings,

Indan


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] libata: implement ata_wait_after_reset()

2007-05-20 Thread Indan Zupancic
Hello Tejun,

On Sun, May 20, 2007 19:09, Tejun Heo wrote:
> Indan Zupancic wrote:
>> Don't controllers support spread spin up of disks, to avoid the peak load?
>> I think I saw something about that in the SiI 3512 spec I downloaded. So
>> maybe something like a special spin up method which can be implemented
>> by drivers, and if it isn't, the current start stop thing is used?
>
> What they support is allowing drivers to control spin up, and it needs
> support from the HDD (many seem to do these days) && BIOS (so that it
> doesn't spin up the disks during POST) too.  To do that properly, we
> need a repository which distributes spinup tokens such that we can do
> parallel, not thundering herd, spinups.

More and more do I get the impression that with full hardware control for Linux,
from BIOS to firmware, great things could be done. Alas.


> The deadline table sets limits to the maximum time a try can take &&
> when the next attempt is made.  So, the initial deadline of 10 secs
> means that the first try is allowed to take upto 10secs and no matter
> when or why it fails, the next attempt is made after 10sec is passed.  I
> would love to bang devices harder but a lot of creepy device are out
> there and I'm pretty sure some of them would crap themselves if pushed
> too hard.  :-(

Well, if it's going to be retried later on, then it's not really a deadline, is 
it?
Perhaps it's better to give up after the deadline, and only try again after an
interrupt is received from the controller?


> Anyways, in your case, the problem was that sata_sil wasted the first
> try for no good reason.  I'm not determined whether we need more
> aggressive deadlines for early tries for cases like this.  It's
> something to think about.

The annoying part is that all this seems to depend on the harddisk, and not
really on the controller (or the combination of the two).


>> Bugger. So it seems like a good idea to do the reset and spinup async 
>> together.
>
> There are a lot of cases.  Depending on various configuration and whims,
> some drives spin up when power is reapplied and don't respond to reset
> till it's ready.  Some drives don't spin up when power is reapplied but
> spin up when PHY is woken up and don't respond to reset till it's ready.
>  Yet others just sit quiet and respond to reset nicely but don't spin up
> till told to do so.
>
> So, well, I agree that the best strategy is doing it all asynchronously.

Another one would be to do it interrupt driven instead of doing polling,
if the controller supports it well.


>> The whole resetting, or just the retry after 10 seconds? But it becomes 
>> clear now that
>> you're right that the spinup problem is solved if sd_resume does background 
>> spin up.
>
> The whole resetting.  ATA controller/port resume itself is completely
> asynchronous.  The resume method just kicks EH thread and tells it to
> wake the thing up and returns.  The EH thread asynchronously resets the
> port, revalidates all the devices and see if any new ones are attached, etc.
>
> The culprit is that while EH thread is active all commands are deferred,
> so the START_STOP command sd_resume() issues waits in the SCSI midlayer
> queue till EH is complete.

Are you saying that right now there is only one EH thread? In that case an
improvent would be to have one EH thread per controller/device, so that at
least independent scsi devices can't block each other.

>  That's why doing it asynchronously is
> tricky.  I'm not sure whether we would be able to do it without
> separating libata from SCSI so that libata has its own suspend/resume
> functions.  :-(

Well, it seems you just moved the other direction, with leaning on sd_resume.


>> Thanks for the pointer, I'll fiddle with it. What do you mean with "proper" 
>> here?
>> Not at all, or in a forced way because power goes away?
>
> Well, basically, your drive isn't told that the power is gonna go off.
> Power just goes off and your drive has to do emergency head unload which
> sometimes causes the funny noise.

Hm, I suspected something like that. Though the shutdown sound of my HD
seems to sound the same as usual. Perhaps the BIOS gives it a sign as part of 
the
suspend? (Or to the controller, which delegates it.)


>> Perhaps a silly question, but how can you ever set manage_start_stop to 0 
>> before
>> the disk is started at bootup? You can change it before doing suspend, but 
>> not at
>> bootup. So is this option at the right place?
>
> The problem is sorta complicated because it involves all devices which
> use SCSI midlayer.  libata, USB, all the regular SCSI disks, whatnot, so
> it's really hard to sell to linux-scsi people if you choose a default
> behavior wh

Re: sd_resume redundant? [was: [PATCH] libata: implement ata_wait_after_reset()]

2007-05-20 Thread Indan Zupancic
On Sun, May 20, 2007 11:54, Tejun Heo wrote:
> Indan Zupancic wrote:
>> On Sat, May 19, 2007 21:04, Tejun Heo wrote:
>>> Tejun Heo wrote:
>>>> Yeah, if SCR registers are accessible, 0xff doesn't indicate the device
>>>> isn't there, so the whole skip-0xff logic probably shouldn't apply in
>>>> such cases, but we can also achieve pretty good result by just making
>>>> the first reset tries a bit more aggressive.
>>> So, here's the patch.
>>>
>>> Paul, can you please test this patch without the previous patch?  Indan,
>>> this should reduce the resume delay.  Please test.  But you'll still
>>> feel some added delay compared to 2.6.20 due to the mentioned
>>> suspend/resume change.
>>
>> This removed the COMRESET errors indeed, and with sd_resume()
>> disabled everything is speedy again (2s or so. Still a desktop pc).
>> I didn't try with sd_resume enabled.
>
> Can you try to measure with sd_resume in place?

[2.173366] sd 0:0:0:0: [sda] Starting disk
[2.475422] ata2: SATA link down (SStatus 0 SControl 310)
[5.478403] ata1: SATA link up 1.5 Gbps (SStatus 113 SControl 310)
[5.481928] ata1.00: ata_hpa_resize 1: sectors = 234441648, hpa_sectors = 
234441648
[5.485904] ata1.00: ata_hpa_resize 1: sectors = 234441648, hpa_sectors = 
234441648
[5.485913] ata1.00: configured for UDMA/100
[5.505109] sd 0:0:0:0: [sda] 234441648 512-byte hardware sectors (120034 MB)
[5.505461] sd 0:0:0:0: [sda] Write Protect is off
[5.505465] sd 0:0:0:0: [sda] Mode Sense: 00 3a 00 00
[5.505612] sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, 
doesn't support DPO or
FUA
...
[6.157259] Restarting tasks ... done.


And with echo 0 > /sys/class/scsi_disk/0\:0\:0\:0/manage_start_stop:

[2.476476] ata2: SATA link down (SStatus 0 SControl 310)
...
[2.825479] Restarting tasks ... done.
...
[5.022076] ata1: SATA link up 1.5 Gbps (SStatus 113 SControl 310)
[5.025605] ata1.00: ata_hpa_resize 1: sectors = 234441648, hpa_sectors = 
234441648
[5.028594] ata1.00: ata_hpa_resize 1: sectors = 234441648, hpa_sectors = 
234441648
[5.028606] ata1.00: configured for UDMA/100
[5.028720] sd 0:0:0:0: [sda] 234441648 512-byte hardware sectors (120034 MB)
[5.028767] sd 0:0:0:0: [sda] Write Protect is off
[5.028773] sd 0:0:0:0: [sda] Mode Sense: 00 3a 00 00
[5.028831] sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, 
doesn't support DPO or
FUA

So over all it takes half a second longer to detect the disk, but because 
everything waits on it,
it takes more than three seconds longer to resume.

Setting manage_start_stop to 0 fixes it and is good enough for me, I didn't 
notice anything bad yet
because of the unmanaged stop. Implementing background spin up will fix it too.


>> Everything seems to work fine without sd_resume(), so why is it needed?
>
> Because not all disks spin up without being told to do so and like it or
> not spinning disks up on resume is the default behavior.  As I wrote in
> the other reply, it would be worthwhile to make it configurable.

Not even after they receive a read command? Ugh.

Greeting,

Indan


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] sata_sil: Greatly improve DMA support

2007-05-20 Thread Indan Zupancic
On Sun, May 20, 2007 12:19, Tejun Heo wrote:
> Indan Zupancic wrote:
>> On Sun, May 20, 2007 00:03, Jeff Garzik wrote:
>>> Indan Zupancic wrote:
>>>> This patch seems to work with my SiI 3512, though I don't notice any
>>>> difference, neither a speedup, nor a slowdown. Hdparm gives the same
>>>> speeds (-tT), and cp -a'ing kernel sources is abysmal slow in both cases,
>>>> (need to look into that one) so I didn't really test it that well.
>>>
>>> It won't result in much of a speedup, except in situations where IOMMU
>>> or other situation that causes you to run into the 64k boundary being an
>>> issue -- generally only on huge transfers.
>>>
>>> A good measure is to dd(1) to/from the block device, rather than using a
>>> filesystem.  As has been shown on LKML, the filesystem can really slow
>>> things down in some cases.
>>
>> I didn't really expect a speedup, it's more that I've no regression to 
>> report.
>>
>> I could benchmark the patch more thoroughly, but right now I'm more worried
>> about the crawling cp I just discovered. Talking about filesystems slowing 
>> down
>> things...
>>
>> Test:
>>
>> $ cp -a linux-2.6/ /tmp/
>>
>> done on the same ext3 partition. linux-2.6 contains source and git repo only,
>> I'm compiling stuff with O=../obj.
>>
>> $ vmstat 10
>> procs ---memory-- ---swap-- -io -system-- cpu
>>  r  b   swpd   free   buff  cache   si   sobibo   in   cs us sy id wa
>>  0  1  0   4168   3316 19570000   739   494  530  393 15  3 66 16
>>  0  3  4   4120   2040 19819600 14677 14111 1247  435  0 17  0 83
>>  0  1  4   3588   1444 19969600  8892  9472 1362  438  0 12  0 88
>>  1  0  4   3772   4228 19601200   764   454 1161  345  0  4  0 96
>>  0  1  4   3548   6156 19308800   793   851 1158  340  0  4  0 96
>>  0  1  4   3852   7608 18909600   798   523 1160  474  1  4  0 95
>>  1  1  4   3612   8684 18604800  1244   864 1178  430  2  5  0 93
>>  0  1  4  90660   9308  9639600   853   906 1244  578  7  6  0 87
>>  0  1  4  72280   9816 11236800   830   854 1278  429 12  5  0 83
>>  1  0  4  52488  10296 13056000   935   861 1178  418  1  6  0 94
>>  0  1  4  30500  10788 14977600   977   858 1178  371  0  6  0 94
>>  0  1  4   9792  11244 16785600   918  1394 1182  350  1  5  0 94
>>  0  1  4   4016  11216 17250400  1017   858 1181  382  1  6  0 94
>>  0  1  4   3660  11484 17148400   966   861 1182  410  1  6  0 94
>>
>> It never finished, as I had no patience to copy about 900 Mb with this rate.
>>
>> As it's a git tree, I suppose it's heavily fragmented, but this is still 
>> rather
>> pathetic. Should I blame cp, or is something else wrong? Any ideas how
>> to figure this one out would be appreciated. Sorry for the off-topicness.
>
> Do things improve if you change the io scheduler to deadline?
>
>   # echo deadline > /sys/block/sda/queue/scheduler

I also tried noop, anticipatory, and now deadline, but it doesn't matter.

> Also worth looking at is the following bug entry.
>
>   http://bugzilla.kernel.org/show_bug.cgi?id=7372
>
> There seems to be weird interaction among the scheduler / VM / IO.  The
> exact cause is still not verified.  :-(

I know, I posted a bugreport too, but for starvation with the anticipatory 
scheduler.

Anyway, that bug seems unrelated to my case, as they have system 
unresponsiveness
or other nastiness, while I only have a crawling cp, which I blame on weaknesses
within ext3, a badly designed cp program and very fragmented filesystem. I just 
need
to verify that, somehow. I'll try with older kernels later.

Greetings,

Indan


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] libata: implement ata_wait_after_reset()

2007-05-20 Thread Indan Zupancic
Hello Tejun,

Thanks for your answers.

On Sun, May 20, 2007 11:50, Tejun Heo wrote:
> Indan Zupancic wrote:
>>> 1. We dropped libata specific suspend/resume implementation in favor of
>>> sd driven one.  Unfortunately, the new implementation currently can't do
>>> background spinup, so that's probably why you can feel the delay.  We
>>> need to figure out how to do background spinup with the new implementation.
>>
>> What are the advantages of that, except slower resume? ;-)
>
> The change was made primarily to make libata spin down disks properly on
> power down and hibernate.  I don't like the added delay either but it's
> better than shortening the lifespan of harddisks.

Ah, it fixes that "weird noise on shutdown" problem? Fair enough.


>  Making resuming
> asynchronous is planned but we need more infrastructure to do that
> properly.  The previous implementation worked but it also might try to
> spin up a lot of disks at once which can't be good for PSU.  I'm not
> sure yet how to do that properly with sd driving suspend/resume.

Don't controllers support spread spin up of disks, to avoid the peak load?
I think I saw something about that in the SiI 3512 spec I downloaded. So
maybe something like a special spin up method which can be implemented
by drivers, and if it isn't, the current start stop thing is used?


>>> 2. To make reset finish in defined time, each reset try now has defined
>>> deadlines.  I was trying to be conservative so I chose 10sec for the
>>> first try.
>>
>> Right. So to me it seems that failed, because the undefined reset time is 
>> just
>> replaced with a fixed ad hoc sleep, which is longer than any undefined reset
>> mechanism. So where did the advantage go? Better to go back to how it was,
>> is my impression. Or make that deadline 10 seconds and after that give up,
>> instead of the other way round. Or just move to async reset.
>
> Well, yeah, your case degraded for sure but a lot of hotplug or error
> handling cases are now much better which often used to take more than 30
> secs in many cases.  There are just a lot of cases and a lot of weird
> devices.  Aiming generally acceptable delay on most cases is of higher
> priority than optimizing specific cases.

What I meant is that the deadline isn't effective because if it can't be done 
within
that time, it's just retried after 10 seconds or whatever, basically rendering 
the
deadline useless. But now I understand that the retry is done in the background,
and that it was just the sd_resume that caused everything to wait on it.


> That said, the 10 sec delayed
> retry you're seeing is primarily caused by incorrect interpretation of
> 0xff status on sata_sil, so with that fixed, it shouldn't be too much of
> a problem.

True.

>>>  It's driven by timing table near the top of libata-eh.c, so
>>> adjusting the table is easy and maybe we can be a bit more aggressive
>>> with the first two tries.  But if we solve #1, this shouldn't matter too
>>> much.
>>
>> #2 seems totally unrelated to #1, as #1 is about spinup, and #2 about reset.
>> Only relation is that #2 slows down #1 because #1 needs to wait on #2.
>
> Not that easy.  Many drives don't respond to reset while they're
> spinning up.

Bugger. So it seems like a good idea to do the reset and spinup async together.


>>>> Maybe a silly question, but why do we wait for the harddisk to show up? 
>>>> Maybe that
>>>> makes a little bit of sense at bootup, but for resume from ram, where 
>>>> nothing is
>>>> supposed to have changed, it seems a bit strange. Why not wait when 
>>>> something tries
>>>> to access the harddisk or something?
>>> I hope it's explained now.
>>
>> Almost. Explained is why we wait on the disk, but that's only the sd_resume 
>> part.
>> It's still unclear why resume must wait till the reset is done.
>
> Port reset itself is asynchronous.  The wait is solely from sd_resume.

The whole resetting, or just the retry after 10 seconds? But it becomes clear 
now that
you're right that the spinup problem is solved if sd_resume does background 
spin up.


>>>> I wonder if sil_pci_device_resume() and sd_resume() know about each 
>>>> other...
>>>> I'll disable sd_resume() and see how that goes.
>>> Yeap, that should work.  In most configurations, devices spin up
>>> immediately after power is restored.  sd_resume() just waits for the
>>> device to be revalidated in such cases.
>>
>> And it kicks the disk into action. Why? Only thing it does is sending a 
>> START_STOP
>> command. I assume that causes the dis

Re: [PATCH] libata: implement ata_wait_after_reset()

2007-05-20 Thread Indan Zupancic
Hello Tejun,

Thanks for your answers.

On Sun, May 20, 2007 11:50, Tejun Heo wrote:
 Indan Zupancic wrote:
 1. We dropped libata specific suspend/resume implementation in favor of
 sd driven one.  Unfortunately, the new implementation currently can't do
 background spinup, so that's probably why you can feel the delay.  We
 need to figure out how to do background spinup with the new implementation.

 What are the advantages of that, except slower resume? ;-)

 The change was made primarily to make libata spin down disks properly on
 power down and hibernate.  I don't like the added delay either but it's
 better than shortening the lifespan of harddisks.

Ah, it fixes that weird noise on shutdown problem? Fair enough.


  Making resuming
 asynchronous is planned but we need more infrastructure to do that
 properly.  The previous implementation worked but it also might try to
 spin up a lot of disks at once which can't be good for PSU.  I'm not
 sure yet how to do that properly with sd driving suspend/resume.

Don't controllers support spread spin up of disks, to avoid the peak load?
I think I saw something about that in the SiI 3512 spec I downloaded. So
maybe something like a special spin up method which can be implemented
by drivers, and if it isn't, the current start stop thing is used?


 2. To make reset finish in defined time, each reset try now has defined
 deadlines.  I was trying to be conservative so I chose 10sec for the
 first try.

 Right. So to me it seems that failed, because the undefined reset time is 
 just
 replaced with a fixed ad hoc sleep, which is longer than any undefined reset
 mechanism. So where did the advantage go? Better to go back to how it was,
 is my impression. Or make that deadline 10 seconds and after that give up,
 instead of the other way round. Or just move to async reset.

 Well, yeah, your case degraded for sure but a lot of hotplug or error
 handling cases are now much better which often used to take more than 30
 secs in many cases.  There are just a lot of cases and a lot of weird
 devices.  Aiming generally acceptable delay on most cases is of higher
 priority than optimizing specific cases.

What I meant is that the deadline isn't effective because if it can't be done 
within
that time, it's just retried after 10 seconds or whatever, basically rendering 
the
deadline useless. But now I understand that the retry is done in the background,
and that it was just the sd_resume that caused everything to wait on it.


 That said, the 10 sec delayed
 retry you're seeing is primarily caused by incorrect interpretation of
 0xff status on sata_sil, so with that fixed, it shouldn't be too much of
 a problem.

True.

  It's driven by timing table near the top of libata-eh.c, so
 adjusting the table is easy and maybe we can be a bit more aggressive
 with the first two tries.  But if we solve #1, this shouldn't matter too
 much.

 #2 seems totally unrelated to #1, as #1 is about spinup, and #2 about reset.
 Only relation is that #2 slows down #1 because #1 needs to wait on #2.

 Not that easy.  Many drives don't respond to reset while they're
 spinning up.

Bugger. So it seems like a good idea to do the reset and spinup async together.


 Maybe a silly question, but why do we wait for the harddisk to show up? 
 Maybe that
 makes a little bit of sense at bootup, but for resume from ram, where 
 nothing is
 supposed to have changed, it seems a bit strange. Why not wait when 
 something tries
 to access the harddisk or something?
 I hope it's explained now.

 Almost. Explained is why we wait on the disk, but that's only the sd_resume 
 part.
 It's still unclear why resume must wait till the reset is done.

 Port reset itself is asynchronous.  The wait is solely from sd_resume.

The whole resetting, or just the retry after 10 seconds? But it becomes clear 
now that
you're right that the spinup problem is solved if sd_resume does background 
spin up.


 I wonder if sil_pci_device_resume() and sd_resume() know about each 
 other...
 I'll disable sd_resume() and see how that goes.
 Yeap, that should work.  In most configurations, devices spin up
 immediately after power is restored.  sd_resume() just waits for the
 device to be revalidated in such cases.

 And it kicks the disk into action. Why? Only thing it does is sending a 
 START_STOP
 command. I assume that causes the disk to spin up. But for e.g. laptopmode I 
 can
 imagine that's quite annoying. I think sd_resume() is totally unnecessary 
 and can
 be removed, because it seems wrong to always force the harddisk to spin up,
 background spin up or not.

 Most ATA drives would spin up immediately when power is reapplied unless
 configured otherwise and you can configure whether sd_resume issues
 START_STOP or not by echoing to sysfs node
 /sys/class/scsi_disk/h:c:i:l/manage_start_stop.  The drawback here is
 you won't get proper spindown if you turn it off.

Thanks for the pointer, I'll fiddle with it. What do you mean

Re: [PATCH] sata_sil: Greatly improve DMA support

2007-05-20 Thread Indan Zupancic
On Sun, May 20, 2007 12:19, Tejun Heo wrote:
 Indan Zupancic wrote:
 On Sun, May 20, 2007 00:03, Jeff Garzik wrote:
 Indan Zupancic wrote:
 This patch seems to work with my SiI 3512, though I don't notice any
 difference, neither a speedup, nor a slowdown. Hdparm gives the same
 speeds (-tT), and cp -a'ing kernel sources is abysmal slow in both cases,
 (need to look into that one) so I didn't really test it that well.

 It won't result in much of a speedup, except in situations where IOMMU
 or other situation that causes you to run into the 64k boundary being an
 issue -- generally only on huge transfers.

 A good measure is to dd(1) to/from the block device, rather than using a
 filesystem.  As has been shown on LKML, the filesystem can really slow
 things down in some cases.

 I didn't really expect a speedup, it's more that I've no regression to 
 report.

 I could benchmark the patch more thoroughly, but right now I'm more worried
 about the crawling cp I just discovered. Talking about filesystems slowing 
 down
 things...

 Test:

 $ cp -a linux-2.6/ /tmp/

 done on the same ext3 partition. linux-2.6 contains source and git repo only,
 I'm compiling stuff with O=../obj.

 $ vmstat 10
 procs ---memory-- ---swap-- -io -system-- cpu
  r  b   swpd   free   buff  cache   si   sobibo   in   cs us sy id wa
  0  1  0   4168   3316 19570000   739   494  530  393 15  3 66 16
  0  3  4   4120   2040 19819600 14677 14111 1247  435  0 17  0 83
  0  1  4   3588   1444 19969600  8892  9472 1362  438  0 12  0 88
  1  0  4   3772   4228 19601200   764   454 1161  345  0  4  0 96
  0  1  4   3548   6156 19308800   793   851 1158  340  0  4  0 96
  0  1  4   3852   7608 18909600   798   523 1160  474  1  4  0 95
  1  1  4   3612   8684 18604800  1244   864 1178  430  2  5  0 93
  0  1  4  90660   9308  9639600   853   906 1244  578  7  6  0 87
  0  1  4  72280   9816 11236800   830   854 1278  429 12  5  0 83
  1  0  4  52488  10296 13056000   935   861 1178  418  1  6  0 94
  0  1  4  30500  10788 14977600   977   858 1178  371  0  6  0 94
  0  1  4   9792  11244 16785600   918  1394 1182  350  1  5  0 94
  0  1  4   4016  11216 17250400  1017   858 1181  382  1  6  0 94
  0  1  4   3660  11484 17148400   966   861 1182  410  1  6  0 94

 It never finished, as I had no patience to copy about 900 Mb with this rate.

 As it's a git tree, I suppose it's heavily fragmented, but this is still 
 rather
 pathetic. Should I blame cp, or is something else wrong? Any ideas how
 to figure this one out would be appreciated. Sorry for the off-topicness.

 Do things improve if you change the io scheduler to deadline?

   # echo deadline  /sys/block/sda/queue/scheduler

I also tried noop, anticipatory, and now deadline, but it doesn't matter.

 Also worth looking at is the following bug entry.

   http://bugzilla.kernel.org/show_bug.cgi?id=7372

 There seems to be weird interaction among the scheduler / VM / IO.  The
 exact cause is still not verified.  :-(

I know, I posted a bugreport too, but for starvation with the anticipatory 
scheduler.

Anyway, that bug seems unrelated to my case, as they have system 
unresponsiveness
or other nastiness, while I only have a crawling cp, which I blame on weaknesses
within ext3, a badly designed cp program and very fragmented filesystem. I just 
need
to verify that, somehow. I'll try with older kernels later.

Greetings,

Indan


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: sd_resume redundant? [was: [PATCH] libata: implement ata_wait_after_reset()]

2007-05-20 Thread Indan Zupancic
On Sun, May 20, 2007 11:54, Tejun Heo wrote:
 Indan Zupancic wrote:
 On Sat, May 19, 2007 21:04, Tejun Heo wrote:
 Tejun Heo wrote:
 Yeah, if SCR registers are accessible, 0xff doesn't indicate the device
 isn't there, so the whole skip-0xff logic probably shouldn't apply in
 such cases, but we can also achieve pretty good result by just making
 the first reset tries a bit more aggressive.
 So, here's the patch.

 Paul, can you please test this patch without the previous patch?  Indan,
 this should reduce the resume delay.  Please test.  But you'll still
 feel some added delay compared to 2.6.20 due to the mentioned
 suspend/resume change.

 This removed the COMRESET errors indeed, and with sd_resume()
 disabled everything is speedy again (2s or so. Still a desktop pc).
 I didn't try with sd_resume enabled.

 Can you try to measure with sd_resume in place?

[2.173366] sd 0:0:0:0: [sda] Starting disk
[2.475422] ata2: SATA link down (SStatus 0 SControl 310)
[5.478403] ata1: SATA link up 1.5 Gbps (SStatus 113 SControl 310)
[5.481928] ata1.00: ata_hpa_resize 1: sectors = 234441648, hpa_sectors = 
234441648
[5.485904] ata1.00: ata_hpa_resize 1: sectors = 234441648, hpa_sectors = 
234441648
[5.485913] ata1.00: configured for UDMA/100
[5.505109] sd 0:0:0:0: [sda] 234441648 512-byte hardware sectors (120034 MB)
[5.505461] sd 0:0:0:0: [sda] Write Protect is off
[5.505465] sd 0:0:0:0: [sda] Mode Sense: 00 3a 00 00
[5.505612] sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, 
doesn't support DPO or
FUA
...
[6.157259] Restarting tasks ... done.


And with echo 0  /sys/class/scsi_disk/0\:0\:0\:0/manage_start_stop:

[2.476476] ata2: SATA link down (SStatus 0 SControl 310)
...
[2.825479] Restarting tasks ... done.
...
[5.022076] ata1: SATA link up 1.5 Gbps (SStatus 113 SControl 310)
[5.025605] ata1.00: ata_hpa_resize 1: sectors = 234441648, hpa_sectors = 
234441648
[5.028594] ata1.00: ata_hpa_resize 1: sectors = 234441648, hpa_sectors = 
234441648
[5.028606] ata1.00: configured for UDMA/100
[5.028720] sd 0:0:0:0: [sda] 234441648 512-byte hardware sectors (120034 MB)
[5.028767] sd 0:0:0:0: [sda] Write Protect is off
[5.028773] sd 0:0:0:0: [sda] Mode Sense: 00 3a 00 00
[5.028831] sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, 
doesn't support DPO or
FUA

So over all it takes half a second longer to detect the disk, but because 
everything waits on it,
it takes more than three seconds longer to resume.

Setting manage_start_stop to 0 fixes it and is good enough for me, I didn't 
notice anything bad yet
because of the unmanaged stop. Implementing background spin up will fix it too.


 Everything seems to work fine without sd_resume(), so why is it needed?

 Because not all disks spin up without being told to do so and like it or
 not spinning disks up on resume is the default behavior.  As I wrote in
 the other reply, it would be worthwhile to make it configurable.

Not even after they receive a read command? Ugh.

Greeting,

Indan


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] libata: implement ata_wait_after_reset()

2007-05-20 Thread Indan Zupancic
Hello Tejun,

On Sun, May 20, 2007 19:09, Tejun Heo wrote:
 Indan Zupancic wrote:
 Don't controllers support spread spin up of disks, to avoid the peak load?
 I think I saw something about that in the SiI 3512 spec I downloaded. So
 maybe something like a special spin up method which can be implemented
 by drivers, and if it isn't, the current start stop thing is used?

 What they support is allowing drivers to control spin up, and it needs
 support from the HDD (many seem to do these days)  BIOS (so that it
 doesn't spin up the disks during POST) too.  To do that properly, we
 need a repository which distributes spinup tokens such that we can do
 parallel, not thundering herd, spinups.

More and more do I get the impression that with full hardware control for Linux,
from BIOS to firmware, great things could be done. Alas.


 The deadline table sets limits to the maximum time a try can take 
 when the next attempt is made.  So, the initial deadline of 10 secs
 means that the first try is allowed to take upto 10secs and no matter
 when or why it fails, the next attempt is made after 10sec is passed.  I
 would love to bang devices harder but a lot of creepy device are out
 there and I'm pretty sure some of them would crap themselves if pushed
 too hard.  :-(

Well, if it's going to be retried later on, then it's not really a deadline, is 
it?
Perhaps it's better to give up after the deadline, and only try again after an
interrupt is received from the controller?


 Anyways, in your case, the problem was that sata_sil wasted the first
 try for no good reason.  I'm not determined whether we need more
 aggressive deadlines for early tries for cases like this.  It's
 something to think about.

The annoying part is that all this seems to depend on the harddisk, and not
really on the controller (or the combination of the two).


 Bugger. So it seems like a good idea to do the reset and spinup async 
 together.

 There are a lot of cases.  Depending on various configuration and whims,
 some drives spin up when power is reapplied and don't respond to reset
 till it's ready.  Some drives don't spin up when power is reapplied but
 spin up when PHY is woken up and don't respond to reset till it's ready.
  Yet others just sit quiet and respond to reset nicely but don't spin up
 till told to do so.

 So, well, I agree that the best strategy is doing it all asynchronously.

Another one would be to do it interrupt driven instead of doing polling,
if the controller supports it well.


 The whole resetting, or just the retry after 10 seconds? But it becomes 
 clear now that
 you're right that the spinup problem is solved if sd_resume does background 
 spin up.

 The whole resetting.  ATA controller/port resume itself is completely
 asynchronous.  The resume method just kicks EH thread and tells it to
 wake the thing up and returns.  The EH thread asynchronously resets the
 port, revalidates all the devices and see if any new ones are attached, etc.

 The culprit is that while EH thread is active all commands are deferred,
 so the START_STOP command sd_resume() issues waits in the SCSI midlayer
 queue till EH is complete.

Are you saying that right now there is only one EH thread? In that case an
improvent would be to have one EH thread per controller/device, so that at
least independent scsi devices can't block each other.

  That's why doing it asynchronously is
 tricky.  I'm not sure whether we would be able to do it without
 separating libata from SCSI so that libata has its own suspend/resume
 functions.  :-(

Well, it seems you just moved the other direction, with leaning on sd_resume.


 Thanks for the pointer, I'll fiddle with it. What do you mean with proper 
 here?
 Not at all, or in a forced way because power goes away?

 Well, basically, your drive isn't told that the power is gonna go off.
 Power just goes off and your drive has to do emergency head unload which
 sometimes causes the funny noise.

Hm, I suspected something like that. Though the shutdown sound of my HD
seems to sound the same as usual. Perhaps the BIOS gives it a sign as part of 
the
suspend? (Or to the controller, which delegates it.)


 Perhaps a silly question, but how can you ever set manage_start_stop to 0 
 before
 the disk is started at bootup? You can change it before doing suspend, but 
 not at
 bootup. So is this option at the right place?

 The problem is sorta complicated because it involves all devices which
 use SCSI midlayer.  libata, USB, all the regular SCSI disks, whatnot, so
 it's really hard to sell to linux-scsi people if you choose a default
 behavior which is specific to libata devices.  What can be done is
 adding a configurable option which doesn't change the default behavior
 and configure it in the low level driver.  This is how manage_start_stop
 is configured.  The default value is zero but libata sets it to one
 while registering each device, so the default value for all libata
 devices is 1 (this should answer your

  1   2   >