[patch 0/3] 2.6.20 fix for PageUptodate memorder problem (try 4)

2007-02-14 Thread Nick Piggin
Various little cleanups and commenting fixes. Fixed up the patchset so
each one, incrementally, should give a properly compiling and running
kernel.

I'd still like Hugh to ack the anon/swap changes when he can find the time.
It would be desirable to get at least one ack as to the overall problem and
design of the fix (Martin's ack is just for the s390 changes at this stage).

Meanwhile, can it go into -mm for wider testing, if it isn't too much
trouble?

Thanks,
Nick

--
SuSE Labs

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[patch 1/3] mm: make read_cache_page synchronous

2007-02-14 Thread Nick Piggin
Ensure pages are uptodate after returning from read_cache_page, which allows
us to cut out most of the filesystem-internal PageUptodate calls.

I didn't have a great look down the call chains, but this appears to fixes 7
possible use-before uptodate in hfs, 2 in hfsplus, 1 in jfs, a few in ecryptfs,
1 in jffs2, and a possible cleared data overwritten with readpage in block2mtd.
All depending on whether the filler is async and/or can return with a !uptodate
page.

Signed-off-by: Nick Piggin [EMAIL PROTECTED]

 drivers/mtd/devices/block2mtd.c |3 --
 fs/afs/dir.c|3 --
 fs/afs/mntpt.c  |   11 +++-
 fs/cramfs/inode.c   |3 +-
 fs/ecryptfs/mmap.c  |   11 
 fs/ext2/dir.c   |3 --
 fs/freevxfs/vxfs_subr.c |3 --
 fs/minix/dir.c  |1 
 fs/namei.c  |   12 -
 fs/nfs/dir.c|5 
 fs/nfs/symlink.c|6 
 fs/ntfs/aops.h  |3 --
 fs/ntfs/attrib.c|   18 +-
 fs/ntfs/file.c  |3 --
 fs/ntfs/super.c |   30 +++-
 fs/ocfs2/symlink.c  |7 -
 fs/partitions/check.c   |3 --
 fs/reiserfs/xattr.c |4 ---
 fs/sysv/dir.c   |   10 
 fs/ufs/dir.c|6 
 fs/ufs/util.c   |6 +---
 include/linux/pagemap.h |   11 
 mm/filemap.c|   49 +++-
 mm/swapfile.c   |3 --
 24 files changed, 70 insertions(+), 144 deletions(-)

Index: linux-2.6/fs/afs/dir.c
===
--- linux-2.6.orig/fs/afs/dir.c
+++ linux-2.6/fs/afs/dir.c
@@ -187,10 +187,7 @@ static struct page *afs_dir_get_page(str
 
page = read_mapping_page(dir-i_mapping, index, NULL);
if (!IS_ERR(page)) {
-   wait_on_page_locked(page);
kmap(page);
-   if (!PageUptodate(page))
-   goto fail;
if (!PageChecked(page))
afs_dir_check_page(dir, page);
if (PageError(page))
Index: linux-2.6/fs/afs/mntpt.c
===
--- linux-2.6.orig/fs/afs/mntpt.c
+++ linux-2.6/fs/afs/mntpt.c
@@ -77,13 +77,11 @@ int afs_mntpt_check_symlink(struct afs_v
}
 
ret = -EIO;
-   wait_on_page_locked(page);
-   buf = kmap(page);
-   if (!PageUptodate(page))
-   goto out_free;
if (PageError(page))
goto out_free;
 
+   buf = kmap(page);
+
/* examine the symlink's contents */
size = vnode-status.size;
_debug(symlink to %*.*s, size, (int) size, buf);
@@ -100,8 +98,8 @@ int afs_mntpt_check_symlink(struct afs_v
 
ret = 0;
 
- out_free:
kunmap(page);
+ out_free:
page_cache_release(page);
  out:
_leave( = %d, ret);
@@ -184,8 +182,7 @@ static struct vfsmount *afs_mntpt_do_aut
}
 
ret = -EIO;
-   wait_on_page_locked(page);
-   if (!PageUptodate(page) || PageError(page))
+   if (PageError(page))
goto error;
 
buf = kmap(page);
Index: linux-2.6/fs/cramfs/inode.c
===
--- linux-2.6.orig/fs/cramfs/inode.c
+++ linux-2.6/fs/cramfs/inode.c
@@ -180,7 +180,8 @@ static void *cramfs_read(struct super_bl
struct page *page = NULL;
 
if (blocknr + i  devsize) {
-   page = read_mapping_page(mapping, blocknr + i, NULL);
+   page = read_mapping_page_async(mapping, blocknr + i,
+   NULL);
/* synchronous error? */
if (IS_ERR(page))
page = NULL;
Index: linux-2.6/fs/ext2/dir.c
===
--- linux-2.6.orig/fs/ext2/dir.c
+++ linux-2.6/fs/ext2/dir.c
@@ -161,10 +161,7 @@ static struct page * ext2_get_page(struc
struct address_space *mapping = dir-i_mapping;
struct page *page = read_mapping_page(mapping, n, NULL);
if (!IS_ERR(page)) {
-   wait_on_page_locked(page);
kmap(page);
-   if (!PageUptodate(page))
-   goto fail;
if (!PageChecked(page))
ext2_check_page(page);
if (PageError(page))
Index: linux-2.6/fs/freevxfs/vxfs_subr.c
===
--- linux-2.6.orig/fs/freevxfs/vxfs_subr.c
+++ linux-2.6/fs/freevxfs/vxfs_subr.c
@@ -74,10 +74,7 @@ vxfs_get_page(struct address_space *mapp
pp = read_mapping_page(mapping, n, 

Re: GPL vs non-GPL device drivers

2007-02-14 Thread Neil Brown
On Wednesday February 14, [EMAIL PROTECTED] wrote:
 I am well aware of what Greg KHs position is, in fact he is the reason
 I started the whole rant. This is only a plea to the higher
 authorities. Linus, please save Linux!

Linus is not in any position to do anything.  The die is cast.

You should speak to a lawyer.

The key issue is this:  
   Does combining your work with Linux create a derived work.

  If it does not, you have nothing to worry about.
  If it does, then maybe you should worry.

  If someone who owns copyright in part of the Linux kernel that you
  are using, decides that they think you have created a derived work,
  then they might bring this to your attention and ask you to abide by
  the conditions in the license under which you obtained the Linux
  kernel.  If no suitable resolution can be found, they might take you
  to court for using their protected work without a valid license (The
  GPL becomes void if you breach it's requirements).

  And then the judge might or might not find against you.  But it is
  very hard to know in advance how the judge will decide in a
  particular case.  Hence the best advice is to speak to a lawyer,
  They have the best chance of advising your how to minimise your
  risk.


I hope that makes the situation clear enough.

NeilBrown
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[patch 2/3] fs: buffer don't PageUptodate without page locked

2007-02-14 Thread Nick Piggin
__block_write_full_page is calling SetPageUptodate without the page locked.
This is unusual, but not incorrect, as PG_writeback is still set.

However the next patch will require that SetPageUptodate always be called
with the page locked. Simply don't bother setting the page uptodate in this
case (it is unusual that the write path does such a thing anyway). Instead
just leave it to the read side to bring the page uptodate when it notices
that all buffers are uptodate.

Signed-off-by: Nick Piggin [EMAIL PROTECTED]

 fs/buffer.c |   11 +--
 1 file changed, 1 insertion(+), 10 deletions(-)

Index: linux-2.6/fs/buffer.c
===
--- linux-2.6.orig/fs/buffer.c
+++ linux-2.6/fs/buffer.c
@@ -1698,17 +1698,8 @@ done:
 * clean.  Someone wrote them back by hand with
 * ll_rw_block/submit_bh.  A rare case.
 */
-   int uptodate = 1;
-   do {
-   if (!buffer_uptodate(bh)) {
-   uptodate = 0;
-   break;
-   }
-   bh = bh-b_this_page;
-   } while (bh != head);
-   if (uptodate)
-   SetPageUptodate(page);
end_page_writeback(page);
+
/*
 * The page and buffer_heads can be released at any time from
 * here on.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[patch 3/3] mm: fix PageUptodate memorder

2007-02-14 Thread Nick Piggin
After running SetPageUptodate, preceeding stores to the page contents to
actually bring it uptodate may not be ordered with the store to set the page
uptodate.

Therefore, another CPU which checks PageUptodate is true, then reads the
page contents can get stale data.

Fix this by ensuring SetPageUptodate is always called with the page locked
(except in the case of a new page that cannot be visible to other CPUs), and
requiring PageUptodate be checked only when the page is locked.

To facilitate lockless checks, SetPageUptodate contains an smp_wmb to order
preceeding stores before the store to page flags, and a new PageUptodate_NoLock
is introduced, which issues a smp_rmb after the page flags are loaded for the
test.

DMA memory barrier is not required, because the driver / IO subsystem must
bring that into order before telling the core kernel that the read has
completed.

One thing I like about it is that it unifies the anonymous page handling
with the rest of the page management, by marking anon pages as uptodate
when they _are_ uptodate, rather than when our implementation requires
that they be marked as such. Doing this let me get rid of the smp_wmb's
in the page copying functions which, specially added for anonymous pages
for a closely related issue, didn't quite match file backed page handling.

Convert core code to use PageUptodate_NoLock. Filesystems are unaffected
thanks to the change to read_cache_page.

Signed-off-by: Nick Piggin [EMAIL PROTECTED]
Acked-by: Martin Schwidefsky [EMAIL PROTECTED]

 fs/splice.c|4 +--
 include/linux/highmem.h|4 ---
 include/linux/page-flags.h |   57 +
 mm/filemap.c   |   20 +++
 mm/hugetlb.c   |2 +
 mm/memory.c|9 +++
 mm/page_io.c   |2 -
 mm/swap_state.c|2 -
 8 files changed, 74 insertions(+), 26 deletions(-)

Index: linux-2.6/include/linux/highmem.h
===
--- linux-2.6.orig/include/linux/highmem.h
+++ linux-2.6/include/linux/highmem.h
@@ -57,8 +57,6 @@ static inline void clear_user_highpage(s
void *addr = kmap_atomic(page, KM_USER0);
clear_user_page(addr, vaddr, page);
kunmap_atomic(addr, KM_USER0);
-   /* Make sure this page is cleared on other CPU's too before using it */
-   smp_wmb();
 }
 
 #ifndef __HAVE_ARCH_ALLOC_ZEROED_USER_HIGHPAGE
@@ -108,8 +106,6 @@ static inline void copy_user_highpage(st
copy_user_page(vto, vfrom, vaddr, to);
kunmap_atomic(vfrom, KM_USER0);
kunmap_atomic(vto, KM_USER1);
-   /* Make sure this page is cleared on other CPU's too before using it */
-   smp_wmb();
 }
 
 #endif
Index: linux-2.6/include/linux/page-flags.h
===
--- linux-2.6.orig/include/linux/page-flags.h
+++ linux-2.6/include/linux/page-flags.h
@@ -126,16 +126,65 @@
 #define ClearPageReferenced(page)  clear_bit(PG_referenced, (page)-flags)
 #define TestClearPageReferenced(page) test_and_clear_bit(PG_referenced, 
(page)-flags)
 
-#define PageUptodate(page) test_bit(PG_uptodate, (page)-flags)
-#ifdef CONFIG_S390
+static inline int PageUptodate(struct page *page)
+{
+   WARN_ON(!PageLocked(page));
+   return test_bit(PG_uptodate, (page)-flags);
+}
+
+/*
+ * PageUptodate to be used when not holding the page lock.
+ */
+static inline int PageUptodate_NoLock(struct page *page)
+{
+   int ret = test_bit(PG_uptodate, (page)-flags);
+
+   /*
+* Must ensure that the data we read out of the page is loaded
+* _after_ we've loaded page-flags and found that it is uptodate.
+* See SetPageUptodate() for the other side of the story.
+*/
+   if (ret)
+   smp_rmb();
+
+   return ret;
+}
+
 static inline void SetPageUptodate(struct page *page)
 {
+   WARN_ON(!PageLocked(page));
+#ifdef CONFIG_S390
if (!test_and_set_bit(PG_uptodate, page-flags))
page_test_and_clear_dirty(page);
-}
 #else
-#define SetPageUptodate(page)  set_bit(PG_uptodate, (page)-flags)
+   /*
+* Memory barrier must be issued before setting the PG_uptodate bit,
+* so all previous writes that served to bring the page uptodate are
+* visible before PageUptodate becomes true.
+*
+* S390 is guaranteed to have a barrier in the test_and_set operation
+* (see Documentation/atomic_ops.txt).
+*
+* This memory barrier should not need to provide ordering against
+* DMA writes into the page, because the IO completion should really
+* be doing that.
+*/
+   smp_wmb();
+   set_bit(PG_uptodate, (page)-flags);
 #endif
+}
+
+static inline void SetNewPageUptodate(struct page *page)
+{
+   /*
+* S390 sets page dirty bit on IO operations, which is why it is
+* cleared in 

[PATCH] fix mempolicy's check on a system with memory-less-node take4

2007-02-14 Thread KAMEZAWA Hiroyuki

please ack if O.K.
-Kame
--
bind_zonelist() can create zero-length zonelist if there is a 
memory-less-node. This patch checks the length of zonelist.
If length is 0, returns -EINVAL.

Changelog: v3 - v4:
- changes a name of a temporal void* variable as error_code
Changelog: v2 - v3
- removed ambiguous void *pointer usage.
- fixed warnings...misuse of PTR_ERR.
Changelog: v1 - v2
- avoid extra pgdat scanningit is not necessary.

tested on ia64/NUMA with memory-less-node.

Signed-Off-By: KAMEZAWA Hiroyuki [EMAIL PROTECTED]


Index: linux-2.6.20/mm/mempolicy.c
===
--- linux-2.6.20.orig/mm/mempolicy.c2007-02-13 15:14:13.0 +0900
+++ linux-2.6.20/mm/mempolicy.c 2007-02-15 16:11:17.0 +0900
@@ -144,7 +144,7 @@
max++;  /* space for zlcache_ptr (see mmzone.h) */
zl = kmalloc(sizeof(struct zone *) * max, GFP_KERNEL);
if (!zl)
-   return NULL;
+   return ERR_PTR(-ENOMEM);
zl-zlcache_ptr = NULL;
num = 0;
/* First put in the highest zones from all nodes, then all the next 
@@ -162,6 +162,10 @@
break;
k--;
}
+   if (num == 0) {
+   kfree(zl);
+   return ERR_PTR(-EINVAL);
+   }
zl-zones[num] = NULL;
return zl;
 }
@@ -193,9 +197,10 @@
break;
case MPOL_BIND:
policy-v.zonelist = bind_zonelist(nodes);
-   if (policy-v.zonelist == NULL) {
+   if (IS_ERR(policy-v.zonelist)) {
+   void *error_code = policy-v.zonelist;
kmem_cache_free(policy_cache, policy);
-   return ERR_PTR(-ENOMEM);
+   return error_code;
}
break;
}
@@ -1667,7 +1672,7 @@
 * then zonelist_policy() will FALL THROUGH to MPOL_DEFAULT.
 */
 
-   if (zonelist) {
+   if (!IS_ERR(zonelist)) {
/* Good - got mem - substitute new zonelist */
kfree(pol-v.zonelist);
pol-v.zonelist = zonelist;

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: GPL vs non-GPL device drivers

2007-02-14 Thread Dave Jones
On Wed, Feb 14, 2007 at 10:46:13PM -0800, v j wrote:
  You don't get it do you. Our source code is meaningless to the Open
  Source community at large.

Linux supports entire _architectures_ of which there are single figures
of people using it.  What makes your hardware special ?
 
  We are only _using_ Linux.

If you're adding kernel modules, you're more than using Linux, you're
developing _for_ linux.  You're just choosing to keep the fruits of
those labors to yourself.
 
  Just as we could have used VxWorks or OSE.

You could.  But would you have had access to thousands of worldwide
contributors making your code better?
This is what you've missed out on with your current stance.
 
  Using our source code would not benefit anybody but
  our competitors.

This excuse has been given time and time again, and repeatedly been 
proven false.  And as soon as one of your competitors makes their
drivers open, guess which one gets 1000+ free developers working
on their code ?

  Sure we could make our drivers open-source. This is a
  decision that is made FIRST when evaluating an OS. If we we were
  required to make our drivers/HW open, we would just not have chosen
  Linux. It is as simple as that.

Please, revisit the 1990s. Read the cathedral and the bazaar.[1]
Listen to MC Hammer.   Realise the funky horror.  Then when you're ready
to revisit us with some points that haven't already been dismissed
please post again. Until then, you're offering nothing new.

Dave

[1] Jesus, I'm recommending ESR texts, I must be desperate.

-- 
http://www.codemonkey.org.uk
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: GPL vs non-GPL device drivers

2007-02-14 Thread Arjan van de Ven
On Wed, 2007-02-14 at 21:16 -0800, v j wrote:
 This is in reference to the following thread:
 
 http://lkml.org/lkml/2006/12/14/63
 
 I am not sure if this is ever addressed in LKML, but linux is _very_
 popular in the embedded space. We (an embedded vendor) chose Linux 3
 years back because of its lack of royalty model, robustness and
 availability of infinite number of open-source tools.


I think you have a bit of a misunderstanding... Linux is not royalty
free. Just the royalty is not in the form of cash, but in the form of
having to give your improvements back to the open source world.

(this is paraphrasing the intent of the GPL basically, you can argue for
hours if drivers are separate or improvements, and I'm not interested in
that debate, it has been debated to death before and only lawyers will
in the end be able to settle that on a case by case basis).

If your mindset is how much can I take take take without giving back
back back then personally I think you're sort of acting like a parasite
in this context 


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: GPL vs non-GPL device drivers

2007-02-14 Thread Nick Piggin

Ben Nizette wrote:

v j wrote:


This is in reference to the following thread:

http://lkml.org/lkml/2006/12/14/63

I am not sure if this is ever addressed in LKML, but linux is _very_
popular in the embedded space. We (an embedded vendor) chose Linux 3
years back because of its lack of royalty model, robustness and
availability of infinite number of open-source tools.


[...]


However we have a worrying trend here. If at some point it becomes
illegal to load our modules into the linux kernel, then it is
unacceptable to us. We would have been better off choosing VxWorks or
OSE 3 years ago when we made an OS choice. The fact that Linux is
becoming more and more closed is very very alarming.

Question to the world here:  Distros make, as a matter of course, a 
series of modifications to the Linux Kernel so that their modules or 
features work.  What stops VJ making a patchset which effectively 
s/EXPORT_SYMBOL_GPL/EXPORT_SYMBOL/g 's the kernel source then 
distributing that under the GPL?  He then supplies his un-GPL'd modules 
to the world which just happen to only run on the modified kernel.  I've 
read the GPL of course (IANAL though) and I can't see what this violates 
except the /spirit/ of the license.  Don't get me wrong, I'm strongly 
against anyone doing what I just mentioned, I believe it to be immoral 
taking someone's GPL'd code and mangling it in such a way.  I speak as 
an embedded developer myself whose company decided that running our code 
under Linux and distributing our code under the GPL was far preferable 
to running closed-source software on a closed-source platform.


The best bet would be to read up on lots of past discussions related to
exactly these kinds of questions, then ask your Lawyer.

Rhetorical question: what stops me from taking somebody's copyrighted
work, stripping the copyrights or falsely claiming to have a license
to redistribute it, then selling it?

--
SUSE Labs, Novell Inc.
Send instant messages to your online friends http://au.messenger.yahoo.com 


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: GPL vs non-GPL device drivers

2007-02-14 Thread Neil Brown
On Wednesday February 14, [EMAIL PROTECTED] wrote:
 On 2/14/07, Randy Dunlap [EMAIL PROTECTED] wrote:
  We seem to have different definitions of open and closed.
 
 Open = 3rd party Linux drivers can be loaded. Closed = No third party
 Linux drivers can be loaded.

Loading a driver is not at issue.  Anyone may load a driver.

The issue is when you *distribute* a driver.
If that driver is a derived work or the Linux kernel, then you may
only distribute it under the terms of the GPLv2, which essentially
means that you make the source code available - under the GPLv2 - to
everyone you give the driver to.

How do you know if the driver is a derived work?
 Well, if it uses POSIX syscalls only, it isn't. (You can write USB
 drivers in user-space which do this).

 If it uses symbols exported with EXPORT_SYMBOL_GPL, then the author of the
 code which provides those symbols thinks that the driver is a derived
 work.

 If it uses EXPORT_SYMBOL symbols, then it is less clear what people
 believe, though there are certainly some who believe it will still
 be a derived work.

But of course the person who's opinion really counts is the judge.  So
you need to get legal advice.

NeilBrown
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC][PATCH 6/6] automatic tuning applied to some kernel components

2007-02-14 Thread Eric W. Biederman
Nadia Derbey [EMAIL PROTECTED] writes:

 But, what do you do with Oracle that's asking maxfiles to be set to 0x1,
 while the default value might be enough for a system that's not running 
 Oracle.
 I'm afraid that giving boot time values to the max_* tunables we will loose 
 all
 the benefits from /proc (or /sys): it is impossible to anticipate what an OS
 will be used for. So allowing such things to be changed without having to 
 reboot
 the machine is in my mind quite a powerful feature we should keep taking
 adavntage of.

I'm not saying remove user spaces' ability to set the
denial-of-service limits.  I'm saying if they need to be frequently
changed we need to update the default so they are higher by default.

There really is no cost in moving those values up and down  it is just
an arbitrary integer used in comparisons.  But if we can make a good
guess that still catches runaway programs before they kill the machine
but also allows more programs to work out of the box we are in better
shape.

Eric
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: GPL vs non-GPL device drivers

2007-02-14 Thread Trent Waddington

On 2/15/07, Neil Brown [EMAIL PROTECTED] wrote:

 [..] then it is less clear what people believe


Another area where it is less clear what people believe is if you are
distributing the module separately to the kernel, but, as I understand
it, vj says he is not.


But of course the person who's opinion really counts is the judge.


The judge's opinion only counts if you actually get to court and
manage to put up a legal defense.


So you need to get legal advice.


Or, ya know, you could take the moral/ethical advice that you're being
a worm and stop now.

Trent
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: GPL vs non-GPL device drivers

2007-02-14 Thread Neil Brown
On Wednesday February 14, [EMAIL PROTECTED] wrote:
 You don't get it do you. Our source code is meaningless to the Open
 Source community at large. It is only useful to our tiny set of
 competitors that have nothing to do with Linux. The Embedded space is
 very specific. We are only _using_ Linux. Just as we could have used
 VxWorks or OSE. Using our source code would not benefit anybody but
 our competitors. 

It would also benefit your *customers*.  And you might find that
providing such benefits increases the number of your customers.

NeilBrown
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [ANNOUNCE] GIT 1.5.0

2007-02-14 Thread Shawn O. Pearce
Jakub Narebski [EMAIL PROTECTED] wrote:
 Junio C Hamano wrote:
 
   - git-blame learned a new option, --incremental, that tells it
     to output the blames as they are assigned.  A sample script
     to use it is also included as contrib/blameview.
 
 And there are example GUI blameview (Perk GTK2), and example Emacs module
 for incremental git-blame, both in contib/ area. 

Not to mention the incremental blame viewer built into git-gui:

git gui blame HEAD foo.c

-- 
Shawn.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/6] MODSIGN: Kernel module signing

2007-02-14 Thread Andreas Gruenbacher
On Wednesday 14 February 2007 20:13, Dave Jones wrote:
 I've not investigated it, but I hear rumours that suse has something
 similar.

Actually, no. We don't belive that module signing adds significant value, and 
it also doesn't work well with external modules. (The external modules we 
really care about are GPL ones; it gives us a way to update drivers without 
pushing out entirely new kernels.)

Cheers,
Andreas
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/6] MODSIGN: Kernel module signing

2007-02-14 Thread Dave Jones
On Wed, Feb 14, 2007 at 09:35:40PM -0800, Andreas Gruenbacher wrote:
  On Wednesday 14 February 2007 20:13, Dave Jones wrote:
   I've not investigated it, but I hear rumours that suse has something
   similar.
  
  Actually, no. We don't belive that module signing adds significant value,

ok, then I was misinformed.

  and it also doesn't work well with external modules.

well, the situation for external modules is no worse than usual.
They still work, they just aren't signed. Which from a distributor point
of view, is actually a nice thing, as they stick out like a sore thumb
in oops reports with (U) markers :)

  (The external modules we really care about are GPL ones; it gives us a way
  to update drivers without pushing out entirely new kernels.)

external modules still compile, and run just fine. The signed modules code
doesn't prevent loading of them unless the user decides to do so with
a special boot option (which is no different really than say, reducing
the cap-bound sysctl to prevent module loading).

Dave

-- 
http://www.codemonkey.org.uk
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 0/6] atl1: bugfix, cleanup, enhancement

2007-02-14 Thread Jay Cliburn
Jeff,

Please accept the following patchset for the atl1 network device driver.

* Drop unnecessary NET_PCI config
* Fix incorrect hash table address
* Read MAC address from register
* Remove unused define
* Add Attansic L1 device id to pci_ids
* Bump version number

This patchset contains changes to the following files.

 drivers/net/Kconfig  |2 +-
 drivers/net/atl1/atl1_hw.c   |   37 +
 drivers/net/atl1/atl1_main.c |5 ++---
 include/linux/pci_ids.h  |1 +
 4 files changed, 25 insertions(+), 20 deletions(-)

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 1/6] atl1: drop NET_PCI from Kconfig

2007-02-14 Thread Jay Cliburn
From: Jay Cliburn [EMAIL PROTECTED]

The atl1 driver doesn't need NET_PCI.  Remove it from Kconfig.
Noticed by Chad Sprouse.

Signed-off-by: Jay Cliburn [EMAIL PROTECTED]
Signed-off-by: Chris Snook [EMAIL PROTECTED]
---

 drivers/net/Kconfig |2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/drivers/net/Kconfig b/drivers/net/Kconfig
index 0bb3c1e..1b624b4 100644
--- a/drivers/net/Kconfig
+++ b/drivers/net/Kconfig
@@ -2350,7 +2350,7 @@ config QLA3XXX
 
 config ATL1
tristate Attansic L1 Gigabit Ethernet support (EXPERIMENTAL)
-   depends on NET_PCI  PCI  EXPERIMENTAL
+   depends on PCI  EXPERIMENTAL
select CRC32
select MII
help
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 2/6] atl1: fix bad ioread address

2007-02-14 Thread Jay Cliburn
From: Al Viro [EMAIL PROTECTED]

An ioread32 statement reads the wrong address.  Fix it.

Signed-off-by: Al Viro [EMAIL PROTECTED]
Signed-off-by: Jay Cliburn [EMAIL PROTECTED]
Signed-off-by: Chris Snook [EMAIL PROTECTED]
---

 drivers/net/atl1/atl1_hw.c |2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/drivers/net/atl1/atl1_hw.c b/drivers/net/atl1/atl1_hw.c
index 08b2d78..e28707a 100644
--- a/drivers/net/atl1/atl1_hw.c
+++ b/drivers/net/atl1/atl1_hw.c
@@ -357,7 +357,7 @@ void atl1_hash_set(struct atl1_hw *hw, u32 hash_value)
 */
hash_reg = (hash_value  31)  0x1;
hash_bit = (hash_value  26)  0x1F;
-   mta = ioread32((hw + REG_RX_HASH_TABLE) + (hash_reg  2));
+   mta = ioread32((hw-hw_addr + REG_RX_HASH_TABLE) + (hash_reg  2));
mta |= (1  hash_bit);
iowrite32(mta, (hw-hw_addr + REG_RX_HASH_TABLE) + (hash_reg  2));
 }
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 3/6] atl1: read MAC address from register

2007-02-14 Thread Jay Cliburn
From: Jay Cliburn [EMAIL PROTECTED]

On some Asus motherboards containing the L1 NIC, the MAC address is
written by the BIOS directly to the MAC register during POST, and is
not stored in eeprom.  If we don't succeed in fetching the MAC address
from eeprom or spi, try reading it directly from the MAC register.
Suggested by Xiong Huang.

And do some cleanup while we've got the hood up...

Signed-off-by: Jay Cliburn [EMAIL PROTECTED]
Signed-off-by: Chris Snook [EMAIL PROTECTED]
---

 drivers/net/atl1/atl1_hw.c |   35 ---
 1 files changed, 20 insertions(+), 15 deletions(-)

diff --git a/drivers/net/atl1/atl1_hw.c b/drivers/net/atl1/atl1_hw.c
index e28707a..314dbaa 100644
--- a/drivers/net/atl1/atl1_hw.c
+++ b/drivers/net/atl1/atl1_hw.c
@@ -243,14 +243,8 @@ static int atl1_get_permanent_address(struct atl1_hw *hw)
i += 4;
}
 
-/*
- * The following 2 lines are the Attansic originals.  Saving for posterity.
- * *(u32 *)  eth_addr[2] = LONGSWAP(addr[0]);
- * *(u16 *)  eth_addr[0] = SHORTSWAP(*(u16 *)  addr[1]);
- */
-   *(u32 *)  eth_addr[2] = swab32(addr[0]);
-   *(u16 *)  eth_addr[0] = swab16(*(u16 *)  addr[1]);
-
+   *(u32 *) eth_addr[2] = swab32(addr[0]);
+   *(u16 *) eth_addr[0] = swab16(*(u16 *) addr[1]);
if (is_valid_ether_addr(eth_addr)) {
memcpy(hw-perm_mac_addr, eth_addr, ETH_ALEN);
return 0;
@@ -281,17 +275,28 @@ static int atl1_get_permanent_address(struct atl1_hw *hw)
i += 4;
}
 
-/*
- * The following 2 lines are the Attansic originals.  Saving for posterity.
- * *(u32 *)  eth_addr[2] = LONGSWAP(addr[0]);
- * *(u16 *)  eth_addr[0] = SHORTSWAP(*(u16 *)  addr[1]);
- */
-   *(u32 *)  eth_addr[2] = swab32(addr[0]);
-   *(u16 *)  eth_addr[0] = swab16(*(u16 *)  addr[1]);
+   *(u32 *) eth_addr[2] = swab32(addr[0]);
+   *(u16 *) eth_addr[0] = swab16(*(u16 *) addr[1]);
if (is_valid_ether_addr(eth_addr)) {
memcpy(hw-perm_mac_addr, eth_addr, ETH_ALEN);
return 0;
}
+
+   /*
+* On some motherboards, the MAC address is written by the
+* BIOS directly to the MAC register during POST, and is
+* not stored in eeprom.  If all else thus far has failed
+* to fetch the permanent MAC address, try reading it directly.
+*/
+   addr[0] = ioread32(hw-hw_addr + REG_MAC_STA_ADDR);
+   addr[1] = ioread16(hw-hw_addr + (REG_MAC_STA_ADDR + 4));
+   *(u32 *) eth_addr[2] = swab32(addr[0]);
+   *(u16 *) eth_addr[0] = swab16(*(u16 *) addr[1]);
+   if (is_valid_ether_addr(eth_addr)) {
+   memcpy(hw-perm_mac_addr, eth_addr, ETH_ALEN);
+   return 0;
+   }
+
return 1;
 }
 
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 4/6] atl1: remove unused define

2007-02-14 Thread Jay Cliburn
From: Chris Snook [EMAIL PROTECTED]

Remove unused define from atl1_main.c.

Signed-off-by: Chris Snook [EMAIL PROTECTED]
Signed-off-by: Jay Cliburn [EMAIL PROTECTED]
---

 drivers/net/atl1/atl1_main.c |1 -
 1 files changed, 0 insertions(+), 1 deletions(-)

diff --git a/drivers/net/atl1/atl1_main.c b/drivers/net/atl1/atl1_main.c
index 6655640..abce97e 100644
--- a/drivers/net/atl1/atl1_main.c
+++ b/drivers/net/atl1/atl1_main.c
@@ -82,7 +82,6 @@
 
 #include atl1.h
 
-#define RUN_REALTIME 0
 #define DRIVER_VERSION 2.0.6
 
 char atl1_driver_name[] = atl1;
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 5/6] atl1: add L1 device id to pci_ids, then use it

2007-02-14 Thread Jay Cliburn
From: Chris Snook [EMAIL PROTECTED]

Add device id for the Attansic L1 chip to pci_ids.h, then use it.

Signed-off-by: Chris Snook [EMAIL PROTECTED]
Signed-off-by: Jay Cliburn [EMAIL PROTECTED]
---

 drivers/net/atl1/atl1_main.c |2 +-
 include/linux/pci_ids.h  |1 +
 2 files changed, 2 insertions(+), 1 deletions(-)

diff --git a/drivers/net/atl1/atl1_main.c b/drivers/net/atl1/atl1_main.c
index abce97e..09f3375 100644
--- a/drivers/net/atl1/atl1_main.c
+++ b/drivers/net/atl1/atl1_main.c
@@ -99,7 +99,7 @@ MODULE_VERSION(DRIVER_VERSION);
  * atl1_pci_tbl - PCI Device ID Table
  */
 static const struct pci_device_id atl1_pci_tbl[] = {
-   {PCI_DEVICE(PCI_VENDOR_ID_ATTANSIC, 0x1048)},
+   {PCI_DEVICE(PCI_VENDOR_ID_ATTANSIC, PCI_DEVICE_ID_ATTANSIC_L1)},
/* required last entry */
{0,}
 };
diff --git a/include/linux/pci_ids.h b/include/linux/pci_ids.h
index 68a7be9..bd21933 100644
--- a/include/linux/pci_ids.h
+++ b/include/linux/pci_ids.h
@@ -2067,6 +2067,7 @@
 #define PCI_DEVICE_ID_TDI_EHCI  0x0101
 
 #define PCI_VENDOR_ID_ATTANSIC 0x1969
+#define PCI_DEVICE_ID_ATTANSIC_L1  0x1048
 
 #define PCI_VENDOR_ID_JMICRON  0x197B
 #define PCI_DEVICE_ID_JMICRON_JMB360   0x2360
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 6/6] atl1: bump version number

2007-02-14 Thread Jay Cliburn
From: Jay Cliburn [EMAIL PROTECTED]

Bump the version number.

Signed-off-by: Jay Cliburn [EMAIL PROTECTED]
---

 drivers/net/atl1/atl1_main.c |2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/drivers/net/atl1/atl1_main.c b/drivers/net/atl1/atl1_main.c
index 09f3375..6567348 100644
--- a/drivers/net/atl1/atl1_main.c
+++ b/drivers/net/atl1/atl1_main.c
@@ -82,7 +82,7 @@
 
 #include atl1.h
 
-#define DRIVER_VERSION 2.0.6
+#define DRIVER_VERSION 2.0.7
 
 char atl1_driver_name[] = atl1;
 static const char atl1_driver_string[] = Attansic L1 Ethernet Network Driver;
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/6] MODSIGN: Kernel module signing

2007-02-14 Thread Andreas Gruenbacher
On Wednesday 14 February 2007 21:45, Dave Jones wrote:
 well, the situation for external modules is no worse than usual.
 They still work, they just aren't signed. Which from a distributor point
 of view, is actually a nice thing, as they stick out like a sore thumb
 in oops reports with (U) markers :)

I agree, that's really what should happen. We solve this by marking modules as 
supported, partner supported, or unsupported, but in an insecure way, so 
partners and users could try to fake the support status of a module and/or 
remove status flags from Oopses, and cryptography wouldn't save us. We could 
try to sign Oopses which I guess you guys are doing. This whole issue hasn't 
been a serious problem in the past though, and we generally try to trust 
users not to play games on us.

In the end, it all seems to boils down to a difference in philosophy.

Thanks,
Andreas
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/6] MODSIGN: Kernel module signing

2007-02-14 Thread Dave Jones
On Wed, Feb 14, 2007 at 10:14:53PM -0800, Andreas Gruenbacher wrote:
  On Wednesday 14 February 2007 21:45, Dave Jones wrote:
   well, the situation for external modules is no worse than usual.
   They still work, they just aren't signed. Which from a distributor point
   of view, is actually a nice thing, as they stick out like a sore thumb
   in oops reports with (U) markers :)
  
  I agree, that's really what should happen. We solve this by marking modules 
  as 
  supported, partner supported, or unsupported, but in an insecure way, so 
  partners and users could try to fake the support status of a module and/or 
  remove status flags from Oopses, and cryptography wouldn't save us. We could 
  try to sign Oopses which I guess you guys are doing. This whole issue hasn't 
  been a serious problem in the past though, and we generally try to trust 
  users not to play games on us.

For the most part it works out.  I've had users file oopses where they've 
editted
out Tainted: P, and left in nvidia(U) for example :-)

Dave

-- 
http://www.codemonkey.org.uk
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 000/196] V4L/DVB updates

2007-02-14 Thread Mauro Carvalho Chehab
Linus,

Please pull 'master' from:
git://git.kernel.org:/pub/scm/linux/kernel/git/mchehab/v4l-dvb.git 
master

Basically, this series adds support for a bunch of newer cards and newer
drivers, do some relevant cleanups on cx88 (improving source code 
readability and reducing binary code size), adds FM radio support on
pvrusb2 and do several other fixes and improvements.

A more detailed log:

   - Add support for the ASUS P7131 remote control
   - Add the Composite over S-Video input on the Asus P7131 Dual
   - Update cx2341x documentation.
   - Update cx2341x documentation.
   - Removed unimplemented cx2341x API commands
   - Improve cx2341x documentation
   - Saa7134: add support for the Encore ENL-TV
   - Updated cardlist to reflect the newly added saa7134 board
   - DIB3000MC and NOVA T USB2 #2
   - Cablestar2 support
   - DVB: Remove unneeded void * casts in ttpci/av7110
   - Remove some unused code from kernel mainstream
   - Add support for more Encore TV cards
   - DVB: fix compile error
   - Make usbvision_rvfree() static
   - MAINTAINERS: tag pvrusb2 list as subscribers-only
   - Pvrusb2-hdw kfree cleanup
   - Cpia module_put cleanup
   - Tvmixer module_put cleanup
   - Cleanup: switch to using msecs_to_jiffies() on bttv
   - Improves some USBVision info messages
   - Bt8xx: add support for Ultraview DVB-T Lite
   - SN9C102 driver updates
   - ZC0301 driver updates.
   - ET61X251 driver updates.
   - Fix authorship references
   - Budget-ci: add support for the Technotrend 1500 bundled remote
   - Fix OOPS on some waitqueue conditions
   - Some fixes at stream waitqueue on vivi
   - Pvrusb2: Enable radio mode round #1
   - Pvrusb2: Enable radio mode round #2
   - Pvrusb2: Fix for min/max control value checking
   - Pvrusb2: Implement multiple minor device number handling
   - Pvrusb2: Implement stream claim checking function
   - Pvrusb2: Implement /dev/radioX
   - Pvrusb2: Use enumeration for minor number get / store code
   - Pvrusb2: Use separate enumeration for get/store of minor number
   - Pvrusb2: Make units uniform when tracking tuning frequency
   - Pvrusb2: video standard broadcast fix for radio mode
   - Pvrusb2: Allow overriding vbi and radio device minor numbers
   - Pvrusb2: Fix heap corruption introduced by radio mods
   - Pvrusb2: Fix tuner frequency calculation
   - Pvrusb2: Fix tuning calculation when in radio mode
   - Pvrusb2: v4l2 API implementation frequency tweaks
   - Pvrusb2: Enable radio mode for 24xxx devices
   - Pvrusb2: Newer frequency range checking
   - Pvrusb2: Better radio versus tv frequency handling
   - Pvrusb2: Remove stream claiming hack from /dev/radio
   - Pvrusb2: Change default volume to something sane
   - Pvrusb2: cosmetic comment tweak
   - Pvrusb2: Fix cut/paste bug in auto_mode_switch control
   - Pvrusb2: Stream configuration cleanups
   - Pvrusb2: bug fix involving switch into radio mode
   - Pvrusb2: Be smarter about mode restoration
   - Cpia.c: buffer overflow
   - Bttv cropping support
   - Pvrusb2: It's safe to kfree() a null pointer
   - Pvrusb2: Use kzalloc instead of kmalloc+memset pairs
   - Pvrusb2: Allow streaming from /dev/radioX
   - Pvrusb2: VIDIOC_G_TUNER cleanup
   - Pvrusb2: Slight debug printing efficiency fixup
   - Pvrusb2: Remove automodeswitch control
   - Pvrusb2: Stop hardcoding frequency ranges
   - Pvrusb2: trace print added
   - Pvrusb2: Fix missing break statement on VIDIOC_S_TUNER
   - Pvrusb2: Fix sizeof() calculation foul-up
   - Pvrusb2: Minor dead code / comment cleanups
   - Pvrusb2: V4L EXT_CTRLS fixup
   - Pvrusb2: A patch to use ARRAY_SIZE macro when appropriate
   - Pvrusb2: Use kzalloc in place of kmalloc/memset pairs
   - Pvrusb2: Use ARRAY_SIZE wherever possible
   - Pvrusb2: Emit VIDIOC_S_TUNER correctly
   - Pvrusb2: Introduce fake audio input selection
   - Pvrusb2: Allow VIDIOC_S_FMT with -1 for resolution values
   - Convert cx8800 driver to video_ioctl2 handler
   - Added support for V4L2_STD_NTSC_443
   - Uncommented NTSC/443 video standard
   - Make cx88-blackbird to work again
   - Renamed video_mux to cx88_video_mux
   - make videodev to auto-generate standards
   - Fix vidioc_g_tuner handling
   - Moved several stuff that were at cx88-video to cx88-blackbird.c
   - Reorder some ioctl handlers
   - Do some cleanups at cx88-blackbird
   - Use cx88_set_freq() on cx88-blackbird.c
   - Remove_cx88_ioctl
   - Convert cx88-blackbird to use video_ioctl2
   - Keep the previous tvnorm default for cx88 and cx88-blackbird
   - Saa7134: add support for Terratec Cinergy HT PCI
   - Adds video output routing
   - Cx88: Add support for svideo/composite input of the Terratec
Cinergy 1400 DVB-T
   - Remove some warnings when compiling on x86_64
   - Fix: VIDIOC_G_TUNER were returning an endless number of tuners
   - Various cx2341x documentation updates/fixes.
   - Proper vendor/device ID for the CinergyT2 input device
   - Dvb-usb: Initial support for MSI Mega Sky 580 based on Uli m9206
   - Dvb-usb: 

Re: [PATCH 1/1] Fabric7 VIOC driver source code

2007-02-14 Thread Andrew Morton
On Wed, 07 Feb 2007 13:07:40 -0800 Sriram Chidambaram [EMAIL PROTECTED] wrote:

 This patch provides the Fabric7 VIOC driver source code.
 This git mbox patch is built against 
 git://git.kernel.org/pub/scm/linux/kernel/git/jgarzik/netdev-2.6.git

 The patch can be pulled from
ftp://ftp.fabric7.com/VIOC/Fabric7-VIOC-driver-patch.FEB-07-2007

For people wondering what this is, the documentation file is below.

I'll pull this driver into my queue so that it doesn't get lost and to give
people an opportunity to review it more easily.  From a quick peek, I'd
expect some changes to be needed: stylistic things, plus some suspicious
looking PCI-poking in vioc_irq.c.  But I didn't look at it at all closely.

The driver needed a bit of help to make it compile on ia64 (I haven't tried
any other architectures).  If it's simply not possible that this device
will ever be present on any non-x86 machines then perhaps we should
restrict it to those architectures at kernel configuration time.

But then, all the changes I made were good ones..







Overview


A Virtual Input-Output Controller (VIOC) is a PCI device that provides
10Gbps of I/O bandwidth that can be shared by up to 16 virtual network
interfaces (VNICs).  VIOC hardware supports several features such as
large frames, checksum offload, gathered send, MSI/MSI-X, bandwidth
control, interrupt mitigation, etc.

VNICs are provisioned to a host partition via an out-of-band interface
from the System Controller -- typically before the partition boots,
although they can be dynamically added or removed from a running
partition as well.

Each provisioned VNIC appears as an Ethernet netdevice to the host OS,
and maintains its own transmit ring in DMA memory.  VNICs are
configured to share up to 4 of total 16 receive rings and 1 of total
16 receive-completion rings in DMA memory.  VIOC hardware classifies
packets into receive rings based on size, allowing more efficient use
of DMA buffer memory.  The default, and recommended, configuration
uses groups of 'receive sets' (rxsets), each with 3 receive rings, a
receive completion ring, and a VIOC Rx interrupt.  The driver gives
each rxset a NAPI poll handler associated with a phantom (invisible)
netdevice, for concurrency.  VNICs are assigned to rxsets using a
simple modulus.

VIOC provides 4 interrupts in INTx mode: 2 for Rx, 1 for Tx, and 1 for
out-of-band messages from the System Controller and errors.  VIOC also
provides 19 MSI-X interrupts: 16 for Rx, 1 for Tx, 1 for out-of-band
messages from the System Controller, and 1 for error signalling from
the hardware.  The VIOC driver makes a determination whether MSI-X
functionality is supported and initializes interrupts accordingly.
[Note: The Linux kernel disables MSI-X for VIOCs on modules with AMD
8131, even if the device is on the HT link.]


Module loadable parameters
==

- poll_weight (default 8) - the number of received packets will be
  processed during one call into the NAPI poll handler.

- rx_intr_timeout (default 1) - hardware rx interrupt mitigation
  timer, in units of 5us.

- rx_intr_pkt_cnt (default 64) - hardware rx interrupt mitigation
  counter, in units of packets.

- tx_pkts_per_irq (default 64) - hardware tx interrupt mitigation
  counter, in units of packets.

- tx_pkts_per_bell (default 1) - the number of packets to enqueue on a
  transmit ring before issuing a doorbell to hardware.

Performance Tuning
==

You may want to use the following sysctl settings to improve
performance.  [NOTE: To be re-checked]

# set in /etc/sysctl.conf

net.ipv4.tcp_timestamps = 0
net.ipv4.tcp_sack = 0
net.ipv4.tcp_rmem = 1000 1000 1000
net.ipv4.tcp_wmem = 1000 1000 1000
net.ipv4.tcp_mem  = 1000 1000 1000

net.core.rmem_max = 5242879
net.core.wmem_max = 5242879
net.core.rmem_default = 5242879
net.core.wmem_default = 5242879
net.core.optmem_max = 5242879
net.core.netdev_max_backlog = 10

Out-of-band Communications with System Controller
=

System operators can use the out-of-band facility to allow for remote
shutdown or reboot of the host partition.  Upon receiving such a
command, the VIOC driver executes /sbin/reboot or /sbin/shutdown
via the usermodehelper() call.

This same communications facility is used for dynamic VNIC
provisioning (plug in and out).

The VIOC driver also registers a callback with
register_reboot_notifier().  When the callback is executed, the driver
records the shutdown event and reason in a VIOC register to notify the
System Controller.



-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch 16/21] Xen-paravirt: Add code into head.S to handle being booted by Xen

2007-02-14 Thread Eric W. Biederman
Jeremy Fitzhardinge [EMAIL PROTECTED] writes:

 Eric W. Biederman wrote:
 Ok.  If that is all this may be a difference that makes no difference.
 binutils has a bad habit of looking at sections (which are fully
 optional) instead of segments on ET_EXEC and ET_DYN objects.  Only
 ET_REL objects (.o files) are required to have sections.
   

 The Xen domain loader will have to be changed to deal with that, which
 isn't too much of a problem.

Ok.  Please fix the Xen domain loader to not look at sections.  It
is a bug for any kind of executable loader to look at anything other
then segments.

 My main concern is the randomness of it, and whether it will fail in
 some more harmful way on other versions of binutils.

Reasonable and it's probably worth letting the binutils developer know.
I do agree that it is weird.   It might be that something in binutils
doesn't like us dropping some of the notes.

 So I recommend for testing write a 100 line program that includes
 elf.h and reads out the note segment.  If all is well we can split
 this code out.
   

 The Xen readnotes utility is essentially that.  I'll hack it.

Sounds good.

Eric
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch 16/21] Xen-paravirt: Add code into head.S to handle being booted by Xen

2007-02-14 Thread Jeremy Fitzhardinge
Eric W. Biederman wrote:
 Reasonable and it's probably worth letting the binutils developer know.
 I do agree that it is weird.   It might be that something in binutils
 doesn't like us dropping some of the notes.
   

What do you mean by dropping some of the notes?  I think the only
notes (at least in this case) are the Xen ones, and they're all included.

J
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch 16/21] Xen-paravirt: Add code into head.S to handle being booted by Xen

2007-02-14 Thread Eric W. Biederman
Jeremy Fitzhardinge [EMAIL PROTECTED] writes:

 Eric W. Biederman wrote:
 Reasonable and it's probably worth letting the binutils developer know.
 I do agree that it is weird.   It might be that something in binutils
 doesn't like us dropping some of the notes.
   

 What do you mean by dropping some of the notes?  I think the only
 notes (at least in this case) are the Xen ones, and they're all included.

I'm pretty certain we explicitly drop the weird GNU note that
is automatically generated by gcc and specifies something informational.

Basically into .note we include *(.note.*) but not *(.note).

I don't think anything we are doing is wrong but ld gets confused easily
in the corner cases.  I'm modestly surprised we didn't have to mark our
.note.xxx scions as .section .note.xxx @note  or whatever the proper
gas syntax is.

Eric
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch 16/21] Xen-paravirt: Add code into head.S to handle being booted by Xen

2007-02-14 Thread Jeremy Fitzhardinge
Eric W. Biederman wrote:
 I'm pretty certain we explicitly drop the weird GNU note that
 is automatically generated by gcc and specifies something informational.
   
But that's something else again, since it appears as a PT_GNU_STACK phdr.

 I don't think anything we are doing is wrong but ld gets confused easily
 in the corner cases.  I'm modestly surprised we didn't have to mark our
 .note.xxx scions as .section .note.xxx @note  or whatever the proper
 gas syntax is.

I did try that, and it didn't make a difference.  The manual says that
the output section type follows the input section type, so I agree its a
bit surprising we ever get a SHT_NOTE out of it without the @note stuff.

J

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch 16/21] Xen-paravirt: Add code into head.S to handle being booted by Xen

2007-02-14 Thread Eric W. Biederman
Jeremy Fitzhardinge [EMAIL PROTECTED] writes:

 Eric W. Biederman wrote:
 I'm pretty certain we explicitly drop the weird GNU note that
 is automatically generated by gcc and specifies something informational.
   
 But that's something else again, since it appears as a PT_GNU_STACK phdr.

Not that.  It's more like abi version or gcc version or something
like.  At least there used to be one of those notes in every .o file
and compiled program.

 I don't think anything we are doing is wrong but ld gets confused easily
 in the corner cases.  I'm modestly surprised we didn't have to mark our
 .note.xxx scions as .section .note.xxx @note  or whatever the proper
 gas syntax is.

 I did try that, and it didn't make a difference.  The manual says that
 the output section type follows the input section type, so I agree its a
 bit surprising we ever get a SHT_NOTE out of it without the @note stuff.

Right.  So the surprise is that SHT_NOTE got set.  There are some
defaults based on the section name somewhere that appear to have done
the right thing.

My best hunch really is that ld treated the .note sections normally
and just mist the handling of the magic SHT_NOTE type.  Which is why
I'm not to worried.

Eric
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 1/2] autofs4 - fix another race between mount and expire

2007-02-14 Thread Ian Kent

Hi Andrew,

Jeff Moyer has identified a race between mount and expire.

What happens is that during an expire the situation can arise
that a directory is removed and another lookup is done before
the expire issues a completion status to the kernel module.
In this case, since the the lookup gets a new dentry, it doesn't
know that there is an expire in progress and when it posts its
mount request, matches the existing expire request and waits
for its completion. ENOENT is then returned to user space
from lookup (as the dentry passed in is now unhashed) without
having performed the mount request.

The solution used here is to keep track of dentrys in this
unhashed state and reuse them, if possible, in order to
preserve the flags. Additionally, this infrastructure will
provide the framework for the reintroduction of caching
of mount fails removed earlier in development.

Signed-off-by: Ian Kent [EMAIL PROTECTED]
Acked-by: Jeff Moyer [EMAIL PROTECTED]

Ian

---

--- linux-2.6.20/fs/autofs4/autofs_i.h.lookup-expire-race   2007-02-05 
03:44:54.0 +0900
+++ linux-2.6.20/fs/autofs4/autofs_i.h  2007-02-12 12:15:17.0 +0900
@@ -52,6 +52,8 @@ struct autofs_info {
 
int flags;
 
+   struct list_head rehash;
+
struct autofs_sb_info *sbi;
unsigned long last_used;
atomic_t count;
@@ -110,6 +112,8 @@ struct autofs_sb_info {
struct mutex wq_mutex;
spinlock_t fs_lock;
struct autofs_wait_queue *queues; /* Wait queue pointer */
+   spinlock_t rehash_lock;
+   struct list_head rehash_list;
 };
 
 static inline struct autofs_sb_info *autofs4_sbi(struct super_block *sb)
--- linux-2.6.20/fs/autofs4/root.c.lookup-expire-race   2007-02-05 
03:44:54.0 +0900
+++ linux-2.6.20/fs/autofs4/root.c  2007-02-12 12:14:51.0 +0900
@@ -263,7 +263,7 @@ static int try_to_fill_dentry(struct den
 */
status = d_invalidate(dentry);
if (status != -EBUSY)
-   return -ENOENT;
+   return -EAGAIN;
}
 
DPRINTK(dentry=%p %.*s ino=%p,
@@ -413,7 +413,16 @@ static int autofs4_revalidate(struct den
 */
status = try_to_fill_dentry(dentry, flags);
if (status == 0)
-   return 1;
+   return 1;
+
+   /*
+* A status of EAGAIN here means that the dentry has gone
+* away while waiting for an expire to complete. If we are
+* racing with expire lookup will wait for it so this must
+* be a revalidate and we need to send it to lookup.
+*/
+   if (status == -EAGAIN)
+   return 0;
 
return status;
}
@@ -459,9 +468,18 @@ void autofs4_dentry_release(struct dentr
de-d_fsdata = NULL;
 
if (inf) {
+   struct autofs_sb_info *sbi = autofs4_sbi(de-d_sb);
+
inf-dentry = NULL;
inf-inode = NULL;
 
+   if (sbi) {
+   spin_lock(sbi-rehash_lock);
+   if (!list_empty(inf-rehash))
+   list_del(inf-rehash);
+   spin_unlock(sbi-rehash_lock);
+   }
+
autofs4_free_ino(inf);
}
 }
@@ -478,10 +496,80 @@ static struct dentry_operations autofs4_
.d_release  = autofs4_dentry_release,
 };
 
+static struct dentry *autofs4_lookup_unhashed(struct autofs_sb_info *sbi, 
struct dentry *parent, struct qstr *name)
+{
+   unsigned int len = name-len;
+   unsigned int hash = name-hash;
+   const unsigned char *str = name-name;
+   struct list_head *p, *head;
+
+   spin_lock(dcache_lock);
+   spin_lock(sbi-rehash_lock);
+   head = sbi-rehash_list;
+   list_for_each(p, head) {
+   struct autofs_info *ino;
+   struct dentry *dentry;
+   struct qstr *qstr;
+
+   ino = list_entry(p, struct autofs_info, rehash);
+   dentry = ino-dentry;
+
+   spin_lock(dentry-d_lock);
+
+   /* Bad luck, we've already been dentry_iput */
+   if (!dentry-d_inode)
+   goto next;
+
+   qstr = dentry-d_name;
+
+   if (dentry-d_name.hash != hash)
+   goto next;
+   if (dentry-d_parent != parent)
+   goto next;
+
+   if (qstr-len != len)
+   goto next;
+   if (memcmp(qstr-name, str, len))
+   goto next;
+
+   if (d_unhashed(dentry)) {
+   struct autofs_info *ino = autofs4_dentry_ino(dentry);
+   struct inode *inode = dentry-d_inode;
+
+   list_del_init(ino-rehash);
+   dget(dentry);
+   

[PATCH 2/2] autofs4 - check for directory re-create in lookup

2007-02-14 Thread Ian Kent

Hi Andrew,

This problem was identified and fixed some time ago by Jeff Moyer
but it fell through the cracks somehow.

It is possible that a user space application could remove and
re-create a directory during a request. To avoid returning a
failure from lookup incorrectly when our current dentry is
unhashed we need to check if another positive, hashed dentry
matching this one exists and if so return it instead of a fail.

Signed-off-by: Jeff Moyer [EMAIL PROTECTED]
Signed-off-by: Ian Kent [EMAIL PROTECTED]

Ian

---

--- linux-2.6.20/fs/autofs4/root.c.lookup-check-unhased 2007-02-12 
13:49:46.0 +0900
+++ linux-2.6.20/fs/autofs4/root.c  2007-02-12 13:54:58.0 +0900
@@ -655,14 +655,29 @@ static struct dentry *autofs4_lookup(str
 
/*
 * If this dentry is unhashed, then we shouldn't honour this
-* lookup even if the dentry is positive.  Returning ENOENT here
-* doesn't do the right thing for all system calls, but it should
-* be OK for the operations we permit from an autofs.
+* lookup.  Returning ENOENT here doesn't do the right thing
+* for all system calls, but it should be OK for the operations
+* we permit from an autofs.
 */
if (dentry-d_inode  d_unhashed(dentry)) {
+   /*
+* A user space application can (and has done in the past)
+* remove and re-create this directory during the callback.
+* This can leave us with an unhashed dentry, but a
+* successful mount!  So we need to perform another
+* cached lookup in case the dentry now exists.
+*/
+   struct dentry *parent = dentry-d_parent;
+   struct dentry *new = d_lookup(parent, dentry-d_name);
+   if (new != NULL)
+   dentry = new;
+   else
+   dentry = ERR_PTR(-ENOENT);
+
if (unhashed)
dput(unhashed);
-   return ERR_PTR(-ENOENT);
+
+   return dentry;
}
 
if (unhashed)
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [uml-devel] x86_64: fix 2.6.18 regression - PTRACE_OLDSETOPTIONS should be accepted

2007-02-14 Thread Andrew Morton
On Thu, 15 Feb 2007 04:43:41 +0100 Blaisorblade [EMAIL PROTECTED] wrote:

  I sent an equivalent patch in earlier today:
 Doh! Interesting this timing...
 
  Index: linux-2.6/arch/x86_64/ia32/ptrace32.c
  ===
  --- linux-2.6.orig/arch/x86_64/ia32/ptrace32.c
  +++ linux-2.6/arch/x86_64/ia32/ptrace32.c
  @@ -239,6 +239,8 @@ asmlinkage long sys32_ptrace(long reques
  __u32 val;
 
  switch (request) {
  +   case PTRACE_OLDSETOPTIONS:
  +   request = PTRACE_SETOPTIONS;
  case PTRACE_TRACEME:
  case PTRACE_ATTACH:
  case PTRACE_KILL:
 
  I change the request so that PTRACE_OLDSETOPTIONS doesn't need to
  propogate any further.  However, it is present in include/asm-x86_64,
  so I guess that counts as being part of the x86_64 ABI.  That being
  the case, I guess my patch can be dropped in favor of this one.
 
 It is handled in ptrace_request, unless there are include problems. I'm going 
 to reboot and test mine for any remaining problem.

Whatever happens, please ensure that the final fix makes it into -stable
as well.  Jeff's version of this patch wasn't cc'ed to [EMAIL PROTECTED]
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


sata_nv ADMA controller lockup investigation

2007-02-14 Thread Robert Hancock
While testing out some libata FUA changes I was working on, I was 
inadvertently able to reproduce the kind of NCQ command timeouts in 
sata_nv that a few people have reported. I since verified that the FUA 
stuff had nothing to do with it as it still happens even with FUA 
disabled. However I'm somewhat at a loss as to how to further debug 
this, so I'm posting my findings in the hope that somebody has some more 
ideas (or anyone at NVIDIA decides to come forth with a tip or two).


The conditions in which I can reproduce this are with:

ext3 filesystem mounted with -o barrier=1
Two instances of a program which truncates a file, then writes single bytes
to it, fsyncing after each one.
Simultaneously, repeatedly writing 100MB from /dev/zero to a file
using dd.

A command timeout usually happens within a few minutes. With my working copy
loaded up with a ton of extra debugging, the exception report for one of
these looks like this. My comments are indented.

ata4: EH in ADMA mode, notifier 0x0 notifier_error 0x0 gen_ctl 0x1501000 status 
0x400 cpb count 0x0 next cpb idx 0x0
This is just dumping all of the ADMA registers when the timeout
happened.
ata4: last intr at 1171511467:501179, status 0x1540
This shows the time of the last interrupt in seconds:microseconds and 
the
ADMA status register contents at that time.
ata4: cmd 61/08:00:40:36:75/00:00:0c:00:00/40 tag 0 at 1171511467:360525 done 
1171511467:393064, stat before 0x400 after 0x400
ata4: cmd 61/40:00:80:a1:64/00:00:0a:00:00/40 tag 0 at 1171511467:393928 done 
1171511467:394345, stat before 0x500 after 0x400
ata4: cmd ea/00:00:00:00:00/00:00:00:00:00/a0 tag 0 at 1171511467:394400 done 
1171511467:425548, stat before 0x500 after 0x400
ata4: cmd 61/08:00:c0:a1:64/00:00:0a:00:00/40 tag 0 at 1171511467:425556 done 
1171511467:425694, stat before 0x500 after 0x400
ata4: cmd ea/00:00:00:00:00/00:00:00:00:00/a0 tag 0 at 1171511467:425699 done 
1171511467:433896, stat before 0x500 after 0x400
ata4: cmd ea/00:00:00:00:00/00:00:00:00:00/a0 tag 0 at 1171511467:433958 done 
1171511467:433971, stat before 0x500 after 0x400
ata4: cmd 61/08:00:c8:a1:64/00:00:0a:00:00/40 tag 0 at 1171511467:433978 done 
1171511467:434152, stat before 0x500 after 0x400
ata4: cmd ea/00:00:00:00:00/00:00:00:00:00/a0 tag 0 at 1171511467:434160 done 
1171511467:442326, stat before 0x500 after 0x400
ata4: cmd 61/08:00:d0:a1:64/00:00:0a:00:00/40 tag 0 at 1171511467:442389 done 
1171511467:442843, stat before 0x500 after 0x400
ata4: cmd 61/08:08:88:7e:75/00:00:0c:00:00/40 tag 1 at 1171511467:442395 done 
1171511467:442846, stat before 0x400 after 0x400
ata4: cmd 61/e8:10:08:58:77/01:00:0c:00:00/40 tag 2 at 1171511467:442419 done 
1171511467:445010, stat before 0x400 after 0x400
ata4: cmd 61/e8:18:f0:59:77/01:00:0c:00:00/40 tag 3 at 1171511467:442437 done 
1171511467:447182, stat before 0x0 after 0x0
ata4: cmd 61/e8:20:d8:5b:77/01:00:0c:00:00/40 tag 4 at 1171511467:442455 done 
1171511467:449343, stat before 0x0 after 0x0
ata4: cmd 61/e8:28:c0:5d:77/01:00:0c:00:00/40 tag 5 at 1171511467:442475 done 
1171511467:451543, stat before 0x0 after 0x0
ata4: cmd 61/30:30:a8:5f:77/00:00:0c:00:00/40 tag 6 at 1171511467:442481 done 
1171511467:451833, stat before 0x0 after 0x0
ata4: cmd ea/00:00:00:00:00/00:00:00:00:00/a0 tag 0 at 1171511467:451858 done 
1171511467:492486, stat before 0x500 after 0x400
ata4: cmd 61/08:00:d8:a1:64/00:00:0a:00:00/40 tag 0 at 1171511467:492498 done 
1171511467:492666, stat before 0x500 after 0x400
ata4: cmd ea/00:00:00:00:00/00:00:00:00:00/a0 tag 0 at 1171511467:492671 done 
1171511467:500909, stat before 0x500 after 0x400
ata4: cmd ea/00:00:00:00:00/00:00:00:00:00/a0 tag 0 at 1171511467:501167 done 
1171511467:501181, stat before 0x500 after 0x400
ata4: cmd 61/08:00:e0:a1:64/00:00:0a:00:00/40 tag 0 at 1171511467:501187 done 
0:0, stat before 0x500 after 0x400
These lines show the last 20 commands issued, the contents of the
taskfile, the tag, the time in sec:usec they were issued,
the time in sec:usec they completed (0:0 for still incomplete),
the ADMA status register contents before issuing the command,
and the register contents after issuing the command.
ata4: CPB 0: ctl_flags 0x1f, resp_flags 0x0
Contents of the outstanding CPB's flags, showing that the controller
seems not to have touched it, released and done flags are clear.
ata4: timeout waiting for ADMA IDLE, stat=0x400
ata4: timeout waiting for ADMA LEGACY, stat=0x400
As part of error handling we try to switch the controller back to
legacy mode. We time out waiting for the controller to show
IDLE, and then clear the GO bit, and then time out waiting for it
to show the LEGACY state. Right after this we beat it over the head
with NV_ADMA_CTL_CHANNEL_RESET which finally seems to restore its
senses, until one of these happens again.
ata4.00: exception Emask 0x0 SAct 0x1 SErr 0x0 

[PATCH 2.6.21-rc1 0/5] ehca patch set for 2.6.21-rc1

2007-02-14 Thread Hoang-Nam Nguyen
Hello Roland!
Here is a patch set for ehca with the following changes resp. bug fixes:
* Reworked irq handler to avoid/reduce missed irq events
* Fix race condition bug in find_next_online_cpu() and other potential
  locking issue of scaling code
* Allow scaling code to be configurable (en-/disable) via module parameter
* Replace yield() in ehca_destroy_cq() by wait_for_completion()
* ehca_query_port() now returns LINK_UP for phys_state instead UNKNOWN
Thanks!
Nam
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 2.6.21-rc1 1/5] ehca: reworked irq handler to avoid/reduce missed irq events

2007-02-14 Thread Hoang-Nam Nguyen
Hi,
here is a patch for ehca with the reworked irq handler.
Thanks
Nam


Signed-off-by: Hoang-Nam Nguyen [EMAIL PROTECTED]
---


 ehca_classes.h |   18 +++--
 ehca_eq.c  |1 
 ehca_irq.c |  200 -
 ehca_irq.h |1 
 ehca_main.c|   24 +-
 ipz_pt_fn.h|9 ++
 6 files changed, 172 insertions(+), 81 deletions(-)


diff -Nurp infiniband_orig/drivers/infiniband/hw/ehca/ehca_classes.h 
infiniband_work/drivers/infiniband/hw/ehca/ehca_classes.h
--- infiniband_orig/drivers/infiniband/hw/ehca/ehca_classes.h   2007-02-11 
21:31:06.0 +0100
+++ infiniband_work/drivers/infiniband/hw/ehca/ehca_classes.h   2007-02-14 
12:53:41.0 +0100
@@ -42,8 +42,6 @@
 #ifndef __EHCA_CLASSES_H__
 #define __EHCA_CLASSES_H__

-#include ehca_classes.h
-#include ipz_pt_fn.h

 struct ehca_module;
 struct ehca_qp;
@@ -54,14 +52,22 @@ struct ehca_mw;
 struct ehca_pd;
 struct ehca_av;

+#include rdma/ib_verbs.h
+#include rdma/ib_user_verbs.h
+
 #ifdef CONFIG_PPC64
 #include ehca_classes_pSeries.h
 #endif
+#include ipz_pt_fn.h
+#include ehca_qes.h
+#include ehca_irq.h

-#include rdma/ib_verbs.h
-#include rdma/ib_user_verbs.h
+#define EHCA_EQE_CACHE_SIZE 20

-#include ehca_irq.h
+struct ehca_eqe_cache_entry {
+   struct ehca_eqe *eqe;
+   struct ehca_cq *cq;
+};

 struct ehca_eq {
u32 length;
@@ -74,6 +80,8 @@ struct ehca_eq {
spinlock_t spinlock;
struct tasklet_struct interrupt_task;
u32 ist;
+   spinlock_t irq_spinlock;
+   struct ehca_eqe_cache_entry eqe_cache[EHCA_EQE_CACHE_SIZE];
 };

 struct ehca_sport {
diff -Nurp infiniband_orig/drivers/infiniband/hw/ehca/ehca_eq.c 
infiniband_work/drivers/infiniband/hw/ehca/ehca_eq.c
--- infiniband_orig/drivers/infiniband/hw/ehca/ehca_eq.c2007-02-11 
21:31:06.0 +0100
+++ infiniband_work/drivers/infiniband/hw/ehca/ehca_eq.c2007-02-14 
12:53:40.0 +0100
@@ -61,6 +61,7 @@ int ehca_create_eq(struct ehca_shca *shc
struct ib_device *ib_dev = shca-ib_device;

spin_lock_init(eq-spinlock);
+   spin_lock_init(eq-irq_spinlock);
eq-is_initialized = 0;

if (type != EHCA_EQ  type != EHCA_NEQ) {
diff -Nurp infiniband_orig/drivers/infiniband/hw/ehca/ehca_irq.c 
infiniband_work/drivers/infiniband/hw/ehca/ehca_irq.c
--- infiniband_orig/drivers/infiniband/hw/ehca/ehca_irq.c   2007-02-11 
21:36:12.0 +0100
+++ infiniband_work/drivers/infiniband/hw/ehca/ehca_irq.c   2007-02-14 
13:07:54.0 +0100
@@ -401,87 +400,143 @@ irqreturn_t ehca_interrupt_eq(int irq, v
return IRQ_HANDLED;
 }

-void ehca_tasklet_eq(unsigned long data)
-{
-   struct ehca_shca *shca = (struct ehca_shca*)data;
-   struct ehca_eqe *eqe;
-   int int_state;
-   int query_cnt = 0;

-   do {
-   eqe = (struct ehca_eqe *)ehca_poll_eq(shca, shca-eq);
+static inline void process_eqe(struct ehca_shca *shca, struct ehca_eqe *eqe)
+{
+   u64 eqe_value;
+   u32 token;
+   unsigned long flags;
+   struct ehca_cq *cq;
+   eqe_value = eqe-entry;
+   ehca_dbg(shca-ib_device, eqe_value=%lx, eqe_value);
+   if (EHCA_BMASK_GET(EQE_COMPLETION_EVENT, eqe_value)) {
+   ehca_dbg(shca-ib_device, ... completion event);
+   token = EHCA_BMASK_GET(EQE_CQ_TOKEN, eqe_value);
+   spin_lock_irqsave(ehca_cq_idr_lock, flags);
+   cq = idr_find(ehca_cq_idr, token);
+   if (cq == NULL) {
+   spin_unlock_irqrestore(ehca_cq_idr_lock, flags);
+   ehca_err(shca-ib_device,
+Invalid eqe for non-existing cq token=%x,
+token);
+   return;
+   }
+   reset_eq_pending(cq);
+#ifdef CONFIG_INFINIBAND_EHCA_SCALING
+   queue_comp_task(cq);
+   spin_unlock_irqrestore(ehca_cq_idr_lock, flags);
+#else
+   spin_unlock_irqrestore(ehca_cq_idr_lock, flags);
+   comp_event_callback(cq);
+#endif
+   } else {
+   ehca_dbg(shca-ib_device,
+Got non completion event);
+   parse_identifier(shca, eqe_value);
+   }
+}

-   if ((shca-hw_level = 2)  eqe)
-   int_state = 1;
-   else
-   int_state = 0;
+void ehca_process_eq(struct ehca_shca *shca, int is_irq)
+{
+   struct ehca_eq *eq = shca-eq;
+   struct ehca_eqe_cache_entry *eqe_cache = eq-eqe_cache;
+   u64 eqe_value;
+   unsigned long flags;
+   int eqe_cnt, i;
+   int eq_empty = 0;

-   while ((int_state == 1) || eqe) {
-   while (eqe) {
-   u64 eqe_value = eqe-entry;
-
-   ehca_dbg(shca-ib_device,
-eqe_value=%lx, eqe_value);
-
-   

[PATCH 2.6.21-rc1 2/5] ehca: fix race condition/locking issues in scaling code

2007-02-14 Thread Hoang-Nam Nguyen
Hi,
this patch fixes a race condition in find_next_cpu_online() and some
other locking issues in scaling code.
Thanks
Nam


Signed-off-by: Hoang-Nam Nguyen [EMAIL PROTECTED]
---


 ehca_irq.c |   68 +
 1 files changed, 33 insertions(+), 35 deletions(-)


diff -Nurp infiniband_orig/drivers/infiniband/hw/ehca/ehca_irq.c 
infiniband_work/drivers/infiniband/hw/ehca/ehca_irq.c
--- infiniband_orig/drivers/infiniband/hw/ehca/ehca_irq.c   2007-02-14 
14:16:45.0 +0100
+++ infiniband_work/drivers/infiniband/hw/ehca/ehca_irq.c   2007-02-14 
14:16:35.0 +0100
@@ -544,28 +544,30 @@ void ehca_tasklet_eq(unsigned long data)

 static inline int find_next_online_cpu(struct ehca_comp_pool* pool)
 {
-   unsigned long flags_last_cpu;
+   int cpu;
+   unsigned long flags;

+   WARN_ON_ONCE(!in_interrupt());
if (ehca_debug_level)
ehca_dmp(cpu_online_map, sizeof(cpumask_t), );

-   spin_lock_irqsave(pool-last_cpu_lock, flags_last_cpu);
-   pool-last_cpu = next_cpu(pool-last_cpu, cpu_online_map);
-   if (pool-last_cpu == NR_CPUS)
-   pool-last_cpu = first_cpu(cpu_online_map);
-   spin_unlock_irqrestore(pool-last_cpu_lock, flags_last_cpu);
+   spin_lock_irqsave(pool-last_cpu_lock, flags);
+   cpu = next_cpu(pool-last_cpu, cpu_online_map);
+   if (cpu == NR_CPUS)
+   cpu = first_cpu(cpu_online_map);
+   pool-last_cpu = cpu;
+   spin_unlock_irqrestore(pool-last_cpu_lock, flags);

-   return pool-last_cpu;
+   return cpu;
 }

 static void __queue_comp_task(struct ehca_cq *__cq,
  struct ehca_cpu_comp_task *cct)
 {
-   unsigned long flags_cct;
-   unsigned long flags_cq;
+   unsigned long flags;

-   spin_lock_irqsave(cct-task_lock, flags_cct);
-   spin_lock_irqsave(__cq-task_lock, flags_cq);
+   spin_lock_irqsave(cct-task_lock, flags);
+   spin_lock(__cq-task_lock);

if (__cq-nr_callbacks == 0) {
__cq-nr_callbacks++;
@@ -576,8 +578,8 @@ static void __queue_comp_task(struct ehc
else
__cq-nr_callbacks++;

-   spin_unlock_irqrestore(__cq-task_lock, flags_cq);
-   spin_unlock_irqrestore(cct-task_lock, flags_cct);
+   spin_unlock(__cq-task_lock);
+   spin_unlock_irqrestore(cct-task_lock, flags);
 }

 static void queue_comp_task(struct ehca_cq *__cq)
@@ -588,69 +590,69 @@ static void queue_comp_task(struct ehca_

cpu = get_cpu();
cpu_id = find_next_online_cpu(pool);
-
BUG_ON(!cpu_online(cpu_id));

cct = per_cpu_ptr(pool-cpu_comp_tasks, cpu_id);
+   BUG_ON(!cct);

if (cct-cq_jobs  0) {
cpu_id = find_next_online_cpu(pool);
cct = per_cpu_ptr(pool-cpu_comp_tasks, cpu_id);
+   BUG_ON(!cct);
}

__queue_comp_task(__cq, cct);
-
-   put_cpu();
-
-   return;
 }

 static void run_comp_task(struct ehca_cpu_comp_task* cct)
 {
struct ehca_cq *cq;
-   unsigned long flags_cct;
-   unsigned long flags_cq;
+   unsigned long flags;

-   spin_lock_irqsave(cct-task_lock, flags_cct);
+   spin_lock_irqsave(cct-task_lock, flags);

while (!list_empty(cct-cq_list)) {
cq = list_entry(cct-cq_list.next, struct ehca_cq, entry);
-   spin_unlock_irqrestore(cct-task_lock, flags_cct);
+   spin_unlock_irqrestore(cct-task_lock, flags);
comp_event_callback(cq);
-   spin_lock_irqsave(cct-task_lock, flags_cct);
+   spin_lock_irqsave(cct-task_lock, flags);

-   spin_lock_irqsave(cq-task_lock, flags_cq);
+   spin_lock(cq-task_lock);
cq-nr_callbacks--;
if (cq-nr_callbacks == 0) {
list_del_init(cct-cq_list.next);
cct-cq_jobs--;
}
-   spin_unlock_irqrestore(cq-task_lock, flags_cq);
-
+   spin_unlock(cq-task_lock);
}

-   spin_unlock_irqrestore(cct-task_lock, flags_cct);
-
-   return;
+   spin_unlock_irqrestore(cct-task_lock, flags);
 }

 static int comp_task(void *__cct)
 {
struct ehca_cpu_comp_task* cct = __cct;
+   int cql_empty;
DECLARE_WAITQUEUE(wait, current);

set_current_state(TASK_INTERRUPTIBLE);
while(!kthread_should_stop()) {
add_wait_queue(cct-wait_queue, wait);

-   if (list_empty(cct-cq_list))
+   spin_lock_irq(cct-task_lock);
+   cql_empty = list_empty(cct-cq_list);
+   spin_unlock_irq(cct-task_lock);
+   if (cql_empty)
schedule();
else
__set_current_state(TASK_RUNNING);

remove_wait_queue(cct-wait_queue, wait);

-   if (!list_empty(cct-cq_list))
+   

[PATCH 2.6.21-rc1 4/5] ehca: replace yield() by wait_for_completion()

2007-02-14 Thread Hoang-Nam Nguyen
Hi,
this patch removes yield() and uses wait_for_completion() in order
to wait for running completion handlers finished before destroying
associated completion queue.
Thanks
Nam


Signed-off-by: Hoang-Nam Nguyen [EMAIL PROTECTED]
---


 ehca_classes.h |3 +++
 ehca_cq.c  |3 ++-
 ehca_irq.c |6 +-
 3 files changed, 10 insertions(+), 2 deletions(-)


diff -Nurp infiniband_orig/drivers/infiniband/hw/ehca/ehca_classes.h 
infiniband_work/drivers/infiniband/hw/ehca/ehca_classes.h
--- infiniband_orig/drivers/infiniband/hw/ehca/ehca_classes.h   2007-02-14 
13:52:49.0 +0100
+++ infiniband_work/drivers/infiniband/hw/ehca/ehca_classes.h   2007-02-14 
13:52:06.0 +0100
@@ -52,6 +52,8 @@ struct ehca_mw;
 struct ehca_pd;
 struct ehca_av;

+#include linux/completion.h
+
 #include rdma/ib_verbs.h
 #include rdma/ib_user_verbs.h

@@ -154,6 +156,7 @@ struct ehca_cq {
struct hlist_head qp_hashtab[QP_HASHTAB_LEN];
struct list_head entry;
u32 nr_callbacks;
+   struct completion zero_callbacks;
spinlock_t task_lock;
u32 ownpid;
/* mmap counter for resources mapped into user space */
diff -Nurp infiniband_orig/drivers/infiniband/hw/ehca/ehca_cq.c 
infiniband_work/drivers/infiniband/hw/ehca/ehca_cq.c
--- infiniband_orig/drivers/infiniband/hw/ehca/ehca_cq.c2007-02-14 
13:52:49.0 +0100
+++ infiniband_work/drivers/infiniband/hw/ehca/ehca_cq.c2007-02-14 
13:52:06.0 +0100
@@ -147,6 +147,7 @@ struct ib_cq *ehca_create_cq(struct ib_d
spin_lock_init(my_cq-spinlock);
spin_lock_init(my_cq-cb_lock);
spin_lock_init(my_cq-task_lock);
+   init_completion(my_cq-zero_callbacks);
my_cq-ownpid = current-tgid;

cq = my_cq-ib_cq;
@@ -332,7 +333,7 @@ int ehca_destroy_cq(struct ib_cq *cq)
spin_lock_irqsave(ehca_cq_idr_lock, flags);
while (my_cq-nr_callbacks) {
spin_unlock_irqrestore(ehca_cq_idr_lock, flags);
-   yield();
+   wait_for_completion(my_cq-zero_callbacks);
spin_lock_irqsave(ehca_cq_idr_lock, flags);
}

diff -Nurp infiniband_orig/drivers/infiniband/hw/ehca/ehca_irq.c 
infiniband_work/drivers/infiniband/hw/ehca/ehca_irq.c
--- infiniband_orig/drivers/infiniband/hw/ehca/ehca_irq.c   2007-02-14 
13:52:49.0 +0100
+++ infiniband_work/drivers/infiniband/hw/ehca/ehca_irq.c   2007-02-14 
13:52:06.0 +0100
@@ -605,6 +605,7 @@ static void run_comp_task(struct ehca_cp
spin_lock_irqsave(cct-task_lock, flags);

while (!list_empty(cct-cq_list)) {
+   int is_complete = 0;
cq = list_entry(cct-cq_list.next, struct ehca_cq, entry);
spin_unlock_irqrestore(cct-task_lock, flags);
comp_event_callback(cq);
@@ -612,11 +613,14 @@ static void run_comp_task(struct ehca_cp

spin_lock(cq-task_lock);
cq-nr_callbacks--;
-   if (cq-nr_callbacks == 0) {
+   is_complete = (cq-nr_callbacks == 0);
+   if (is_complete) {
list_del_init(cct-cq_list.next);
cct-cq_jobs--;
}
spin_unlock(cq-task_lock);
+   if (is_complete) /* wake up waiting destroy_cq() */
+   complete(cq-zero_callbacks);
}

spin_unlock_irqrestore(cct-task_lock, flags);
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 2.6.21-rc1 3/5] ehca: allow en/disabling scaling code via module parameter

2007-02-14 Thread Hoang-Nam Nguyen
Hi,
here is a patch for ehca that allows users to en/disable scaling code
when loading ib_ehca module.
Thanks
Nam


Signed-off-by: Hoang-Nam Nguyen [EMAIL PROTECTED]
---


 Kconfig|8 
 ehca_classes.h |1 +
 ehca_irq.c |   47 +--
 ehca_main.c|4 
 4 files changed, 26 insertions(+), 34 deletions(-)


diff -Nurp infiniband_orig/drivers/infiniband/hw/ehca/Kconfig 
infiniband_work/drivers/infiniband/hw/ehca/Kconfig
--- infiniband_orig/drivers/infiniband/hw/ehca/Kconfig  2007-02-14 
14:18:16.0 +0100
+++ infiniband_work/drivers/infiniband/hw/ehca/Kconfig  2007-02-14 
14:20:52.0 +0100
@@ -7,11 +7,3 @@ config INFINIBAND_EHCA
To compile the driver as a module, choose M here. The module
will be called ib_ehca.

-config INFINIBAND_EHCA_SCALING
-   bool Scaling support (EXPERIMENTAL)
-   depends on IBMEBUS  INFINIBAND_EHCA  HOTPLUG_CPU  EXPERIMENTAL
-   default y
-   ---help---
-   eHCA scaling support schedules the CQ callbacks to different CPUs.
-
-   To enable this feature choose Y here.
diff -Nurp infiniband_orig/drivers/infiniband/hw/ehca/ehca_classes.h 
infiniband_work/drivers/infiniband/hw/ehca/ehca_classes.h
--- infiniband_orig/drivers/infiniband/hw/ehca/ehca_classes.h   2007-02-14 
14:18:16.0 +0100
+++ infiniband_work/drivers/infiniband/hw/ehca/ehca_classes.h   2007-02-14 
14:20:17.0 +0100
@@ -277,6 +277,7 @@ extern struct idr ehca_cq_idr;
 extern int ehca_static_rate;
 extern int ehca_port_act_time;
 extern int ehca_use_hp_mr;
+extern int ehca_scaling_code;

 struct ipzu_queue_resp {
u32 qe_size;  /* queue entry size */
diff -Nurp infiniband_orig/drivers/infiniband/hw/ehca/ehca_irq.c 
infiniband_work/drivers/infiniband/hw/ehca/ehca_irq.c
--- infiniband_orig/drivers/infiniband/hw/ehca/ehca_irq.c   2007-02-14 
14:18:16.0 +0100
+++ infiniband_work/drivers/infiniband/hw/ehca/ehca_irq.c   2007-02-14 
14:20:17.0 +0100
@@ -63,15 +63,11 @@
 #define ERROR_DATA_LENGTH  EHCA_BMASK_IBM(52,63)
 #define ERROR_DATA_TYPEEHCA_BMASK_IBM(0,7)

-#ifdef CONFIG_INFINIBAND_EHCA_SCALING
-
 static void queue_comp_task(struct ehca_cq *__cq);

 static struct ehca_comp_pool* pool;
 static struct notifier_block comp_pool_callback_nb;

-#endif
-
 static inline void comp_event_callback(struct ehca_cq *cq)
 {
if (!cq-ib_cq.comp_handler)
@@ -423,13 +419,13 @@ static inline void process_eqe(struct eh
return;
}
reset_eq_pending(cq);
-#ifdef CONFIG_INFINIBAND_EHCA_SCALING
-   queue_comp_task(cq);
-   spin_unlock_irqrestore(ehca_cq_idr_lock, flags);
-#else
-   spin_unlock_irqrestore(ehca_cq_idr_lock, flags);
-   comp_event_callback(cq);
-#endif
+   if (ehca_scaling_code) {
+   queue_comp_task(cq);
+   spin_unlock_irqrestore(ehca_cq_idr_lock, flags);
+   } else {
+   spin_unlock_irqrestore(ehca_cq_idr_lock, flags);
+   comp_event_callback(cq);
+   }
} else {
ehca_dbg(shca-ib_device,
 Got non completion event);
@@ -508,13 +504,12 @@ void ehca_process_eq(struct ehca_shca *s
/* call completion handler for cached eqes */
for (i = 0; i  eqe_cnt; i++)
if (eq-eqe_cache[i].cq) {
-#ifdef CONFIG_INFINIBAND_EHCA_SCALING
-   spin_lock(ehca_cq_idr_lock);
-   queue_comp_task(eq-eqe_cache[i].cq);
-   spin_unlock(ehca_cq_idr_lock);
-#else
-   comp_event_callback(eq-eqe_cache[i].cq);
-#endif
+   if (ehca_scaling_code) {
+   spin_lock(ehca_cq_idr_lock);
+   queue_comp_task(eq-eqe_cache[i].cq);
+   spin_unlock(ehca_cq_idr_lock);
+   } else
+   comp_event_callback(eq-eqe_cache[i].cq);
} else {
ehca_dbg(shca-ib_device, Got non completion event);
parse_identifier(shca, eq-eqe_cache[i].eqe-entry);
@@ -540,8 +535,6 @@ void ehca_tasklet_eq(unsigned long data)
ehca_process_eq((struct ehca_shca*)data, 1);
 }

-#ifdef CONFIG_INFINIBAND_EHCA_SCALING
-
 static inline int find_next_online_cpu(struct ehca_comp_pool* pool)
 {
int cpu;
@@ -764,14 +757,14 @@ static int comp_pool_callback(struct not
return NOTIFY_OK;
 }

-#endif
-
 int ehca_create_comp_pool(void)
 {
-#ifdef CONFIG_INFINIBAND_EHCA_SCALING
int cpu;
struct task_struct *task;

+   if (!ehca_scaling_code)
+   return 0;
+
pool = kzalloc(sizeof(struct ehca_comp_pool), GFP_KERNEL);
if (pool == NULL)
return -ENOMEM;
@@ -796,16 +789,19 @@ int 

[PATCH 2.6.21-rc1 5/5] ehca: query_port() returns LINK_UP instead UNKNOWN

2007-02-14 Thread Hoang-Nam Nguyen
Hi,
this patch sets port phys state as a result of ehca_query_port() to LINK_UP.
On pSeries ehca actually represents a logical HCA, whose phys/link state always
is LINK_UP. 
Thanks
Nam


Signed-off-by: Hoang-Nam Nguyen [EMAIL PROTECTED]
---


 ehca_hca.c |3 +++
 1 files changed, 3 insertions(+)


diff -Nurp infiniband_orig/drivers/infiniband/hw/ehca/ehca_hca.c 
infiniband_work/drivers/infiniband/hw/ehca/ehca_hca.c
--- infiniband_orig/drivers/infiniband/hw/ehca/ehca_hca.c   2007-02-14 
13:11:45.0 +0100
+++ infiniband_work/drivers/infiniband/hw/ehca/ehca_hca.c   2007-02-14 
12:53:52.0 +0100
@@ -162,6 +162,9 @@ int ehca_query_port(struct ib_device *ib
props-active_width= IB_WIDTH_12X;
props-active_speed= 0x1;

+   /* at the moment (logical) link state is always LINK_UP */
+   props-phys_state  = 0x5;
+
 query_port1:
ehca_free_fw_ctrlblock(rblock);

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 2.6.21-rc1 1/5] ehca: reworked irq handler to avoid/reduce missed irq events

2007-02-14 Thread Christoph Hellwig
On Wed, Feb 14, 2007 at 05:40:47PM +0100, Hoang-Nam Nguyen wrote:
 Hi,
 here is a patch for ehca with the reworked irq handler.
 Thanks
 Nam

This looks okay to me (and sorry for new replying earlier to you private
mail)
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 2.6.21-rc1 1/5] ehca: reworked irq handler to avoid/reduce missed irq events

2007-02-14 Thread Roland Dreier
Looks fine but this patch at least has serious whitespace
damage... please resend a fixed version.

 - R.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 2.6.21-rc1 4/5] ehca: replace yield() by wait_for_completion()

2007-02-14 Thread Christoph Hellwig
 @@ -332,7 +333,7 @@ int ehca_destroy_cq(struct ib_cq *cq)
 spin_lock_irqsave(ehca_cq_idr_lock, flags);
 while (my_cq-nr_callbacks) {
 spin_unlock_irqrestore(ehca_cq_idr_lock, flags);
 -   yield();
 +   wait_for_completion(my_cq-zero_callbacks);
 spin_lock_irqsave(ehca_cq_idr_lock, flags);
 }

A while loop around wait_for_completion doesn't make all that much sense.
I suspect a simple

if (my_cq-nr_callbacks)
wait_for_completion(my_cq-zero_callbacks);

Is what you need.

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 2.6.21-rc1 4/5] ehca: replace yield() by wait_for_completion()

2007-02-14 Thread Roland Dreier
I agree with Christoph -- the use of wait_for_completion() in a loop
makes no sense.  When you send a new copy of this patch without
whitespace damage, please fix that up too...
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


<    3   4   5   6   7   8