Re: [PATCH 04/22] block: Abstract out bvec iterator

2013-08-13 Thread Ed L Cashin
On Tue, Aug 13, 2013 at 11:51:58AM -0700, Kent Overstreet wrote:
> On Tue, Aug 13, 2013 at 10:03:04AM -0400, Ed Cashin wrote:
> > On Aug 9, 2013, Ed Cashin wrote:
> > > On Aug 8, 2013, at 9:05 PM, Kent Overstreet wrote:
> > > ...
> > > > It's in the for-jens branch now.
> > > 
> > > 
> > > Just examining the patches, I like the way it cleans up the aoe code.  I
> > > had a question about a new BUG added by the for-jens branch the
> > > read-response handling path of the aoe driver.
> > 
> > The aoe driver in linux-bcache/for-jens commit 4c36c973a8f45 is
> > passing my tests.
> > 
> > Here is a patch against that branch illustrating my suggestion for
> > handling bad target responses gracefully.
> 
> Thanks - shall I just fold that into the aoe immutable bvec patch?

Yes, that would be good, thanks.

Unfortunately, the way I usually send patches to vger didn't work
this time.  It looks like the MTA didn't retry after the
greylisting used SMTP temporary failures.  So I'm trying a
different way to send and including the same patch for the
benefit of the Cc list.

commit 2c39f50b1ee02e2ac07fd072a883a91713da53cc
Author: Ed Cashin 
Date:   Tue Aug 13 10:50:28 2013 -0400

aoe: bad AoE responses fail I/O without BUG

Instead of having a BUG when the AoE target does something wrong,
just fail the I/O and log the problem with rate limiting.

diff --git a/drivers/block/aoe/aoecmd.c b/drivers/block/aoe/aoecmd.c
index cacd48e..b9916a6 100644
--- a/drivers/block/aoe/aoecmd.c
+++ b/drivers/block/aoe/aoecmd.c
@@ -1096,7 +1096,6 @@ bvcpy(struct sk_buff *skb, struct bio *bio, struct 
bvec_iter iter, long cnt)
int soff = 0;
struct bio_vec bv;
 
-   BUG_ON(cnt > iter.bi_size);
iter.bi_size = cnt;
 
__bio_for_each_segment(bv, bio, iter, iter) {
@@ -1196,6 +1195,14 @@ noskb:   if (buf)
clear_bit(BIO_UPTODATE, >bio->bi_flags);
break;
}
+   if (n > f->iter.bi_size) {
+   pr_err_ratelimited("%s e%ld.%d.  bytes=%ld need=%u\n",
+   "aoe: too-large data size in read from",
+   (long) d->aoemajor, d->aoeminor,
+   n, f->iter.bi_size);
+   clear_bit(BIO_UPTODATE, >bio->bi_flags);
+   break;
+   }
bvcpy(skb, f->buf->bio, f->iter, n);
case ATA_CMD_PIO_WRITE:
case ATA_CMD_PIO_WRITE_EXT:

-- 
  Ed
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 04/22] block: Abstract out bvec iterator

2013-08-13 Thread Ed L Cashin
On Tue, Aug 13, 2013 at 11:51:58AM -0700, Kent Overstreet wrote:
 On Tue, Aug 13, 2013 at 10:03:04AM -0400, Ed Cashin wrote:
  On Aug 9, 2013, Ed Cashin wrote:
   On Aug 8, 2013, at 9:05 PM, Kent Overstreet wrote:
   ...
It's in the for-jens branch now.
   
   
   Just examining the patches, I like the way it cleans up the aoe code.  I
   had a question about a new BUG added by the for-jens branch the
   read-response handling path of the aoe driver.
  
  The aoe driver in linux-bcache/for-jens commit 4c36c973a8f45 is
  passing my tests.
  
  Here is a patch against that branch illustrating my suggestion for
  handling bad target responses gracefully.
 
 Thanks - shall I just fold that into the aoe immutable bvec patch?

Yes, that would be good, thanks.

Unfortunately, the way I usually send patches to vger didn't work
this time.  It looks like the MTA didn't retry after the
greylisting used SMTP temporary failures.  So I'm trying a
different way to send and including the same patch for the
benefit of the Cc list.

commit 2c39f50b1ee02e2ac07fd072a883a91713da53cc
Author: Ed Cashin ecas...@coraid.com
Date:   Tue Aug 13 10:50:28 2013 -0400

aoe: bad AoE responses fail I/O without BUG

Instead of having a BUG when the AoE target does something wrong,
just fail the I/O and log the problem with rate limiting.

diff --git a/drivers/block/aoe/aoecmd.c b/drivers/block/aoe/aoecmd.c
index cacd48e..b9916a6 100644
--- a/drivers/block/aoe/aoecmd.c
+++ b/drivers/block/aoe/aoecmd.c
@@ -1096,7 +1096,6 @@ bvcpy(struct sk_buff *skb, struct bio *bio, struct 
bvec_iter iter, long cnt)
int soff = 0;
struct bio_vec bv;
 
-   BUG_ON(cnt  iter.bi_size);
iter.bi_size = cnt;
 
__bio_for_each_segment(bv, bio, iter, iter) {
@@ -1196,6 +1195,14 @@ noskb:   if (buf)
clear_bit(BIO_UPTODATE, buf-bio-bi_flags);
break;
}
+   if (n  f-iter.bi_size) {
+   pr_err_ratelimited(%s e%ld.%d.  bytes=%ld need=%u\n,
+   aoe: too-large data size in read from,
+   (long) d-aoemajor, d-aoeminor,
+   n, f-iter.bi_size);
+   clear_bit(BIO_UPTODATE, buf-bio-bi_flags);
+   break;
+   }
bvcpy(skb, f-buf-bio, f-iter, n);
case ATA_CMD_PIO_WRITE:
case ATA_CMD_PIO_WRITE_EXT:

-- 
  Ed
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [2.6 patch] #if 0 aoedev_isbusy()

2008-02-13 Thread Ed L. Cashin
On Thu, Feb 14, 2008 at 12:05:37AM +0200, Adrian Bunk wrote:
> On Wed, Feb 13, 2008 at 11:03:35PM +0100, Jan Engelhardt wrote:
> > 
> > On Feb 13 2008 23:30, Adrian Bunk wrote:
> > >
> > >This patch #if 0's the no longer used aoedev_isbusy().
> > 
> > Why not just remove it? (It can be resurrected from earlier
> > revisions should it be needed again.)
> 
> I've switched to doing #if 0 instead since this addresses the
> "It might be needed in the future." answer I otherwise sometimes
> got.
> 
> But if a maintainer wants to have such a function deleted instead that's 
> fine with me.

Hello.  That function can go away.  I have some changes pending, but
none of them use aoedev_isbusy.  Thanks for Cc-ing me.

-- 
  Ed L Cashin <[EMAIL PROTECTED]>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [2.6 patch] #if 0 aoedev_isbusy()

2008-02-13 Thread Ed L. Cashin
On Thu, Feb 14, 2008 at 12:05:37AM +0200, Adrian Bunk wrote:
 On Wed, Feb 13, 2008 at 11:03:35PM +0100, Jan Engelhardt wrote:
  
  On Feb 13 2008 23:30, Adrian Bunk wrote:
  
  This patch #if 0's the no longer used aoedev_isbusy().
  
  Why not just remove it? (It can be resurrected from earlier
  revisions should it be needed again.)
 
 I've switched to doing #if 0 instead since this addresses the
 It might be needed in the future. answer I otherwise sometimes
 got.
 
 But if a maintainer wants to have such a function deleted instead that's 
 fine with me.

Hello.  That function can go away.  I have some changes pending, but
none of them use aoedev_isbusy.  Thanks for Cc-ing me.

-- 
  Ed L Cashin [EMAIL PROTECTED]
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] Documentation: Add 00-INDEX file for AoE

2008-01-14 Thread Ed L. Cashin
On Sun, Jan 13, 2008 at 02:44:48AM +0100, Jesper Juhl wrote:
> Documentation/aoe/ is missing a 00-INDEX file. Add one.

Thanks.  I think that it would help to clarify that using udev is the
norm, and that the mkdev and mkshelf scripts are just examples to show
how you could do it manually.

Signed-off-by: Ed L. Cashin <[EMAIL PROTECTED]>

diff --git a/Documentation/aoe/00-INDEX b/Documentation/aoe/00-INDEX
index 8188b22..d087df6 100644
--- a/Documentation/aoe/00-INDEX
+++ b/Documentation/aoe/00-INDEX
@@ -5,9 +5,10 @@ aoe.txt
 autoload.sh
- script for making the AoE driver autoload via /etc/modprobe.conf.
 mkdevs.sh
-   - script for creating required AoE device nodes.
+   - script for creating required AoE device nodes without udev.
 mkshelf.sh
-   - script for making one shelf's worth of block device nodes.
+   - script for making one shelf's worth of block device nodes
+ without udev.
 status.sh
- script to collate and present sysfs information about AoE storage.
 todo.txt
diff --git a/Documentation/aoe/mkdevs.sh b/Documentation/aoe/mkdevs.sh
index 97374aa..5870a6c 100644
--- a/Documentation/aoe/mkdevs.sh
+++ b/Documentation/aoe/mkdevs.sh
@@ -1,4 +1,7 @@
 #!/bin/sh
+# This example script shows how device nodes for interacting with the
+# aoe driver could be created in the absence of udev.  Using udev is
+# preferable to creating them manually.
 
 n_shelves=${n_shelves:-10}
 n_partitions=${n_partitions:-16}
diff --git a/Documentation/aoe/mkshelf.sh b/Documentation/aoe/mkshelf.sh
index 3261581..7f412f0 100644
--- a/Documentation/aoe/mkshelf.sh
+++ b/Documentation/aoe/mkshelf.sh
@@ -1,4 +1,7 @@
 #! /bin/sh
+# This example script shows how device nodes for interacting with the
+# aoe driver could be created in the absence of udev.  Using udev is
+# preferable to creating them manually.
 
 if test "$#" != "2"; then
echo "Usage: sh `basename $0` {dir} {shelfaddress}" 1>&2


-- 
  Ed L Cashin <[EMAIL PROTECTED]>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] Documentation: Add 00-INDEX file for AoE

2008-01-14 Thread Ed L. Cashin
On Sun, Jan 13, 2008 at 02:44:48AM +0100, Jesper Juhl wrote:
 Documentation/aoe/ is missing a 00-INDEX file. Add one.

Thanks.  I think that it would help to clarify that using udev is the
norm, and that the mkdev and mkshelf scripts are just examples to show
how you could do it manually.

Signed-off-by: Ed L. Cashin [EMAIL PROTECTED]

diff --git a/Documentation/aoe/00-INDEX b/Documentation/aoe/00-INDEX
index 8188b22..d087df6 100644
--- a/Documentation/aoe/00-INDEX
+++ b/Documentation/aoe/00-INDEX
@@ -5,9 +5,10 @@ aoe.txt
 autoload.sh
- script for making the AoE driver autoload via /etc/modprobe.conf.
 mkdevs.sh
-   - script for creating required AoE device nodes.
+   - script for creating required AoE device nodes without udev.
 mkshelf.sh
-   - script for making one shelf's worth of block device nodes.
+   - script for making one shelf's worth of block device nodes
+ without udev.
 status.sh
- script to collate and present sysfs information about AoE storage.
 todo.txt
diff --git a/Documentation/aoe/mkdevs.sh b/Documentation/aoe/mkdevs.sh
index 97374aa..5870a6c 100644
--- a/Documentation/aoe/mkdevs.sh
+++ b/Documentation/aoe/mkdevs.sh
@@ -1,4 +1,7 @@
 #!/bin/sh
+# This example script shows how device nodes for interacting with the
+# aoe driver could be created in the absence of udev.  Using udev is
+# preferable to creating them manually.
 
 n_shelves=${n_shelves:-10}
 n_partitions=${n_partitions:-16}
diff --git a/Documentation/aoe/mkshelf.sh b/Documentation/aoe/mkshelf.sh
index 3261581..7f412f0 100644
--- a/Documentation/aoe/mkshelf.sh
+++ b/Documentation/aoe/mkshelf.sh
@@ -1,4 +1,7 @@
 #! /bin/sh
+# This example script shows how device nodes for interacting with the
+# aoe driver could be created in the absence of udev.  Using udev is
+# preferable to creating them manually.
 
 if test $# != 2; then
echo Usage: sh `basename $0` {dir} {shelfaddress} 12


-- 
  Ed L Cashin [EMAIL PROTECTED]
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] aoe: document the behavior of /dev/etherd/err

2007-12-26 Thread Ed L. Cashin

Signed-off-by: Ed L. Cashin <[EMAIL PROTECTED]>
---
 drivers/block/aoe/aoechr.c |4 
 1 files changed, 4 insertions(+), 0 deletions(-)

diff --git a/drivers/block/aoe/aoechr.c b/drivers/block/aoe/aoechr.c
index 2620073..871f284 100644
--- a/drivers/block/aoe/aoechr.c
+++ b/drivers/block/aoe/aoechr.c
@@ -33,6 +33,10 @@ struct ErrMsg {
char *msg;
 };
 
+/* A ring buffer of error messages, to be read through
+ * "/dev/etherd/err".  When no messages are present,
+ * readers will block waiting for messages to appear.
+ */
 static struct ErrMsg emsgs[NMSG];
 static int emsgs_head_idx, emsgs_tail_idx;
 static struct semaphore emsgs_sema;
-- 
1.5.3.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] aoe: initialize locking structures before registering char devices

2007-12-26 Thread Ed L. Cashin
This patch was made against 2.6.24-rc6-mm1.

In March 2007, Alexey Dobriyan suggested this change, which
eliminates a race after register_chardev has been called but
the locking primitives protecting the error messages ring
buffer have not yet been initialized.

The initialization could happen at compile time, but that
would leave aoe as the only user of __DECLARE_SEMAPHORE_GENERIC.

Signed-off-by: Ed L. Cashin <[EMAIL PROTECTED]>
---
 drivers/block/aoe/aoechr.c |4 ++--
 1 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/block/aoe/aoechr.c b/drivers/block/aoe/aoechr.c
index e8e60e7..2620073 100644
--- a/drivers/block/aoe/aoechr.c
+++ b/drivers/block/aoe/aoechr.c
@@ -259,13 +259,13 @@ aoechr_init(void)
 {
int n, i;
 
+   sema_init(_sema, 0);
+   spin_lock_init(_lock);
n = register_chrdev(AOE_MAJOR, "aoechr", _fops);
if (n < 0) { 
printk(KERN_ERR "aoe: can't register char device\n");
return n;
}
-   sema_init(_sema, 0);
-   spin_lock_init(_lock);
aoe_class = class_create(THIS_MODULE, "aoe");
if (IS_ERR(aoe_class)) {
unregister_chrdev(AOE_MAJOR, "aoechr");
-- 
1.5.3.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 09/13] remove race between use and initialization of locks

2007-12-26 Thread Ed L. Cashin
On Fri, Dec 21, 2007 at 10:00:40PM -0800, Andrew Morton wrote:
> On Thu, 20 Dec 2007 17:15:57 -0500 "Ed L. Cashin" <[EMAIL PROTECTED]> wrote:
...
> > +static __DECLARE_SEMAPHORE_GENERIC(emsgs_sema, 0);
...
> > -   sema_init(_sema, 0);
> > -   spin_lock_init(_lock);
> > aoe_class = class_create(THIS_MODULE, "aoe");
> > if (IS_ERR(aoe_class)) {
> > unregister_chrdev(AOE_MAJOR, "aoechr");
> 
> I think it would be better to go back to initialising emsgs_lock at runtime
> rather than fattening the exported semaphore API like this.

I don't think there is anything wrong with having a complete set of
initialization routines for a semaphore, but it's certainly easy
enough to go back to Alexey Dobriyan's original suggestion, which was
to simply move the initialization calls before register_chardev.

I will follow this email with a patch that does that.

> emssgs_sema is a weird-looking thing.  There really should be some comments
> in there because it is unobvious what the code is attempting to do.
> 
> What is the code attempting to do?

There is a ring buffer of error messages.  Userland processes can read
these error messages by reading /dev/etherd/err, blocking if there are
no error messages to read yet.

The emsgs_sema semaphore is used to manage the reader(s) waiting for
error messages.  When there are sleepers waiting, "up" is used to wake
one up when a new error message is produced.  A reader gets a single
message, not just some text with a mixture of different errors.

If I do,

  cat /dev/etherd/err > /my/log/file

... then I can hit control-c or send a SIGTERM to stop it.

> It appears to me that nblocked_emsgs_readers gets incorrectly
> decremented if the down_interruptible() got interrupted, btw.

The counter will be incremented again if the process goes back to
sleep waiting for an error message, but the process might be getting
killed.  The counter is really just for sleeping readers.

-- 
  Ed L Cashin <[EMAIL PROTECTED]>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 09/13] remove race between use and initialization of locks

2007-12-26 Thread Ed L. Cashin
On Fri, Dec 21, 2007 at 10:00:40PM -0800, Andrew Morton wrote:
 On Thu, 20 Dec 2007 17:15:57 -0500 Ed L. Cashin [EMAIL PROTECTED] wrote:
...
  +static __DECLARE_SEMAPHORE_GENERIC(emsgs_sema, 0);
...
  -   sema_init(emsgs_sema, 0);
  -   spin_lock_init(emsgs_lock);
  aoe_class = class_create(THIS_MODULE, aoe);
  if (IS_ERR(aoe_class)) {
  unregister_chrdev(AOE_MAJOR, aoechr);
 
 I think it would be better to go back to initialising emsgs_lock at runtime
 rather than fattening the exported semaphore API like this.

I don't think there is anything wrong with having a complete set of
initialization routines for a semaphore, but it's certainly easy
enough to go back to Alexey Dobriyan's original suggestion, which was
to simply move the initialization calls before register_chardev.

I will follow this email with a patch that does that.

 emssgs_sema is a weird-looking thing.  There really should be some comments
 in there because it is unobvious what the code is attempting to do.
 
 What is the code attempting to do?

There is a ring buffer of error messages.  Userland processes can read
these error messages by reading /dev/etherd/err, blocking if there are
no error messages to read yet.

The emsgs_sema semaphore is used to manage the reader(s) waiting for
error messages.  When there are sleepers waiting, up is used to wake
one up when a new error message is produced.  A reader gets a single
message, not just some text with a mixture of different errors.

If I do,

  cat /dev/etherd/err  /my/log/file

... then I can hit control-c or send a SIGTERM to stop it.

 It appears to me that nblocked_emsgs_readers gets incorrectly
 decremented if the down_interruptible() got interrupted, btw.

The counter will be incremented again if the process goes back to
sleep waiting for an error message, but the process might be getting
killed.  The counter is really just for sleeping readers.

-- 
  Ed L Cashin [EMAIL PROTECTED]
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] aoe: initialize locking structures before registering char devices

2007-12-26 Thread Ed L. Cashin
This patch was made against 2.6.24-rc6-mm1.

In March 2007, Alexey Dobriyan suggested this change, which
eliminates a race after register_chardev has been called but
the locking primitives protecting the error messages ring
buffer have not yet been initialized.

The initialization could happen at compile time, but that
would leave aoe as the only user of __DECLARE_SEMAPHORE_GENERIC.

Signed-off-by: Ed L. Cashin [EMAIL PROTECTED]
---
 drivers/block/aoe/aoechr.c |4 ++--
 1 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/block/aoe/aoechr.c b/drivers/block/aoe/aoechr.c
index e8e60e7..2620073 100644
--- a/drivers/block/aoe/aoechr.c
+++ b/drivers/block/aoe/aoechr.c
@@ -259,13 +259,13 @@ aoechr_init(void)
 {
int n, i;
 
+   sema_init(emsgs_sema, 0);
+   spin_lock_init(emsgs_lock);
n = register_chrdev(AOE_MAJOR, aoechr, aoe_fops);
if (n  0) { 
printk(KERN_ERR aoe: can't register char device\n);
return n;
}
-   sema_init(emsgs_sema, 0);
-   spin_lock_init(emsgs_lock);
aoe_class = class_create(THIS_MODULE, aoe);
if (IS_ERR(aoe_class)) {
unregister_chrdev(AOE_MAJOR, aoechr);
-- 
1.5.3.4

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] aoe: document the behavior of /dev/etherd/err

2007-12-26 Thread Ed L. Cashin

Signed-off-by: Ed L. Cashin [EMAIL PROTECTED]
---
 drivers/block/aoe/aoechr.c |4 
 1 files changed, 4 insertions(+), 0 deletions(-)

diff --git a/drivers/block/aoe/aoechr.c b/drivers/block/aoe/aoechr.c
index 2620073..871f284 100644
--- a/drivers/block/aoe/aoechr.c
+++ b/drivers/block/aoe/aoechr.c
@@ -33,6 +33,10 @@ struct ErrMsg {
char *msg;
 };
 
+/* A ring buffer of error messages, to be read through
+ * /dev/etherd/err.  When no messages are present,
+ * readers will block waiting for messages to appear.
+ */
 static struct ErrMsg emsgs[NMSG];
 static int emsgs_head_idx, emsgs_tail_idx;
 static struct semaphore emsgs_sema;
-- 
1.5.3.4

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 13/13] update copyright date

2007-12-20 Thread Ed L. Cashin
Update the year in the copyright notices.

Signed-off-by: Ed L. Cashin <[EMAIL PROTECTED]>
---
 drivers/block/aoe/aoe.h |2 +-
 drivers/block/aoe/aoeblk.c  |2 +-
 drivers/block/aoe/aoechr.c  |2 +-
 drivers/block/aoe/aoecmd.c  |2 +-
 drivers/block/aoe/aoedev.c  |2 +-
 drivers/block/aoe/aoemain.c |2 +-
 drivers/block/aoe/aoenet.c  |2 +-
 7 files changed, 7 insertions(+), 7 deletions(-)

diff --git a/drivers/block/aoe/aoe.h b/drivers/block/aoe/aoe.h
index 67ef4d7..280e71e 100644
--- a/drivers/block/aoe/aoe.h
+++ b/drivers/block/aoe/aoe.h
@@ -1,4 +1,4 @@
-/* Copyright (c) 2006 Coraid, Inc.  See COPYING for GPL terms. */
+/* Copyright (c) 2007 Coraid, Inc.  See COPYING for GPL terms. */
 #define VERSION "47"
 #define AOE_MAJOR 152
 #define DEVICE_NAME "aoe"
diff --git a/drivers/block/aoe/aoeblk.c b/drivers/block/aoe/aoeblk.c
index 25c6760..0c39782 100644
--- a/drivers/block/aoe/aoeblk.c
+++ b/drivers/block/aoe/aoeblk.c
@@ -1,4 +1,4 @@
-/* Copyright (c) 2006 Coraid, Inc.  See COPYING for GPL terms. */
+/* Copyright (c) 2007 Coraid, Inc.  See COPYING for GPL terms. */
 /*
  * aoeblk.c
  * block device routines
diff --git a/drivers/block/aoe/aoechr.c b/drivers/block/aoe/aoechr.c
index 670bba6..ef49e4b 100644
--- a/drivers/block/aoe/aoechr.c
+++ b/drivers/block/aoe/aoechr.c
@@ -1,4 +1,4 @@
-/* Copyright (c) 2006 Coraid, Inc.  See COPYING for GPL terms. */
+/* Copyright (c) 2007 Coraid, Inc.  See COPYING for GPL terms. */
 /*
  * aoechr.c
  * AoE character device driver
diff --git a/drivers/block/aoe/aoecmd.c b/drivers/block/aoe/aoecmd.c
index 1e37cf6..44beb17 100644
--- a/drivers/block/aoe/aoecmd.c
+++ b/drivers/block/aoe/aoecmd.c
@@ -1,4 +1,4 @@
-/* Copyright (c) 2006 Coraid, Inc.  See COPYING for GPL terms. */
+/* Copyright (c) 2007 Coraid, Inc.  See COPYING for GPL terms. */
 /*
  * aoecmd.c
  * Filesystem request handling methods
diff --git a/drivers/block/aoe/aoedev.c b/drivers/block/aoe/aoedev.c
index 839a964..d146c4e 100644
--- a/drivers/block/aoe/aoedev.c
+++ b/drivers/block/aoe/aoedev.c
@@ -1,4 +1,4 @@
-/* Copyright (c) 2006 Coraid, Inc.  See COPYING for GPL terms. */
+/* Copyright (c) 2007 Coraid, Inc.  See COPYING for GPL terms. */
 /*
  * aoedev.c
  * AoE device utility functions; maintains device list.
diff --git a/drivers/block/aoe/aoemain.c b/drivers/block/aoe/aoemain.c
index a04b7d6..7b15a5e 100644
--- a/drivers/block/aoe/aoemain.c
+++ b/drivers/block/aoe/aoemain.c
@@ -1,4 +1,4 @@
-/* Copyright (c) 2006 Coraid, Inc.  See COPYING for GPL terms. */
+/* Copyright (c) 2007 Coraid, Inc.  See COPYING for GPL terms. */
 /*
  * aoemain.c
  * Module initialization routines, discover timer
diff --git a/drivers/block/aoe/aoenet.c b/drivers/block/aoe/aoenet.c
index ada4a06..8460ef7 100644
--- a/drivers/block/aoe/aoenet.c
+++ b/drivers/block/aoe/aoenet.c
@@ -1,4 +1,4 @@
-/* Copyright (c) 2006 Coraid, Inc.  See COPYING for GPL terms. */
+/* Copyright (c) 2007 Coraid, Inc.  See COPYING for GPL terms. */
 /*
  * aoenet.c
  * Ethernet portion of AoE driver
-- 
1.5.3.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 12/13] make error messages more specific

2007-12-20 Thread Ed L. Cashin
Andrew Morton pointed out that the "too many targets" message in patch
2 could be printed for failing GFP_ATOMIC allocations.  This patch
makes the messages more specific.

Signed-off-by: Ed L. Cashin <[EMAIL PROTECTED]>
---
 drivers/block/aoe/aoecmd.c |   15 +++
 1 files changed, 7 insertions(+), 8 deletions(-)

diff --git a/drivers/block/aoe/aoecmd.c b/drivers/block/aoe/aoecmd.c
index bcea36c..1e37cf6 100644
--- a/drivers/block/aoe/aoecmd.c
+++ b/drivers/block/aoe/aoecmd.c
@@ -957,15 +957,17 @@ addtgt(struct aoedev *d, char *addr, ulong nframes)
for (; tt < te && *tt; tt++)
;
 
-   if (tt == te)
+   if (tt == te) {
+   printk(KERN_INFO
+   "aoe: device addtgt failure; too many targets\n");
return NULL;
-
+   }
t = kcalloc(1, sizeof *t, GFP_ATOMIC);
-   if (!t)
-   return NULL;
f = kcalloc(nframes, sizeof *f, GFP_ATOMIC);
-   if (!f) {
+   if (!t || !f) {
+   kfree(f);
kfree(t);
+   printk(KERN_INFO "aoe: cannot allocate memory to add target\n");
return NULL;
}
 
@@ -1029,9 +1031,6 @@ aoecmd_cfg_rsp(struct sk_buff *skb)
if (!t) {
t = addtgt(d, h->src, n);
if (!t) {
-   printk(KERN_INFO
-   "aoe: device addtgt failure; "
-   "too many targets?\n");
spin_unlock_irqrestore(>lock, flags);
return;
}
-- 
1.5.3.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 10/13] add module parameter for users who need more outstanding I/O

2007-12-20 Thread Ed L. Cashin
An AoE target provides an estimate of the number of outstanding
commands that the AoE initiator can send before getting a response.
The aoe_maxout parameter provides a way to set an even lower limit.
It will not allow a user to use more outstanding commands than the
target permits.  If a user discovers a problem with a large setting,
this parameter provides a way for us to work with them to debug the
problem.  We expect to improve the dynamic window sizing algorithm and
drop this parameter.  For the time being, it is a debugging aid.

Signed-off-by: Ed L. Cashin <[EMAIL PROTECTED]>
---
 drivers/block/aoe/aoecmd.c |   10 +++---
 1 files changed, 7 insertions(+), 3 deletions(-)

diff --git a/drivers/block/aoe/aoecmd.c b/drivers/block/aoe/aoecmd.c
index 7a96183..e92d885 100644
--- a/drivers/block/aoe/aoecmd.c
+++ b/drivers/block/aoe/aoecmd.c
@@ -18,6 +18,11 @@ static int aoe_deadsecs = 60 * 3;
 module_param(aoe_deadsecs, int, 0644);
 MODULE_PARM_DESC(aoe_deadsecs, "After aoe_deadsecs seconds, give up and fail 
dev.");
 
+static int aoe_maxout = 16;
+module_param(aoe_maxout, int, 0644);
+MODULE_PARM_DESC(aoe_maxout,
+   "Only aoe_maxout outstanding packets for every MAC on eX.Y.");
+
 static struct sk_buff *
 new_skb(ulong len)
 {
@@ -984,7 +989,6 @@ aoecmd_cfg_rsp(struct sk_buff *skb)
struct aoeif *ifp;
ulong flags, sysminor, aoemajor;
struct sk_buff *sl;
-   enum { MAXFRAMES = 16 };
u16 n;
 
h = (struct aoe_hdr *) skb_mac_header(skb);
@@ -1009,8 +1013,8 @@ aoecmd_cfg_rsp(struct sk_buff *skb)
}
 
n = be16_to_cpu(ch->bufcnt);
-   if (n > MAXFRAMES)  /* keep it reasonable */
-   n = MAXFRAMES;
+   if (n > aoe_maxout) /* keep it reasonable */
+   n = aoe_maxout;
 
d = aoedev_by_sysminor_m(sysminor);
if (d == NULL) {
-- 
1.5.3.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 11/13] the aoeminor doesn't need a long format

2007-12-20 Thread Ed L. Cashin
The aoedev aoeminor member doesn't need a long format.

Signed-off-by: Ed L. Cashin <[EMAIL PROTECTED]>
---
 drivers/block/aoe/aoeblk.c |7 ---
 drivers/block/aoe/aoecmd.c |5 +++--
 2 files changed, 7 insertions(+), 5 deletions(-)

diff --git a/drivers/block/aoe/aoeblk.c b/drivers/block/aoe/aoeblk.c
index deea536..25c6760 100644
--- a/drivers/block/aoe/aoeblk.c
+++ b/drivers/block/aoe/aoeblk.c
@@ -202,7 +202,7 @@ aoeblk_make_request(struct request_queue *q, struct bio 
*bio)
spin_lock_irqsave(>lock, flags);
 
if ((d->flags & DEVFL_UP) == 0) {
-   printk(KERN_INFO "aoe: device %ld.%ld is not up\n",
+   printk(KERN_INFO "aoe: device %ld.%d is not up\n",
d->aoemajor, d->aoeminor);
spin_unlock_irqrestore(>lock, flags);
mempool_free(buf, d->bufpool);
@@ -255,14 +255,15 @@ aoeblk_gdalloc(void *vp)
 
gd = alloc_disk(AOE_PARTITIONS);
if (gd == NULL) {
-   printk(KERN_ERR "aoe: cannot allocate disk structure for 
%ld.%ld\n",
+   printk(KERN_ERR
+   "aoe: cannot allocate disk structure for %ld.%d\n",
d->aoemajor, d->aoeminor);
goto err;
}
 
d->bufpool = mempool_create_slab_pool(MIN_BUFS, buf_pool_cache);
if (d->bufpool == NULL) {
-   printk(KERN_ERR "aoe: cannot allocate bufpool for %ld.%ld\n",
+   printk(KERN_ERR "aoe: cannot allocate bufpool for %ld.%d\n",
d->aoemajor, d->aoeminor);
goto err_disk;
}
diff --git a/drivers/block/aoe/aoecmd.c b/drivers/block/aoe/aoecmd.c
index e92d885..bcea36c 100644
--- a/drivers/block/aoe/aoecmd.c
+++ b/drivers/block/aoe/aoecmd.c
@@ -697,7 +697,8 @@ ataid_complete(struct aoedev *d, struct aoetgt *t, unsigned 
char *id)
}
 
if (d->ssize != ssize)
-   printk(KERN_INFO "aoe: %012llx e%lu.%lu v%04x has %llu 
sectors\n",
+   printk(KERN_INFO
+   "aoe: %012llx e%ld.%d v%04x has %llu sectors\n",
mac_addr(t->addr),
d->aoemajor, d->aoeminor,
d->fw_ver, (long long)ssize);
@@ -822,7 +823,7 @@ aoecmd_ata_rsp(struct sk_buff *skb)
 
if (ahin->cmdstat & 0xa9) { /* these bits cleared on success */
printk(KERN_ERR
-   "aoe: ata error cmd=%2.2Xh stat=%2.2Xh from e%ld.%ld\n",
+   "aoe: ata error cmd=%2.2Xh stat=%2.2Xh from e%ld.%d\n",
ahout->cmdstat, ahin->cmdstat,
d->aoemajor, d->aoeminor);
if (buf)
-- 
1.5.3.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 08/13] only install new AoE device once

2007-12-20 Thread Ed L. Cashin
An aoe driver user who had about 70 AoE targets found that he was
hitting a BUG in sysfs_create_file because the aoe driver was trying
to tell the kernel about an AoE device more than once.  Each AoE
device was reachable by several local network interfaces, and multiple
ATA device indentify responses were returning from that single device.

This patch eliminates a race condition so that aoe always informs the
block layer of a new AoE device once in the presence of multiple
incoming ATA device identify responses.

Signed-off-by: Ed L. Cashin <[EMAIL PROTECTED]>
---
 drivers/block/aoe/aoecmd.c |2 ++
 1 files changed, 2 insertions(+), 0 deletions(-)

diff --git a/drivers/block/aoe/aoecmd.c b/drivers/block/aoe/aoecmd.c
index b49e06e..7a96183 100644
--- a/drivers/block/aoe/aoecmd.c
+++ b/drivers/block/aoe/aoecmd.c
@@ -698,6 +698,8 @@ ataid_complete(struct aoedev *d, struct aoetgt *t, unsigned 
char *id)
d->fw_ver, (long long)ssize);
d->ssize = ssize;
d->geo.start = 0;
+   if (d->flags & (DEVFL_GDALLOC|DEVFL_NEWSIZE))
+   return;
if (d->gd != NULL) {
d->gd->capacity = ssize;
d->flags |= DEVFL_NEWSIZE;
-- 
1.5.3.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 09/13] remove race between use and initialization of locks

2007-12-20 Thread Ed L. Cashin
Alexey Dobriyan noticed a race in the initialization of the dynamic
locks in ...

  Message-ID: <[EMAIL PROTECTED]>

Andrew Morton commented that these locks should be initialized at
compile time, so this patch does that.

Signed-off-by: Ed L. Cashin <[EMAIL PROTECTED]>
---
 drivers/block/aoe/aoechr.c |6 ++
 1 files changed, 2 insertions(+), 4 deletions(-)

diff --git a/drivers/block/aoe/aoechr.c b/drivers/block/aoe/aoechr.c
index 1bc85aa..670bba6 100644
--- a/drivers/block/aoe/aoechr.c
+++ b/drivers/block/aoe/aoechr.c
@@ -35,8 +35,8 @@ struct ErrMsg {
 
 static struct ErrMsg emsgs[NMSG];
 static int emsgs_head_idx, emsgs_tail_idx;
-static struct semaphore emsgs_sema;
-static spinlock_t emsgs_lock;
+static __DECLARE_SEMAPHORE_GENERIC(emsgs_sema, 0);
+static DEFINE_SPINLOCK(emsgs_lock);
 static int nblocked_emsgs_readers;
 static struct class *aoe_class;
 static struct aoe_chardev chardevs[] = {
@@ -264,8 +264,6 @@ aoechr_init(void)
printk(KERN_ERR "aoe: can't register char device\n");
return n;
}
-   sema_init(_sema, 0);
-   spin_lock_init(_lock);
aoe_class = class_create(THIS_MODULE, "aoe");
if (IS_ERR(aoe_class)) {
unregister_chrdev(AOE_MAJOR, "aoechr");
-- 
1.5.3.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 06/13] user can ask driver to forget previously detected devices

2007-12-20 Thread Ed L. Cashin
When an AoE device is detected, the kernel is informed, and a new
block device is created.  If the device is unused, the block device
corresponding to remote device that is no longer available may be
removed from the system by telling the aoe driver to "flush" its list
of devices.

Without this patch, software like GPFS and LVM may attempt to read
from AoE devices that were discovered earlier but are no longer
present, blocking until the I/O attempt times out.

Signed-off-by: Ed L. Cashin <[EMAIL PROTECTED]>
---
 Documentation/aoe/mkdevs.sh |2 +
 Documentation/aoe/udev.txt  |1 +
 drivers/block/aoe/aoe.h |1 +
 drivers/block/aoe/aoechr.c  |5 ++
 drivers/block/aoe/aoedev.c  |   87 +-
 5 files changed, 77 insertions(+), 19 deletions(-)

diff --git a/Documentation/aoe/mkdevs.sh b/Documentation/aoe/mkdevs.sh
index 97374aa..44c0ab7 100644
--- a/Documentation/aoe/mkdevs.sh
+++ b/Documentation/aoe/mkdevs.sh
@@ -29,6 +29,8 @@ rm -f $dir/interfaces
 mknod -m 0200 $dir/interfaces c $MAJOR 4
 rm -f $dir/revalidate
 mknod -m 0200 $dir/revalidate c $MAJOR 5
+rm -f $dir/flush
+mknod -m 0200 $dir/flush c $MAJOR 6
 
 export n_partitions
 mkshelf=`echo $0 | sed 's!mkdevs!mkshelf!'`
diff --git a/Documentation/aoe/udev.txt b/Documentation/aoe/udev.txt
index 17e76c4..8686e78 100644
--- a/Documentation/aoe/udev.txt
+++ b/Documentation/aoe/udev.txt
@@ -20,6 +20,7 @@ SUBSYSTEM=="aoe", KERNEL=="discover", NAME="etherd/%k", 
GROUP="disk", MODE="0220
 SUBSYSTEM=="aoe", KERNEL=="err",   NAME="etherd/%k", GROUP="disk", 
MODE="0440"
 SUBSYSTEM=="aoe", KERNEL=="interfaces",NAME="etherd/%k", GROUP="disk", 
MODE="0220"
 SUBSYSTEM=="aoe", KERNEL=="revalidate",NAME="etherd/%k", GROUP="disk", 
MODE="0220"
+SUBSYSTEM=="aoe", KERNEL=="flush", NAME="etherd/%k", GROUP="disk", 
MODE="0220"
 
 # aoe block devices 
 KERNEL=="etherd*",   NAME="%k", GROUP="disk"
diff --git a/drivers/block/aoe/aoe.h b/drivers/block/aoe/aoe.h
index aecaac3..2248ab2 100644
--- a/drivers/block/aoe/aoe.h
+++ b/drivers/block/aoe/aoe.h
@@ -191,6 +191,7 @@ struct aoedev *aoedev_by_aoeaddr(int maj, int min);
 struct aoedev *aoedev_by_sysminor_m(ulong sysminor);
 void aoedev_downdev(struct aoedev *d);
 int aoedev_isbusy(struct aoedev *d);
+int aoedev_flush(const char __user *str, size_t size);
 
 int aoenet_init(void);
 void aoenet_exit(void);
diff --git a/drivers/block/aoe/aoechr.c b/drivers/block/aoe/aoechr.c
index f112466..1bc85aa 100644
--- a/drivers/block/aoe/aoechr.c
+++ b/drivers/block/aoe/aoechr.c
@@ -15,6 +15,7 @@ enum {
MINOR_DISCOVER,
MINOR_INTERFACES,
MINOR_REVALIDATE,
+   MINOR_FLUSH,
MSGSZ = 2048,
NMSG = 100, /* message backlog to retain */
 };
@@ -43,6 +44,7 @@ static struct aoe_chardev chardevs[] = {
{ MINOR_DISCOVER, "discover" },
{ MINOR_INTERFACES, "interfaces" },
{ MINOR_REVALIDATE, "revalidate" },
+   { MINOR_FLUSH, "flush" },
 };
 
 static int
@@ -158,6 +160,9 @@ aoechr_write(struct file *filp, const char __user *buf, 
size_t cnt, loff_t *offp
break;
case MINOR_REVALIDATE:
ret = revalidate(buf, cnt);
+   break;
+   case MINOR_FLUSH:
+   ret = aoedev_flush(buf, cnt);
}
if (ret == 0)
ret = cnt;
diff --git a/drivers/block/aoe/aoedev.c b/drivers/block/aoe/aoedev.c
index a4d625a..e26f6f4 100644
--- a/drivers/block/aoe/aoedev.c
+++ b/drivers/block/aoe/aoedev.c
@@ -9,6 +9,10 @@
 #include 
 #include "aoe.h"
 
+static void dummy_timer(ulong);
+static void aoedev_freedev(struct aoedev *);
+static void freetgt(struct aoetgt *t);
+
 static struct aoedev *devlist;
 static spinlock_t devlist_lock;
 
@@ -108,6 +112,70 @@ aoedev_downdev(struct aoedev *d)
d->flags &= ~DEVFL_UP;
 }
 
+static void
+aoedev_freedev(struct aoedev *d)
+{
+   struct aoetgt **t, **e;
+
+   if (d->gd) {
+   aoedisk_rm_sysfs(d);
+   del_gendisk(d->gd);
+   put_disk(d->gd);
+   }
+   t = d->targets;
+   e = t + NTARGETS;
+   for (; t < e && *t; t++)
+   freetgt(*t);
+   if (d->bufpool)
+   mempool_destroy(d->bufpool);
+   kfree(d);
+}
+
+int
+aoedev_flush(const char __user *str, size_t cnt)
+{
+   ulong flags;
+   struct aoedev *d, **dd;
+   struct aoedev *rmd = NULL;
+   char buf[16];
+   int all = 0;
+
+   if (cnt >= 3) {
+   if (cnt > sizeof buf)
+   cnt = sizeof buf;
+   if (copy_fr

[PATCH 07/13] dynamically allocate a capped number of skbs when necessary

2007-12-20 Thread Ed L. Cashin
What this Patch Does

  Even before this recent series of 12 patches to 2.6.22-rc4, the aoe
  driver was reusing a small set of skbs that were allocated once and
  were only used for outbound AoE commands.

  The network layer cannot be allowed to put_page on the data that is
  still associated with a bio we haven't returned to the block layer,
  so the aoe driver (even before the patch under discussion) is still
  the owner of skbs that have been handed to the network layer for
  transmission.  We need to keep track of these skbs so that we can
  free them, but by tracking them, we can also easily re-use them.

  The new patch was a response to the behavior of certain network
  drivers.  We cannot reuse an skb that the network driver still has
  in its transmit ring.  Network drivers can defer transmit ring
  cleanup and then use the state in the skb to determine how many data
  segments to clean up in its transmit ring.  The tg3 driver is one
  driver that behaves in this way.

  When the network driver defers cleanup of its transmit ring, the aoe
  driver can find itself in a situation where it would like to send an
  AoE command, and the AoE target is ready for more work, but the
  network driver still has all of the pre-allocated skbs.  In that
  case, the new patch just calls alloc_skb, as you'd expect.

  We don't want to get carried away, though.  We try not to do
  excessive allocation in the write path, so we cap the number of skbs
  we dynamically allocate.

  Probably calling it a "dynamic pool" is misleading.  We were already
  trying to use a small fixed-size set of pre-allocated skbs before
  this patch, and this patch just provides a little headroom (with a
  ceiling, though) to accomodate network drivers that hang onto skbs,
  by allocating when needed.  The d->skbpool_hd list of allocated skbs
  is necessary so that we can free them later.

  We didn't notice the need for this headroom until AoE targets got
  fast enough.

Alternatives

  If the network layer never did a put_page on the pages in the bio's
  we get from the block layer, then it would be possible for us to
  hand skbs to the network layer and forget about them, allowing the
  network layer to free skbs itself (and thereby calling our own
  skb->destructor callback function if we needed that).  In that case
  we could get rid of the pre-allocated skbs and also the
  d->skbpool_hd, instead just calling alloc_skb every time we wanted
  to transmit a packet.  The slab allocator would effectively maintain
  the list of skbs.

  Besides a loss of CPU cache locality, the main concern with that
  approach the danger that it would increase the likelihood of
  deadlock when VM is trying to free pages by writing dirty data from
  the page cache through the aoe driver out to persistent storage on
  an AoE device.  Right now we have a situation where we have
  pre-allocation that corresponds to how much we use, which seems
  ideal.

  Of course, there's still the separate issue of receiving the packets
  that tell us that a write has successfully completed on the AoE
  target.  When memory is low and VM is using AoE to flush dirty data
  to free up pages, it would be perfect if there were a way for us to
  register a fast callback that could recognize write command
  completion responses.  But I don't think the current problems with
  the receive side of the situation are a justification for
  exacerbating the problem on the transmit side.

Signed-off-by: Ed L. Cashin <[EMAIL PROTECTED]>
---
 drivers/block/aoe/aoe.h|5 ++
 drivers/block/aoe/aoecmd.c |  117 +++-
 drivers/block/aoe/aoedev.c |   52 +---
 3 files changed, 133 insertions(+), 41 deletions(-)

diff --git a/drivers/block/aoe/aoe.h b/drivers/block/aoe/aoe.h
index 2248ab2..67ef4d7 100644
--- a/drivers/block/aoe/aoe.h
+++ b/drivers/block/aoe/aoe.h
@@ -89,6 +89,7 @@ enum {
MIN_BUFS = 16,
NTARGETS = 8,
NAOEIFS = 8,
+   NSKBPOOLMAX = 128,
 
TIMERTICK = HZ / 10,
MINTIMER = HZ >> 2,
@@ -138,6 +139,7 @@ struct aoetgt {
u16 useme;
ulong lastwadj; /* last window adjustment */
int wpkts, rpkts;
+   int dataref;
 };
 
 struct aoedev {
@@ -159,6 +161,9 @@ struct aoedev {
spinlock_t lock;
struct sk_buff *sendq_hd; /* packets needing to be sent, list head */
struct sk_buff *sendq_tl;
+   struct sk_buff *skbpool_hd;
+   struct sk_buff *skbpool_tl;
+   int nskbpool;
mempool_t *bufpool; /* for deadlock-free Buf allocation */
struct list_head bufq;  /* queue of bios to work on */
struct buf *inprocess;  /* the one we're currently working on */
diff --git a/drivers/block/aoe/aoecmd.c b/drivers/block/aoe/aoecmd.c
index 1be5150..b49e06e 100644
--- a/drivers/block/aoe/aoecmd.c
+++ b/drivers/block/aoe/aoecmd.c
@@ -106,45 +106,104 @@ ifrotate(struct aoetgt 

[PATCH 04/13] clean up udev configuration example

2007-12-20 Thread Ed L. Cashin
This patch adds a known default location for the udev configuration
file and uses the more recent "==" syntax for SUBSYSTEM and KERNEL.

Signed-off-by: Ed L. Cashin <[EMAIL PROTECTED]>
---
 Documentation/aoe/udev-install.sh |5 -
 Documentation/aoe/udev.txt|   15 ---
 2 files changed, 12 insertions(+), 8 deletions(-)

diff --git a/Documentation/aoe/udev-install.sh 
b/Documentation/aoe/udev-install.sh
index 6449911..15e86f5 100644
--- a/Documentation/aoe/udev-install.sh
+++ b/Documentation/aoe/udev-install.sh
@@ -23,7 +23,10 @@ fi
 # /etc/udev/rules.d
 #
 rules_d="`sed -n '/^udev_rules=/{ s!udev_rules=!!; s!\"!!g; p; }' $conf`"
-if test -z "$rules_d" || test ! -d "$rules_d"; then
+if test -z "$rules_d" ; then
+   rules_d=/etc/udev/rules.d
+fi
+if test ! -d "$rules_d"; then
echo "$me Error: cannot find udev rules directory" 1>&2
exit 1
 fi
diff --git a/Documentation/aoe/udev.txt b/Documentation/aoe/udev.txt
index a7ed1dc..17e76c4 100644
--- a/Documentation/aoe/udev.txt
+++ b/Documentation/aoe/udev.txt
@@ -1,6 +1,7 @@
 # These rules tell udev what device nodes to create for aoe support.
-# They may be installed along the following lines (adjusted to what
-# you see on your system).
+# They may be installed along the following lines.  Check the section
+# 8 udev manpage to see whether your udev supports SUBSYSTEM, and
+# whether it uses one or two equal signs for SUBSYSTEM and KERNEL.
 # 
 #   [EMAIL PROTECTED] ~$ su
 #   Password:
@@ -15,10 +16,10 @@
 #  
 
 # aoe char devices
-SUBSYSTEM="aoe", KERNEL="discover",NAME="etherd/%k", GROUP="disk", 
MODE="0220"
-SUBSYSTEM="aoe", KERNEL="err", NAME="etherd/%k", GROUP="disk", 
MODE="0440"
-SUBSYSTEM="aoe", KERNEL="interfaces",  NAME="etherd/%k", GROUP="disk", 
MODE="0220"
-SUBSYSTEM="aoe", KERNEL="revalidate",  NAME="etherd/%k", GROUP="disk", 
MODE="0220"
+SUBSYSTEM=="aoe", KERNEL=="discover",  NAME="etherd/%k", GROUP="disk", 
MODE="0220"
+SUBSYSTEM=="aoe", KERNEL=="err",   NAME="etherd/%k", GROUP="disk", 
MODE="0440"
+SUBSYSTEM=="aoe", KERNEL=="interfaces",NAME="etherd/%k", GROUP="disk", 
MODE="0220"
+SUBSYSTEM=="aoe", KERNEL=="revalidate",NAME="etherd/%k", GROUP="disk", 
MODE="0220"
 
 # aoe block devices 
-KERNEL="etherd*",   NAME="%k", GROUP="disk"
+KERNEL=="etherd*",   NAME="%k", GROUP="disk"
-- 
1.5.3.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 05/13] eliminate goto and improve readability

2007-12-20 Thread Ed L. Cashin
Adam Richter suggested eliminating this goto.

Signed-off-by: Ed L. Cashin <[EMAIL PROTECTED]>
---
 drivers/block/aoe/aoechr.c |   69 +--
 1 files changed, 34 insertions(+), 35 deletions(-)

diff --git a/drivers/block/aoe/aoechr.c b/drivers/block/aoe/aoechr.c
index 03c7f4a..f112466 100644
--- a/drivers/block/aoe/aoechr.c
+++ b/drivers/block/aoe/aoechr.c
@@ -194,52 +194,51 @@ aoechr_read(struct file *filp, char __user *buf, size_t 
cnt, loff_t *off)
ulong flags;
 
n = (unsigned long) filp->private_data;
-   switch (n) {
-   case MINOR_ERR:
-   spin_lock_irqsave(_lock, flags);
-loop:
-   em = emsgs + emsgs_head_idx;
-   if ((em->flags & EMFL_VALID) == 0) {
-   if (filp->f_flags & O_NDELAY) {
-   spin_unlock_irqrestore(_lock, flags);
-   return -EAGAIN;
-   }
-   nblocked_emsgs_readers++;
+   if (n != MINOR_ERR)
+   return -EFAULT;
+
+   spin_lock_irqsave(_lock, flags);
 
+   for (;;) {
+   em = emsgs + emsgs_head_idx;
+   if ((em->flags & EMFL_VALID) != 0)
+   break;
+   if (filp->f_flags & O_NDELAY) {
spin_unlock_irqrestore(_lock, flags);
+   return -EAGAIN;
+   }
+   nblocked_emsgs_readers++;
+
+   spin_unlock_irqrestore(_lock, flags);
 
-   n = down_interruptible(_sema);
+   n = down_interruptible(_sema);
 
-   spin_lock_irqsave(_lock, flags);
+   spin_lock_irqsave(_lock, flags);
 
-   nblocked_emsgs_readers--;
+   nblocked_emsgs_readers--;
 
-   if (n) {
-   spin_unlock_irqrestore(_lock, flags);
-   return -ERESTARTSYS;
-   }
-   goto loop;
-   }
-   if (em->len > cnt) {
+   if (n) {
spin_unlock_irqrestore(_lock, flags);
-   return -EAGAIN;
+   return -ERESTARTSYS;
}
-   mp = em->msg;
-   len = em->len;
-   em->msg = NULL;
-   em->flags &= ~EMFL_VALID;
+   }
+   if (em->len > cnt) {
+   spin_unlock_irqrestore(_lock, flags);
+   return -EAGAIN;
+   }
+   mp = em->msg;
+   len = em->len;
+   em->msg = NULL;
+   em->flags &= ~EMFL_VALID;
 
-   emsgs_head_idx++;
-   emsgs_head_idx %= ARRAY_SIZE(emsgs);
+   emsgs_head_idx++;
+   emsgs_head_idx %= ARRAY_SIZE(emsgs);
 
-   spin_unlock_irqrestore(_lock, flags);
+   spin_unlock_irqrestore(_lock, flags);
 
-   n = copy_to_user(buf, mp, len);
-   kfree(mp);
-   return n == 0 ? len : -EFAULT;
-   default:
-   return -EFAULT;
-   }
+   n = copy_to_user(buf, mp, len);
+   kfree(mp);
+   return n == 0 ? len : -EFAULT;
 }
 
 static const struct file_operations aoe_fops = {
-- 
1.5.3.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 03/13] mac_addr: avoid 64-bit arch compiler warnings

2007-12-20 Thread Ed L. Cashin
By returning unsigned long long, mac_addr does not generate compiler
warnings on 64-bit architectures.

Signed-off-by: Ed L. Cashin <[EMAIL PROTECTED]>
---
 drivers/block/aoe/aoe.h|2 +-
 drivers/block/aoe/aoeblk.c |3 +--
 drivers/block/aoe/aoecmd.c |   10 +-
 drivers/block/aoe/aoenet.c |4 ++--
 4 files changed, 9 insertions(+), 10 deletions(-)

diff --git a/drivers/block/aoe/aoe.h b/drivers/block/aoe/aoe.h
index 87df18b..aecaac3 100644
--- a/drivers/block/aoe/aoe.h
+++ b/drivers/block/aoe/aoe.h
@@ -198,4 +198,4 @@ void aoenet_xmit(struct sk_buff *);
 int is_aoe_netif(struct net_device *ifp);
 int set_aoe_iflist(const char __user *str, size_t size);
 
-u64 mac_addr(char addr[6]);
+unsigned long long mac_addr(char addr[6]);
diff --git a/drivers/block/aoe/aoeblk.c b/drivers/block/aoe/aoeblk.c
index c2649c9..deea536 100644
--- a/drivers/block/aoe/aoeblk.c
+++ b/drivers/block/aoe/aoeblk.c
@@ -37,8 +37,7 @@ static ssize_t aoedisk_show_mac(struct device *dev,
 
if (t == NULL)
return snprintf(page, PAGE_SIZE, "none\n");
-   return snprintf(page, PAGE_SIZE, "%012llx\n",
-   (unsigned long long)mac_addr(t->addr));
+   return snprintf(page, PAGE_SIZE, "%012llx\n", mac_addr(t->addr));
 }
 static ssize_t aoedisk_show_netif(struct device *dev,
  struct device_attribute *attr, char *page)
diff --git a/drivers/block/aoe/aoecmd.c b/drivers/block/aoe/aoecmd.c
index 5e7daa1..1be5150 100644
--- a/drivers/block/aoe/aoecmd.c
+++ b/drivers/block/aoe/aoecmd.c
@@ -309,7 +309,8 @@ resend(struct aoedev *d, struct aoetgt *t, struct frame *f)
"%15s e%ld.%d [EMAIL PROTECTED] newtag=%08x "
"s=%012llx d=%012llx nout=%d\n",
"retransmit", d->aoemajor, d->aoeminor, f->tag, jiffies, n,
-   mac_addr(h->src), mac_addr(h->dst), t->nout);
+   mac_addr(h->src),
+   mac_addr(h->dst), t->nout);
aoechr_error(buf);
 
f->tag = n;
@@ -633,7 +634,7 @@ ataid_complete(struct aoedev *d, struct aoetgt *t, unsigned 
char *id)
 
if (d->ssize != ssize)
printk(KERN_INFO "aoe: %012llx e%lu.%lu v%04x has %llu 
sectors\n",
-   (unsigned long long)mac_addr(t->addr),
+   mac_addr(t->addr),
d->aoemajor, d->aoeminor,
d->fw_ver, (long long)ssize);
d->ssize = ssize;
@@ -727,8 +728,7 @@ aoecmd_ata_rsp(struct sk_buff *skb)
t = gettgt(d, hin->src);
if (t == NULL) {
printk(KERN_INFO "aoe: can't find target e%ld.%d:%012llx\n",
-   d->aoemajor, d->aoeminor,
-   (unsigned long long) mac_addr(hin->src));
+   d->aoemajor, d->aoeminor, mac_addr(hin->src));
spin_unlock_irqrestore(>lock, flags);
return;
}
@@ -1003,7 +1003,7 @@ aoecmd_cfg_rsp(struct sk_buff *skb)
"aoe: e%ld.%d: setting %d%s%s:%012llx\n",
d->aoemajor, d->aoeminor, n,
" byte data frames on ", ifp->nd->name,
-   (unsigned long long) mac_addr(t->addr));
+   mac_addr(t->addr));
ifp->maxbcnt = n;
}
}
diff --git a/drivers/block/aoe/aoenet.c b/drivers/block/aoe/aoenet.c
index 7a38a45..ada4a06 100644
--- a/drivers/block/aoe/aoenet.c
+++ b/drivers/block/aoe/aoenet.c
@@ -83,7 +83,7 @@ set_aoe_iflist(const char __user *user_str, size_t size)
return 0;
 }
 
-u64
+unsigned long long
 mac_addr(char addr[6])
 {
__be64 n = 0;
@@ -91,7 +91,7 @@ mac_addr(char addr[6])
 
memcpy(p + 2, addr, 6); /* (sizeof addr != 6) */
 
-   return __be64_to_cpu(n);
+   return (unsigned long long) __be64_to_cpu(n);
 }
 
 void
-- 
1.5.3.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 02/13] handle multiple network paths to AoE device

2007-12-20 Thread Ed L. Cashin
A remote AoE device is something can process ATA commands and is
identified by an AoE shelf number and an AoE slot number.  Such a
device might have more than one network interface, and it might be
reachable by more than one local network interface.  This patch tracks
the available network paths available to each AoE device, allowing
them to be used more efficiently.

Andrew Morton asked about the call to msleep_interruptible in the
revalidate function.  Yes, if a signal is pending, then
msleep_interruptible will not return 0.  That means we will not loop
but will call aoenet_xmit with a NULL skb, which is a noop.  If the
system is too low on memory or the aoe driver is too low on frames,
then the user can hit control-C to interrupt the attempt to do a
revalidate.  I have added a comment to the code summarizing that.

Andrew Morton asked whether the allocation performed inside addtgt
could use a more relaxed allocation like GFP_KERNEL, but addtgt is
called when the aoedev lock has been locked with spin_lock_irqsave.
It would be nice to allocate the memory under fewer restrictions, but
targets are only added when the device is being discovered, and if the
target can't be added right now, we can try again in a minute when
then next AoE config query broadcast goes out.

Andrew Morton pointed out that the "too many targets" message could be
printed for failing GFP_ATOMIC allocations.  The last patch in this
series makes the messages more specific.

Signed-off-by: Ed L. Cashin <[EMAIL PROTECTED]>
---
 drivers/block/aoe/aoe.h|   57 +++--
 drivers/block/aoe/aoeblk.c |   62 -
 drivers/block/aoe/aoechr.c |   17 +-
 drivers/block/aoe/aoecmd.c |  675 ++--
 drivers/block/aoe/aoedev.c |  168 +--
 drivers/block/aoe/aoenet.c |9 +-
 6 files changed, 653 insertions(+), 335 deletions(-)

diff --git a/drivers/block/aoe/aoe.h b/drivers/block/aoe/aoe.h
index 4d0543a..87df18b 100644
--- a/drivers/block/aoe/aoe.h
+++ b/drivers/block/aoe/aoe.h
@@ -76,10 +76,8 @@ enum {
DEVFL_EXT = (1<<2), /* device accepts lba48 commands */
DEVFL_CLOSEWAIT = (1<<3), /* device is waiting for all closes to 
revalidate */
DEVFL_GDALLOC = (1<<4), /* need to alloc gendisk */
-   DEVFL_PAUSE = (1<<5),
+   DEVFL_KICKME = (1<<5),  /* slow polling network card catch */
DEVFL_NEWSIZE = (1<<6), /* need to update dev size in block layer */
-   DEVFL_MAXBCNT = (1<<7), /* d->maxbcnt is not changeable */
-   DEVFL_KICKME = (1<<8),
 
BUFFL_FAIL = 1,
 };
@@ -88,17 +86,24 @@ enum {
DEFAULTBCNT = 2 * 512,  /* 2 sectors */
NPERSHELF = 16, /* number of slots per shelf address */
FREETAG = -1,
-   MIN_BUFS = 8,
+   MIN_BUFS = 16,
+   NTARGETS = 8,
+   NAOEIFS = 8,
+
+   TIMERTICK = HZ / 10,
+   MINTIMER = HZ >> 2,
+   MAXTIMER = HZ << 1,
+   HELPWAIT = 20,
 };
 
 struct buf {
struct list_head bufs;
-   ulong start_time;   /* for disk stats */
+   ulong stime;/* for disk stats */
ulong flags;
ulong nframesout;
-   char *bufaddr;
ulong resid;
ulong bv_resid;
+   ulong bv_off;
sector_t sector;
struct bio *bio;
struct bio_vec *bv;
@@ -114,19 +119,37 @@ struct frame {
struct sk_buff *skb;
 };
 
+struct aoeif {
+   struct net_device *nd;
+   unsigned char lost;
+   unsigned char lostjumbo;
+   ushort maxbcnt;
+};
+
+struct aoetgt {
+   unsigned char addr[6];
+   ushort nframes;
+   struct frame *frames;
+   struct aoeif ifs[NAOEIFS];
+   struct aoeif *ifp;  /* current aoeif in use */
+   ushort nout;
+   ushort maxout;
+   u16 lasttag;/* last tag sent */
+   u16 useme;
+   ulong lastwadj; /* last window adjustment */
+   int wpkts, rpkts;
+};
+
 struct aoedev {
struct aoedev *next;
-   unsigned char addr[6];  /* remote mac addr */
-   ushort flags;
ulong sysminor;
ulong aoemajor;
-   ulong aoeminor;
+   u16 aoeminor;
+   u16 flags;
u16 nopen;  /* (bd_openers isn't available without 
sleeping) */
-   u16 lasttag;/* last tag sent */
u16 rttavg; /* round trip average of requests/responses */
u16 mintimer;
u16 fw_ver; /* version of blade's firmware */
-   u16 maxbcnt;
struct work_struct work;/* disk create work struct */
struct gendisk *gd;
struct request_queue blkq;
@@ -134,15 +157,14 @@ struct aoedev {
sector_t ssize;
struct timer_list timer;
spinlock_t lock;
-   struct net_device *ifp; /* interface ed is attached to */
struct sk_buff *sendq_hd; /* packets needing to be sent, list head */
struct sk_buff *sendq_tl;
mempool_t *bu

[PATCH 01/13] bring driver version number to 47

2007-12-20 Thread Ed L. Cashin
These patches were made against kernel 2.6.24-rc5-mm1 kernel.  They
were submitted earlier and have been modified to incorporate feedback
from the kernel development community.  I am resending them after
resolving conflicts with patches already in the mm tree.

The eleventh patch in the old series was obsoleted by a patch in the
2.6.24-rc5-mm1 tree, gregkh-driver-block-device.patch, and has been
omitted.  The last patch in this series is new but straightforward.

Signed-off-by: Ed L. Cashin <[EMAIL PROTECTED]>
---
 drivers/block/aoe/aoe.h |2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/drivers/block/aoe/aoe.h b/drivers/block/aoe/aoe.h
index 07f02f8..4d0543a 100644
--- a/drivers/block/aoe/aoe.h
+++ b/drivers/block/aoe/aoe.h
@@ -1,5 +1,5 @@
 /* Copyright (c) 2006 Coraid, Inc.  See COPYING for GPL terms. */
-#define VERSION "32"
+#define VERSION "47"
 #define AOE_MAJOR 152
 #define DEVICE_NAME "aoe"
 
-- 
1.5.3.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 01/13] bring driver version number to 47

2007-12-20 Thread Ed L. Cashin
These patches were made against kernel 2.6.24-rc5-mm1 kernel.  They
were submitted earlier and have been modified to incorporate feedback
from the kernel development community.  I am resending them after
resolving conflicts with patches already in the mm tree.

The eleventh patch in the old series was obsoleted by a patch in the
2.6.24-rc5-mm1 tree, gregkh-driver-block-device.patch, and has been
omitted.  The last patch in this series is new but straightforward.

Signed-off-by: Ed L. Cashin [EMAIL PROTECTED]
---
 drivers/block/aoe/aoe.h |2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/drivers/block/aoe/aoe.h b/drivers/block/aoe/aoe.h
index 07f02f8..4d0543a 100644
--- a/drivers/block/aoe/aoe.h
+++ b/drivers/block/aoe/aoe.h
@@ -1,5 +1,5 @@
 /* Copyright (c) 2006 Coraid, Inc.  See COPYING for GPL terms. */
-#define VERSION 32
+#define VERSION 47
 #define AOE_MAJOR 152
 #define DEVICE_NAME aoe
 
-- 
1.5.3.4

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 02/13] handle multiple network paths to AoE device

2007-12-20 Thread Ed L. Cashin
A remote AoE device is something can process ATA commands and is
identified by an AoE shelf number and an AoE slot number.  Such a
device might have more than one network interface, and it might be
reachable by more than one local network interface.  This patch tracks
the available network paths available to each AoE device, allowing
them to be used more efficiently.

Andrew Morton asked about the call to msleep_interruptible in the
revalidate function.  Yes, if a signal is pending, then
msleep_interruptible will not return 0.  That means we will not loop
but will call aoenet_xmit with a NULL skb, which is a noop.  If the
system is too low on memory or the aoe driver is too low on frames,
then the user can hit control-C to interrupt the attempt to do a
revalidate.  I have added a comment to the code summarizing that.

Andrew Morton asked whether the allocation performed inside addtgt
could use a more relaxed allocation like GFP_KERNEL, but addtgt is
called when the aoedev lock has been locked with spin_lock_irqsave.
It would be nice to allocate the memory under fewer restrictions, but
targets are only added when the device is being discovered, and if the
target can't be added right now, we can try again in a minute when
then next AoE config query broadcast goes out.

Andrew Morton pointed out that the too many targets message could be
printed for failing GFP_ATOMIC allocations.  The last patch in this
series makes the messages more specific.

Signed-off-by: Ed L. Cashin [EMAIL PROTECTED]
---
 drivers/block/aoe/aoe.h|   57 +++--
 drivers/block/aoe/aoeblk.c |   62 -
 drivers/block/aoe/aoechr.c |   17 +-
 drivers/block/aoe/aoecmd.c |  675 ++--
 drivers/block/aoe/aoedev.c |  168 +--
 drivers/block/aoe/aoenet.c |9 +-
 6 files changed, 653 insertions(+), 335 deletions(-)

diff --git a/drivers/block/aoe/aoe.h b/drivers/block/aoe/aoe.h
index 4d0543a..87df18b 100644
--- a/drivers/block/aoe/aoe.h
+++ b/drivers/block/aoe/aoe.h
@@ -76,10 +76,8 @@ enum {
DEVFL_EXT = (12), /* device accepts lba48 commands */
DEVFL_CLOSEWAIT = (13), /* device is waiting for all closes to 
revalidate */
DEVFL_GDALLOC = (14), /* need to alloc gendisk */
-   DEVFL_PAUSE = (15),
+   DEVFL_KICKME = (15),  /* slow polling network card catch */
DEVFL_NEWSIZE = (16), /* need to update dev size in block layer */
-   DEVFL_MAXBCNT = (17), /* d-maxbcnt is not changeable */
-   DEVFL_KICKME = (18),
 
BUFFL_FAIL = 1,
 };
@@ -88,17 +86,24 @@ enum {
DEFAULTBCNT = 2 * 512,  /* 2 sectors */
NPERSHELF = 16, /* number of slots per shelf address */
FREETAG = -1,
-   MIN_BUFS = 8,
+   MIN_BUFS = 16,
+   NTARGETS = 8,
+   NAOEIFS = 8,
+
+   TIMERTICK = HZ / 10,
+   MINTIMER = HZ  2,
+   MAXTIMER = HZ  1,
+   HELPWAIT = 20,
 };
 
 struct buf {
struct list_head bufs;
-   ulong start_time;   /* for disk stats */
+   ulong stime;/* for disk stats */
ulong flags;
ulong nframesout;
-   char *bufaddr;
ulong resid;
ulong bv_resid;
+   ulong bv_off;
sector_t sector;
struct bio *bio;
struct bio_vec *bv;
@@ -114,19 +119,37 @@ struct frame {
struct sk_buff *skb;
 };
 
+struct aoeif {
+   struct net_device *nd;
+   unsigned char lost;
+   unsigned char lostjumbo;
+   ushort maxbcnt;
+};
+
+struct aoetgt {
+   unsigned char addr[6];
+   ushort nframes;
+   struct frame *frames;
+   struct aoeif ifs[NAOEIFS];
+   struct aoeif *ifp;  /* current aoeif in use */
+   ushort nout;
+   ushort maxout;
+   u16 lasttag;/* last tag sent */
+   u16 useme;
+   ulong lastwadj; /* last window adjustment */
+   int wpkts, rpkts;
+};
+
 struct aoedev {
struct aoedev *next;
-   unsigned char addr[6];  /* remote mac addr */
-   ushort flags;
ulong sysminor;
ulong aoemajor;
-   ulong aoeminor;
+   u16 aoeminor;
+   u16 flags;
u16 nopen;  /* (bd_openers isn't available without 
sleeping) */
-   u16 lasttag;/* last tag sent */
u16 rttavg; /* round trip average of requests/responses */
u16 mintimer;
u16 fw_ver; /* version of blade's firmware */
-   u16 maxbcnt;
struct work_struct work;/* disk create work struct */
struct gendisk *gd;
struct request_queue blkq;
@@ -134,15 +157,14 @@ struct aoedev {
sector_t ssize;
struct timer_list timer;
spinlock_t lock;
-   struct net_device *ifp; /* interface ed is attached to */
struct sk_buff *sendq_hd; /* packets needing to be sent, list head */
struct sk_buff *sendq_tl;
mempool_t *bufpool; /* for deadlock-free Buf allocation */
struct list_head bufq;  /* queue of bios to work

[PATCH 03/13] mac_addr: avoid 64-bit arch compiler warnings

2007-12-20 Thread Ed L. Cashin
By returning unsigned long long, mac_addr does not generate compiler
warnings on 64-bit architectures.

Signed-off-by: Ed L. Cashin [EMAIL PROTECTED]
---
 drivers/block/aoe/aoe.h|2 +-
 drivers/block/aoe/aoeblk.c |3 +--
 drivers/block/aoe/aoecmd.c |   10 +-
 drivers/block/aoe/aoenet.c |4 ++--
 4 files changed, 9 insertions(+), 10 deletions(-)

diff --git a/drivers/block/aoe/aoe.h b/drivers/block/aoe/aoe.h
index 87df18b..aecaac3 100644
--- a/drivers/block/aoe/aoe.h
+++ b/drivers/block/aoe/aoe.h
@@ -198,4 +198,4 @@ void aoenet_xmit(struct sk_buff *);
 int is_aoe_netif(struct net_device *ifp);
 int set_aoe_iflist(const char __user *str, size_t size);
 
-u64 mac_addr(char addr[6]);
+unsigned long long mac_addr(char addr[6]);
diff --git a/drivers/block/aoe/aoeblk.c b/drivers/block/aoe/aoeblk.c
index c2649c9..deea536 100644
--- a/drivers/block/aoe/aoeblk.c
+++ b/drivers/block/aoe/aoeblk.c
@@ -37,8 +37,7 @@ static ssize_t aoedisk_show_mac(struct device *dev,
 
if (t == NULL)
return snprintf(page, PAGE_SIZE, none\n);
-   return snprintf(page, PAGE_SIZE, %012llx\n,
-   (unsigned long long)mac_addr(t-addr));
+   return snprintf(page, PAGE_SIZE, %012llx\n, mac_addr(t-addr));
 }
 static ssize_t aoedisk_show_netif(struct device *dev,
  struct device_attribute *attr, char *page)
diff --git a/drivers/block/aoe/aoecmd.c b/drivers/block/aoe/aoecmd.c
index 5e7daa1..1be5150 100644
--- a/drivers/block/aoe/aoecmd.c
+++ b/drivers/block/aoe/aoecmd.c
@@ -309,7 +309,8 @@ resend(struct aoedev *d, struct aoetgt *t, struct frame *f)
%15s e%ld.%d [EMAIL PROTECTED] newtag=%08x 
s=%012llx d=%012llx nout=%d\n,
retransmit, d-aoemajor, d-aoeminor, f-tag, jiffies, n,
-   mac_addr(h-src), mac_addr(h-dst), t-nout);
+   mac_addr(h-src),
+   mac_addr(h-dst), t-nout);
aoechr_error(buf);
 
f-tag = n;
@@ -633,7 +634,7 @@ ataid_complete(struct aoedev *d, struct aoetgt *t, unsigned 
char *id)
 
if (d-ssize != ssize)
printk(KERN_INFO aoe: %012llx e%lu.%lu v%04x has %llu 
sectors\n,
-   (unsigned long long)mac_addr(t-addr),
+   mac_addr(t-addr),
d-aoemajor, d-aoeminor,
d-fw_ver, (long long)ssize);
d-ssize = ssize;
@@ -727,8 +728,7 @@ aoecmd_ata_rsp(struct sk_buff *skb)
t = gettgt(d, hin-src);
if (t == NULL) {
printk(KERN_INFO aoe: can't find target e%ld.%d:%012llx\n,
-   d-aoemajor, d-aoeminor,
-   (unsigned long long) mac_addr(hin-src));
+   d-aoemajor, d-aoeminor, mac_addr(hin-src));
spin_unlock_irqrestore(d-lock, flags);
return;
}
@@ -1003,7 +1003,7 @@ aoecmd_cfg_rsp(struct sk_buff *skb)
aoe: e%ld.%d: setting %d%s%s:%012llx\n,
d-aoemajor, d-aoeminor, n,
 byte data frames on , ifp-nd-name,
-   (unsigned long long) mac_addr(t-addr));
+   mac_addr(t-addr));
ifp-maxbcnt = n;
}
}
diff --git a/drivers/block/aoe/aoenet.c b/drivers/block/aoe/aoenet.c
index 7a38a45..ada4a06 100644
--- a/drivers/block/aoe/aoenet.c
+++ b/drivers/block/aoe/aoenet.c
@@ -83,7 +83,7 @@ set_aoe_iflist(const char __user *user_str, size_t size)
return 0;
 }
 
-u64
+unsigned long long
 mac_addr(char addr[6])
 {
__be64 n = 0;
@@ -91,7 +91,7 @@ mac_addr(char addr[6])
 
memcpy(p + 2, addr, 6); /* (sizeof addr != 6) */
 
-   return __be64_to_cpu(n);
+   return (unsigned long long) __be64_to_cpu(n);
 }
 
 void
-- 
1.5.3.4

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 04/13] clean up udev configuration example

2007-12-20 Thread Ed L. Cashin
This patch adds a known default location for the udev configuration
file and uses the more recent == syntax for SUBSYSTEM and KERNEL.

Signed-off-by: Ed L. Cashin [EMAIL PROTECTED]
---
 Documentation/aoe/udev-install.sh |5 -
 Documentation/aoe/udev.txt|   15 ---
 2 files changed, 12 insertions(+), 8 deletions(-)

diff --git a/Documentation/aoe/udev-install.sh 
b/Documentation/aoe/udev-install.sh
index 6449911..15e86f5 100644
--- a/Documentation/aoe/udev-install.sh
+++ b/Documentation/aoe/udev-install.sh
@@ -23,7 +23,10 @@ fi
 # /etc/udev/rules.d
 #
 rules_d=`sed -n '/^udev_rules=/{ s!udev_rules=!!; s!\!!g; p; }' $conf`
-if test -z $rules_d || test ! -d $rules_d; then
+if test -z $rules_d ; then
+   rules_d=/etc/udev/rules.d
+fi
+if test ! -d $rules_d; then
echo $me Error: cannot find udev rules directory 12
exit 1
 fi
diff --git a/Documentation/aoe/udev.txt b/Documentation/aoe/udev.txt
index a7ed1dc..17e76c4 100644
--- a/Documentation/aoe/udev.txt
+++ b/Documentation/aoe/udev.txt
@@ -1,6 +1,7 @@
 # These rules tell udev what device nodes to create for aoe support.
-# They may be installed along the following lines (adjusted to what
-# you see on your system).
+# They may be installed along the following lines.  Check the section
+# 8 udev manpage to see whether your udev supports SUBSYSTEM, and
+# whether it uses one or two equal signs for SUBSYSTEM and KERNEL.
 # 
 #   [EMAIL PROTECTED] ~$ su
 #   Password:
@@ -15,10 +16,10 @@
 #  
 
 # aoe char devices
-SUBSYSTEM=aoe, KERNEL=discover,NAME=etherd/%k, GROUP=disk, 
MODE=0220
-SUBSYSTEM=aoe, KERNEL=err, NAME=etherd/%k, GROUP=disk, 
MODE=0440
-SUBSYSTEM=aoe, KERNEL=interfaces,  NAME=etherd/%k, GROUP=disk, 
MODE=0220
-SUBSYSTEM=aoe, KERNEL=revalidate,  NAME=etherd/%k, GROUP=disk, 
MODE=0220
+SUBSYSTEM==aoe, KERNEL==discover,  NAME=etherd/%k, GROUP=disk, 
MODE=0220
+SUBSYSTEM==aoe, KERNEL==err,   NAME=etherd/%k, GROUP=disk, 
MODE=0440
+SUBSYSTEM==aoe, KERNEL==interfaces,NAME=etherd/%k, GROUP=disk, 
MODE=0220
+SUBSYSTEM==aoe, KERNEL==revalidate,NAME=etherd/%k, GROUP=disk, 
MODE=0220
 
 # aoe block devices 
-KERNEL=etherd*,   NAME=%k, GROUP=disk
+KERNEL==etherd*,   NAME=%k, GROUP=disk
-- 
1.5.3.4

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 05/13] eliminate goto and improve readability

2007-12-20 Thread Ed L. Cashin
Adam Richter suggested eliminating this goto.

Signed-off-by: Ed L. Cashin [EMAIL PROTECTED]
---
 drivers/block/aoe/aoechr.c |   69 +--
 1 files changed, 34 insertions(+), 35 deletions(-)

diff --git a/drivers/block/aoe/aoechr.c b/drivers/block/aoe/aoechr.c
index 03c7f4a..f112466 100644
--- a/drivers/block/aoe/aoechr.c
+++ b/drivers/block/aoe/aoechr.c
@@ -194,52 +194,51 @@ aoechr_read(struct file *filp, char __user *buf, size_t 
cnt, loff_t *off)
ulong flags;
 
n = (unsigned long) filp-private_data;
-   switch (n) {
-   case MINOR_ERR:
-   spin_lock_irqsave(emsgs_lock, flags);
-loop:
-   em = emsgs + emsgs_head_idx;
-   if ((em-flags  EMFL_VALID) == 0) {
-   if (filp-f_flags  O_NDELAY) {
-   spin_unlock_irqrestore(emsgs_lock, flags);
-   return -EAGAIN;
-   }
-   nblocked_emsgs_readers++;
+   if (n != MINOR_ERR)
+   return -EFAULT;
+
+   spin_lock_irqsave(emsgs_lock, flags);
 
+   for (;;) {
+   em = emsgs + emsgs_head_idx;
+   if ((em-flags  EMFL_VALID) != 0)
+   break;
+   if (filp-f_flags  O_NDELAY) {
spin_unlock_irqrestore(emsgs_lock, flags);
+   return -EAGAIN;
+   }
+   nblocked_emsgs_readers++;
+
+   spin_unlock_irqrestore(emsgs_lock, flags);
 
-   n = down_interruptible(emsgs_sema);
+   n = down_interruptible(emsgs_sema);
 
-   spin_lock_irqsave(emsgs_lock, flags);
+   spin_lock_irqsave(emsgs_lock, flags);
 
-   nblocked_emsgs_readers--;
+   nblocked_emsgs_readers--;
 
-   if (n) {
-   spin_unlock_irqrestore(emsgs_lock, flags);
-   return -ERESTARTSYS;
-   }
-   goto loop;
-   }
-   if (em-len  cnt) {
+   if (n) {
spin_unlock_irqrestore(emsgs_lock, flags);
-   return -EAGAIN;
+   return -ERESTARTSYS;
}
-   mp = em-msg;
-   len = em-len;
-   em-msg = NULL;
-   em-flags = ~EMFL_VALID;
+   }
+   if (em-len  cnt) {
+   spin_unlock_irqrestore(emsgs_lock, flags);
+   return -EAGAIN;
+   }
+   mp = em-msg;
+   len = em-len;
+   em-msg = NULL;
+   em-flags = ~EMFL_VALID;
 
-   emsgs_head_idx++;
-   emsgs_head_idx %= ARRAY_SIZE(emsgs);
+   emsgs_head_idx++;
+   emsgs_head_idx %= ARRAY_SIZE(emsgs);
 
-   spin_unlock_irqrestore(emsgs_lock, flags);
+   spin_unlock_irqrestore(emsgs_lock, flags);
 
-   n = copy_to_user(buf, mp, len);
-   kfree(mp);
-   return n == 0 ? len : -EFAULT;
-   default:
-   return -EFAULT;
-   }
+   n = copy_to_user(buf, mp, len);
+   kfree(mp);
+   return n == 0 ? len : -EFAULT;
 }
 
 static const struct file_operations aoe_fops = {
-- 
1.5.3.4

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 06/13] user can ask driver to forget previously detected devices

2007-12-20 Thread Ed L. Cashin
When an AoE device is detected, the kernel is informed, and a new
block device is created.  If the device is unused, the block device
corresponding to remote device that is no longer available may be
removed from the system by telling the aoe driver to flush its list
of devices.

Without this patch, software like GPFS and LVM may attempt to read
from AoE devices that were discovered earlier but are no longer
present, blocking until the I/O attempt times out.

Signed-off-by: Ed L. Cashin [EMAIL PROTECTED]
---
 Documentation/aoe/mkdevs.sh |2 +
 Documentation/aoe/udev.txt  |1 +
 drivers/block/aoe/aoe.h |1 +
 drivers/block/aoe/aoechr.c  |5 ++
 drivers/block/aoe/aoedev.c  |   87 +-
 5 files changed, 77 insertions(+), 19 deletions(-)

diff --git a/Documentation/aoe/mkdevs.sh b/Documentation/aoe/mkdevs.sh
index 97374aa..44c0ab7 100644
--- a/Documentation/aoe/mkdevs.sh
+++ b/Documentation/aoe/mkdevs.sh
@@ -29,6 +29,8 @@ rm -f $dir/interfaces
 mknod -m 0200 $dir/interfaces c $MAJOR 4
 rm -f $dir/revalidate
 mknod -m 0200 $dir/revalidate c $MAJOR 5
+rm -f $dir/flush
+mknod -m 0200 $dir/flush c $MAJOR 6
 
 export n_partitions
 mkshelf=`echo $0 | sed 's!mkdevs!mkshelf!'`
diff --git a/Documentation/aoe/udev.txt b/Documentation/aoe/udev.txt
index 17e76c4..8686e78 100644
--- a/Documentation/aoe/udev.txt
+++ b/Documentation/aoe/udev.txt
@@ -20,6 +20,7 @@ SUBSYSTEM==aoe, KERNEL==discover, NAME=etherd/%k, 
GROUP=disk, MODE=0220
 SUBSYSTEM==aoe, KERNEL==err,   NAME=etherd/%k, GROUP=disk, 
MODE=0440
 SUBSYSTEM==aoe, KERNEL==interfaces,NAME=etherd/%k, GROUP=disk, 
MODE=0220
 SUBSYSTEM==aoe, KERNEL==revalidate,NAME=etherd/%k, GROUP=disk, 
MODE=0220
+SUBSYSTEM==aoe, KERNEL==flush, NAME=etherd/%k, GROUP=disk, 
MODE=0220
 
 # aoe block devices 
 KERNEL==etherd*,   NAME=%k, GROUP=disk
diff --git a/drivers/block/aoe/aoe.h b/drivers/block/aoe/aoe.h
index aecaac3..2248ab2 100644
--- a/drivers/block/aoe/aoe.h
+++ b/drivers/block/aoe/aoe.h
@@ -191,6 +191,7 @@ struct aoedev *aoedev_by_aoeaddr(int maj, int min);
 struct aoedev *aoedev_by_sysminor_m(ulong sysminor);
 void aoedev_downdev(struct aoedev *d);
 int aoedev_isbusy(struct aoedev *d);
+int aoedev_flush(const char __user *str, size_t size);
 
 int aoenet_init(void);
 void aoenet_exit(void);
diff --git a/drivers/block/aoe/aoechr.c b/drivers/block/aoe/aoechr.c
index f112466..1bc85aa 100644
--- a/drivers/block/aoe/aoechr.c
+++ b/drivers/block/aoe/aoechr.c
@@ -15,6 +15,7 @@ enum {
MINOR_DISCOVER,
MINOR_INTERFACES,
MINOR_REVALIDATE,
+   MINOR_FLUSH,
MSGSZ = 2048,
NMSG = 100, /* message backlog to retain */
 };
@@ -43,6 +44,7 @@ static struct aoe_chardev chardevs[] = {
{ MINOR_DISCOVER, discover },
{ MINOR_INTERFACES, interfaces },
{ MINOR_REVALIDATE, revalidate },
+   { MINOR_FLUSH, flush },
 };
 
 static int
@@ -158,6 +160,9 @@ aoechr_write(struct file *filp, const char __user *buf, 
size_t cnt, loff_t *offp
break;
case MINOR_REVALIDATE:
ret = revalidate(buf, cnt);
+   break;
+   case MINOR_FLUSH:
+   ret = aoedev_flush(buf, cnt);
}
if (ret == 0)
ret = cnt;
diff --git a/drivers/block/aoe/aoedev.c b/drivers/block/aoe/aoedev.c
index a4d625a..e26f6f4 100644
--- a/drivers/block/aoe/aoedev.c
+++ b/drivers/block/aoe/aoedev.c
@@ -9,6 +9,10 @@
 #include linux/netdevice.h
 #include aoe.h
 
+static void dummy_timer(ulong);
+static void aoedev_freedev(struct aoedev *);
+static void freetgt(struct aoetgt *t);
+
 static struct aoedev *devlist;
 static spinlock_t devlist_lock;
 
@@ -108,6 +112,70 @@ aoedev_downdev(struct aoedev *d)
d-flags = ~DEVFL_UP;
 }
 
+static void
+aoedev_freedev(struct aoedev *d)
+{
+   struct aoetgt **t, **e;
+
+   if (d-gd) {
+   aoedisk_rm_sysfs(d);
+   del_gendisk(d-gd);
+   put_disk(d-gd);
+   }
+   t = d-targets;
+   e = t + NTARGETS;
+   for (; t  e  *t; t++)
+   freetgt(*t);
+   if (d-bufpool)
+   mempool_destroy(d-bufpool);
+   kfree(d);
+}
+
+int
+aoedev_flush(const char __user *str, size_t cnt)
+{
+   ulong flags;
+   struct aoedev *d, **dd;
+   struct aoedev *rmd = NULL;
+   char buf[16];
+   int all = 0;
+
+   if (cnt = 3) {
+   if (cnt  sizeof buf)
+   cnt = sizeof buf;
+   if (copy_from_user(buf, str, cnt))
+   return -EFAULT;
+   all = !strncmp(buf, all, 3);
+   }
+
+   flush_scheduled_work();
+   spin_lock_irqsave(devlist_lock, flags);
+   dd = devlist;
+   while ((d = *dd)) {
+   spin_lock(d-lock);
+   if ((!all  (d-flags  DEVFL_UP))
+   || (d-flags  (DEVFL_GDALLOC|DEVFL_NEWSIZE))
+   || d-nopen

[PATCH 07/13] dynamically allocate a capped number of skbs when necessary

2007-12-20 Thread Ed L. Cashin
What this Patch Does

  Even before this recent series of 12 patches to 2.6.22-rc4, the aoe
  driver was reusing a small set of skbs that were allocated once and
  were only used for outbound AoE commands.

  The network layer cannot be allowed to put_page on the data that is
  still associated with a bio we haven't returned to the block layer,
  so the aoe driver (even before the patch under discussion) is still
  the owner of skbs that have been handed to the network layer for
  transmission.  We need to keep track of these skbs so that we can
  free them, but by tracking them, we can also easily re-use them.

  The new patch was a response to the behavior of certain network
  drivers.  We cannot reuse an skb that the network driver still has
  in its transmit ring.  Network drivers can defer transmit ring
  cleanup and then use the state in the skb to determine how many data
  segments to clean up in its transmit ring.  The tg3 driver is one
  driver that behaves in this way.

  When the network driver defers cleanup of its transmit ring, the aoe
  driver can find itself in a situation where it would like to send an
  AoE command, and the AoE target is ready for more work, but the
  network driver still has all of the pre-allocated skbs.  In that
  case, the new patch just calls alloc_skb, as you'd expect.

  We don't want to get carried away, though.  We try not to do
  excessive allocation in the write path, so we cap the number of skbs
  we dynamically allocate.

  Probably calling it a dynamic pool is misleading.  We were already
  trying to use a small fixed-size set of pre-allocated skbs before
  this patch, and this patch just provides a little headroom (with a
  ceiling, though) to accomodate network drivers that hang onto skbs,
  by allocating when needed.  The d-skbpool_hd list of allocated skbs
  is necessary so that we can free them later.

  We didn't notice the need for this headroom until AoE targets got
  fast enough.

Alternatives

  If the network layer never did a put_page on the pages in the bio's
  we get from the block layer, then it would be possible for us to
  hand skbs to the network layer and forget about them, allowing the
  network layer to free skbs itself (and thereby calling our own
  skb-destructor callback function if we needed that).  In that case
  we could get rid of the pre-allocated skbs and also the
  d-skbpool_hd, instead just calling alloc_skb every time we wanted
  to transmit a packet.  The slab allocator would effectively maintain
  the list of skbs.

  Besides a loss of CPU cache locality, the main concern with that
  approach the danger that it would increase the likelihood of
  deadlock when VM is trying to free pages by writing dirty data from
  the page cache through the aoe driver out to persistent storage on
  an AoE device.  Right now we have a situation where we have
  pre-allocation that corresponds to how much we use, which seems
  ideal.

  Of course, there's still the separate issue of receiving the packets
  that tell us that a write has successfully completed on the AoE
  target.  When memory is low and VM is using AoE to flush dirty data
  to free up pages, it would be perfect if there were a way for us to
  register a fast callback that could recognize write command
  completion responses.  But I don't think the current problems with
  the receive side of the situation are a justification for
  exacerbating the problem on the transmit side.

Signed-off-by: Ed L. Cashin [EMAIL PROTECTED]
---
 drivers/block/aoe/aoe.h|5 ++
 drivers/block/aoe/aoecmd.c |  117 +++-
 drivers/block/aoe/aoedev.c |   52 +---
 3 files changed, 133 insertions(+), 41 deletions(-)

diff --git a/drivers/block/aoe/aoe.h b/drivers/block/aoe/aoe.h
index 2248ab2..67ef4d7 100644
--- a/drivers/block/aoe/aoe.h
+++ b/drivers/block/aoe/aoe.h
@@ -89,6 +89,7 @@ enum {
MIN_BUFS = 16,
NTARGETS = 8,
NAOEIFS = 8,
+   NSKBPOOLMAX = 128,
 
TIMERTICK = HZ / 10,
MINTIMER = HZ  2,
@@ -138,6 +139,7 @@ struct aoetgt {
u16 useme;
ulong lastwadj; /* last window adjustment */
int wpkts, rpkts;
+   int dataref;
 };
 
 struct aoedev {
@@ -159,6 +161,9 @@ struct aoedev {
spinlock_t lock;
struct sk_buff *sendq_hd; /* packets needing to be sent, list head */
struct sk_buff *sendq_tl;
+   struct sk_buff *skbpool_hd;
+   struct sk_buff *skbpool_tl;
+   int nskbpool;
mempool_t *bufpool; /* for deadlock-free Buf allocation */
struct list_head bufq;  /* queue of bios to work on */
struct buf *inprocess;  /* the one we're currently working on */
diff --git a/drivers/block/aoe/aoecmd.c b/drivers/block/aoe/aoecmd.c
index 1be5150..b49e06e 100644
--- a/drivers/block/aoe/aoecmd.c
+++ b/drivers/block/aoe/aoecmd.c
@@ -106,45 +106,104 @@ ifrotate(struct aoetgt *t)
}
 }
 
+static void
+skb_pool_put(struct

[PATCH 08/13] only install new AoE device once

2007-12-20 Thread Ed L. Cashin
An aoe driver user who had about 70 AoE targets found that he was
hitting a BUG in sysfs_create_file because the aoe driver was trying
to tell the kernel about an AoE device more than once.  Each AoE
device was reachable by several local network interfaces, and multiple
ATA device indentify responses were returning from that single device.

This patch eliminates a race condition so that aoe always informs the
block layer of a new AoE device once in the presence of multiple
incoming ATA device identify responses.

Signed-off-by: Ed L. Cashin [EMAIL PROTECTED]
---
 drivers/block/aoe/aoecmd.c |2 ++
 1 files changed, 2 insertions(+), 0 deletions(-)

diff --git a/drivers/block/aoe/aoecmd.c b/drivers/block/aoe/aoecmd.c
index b49e06e..7a96183 100644
--- a/drivers/block/aoe/aoecmd.c
+++ b/drivers/block/aoe/aoecmd.c
@@ -698,6 +698,8 @@ ataid_complete(struct aoedev *d, struct aoetgt *t, unsigned 
char *id)
d-fw_ver, (long long)ssize);
d-ssize = ssize;
d-geo.start = 0;
+   if (d-flags  (DEVFL_GDALLOC|DEVFL_NEWSIZE))
+   return;
if (d-gd != NULL) {
d-gd-capacity = ssize;
d-flags |= DEVFL_NEWSIZE;
-- 
1.5.3.4

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 09/13] remove race between use and initialization of locks

2007-12-20 Thread Ed L. Cashin
Alexey Dobriyan noticed a race in the initialization of the dynamic
locks in ...

  Message-ID: [EMAIL PROTECTED]

Andrew Morton commented that these locks should be initialized at
compile time, so this patch does that.

Signed-off-by: Ed L. Cashin [EMAIL PROTECTED]
---
 drivers/block/aoe/aoechr.c |6 ++
 1 files changed, 2 insertions(+), 4 deletions(-)

diff --git a/drivers/block/aoe/aoechr.c b/drivers/block/aoe/aoechr.c
index 1bc85aa..670bba6 100644
--- a/drivers/block/aoe/aoechr.c
+++ b/drivers/block/aoe/aoechr.c
@@ -35,8 +35,8 @@ struct ErrMsg {
 
 static struct ErrMsg emsgs[NMSG];
 static int emsgs_head_idx, emsgs_tail_idx;
-static struct semaphore emsgs_sema;
-static spinlock_t emsgs_lock;
+static __DECLARE_SEMAPHORE_GENERIC(emsgs_sema, 0);
+static DEFINE_SPINLOCK(emsgs_lock);
 static int nblocked_emsgs_readers;
 static struct class *aoe_class;
 static struct aoe_chardev chardevs[] = {
@@ -264,8 +264,6 @@ aoechr_init(void)
printk(KERN_ERR aoe: can't register char device\n);
return n;
}
-   sema_init(emsgs_sema, 0);
-   spin_lock_init(emsgs_lock);
aoe_class = class_create(THIS_MODULE, aoe);
if (IS_ERR(aoe_class)) {
unregister_chrdev(AOE_MAJOR, aoechr);
-- 
1.5.3.4

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 10/13] add module parameter for users who need more outstanding I/O

2007-12-20 Thread Ed L. Cashin
An AoE target provides an estimate of the number of outstanding
commands that the AoE initiator can send before getting a response.
The aoe_maxout parameter provides a way to set an even lower limit.
It will not allow a user to use more outstanding commands than the
target permits.  If a user discovers a problem with a large setting,
this parameter provides a way for us to work with them to debug the
problem.  We expect to improve the dynamic window sizing algorithm and
drop this parameter.  For the time being, it is a debugging aid.

Signed-off-by: Ed L. Cashin [EMAIL PROTECTED]
---
 drivers/block/aoe/aoecmd.c |   10 +++---
 1 files changed, 7 insertions(+), 3 deletions(-)

diff --git a/drivers/block/aoe/aoecmd.c b/drivers/block/aoe/aoecmd.c
index 7a96183..e92d885 100644
--- a/drivers/block/aoe/aoecmd.c
+++ b/drivers/block/aoe/aoecmd.c
@@ -18,6 +18,11 @@ static int aoe_deadsecs = 60 * 3;
 module_param(aoe_deadsecs, int, 0644);
 MODULE_PARM_DESC(aoe_deadsecs, After aoe_deadsecs seconds, give up and fail 
dev.);
 
+static int aoe_maxout = 16;
+module_param(aoe_maxout, int, 0644);
+MODULE_PARM_DESC(aoe_maxout,
+   Only aoe_maxout outstanding packets for every MAC on eX.Y.);
+
 static struct sk_buff *
 new_skb(ulong len)
 {
@@ -984,7 +989,6 @@ aoecmd_cfg_rsp(struct sk_buff *skb)
struct aoeif *ifp;
ulong flags, sysminor, aoemajor;
struct sk_buff *sl;
-   enum { MAXFRAMES = 16 };
u16 n;
 
h = (struct aoe_hdr *) skb_mac_header(skb);
@@ -1009,8 +1013,8 @@ aoecmd_cfg_rsp(struct sk_buff *skb)
}
 
n = be16_to_cpu(ch-bufcnt);
-   if (n  MAXFRAMES)  /* keep it reasonable */
-   n = MAXFRAMES;
+   if (n  aoe_maxout) /* keep it reasonable */
+   n = aoe_maxout;
 
d = aoedev_by_sysminor_m(sysminor);
if (d == NULL) {
-- 
1.5.3.4

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 11/13] the aoeminor doesn't need a long format

2007-12-20 Thread Ed L. Cashin
The aoedev aoeminor member doesn't need a long format.

Signed-off-by: Ed L. Cashin [EMAIL PROTECTED]
---
 drivers/block/aoe/aoeblk.c |7 ---
 drivers/block/aoe/aoecmd.c |5 +++--
 2 files changed, 7 insertions(+), 5 deletions(-)

diff --git a/drivers/block/aoe/aoeblk.c b/drivers/block/aoe/aoeblk.c
index deea536..25c6760 100644
--- a/drivers/block/aoe/aoeblk.c
+++ b/drivers/block/aoe/aoeblk.c
@@ -202,7 +202,7 @@ aoeblk_make_request(struct request_queue *q, struct bio 
*bio)
spin_lock_irqsave(d-lock, flags);
 
if ((d-flags  DEVFL_UP) == 0) {
-   printk(KERN_INFO aoe: device %ld.%ld is not up\n,
+   printk(KERN_INFO aoe: device %ld.%d is not up\n,
d-aoemajor, d-aoeminor);
spin_unlock_irqrestore(d-lock, flags);
mempool_free(buf, d-bufpool);
@@ -255,14 +255,15 @@ aoeblk_gdalloc(void *vp)
 
gd = alloc_disk(AOE_PARTITIONS);
if (gd == NULL) {
-   printk(KERN_ERR aoe: cannot allocate disk structure for 
%ld.%ld\n,
+   printk(KERN_ERR
+   aoe: cannot allocate disk structure for %ld.%d\n,
d-aoemajor, d-aoeminor);
goto err;
}
 
d-bufpool = mempool_create_slab_pool(MIN_BUFS, buf_pool_cache);
if (d-bufpool == NULL) {
-   printk(KERN_ERR aoe: cannot allocate bufpool for %ld.%ld\n,
+   printk(KERN_ERR aoe: cannot allocate bufpool for %ld.%d\n,
d-aoemajor, d-aoeminor);
goto err_disk;
}
diff --git a/drivers/block/aoe/aoecmd.c b/drivers/block/aoe/aoecmd.c
index e92d885..bcea36c 100644
--- a/drivers/block/aoe/aoecmd.c
+++ b/drivers/block/aoe/aoecmd.c
@@ -697,7 +697,8 @@ ataid_complete(struct aoedev *d, struct aoetgt *t, unsigned 
char *id)
}
 
if (d-ssize != ssize)
-   printk(KERN_INFO aoe: %012llx e%lu.%lu v%04x has %llu 
sectors\n,
+   printk(KERN_INFO
+   aoe: %012llx e%ld.%d v%04x has %llu sectors\n,
mac_addr(t-addr),
d-aoemajor, d-aoeminor,
d-fw_ver, (long long)ssize);
@@ -822,7 +823,7 @@ aoecmd_ata_rsp(struct sk_buff *skb)
 
if (ahin-cmdstat  0xa9) { /* these bits cleared on success */
printk(KERN_ERR
-   aoe: ata error cmd=%2.2Xh stat=%2.2Xh from e%ld.%ld\n,
+   aoe: ata error cmd=%2.2Xh stat=%2.2Xh from e%ld.%d\n,
ahout-cmdstat, ahin-cmdstat,
d-aoemajor, d-aoeminor);
if (buf)
-- 
1.5.3.4

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 12/13] make error messages more specific

2007-12-20 Thread Ed L. Cashin
Andrew Morton pointed out that the too many targets message in patch
2 could be printed for failing GFP_ATOMIC allocations.  This patch
makes the messages more specific.

Signed-off-by: Ed L. Cashin [EMAIL PROTECTED]
---
 drivers/block/aoe/aoecmd.c |   15 +++
 1 files changed, 7 insertions(+), 8 deletions(-)

diff --git a/drivers/block/aoe/aoecmd.c b/drivers/block/aoe/aoecmd.c
index bcea36c..1e37cf6 100644
--- a/drivers/block/aoe/aoecmd.c
+++ b/drivers/block/aoe/aoecmd.c
@@ -957,15 +957,17 @@ addtgt(struct aoedev *d, char *addr, ulong nframes)
for (; tt  te  *tt; tt++)
;
 
-   if (tt == te)
+   if (tt == te) {
+   printk(KERN_INFO
+   aoe: device addtgt failure; too many targets\n);
return NULL;
-
+   }
t = kcalloc(1, sizeof *t, GFP_ATOMIC);
-   if (!t)
-   return NULL;
f = kcalloc(nframes, sizeof *f, GFP_ATOMIC);
-   if (!f) {
+   if (!t || !f) {
+   kfree(f);
kfree(t);
+   printk(KERN_INFO aoe: cannot allocate memory to add target\n);
return NULL;
}
 
@@ -1029,9 +1031,6 @@ aoecmd_cfg_rsp(struct sk_buff *skb)
if (!t) {
t = addtgt(d, h-src, n);
if (!t) {
-   printk(KERN_INFO
-   aoe: device addtgt failure; 
-   too many targets?\n);
spin_unlock_irqrestore(d-lock, flags);
return;
}
-- 
1.5.3.4

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 13/13] update copyright date

2007-12-20 Thread Ed L. Cashin
Update the year in the copyright notices.

Signed-off-by: Ed L. Cashin [EMAIL PROTECTED]
---
 drivers/block/aoe/aoe.h |2 +-
 drivers/block/aoe/aoeblk.c  |2 +-
 drivers/block/aoe/aoechr.c  |2 +-
 drivers/block/aoe/aoecmd.c  |2 +-
 drivers/block/aoe/aoedev.c  |2 +-
 drivers/block/aoe/aoemain.c |2 +-
 drivers/block/aoe/aoenet.c  |2 +-
 7 files changed, 7 insertions(+), 7 deletions(-)

diff --git a/drivers/block/aoe/aoe.h b/drivers/block/aoe/aoe.h
index 67ef4d7..280e71e 100644
--- a/drivers/block/aoe/aoe.h
+++ b/drivers/block/aoe/aoe.h
@@ -1,4 +1,4 @@
-/* Copyright (c) 2006 Coraid, Inc.  See COPYING for GPL terms. */
+/* Copyright (c) 2007 Coraid, Inc.  See COPYING for GPL terms. */
 #define VERSION 47
 #define AOE_MAJOR 152
 #define DEVICE_NAME aoe
diff --git a/drivers/block/aoe/aoeblk.c b/drivers/block/aoe/aoeblk.c
index 25c6760..0c39782 100644
--- a/drivers/block/aoe/aoeblk.c
+++ b/drivers/block/aoe/aoeblk.c
@@ -1,4 +1,4 @@
-/* Copyright (c) 2006 Coraid, Inc.  See COPYING for GPL terms. */
+/* Copyright (c) 2007 Coraid, Inc.  See COPYING for GPL terms. */
 /*
  * aoeblk.c
  * block device routines
diff --git a/drivers/block/aoe/aoechr.c b/drivers/block/aoe/aoechr.c
index 670bba6..ef49e4b 100644
--- a/drivers/block/aoe/aoechr.c
+++ b/drivers/block/aoe/aoechr.c
@@ -1,4 +1,4 @@
-/* Copyright (c) 2006 Coraid, Inc.  See COPYING for GPL terms. */
+/* Copyright (c) 2007 Coraid, Inc.  See COPYING for GPL terms. */
 /*
  * aoechr.c
  * AoE character device driver
diff --git a/drivers/block/aoe/aoecmd.c b/drivers/block/aoe/aoecmd.c
index 1e37cf6..44beb17 100644
--- a/drivers/block/aoe/aoecmd.c
+++ b/drivers/block/aoe/aoecmd.c
@@ -1,4 +1,4 @@
-/* Copyright (c) 2006 Coraid, Inc.  See COPYING for GPL terms. */
+/* Copyright (c) 2007 Coraid, Inc.  See COPYING for GPL terms. */
 /*
  * aoecmd.c
  * Filesystem request handling methods
diff --git a/drivers/block/aoe/aoedev.c b/drivers/block/aoe/aoedev.c
index 839a964..d146c4e 100644
--- a/drivers/block/aoe/aoedev.c
+++ b/drivers/block/aoe/aoedev.c
@@ -1,4 +1,4 @@
-/* Copyright (c) 2006 Coraid, Inc.  See COPYING for GPL terms. */
+/* Copyright (c) 2007 Coraid, Inc.  See COPYING for GPL terms. */
 /*
  * aoedev.c
  * AoE device utility functions; maintains device list.
diff --git a/drivers/block/aoe/aoemain.c b/drivers/block/aoe/aoemain.c
index a04b7d6..7b15a5e 100644
--- a/drivers/block/aoe/aoemain.c
+++ b/drivers/block/aoe/aoemain.c
@@ -1,4 +1,4 @@
-/* Copyright (c) 2006 Coraid, Inc.  See COPYING for GPL terms. */
+/* Copyright (c) 2007 Coraid, Inc.  See COPYING for GPL terms. */
 /*
  * aoemain.c
  * Module initialization routines, discover timer
diff --git a/drivers/block/aoe/aoenet.c b/drivers/block/aoe/aoenet.c
index ada4a06..8460ef7 100644
--- a/drivers/block/aoe/aoenet.c
+++ b/drivers/block/aoe/aoenet.c
@@ -1,4 +1,4 @@
-/* Copyright (c) 2006 Coraid, Inc.  See COPYING for GPL terms. */
+/* Copyright (c) 2007 Coraid, Inc.  See COPYING for GPL terms. */
 /*
  * aoenet.c
  * Ethernet portion of AoE driver
-- 
1.5.3.4

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Patch 3/8] Enhanced partition statistics: aoe fix

2007-12-13 Thread Ed L. Cashin
On Thu, Dec 13, 2007 at 05:17:45PM +0100, Jerome Marchand wrote:
> Updates the enhanced partition statistics in ATA over Ethernet driver (not 
> tested).

Acked-by: Ed L. Cashin <[EMAIL PROTECTED]>

> Signed-off-by: Jerome Marchand <[EMAIL PROTECTED]>
> ---
>  aoecmd.c |8 
>  1 file changed, 4 insertions(+), 4 deletions(-)
> diff -urNp -X linux-2.6/Documentation/dontdiff 
> linux-2.6.orig/drivers/block/aoe/aoecmd.c linux-2.6/drivers/block/aoe/aoecmd.c
> --- linux-2.6.orig/drivers/block/aoe/aoecmd.c 2007-12-04 17:37:13.0 
> +0100
> +++ linux-2.6/drivers/block/aoe/aoecmd.c  2007-12-05 13:45:10.0 
> +0100
> @@ -648,10 +648,10 @@ aoecmd_ata_rsp(struct sk_buff *skb)
>   struct gendisk *disk = d->gd;
>   const int rw = bio_data_dir(buf->bio);
>  
> - disk_stat_inc(disk, ios[rw]);
> - disk_stat_add(disk, ticks[rw], duration);
> - disk_stat_add(disk, sectors[rw], n_sect);
> - disk_stat_add(disk, io_ticks, duration);
> + all_stat_inc(disk, ios[rw], buf->sector);
> + all_stat_add(disk, ticks[rw], duration, buf->sector);
> + all_stat_add(disk, sectors[rw], n_sect, buf->sector);
> + all_stat_add(disk, io_ticks, duration, buf->sector);
>   n = (buf->flags & BUFFL_FAIL) ? -EIO : 0;
>   bio_endio(buf->bio, n);
>   mempool_free(buf, d->bufpool);

-- 
  Ed L Cashin <[EMAIL PROTECTED]>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Patch 3/8] Enhanced partition statistics: aoe fix

2007-12-13 Thread Ed L. Cashin
On Thu, Dec 13, 2007 at 05:17:45PM +0100, Jerome Marchand wrote:
 Updates the enhanced partition statistics in ATA over Ethernet driver (not 
 tested).

Acked-by: Ed L. Cashin [EMAIL PROTECTED]

 Signed-off-by: Jerome Marchand [EMAIL PROTECTED]
 ---
  aoecmd.c |8 
  1 file changed, 4 insertions(+), 4 deletions(-)
 diff -urNp -X linux-2.6/Documentation/dontdiff 
 linux-2.6.orig/drivers/block/aoe/aoecmd.c linux-2.6/drivers/block/aoe/aoecmd.c
 --- linux-2.6.orig/drivers/block/aoe/aoecmd.c 2007-12-04 17:37:13.0 
 +0100
 +++ linux-2.6/drivers/block/aoe/aoecmd.c  2007-12-05 13:45:10.0 
 +0100
 @@ -648,10 +648,10 @@ aoecmd_ata_rsp(struct sk_buff *skb)
   struct gendisk *disk = d-gd;
   const int rw = bio_data_dir(buf-bio);
  
 - disk_stat_inc(disk, ios[rw]);
 - disk_stat_add(disk, ticks[rw], duration);
 - disk_stat_add(disk, sectors[rw], n_sect);
 - disk_stat_add(disk, io_ticks, duration);
 + all_stat_inc(disk, ios[rw], buf-sector);
 + all_stat_add(disk, ticks[rw], duration, buf-sector);
 + all_stat_add(disk, sectors[rw], n_sect, buf-sector);
 + all_stat_add(disk, io_ticks, duration, buf-sector);
   n = (buf-flags  BUFFL_FAIL) ? -EIO : 0;
   bio_endio(buf-bio, n);
   mempool_free(buf, d-bufpool);

-- 
  Ed L Cashin [EMAIL PROTECTED]
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Bugme-new] [Bug 9482] New: kernel GPF in 2.6.24 (g09f345da)

2007-12-10 Thread Ed L. Cashin
On Sat, Dec 08, 2007 at 07:50:15PM -0800, Andrew Morton wrote:
> On Sat, 8 Dec 2007 16:59:30 -0600 "Jon Nelson" <[EMAIL PROTECTED]> wrote:
> 
> > I can confirm that 2.6.24rc4 with the (second) patch works fine.
> 
> OK, thanks.
> 
> We haven't heard back from Ed yet.   I'll sit on this for a few more days.

Sorry, yes, the second patch works well for me in testing.  I had some
initial concerns that were unfounded.

-- 
  Ed L Cashin <[EMAIL PROTECTED]>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Bugme-new] [Bug 9482] New: kernel GPF in 2.6.24 (g09f345da)

2007-12-10 Thread Ed L. Cashin
On Sat, Dec 08, 2007 at 07:50:15PM -0800, Andrew Morton wrote:
 On Sat, 8 Dec 2007 16:59:30 -0600 Jon Nelson [EMAIL PROTECTED] wrote:
 
  I can confirm that 2.6.24rc4 with the (second) patch works fine.
 
 OK, thanks.
 
 We haven't heard back from Ed yet.   I'll sit on this for a few more days.

Sorry, yes, the second patch works well for me in testing.  I had some
initial concerns that were unfounded.

-- 
  Ed L Cashin [EMAIL PROTECTED]
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 13/13] make error messages more specific

2007-12-07 Thread Ed L. Cashin
Andrew Morton pointed out that the "too many targets" message in patch
2 could be printed for failing GFP_ATOMIC allocations.  This patch
makes the messages more specific.

Signed-off-by: Ed L. Cashin <[EMAIL PROTECTED]>
---
 drivers/block/aoe/aoecmd.c |   15 +++
 1 files changed, 7 insertions(+), 8 deletions(-)

diff --git a/drivers/block/aoe/aoecmd.c b/drivers/block/aoe/aoecmd.c
index bcea36c..1e37cf6 100644
--- a/drivers/block/aoe/aoecmd.c
+++ b/drivers/block/aoe/aoecmd.c
@@ -957,15 +957,17 @@ addtgt(struct aoedev *d, char *addr, ulong nframes)
for (; tt < te && *tt; tt++)
;
 
-   if (tt == te)
+   if (tt == te) {
+   printk(KERN_INFO
+   "aoe: device addtgt failure; too many targets\n");
return NULL;
-
+   }
t = kcalloc(1, sizeof *t, GFP_ATOMIC);
-   if (!t)
-   return NULL;
f = kcalloc(nframes, sizeof *f, GFP_ATOMIC);
-   if (!f) {
+   if (!t || !f) {
+   kfree(f);
kfree(t);
+   printk(KERN_INFO "aoe: cannot allocate memory to add target\n");
return NULL;
}
 
@@ -1029,9 +1031,6 @@ aoecmd_cfg_rsp(struct sk_buff *skb)
if (!t) {
t = addtgt(d, h->src, n);
if (!t) {
-   printk(KERN_INFO
-   "aoe: device addtgt failure; "
-   "too many targets?\n");
spin_unlock_irqrestore(>lock, flags);
return;
}
-- 
1.5.3.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 12/13] the aoeminor doesn't need a long format

2007-12-07 Thread Ed L. Cashin
The aoedev aoeminor member doesn't need a long format.

Signed-off-by: Ed L. Cashin <[EMAIL PROTECTED]>
---
 drivers/block/aoe/aoeblk.c |7 ---
 drivers/block/aoe/aoecmd.c |5 +++--
 2 files changed, 7 insertions(+), 5 deletions(-)

diff --git a/drivers/block/aoe/aoeblk.c b/drivers/block/aoe/aoeblk.c
index 98ab170..b78a8ef 100644
--- a/drivers/block/aoe/aoeblk.c
+++ b/drivers/block/aoe/aoeblk.c
@@ -203,7 +203,7 @@ aoeblk_make_request(struct request_queue *q, struct bio 
*bio)
spin_lock_irqsave(>lock, flags);
 
if ((d->flags & DEVFL_UP) == 0) {
-   printk(KERN_INFO "aoe: device %ld.%ld is not up\n",
+   printk(KERN_INFO "aoe: device %ld.%d is not up\n",
d->aoemajor, d->aoeminor);
spin_unlock_irqrestore(>lock, flags);
mempool_free(buf, d->bufpool);
@@ -256,14 +256,15 @@ aoeblk_gdalloc(void *vp)
 
gd = alloc_disk(AOE_PARTITIONS);
if (gd == NULL) {
-   printk(KERN_ERR "aoe: cannot allocate disk structure for 
%ld.%ld\n",
+   printk(KERN_ERR
+   "aoe: cannot allocate disk structure for %ld.%d\n",
d->aoemajor, d->aoeminor);
goto err;
}
 
d->bufpool = mempool_create_slab_pool(MIN_BUFS, buf_pool_cache);
if (d->bufpool == NULL) {
-   printk(KERN_ERR "aoe: cannot allocate bufpool for %ld.%ld\n",
+   printk(KERN_ERR "aoe: cannot allocate bufpool for %ld.%d\n",
d->aoemajor, d->aoeminor);
goto err_disk;
}
diff --git a/drivers/block/aoe/aoecmd.c b/drivers/block/aoe/aoecmd.c
index e92d885..bcea36c 100644
--- a/drivers/block/aoe/aoecmd.c
+++ b/drivers/block/aoe/aoecmd.c
@@ -697,7 +697,8 @@ ataid_complete(struct aoedev *d, struct aoetgt *t, unsigned 
char *id)
}
 
if (d->ssize != ssize)
-   printk(KERN_INFO "aoe: %012llx e%lu.%lu v%04x has %llu 
sectors\n",
+   printk(KERN_INFO
+   "aoe: %012llx e%ld.%d v%04x has %llu sectors\n",
mac_addr(t->addr),
d->aoemajor, d->aoeminor,
d->fw_ver, (long long)ssize);
@@ -822,7 +823,7 @@ aoecmd_ata_rsp(struct sk_buff *skb)
 
if (ahin->cmdstat & 0xa9) { /* these bits cleared on success */
printk(KERN_ERR
-   "aoe: ata error cmd=%2.2Xh stat=%2.2Xh from e%ld.%ld\n",
+   "aoe: ata error cmd=%2.2Xh stat=%2.2Xh from e%ld.%d\n",
ahout->cmdstat, ahin->cmdstat,
d->aoemajor, d->aoeminor);
if (buf)
-- 
1.5.3.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 11/13] remove extra space in prototypes for consistency

2007-12-07 Thread Ed L. Cashin
Remove extra space in prototypes for consistency.

Signed-off-by: Ed L. Cashin <[EMAIL PROTECTED]>
---
 drivers/block/aoe/aoeblk.c |8 
 1 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/block/aoe/aoeblk.c b/drivers/block/aoe/aoeblk.c
index 7168d3d..98ab170 100644
--- a/drivers/block/aoe/aoeblk.c
+++ b/drivers/block/aoe/aoeblk.c
@@ -15,7 +15,7 @@
 
 static struct kmem_cache *buf_pool_cache;
 
-static ssize_t aoedisk_show_state(struct gendisk * disk, char *page)
+static ssize_t aoedisk_show_state(struct gendisk *disk, char *page)
 {
struct aoedev *d = disk->private_data;
 
@@ -26,7 +26,7 @@ static ssize_t aoedisk_show_state(struct gendisk * disk, char 
*page)
(d->nopen && !(d->flags & DEVFL_UP)) ? ",closewait" : 
"");
/* I'd rather see nopen exported so we can ditch closewait */
 }
-static ssize_t aoedisk_show_mac(struct gendisk * disk, char *page)
+static ssize_t aoedisk_show_mac(struct gendisk *disk, char *page)
 {
struct aoedev *d = disk->private_data;
struct aoetgt *t = d->targets[0];
@@ -35,7 +35,7 @@ static ssize_t aoedisk_show_mac(struct gendisk * disk, char 
*page)
return snprintf(page, PAGE_SIZE, "none\n");
return snprintf(page, PAGE_SIZE, "%012llx\n", mac_addr(t->addr));
 }
-static ssize_t aoedisk_show_netif(struct gendisk * disk, char *page)
+static ssize_t aoedisk_show_netif(struct gendisk *disk, char *page)
 {
struct aoedev *d = disk->private_data;
struct net_device *nds[8], **nd, **nnd, **ne;
@@ -71,7 +71,7 @@ static ssize_t aoedisk_show_netif(struct gendisk * disk, char 
*page)
return p-page;
 }
 /* firmware version */
-static ssize_t aoedisk_show_fwver(struct gendisk * disk, char *page)
+static ssize_t aoedisk_show_fwver(struct gendisk *disk, char *page)
 {
struct aoedev *d = disk->private_data;
 
-- 
1.5.3.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 10/13] add module parameter for users who need more outstanding I/O

2007-12-07 Thread Ed L. Cashin
An AoE target provides an estimate of the number of outstanding
commands that the AoE initiator can send before getting a response.
The aoe_maxout parameter provides a way to set an even lower limit.
It will not allow a user to use more outstanding commands than the
target permits.  If a user discovers a problem with a large setting,
this parameter provides a way for us to work with them to debug the
problem.  We expect to improve the dynamic window sizing algorithm and
drop this parameter.  For the time being, it is a debugging aid.

Signed-off-by: Ed L. Cashin <[EMAIL PROTECTED]>
---
 drivers/block/aoe/aoecmd.c |   10 +++---
 1 files changed, 7 insertions(+), 3 deletions(-)

diff --git a/drivers/block/aoe/aoecmd.c b/drivers/block/aoe/aoecmd.c
index 7a96183..e92d885 100644
--- a/drivers/block/aoe/aoecmd.c
+++ b/drivers/block/aoe/aoecmd.c
@@ -18,6 +18,11 @@ static int aoe_deadsecs = 60 * 3;
 module_param(aoe_deadsecs, int, 0644);
 MODULE_PARM_DESC(aoe_deadsecs, "After aoe_deadsecs seconds, give up and fail 
dev.");
 
+static int aoe_maxout = 16;
+module_param(aoe_maxout, int, 0644);
+MODULE_PARM_DESC(aoe_maxout,
+   "Only aoe_maxout outstanding packets for every MAC on eX.Y.");
+
 static struct sk_buff *
 new_skb(ulong len)
 {
@@ -984,7 +989,6 @@ aoecmd_cfg_rsp(struct sk_buff *skb)
struct aoeif *ifp;
ulong flags, sysminor, aoemajor;
struct sk_buff *sl;
-   enum { MAXFRAMES = 16 };
u16 n;
 
h = (struct aoe_hdr *) skb_mac_header(skb);
@@ -1009,8 +1013,8 @@ aoecmd_cfg_rsp(struct sk_buff *skb)
}
 
n = be16_to_cpu(ch->bufcnt);
-   if (n > MAXFRAMES)  /* keep it reasonable */
-   n = MAXFRAMES;
+   if (n > aoe_maxout) /* keep it reasonable */
+   n = aoe_maxout;
 
d = aoedev_by_sysminor_m(sysminor);
if (d == NULL) {
-- 
1.5.3.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 09/13] remove race between use and initialization of locks

2007-12-07 Thread Ed L. Cashin
Alexey Dobriyan noticed a race in the initialization of the dynamic
locks in ...

  Message-ID: <[EMAIL PROTECTED]>

Andrew Morton commented that these locks should be initialized at
compile time, so this patch does that.

Signed-off-by: Ed L. Cashin <[EMAIL PROTECTED]>
---
 drivers/block/aoe/aoechr.c |6 ++
 1 files changed, 2 insertions(+), 4 deletions(-)

diff --git a/drivers/block/aoe/aoechr.c b/drivers/block/aoe/aoechr.c
index 166f54f..0ce9bda 100644
--- a/drivers/block/aoe/aoechr.c
+++ b/drivers/block/aoe/aoechr.c
@@ -35,8 +35,8 @@ struct ErrMsg {
 
 static struct ErrMsg emsgs[NMSG];
 static int emsgs_head_idx, emsgs_tail_idx;
-static struct semaphore emsgs_sema;
-static spinlock_t emsgs_lock;
+static __DECLARE_SEMAPHORE_GENERIC(emsgs_sema, 0);
+static DEFINE_SPINLOCK(emsgs_lock);
 static int nblocked_emsgs_readers;
 static struct class *aoe_class;
 static struct aoe_chardev chardevs[] = {
@@ -264,8 +264,6 @@ aoechr_init(void)
printk(KERN_ERR "aoe: can't register char device\n");
return n;
}
-   sema_init(_sema, 0);
-   spin_lock_init(_lock);
aoe_class = class_create(THIS_MODULE, "aoe");
if (IS_ERR(aoe_class)) {
unregister_chrdev(AOE_MAJOR, "aoechr");
-- 
1.5.3.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 08/13] only install new AoE device once

2007-12-07 Thread Ed L. Cashin
An aoe driver user who had about 70 AoE targets found that he was
hitting a BUG in sysfs_create_file because the aoe driver was trying
to tell the kernel about an AoE device more than once.  Each AoE
device was reachable by several local network interfaces, and multiple
ATA device indentify responses were returning from that single device.

This patch eliminates a race condition so that aoe always informs the
block layer of a new AoE device once in the presence of multiple
incoming ATA device identify responses.

Signed-off-by: Ed L. Cashin <[EMAIL PROTECTED]>
---
 drivers/block/aoe/aoecmd.c |2 ++
 1 files changed, 2 insertions(+), 0 deletions(-)

diff --git a/drivers/block/aoe/aoecmd.c b/drivers/block/aoe/aoecmd.c
index b49e06e..7a96183 100644
--- a/drivers/block/aoe/aoecmd.c
+++ b/drivers/block/aoe/aoecmd.c
@@ -698,6 +698,8 @@ ataid_complete(struct aoedev *d, struct aoetgt *t, unsigned 
char *id)
d->fw_ver, (long long)ssize);
d->ssize = ssize;
d->geo.start = 0;
+   if (d->flags & (DEVFL_GDALLOC|DEVFL_NEWSIZE))
+   return;
if (d->gd != NULL) {
d->gd->capacity = ssize;
d->flags |= DEVFL_NEWSIZE;
-- 
1.5.3.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 07/13] dynamically allocate a capped number of skbs when necessary

2007-12-07 Thread Ed L. Cashin
What this Patch Does

  Even before this recent series of 12 patches to 2.6.22-rc4, the aoe
  driver was reusing a small set of skbs that were allocated once and
  were only used for outbound AoE commands.

  The network layer cannot be allowed to put_page on the data that is
  still associated with a bio we haven't returned to the block layer,
  so the aoe driver (even before the patch under discussion) is still
  the owner of skbs that have been handed to the network layer for
  transmission.  We need to keep track of these skbs so that we can
  free them, but by tracking them, we can also easily re-use them.

  The new patch was a response to the behavior of certain network
  drivers.  We cannot reuse an skb that the network driver still has
  in its transmit ring.  Network drivers can defer transmit ring
  cleanup and then use the state in the skb to determine how many data
  segments to clean up in its transmit ring.  The tg3 driver is one
  driver that behaves in this way.

  When the network driver defers cleanup of its transmit ring, the aoe
  driver can find itself in a situation where it would like to send an
  AoE command, and the AoE target is ready for more work, but the
  network driver still has all of the pre-allocated skbs.  In that
  case, the new patch just calls alloc_skb, as you'd expect.

  We don't want to get carried away, though.  We try not to do
  excessive allocation in the write path, so we cap the number of skbs
  we dynamically allocate.

  Probably calling it a "dynamic pool" is misleading.  We were already
  trying to use a small fixed-size set of pre-allocated skbs before
  this patch, and this patch just provides a little headroom (with a
  ceiling, though) to accomodate network drivers that hang onto skbs,
  by allocating when needed.  The d->skbpool_hd list of allocated skbs
  is necessary so that we can free them later.

  We didn't notice the need for this headroom until AoE targets got
  fast enough.

Alternatives

  If the network layer never did a put_page on the pages in the bio's
  we get from the block layer, then it would be possible for us to
  hand skbs to the network layer and forget about them, allowing the
  network layer to free skbs itself (and thereby calling our own
  skb->destructor callback function if we needed that).  In that case
  we could get rid of the pre-allocated skbs and also the
  d->skbpool_hd, instead just calling alloc_skb every time we wanted
  to transmit a packet.  The slab allocator would effectively maintain
  the list of skbs.

  Besides a loss of CPU cache locality, the main concern with that
  approach the danger that it would increase the likelihood of
  deadlock when VM is trying to free pages by writing dirty data from
  the page cache through the aoe driver out to persistent storage on
  an AoE device.  Right now we have a situation where we have
  pre-allocation that corresponds to how much we use, which seems
  ideal.

  Of course, there's still the separate issue of receiving the packets
  that tell us that a write has successfully completed on the AoE
  target.  When memory is low and VM is using AoE to flush dirty data
  to free up pages, it would be perfect if there were a way for us to
  register a fast callback that could recognize write command
  completion responses.  But I don't think the current problems with
  the receive side of the situation are a justification for
  exacerbating the problem on the transmit side.

Signed-off-by: Ed L. Cashin <[EMAIL PROTECTED]>
---
 drivers/block/aoe/aoe.h|5 ++
 drivers/block/aoe/aoecmd.c |  117 +++-
 drivers/block/aoe/aoedev.c |   52 +---
 3 files changed, 133 insertions(+), 41 deletions(-)

diff --git a/drivers/block/aoe/aoe.h b/drivers/block/aoe/aoe.h
index 2248ab2..67ef4d7 100644
--- a/drivers/block/aoe/aoe.h
+++ b/drivers/block/aoe/aoe.h
@@ -89,6 +89,7 @@ enum {
MIN_BUFS = 16,
NTARGETS = 8,
NAOEIFS = 8,
+   NSKBPOOLMAX = 128,
 
TIMERTICK = HZ / 10,
MINTIMER = HZ >> 2,
@@ -138,6 +139,7 @@ struct aoetgt {
u16 useme;
ulong lastwadj; /* last window adjustment */
int wpkts, rpkts;
+   int dataref;
 };
 
 struct aoedev {
@@ -159,6 +161,9 @@ struct aoedev {
spinlock_t lock;
struct sk_buff *sendq_hd; /* packets needing to be sent, list head */
struct sk_buff *sendq_tl;
+   struct sk_buff *skbpool_hd;
+   struct sk_buff *skbpool_tl;
+   int nskbpool;
mempool_t *bufpool; /* for deadlock-free Buf allocation */
struct list_head bufq;  /* queue of bios to work on */
struct buf *inprocess;  /* the one we're currently working on */
diff --git a/drivers/block/aoe/aoecmd.c b/drivers/block/aoe/aoecmd.c
index 1be5150..b49e06e 100644
--- a/drivers/block/aoe/aoecmd.c
+++ b/drivers/block/aoe/aoecmd.c
@@ -106,45 +106,104 @@ ifrotate(struct aoetgt 

[PATCH 06/13] user can ask driver to forget previously detected devices

2007-12-07 Thread Ed L. Cashin
When an AoE device is detected, the kernel is informed, and a new
block device is created.  If the device is unused, the block device
corresponding to remote device that is no longer available may be
removed from the system by telling the aoe driver to "flush" its list
of devices.

Without this patch, software like GPFS and LVM may attempt to read
from AoE devices that were discovered earlier but are no longer
present, blocking until the I/O attempt times out.

Signed-off-by: Ed L. Cashin <[EMAIL PROTECTED]>
---
 Documentation/aoe/mkdevs.sh |2 +
 Documentation/aoe/udev.txt  |1 +
 drivers/block/aoe/aoe.h |1 +
 drivers/block/aoe/aoechr.c  |5 ++
 drivers/block/aoe/aoedev.c  |   87 +-
 5 files changed, 77 insertions(+), 19 deletions(-)

diff --git a/Documentation/aoe/mkdevs.sh b/Documentation/aoe/mkdevs.sh
index 97374aa..44c0ab7 100644
--- a/Documentation/aoe/mkdevs.sh
+++ b/Documentation/aoe/mkdevs.sh
@@ -29,6 +29,8 @@ rm -f $dir/interfaces
 mknod -m 0200 $dir/interfaces c $MAJOR 4
 rm -f $dir/revalidate
 mknod -m 0200 $dir/revalidate c $MAJOR 5
+rm -f $dir/flush
+mknod -m 0200 $dir/flush c $MAJOR 6
 
 export n_partitions
 mkshelf=`echo $0 | sed 's!mkdevs!mkshelf!'`
diff --git a/Documentation/aoe/udev.txt b/Documentation/aoe/udev.txt
index 17e76c4..8686e78 100644
--- a/Documentation/aoe/udev.txt
+++ b/Documentation/aoe/udev.txt
@@ -20,6 +20,7 @@ SUBSYSTEM=="aoe", KERNEL=="discover", NAME="etherd/%k", 
GROUP="disk", MODE="0220
 SUBSYSTEM=="aoe", KERNEL=="err",   NAME="etherd/%k", GROUP="disk", 
MODE="0440"
 SUBSYSTEM=="aoe", KERNEL=="interfaces",NAME="etherd/%k", GROUP="disk", 
MODE="0220"
 SUBSYSTEM=="aoe", KERNEL=="revalidate",NAME="etherd/%k", GROUP="disk", 
MODE="0220"
+SUBSYSTEM=="aoe", KERNEL=="flush", NAME="etherd/%k", GROUP="disk", 
MODE="0220"
 
 # aoe block devices 
 KERNEL=="etherd*",   NAME="%k", GROUP="disk"
diff --git a/drivers/block/aoe/aoe.h b/drivers/block/aoe/aoe.h
index aecaac3..2248ab2 100644
--- a/drivers/block/aoe/aoe.h
+++ b/drivers/block/aoe/aoe.h
@@ -191,6 +191,7 @@ struct aoedev *aoedev_by_aoeaddr(int maj, int min);
 struct aoedev *aoedev_by_sysminor_m(ulong sysminor);
 void aoedev_downdev(struct aoedev *d);
 int aoedev_isbusy(struct aoedev *d);
+int aoedev_flush(const char __user *str, size_t size);
 
 int aoenet_init(void);
 void aoenet_exit(void);
diff --git a/drivers/block/aoe/aoechr.c b/drivers/block/aoe/aoechr.c
index 4a3889d..166f54f 100644
--- a/drivers/block/aoe/aoechr.c
+++ b/drivers/block/aoe/aoechr.c
@@ -15,6 +15,7 @@ enum {
MINOR_DISCOVER,
MINOR_INTERFACES,
MINOR_REVALIDATE,
+   MINOR_FLUSH,
MSGSZ = 2048,
NMSG = 100, /* message backlog to retain */
 };
@@ -43,6 +44,7 @@ static struct aoe_chardev chardevs[] = {
{ MINOR_DISCOVER, "discover" },
{ MINOR_INTERFACES, "interfaces" },
{ MINOR_REVALIDATE, "revalidate" },
+   { MINOR_FLUSH, "flush" },
 };
 
 static int
@@ -158,6 +160,9 @@ aoechr_write(struct file *filp, const char __user *buf, 
size_t cnt, loff_t *offp
break;
case MINOR_REVALIDATE:
ret = revalidate(buf, cnt);
+   break;
+   case MINOR_FLUSH:
+   ret = aoedev_flush(buf, cnt);
}
if (ret == 0)
ret = cnt;
diff --git a/drivers/block/aoe/aoedev.c b/drivers/block/aoe/aoedev.c
index a4d625a..e26f6f4 100644
--- a/drivers/block/aoe/aoedev.c
+++ b/drivers/block/aoe/aoedev.c
@@ -9,6 +9,10 @@
 #include 
 #include "aoe.h"
 
+static void dummy_timer(ulong);
+static void aoedev_freedev(struct aoedev *);
+static void freetgt(struct aoetgt *t);
+
 static struct aoedev *devlist;
 static spinlock_t devlist_lock;
 
@@ -108,6 +112,70 @@ aoedev_downdev(struct aoedev *d)
d->flags &= ~DEVFL_UP;
 }
 
+static void
+aoedev_freedev(struct aoedev *d)
+{
+   struct aoetgt **t, **e;
+
+   if (d->gd) {
+   aoedisk_rm_sysfs(d);
+   del_gendisk(d->gd);
+   put_disk(d->gd);
+   }
+   t = d->targets;
+   e = t + NTARGETS;
+   for (; t < e && *t; t++)
+   freetgt(*t);
+   if (d->bufpool)
+   mempool_destroy(d->bufpool);
+   kfree(d);
+}
+
+int
+aoedev_flush(const char __user *str, size_t cnt)
+{
+   ulong flags;
+   struct aoedev *d, **dd;
+   struct aoedev *rmd = NULL;
+   char buf[16];
+   int all = 0;
+
+   if (cnt >= 3) {
+   if (cnt > sizeof buf)
+   cnt = sizeof buf;
+   if (copy_fr

[PATCH 05/13] eliminate goto and improve readability

2007-12-07 Thread Ed L. Cashin
Adam Richter suggested eliminating this goto.

Signed-off-by: Ed L. Cashin <[EMAIL PROTECTED]>
---
 drivers/block/aoe/aoechr.c |   69 +--
 1 files changed, 34 insertions(+), 35 deletions(-)

diff --git a/drivers/block/aoe/aoechr.c b/drivers/block/aoe/aoechr.c
index 1a5c4b5..4a3889d 100644
--- a/drivers/block/aoe/aoechr.c
+++ b/drivers/block/aoe/aoechr.c
@@ -194,52 +194,51 @@ aoechr_read(struct file *filp, char __user *buf, size_t 
cnt, loff_t *off)
ulong flags;
 
n = (unsigned long) filp->private_data;
-   switch (n) {
-   case MINOR_ERR:
-   spin_lock_irqsave(_lock, flags);
-loop:
-   em = emsgs + emsgs_head_idx;
-   if ((em->flags & EMFL_VALID) == 0) {
-   if (filp->f_flags & O_NDELAY) {
-   spin_unlock_irqrestore(_lock, flags);
-   return -EAGAIN;
-   }
-   nblocked_emsgs_readers++;
+   if (n != MINOR_ERR)
+   return -EFAULT;
+
+   spin_lock_irqsave(_lock, flags);
 
+   for (;;) {
+   em = emsgs + emsgs_head_idx;
+   if ((em->flags & EMFL_VALID) != 0)
+   break;
+   if (filp->f_flags & O_NDELAY) {
spin_unlock_irqrestore(_lock, flags);
+   return -EAGAIN;
+   }
+   nblocked_emsgs_readers++;
+
+   spin_unlock_irqrestore(_lock, flags);
 
-   n = down_interruptible(_sema);
+   n = down_interruptible(_sema);
 
-   spin_lock_irqsave(_lock, flags);
+   spin_lock_irqsave(_lock, flags);
 
-   nblocked_emsgs_readers--;
+   nblocked_emsgs_readers--;
 
-   if (n) {
-   spin_unlock_irqrestore(_lock, flags);
-   return -ERESTARTSYS;
-   }
-   goto loop;
-   }
-   if (em->len > cnt) {
+   if (n) {
spin_unlock_irqrestore(_lock, flags);
-   return -EAGAIN;
+   return -ERESTARTSYS;
}
-   mp = em->msg;
-   len = em->len;
-   em->msg = NULL;
-   em->flags &= ~EMFL_VALID;
+   }
+   if (em->len > cnt) {
+   spin_unlock_irqrestore(_lock, flags);
+   return -EAGAIN;
+   }
+   mp = em->msg;
+   len = em->len;
+   em->msg = NULL;
+   em->flags &= ~EMFL_VALID;
 
-   emsgs_head_idx++;
-   emsgs_head_idx %= ARRAY_SIZE(emsgs);
+   emsgs_head_idx++;
+   emsgs_head_idx %= ARRAY_SIZE(emsgs);
 
-   spin_unlock_irqrestore(_lock, flags);
+   spin_unlock_irqrestore(_lock, flags);
 
-   n = copy_to_user(buf, mp, len);
-   kfree(mp);
-   return n == 0 ? len : -EFAULT;
-   default:
-   return -EFAULT;
-   }
+   n = copy_to_user(buf, mp, len);
+   kfree(mp);
+   return n == 0 ? len : -EFAULT;
 }
 
 static const struct file_operations aoe_fops = {
-- 
1.5.3.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 04/13] clean up udev configuration example

2007-12-07 Thread Ed L. Cashin
This patch adds a known default location for the udev configuration
file and uses the more recent "==" syntax for SUBSYSTEM and KERNEL.

Signed-off-by: Ed L. Cashin <[EMAIL PROTECTED]>
---
 Documentation/aoe/udev-install.sh |5 -
 Documentation/aoe/udev.txt|   15 ---
 2 files changed, 12 insertions(+), 8 deletions(-)

diff --git a/Documentation/aoe/udev-install.sh 
b/Documentation/aoe/udev-install.sh
index 6449911..15e86f5 100644
--- a/Documentation/aoe/udev-install.sh
+++ b/Documentation/aoe/udev-install.sh
@@ -23,7 +23,10 @@ fi
 # /etc/udev/rules.d
 #
 rules_d="`sed -n '/^udev_rules=/{ s!udev_rules=!!; s!\"!!g; p; }' $conf`"
-if test -z "$rules_d" || test ! -d "$rules_d"; then
+if test -z "$rules_d" ; then
+   rules_d=/etc/udev/rules.d
+fi
+if test ! -d "$rules_d"; then
echo "$me Error: cannot find udev rules directory" 1>&2
exit 1
 fi
diff --git a/Documentation/aoe/udev.txt b/Documentation/aoe/udev.txt
index a7ed1dc..17e76c4 100644
--- a/Documentation/aoe/udev.txt
+++ b/Documentation/aoe/udev.txt
@@ -1,6 +1,7 @@
 # These rules tell udev what device nodes to create for aoe support.
-# They may be installed along the following lines (adjusted to what
-# you see on your system).
+# They may be installed along the following lines.  Check the section
+# 8 udev manpage to see whether your udev supports SUBSYSTEM, and
+# whether it uses one or two equal signs for SUBSYSTEM and KERNEL.
 # 
 #   [EMAIL PROTECTED] ~$ su
 #   Password:
@@ -15,10 +16,10 @@
 #  
 
 # aoe char devices
-SUBSYSTEM="aoe", KERNEL="discover",NAME="etherd/%k", GROUP="disk", 
MODE="0220"
-SUBSYSTEM="aoe", KERNEL="err", NAME="etherd/%k", GROUP="disk", 
MODE="0440"
-SUBSYSTEM="aoe", KERNEL="interfaces",  NAME="etherd/%k", GROUP="disk", 
MODE="0220"
-SUBSYSTEM="aoe", KERNEL="revalidate",  NAME="etherd/%k", GROUP="disk", 
MODE="0220"
+SUBSYSTEM=="aoe", KERNEL=="discover",  NAME="etherd/%k", GROUP="disk", 
MODE="0220"
+SUBSYSTEM=="aoe", KERNEL=="err",   NAME="etherd/%k", GROUP="disk", 
MODE="0440"
+SUBSYSTEM=="aoe", KERNEL=="interfaces",NAME="etherd/%k", GROUP="disk", 
MODE="0220"
+SUBSYSTEM=="aoe", KERNEL=="revalidate",NAME="etherd/%k", GROUP="disk", 
MODE="0220"
 
 # aoe block devices 
-KERNEL="etherd*",   NAME="%k", GROUP="disk"
+KERNEL=="etherd*",   NAME="%k", GROUP="disk"
-- 
1.5.3.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 03/13] mac_addr: avoid 64-bit arch compiler warnings

2007-12-07 Thread Ed L. Cashin
By returning unsigned long long, mac_addr does not generate compiler
warnings on 64-bit architectures.

Signed-off-by: Ed L. Cashin <[EMAIL PROTECTED]>
---
 drivers/block/aoe/aoe.h|2 +-
 drivers/block/aoe/aoeblk.c |3 +--
 drivers/block/aoe/aoecmd.c |   10 +-
 drivers/block/aoe/aoenet.c |4 ++--
 4 files changed, 9 insertions(+), 10 deletions(-)

diff --git a/drivers/block/aoe/aoe.h b/drivers/block/aoe/aoe.h
index 87df18b..aecaac3 100644
--- a/drivers/block/aoe/aoe.h
+++ b/drivers/block/aoe/aoe.h
@@ -198,4 +198,4 @@ void aoenet_xmit(struct sk_buff *);
 int is_aoe_netif(struct net_device *ifp);
 int set_aoe_iflist(const char __user *str, size_t size);
 
-u64 mac_addr(char addr[6]);
+unsigned long long mac_addr(char addr[6]);
diff --git a/drivers/block/aoe/aoeblk.c b/drivers/block/aoe/aoeblk.c
index e10a7f3..7168d3d 100644
--- a/drivers/block/aoe/aoeblk.c
+++ b/drivers/block/aoe/aoeblk.c
@@ -33,8 +33,7 @@ static ssize_t aoedisk_show_mac(struct gendisk * disk, char 
*page)
 
if (t == NULL)
return snprintf(page, PAGE_SIZE, "none\n");
-   return snprintf(page, PAGE_SIZE, "%012llx\n",
-   (unsigned long long)mac_addr(t->addr));
+   return snprintf(page, PAGE_SIZE, "%012llx\n", mac_addr(t->addr));
 }
 static ssize_t aoedisk_show_netif(struct gendisk * disk, char *page)
 {
diff --git a/drivers/block/aoe/aoecmd.c b/drivers/block/aoe/aoecmd.c
index 5e7daa1..1be5150 100644
--- a/drivers/block/aoe/aoecmd.c
+++ b/drivers/block/aoe/aoecmd.c
@@ -309,7 +309,8 @@ resend(struct aoedev *d, struct aoetgt *t, struct frame *f)
"%15s e%ld.%d [EMAIL PROTECTED] newtag=%08x "
"s=%012llx d=%012llx nout=%d\n",
"retransmit", d->aoemajor, d->aoeminor, f->tag, jiffies, n,
-   mac_addr(h->src), mac_addr(h->dst), t->nout);
+   mac_addr(h->src),
+   mac_addr(h->dst), t->nout);
aoechr_error(buf);
 
f->tag = n;
@@ -633,7 +634,7 @@ ataid_complete(struct aoedev *d, struct aoetgt *t, unsigned 
char *id)
 
if (d->ssize != ssize)
printk(KERN_INFO "aoe: %012llx e%lu.%lu v%04x has %llu 
sectors\n",
-   (unsigned long long)mac_addr(t->addr),
+   mac_addr(t->addr),
d->aoemajor, d->aoeminor,
d->fw_ver, (long long)ssize);
d->ssize = ssize;
@@ -727,8 +728,7 @@ aoecmd_ata_rsp(struct sk_buff *skb)
t = gettgt(d, hin->src);
if (t == NULL) {
printk(KERN_INFO "aoe: can't find target e%ld.%d:%012llx\n",
-   d->aoemajor, d->aoeminor,
-   (unsigned long long) mac_addr(hin->src));
+   d->aoemajor, d->aoeminor, mac_addr(hin->src));
spin_unlock_irqrestore(>lock, flags);
return;
}
@@ -1003,7 +1003,7 @@ aoecmd_cfg_rsp(struct sk_buff *skb)
"aoe: e%ld.%d: setting %d%s%s:%012llx\n",
d->aoemajor, d->aoeminor, n,
" byte data frames on ", ifp->nd->name,
-   (unsigned long long) mac_addr(t->addr));
+   mac_addr(t->addr));
ifp->maxbcnt = n;
}
}
diff --git a/drivers/block/aoe/aoenet.c b/drivers/block/aoe/aoenet.c
index 7a38a45..ada4a06 100644
--- a/drivers/block/aoe/aoenet.c
+++ b/drivers/block/aoe/aoenet.c
@@ -83,7 +83,7 @@ set_aoe_iflist(const char __user *user_str, size_t size)
return 0;
 }
 
-u64
+unsigned long long
 mac_addr(char addr[6])
 {
__be64 n = 0;
@@ -91,7 +91,7 @@ mac_addr(char addr[6])
 
memcpy(p + 2, addr, 6); /* (sizeof addr != 6) */
 
-   return __be64_to_cpu(n);
+   return (unsigned long long) __be64_to_cpu(n);
 }
 
 void
-- 
1.5.3.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 02/13] handle multiple network paths to AoE device

2007-12-07 Thread Ed L. Cashin
A remote AoE device is something can process ATA commands and is
identified by an AoE shelf number and an AoE slot number.  Such a
device might have more than one network interface, and it might be
reachable by more than one local network interface.  This patch tracks
the available network paths available to each AoE device, allowing
them to be used more efficiently.

Andrew Morton asked about the call to msleep_interruptible in the
revalidate function.  Yes, if a signal is pending, then
msleep_interruptible will not return 0.  That means we will not loop
but will call aoenet_xmit with a NULL skb, which is a noop.  If the
system is too low on memory or the aoe driver is too low on frames,
then the user can hit control-C to interrupt the attempt to do a
revalidate.  I have added a comment to the code summarizing that.

Andrew Morton asked whether the allocation performed inside addtgt
could use a more relaxed allocation like GFP_KERNEL, but addtgt is
called when the aoedev lock has been locked with spin_lock_irqsave.
It would be nice to allocate the memory under fewer restrictions, but
targets are only added when the device is being discovered, and if the
target can't be added right now, we can try again in a minute when
then next AoE config query broadcast goes out.

Andrew Morton pointed out that the "too many targets" message could be
printed for failing GFP_ATOMIC allocations.  The last patch in this
series makes the messages more specific.

Signed-off-by: Ed L. Cashin <[EMAIL PROTECTED]>
---
 drivers/block/aoe/aoe.h|   57 +++--
 drivers/block/aoe/aoeblk.c |   62 -
 drivers/block/aoe/aoechr.c |   17 +-
 drivers/block/aoe/aoecmd.c |  675 ++--
 drivers/block/aoe/aoedev.c |  168 +--
 drivers/block/aoe/aoenet.c |9 +-
 6 files changed, 653 insertions(+), 335 deletions(-)

diff --git a/drivers/block/aoe/aoe.h b/drivers/block/aoe/aoe.h
index 4d0543a..87df18b 100644
--- a/drivers/block/aoe/aoe.h
+++ b/drivers/block/aoe/aoe.h
@@ -76,10 +76,8 @@ enum {
DEVFL_EXT = (1<<2), /* device accepts lba48 commands */
DEVFL_CLOSEWAIT = (1<<3), /* device is waiting for all closes to 
revalidate */
DEVFL_GDALLOC = (1<<4), /* need to alloc gendisk */
-   DEVFL_PAUSE = (1<<5),
+   DEVFL_KICKME = (1<<5),  /* slow polling network card catch */
DEVFL_NEWSIZE = (1<<6), /* need to update dev size in block layer */
-   DEVFL_MAXBCNT = (1<<7), /* d->maxbcnt is not changeable */
-   DEVFL_KICKME = (1<<8),
 
BUFFL_FAIL = 1,
 };
@@ -88,17 +86,24 @@ enum {
DEFAULTBCNT = 2 * 512,  /* 2 sectors */
NPERSHELF = 16, /* number of slots per shelf address */
FREETAG = -1,
-   MIN_BUFS = 8,
+   MIN_BUFS = 16,
+   NTARGETS = 8,
+   NAOEIFS = 8,
+
+   TIMERTICK = HZ / 10,
+   MINTIMER = HZ >> 2,
+   MAXTIMER = HZ << 1,
+   HELPWAIT = 20,
 };
 
 struct buf {
struct list_head bufs;
-   ulong start_time;   /* for disk stats */
+   ulong stime;/* for disk stats */
ulong flags;
ulong nframesout;
-   char *bufaddr;
ulong resid;
ulong bv_resid;
+   ulong bv_off;
sector_t sector;
struct bio *bio;
struct bio_vec *bv;
@@ -114,19 +119,37 @@ struct frame {
struct sk_buff *skb;
 };
 
+struct aoeif {
+   struct net_device *nd;
+   unsigned char lost;
+   unsigned char lostjumbo;
+   ushort maxbcnt;
+};
+
+struct aoetgt {
+   unsigned char addr[6];
+   ushort nframes;
+   struct frame *frames;
+   struct aoeif ifs[NAOEIFS];
+   struct aoeif *ifp;  /* current aoeif in use */
+   ushort nout;
+   ushort maxout;
+   u16 lasttag;/* last tag sent */
+   u16 useme;
+   ulong lastwadj; /* last window adjustment */
+   int wpkts, rpkts;
+};
+
 struct aoedev {
struct aoedev *next;
-   unsigned char addr[6];  /* remote mac addr */
-   ushort flags;
ulong sysminor;
ulong aoemajor;
-   ulong aoeminor;
+   u16 aoeminor;
+   u16 flags;
u16 nopen;  /* (bd_openers isn't available without 
sleeping) */
-   u16 lasttag;/* last tag sent */
u16 rttavg; /* round trip average of requests/responses */
u16 mintimer;
u16 fw_ver; /* version of blade's firmware */
-   u16 maxbcnt;
struct work_struct work;/* disk create work struct */
struct gendisk *gd;
struct request_queue blkq;
@@ -134,15 +157,14 @@ struct aoedev {
sector_t ssize;
struct timer_list timer;
spinlock_t lock;
-   struct net_device *ifp; /* interface ed is attached to */
struct sk_buff *sendq_hd; /* packets needing to be sent, list head */
struct sk_buff *sendq_tl;
mempool_t *bu

[PATCH 01/13] bring driver version number to 47

2007-12-07 Thread Ed L. Cashin
These patches were made against kernel 2.6.23-rc4 kernel with the
aoe-properly-initialise-the-request_queues-backing_dev_info patch
(currently in mm) applied.  They were submitted earlier and have been
modified to incorporate feedback from the kernel development
community.

Signed-off-by: Ed L. Cashin <[EMAIL PROTECTED]>
---
 drivers/block/aoe/aoe.h |2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/drivers/block/aoe/aoe.h b/drivers/block/aoe/aoe.h
index 07f02f8..4d0543a 100644
--- a/drivers/block/aoe/aoe.h
+++ b/drivers/block/aoe/aoe.h
@@ -1,5 +1,5 @@
 /* Copyright (c) 2006 Coraid, Inc.  See COPYING for GPL terms. */
-#define VERSION "32"
+#define VERSION "47"
 #define AOE_MAJOR 152
 #define DEVICE_NAME "aoe"
 
-- 
1.5.3.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 11/13] remove extra space in prototypes for consistency

2007-12-07 Thread Ed L. Cashin
Remove extra space in prototypes for consistency.

Signed-off-by: Ed L. Cashin [EMAIL PROTECTED]
---
 drivers/block/aoe/aoeblk.c |8 
 1 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/block/aoe/aoeblk.c b/drivers/block/aoe/aoeblk.c
index 7168d3d..98ab170 100644
--- a/drivers/block/aoe/aoeblk.c
+++ b/drivers/block/aoe/aoeblk.c
@@ -15,7 +15,7 @@
 
 static struct kmem_cache *buf_pool_cache;
 
-static ssize_t aoedisk_show_state(struct gendisk * disk, char *page)
+static ssize_t aoedisk_show_state(struct gendisk *disk, char *page)
 {
struct aoedev *d = disk-private_data;
 
@@ -26,7 +26,7 @@ static ssize_t aoedisk_show_state(struct gendisk * disk, char 
*page)
(d-nopen  !(d-flags  DEVFL_UP)) ? ,closewait : 
);
/* I'd rather see nopen exported so we can ditch closewait */
 }
-static ssize_t aoedisk_show_mac(struct gendisk * disk, char *page)
+static ssize_t aoedisk_show_mac(struct gendisk *disk, char *page)
 {
struct aoedev *d = disk-private_data;
struct aoetgt *t = d-targets[0];
@@ -35,7 +35,7 @@ static ssize_t aoedisk_show_mac(struct gendisk * disk, char 
*page)
return snprintf(page, PAGE_SIZE, none\n);
return snprintf(page, PAGE_SIZE, %012llx\n, mac_addr(t-addr));
 }
-static ssize_t aoedisk_show_netif(struct gendisk * disk, char *page)
+static ssize_t aoedisk_show_netif(struct gendisk *disk, char *page)
 {
struct aoedev *d = disk-private_data;
struct net_device *nds[8], **nd, **nnd, **ne;
@@ -71,7 +71,7 @@ static ssize_t aoedisk_show_netif(struct gendisk * disk, char 
*page)
return p-page;
 }
 /* firmware version */
-static ssize_t aoedisk_show_fwver(struct gendisk * disk, char *page)
+static ssize_t aoedisk_show_fwver(struct gendisk *disk, char *page)
 {
struct aoedev *d = disk-private_data;
 
-- 
1.5.3.4

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 10/13] add module parameter for users who need more outstanding I/O

2007-12-07 Thread Ed L. Cashin
An AoE target provides an estimate of the number of outstanding
commands that the AoE initiator can send before getting a response.
The aoe_maxout parameter provides a way to set an even lower limit.
It will not allow a user to use more outstanding commands than the
target permits.  If a user discovers a problem with a large setting,
this parameter provides a way for us to work with them to debug the
problem.  We expect to improve the dynamic window sizing algorithm and
drop this parameter.  For the time being, it is a debugging aid.

Signed-off-by: Ed L. Cashin [EMAIL PROTECTED]
---
 drivers/block/aoe/aoecmd.c |   10 +++---
 1 files changed, 7 insertions(+), 3 deletions(-)

diff --git a/drivers/block/aoe/aoecmd.c b/drivers/block/aoe/aoecmd.c
index 7a96183..e92d885 100644
--- a/drivers/block/aoe/aoecmd.c
+++ b/drivers/block/aoe/aoecmd.c
@@ -18,6 +18,11 @@ static int aoe_deadsecs = 60 * 3;
 module_param(aoe_deadsecs, int, 0644);
 MODULE_PARM_DESC(aoe_deadsecs, After aoe_deadsecs seconds, give up and fail 
dev.);
 
+static int aoe_maxout = 16;
+module_param(aoe_maxout, int, 0644);
+MODULE_PARM_DESC(aoe_maxout,
+   Only aoe_maxout outstanding packets for every MAC on eX.Y.);
+
 static struct sk_buff *
 new_skb(ulong len)
 {
@@ -984,7 +989,6 @@ aoecmd_cfg_rsp(struct sk_buff *skb)
struct aoeif *ifp;
ulong flags, sysminor, aoemajor;
struct sk_buff *sl;
-   enum { MAXFRAMES = 16 };
u16 n;
 
h = (struct aoe_hdr *) skb_mac_header(skb);
@@ -1009,8 +1013,8 @@ aoecmd_cfg_rsp(struct sk_buff *skb)
}
 
n = be16_to_cpu(ch-bufcnt);
-   if (n  MAXFRAMES)  /* keep it reasonable */
-   n = MAXFRAMES;
+   if (n  aoe_maxout) /* keep it reasonable */
+   n = aoe_maxout;
 
d = aoedev_by_sysminor_m(sysminor);
if (d == NULL) {
-- 
1.5.3.4

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 09/13] remove race between use and initialization of locks

2007-12-07 Thread Ed L. Cashin
Alexey Dobriyan noticed a race in the initialization of the dynamic
locks in ...

  Message-ID: [EMAIL PROTECTED]

Andrew Morton commented that these locks should be initialized at
compile time, so this patch does that.

Signed-off-by: Ed L. Cashin [EMAIL PROTECTED]
---
 drivers/block/aoe/aoechr.c |6 ++
 1 files changed, 2 insertions(+), 4 deletions(-)

diff --git a/drivers/block/aoe/aoechr.c b/drivers/block/aoe/aoechr.c
index 166f54f..0ce9bda 100644
--- a/drivers/block/aoe/aoechr.c
+++ b/drivers/block/aoe/aoechr.c
@@ -35,8 +35,8 @@ struct ErrMsg {
 
 static struct ErrMsg emsgs[NMSG];
 static int emsgs_head_idx, emsgs_tail_idx;
-static struct semaphore emsgs_sema;
-static spinlock_t emsgs_lock;
+static __DECLARE_SEMAPHORE_GENERIC(emsgs_sema, 0);
+static DEFINE_SPINLOCK(emsgs_lock);
 static int nblocked_emsgs_readers;
 static struct class *aoe_class;
 static struct aoe_chardev chardevs[] = {
@@ -264,8 +264,6 @@ aoechr_init(void)
printk(KERN_ERR aoe: can't register char device\n);
return n;
}
-   sema_init(emsgs_sema, 0);
-   spin_lock_init(emsgs_lock);
aoe_class = class_create(THIS_MODULE, aoe);
if (IS_ERR(aoe_class)) {
unregister_chrdev(AOE_MAJOR, aoechr);
-- 
1.5.3.4

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 08/13] only install new AoE device once

2007-12-07 Thread Ed L. Cashin
An aoe driver user who had about 70 AoE targets found that he was
hitting a BUG in sysfs_create_file because the aoe driver was trying
to tell the kernel about an AoE device more than once.  Each AoE
device was reachable by several local network interfaces, and multiple
ATA device indentify responses were returning from that single device.

This patch eliminates a race condition so that aoe always informs the
block layer of a new AoE device once in the presence of multiple
incoming ATA device identify responses.

Signed-off-by: Ed L. Cashin [EMAIL PROTECTED]
---
 drivers/block/aoe/aoecmd.c |2 ++
 1 files changed, 2 insertions(+), 0 deletions(-)

diff --git a/drivers/block/aoe/aoecmd.c b/drivers/block/aoe/aoecmd.c
index b49e06e..7a96183 100644
--- a/drivers/block/aoe/aoecmd.c
+++ b/drivers/block/aoe/aoecmd.c
@@ -698,6 +698,8 @@ ataid_complete(struct aoedev *d, struct aoetgt *t, unsigned 
char *id)
d-fw_ver, (long long)ssize);
d-ssize = ssize;
d-geo.start = 0;
+   if (d-flags  (DEVFL_GDALLOC|DEVFL_NEWSIZE))
+   return;
if (d-gd != NULL) {
d-gd-capacity = ssize;
d-flags |= DEVFL_NEWSIZE;
-- 
1.5.3.4

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 05/13] eliminate goto and improve readability

2007-12-07 Thread Ed L. Cashin
Adam Richter suggested eliminating this goto.

Signed-off-by: Ed L. Cashin [EMAIL PROTECTED]
---
 drivers/block/aoe/aoechr.c |   69 +--
 1 files changed, 34 insertions(+), 35 deletions(-)

diff --git a/drivers/block/aoe/aoechr.c b/drivers/block/aoe/aoechr.c
index 1a5c4b5..4a3889d 100644
--- a/drivers/block/aoe/aoechr.c
+++ b/drivers/block/aoe/aoechr.c
@@ -194,52 +194,51 @@ aoechr_read(struct file *filp, char __user *buf, size_t 
cnt, loff_t *off)
ulong flags;
 
n = (unsigned long) filp-private_data;
-   switch (n) {
-   case MINOR_ERR:
-   spin_lock_irqsave(emsgs_lock, flags);
-loop:
-   em = emsgs + emsgs_head_idx;
-   if ((em-flags  EMFL_VALID) == 0) {
-   if (filp-f_flags  O_NDELAY) {
-   spin_unlock_irqrestore(emsgs_lock, flags);
-   return -EAGAIN;
-   }
-   nblocked_emsgs_readers++;
+   if (n != MINOR_ERR)
+   return -EFAULT;
+
+   spin_lock_irqsave(emsgs_lock, flags);
 
+   for (;;) {
+   em = emsgs + emsgs_head_idx;
+   if ((em-flags  EMFL_VALID) != 0)
+   break;
+   if (filp-f_flags  O_NDELAY) {
spin_unlock_irqrestore(emsgs_lock, flags);
+   return -EAGAIN;
+   }
+   nblocked_emsgs_readers++;
+
+   spin_unlock_irqrestore(emsgs_lock, flags);
 
-   n = down_interruptible(emsgs_sema);
+   n = down_interruptible(emsgs_sema);
 
-   spin_lock_irqsave(emsgs_lock, flags);
+   spin_lock_irqsave(emsgs_lock, flags);
 
-   nblocked_emsgs_readers--;
+   nblocked_emsgs_readers--;
 
-   if (n) {
-   spin_unlock_irqrestore(emsgs_lock, flags);
-   return -ERESTARTSYS;
-   }
-   goto loop;
-   }
-   if (em-len  cnt) {
+   if (n) {
spin_unlock_irqrestore(emsgs_lock, flags);
-   return -EAGAIN;
+   return -ERESTARTSYS;
}
-   mp = em-msg;
-   len = em-len;
-   em-msg = NULL;
-   em-flags = ~EMFL_VALID;
+   }
+   if (em-len  cnt) {
+   spin_unlock_irqrestore(emsgs_lock, flags);
+   return -EAGAIN;
+   }
+   mp = em-msg;
+   len = em-len;
+   em-msg = NULL;
+   em-flags = ~EMFL_VALID;
 
-   emsgs_head_idx++;
-   emsgs_head_idx %= ARRAY_SIZE(emsgs);
+   emsgs_head_idx++;
+   emsgs_head_idx %= ARRAY_SIZE(emsgs);
 
-   spin_unlock_irqrestore(emsgs_lock, flags);
+   spin_unlock_irqrestore(emsgs_lock, flags);
 
-   n = copy_to_user(buf, mp, len);
-   kfree(mp);
-   return n == 0 ? len : -EFAULT;
-   default:
-   return -EFAULT;
-   }
+   n = copy_to_user(buf, mp, len);
+   kfree(mp);
+   return n == 0 ? len : -EFAULT;
 }
 
 static const struct file_operations aoe_fops = {
-- 
1.5.3.4

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 04/13] clean up udev configuration example

2007-12-07 Thread Ed L. Cashin
This patch adds a known default location for the udev configuration
file and uses the more recent == syntax for SUBSYSTEM and KERNEL.

Signed-off-by: Ed L. Cashin [EMAIL PROTECTED]
---
 Documentation/aoe/udev-install.sh |5 -
 Documentation/aoe/udev.txt|   15 ---
 2 files changed, 12 insertions(+), 8 deletions(-)

diff --git a/Documentation/aoe/udev-install.sh 
b/Documentation/aoe/udev-install.sh
index 6449911..15e86f5 100644
--- a/Documentation/aoe/udev-install.sh
+++ b/Documentation/aoe/udev-install.sh
@@ -23,7 +23,10 @@ fi
 # /etc/udev/rules.d
 #
 rules_d=`sed -n '/^udev_rules=/{ s!udev_rules=!!; s!\!!g; p; }' $conf`
-if test -z $rules_d || test ! -d $rules_d; then
+if test -z $rules_d ; then
+   rules_d=/etc/udev/rules.d
+fi
+if test ! -d $rules_d; then
echo $me Error: cannot find udev rules directory 12
exit 1
 fi
diff --git a/Documentation/aoe/udev.txt b/Documentation/aoe/udev.txt
index a7ed1dc..17e76c4 100644
--- a/Documentation/aoe/udev.txt
+++ b/Documentation/aoe/udev.txt
@@ -1,6 +1,7 @@
 # These rules tell udev what device nodes to create for aoe support.
-# They may be installed along the following lines (adjusted to what
-# you see on your system).
+# They may be installed along the following lines.  Check the section
+# 8 udev manpage to see whether your udev supports SUBSYSTEM, and
+# whether it uses one or two equal signs for SUBSYSTEM and KERNEL.
 # 
 #   [EMAIL PROTECTED] ~$ su
 #   Password:
@@ -15,10 +16,10 @@
 #  
 
 # aoe char devices
-SUBSYSTEM=aoe, KERNEL=discover,NAME=etherd/%k, GROUP=disk, 
MODE=0220
-SUBSYSTEM=aoe, KERNEL=err, NAME=etherd/%k, GROUP=disk, 
MODE=0440
-SUBSYSTEM=aoe, KERNEL=interfaces,  NAME=etherd/%k, GROUP=disk, 
MODE=0220
-SUBSYSTEM=aoe, KERNEL=revalidate,  NAME=etherd/%k, GROUP=disk, 
MODE=0220
+SUBSYSTEM==aoe, KERNEL==discover,  NAME=etherd/%k, GROUP=disk, 
MODE=0220
+SUBSYSTEM==aoe, KERNEL==err,   NAME=etherd/%k, GROUP=disk, 
MODE=0440
+SUBSYSTEM==aoe, KERNEL==interfaces,NAME=etherd/%k, GROUP=disk, 
MODE=0220
+SUBSYSTEM==aoe, KERNEL==revalidate,NAME=etherd/%k, GROUP=disk, 
MODE=0220
 
 # aoe block devices 
-KERNEL=etherd*,   NAME=%k, GROUP=disk
+KERNEL==etherd*,   NAME=%k, GROUP=disk
-- 
1.5.3.4

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 03/13] mac_addr: avoid 64-bit arch compiler warnings

2007-12-07 Thread Ed L. Cashin
By returning unsigned long long, mac_addr does not generate compiler
warnings on 64-bit architectures.

Signed-off-by: Ed L. Cashin [EMAIL PROTECTED]
---
 drivers/block/aoe/aoe.h|2 +-
 drivers/block/aoe/aoeblk.c |3 +--
 drivers/block/aoe/aoecmd.c |   10 +-
 drivers/block/aoe/aoenet.c |4 ++--
 4 files changed, 9 insertions(+), 10 deletions(-)

diff --git a/drivers/block/aoe/aoe.h b/drivers/block/aoe/aoe.h
index 87df18b..aecaac3 100644
--- a/drivers/block/aoe/aoe.h
+++ b/drivers/block/aoe/aoe.h
@@ -198,4 +198,4 @@ void aoenet_xmit(struct sk_buff *);
 int is_aoe_netif(struct net_device *ifp);
 int set_aoe_iflist(const char __user *str, size_t size);
 
-u64 mac_addr(char addr[6]);
+unsigned long long mac_addr(char addr[6]);
diff --git a/drivers/block/aoe/aoeblk.c b/drivers/block/aoe/aoeblk.c
index e10a7f3..7168d3d 100644
--- a/drivers/block/aoe/aoeblk.c
+++ b/drivers/block/aoe/aoeblk.c
@@ -33,8 +33,7 @@ static ssize_t aoedisk_show_mac(struct gendisk * disk, char 
*page)
 
if (t == NULL)
return snprintf(page, PAGE_SIZE, none\n);
-   return snprintf(page, PAGE_SIZE, %012llx\n,
-   (unsigned long long)mac_addr(t-addr));
+   return snprintf(page, PAGE_SIZE, %012llx\n, mac_addr(t-addr));
 }
 static ssize_t aoedisk_show_netif(struct gendisk * disk, char *page)
 {
diff --git a/drivers/block/aoe/aoecmd.c b/drivers/block/aoe/aoecmd.c
index 5e7daa1..1be5150 100644
--- a/drivers/block/aoe/aoecmd.c
+++ b/drivers/block/aoe/aoecmd.c
@@ -309,7 +309,8 @@ resend(struct aoedev *d, struct aoetgt *t, struct frame *f)
%15s e%ld.%d [EMAIL PROTECTED] newtag=%08x 
s=%012llx d=%012llx nout=%d\n,
retransmit, d-aoemajor, d-aoeminor, f-tag, jiffies, n,
-   mac_addr(h-src), mac_addr(h-dst), t-nout);
+   mac_addr(h-src),
+   mac_addr(h-dst), t-nout);
aoechr_error(buf);
 
f-tag = n;
@@ -633,7 +634,7 @@ ataid_complete(struct aoedev *d, struct aoetgt *t, unsigned 
char *id)
 
if (d-ssize != ssize)
printk(KERN_INFO aoe: %012llx e%lu.%lu v%04x has %llu 
sectors\n,
-   (unsigned long long)mac_addr(t-addr),
+   mac_addr(t-addr),
d-aoemajor, d-aoeminor,
d-fw_ver, (long long)ssize);
d-ssize = ssize;
@@ -727,8 +728,7 @@ aoecmd_ata_rsp(struct sk_buff *skb)
t = gettgt(d, hin-src);
if (t == NULL) {
printk(KERN_INFO aoe: can't find target e%ld.%d:%012llx\n,
-   d-aoemajor, d-aoeminor,
-   (unsigned long long) mac_addr(hin-src));
+   d-aoemajor, d-aoeminor, mac_addr(hin-src));
spin_unlock_irqrestore(d-lock, flags);
return;
}
@@ -1003,7 +1003,7 @@ aoecmd_cfg_rsp(struct sk_buff *skb)
aoe: e%ld.%d: setting %d%s%s:%012llx\n,
d-aoemajor, d-aoeminor, n,
 byte data frames on , ifp-nd-name,
-   (unsigned long long) mac_addr(t-addr));
+   mac_addr(t-addr));
ifp-maxbcnt = n;
}
}
diff --git a/drivers/block/aoe/aoenet.c b/drivers/block/aoe/aoenet.c
index 7a38a45..ada4a06 100644
--- a/drivers/block/aoe/aoenet.c
+++ b/drivers/block/aoe/aoenet.c
@@ -83,7 +83,7 @@ set_aoe_iflist(const char __user *user_str, size_t size)
return 0;
 }
 
-u64
+unsigned long long
 mac_addr(char addr[6])
 {
__be64 n = 0;
@@ -91,7 +91,7 @@ mac_addr(char addr[6])
 
memcpy(p + 2, addr, 6); /* (sizeof addr != 6) */
 
-   return __be64_to_cpu(n);
+   return (unsigned long long) __be64_to_cpu(n);
 }
 
 void
-- 
1.5.3.4

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 02/13] handle multiple network paths to AoE device

2007-12-07 Thread Ed L. Cashin
A remote AoE device is something can process ATA commands and is
identified by an AoE shelf number and an AoE slot number.  Such a
device might have more than one network interface, and it might be
reachable by more than one local network interface.  This patch tracks
the available network paths available to each AoE device, allowing
them to be used more efficiently.

Andrew Morton asked about the call to msleep_interruptible in the
revalidate function.  Yes, if a signal is pending, then
msleep_interruptible will not return 0.  That means we will not loop
but will call aoenet_xmit with a NULL skb, which is a noop.  If the
system is too low on memory or the aoe driver is too low on frames,
then the user can hit control-C to interrupt the attempt to do a
revalidate.  I have added a comment to the code summarizing that.

Andrew Morton asked whether the allocation performed inside addtgt
could use a more relaxed allocation like GFP_KERNEL, but addtgt is
called when the aoedev lock has been locked with spin_lock_irqsave.
It would be nice to allocate the memory under fewer restrictions, but
targets are only added when the device is being discovered, and if the
target can't be added right now, we can try again in a minute when
then next AoE config query broadcast goes out.

Andrew Morton pointed out that the too many targets message could be
printed for failing GFP_ATOMIC allocations.  The last patch in this
series makes the messages more specific.

Signed-off-by: Ed L. Cashin [EMAIL PROTECTED]
---
 drivers/block/aoe/aoe.h|   57 +++--
 drivers/block/aoe/aoeblk.c |   62 -
 drivers/block/aoe/aoechr.c |   17 +-
 drivers/block/aoe/aoecmd.c |  675 ++--
 drivers/block/aoe/aoedev.c |  168 +--
 drivers/block/aoe/aoenet.c |9 +-
 6 files changed, 653 insertions(+), 335 deletions(-)

diff --git a/drivers/block/aoe/aoe.h b/drivers/block/aoe/aoe.h
index 4d0543a..87df18b 100644
--- a/drivers/block/aoe/aoe.h
+++ b/drivers/block/aoe/aoe.h
@@ -76,10 +76,8 @@ enum {
DEVFL_EXT = (12), /* device accepts lba48 commands */
DEVFL_CLOSEWAIT = (13), /* device is waiting for all closes to 
revalidate */
DEVFL_GDALLOC = (14), /* need to alloc gendisk */
-   DEVFL_PAUSE = (15),
+   DEVFL_KICKME = (15),  /* slow polling network card catch */
DEVFL_NEWSIZE = (16), /* need to update dev size in block layer */
-   DEVFL_MAXBCNT = (17), /* d-maxbcnt is not changeable */
-   DEVFL_KICKME = (18),
 
BUFFL_FAIL = 1,
 };
@@ -88,17 +86,24 @@ enum {
DEFAULTBCNT = 2 * 512,  /* 2 sectors */
NPERSHELF = 16, /* number of slots per shelf address */
FREETAG = -1,
-   MIN_BUFS = 8,
+   MIN_BUFS = 16,
+   NTARGETS = 8,
+   NAOEIFS = 8,
+
+   TIMERTICK = HZ / 10,
+   MINTIMER = HZ  2,
+   MAXTIMER = HZ  1,
+   HELPWAIT = 20,
 };
 
 struct buf {
struct list_head bufs;
-   ulong start_time;   /* for disk stats */
+   ulong stime;/* for disk stats */
ulong flags;
ulong nframesout;
-   char *bufaddr;
ulong resid;
ulong bv_resid;
+   ulong bv_off;
sector_t sector;
struct bio *bio;
struct bio_vec *bv;
@@ -114,19 +119,37 @@ struct frame {
struct sk_buff *skb;
 };
 
+struct aoeif {
+   struct net_device *nd;
+   unsigned char lost;
+   unsigned char lostjumbo;
+   ushort maxbcnt;
+};
+
+struct aoetgt {
+   unsigned char addr[6];
+   ushort nframes;
+   struct frame *frames;
+   struct aoeif ifs[NAOEIFS];
+   struct aoeif *ifp;  /* current aoeif in use */
+   ushort nout;
+   ushort maxout;
+   u16 lasttag;/* last tag sent */
+   u16 useme;
+   ulong lastwadj; /* last window adjustment */
+   int wpkts, rpkts;
+};
+
 struct aoedev {
struct aoedev *next;
-   unsigned char addr[6];  /* remote mac addr */
-   ushort flags;
ulong sysminor;
ulong aoemajor;
-   ulong aoeminor;
+   u16 aoeminor;
+   u16 flags;
u16 nopen;  /* (bd_openers isn't available without 
sleeping) */
-   u16 lasttag;/* last tag sent */
u16 rttavg; /* round trip average of requests/responses */
u16 mintimer;
u16 fw_ver; /* version of blade's firmware */
-   u16 maxbcnt;
struct work_struct work;/* disk create work struct */
struct gendisk *gd;
struct request_queue blkq;
@@ -134,15 +157,14 @@ struct aoedev {
sector_t ssize;
struct timer_list timer;
spinlock_t lock;
-   struct net_device *ifp; /* interface ed is attached to */
struct sk_buff *sendq_hd; /* packets needing to be sent, list head */
struct sk_buff *sendq_tl;
mempool_t *bufpool; /* for deadlock-free Buf allocation */
struct list_head bufq;  /* queue of bios to work

[PATCH 13/13] make error messages more specific

2007-12-07 Thread Ed L. Cashin
Andrew Morton pointed out that the too many targets message in patch
2 could be printed for failing GFP_ATOMIC allocations.  This patch
makes the messages more specific.

Signed-off-by: Ed L. Cashin [EMAIL PROTECTED]
---
 drivers/block/aoe/aoecmd.c |   15 +++
 1 files changed, 7 insertions(+), 8 deletions(-)

diff --git a/drivers/block/aoe/aoecmd.c b/drivers/block/aoe/aoecmd.c
index bcea36c..1e37cf6 100644
--- a/drivers/block/aoe/aoecmd.c
+++ b/drivers/block/aoe/aoecmd.c
@@ -957,15 +957,17 @@ addtgt(struct aoedev *d, char *addr, ulong nframes)
for (; tt  te  *tt; tt++)
;
 
-   if (tt == te)
+   if (tt == te) {
+   printk(KERN_INFO
+   aoe: device addtgt failure; too many targets\n);
return NULL;
-
+   }
t = kcalloc(1, sizeof *t, GFP_ATOMIC);
-   if (!t)
-   return NULL;
f = kcalloc(nframes, sizeof *f, GFP_ATOMIC);
-   if (!f) {
+   if (!t || !f) {
+   kfree(f);
kfree(t);
+   printk(KERN_INFO aoe: cannot allocate memory to add target\n);
return NULL;
}
 
@@ -1029,9 +1031,6 @@ aoecmd_cfg_rsp(struct sk_buff *skb)
if (!t) {
t = addtgt(d, h-src, n);
if (!t) {
-   printk(KERN_INFO
-   aoe: device addtgt failure; 
-   too many targets?\n);
spin_unlock_irqrestore(d-lock, flags);
return;
}
-- 
1.5.3.4

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 12/13] the aoeminor doesn't need a long format

2007-12-07 Thread Ed L. Cashin
The aoedev aoeminor member doesn't need a long format.

Signed-off-by: Ed L. Cashin [EMAIL PROTECTED]
---
 drivers/block/aoe/aoeblk.c |7 ---
 drivers/block/aoe/aoecmd.c |5 +++--
 2 files changed, 7 insertions(+), 5 deletions(-)

diff --git a/drivers/block/aoe/aoeblk.c b/drivers/block/aoe/aoeblk.c
index 98ab170..b78a8ef 100644
--- a/drivers/block/aoe/aoeblk.c
+++ b/drivers/block/aoe/aoeblk.c
@@ -203,7 +203,7 @@ aoeblk_make_request(struct request_queue *q, struct bio 
*bio)
spin_lock_irqsave(d-lock, flags);
 
if ((d-flags  DEVFL_UP) == 0) {
-   printk(KERN_INFO aoe: device %ld.%ld is not up\n,
+   printk(KERN_INFO aoe: device %ld.%d is not up\n,
d-aoemajor, d-aoeminor);
spin_unlock_irqrestore(d-lock, flags);
mempool_free(buf, d-bufpool);
@@ -256,14 +256,15 @@ aoeblk_gdalloc(void *vp)
 
gd = alloc_disk(AOE_PARTITIONS);
if (gd == NULL) {
-   printk(KERN_ERR aoe: cannot allocate disk structure for 
%ld.%ld\n,
+   printk(KERN_ERR
+   aoe: cannot allocate disk structure for %ld.%d\n,
d-aoemajor, d-aoeminor);
goto err;
}
 
d-bufpool = mempool_create_slab_pool(MIN_BUFS, buf_pool_cache);
if (d-bufpool == NULL) {
-   printk(KERN_ERR aoe: cannot allocate bufpool for %ld.%ld\n,
+   printk(KERN_ERR aoe: cannot allocate bufpool for %ld.%d\n,
d-aoemajor, d-aoeminor);
goto err_disk;
}
diff --git a/drivers/block/aoe/aoecmd.c b/drivers/block/aoe/aoecmd.c
index e92d885..bcea36c 100644
--- a/drivers/block/aoe/aoecmd.c
+++ b/drivers/block/aoe/aoecmd.c
@@ -697,7 +697,8 @@ ataid_complete(struct aoedev *d, struct aoetgt *t, unsigned 
char *id)
}
 
if (d-ssize != ssize)
-   printk(KERN_INFO aoe: %012llx e%lu.%lu v%04x has %llu 
sectors\n,
+   printk(KERN_INFO
+   aoe: %012llx e%ld.%d v%04x has %llu sectors\n,
mac_addr(t-addr),
d-aoemajor, d-aoeminor,
d-fw_ver, (long long)ssize);
@@ -822,7 +823,7 @@ aoecmd_ata_rsp(struct sk_buff *skb)
 
if (ahin-cmdstat  0xa9) { /* these bits cleared on success */
printk(KERN_ERR
-   aoe: ata error cmd=%2.2Xh stat=%2.2Xh from e%ld.%ld\n,
+   aoe: ata error cmd=%2.2Xh stat=%2.2Xh from e%ld.%d\n,
ahout-cmdstat, ahin-cmdstat,
d-aoemajor, d-aoeminor);
if (buf)
-- 
1.5.3.4

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 07/13] dynamically allocate a capped number of skbs when necessary

2007-12-07 Thread Ed L. Cashin
What this Patch Does

  Even before this recent series of 12 patches to 2.6.22-rc4, the aoe
  driver was reusing a small set of skbs that were allocated once and
  were only used for outbound AoE commands.

  The network layer cannot be allowed to put_page on the data that is
  still associated with a bio we haven't returned to the block layer,
  so the aoe driver (even before the patch under discussion) is still
  the owner of skbs that have been handed to the network layer for
  transmission.  We need to keep track of these skbs so that we can
  free them, but by tracking them, we can also easily re-use them.

  The new patch was a response to the behavior of certain network
  drivers.  We cannot reuse an skb that the network driver still has
  in its transmit ring.  Network drivers can defer transmit ring
  cleanup and then use the state in the skb to determine how many data
  segments to clean up in its transmit ring.  The tg3 driver is one
  driver that behaves in this way.

  When the network driver defers cleanup of its transmit ring, the aoe
  driver can find itself in a situation where it would like to send an
  AoE command, and the AoE target is ready for more work, but the
  network driver still has all of the pre-allocated skbs.  In that
  case, the new patch just calls alloc_skb, as you'd expect.

  We don't want to get carried away, though.  We try not to do
  excessive allocation in the write path, so we cap the number of skbs
  we dynamically allocate.

  Probably calling it a dynamic pool is misleading.  We were already
  trying to use a small fixed-size set of pre-allocated skbs before
  this patch, and this patch just provides a little headroom (with a
  ceiling, though) to accomodate network drivers that hang onto skbs,
  by allocating when needed.  The d-skbpool_hd list of allocated skbs
  is necessary so that we can free them later.

  We didn't notice the need for this headroom until AoE targets got
  fast enough.

Alternatives

  If the network layer never did a put_page on the pages in the bio's
  we get from the block layer, then it would be possible for us to
  hand skbs to the network layer and forget about them, allowing the
  network layer to free skbs itself (and thereby calling our own
  skb-destructor callback function if we needed that).  In that case
  we could get rid of the pre-allocated skbs and also the
  d-skbpool_hd, instead just calling alloc_skb every time we wanted
  to transmit a packet.  The slab allocator would effectively maintain
  the list of skbs.

  Besides a loss of CPU cache locality, the main concern with that
  approach the danger that it would increase the likelihood of
  deadlock when VM is trying to free pages by writing dirty data from
  the page cache through the aoe driver out to persistent storage on
  an AoE device.  Right now we have a situation where we have
  pre-allocation that corresponds to how much we use, which seems
  ideal.

  Of course, there's still the separate issue of receiving the packets
  that tell us that a write has successfully completed on the AoE
  target.  When memory is low and VM is using AoE to flush dirty data
  to free up pages, it would be perfect if there were a way for us to
  register a fast callback that could recognize write command
  completion responses.  But I don't think the current problems with
  the receive side of the situation are a justification for
  exacerbating the problem on the transmit side.

Signed-off-by: Ed L. Cashin [EMAIL PROTECTED]
---
 drivers/block/aoe/aoe.h|5 ++
 drivers/block/aoe/aoecmd.c |  117 +++-
 drivers/block/aoe/aoedev.c |   52 +---
 3 files changed, 133 insertions(+), 41 deletions(-)

diff --git a/drivers/block/aoe/aoe.h b/drivers/block/aoe/aoe.h
index 2248ab2..67ef4d7 100644
--- a/drivers/block/aoe/aoe.h
+++ b/drivers/block/aoe/aoe.h
@@ -89,6 +89,7 @@ enum {
MIN_BUFS = 16,
NTARGETS = 8,
NAOEIFS = 8,
+   NSKBPOOLMAX = 128,
 
TIMERTICK = HZ / 10,
MINTIMER = HZ  2,
@@ -138,6 +139,7 @@ struct aoetgt {
u16 useme;
ulong lastwadj; /* last window adjustment */
int wpkts, rpkts;
+   int dataref;
 };
 
 struct aoedev {
@@ -159,6 +161,9 @@ struct aoedev {
spinlock_t lock;
struct sk_buff *sendq_hd; /* packets needing to be sent, list head */
struct sk_buff *sendq_tl;
+   struct sk_buff *skbpool_hd;
+   struct sk_buff *skbpool_tl;
+   int nskbpool;
mempool_t *bufpool; /* for deadlock-free Buf allocation */
struct list_head bufq;  /* queue of bios to work on */
struct buf *inprocess;  /* the one we're currently working on */
diff --git a/drivers/block/aoe/aoecmd.c b/drivers/block/aoe/aoecmd.c
index 1be5150..b49e06e 100644
--- a/drivers/block/aoe/aoecmd.c
+++ b/drivers/block/aoe/aoecmd.c
@@ -106,45 +106,104 @@ ifrotate(struct aoetgt *t)
}
 }
 
+static void
+skb_pool_put(struct

[PATCH 01/13] bring driver version number to 47

2007-12-07 Thread Ed L. Cashin
These patches were made against kernel 2.6.23-rc4 kernel with the
aoe-properly-initialise-the-request_queues-backing_dev_info patch
(currently in mm) applied.  They were submitted earlier and have been
modified to incorporate feedback from the kernel development
community.

Signed-off-by: Ed L. Cashin [EMAIL PROTECTED]
---
 drivers/block/aoe/aoe.h |2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/drivers/block/aoe/aoe.h b/drivers/block/aoe/aoe.h
index 07f02f8..4d0543a 100644
--- a/drivers/block/aoe/aoe.h
+++ b/drivers/block/aoe/aoe.h
@@ -1,5 +1,5 @@
 /* Copyright (c) 2006 Coraid, Inc.  See COPYING for GPL terms. */
-#define VERSION 32
+#define VERSION 47
 #define AOE_MAJOR 152
 #define DEVICE_NAME aoe
 
-- 
1.5.3.4

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 06/13] user can ask driver to forget previously detected devices

2007-12-07 Thread Ed L. Cashin
When an AoE device is detected, the kernel is informed, and a new
block device is created.  If the device is unused, the block device
corresponding to remote device that is no longer available may be
removed from the system by telling the aoe driver to flush its list
of devices.

Without this patch, software like GPFS and LVM may attempt to read
from AoE devices that were discovered earlier but are no longer
present, blocking until the I/O attempt times out.

Signed-off-by: Ed L. Cashin [EMAIL PROTECTED]
---
 Documentation/aoe/mkdevs.sh |2 +
 Documentation/aoe/udev.txt  |1 +
 drivers/block/aoe/aoe.h |1 +
 drivers/block/aoe/aoechr.c  |5 ++
 drivers/block/aoe/aoedev.c  |   87 +-
 5 files changed, 77 insertions(+), 19 deletions(-)

diff --git a/Documentation/aoe/mkdevs.sh b/Documentation/aoe/mkdevs.sh
index 97374aa..44c0ab7 100644
--- a/Documentation/aoe/mkdevs.sh
+++ b/Documentation/aoe/mkdevs.sh
@@ -29,6 +29,8 @@ rm -f $dir/interfaces
 mknod -m 0200 $dir/interfaces c $MAJOR 4
 rm -f $dir/revalidate
 mknod -m 0200 $dir/revalidate c $MAJOR 5
+rm -f $dir/flush
+mknod -m 0200 $dir/flush c $MAJOR 6
 
 export n_partitions
 mkshelf=`echo $0 | sed 's!mkdevs!mkshelf!'`
diff --git a/Documentation/aoe/udev.txt b/Documentation/aoe/udev.txt
index 17e76c4..8686e78 100644
--- a/Documentation/aoe/udev.txt
+++ b/Documentation/aoe/udev.txt
@@ -20,6 +20,7 @@ SUBSYSTEM==aoe, KERNEL==discover, NAME=etherd/%k, 
GROUP=disk, MODE=0220
 SUBSYSTEM==aoe, KERNEL==err,   NAME=etherd/%k, GROUP=disk, 
MODE=0440
 SUBSYSTEM==aoe, KERNEL==interfaces,NAME=etherd/%k, GROUP=disk, 
MODE=0220
 SUBSYSTEM==aoe, KERNEL==revalidate,NAME=etherd/%k, GROUP=disk, 
MODE=0220
+SUBSYSTEM==aoe, KERNEL==flush, NAME=etherd/%k, GROUP=disk, 
MODE=0220
 
 # aoe block devices 
 KERNEL==etherd*,   NAME=%k, GROUP=disk
diff --git a/drivers/block/aoe/aoe.h b/drivers/block/aoe/aoe.h
index aecaac3..2248ab2 100644
--- a/drivers/block/aoe/aoe.h
+++ b/drivers/block/aoe/aoe.h
@@ -191,6 +191,7 @@ struct aoedev *aoedev_by_aoeaddr(int maj, int min);
 struct aoedev *aoedev_by_sysminor_m(ulong sysminor);
 void aoedev_downdev(struct aoedev *d);
 int aoedev_isbusy(struct aoedev *d);
+int aoedev_flush(const char __user *str, size_t size);
 
 int aoenet_init(void);
 void aoenet_exit(void);
diff --git a/drivers/block/aoe/aoechr.c b/drivers/block/aoe/aoechr.c
index 4a3889d..166f54f 100644
--- a/drivers/block/aoe/aoechr.c
+++ b/drivers/block/aoe/aoechr.c
@@ -15,6 +15,7 @@ enum {
MINOR_DISCOVER,
MINOR_INTERFACES,
MINOR_REVALIDATE,
+   MINOR_FLUSH,
MSGSZ = 2048,
NMSG = 100, /* message backlog to retain */
 };
@@ -43,6 +44,7 @@ static struct aoe_chardev chardevs[] = {
{ MINOR_DISCOVER, discover },
{ MINOR_INTERFACES, interfaces },
{ MINOR_REVALIDATE, revalidate },
+   { MINOR_FLUSH, flush },
 };
 
 static int
@@ -158,6 +160,9 @@ aoechr_write(struct file *filp, const char __user *buf, 
size_t cnt, loff_t *offp
break;
case MINOR_REVALIDATE:
ret = revalidate(buf, cnt);
+   break;
+   case MINOR_FLUSH:
+   ret = aoedev_flush(buf, cnt);
}
if (ret == 0)
ret = cnt;
diff --git a/drivers/block/aoe/aoedev.c b/drivers/block/aoe/aoedev.c
index a4d625a..e26f6f4 100644
--- a/drivers/block/aoe/aoedev.c
+++ b/drivers/block/aoe/aoedev.c
@@ -9,6 +9,10 @@
 #include linux/netdevice.h
 #include aoe.h
 
+static void dummy_timer(ulong);
+static void aoedev_freedev(struct aoedev *);
+static void freetgt(struct aoetgt *t);
+
 static struct aoedev *devlist;
 static spinlock_t devlist_lock;
 
@@ -108,6 +112,70 @@ aoedev_downdev(struct aoedev *d)
d-flags = ~DEVFL_UP;
 }
 
+static void
+aoedev_freedev(struct aoedev *d)
+{
+   struct aoetgt **t, **e;
+
+   if (d-gd) {
+   aoedisk_rm_sysfs(d);
+   del_gendisk(d-gd);
+   put_disk(d-gd);
+   }
+   t = d-targets;
+   e = t + NTARGETS;
+   for (; t  e  *t; t++)
+   freetgt(*t);
+   if (d-bufpool)
+   mempool_destroy(d-bufpool);
+   kfree(d);
+}
+
+int
+aoedev_flush(const char __user *str, size_t cnt)
+{
+   ulong flags;
+   struct aoedev *d, **dd;
+   struct aoedev *rmd = NULL;
+   char buf[16];
+   int all = 0;
+
+   if (cnt = 3) {
+   if (cnt  sizeof buf)
+   cnt = sizeof buf;
+   if (copy_from_user(buf, str, cnt))
+   return -EFAULT;
+   all = !strncmp(buf, all, 3);
+   }
+
+   flush_scheduled_work();
+   spin_lock_irqsave(devlist_lock, flags);
+   dd = devlist;
+   while ((d = *dd)) {
+   spin_lock(d-lock);
+   if ((!all  (d-flags  DEVFL_UP))
+   || (d-flags  (DEVFL_GDALLOC|DEVFL_NEWSIZE))
+   || d-nopen

Re: [Bugme-new] [Bug 9482] New: kernel GPF in 2.6.24 (g09f345da)

2007-12-03 Thread Ed L. Cashin
On Mon, Dec 03, 2007 at 03:13:49PM -0800, Andrew Morton wrote:
> On Mon, 3 Dec 2007 14:47:22 -0800
> Andrew Morton <[EMAIL PROTECTED]> wrote:
> 
> > Does this fix?
> 
> Slightly more elaborate version

Yes, this patch does eliminate the problem.  Without it, no write can
complete, and with it I have seen many writes complete without any
trouble.

Thank you for looking into this.  I will look more closely at this
patch tomorrow.

-- 
  Ed L Cashin <[EMAIL PROTECTED]>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Bugme-new] [Bug 9482] New: kernel GPF in 2.6.24 (g09f345da)

2007-12-03 Thread Ed L. Cashin
On Mon, Dec 03, 2007 at 02:47:22PM -0800, Andrew Morton wrote:
> On Mon, 3 Dec 2007 16:38:37 -0500
> "Ed L. Cashin" <[EMAIL PROTECTED]> wrote:
...
> > It appears that the fbc->counters pointer is NULL.
> 
> Does this fix?
> 
> --- a/drivers/block/aoe/aoeblk.c~a
> +++ a/drivers/block/aoe/aoeblk.c
> @@ -6,6 +6,7 @@
>  
>  #include 
>  #include 
> +#include 
>  #include 
>  #include 
>  #include 
> @@ -228,6 +229,7 @@ aoeblk_gdalloc(void *vp)
>  
>   spin_lock_irqsave(>lock, flags);
>   blk_queue_make_request(>blkq, aoeblk_make_request);
> + bdi_init(>blkq.backing_dev_info);
>   gd->major = AOE_MAJOR;
>   gd->first_minor = d->sysminor * AOE_PARTITIONS;
>   gd->fops = _bdops;
> _
> 
> 
> 

No, the behavior doesn't change with this patch applied.

Meanwhile I have started a git bisect, and hopefully that will turn up
a specific patch before I hit an unbootable kernel or get my machine
in a state where it won't boot.

-- 
  Ed L Cashin <[EMAIL PROTECTED]>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Bugme-new] [Bug 9482] New: kernel GPF in 2.6.24 (g09f345da)

2007-12-03 Thread Ed L. Cashin
On Mon, Dec 03, 2007 at 04:00:05PM -0500, Ed L. Cashin wrote:
...
> I'll keep looking at this, but at a glance it looks like the cpu
> number is valid, because I don't trip a BUG_ON when I make the change
> below (the badval variable is noise, sorry).
> 
>   --- lx/lib/percpu_counter.c.200711302007-12-03 15:43:19.0 -0500
>   +++ lx/lib/percpu_counter.c 2007-12-03 15:47:38.0 -0500
>   @@ -33,7 +33,9 @@ void __percpu_counter_add(struct percpu_
>   s64 count;
>   s32 *pcount;
>   int cpu = get_cpu();
>   +   u64 badval = 0xULL;
>
>   +   BUG_ON(!cpu_possible(cpu));
>   pcount = per_cpu_ptr(fbc->counters, cpu);
>   count = *pcount + amount;
>   if (count >= batch || count <= -batch) {

It appears that the fbc->counters pointer is NULL.  I added the line,

BUG_ON(!fbc->counters);

... (on line 39 in my percpu_counter.c), and it results in the trace
below.  It looks like when it's NULL, percpu_ptr passes it to
__percpu_disguise, which makes it all ones and then tries to
dereference 0x to access to the "ptrs" member of the
struct percpu_data.

[ cut here ]
kernel BUG at lib/percpu_counter.c:39!
invalid opcode:  [1] SMP 
CPU 0 
Modules linked in: aoe
Pid: 3354, comm: bash Not tainted 2.6.24-rc3-47dbg #10
RIP: 0010:[]  [] 
__percpu_counter_add+0x2a/0x8f
RSP: 0018:810075031aa8  EFLAGS: 00010046
RAX:  RBX: 81007fd19bd8 RCX: 
RDX: 0010 RSI: 0001 RDI: 
RBP: 810075031ac8 R08: 81007cc077b0 R09: 802ae5ee
R10: 810075031aa8 R11: 8100750318e8 R12: 81007c81c380
R13: 810073ce8250 R14: 0200 R15: 8100755016b0
FS:  2b3e5c052db0() GS:8078b000() knlGS:
CS:  0010 DS:  ES:  CR0: 8005003b
CR2: 2b7f44fb64e0 CR3: 7c4b1000 CR4: 06e0
DR0:  DR1:  DR2: 
DR3:  DR6: 0ff0 DR7: 0400
Process bash (pid: 3354, threadinfo 81007503, task 81007b4da040)
Stack:  810075031ac8 81007fd19bd8 81007c81c380 
 810075031af8 802ae682 100075031ae8 8100755016b0
 0200 81007fd19bd8 810075031b18 802ae75c
Call Trace:
 [] __set_page_dirty+0xdc/0x121
 [] mark_buffer_dirty+0x95/0x99
 [] __block_commit_write+0x72/0xac
 [] block_write_end+0x4f/0x5b
 [] blkdev_write_end+0x1b/0x38
 [] generic_file_buffered_write+0x1c0/0x648
 [] current_fs_time+0x22/0x29
 [] __generic_file_aio_write_nolock+0x358/0x3c2
 [] filemap_fault+0x1c4/0x320
 [] unlock_page+0x2d/0x31
 [] generic_file_aio_write_nolock+0x3b/0x8d
 [] do_sync_write+0xe2/0x126
 [] autoremove_wake_function+0x0/0x38
 [] do_page_fault+0x3f8/0x7bb
 [] fd_install+0x5f/0x68
 [] vfs_write+0xae/0x137
 [] sys_write+0x47/0x70
 [] system_call+0x7e/0x83


Code: 0f 0b eb fe 0f a3 3d 7e 08 4f 00 19 c0 85 c0 75 04 0f 0b eb 
RIP  [] __percpu_counter_add+0x2a/0x8f
 RSP 
BUG: sleeping function called from invalid context at kernel/rwsem.c:20
in_atomic():0, irqs_disabled():1
INFO: lockdep is turned off.

Call Trace:
 [] debug_show_held_locks+0x1b/0x24
 [] __might_sleep+0xc7/0xc9
 [] down_read+0x1d/0x4a
 [] exit_mm+0x34/0xf7
 [] do_exit+0x247/0x75b
 [] kernel_math_error+0x0/0x7e
 [] do_trap+0x101/0x110
 [] do_invalid_op+0x91/0x9a
 [] __percpu_counter_add+0x2a/0x8f
 [] :aoe:aoeblk_make_request+0x1c3/0x1d0
 [] io_schedule+0x28/0x34
 [] error_exit+0x0/0x9a
 [] __set_page_dirty+0x48/0x121
 [] __percpu_counter_add+0x2a/0x8f
 [] __set_page_dirty+0xdc/0x121
 [] mark_buffer_dirty+0x95/0x99
 [] __block_commit_write+0x72/0xac
 [] block_write_end+0x4f/0x5b
 [] blkdev_write_end+0x1b/0x38
 [] generic_file_buffered_write+0x1c0/0x648
 [] current_fs_time+0x22/0x29
 [] __generic_file_aio_write_nolock+0x358/0x3c2
 [] filemap_fault+0x1c4/0x320
 [] unlock_page+0x2d/0x31
 [] generic_file_aio_write_nolock+0x3b/0x8d
 [] do_sync_write+0xe2/0x126
 [] autoremove_wake_function+0x0/0x38
 [] do_page_fault+0x3f8/0x7bb
 [] fd_install+0x5f/0x68
 [] vfs_write+0xae/0x137
 [] sys_write+0x47/0x70
 [] system_call+0x7e/0x83



-- 
  Ed L Cashin <[EMAIL PROTECTED]>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Bugme-new] [Bug 9482] New: kernel GPF in 2.6.24 (g09f345da)

2007-12-03 Thread Ed L. Cashin
On Mon, Dec 03, 2007 at 11:34:59AM -0800, Andrew Morton wrote:
...
> Strange.  It _looks_ like we've somehow caused smp_processor_id() to return
> a not-possible CPU number.
...
> Could you debug this a bit please?  Find out which CPU number
> __percpu_counter_add() is using, for a start?  I'd do:
...
> Alternatively, just do
> 
>   if (!cpu_possible(cpu))
>   printk(...)
> 
> in __percpu_counter_add().  Then you can proceed to work through the
> various operations which smp_processor_id() does and find out where it went
> wrong: print out %fs, mainly.
> 
> If the cpu number is valid then perhaps something scribbled on the cpu's
> per-cpu memory.

I'll keep looking at this, but at a glance it looks like the cpu
number is valid, because I don't trip a BUG_ON when I make the change
below (the badval variable is noise, sorry).

  --- lx/lib/percpu_counter.c.200711302007-12-03 15:43:19.0 -0500
  +++ lx/lib/percpu_counter.c 2007-12-03 15:47:38.0 -0500
  @@ -33,7 +33,9 @@ void __percpu_counter_add(struct percpu_
  s64 count;
  s32 *pcount;
  int cpu = get_cpu();
  +   u64 badval = 0xULL;
   
  +   BUG_ON(!cpu_possible(cpu));
  pcount = per_cpu_ptr(fbc->counters, cpu);
  count = *pcount + amount;
  if (count >= batch || count <= -batch) {

The trace is,

Unable to handle kernel paging request at  RIP: 
 [] __percpu_counter_add+0x35/0x7f
PGD 203067 PUD 204067 PMD 0 
Oops:  [1] SMP 
CPU 0 
Modules linked in: aoe
Pid: 2777, comm: bash Not tainted 2.6.24-rc3-47dbg #9
RIP: 0010:[]  [] 
__percpu_counter_add+0x35/0x7f
RSP: 0018:810078d19aa8  EFLAGS: 00010086
RAX:  RBX: 81007fc71950 RCX: 0010
RDX:  RSI: 0001 RDI: 810078d9a250
RBP: 810078d19ac8 R08: 81007cc077b0 R09: 802ae5ee
R10: 810078d19aa8 R11: 810078cb59d8 R12: 81007c81c380
R13: 810078d9a250 R14: 0200 R15: 81007805f830
FS:  2b341db94db0() GS:8078b000() knlGS:
CS:  0010 DS:  ES:  CR0: 8005003b
CR2:  CR3: 7b415000 CR4: 06e0
DR0:  DR1:  DR2: 
DR3:  DR6: 0ff0 DR7: 0400
Process bash (pid: 2777, threadinfo 810078d18000, task 810078d160c0)
Stack:  810078d19ac8 81007fc71950 81007c81c380 
 810078d19af8 802ae682 100078d19ae8 81007805f830
 0200 81007fc71950 810078d19b18 802ae75c
Call Trace:
 [] __set_page_dirty+0xdc/0x121
 [] mark_buffer_dirty+0x95/0x99
 [] __block_commit_write+0x72/0xac
 [] block_write_end+0x4f/0x5b
 [] blkdev_write_end+0x1b/0x38
 [] generic_file_buffered_write+0x1c0/0x648
 [] current_fs_time+0x22/0x29
 [] __generic_file_aio_write_nolock+0x358/0x3c2
 [] filemap_fault+0x1c4/0x320
 [] unlock_page+0x2d/0x31
 [] generic_file_aio_write_nolock+0x3b/0x8d
 [] do_sync_write+0xe2/0x126
 [] autoremove_wake_function+0x0/0x38
 [] do_page_fault+0x3f8/0x7bb
 [] fd_install+0x5f/0x68
 [] vfs_write+0xae/0x137
 [] sys_write+0x47/0x70
 [] system_call+0x7e/0x83


Code: 4c 8b 24 d0 49 63 04 24 48 8d 1c 30 48 63 c1 48 39 c3 7d 0a 
RIP  [] __percpu_counter_add+0x35/0x7f
 RSP 
CR2: 
BUG: sleeping function called from invalid context at kernel/rwsem.c:20
in_atomic():0, irqs_disabled():1
INFO: lockdep is turned off.

Call Trace:
 [] debug_show_held_locks+0x1b/0x24
 [] __might_sleep+0xc7/0xc9
 [] down_read+0x1d/0x4a
 [] exit_mm+0x34/0xf7
 [] do_exit+0x247/0x75b
 [] do_page_fault+0x6c7/0x7bb
 [] thread_return+0x42/0x86
 [] :aoe:aoeblk_make_request+0x1c3/0x1d0
 [] error_exit+0x0/0x9a
 [] __set_page_dirty+0x48/0x121
 [] __percpu_counter_add+0x35/0x7f
 [] __set_page_dirty+0xdc/0x121
 [] mark_buffer_dirty+0x95/0x99
 [] __block_commit_write+0x72/0xac
 [] block_write_end+0x4f/0x5b
 [] blkdev_write_end+0x1b/0x38
 [] generic_file_buffered_write+0x1c0/0x648
 [] current_fs_time+0x22/0x29
 [] __generic_file_aio_write_nolock+0x358/0x3c2
 [] filemap_fault+0x1c4/0x320
 [] unlock_page+0x2d/0x31
 [] generic_file_aio_write_nolock+0x3b/0x8d
 [] do_sync_write+0xe2/0x126
 [] autoremove_wake_function+0x0/0x38
 [] do_page_fault+0x3f8/0x7bb
 [] fd_install+0x5f/0x68
 [] vfs_write+0xae/0x137
 [] sys_write+0x47/0x70
 [] system_call+0x7e/0x83



-- 
  Ed L Cashin <[EMAIL PROTECTED]>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Bugme-new] [Bug 9482] New: kernel GPF in 2.6.24 (g09f345da)

2007-12-03 Thread Ed L. Cashin
On Sat, Dec 01, 2007 at 12:23:02PM -0800, Andrew Morton wrote:
> (switched to email - please respond via emailed reply-to-all, not via the
> bugzilla web interface)
> 
> On Sat,  1 Dec 2007 11:54:11 -0800 (PST) [EMAIL PROTECTED] wrote:
> 
> > http://bugzilla.kernel.org/show_bug.cgi?id=9482
...
> Damn that's odd.  General Protection Fault in
> __set_page_dirty->__percpu_counter_add().  No sign of AOE in the trace.
> 
> I assume that it is repeatable and that it doesn't occur with mkfs on
> regular local disk drives?

I am encountering this same problem during testing of some patches I
would like to send to the LKML, applied to 2.6.24-rc3, and I can trip
this problem with just,

  echo > /dev/etherd/e7.0

... at which point I get the trace below.  (I had added a couple of
checks for 0x pointers to __percpu_counter_add.)  I
haven't been able to check the unpatched aoe driver, but it looks the
same.

Unable to handle kernel paging request at  RIP: 
 [] __percpu_counter_add+0x24/0x6d
PGD 203067 PUD 204067 PMD 0 
Oops:  [1] SMP 
CPU 0 
Modules linked in: aoe
Pid: 2860, comm: bash Not tainted 2.6.24-rc3-47dbg #5
RIP: 0010:[]  [] 
__percpu_counter_add+0x24/0x6d
RSP: 0018:81007a0fbaa8  EFLAGS: 00010092
RAX:  RBX: 81007fcc48e0 RCX: 0010
RDX:  RSI: 0001 RDI: 81007ace7240
RBP: 81007a0fbac8 R08: 81007cc077b0 R09: 802ae5ee
R10: 81007a0fbaa8 R11: 810077dd99d8 R12: 81007ace7240
R13:  R14: 0200 R15: 810078473bb0
FS:  2ba601c5cdb0() GS:8078b000() knlGS:
CS:  0010 DS:  ES:  CR0: 8005003b
CR2:  CR3: 77c31000 CR4: 06e0
DR0:  DR1:  DR2: 
DR3:  DR6: 0ff0 DR7: 0400
Process bash (pid: 2860, threadinfo 81007a0fa000, task 81007bf48040)
Stack:  81007a0fbac8 81007fcc48e0 81007c81c380 
 81007a0fbaf8 802ae682 10007a0fbae8 810078473bb0
 0200 81007fcc48e0 81007a0fbb18 802ae75c
Call Trace:
 [] __set_page_dirty+0xdc/0x121
 [] mark_buffer_dirty+0x95/0x99
 [] __block_commit_write+0x72/0xac
 [] block_write_end+0x4f/0x5b
 [] blkdev_write_end+0x1b/0x38
 [] generic_file_buffered_write+0x1c0/0x648
 [] current_fs_time+0x22/0x29
 [] __generic_file_aio_write_nolock+0x358/0x3c2
 [] filemap_fault+0x1c4/0x320
 [] unlock_page+0x2d/0x31
 [] generic_file_aio_write_nolock+0x3b/0x8d
 [] do_sync_write+0xe2/0x126
 [] autoremove_wake_function+0x0/0x38
 [] do_page_fault+0x3f8/0x7bb
 [] fd_install+0x5f/0x68
 [] vfs_write+0xae/0x137
 [] sys_write+0x47/0x70
 [] system_call+0x7e/0x83


Code: 4c 8b 2c d0 49 63 45 00 48 8d 1c 30 48 63 c1 48 39 c3 7d 0a 
RIP  [] __percpu_counter_add+0x24/0x6d
 RSP 
CR2: 
BUG: sleeping function called from invalid context at kernel/rwsem.c:20
in_atomic():0, irqs_disabled():1
INFO: lockdep is turned off.

Call Trace:
 [] debug_show_held_locks+0x1b/0x24
 [] __might_sleep+0xc7/0xc9
 [] down_read+0x1d/0x4a
 [] exit_mm+0x34/0xf7
 [] do_exit+0x247/0x75b
 [] do_page_fault+0x6c7/0x7bb
 [] thread_return+0x42/0x86
 [] :aoe:aoeblk_make_request+0x1e8/0x1f5
 [] error_exit+0x0/0x9a
 [] __set_page_dirty+0x48/0x121
 [] __percpu_counter_add+0x24/0x6d
 [] __set_page_dirty+0xdc/0x121
 [] mark_buffer_dirty+0x95/0x99
 [] __block_commit_write+0x72/0xac
 [] block_write_end+0x4f/0x5b
 [] blkdev_write_end+0x1b/0x38
 [] generic_file_buffered_write+0x1c0/0x648
 [] current_fs_time+0x22/0x29
 [] __generic_file_aio_write_nolock+0x358/0x3c2
 [] filemap_fault+0x1c4/0x320
 [] unlock_page+0x2d/0x31
 [] generic_file_aio_write_nolock+0x3b/0x8d
 [] do_sync_write+0xe2/0x126
 [] autoremove_wake_function+0x0/0x38
 [] do_page_fault+0x3f8/0x7bb
 [] fd_install+0x5f/0x68
 [] vfs_write+0xae/0x137
 [] sys_write+0x47/0x70
 [] system_call+0x7e/0x83



-- 
  Ed L Cashin <[EMAIL PROTECTED]>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Bugme-new] [Bug 9482] New: kernel GPF in 2.6.24 (g09f345da)

2007-12-03 Thread Ed L. Cashin
On Sat, Dec 01, 2007 at 12:23:02PM -0800, Andrew Morton wrote:
 (switched to email - please respond via emailed reply-to-all, not via the
 bugzilla web interface)
 
 On Sat,  1 Dec 2007 11:54:11 -0800 (PST) [EMAIL PROTECTED] wrote:
 
  http://bugzilla.kernel.org/show_bug.cgi?id=9482
...
 Damn that's odd.  General Protection Fault in
 __set_page_dirty-__percpu_counter_add().  No sign of AOE in the trace.
 
 I assume that it is repeatable and that it doesn't occur with mkfs on
 regular local disk drives?

I am encountering this same problem during testing of some patches I
would like to send to the LKML, applied to 2.6.24-rc3, and I can trip
this problem with just,

  echo  /dev/etherd/e7.0

... at which point I get the trace below.  (I had added a couple of
checks for 0x pointers to __percpu_counter_add.)  I
haven't been able to check the unpatched aoe driver, but it looks the
same.

Unable to handle kernel paging request at  RIP: 
 [8036d597] __percpu_counter_add+0x24/0x6d
PGD 203067 PUD 204067 PMD 0 
Oops:  [1] SMP 
CPU 0 
Modules linked in: aoe
Pid: 2860, comm: bash Not tainted 2.6.24-rc3-47dbg #5
RIP: 0010:[8036d597]  [8036d597] 
__percpu_counter_add+0x24/0x6d
RSP: 0018:81007a0fbaa8  EFLAGS: 00010092
RAX:  RBX: 81007fcc48e0 RCX: 0010
RDX:  RSI: 0001 RDI: 81007ace7240
RBP: 81007a0fbac8 R08: 81007cc077b0 R09: 802ae5ee
R10: 81007a0fbaa8 R11: 810077dd99d8 R12: 81007ace7240
R13:  R14: 0200 R15: 810078473bb0
FS:  2ba601c5cdb0() GS:8078b000() knlGS:
CS:  0010 DS:  ES:  CR0: 8005003b
CR2:  CR3: 77c31000 CR4: 06e0
DR0:  DR1:  DR2: 
DR3:  DR6: 0ff0 DR7: 0400
Process bash (pid: 2860, threadinfo 81007a0fa000, task 81007bf48040)
Stack:  81007a0fbac8 81007fcc48e0 81007c81c380 
 81007a0fbaf8 802ae682 10007a0fbae8 810078473bb0
 0200 81007fcc48e0 81007a0fbb18 802ae75c
Call Trace:
 [802ae682] __set_page_dirty+0xdc/0x121
 [802ae75c] mark_buffer_dirty+0x95/0x99
 [802ae7d2] __block_commit_write+0x72/0xac
 [802ae988] block_write_end+0x4f/0x5b
 [802b243f] blkdev_write_end+0x1b/0x38
 [80265d96] generic_file_buffered_write+0x1c0/0x648
 [8023a752] current_fs_time+0x22/0x29
 [80266576] __generic_file_aio_write_nolock+0x358/0x3c2
 [80266c84] filemap_fault+0x1c4/0x320
 [80264cce] unlock_page+0x2d/0x31
 [802666dd] generic_file_aio_write_nolock+0x3b/0x8d
 [8028e40f] do_sync_write+0xe2/0x126
 [802497d0] autoremove_wake_function+0x0/0x38
 [8058e705] do_page_fault+0x3f8/0x7bb
 [8028cae8] fd_install+0x5f/0x68
 [8028eb98] vfs_write+0xae/0x137
 [8028f102] sys_write+0x47/0x70
 [8020b7ae] system_call+0x7e/0x83


Code: 4c 8b 2c d0 49 63 45 00 48 8d 1c 30 48 63 c1 48 39 c3 7d 0a 
RIP  [8036d597] __percpu_counter_add+0x24/0x6d
 RSP 81007a0fbaa8
CR2: 
BUG: sleeping function called from invalid context at kernel/rwsem.c:20
in_atomic():0, irqs_disabled():1
INFO: lockdep is turned off.

Call Trace:
 [802518cb] debug_show_held_locks+0x1b/0x24
 [8022f352] __might_sleep+0xc7/0xc9
 [8024c4d0] down_read+0x1d/0x4a
 [80237bd8] exit_mm+0x34/0xf7
 [8023933e] do_exit+0x247/0x75b
 [8058e9d4] do_page_fault+0x6c7/0x7bb
 [8058a81a] thread_return+0x42/0x86
 [88002698] :aoe:aoeblk_make_request+0x1e8/0x1f5
 [8058cb6d] error_exit+0x0/0x9a
 [802ae5ee] __set_page_dirty+0x48/0x121
 [8036d597] __percpu_counter_add+0x24/0x6d
 [802ae682] __set_page_dirty+0xdc/0x121
 [802ae75c] mark_buffer_dirty+0x95/0x99
 [802ae7d2] __block_commit_write+0x72/0xac
 [802ae988] block_write_end+0x4f/0x5b
 [802b243f] blkdev_write_end+0x1b/0x38
 [80265d96] generic_file_buffered_write+0x1c0/0x648
 [8023a752] current_fs_time+0x22/0x29
 [80266576] __generic_file_aio_write_nolock+0x358/0x3c2
 [80266c84] filemap_fault+0x1c4/0x320
 [80264cce] unlock_page+0x2d/0x31
 [802666dd] generic_file_aio_write_nolock+0x3b/0x8d
 [8028e40f] do_sync_write+0xe2/0x126
 [802497d0] autoremove_wake_function+0x0/0x38
 [8058e705] do_page_fault+0x3f8/0x7bb
 [8028cae8] fd_install+0x5f/0x68
 [8028eb98] vfs_write+0xae/0x137
 [8028f102] sys_write+0x47/0x70
 [8020b7ae] system_call+0x7e/0x83



-- 
  Ed L Cashin [EMAIL PROTECTED]
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org

Re: [Bugme-new] [Bug 9482] New: kernel GPF in 2.6.24 (g09f345da)

2007-12-03 Thread Ed L. Cashin
] autoremove_wake_function+0x0/0x38
 [8058e715] do_page_fault+0x3f8/0x7bb
 [8028cae8] fd_install+0x5f/0x68
 [8028eb98] vfs_write+0xae/0x137
 [8028f102] sys_write+0x47/0x70
 [8020b7ae] system_call+0x7e/0x83



-- 
  Ed L Cashin [EMAIL PROTECTED]
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Bugme-new] [Bug 9482] New: kernel GPF in 2.6.24 (g09f345da)

2007-12-03 Thread Ed L. Cashin
On Mon, Dec 03, 2007 at 04:00:05PM -0500, Ed L. Cashin wrote:
...
 I'll keep looking at this, but at a glance it looks like the cpu
 number is valid, because I don't trip a BUG_ON when I make the change
 below (the badval variable is noise, sorry).
 
   --- lx/lib/percpu_counter.c.200711302007-12-03 15:43:19.0 -0500
   +++ lx/lib/percpu_counter.c 2007-12-03 15:47:38.0 -0500
   @@ -33,7 +33,9 @@ void __percpu_counter_add(struct percpu_
   s64 count;
   s32 *pcount;
   int cpu = get_cpu();
   +   u64 badval = 0xULL;

   +   BUG_ON(!cpu_possible(cpu));
   pcount = per_cpu_ptr(fbc-counters, cpu);
   count = *pcount + amount;
   if (count = batch || count = -batch) {

It appears that the fbc-counters pointer is NULL.  I added the line,

BUG_ON(!fbc-counters);

... (on line 39 in my percpu_counter.c), and it results in the trace
below.  It looks like when it's NULL, percpu_ptr passes it to
__percpu_disguise, which makes it all ones and then tries to
dereference 0x to access to the ptrs member of the
struct percpu_data.

[ cut here ]
kernel BUG at lib/percpu_counter.c:39!
invalid opcode:  [1] SMP 
CPU 0 
Modules linked in: aoe
Pid: 3354, comm: bash Not tainted 2.6.24-rc3-47dbg #10
RIP: 0010:[8036d5f7]  [8036d5f7] 
__percpu_counter_add+0x2a/0x8f
RSP: 0018:810075031aa8  EFLAGS: 00010046
RAX:  RBX: 81007fd19bd8 RCX: 
RDX: 0010 RSI: 0001 RDI: 
RBP: 810075031ac8 R08: 81007cc077b0 R09: 802ae5ee
R10: 810075031aa8 R11: 8100750318e8 R12: 81007c81c380
R13: 810073ce8250 R14: 0200 R15: 8100755016b0
FS:  2b3e5c052db0() GS:8078b000() knlGS:
CS:  0010 DS:  ES:  CR0: 8005003b
CR2: 2b7f44fb64e0 CR3: 7c4b1000 CR4: 06e0
DR0:  DR1:  DR2: 
DR3:  DR6: 0ff0 DR7: 0400
Process bash (pid: 3354, threadinfo 81007503, task 81007b4da040)
Stack:  810075031ac8 81007fd19bd8 81007c81c380 
 810075031af8 802ae682 100075031ae8 8100755016b0
 0200 81007fd19bd8 810075031b18 802ae75c
Call Trace:
 [802ae682] __set_page_dirty+0xdc/0x121
 [802ae75c] mark_buffer_dirty+0x95/0x99
 [802ae7d2] __block_commit_write+0x72/0xac
 [802ae988] block_write_end+0x4f/0x5b
 [802b243f] blkdev_write_end+0x1b/0x38
 [80265d96] generic_file_buffered_write+0x1c0/0x648
 [8023a752] current_fs_time+0x22/0x29
 [80266576] __generic_file_aio_write_nolock+0x358/0x3c2
 [80266c84] filemap_fault+0x1c4/0x320
 [80264cce] unlock_page+0x2d/0x31
 [802666dd] generic_file_aio_write_nolock+0x3b/0x8d
 [8028e40f] do_sync_write+0xe2/0x126
 [802497d0] autoremove_wake_function+0x0/0x38
 [8058e725] do_page_fault+0x3f8/0x7bb
 [8028cae8] fd_install+0x5f/0x68
 [8028eb98] vfs_write+0xae/0x137
 [8028f102] sys_write+0x47/0x70
 [8020b7ae] system_call+0x7e/0x83


Code: 0f 0b eb fe 0f a3 3d 7e 08 4f 00 19 c0 85 c0 75 04 0f 0b eb 
RIP  [8036d5f7] __percpu_counter_add+0x2a/0x8f
 RSP 810075031aa8
BUG: sleeping function called from invalid context at kernel/rwsem.c:20
in_atomic():0, irqs_disabled():1
INFO: lockdep is turned off.

Call Trace:
 [802518cb] debug_show_held_locks+0x1b/0x24
 [8022f352] __might_sleep+0xc7/0xc9
 [8024c4d0] down_read+0x1d/0x4a
 [80237bd8] exit_mm+0x34/0xf7
 [8023933e] do_exit+0x247/0x75b
 [8020d01e] kernel_math_error+0x0/0x7e
 [8058d35b] do_trap+0x101/0x110
 [8020d4a6] do_invalid_op+0x91/0x9a
 [8036d5f7] __percpu_counter_add+0x2a/0x8f
 [88002544] :aoe:aoeblk_make_request+0x1c3/0x1d0
 [8058aaeb] io_schedule+0x28/0x34
 [8058cb8d] error_exit+0x0/0x9a
 [802ae5ee] __set_page_dirty+0x48/0x121
 [8036d5f7] __percpu_counter_add+0x2a/0x8f
 [802ae682] __set_page_dirty+0xdc/0x121
 [802ae75c] mark_buffer_dirty+0x95/0x99
 [802ae7d2] __block_commit_write+0x72/0xac
 [802ae988] block_write_end+0x4f/0x5b
 [802b243f] blkdev_write_end+0x1b/0x38
 [80265d96] generic_file_buffered_write+0x1c0/0x648
 [8023a752] current_fs_time+0x22/0x29
 [80266576] __generic_file_aio_write_nolock+0x358/0x3c2
 [80266c84] filemap_fault+0x1c4/0x320
 [80264cce] unlock_page+0x2d/0x31
 [802666dd] generic_file_aio_write_nolock+0x3b/0x8d
 [8028e40f] do_sync_write+0xe2/0x126
 [802497d0] autoremove_wake_function+0x0/0x38
 [8058e725] do_page_fault+0x3f8/0x7bb
 [8028cae8] fd_install+0x5f/0x68
 [8028eb98] vfs_write

Re: [Bugme-new] [Bug 9482] New: kernel GPF in 2.6.24 (g09f345da)

2007-12-03 Thread Ed L. Cashin
On Mon, Dec 03, 2007 at 03:13:49PM -0800, Andrew Morton wrote:
 On Mon, 3 Dec 2007 14:47:22 -0800
 Andrew Morton [EMAIL PROTECTED] wrote:
 
  Does this fix?
 
 Slightly more elaborate version

Yes, this patch does eliminate the problem.  Without it, no write can
complete, and with it I have seen many writes complete without any
trouble.

Thank you for looking into this.  I will look more closely at this
patch tomorrow.

-- 
  Ed L Cashin [EMAIL PROTECTED]
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Bugme-new] [Bug 9482] New: kernel GPF in 2.6.24 (g09f345da)

2007-12-03 Thread Ed L. Cashin
On Mon, Dec 03, 2007 at 02:47:22PM -0800, Andrew Morton wrote:
 On Mon, 3 Dec 2007 16:38:37 -0500
 Ed L. Cashin [EMAIL PROTECTED] wrote:
...
  It appears that the fbc-counters pointer is NULL.
 
 Does this fix?
 
 --- a/drivers/block/aoe/aoeblk.c~a
 +++ a/drivers/block/aoe/aoeblk.c
 @@ -6,6 +6,7 @@
  
  #include linux/hdreg.h
  #include linux/blkdev.h
 +#include linux/backing-dev.h
  #include linux/fs.h
  #include linux/ioctl.h
  #include linux/genhd.h
 @@ -228,6 +229,7 @@ aoeblk_gdalloc(void *vp)
  
   spin_lock_irqsave(d-lock, flags);
   blk_queue_make_request(d-blkq, aoeblk_make_request);
 + bdi_init(d-blkq.backing_dev_info);
   gd-major = AOE_MAJOR;
   gd-first_minor = d-sysminor * AOE_PARTITIONS;
   gd-fops = aoe_bdops;
 _
 
 
 wonders whether blk_queue_make_request() should be running bdi_init()?

No, the behavior doesn't change with this patch applied.

Meanwhile I have started a git bisect, and hopefully that will turn up
a specific patch before I hit an unbootable kernel or get my machine
in a state where it won't boot.

-- 
  Ed L Cashin [EMAIL PROTECTED]
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] [73/2many] MAINTAINERS - ATA OVER ETHERNET DRIVER

2007-08-13 Thread Ed L. Cashin
On Sun, Aug 12, 2007 at 11:23:41PM -0700, [EMAIL PROTECTED] wrote:
> Add file pattern to MAINTAINER entry
> 
> Signed-off-by: Joe Perches <[EMAIL PROTECTED]>
> 
> diff --git a/MAINTAINERS b/MAINTAINERS
> index a165698..b8bb108 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -740,6 +740,8 @@ P:Ed L. Cashin
>  M:   [EMAIL PROTECTED]
>  W:   http://www.coraid.com/support/linux
>  S:   Supported
> +F:   Documentation/aoe/
> +F:   drivers/block/aoe/
>  
>  ATL1 ETHERNET DRIVER
>  P:   Jay Cliburn

That looks fine to me.

-- 
  Ed L Cashin <[EMAIL PROTECTED]>
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] [73/2many] MAINTAINERS - ATA OVER ETHERNET DRIVER

2007-08-13 Thread Ed L. Cashin
On Sun, Aug 12, 2007 at 11:23:41PM -0700, [EMAIL PROTECTED] wrote:
 Add file pattern to MAINTAINER entry
 
 Signed-off-by: Joe Perches [EMAIL PROTECTED]
 
 diff --git a/MAINTAINERS b/MAINTAINERS
 index a165698..b8bb108 100644
 --- a/MAINTAINERS
 +++ b/MAINTAINERS
 @@ -740,6 +740,8 @@ P:Ed L. Cashin
  M:   [EMAIL PROTECTED]
  W:   http://www.coraid.com/support/linux
  S:   Supported
 +F:   Documentation/aoe/
 +F:   drivers/block/aoe/
  
  ATL1 ETHERNET DRIVER
  P:   Jay Cliburn

That looks fine to me.

-- 
  Ed L Cashin [EMAIL PROTECTED]
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] aoe: remove unecessary wrapper function

2007-08-06 Thread Ed L. Cashin
We can just use skb_mac_header now, and we don't need
a wrapper function to perform the cast.  Instead of
requiring the reader to check aoe.h to look up what an
aoe_hdr function does, I'd rather do without it.
---
 drivers/block/aoe/aoe.h|9 -
 drivers/block/aoe/aoecmd.c |   14 +++---
 drivers/block/aoe/aoenet.c |2 +-
 3 files changed, 8 insertions(+), 17 deletions(-)

diff --git a/drivers/block/aoe/aoe.h b/drivers/block/aoe/aoe.h
index ba07f76..07f02f8 100644
--- a/drivers/block/aoe/aoe.h
+++ b/drivers/block/aoe/aoe.h
@@ -48,15 +48,6 @@ struct aoe_hdr {
__be32 tag;
 };
 
-#ifdef __KERNEL__
-#include 
-
-static inline struct aoe_hdr *aoe_hdr(const struct sk_buff *skb)
-{
-   return (struct aoe_hdr *)skb_mac_header(skb);
-}
-#endif
-
 struct aoe_atahdr {
unsigned char aflags;
unsigned char errfeat;
diff --git a/drivers/block/aoe/aoecmd.c b/drivers/block/aoe/aoecmd.c
index 01fbdd3..8893a26 100644
--- a/drivers/block/aoe/aoecmd.c
+++ b/drivers/block/aoe/aoecmd.c
@@ -119,7 +119,7 @@ aoecmd_ata_rw(struct aoedev *d, struct frame *f)
 
/* initialize the headers & frame */
skb = f->skb;
-   h = aoe_hdr(skb);
+   h = (struct aoe_hdr *) skb_mac_header(skb);
ah = (struct aoe_atahdr *) (h+1);
skb_put(skb, sizeof *h + sizeof *ah);
memset(h, 0, skb->len);
@@ -208,7 +208,7 @@ aoecmd_cfg_pkts(ushort aoemajor, unsigned char aoeminor, 
struct sk_buff **tail)
skb->dev = ifp;
if (sl_tail == NULL)
sl_tail = skb;
-   h = aoe_hdr(skb);
+   h = (struct aoe_hdr *) skb_mac_header(skb);
memset(h, 0, sizeof *h + sizeof *ch);
 
memset(h->dst, 0xff, sizeof h->dst);
@@ -303,7 +303,7 @@ rexmit(struct aoedev *d, struct frame *f)
aoechr_error(buf);
 
skb = f->skb;
-   h = aoe_hdr(skb);
+   h = (struct aoe_hdr *) skb_mac_header(skb);
ah = (struct aoe_atahdr *) (h+1);
f->tag = n;
h->tag = cpu_to_be32(n);
@@ -532,7 +532,7 @@ aoecmd_ata_rsp(struct sk_buff *skb)
char ebuf[128];
u16 aoemajor;
 
-   hin = aoe_hdr(skb);
+   hin = (struct aoe_hdr *) skb_mac_header(skb);
aoemajor = be16_to_cpu(get_unaligned(>major));
d = aoedev_by_aoeaddr(aoemajor, hin->minor);
if (d == NULL) {
@@ -564,7 +564,7 @@ aoecmd_ata_rsp(struct sk_buff *skb)
calc_rttavg(d, tsince(f->tag));
 
ahin = (struct aoe_atahdr *) (hin+1);
-   hout = aoe_hdr(f->skb);
+   hout = (struct aoe_hdr *) skb_mac_header(f->skb);
ahout = (struct aoe_atahdr *) (hout+1);
buf = f->buf;
 
@@ -698,7 +698,7 @@ aoecmd_ata_id(struct aoedev *d)
 
/* initialize the headers & frame */
skb = f->skb;
-   h = aoe_hdr(skb);
+   h = (struct aoe_hdr *) skb_mac_header(skb);
ah = (struct aoe_atahdr *) (h+1);
skb_put(skb, sizeof *h + sizeof *ah);
memset(h, 0, skb->len);
@@ -729,7 +729,7 @@ aoecmd_cfg_rsp(struct sk_buff *skb)
enum { MAXFRAMES = 16 };
u16 n;
 
-   h = aoe_hdr(skb);
+   h = (struct aoe_hdr *) skb_mac_header(skb);
ch = (struct aoe_cfghdr *) (h+1);
 
/*
diff --git a/drivers/block/aoe/aoenet.c b/drivers/block/aoe/aoenet.c
index f9ddfda..d54bf3a 100644
--- a/drivers/block/aoe/aoenet.c
+++ b/drivers/block/aoe/aoenet.c
@@ -123,7 +123,7 @@ aoenet_rcv(struct sk_buff *skb, struct net_device *ifp, 
struct packet_type *pt,
goto exit;
skb_push(skb, ETH_HLEN);/* (1) */
 
-   h = aoe_hdr(skb);
+   h = (struct aoe_hdr *) skb_mac_header(skb);
n = be32_to_cpu(get_unaligned(>tag));
if ((h->verfl & AOEFL_RSP) == 0 || (n & 1<<31))
goto exit;
-- 
1.5.2.1

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] aoe: remove unecessary wrapper function

2007-08-06 Thread Ed L. Cashin
We can just use skb_mac_header now, and we don't need
a wrapper function to perform the cast.  Instead of
requiring the reader to check aoe.h to look up what an
aoe_hdr function does, I'd rather do without it.
---
 drivers/block/aoe/aoe.h|9 -
 drivers/block/aoe/aoecmd.c |   14 +++---
 drivers/block/aoe/aoenet.c |2 +-
 3 files changed, 8 insertions(+), 17 deletions(-)

diff --git a/drivers/block/aoe/aoe.h b/drivers/block/aoe/aoe.h
index ba07f76..07f02f8 100644
--- a/drivers/block/aoe/aoe.h
+++ b/drivers/block/aoe/aoe.h
@@ -48,15 +48,6 @@ struct aoe_hdr {
__be32 tag;
 };
 
-#ifdef __KERNEL__
-#include linux/skbuff.h
-
-static inline struct aoe_hdr *aoe_hdr(const struct sk_buff *skb)
-{
-   return (struct aoe_hdr *)skb_mac_header(skb);
-}
-#endif
-
 struct aoe_atahdr {
unsigned char aflags;
unsigned char errfeat;
diff --git a/drivers/block/aoe/aoecmd.c b/drivers/block/aoe/aoecmd.c
index 01fbdd3..8893a26 100644
--- a/drivers/block/aoe/aoecmd.c
+++ b/drivers/block/aoe/aoecmd.c
@@ -119,7 +119,7 @@ aoecmd_ata_rw(struct aoedev *d, struct frame *f)
 
/* initialize the headers  frame */
skb = f-skb;
-   h = aoe_hdr(skb);
+   h = (struct aoe_hdr *) skb_mac_header(skb);
ah = (struct aoe_atahdr *) (h+1);
skb_put(skb, sizeof *h + sizeof *ah);
memset(h, 0, skb-len);
@@ -208,7 +208,7 @@ aoecmd_cfg_pkts(ushort aoemajor, unsigned char aoeminor, 
struct sk_buff **tail)
skb-dev = ifp;
if (sl_tail == NULL)
sl_tail = skb;
-   h = aoe_hdr(skb);
+   h = (struct aoe_hdr *) skb_mac_header(skb);
memset(h, 0, sizeof *h + sizeof *ch);
 
memset(h-dst, 0xff, sizeof h-dst);
@@ -303,7 +303,7 @@ rexmit(struct aoedev *d, struct frame *f)
aoechr_error(buf);
 
skb = f-skb;
-   h = aoe_hdr(skb);
+   h = (struct aoe_hdr *) skb_mac_header(skb);
ah = (struct aoe_atahdr *) (h+1);
f-tag = n;
h-tag = cpu_to_be32(n);
@@ -532,7 +532,7 @@ aoecmd_ata_rsp(struct sk_buff *skb)
char ebuf[128];
u16 aoemajor;
 
-   hin = aoe_hdr(skb);
+   hin = (struct aoe_hdr *) skb_mac_header(skb);
aoemajor = be16_to_cpu(get_unaligned(hin-major));
d = aoedev_by_aoeaddr(aoemajor, hin-minor);
if (d == NULL) {
@@ -564,7 +564,7 @@ aoecmd_ata_rsp(struct sk_buff *skb)
calc_rttavg(d, tsince(f-tag));
 
ahin = (struct aoe_atahdr *) (hin+1);
-   hout = aoe_hdr(f-skb);
+   hout = (struct aoe_hdr *) skb_mac_header(f-skb);
ahout = (struct aoe_atahdr *) (hout+1);
buf = f-buf;
 
@@ -698,7 +698,7 @@ aoecmd_ata_id(struct aoedev *d)
 
/* initialize the headers  frame */
skb = f-skb;
-   h = aoe_hdr(skb);
+   h = (struct aoe_hdr *) skb_mac_header(skb);
ah = (struct aoe_atahdr *) (h+1);
skb_put(skb, sizeof *h + sizeof *ah);
memset(h, 0, skb-len);
@@ -729,7 +729,7 @@ aoecmd_cfg_rsp(struct sk_buff *skb)
enum { MAXFRAMES = 16 };
u16 n;
 
-   h = aoe_hdr(skb);
+   h = (struct aoe_hdr *) skb_mac_header(skb);
ch = (struct aoe_cfghdr *) (h+1);
 
/*
diff --git a/drivers/block/aoe/aoenet.c b/drivers/block/aoe/aoenet.c
index f9ddfda..d54bf3a 100644
--- a/drivers/block/aoe/aoenet.c
+++ b/drivers/block/aoe/aoenet.c
@@ -123,7 +123,7 @@ aoenet_rcv(struct sk_buff *skb, struct net_device *ifp, 
struct packet_type *pt,
goto exit;
skb_push(skb, ETH_HLEN);/* (1) */
 
-   h = aoe_hdr(skb);
+   h = (struct aoe_hdr *) skb_mac_header(skb);
n = be32_to_cpu(get_unaligned(h-tag));
if ((h-verfl  AOEFL_RSP) == 0 || (n  131))
goto exit;
-- 
1.5.2.1

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: ATA over ethernet swapping and obfuscated code

2007-07-31 Thread Ed L. Cashin
On Tue, Jul 31, 2007 at 05:29:24PM +0200, Pavel Machek wrote:
...
> Is the protocol documented somewhere? aoe.txt only points at
> HOWTO... aha, protocol is linked from wikipedia.
> http://www.coraid.com/documents/AoEr10.txt ... perhaps that should be
> linked from aoe.txt, too?

Perhaps.  Most people reading the aoe.txt file won't need to refer to
the protocol itself, though.

> Hmm, aoe protocol is really trivial. Perhaps netpoll/netconsole
> infrastructure could be used to create driver good enough for
> swapping? (Ok, it would not neccessarily perform too well, but... we'd
> simply wait for the reply synchronously. It should be pretty simple).

I think that in general you still need a way to receive write
confirmations without allocating memory, and the driver can't provide
that mechanism.  The problem is that when memory is scarce, writes of
dirty data must be able to complete, but because memory is scarce,
there might not be enough to receive and process packets write-reponse
packets, and the driver has no way of affecting the situation.  That's
why I think a callback could work: The network layer could allow
storage drivers to register a callback that recognizes write
responses.

Usually the callback would not be used, but if free pages became so
scarce that network receives could not take place in a normal fashion,
the (zero or few) registered callbacks would be used to quickly
determine whether each packet was a write response.  The distinction
is important, because write responses can result in the freeing of
pages.

When a storage driver's callback identified a write response, then a
reserved skb could be used to process the receive without allocating
memory.  During the memory crunch packets that were not write
responses would be dropped just as they are already, but dirty pages
would be flushed.  The mechanism would only take effect when free
pages were scarce.

It is easy to chat, though.  Maybe someday I will test and submit a
patch that implements this mechanism, but I'm hoping that somebody
beats me to it.  :)

-- 
  Ed L Cashin <[EMAIL PROTECTED]>
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: ATA over ethernet swapping and obfuscated code

2007-07-31 Thread Ed L. Cashin
On Tue, Jul 31, 2007 at 03:58:31PM +0200, Pavel Machek wrote:
> Hi!
> 
> I wanted to know if it is possible/okay to swap over AOE... 
> 
> According to
> http://www.coraid.com/support/linux/EtherDrive-2.6-HOWTO-5.html#ss5.20
> .. it runs OOM even during normal use, so I guess swapping over it is
> no-no?

It can be done (e.g., to create virtual memory for running xfs_check
on a diskless machine as a temporary measure), but it probably won't
be a good idea until there is a mechanism that allows write responses
to be (quickly recognized and then) received without allocating memory
when there are no free pages.

I think if we could register a very fast function to recognize write
responses, which would be called only when free memory was very low,
and then use a pre-allocated receive skb for receiving write
responses, then we'd be OK, and the common case wouldn't be affected.

> Can I build both client and server for these using free software?

Yes.  A popular free target is the vblade (aoetools.sourceforge.net),
and there are others.  The most popular free software initiator is the
aoe driver in Linux.

> In the process, I looked at the aoe code, and parts of it look like
> obfuscated C contest. The use of switch() as an if was particulary
> creative; I'm not even sure if I translated it properly... can you
> take a look?

I recently submitted a set of patches, and Andrew Morton asked me to
avoid the switch statement you are talking about, so thanks for the
patch, but that code is going to be patched soon anyway.

More below.

> (Patch is 
> 
> Signed-off-by: Pavel Machek <[EMAIL PROTECTED]>
> 
> but I did not even compile test it)
>
> diff --git a/drivers/block/aoe/aoedev.c b/drivers/block/aoe/aoedev.c
> index 05a9719..38ba35d 100644
> --- a/drivers/block/aoe/aoedev.c
> +++ b/drivers/block/aoe/aoedev.c
> @@ -64,29 +64,26 @@ aoedev_newdev(ulong nframes)
>  
>   d = kzalloc(sizeof *d, GFP_ATOMIC);
>   f = kcalloc(nframes, sizeof *f, GFP_ATOMIC);
> - switch (!d || !f) {
> - case 0:
> - d->nframes = nframes;
> - d->frames = f;
> - e = f + nframes;
> - for (; f - f->tag = FREETAG;
> - f->skb = new_skb(ETH_ZLEN);
> - if (!f->skb)
> - break;
> - }
> - if (f == e)
> - break;
> + if (!d || !f) {
> + kfree(f);
> + kfree(d);
> + return NULL;
> + }
> +
> + d->nframes = nframes;
> + d->frames = f;
> + e = f + nframes;
> + for (; f + f->tag = FREETAG;
> + f->skb = new_skb(ETH_ZLEN);
> + if (!f->skb)
> + break;
> + }
> + if (f != e) {
>   while (f > d->frames) {
>   f--;
>   dev_kfree_skb(f->skb);
>   }
> - default:
> - if (f)
> - kfree(f);
> - if (d)
> - kfree(d);
> - return NULL;
>   }
>   INIT_WORK(>work, aoecmd_sleepwork);
>   spin_lock_init(>lock);
> 
> 
> aoedev_by_sysminor_m() returns with spinlock held in error case; I
> guess that's bad.
> 
> struct aoedev *
> aoedev_by_sysminor_m(ulong sysminor, ulong bufcnt)
> {
>   struct aoedev *d;
>   ulong flags;
> 
>   spin_lock_irqsave(_lock, flags);
> 
>   for (d=devlist; d; d=d->next)
>   if (d->sysminor == sysminor)
>   break;
> 
>   if (d == NULL) {
>   d = aoedev_newdev(bufcnt);
>   if (d == NULL) {
>   spin_unlock_irqrestore(_lock, flags);
>   printk(KERN_INFO "aoe: aoedev_newdev
> failure.\n");
>   return NULL;
> ~~~~~~~~~ here

I don't see what you mean.  There's an unlock two lines before the
return.

>   }
>   d->sysminor = sysminor;
>   d->aoemajor = AOEMAJOR(sysminor);
>   d->aoeminor = AOEMINOR(sysminor);
>   }
> 
>   spin_unlock_irqrestore(_lock, flags);
>   return d;
> }
> 

-- 
  Ed L Cashin <[EMAIL PROTECTED]>
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: ATA over ethernet swapping and obfuscated code

2007-07-31 Thread Ed L. Cashin
On Tue, Jul 31, 2007 at 03:58:31PM +0200, Pavel Machek wrote:
 Hi!
 
 I wanted to know if it is possible/okay to swap over AOE... 
 
 According to
 http://www.coraid.com/support/linux/EtherDrive-2.6-HOWTO-5.html#ss5.20
 .. it runs OOM even during normal use, so I guess swapping over it is
 no-no?

It can be done (e.g., to create virtual memory for running xfs_check
on a diskless machine as a temporary measure), but it probably won't
be a good idea until there is a mechanism that allows write responses
to be (quickly recognized and then) received without allocating memory
when there are no free pages.

I think if we could register a very fast function to recognize write
responses, which would be called only when free memory was very low,
and then use a pre-allocated receive skb for receiving write
responses, then we'd be OK, and the common case wouldn't be affected.

 Can I build both client and server for these using free software?

Yes.  A popular free target is the vblade (aoetools.sourceforge.net),
and there are others.  The most popular free software initiator is the
aoe driver in Linux.

 In the process, I looked at the aoe code, and parts of it look like
 obfuscated C contest. The use of switch() as an if was particulary
 creative; I'm not even sure if I translated it properly... can you
 take a look?

I recently submitted a set of patches, and Andrew Morton asked me to
avoid the switch statement you are talking about, so thanks for the
patch, but that code is going to be patched soon anyway.

More below.

 (Patch is 
 
 Signed-off-by: Pavel Machek [EMAIL PROTECTED]
 
 but I did not even compile test it)

 diff --git a/drivers/block/aoe/aoedev.c b/drivers/block/aoe/aoedev.c
 index 05a9719..38ba35d 100644
 --- a/drivers/block/aoe/aoedev.c
 +++ b/drivers/block/aoe/aoedev.c
 @@ -64,29 +64,26 @@ aoedev_newdev(ulong nframes)
  
   d = kzalloc(sizeof *d, GFP_ATOMIC);
   f = kcalloc(nframes, sizeof *f, GFP_ATOMIC);
 - switch (!d || !f) {
 - case 0:
 - d-nframes = nframes;
 - d-frames = f;
 - e = f + nframes;
 - for (; fe; f++) {
 - f-tag = FREETAG;
 - f-skb = new_skb(ETH_ZLEN);
 - if (!f-skb)
 - break;
 - }
 - if (f == e)
 - break;
 + if (!d || !f) {
 + kfree(f);
 + kfree(d);
 + return NULL;
 + }
 +
 + d-nframes = nframes;
 + d-frames = f;
 + e = f + nframes;
 + for (; fe; f++) {
 + f-tag = FREETAG;
 + f-skb = new_skb(ETH_ZLEN);
 + if (!f-skb)
 + break;
 + }
 + if (f != e) {
   while (f  d-frames) {
   f--;
   dev_kfree_skb(f-skb);
   }
 - default:
 - if (f)
 - kfree(f);
 - if (d)
 - kfree(d);
 - return NULL;
   }
   INIT_WORK(d-work, aoecmd_sleepwork);
   spin_lock_init(d-lock);
 
 
 aoedev_by_sysminor_m() returns with spinlock held in error case; I
 guess that's bad.
 
 struct aoedev *
 aoedev_by_sysminor_m(ulong sysminor, ulong bufcnt)
 {
   struct aoedev *d;
   ulong flags;
 
   spin_lock_irqsave(devlist_lock, flags);
 
   for (d=devlist; d; d=d-next)
   if (d-sysminor == sysminor)
   break;
 
   if (d == NULL) {
   d = aoedev_newdev(bufcnt);
   if (d == NULL) {
   spin_unlock_irqrestore(devlist_lock, flags);
   printk(KERN_INFO aoe: aoedev_newdev
 failure.\n);
   return NULL;
 ~ here

I don't see what you mean.  There's an unlock two lines before the
return.

   }
   d-sysminor = sysminor;
   d-aoemajor = AOEMAJOR(sysminor);
   d-aoeminor = AOEMINOR(sysminor);
   }
 
   spin_unlock_irqrestore(devlist_lock, flags);
   return d;
 }
 

-- 
  Ed L Cashin [EMAIL PROTECTED]
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: ATA over ethernet swapping and obfuscated code

2007-07-31 Thread Ed L. Cashin
On Tue, Jul 31, 2007 at 05:29:24PM +0200, Pavel Machek wrote:
...
 Is the protocol documented somewhere? aoe.txt only points at
 HOWTO... aha, protocol is linked from wikipedia.
 http://www.coraid.com/documents/AoEr10.txt ... perhaps that should be
 linked from aoe.txt, too?

Perhaps.  Most people reading the aoe.txt file won't need to refer to
the protocol itself, though.

 Hmm, aoe protocol is really trivial. Perhaps netpoll/netconsole
 infrastructure could be used to create driver good enough for
 swapping? (Ok, it would not neccessarily perform too well, but... we'd
 simply wait for the reply synchronously. It should be pretty simple).

I think that in general you still need a way to receive write
confirmations without allocating memory, and the driver can't provide
that mechanism.  The problem is that when memory is scarce, writes of
dirty data must be able to complete, but because memory is scarce,
there might not be enough to receive and process packets write-reponse
packets, and the driver has no way of affecting the situation.  That's
why I think a callback could work: The network layer could allow
storage drivers to register a callback that recognizes write
responses.

Usually the callback would not be used, but if free pages became so
scarce that network receives could not take place in a normal fashion,
the (zero or few) registered callbacks would be used to quickly
determine whether each packet was a write response.  The distinction
is important, because write responses can result in the freeing of
pages.

When a storage driver's callback identified a write response, then a
reserved skb could be used to process the receive without allocating
memory.  During the memory crunch packets that were not write
responses would be dropped just as they are already, but dirty pages
would be flushed.  The mechanism would only take effect when free
pages were scarce.

It is easy to chat, though.  Maybe someday I will test and submit a
patch that implements this mechanism, but I'm hoping that somebody
beats me to it.  :)

-- 
  Ed L Cashin [EMAIL PROTECTED]
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] stacked ifs (was Re: [PATCH 02/12] handle multiple network paths to AoE device)

2007-07-16 Thread Ed L. Cashin
On Mon, Jul 02, 2007 at 09:29:49PM -0700, Andrew Morton wrote:
> On Tue, 26 Jun 2007 14:50:10 -0400 "Ed L. Cashin" <[EMAIL PROTECTED]> wrote:
...
> > +static struct frame *
> > +freeframe(struct aoedev *d)
> >  {
> > +   struct frame *f, *e;
> > +   struct aoetgt **t;
> > +   ulong n;
> > +
> > +   if (d->targets[0] == NULL) {/* shouldn't happen, but I'm paranoid */
> > +   printk(KERN_ERR "aoe: NULL TARGETS!\n");
> > +   return NULL;
> > +   }
> > +   t = d->targets;
> > +   do {
> > +   if (t != d->htgt)
> > +   if ((*t)->ifp->nd)
> > +   if ((*t)->nout < (*t)->maxout) {
> 
> ugh.  Do this:
> 
>   do {
>   if (t == d->htgt)
>   continue;
>   if (!(*t)->ifp->nd)
>   continue;
>   if ((*t)->nout >= (*t)->maxout)
>   continue;
>   
>   
>   } while (++t ...)

Do you think the "stacked ifs" in the first version above could be
accepted as a convenient extension to the K conventions in
Documentation/CodingStyle?  Brian Kerhnighan (the "K" in "K") and
Ken Thompson, were among the UNIX hackers who produced the UNIX v7
files that have stacked ifs:

  namei.c, dump.c, iostat.c od.c sa.c, vplot.c, refer/what2.c,
  sed/sed1.c, tbl/t8.c, chess/{agen, play, stdin}.c

Certainly, Linux isn't UNIX, but stacked ifs needn't be treated as
foreign.  They're in the K tradition that Documentation/CodingStyle
is based on.

Stacked ifs are functionally equivalent to a single if with its
conditionals joined by "&&".  Once that is retained, they are not at
all difficult to recognize and understand.  And they have some
advantages over the single if that uses "&&".

When editing code, it is easier to remove conditionals that are no
longer needed, or to arrange conditionals in a different order.
Conditional expressions stand out clearly.

When stacked ifs are in use, the resulting patches can be easier to
read because fewer lines need to change (compared to splitting on the
&&), and also simply because the text is more regular when it comes to
parentheses and ampersands.  There is less distracting noise.

Of course, my primary motivation is for us to be able to contribute
aoe driver patches that use this style, and that would be fantastic,
but I don't think it is unrealistic to say that the adoption of this
style would benefit others, helping to make patches easier to review
in some cases.

Understanding it only takes an understanding of C itself, so the only
"new" change would be a slight and justifiable loosening of the
indentation policy.

Signed-off-by: "Ed L. Cashin" <[EMAIL PROTECTED]>

---
 Documentation/CodingStyle |   21 +
 1 files changed, 21 insertions(+), 0 deletions(-)

diff --git a/Documentation/CodingStyle b/Documentation/CodingStyle
index a667eb1..bb2bb57 100644
--- a/Documentation/CodingStyle
+++ b/Documentation/CodingStyle
@@ -94,6 +94,27 @@ void fun(int a, int b, int c)
next_statement;
 }
 
+One way to break a long condition in an if statement is to use stacked
+ifs.  The following code extracts are equivalent, but the version with
+stacked ifs is easier to read and edit, and it generates more specific
+patches.
+
+   /* version one: stacked ifs */
+   if (condition)
+   if (condition2)
+   if (condition3)
+   if (condition4)
+   first_statement;
+   else
+   next_statement;
+
+   /* version two: logical and */
+   if (condition1 && condition2 && condition3 && condition4)
+   first_statement;
+   else
+   next_statement;
+
+
Chapter 3: Placing Braces and Spaces
 
 The other issue that always comes up in C styling is the placement of
-- 
1.5.2.1


-- 
  Ed L Cashin <[EMAIL PROTECTED]>
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] stacked ifs (was Re: [PATCH 02/12] handle multiple network paths to AoE device)

2007-07-16 Thread Ed L. Cashin
On Mon, Jul 02, 2007 at 09:29:49PM -0700, Andrew Morton wrote:
 On Tue, 26 Jun 2007 14:50:10 -0400 Ed L. Cashin [EMAIL PROTECTED] wrote:
...
  +static struct frame *
  +freeframe(struct aoedev *d)
   {
  +   struct frame *f, *e;
  +   struct aoetgt **t;
  +   ulong n;
  +
  +   if (d-targets[0] == NULL) {/* shouldn't happen, but I'm paranoid */
  +   printk(KERN_ERR aoe: NULL TARGETS!\n);
  +   return NULL;
  +   }
  +   t = d-targets;
  +   do {
  +   if (t != d-htgt)
  +   if ((*t)-ifp-nd)
  +   if ((*t)-nout  (*t)-maxout) {
 
 ugh.  Do this:
 
   do {
   if (t == d-htgt)
   continue;
   if (!(*t)-ifp-nd)
   continue;
   if ((*t)-nout = (*t)-maxout)
   continue;
   
   stuff
   } while (++t ...)

Do you think the stacked ifs in the first version above could be
accepted as a convenient extension to the KR-based conventions in
Documentation/CodingStyle?  Brian Kerhnighan (the K in KR) and
Ken Thompson, were among the UNIX hackers who produced the UNIX v7
files that have stacked ifs:

  namei.c, dump.c, iostat.c od.c sa.c, vplot.c, refer/what2.c,
  sed/sed1.c, tbl/t8.c, chess/{agen, play, stdin}.c

Certainly, Linux isn't UNIX, but stacked ifs needn't be treated as
foreign.  They're in the KR tradition that Documentation/CodingStyle
is based on.

Stacked ifs are functionally equivalent to a single if with its
conditionals joined by .  Once that is retained, they are not at
all difficult to recognize and understand.  And they have some
advantages over the single if that uses .

When editing code, it is easier to remove conditionals that are no
longer needed, or to arrange conditionals in a different order.
Conditional expressions stand out clearly.

When stacked ifs are in use, the resulting patches can be easier to
read because fewer lines need to change (compared to splitting on the
), and also simply because the text is more regular when it comes to
parentheses and ampersands.  There is less distracting noise.

Of course, my primary motivation is for us to be able to contribute
aoe driver patches that use this style, and that would be fantastic,
but I don't think it is unrealistic to say that the adoption of this
style would benefit others, helping to make patches easier to review
in some cases.

Understanding it only takes an understanding of C itself, so the only
new change would be a slight and justifiable loosening of the
indentation policy.

Signed-off-by: Ed L. Cashin [EMAIL PROTECTED]

---
 Documentation/CodingStyle |   21 +
 1 files changed, 21 insertions(+), 0 deletions(-)

diff --git a/Documentation/CodingStyle b/Documentation/CodingStyle
index a667eb1..bb2bb57 100644
--- a/Documentation/CodingStyle
+++ b/Documentation/CodingStyle
@@ -94,6 +94,27 @@ void fun(int a, int b, int c)
next_statement;
 }
 
+One way to break a long condition in an if statement is to use stacked
+ifs.  The following code extracts are equivalent, but the version with
+stacked ifs is easier to read and edit, and it generates more specific
+patches.
+
+   /* version one: stacked ifs */
+   if (condition)
+   if (condition2)
+   if (condition3)
+   if (condition4)
+   first_statement;
+   else
+   next_statement;
+
+   /* version two: logical and */
+   if (condition1  condition2  condition3  condition4)
+   first_statement;
+   else
+   next_statement;
+
+
Chapter 3: Placing Braces and Spaces
 
 The other issue that always comes up in C styling is the placement of
-- 
1.5.2.1


-- 
  Ed L Cashin [EMAIL PROTECTED]
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 02/12] handle multiple network paths to AoE device

2007-07-11 Thread Ed L. Cashin
On Mon, Jul 02, 2007 at 09:29:49PM -0700, Andrew Morton wrote:
> On Tue, 26 Jun 2007 14:50:10 -0400 "Ed L. Cashin" <[EMAIL PROTECTED]> wrote:
...
> > +loop:
> > +   skb = aoecmd_ata_id(d);
> > spin_unlock_irqrestore(>lock, flags);
> > +   if (!skb && !msleep_interruptible(200)) {
> > +   spin_lock_irqsave(>lock, flags);
> > +   goto loop;
> > +   }
> > +   aoenet_xmit(skb);
> > aoecmd_cfg(major, minor);
> > -
> > return 0;
> >  }
> 
> interruptible sleep?  Does this code work as intended when there's a signal
> pending?  (Maybe that's what the interruptible sleep is for: don't know,
> and am not inclined to work it out given the (low) level of comments in
> here and the (lower) level of changelogging).

Yes, if a signal is pending, then msleep_interruptible will not return
0.  That means we will not loop but will call aoenet_xmit with a NULL
skb, which is a noop.  So if the system is too low on memory or the
aoe driver is too low on frames, then the user can hit control-C to
interrupt the attempt to do a revalidate.

I will add a comment to that effect in the resubmitted patch.

-- 
  Ed L Cashin <[EMAIL PROTECTED]>
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 02/12] handle multiple network paths to AoE device

2007-07-11 Thread Ed L. Cashin
On Mon, Jul 02, 2007 at 09:29:49PM -0700, Andrew Morton wrote:
 On Tue, 26 Jun 2007 14:50:10 -0400 Ed L. Cashin [EMAIL PROTECTED] wrote:
...
  +loop:
  +   skb = aoecmd_ata_id(d);
  spin_unlock_irqrestore(d-lock, flags);
  +   if (!skb  !msleep_interruptible(200)) {
  +   spin_lock_irqsave(d-lock, flags);
  +   goto loop;
  +   }
  +   aoenet_xmit(skb);
  aoecmd_cfg(major, minor);
  -
  return 0;
   }
 
 interruptible sleep?  Does this code work as intended when there's a signal
 pending?  (Maybe that's what the interruptible sleep is for: don't know,
 and am not inclined to work it out given the (low) level of comments in
 here and the (lower) level of changelogging).

Yes, if a signal is pending, then msleep_interruptible will not return
0.  That means we will not loop but will call aoenet_xmit with a NULL
skb, which is a noop.  So if the system is too low on memory or the
aoe driver is too low on frames, then the user can hit control-C to
interrupt the attempt to do a revalidate.

I will add a comment to that effect in the resubmitted patch.

-- 
  Ed L Cashin [EMAIL PROTECTED]
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 07/12] use a dynamic pool of sk_buffs to keep up with fast targets

2007-07-06 Thread Ed L. Cashin
Hi.  I will address the style issues and other things that Andrew
Morton pointed out---Thanks again for the feedback.

As far as the skb pool goes, I'm afraid my comment is misleading.

What this Patch Does 

  Even before this recent series of 12 patches to 2.6.22-rc4, the aoe
  driver was reusing a small set of skbs that were allocated once and
  were only used for outbound AoE commands.
  
  The network layer cannot be allowed to put_page on the data that is
  still associated with a bio we haven't returned to the block layer,
  so the aoe driver (even before the patch under discussion) is still
  the owner of skbs that have been handed to the network layer for
  transmission.  We need to keep track of these skbs so that we can
  free them, but by tracking them, we can also easily re-use them.
  
  The new patch was a response to the behavior of certain network
  drivers.  We cannot reuse an skb that the network driver still has
  in its transmit ring.  Network drivers can defer transmit ring
  cleanup and then use the state in the skb to determine how many data
  segments to clean up in its transmit ring.  The tg3 driver is one
  driver that behaves in this way.
  
  When the network driver defers cleanup of its transmit ring, the aoe
  driver can find itself in a situation where it would like to send an
  AoE command, and the AoE target is ready for more work, but the
  network driver still has all of the pre-allocated skbs.  In that
  case, the new patch just calls alloc_skb, as you'd expect.
  
  We don't want to get carried away, though.  We try not to do
  excessive allocation in the write path, so we cap the number of skbs
  we dynamically allocate.
  
  Probably calling it a "dynamic pool" is misleading.  We were already
  trying to use a small fixed-size set of pre-allocated skbs before
  this patch, and this patch just provides a little headroom (with a
  ceiling, though) to accomodate network drivers that hang onto skbs,
  by allocating when needed.  The d->skbpool_hd list of allocated skbs
  is necessary so that we can free them later.
  
  We didn't notice the need for this headroom until AoE targets got
  fast enough, but the comment summarizing this patch still wasn't
  very good.  So, when I resubmit this patch, I will use a different
  comment:
  
dynamically allocate a capped number of skbs when necessary


Alternatives

  If the network layer never did a put_page on the pages in the bio's
  we get from the block layer, then it would be possible for us to
  hand skbs to the network layer and forget about them, allowing the
  network layer to free skbs itself (and thereby calling our own
  skb->destructor callback function if we needed that).  In that case
  we could get rid of the pre-allocated skbs and also the
  d->skbpool_hd, instead just calling alloc_skb every time we wanted
  to transmit a packet.  The slab allocator would effectively maintain
  the list of skbs.
  
  Besides a loss of CPU cache locality, the main concern with that
  approach the danger that it would increase the likelihood of
  deadlock when VM is trying to free pages by writing dirty data from
  the page cache through the aoe driver out to persistent storage on
  an AoE device.  Right now we have a situation where we have
  pre-allocation that corresponds to how much we use, which seems
  ideal.
  
  Of course, there's still the separate issue of receiving the packets
  that tell us that a write has successfully completed on the AoE
  target.  When memory is low and VM is using AoE to flush dirty data
  to free up pages, it would be perfect if there were a way for us to
  register a fast callback that could recognize write command
  completion responses.  But I don't think the current problems with
  the receive side of the situation are a justification for
  exacerbating the problem on the transmit side.

-- 
  Support - http://www.coraid.com/support/howto.html
  Ed L Cashin <[EMAIL PROTECTED]>
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 07/12] use a dynamic pool of sk_buffs to keep up with fast targets

2007-07-06 Thread Ed L. Cashin
Hi.  I will address the style issues and other things that Andrew
Morton pointed out---Thanks again for the feedback.

As far as the skb pool goes, I'm afraid my comment is misleading.

What this Patch Does 

  Even before this recent series of 12 patches to 2.6.22-rc4, the aoe
  driver was reusing a small set of skbs that were allocated once and
  were only used for outbound AoE commands.
  
  The network layer cannot be allowed to put_page on the data that is
  still associated with a bio we haven't returned to the block layer,
  so the aoe driver (even before the patch under discussion) is still
  the owner of skbs that have been handed to the network layer for
  transmission.  We need to keep track of these skbs so that we can
  free them, but by tracking them, we can also easily re-use them.
  
  The new patch was a response to the behavior of certain network
  drivers.  We cannot reuse an skb that the network driver still has
  in its transmit ring.  Network drivers can defer transmit ring
  cleanup and then use the state in the skb to determine how many data
  segments to clean up in its transmit ring.  The tg3 driver is one
  driver that behaves in this way.
  
  When the network driver defers cleanup of its transmit ring, the aoe
  driver can find itself in a situation where it would like to send an
  AoE command, and the AoE target is ready for more work, but the
  network driver still has all of the pre-allocated skbs.  In that
  case, the new patch just calls alloc_skb, as you'd expect.
  
  We don't want to get carried away, though.  We try not to do
  excessive allocation in the write path, so we cap the number of skbs
  we dynamically allocate.
  
  Probably calling it a dynamic pool is misleading.  We were already
  trying to use a small fixed-size set of pre-allocated skbs before
  this patch, and this patch just provides a little headroom (with a
  ceiling, though) to accomodate network drivers that hang onto skbs,
  by allocating when needed.  The d-skbpool_hd list of allocated skbs
  is necessary so that we can free them later.
  
  We didn't notice the need for this headroom until AoE targets got
  fast enough, but the comment summarizing this patch still wasn't
  very good.  So, when I resubmit this patch, I will use a different
  comment:
  
dynamically allocate a capped number of skbs when necessary


Alternatives

  If the network layer never did a put_page on the pages in the bio's
  we get from the block layer, then it would be possible for us to
  hand skbs to the network layer and forget about them, allowing the
  network layer to free skbs itself (and thereby calling our own
  skb-destructor callback function if we needed that).  In that case
  we could get rid of the pre-allocated skbs and also the
  d-skbpool_hd, instead just calling alloc_skb every time we wanted
  to transmit a packet.  The slab allocator would effectively maintain
  the list of skbs.
  
  Besides a loss of CPU cache locality, the main concern with that
  approach the danger that it would increase the likelihood of
  deadlock when VM is trying to free pages by writing dirty data from
  the page cache through the aoe driver out to persistent storage on
  an AoE device.  Right now we have a situation where we have
  pre-allocation that corresponds to how much we use, which seems
  ideal.
  
  Of course, there's still the separate issue of receiving the packets
  that tell us that a write has successfully completed on the AoE
  target.  When memory is low and VM is using AoE to flush dirty data
  to free up pages, it would be perfect if there were a way for us to
  register a fast callback that could recognize write command
  completion responses.  But I don't think the current problems with
  the receive side of the situation are a justification for
  exacerbating the problem on the transmit side.

-- 
  Support - http://www.coraid.com/support/howto.html
  Ed L Cashin [EMAIL PROTECTED]
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 1/1] docs: static initialization of spinlocks is OK

2007-07-03 Thread Ed L. Cashin
Static initialization of spinlocks is preferable to dynamic
initialization when it is practical.  This patch updates documentation
for consistency with comments in spinlock_types.h.

Signed-off-by: Ed L. Cashin <[EMAIL PROTECTED]>
---
 Documentation/spinlocks.txt |   20 +++-
 1 files changed, 11 insertions(+), 9 deletions(-)

diff --git a/Documentation/spinlocks.txt b/Documentation/spinlocks.txt
index a661d68..471e753 100644
--- a/Documentation/spinlocks.txt
+++ b/Documentation/spinlocks.txt
@@ -1,7 +1,12 @@
-UPDATE March 21 2005 Amit Gud <[EMAIL PROTECTED]>
+SPIN_LOCK_UNLOCKED and RW_LOCK_UNLOCKED defeat lockdep state tracking and
+are hence deprecated.
 
-Macros SPIN_LOCK_UNLOCKED and RW_LOCK_UNLOCKED are deprecated and will be
-removed soon. So for any new code dynamic initialization should be used:
+Please use DEFINE_SPINLOCK()/DEFINE_RWLOCK() or
+__SPIN_LOCK_UNLOCKED()/__RW_LOCK_UNLOCKED() as appropriate for static
+initialization.
+
+Dynamic initialization, when necessary, may be performed as
+demonstrated below.
 
spinlock_t xxx_lock;
rwlock_t xxx_rw_lock;
@@ -15,12 +20,9 @@ removed soon. So for any new code dynamic initialization 
should be used:
 
module_init(xxx_init);
 
-Reasons for deprecation
-  - it hurts automatic lock validators
-  - it becomes intrusive for the realtime preemption patches
-
-Following discussion is still valid, however, with the dynamic initialization
-of spinlocks instead of static.
+The following discussion is still valid, however, with the dynamic
+initialization of spinlocks or with DEFINE_SPINLOCK, etc., used
+instead of SPIN_LOCK_UNLOCKED.
 
 ---
 
-- 
1.5.2.1
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 1/1] docs: static initialization of spinlocks is OK

2007-07-03 Thread Ed L. Cashin
Static initialization of spinlocks is preferable to dynamic
initialization when it is practical.  This patch updates documentation
for consistency with comments in spinlock_types.h.

Signed-off-by: Ed L. Cashin [EMAIL PROTECTED]
---
 Documentation/spinlocks.txt |   20 +++-
 1 files changed, 11 insertions(+), 9 deletions(-)

diff --git a/Documentation/spinlocks.txt b/Documentation/spinlocks.txt
index a661d68..471e753 100644
--- a/Documentation/spinlocks.txt
+++ b/Documentation/spinlocks.txt
@@ -1,7 +1,12 @@
-UPDATE March 21 2005 Amit Gud [EMAIL PROTECTED]
+SPIN_LOCK_UNLOCKED and RW_LOCK_UNLOCKED defeat lockdep state tracking and
+are hence deprecated.
 
-Macros SPIN_LOCK_UNLOCKED and RW_LOCK_UNLOCKED are deprecated and will be
-removed soon. So for any new code dynamic initialization should be used:
+Please use DEFINE_SPINLOCK()/DEFINE_RWLOCK() or
+__SPIN_LOCK_UNLOCKED()/__RW_LOCK_UNLOCKED() as appropriate for static
+initialization.
+
+Dynamic initialization, when necessary, may be performed as
+demonstrated below.
 
spinlock_t xxx_lock;
rwlock_t xxx_rw_lock;
@@ -15,12 +20,9 @@ removed soon. So for any new code dynamic initialization 
should be used:
 
module_init(xxx_init);
 
-Reasons for deprecation
-  - it hurts automatic lock validators
-  - it becomes intrusive for the realtime preemption patches
-
-Following discussion is still valid, however, with the dynamic initialization
-of spinlocks instead of static.
+The following discussion is still valid, however, with the dynamic
+initialization of spinlocks or with DEFINE_SPINLOCK, etc., used
+instead of SPIN_LOCK_UNLOCKED.
 
 ---
 
-- 
1.5.2.1
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 12/12] the aoeminor doesn't need a long format

2007-06-26 Thread Ed L. Cashin
On Tue, Jun 26, 2007 at 12:51:07PM -0700, Randy Dunlap wrote:
> On Tue, 26 Jun 2007 14:50:12 -0400 Ed L. Cashin wrote:
> 
> > The aoedev aoeminor member doesn't need a long format.
> 
> Was there a patch that changed aoeminor to an int?
> Last I see is:
>   ulong aoeminor;
> in linux-2.6.22-rc6/drivers/block/aoe/aoe.h.
> 
> If it's still ulong, you shouldn't change the printk format.

Yes, thanks for checking.  Patch 2 of 12 did change it to a u16.

-- 
  Ed L Cashin <[EMAIL PROTECTED]>
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 02/12] handle multiple network paths to AoE device

2007-06-26 Thread Ed L. Cashin
Handle multiple network paths to AoE device.

Signed-off-by: Ed L. Cashin <[EMAIL PROTECTED]>
---
 drivers/block/aoe/aoe.h|   58 +++--
 drivers/block/aoe/aoeblk.c |   63 -
 drivers/block/aoe/aoechr.c |   14 +-
 drivers/block/aoe/aoecmd.c |  660 +--
 drivers/block/aoe/aoedev.c |  163 +--
 drivers/block/aoe/aoenet.c |4 +-
 6 files changed, 630 insertions(+), 332 deletions(-)

diff --git a/drivers/block/aoe/aoe.h b/drivers/block/aoe/aoe.h
index 2ce5ce9..069f04c 100644
--- a/drivers/block/aoe/aoe.h
+++ b/drivers/block/aoe/aoe.h
@@ -85,10 +85,8 @@ enum {
DEVFL_EXT = (1<<2), /* device accepts lba48 commands */
DEVFL_CLOSEWAIT = (1<<3), /* device is waiting for all closes to 
revalidate */
DEVFL_GDALLOC = (1<<4), /* need to alloc gendisk */
-   DEVFL_PAUSE = (1<<5),
+   DEVFL_KICKME = (1<<5),  /* slow polling network card catch */
DEVFL_NEWSIZE = (1<<6), /* need to update dev size in block layer */
-   DEVFL_MAXBCNT = (1<<7), /* d->maxbcnt is not changeable */
-   DEVFL_KICKME = (1<<8),
 
BUFFL_FAIL = 1,
 };
@@ -97,17 +95,24 @@ enum {
DEFAULTBCNT = 2 * 512,  /* 2 sectors */
NPERSHELF = 16, /* number of slots per shelf address */
FREETAG = -1,
-   MIN_BUFS = 8,
+   MIN_BUFS = 16,
+   NTARGETS = 8,
+   NAOEIFS = 8,
+
+   TIMERTICK = HZ / 10,
+   MINTIMER = HZ >> 2,
+   MAXTIMER = HZ << 1,
+   HELPWAIT = 20,
 };
 
 struct buf {
struct list_head bufs;
-   ulong start_time;   /* for disk stats */
+   ulong stime;/* for disk stats */
ulong flags;
ulong nframesout;
-   char *bufaddr;
ulong resid;
ulong bv_resid;
+   ulong bv_off;
sector_t sector;
struct bio *bio;
struct bio_vec *bv;
@@ -123,19 +128,37 @@ struct frame {
struct sk_buff *skb;
 };
 
+struct aoeif {
+   struct net_device *nd;
+   unsigned char lost;
+   unsigned char lostjumbo;
+   ushort maxbcnt;
+};
+
+struct aoetgt {
+   unsigned char addr[6];
+   ushort nframes;
+   struct frame *frames;
+   struct aoeif ifs[NAOEIFS];
+   struct aoeif *ifp;  /* current aoeif in use */
+   ushort nout;
+   ushort maxout;
+   u16 lasttag;/* last tag sent */
+   u16 useme;
+   ulong lastwadj; /* last window adjustment */
+int wpkts, rpkts;
+};
+
 struct aoedev {
struct aoedev *next;
-   unsigned char addr[6];  /* remote mac addr */
-   ushort flags;
ulong sysminor;
ulong aoemajor;
-   ulong aoeminor;
+   u16 aoeminor;
+   u16 flags;
u16 nopen;  /* (bd_openers isn't available without 
sleeping) */
-   u16 lasttag;/* last tag sent */
u16 rttavg; /* round trip average of requests/responses */
u16 mintimer;
u16 fw_ver; /* version of blade's firmware */
-   u16 maxbcnt;
struct work_struct work;/* disk create work struct */
struct gendisk *gd;
request_queue_t blkq;
@@ -143,15 +166,15 @@ struct aoedev {
sector_t ssize;
struct timer_list timer;
spinlock_t lock;
-   struct net_device *ifp; /* interface ed is attached to */
struct sk_buff *sendq_hd; /* packets needing to be sent, list head */
struct sk_buff *sendq_tl;
mempool_t *bufpool; /* for deadlock-free Buf allocation */
struct list_head bufq;  /* queue of bios to work on */
struct buf *inprocess;  /* the one we're currently working on */
-   ushort lostjumbo;
-   ushort nframes; /* number of frames below */
-   struct frame *frames;
+   struct aoetgt *targets[NTARGETS];
+   struct aoetgt **tgt;/* target in use when working */
+   struct aoetgt **htgt;   /* target needing rexmit assistance */
+//int ios[64];
 };
 
 
@@ -169,12 +192,13 @@ void aoecmd_cfg(ushort aoemajor, unsigned char aoeminor);
 void aoecmd_ata_rsp(struct sk_buff *);
 void aoecmd_cfg_rsp(struct sk_buff *);
 void aoecmd_sleepwork(struct work_struct *);
-struct sk_buff *new_skb(ulong);
+void aoecmd_cleanslate(struct aoedev *);
+struct sk_buff *aoecmd_ata_id(struct aoedev *);
 
 int aoedev_init(void);
 void aoedev_exit(void);
 struct aoedev *aoedev_by_aoeaddr(int maj, int min);
-struct aoedev *aoedev_by_sysminor_m(ulong sysminor, ulong bufcnt);
+struct aoedev *aoedev_by_sysminor_m(ulong sysminor);
 void aoedev_downdev(struct aoedev *d);
 int aoedev_isbusy(struct aoedev *d);
 
diff --git a/drivers/block/aoe/aoeblk.c b/drivers/block/aoe/aoeblk.c
index 478489c..f6773ab 100644
--- a/drivers/block/aoe/aoeblk.c
+++ b/drivers/block/aoe/aoeblk.c
@@ -21,22 +21,55 @@ static ssize_t aoedisk_show_state(struct gendisk * disk, 
char *page)
return snprintf(page, PAGE_SIZE,

[PATCH 01/12] bring driver version number to 47

2007-06-26 Thread Ed L. Cashin
These patches were made against kernel 2.6.22-rc4.

Bring driver version number to 47.

Signed-off-by: Ed L. Cashin <[EMAIL PROTECTED]>
---
 drivers/block/aoe/aoe.h |2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/drivers/block/aoe/aoe.h b/drivers/block/aoe/aoe.h
index 1d84668..2ce5ce9 100644
--- a/drivers/block/aoe/aoe.h
+++ b/drivers/block/aoe/aoe.h
@@ -1,5 +1,5 @@
 /* Copyright (c) 2006 Coraid, Inc.  See COPYING for GPL terms. */
-#define VERSION "32"
+#define VERSION "47"
 #define AOE_MAJOR 152
 #define DEVICE_NAME "aoe"
 
-- 
1.5.2.1

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 12/12] the aoeminor doesn't need a long format

2007-06-26 Thread Ed L. Cashin
The aoedev aoeminor member doesn't need a long format.

Signed-off-by: Ed L. Cashin <[EMAIL PROTECTED]>
---
 drivers/block/aoe/aoeblk.c |6 +++---
 drivers/block/aoe/aoecmd.c |4 ++--
 2 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/drivers/block/aoe/aoeblk.c b/drivers/block/aoe/aoeblk.c
index ccadd9a..e216fe0 100644
--- a/drivers/block/aoe/aoeblk.c
+++ b/drivers/block/aoe/aoeblk.c
@@ -203,7 +203,7 @@ aoeblk_make_request(request_queue_t *q, struct bio *bio)
spin_lock_irqsave(>lock, flags);
 
if ((d->flags & DEVFL_UP) == 0) {
-   printk(KERN_INFO "aoe: device %ld.%ld is not up\n",
+   printk(KERN_INFO "aoe: device %ld.%d is not up\n",
d->aoemajor, d->aoeminor);
spin_unlock_irqrestore(>lock, flags);
mempool_free(buf, d->bufpool);
@@ -256,7 +256,7 @@ aoeblk_gdalloc(void *vp)
 
gd = alloc_disk(AOE_PARTITIONS);
if (gd == NULL) {
-   printk(KERN_ERR "aoe: cannot allocate disk structure for 
%ld.%ld\n",
+   printk(KERN_ERR "aoe: cannot allocate disk structure for 
%ld.%d\n",
d->aoemajor, d->aoeminor);
spin_lock_irqsave(>lock, flags);
d->flags &= ~DEVFL_GDALLOC;
@@ -266,7 +266,7 @@ aoeblk_gdalloc(void *vp)
 
d->bufpool = mempool_create_slab_pool(MIN_BUFS, buf_pool_cache);
if (d->bufpool == NULL) {
-   printk(KERN_ERR "aoe: cannot allocate bufpool for %ld.%ld\n",
+   printk(KERN_ERR "aoe: cannot allocate bufpool for %ld.%d\n",
d->aoemajor, d->aoeminor);
put_disk(gd);
spin_lock_irqsave(>lock, flags);
diff --git a/drivers/block/aoe/aoecmd.c b/drivers/block/aoe/aoecmd.c
index 9de0024..dfb1184 100644
--- a/drivers/block/aoe/aoecmd.c
+++ b/drivers/block/aoe/aoecmd.c
@@ -680,7 +680,7 @@ ataid_complete(struct aoedev *d, struct aoetgt *t, unsigned 
char *id)
}
 
if (d->ssize != ssize)
-   printk(KERN_INFO "aoe: %012llx e%lu.%lu v%04x has %llu 
sectors\n",
+   printk(KERN_INFO "aoe: %012llx e%ld.%d v%04x has %llu 
sectors\n",
mac_addr(t->addr),
d->aoemajor, d->aoeminor,
d->fw_ver, (long long)ssize);
@@ -805,7 +805,7 @@ aoecmd_ata_rsp(struct sk_buff *skb)
 
if (ahin->cmdstat & 0xa9) { /* these bits cleared on success */
printk(KERN_ERR
-   "aoe: ata error cmd=%2.2Xh stat=%2.2Xh from e%ld.%ld\n",
+   "aoe: ata error cmd=%2.2Xh stat=%2.2Xh from e%ld.%d\n",
ahout->cmdstat, ahin->cmdstat,
d->aoemajor, d->aoeminor);
if (buf)
-- 
1.5.2.1

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 09/12] remove race between use and initialization of locks

2007-06-26 Thread Ed L. Cashin
This change was originally submitted by Alexey Dobriyan in an email
with ...

  Message-ID: <[EMAIL PROTECTED]>

and the comment,

  Some drivers do register_chrdev() before lock or semaphore used in
  corresponding file_operations is initialized.

Andrew Morton commented that these locks should be initialized at
compile time, but Alexey Debriyan pointed out that the Documentation
tells us to use dynamic initialization whenever possible, and then the
discussion petered out.

  http://preview.tinyurl.com/2pxq6p

I believe we made these locks dynamic because of the notice in
Documentation/spinlocks.txt, which says that static initializers are
deprecated:

  UPDATE March 21 2005 Amit Gud <[EMAIL PROTECTED]>

  Macros SPIN_LOCK_UNLOCKED and RW_LOCK_UNLOCKED are deprecated and will be
  removed soon. So for any new code dynamic initialization should be used:
...

In any case, the patch below makes the code correct and in keeping
with the existing documentation.  If the existing docs are wrong, I'd
be happy to follow up with a patch that corrects them and makes these
aoechr.c locks static.

Signed-off-by: Alexey Dobriyan <[EMAIL PROTECTED]>
Signed-off-by: Ed L. Cashin <[EMAIL PROTECTED]>
---
 drivers/block/aoe/aoechr.c |4 ++--
 1 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/block/aoe/aoechr.c b/drivers/block/aoe/aoechr.c
index 10b38a7..2b4f873 100644
--- a/drivers/block/aoe/aoechr.c
+++ b/drivers/block/aoe/aoechr.c
@@ -256,13 +256,13 @@ aoechr_init(void)
 {
int n, i;
 
+   sema_init(_sema, 0);
+   spin_lock_init(_lock);
n = register_chrdev(AOE_MAJOR, "aoechr", _fops);
if (n < 0) { 
printk(KERN_ERR "aoe: can't register char device\n");
return n;
}
-   sema_init(_sema, 0);
-   spin_lock_init(_lock);
aoe_class = class_create(THIS_MODULE, "aoe");
if (IS_ERR(aoe_class)) {
unregister_chrdev(AOE_MAJOR, "aoechr");
-- 
1.5.2.1

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 11/12] remove extra space in prototypes for consistency

2007-06-26 Thread Ed L. Cashin
Remove extra space in prototypes for consistency.

Signed-off-by: Ed L. Cashin <[EMAIL PROTECTED]>
---
 drivers/block/aoe/aoeblk.c |8 
 1 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/block/aoe/aoeblk.c b/drivers/block/aoe/aoeblk.c
index c9cf576..ccadd9a 100644
--- a/drivers/block/aoe/aoeblk.c
+++ b/drivers/block/aoe/aoeblk.c
@@ -14,7 +14,7 @@
 
 static struct kmem_cache *buf_pool_cache;
 
-static ssize_t aoedisk_show_state(struct gendisk * disk, char *page)
+static ssize_t aoedisk_show_state(struct gendisk *disk, char *page)
 {
struct aoedev *d = disk->private_data;
 
@@ -25,7 +25,7 @@ static ssize_t aoedisk_show_state(struct gendisk * disk, char 
*page)
(d->nopen && !(d->flags & DEVFL_UP)) ? ",closewait" : 
"");
/* I'd rather see nopen exported so we can ditch closewait */
 }
-static ssize_t aoedisk_show_mac(struct gendisk * disk, char *page)
+static ssize_t aoedisk_show_mac(struct gendisk *disk, char *page)
 {
struct aoedev *d = disk->private_data;
struct aoetgt *t = d->targets[0];
@@ -34,7 +34,7 @@ static ssize_t aoedisk_show_mac(struct gendisk * disk, char 
*page)
return snprintf(page, PAGE_SIZE, "none\n");
return snprintf(page, PAGE_SIZE, "%012llx\n", mac_addr(t->addr));
 }
-static ssize_t aoedisk_show_netif(struct gendisk * disk, char *page)
+static ssize_t aoedisk_show_netif(struct gendisk *disk, char *page)
 {
struct aoedev *d = disk->private_data;
struct net_device *nds[8], **nd, **nnd, **ne;
@@ -71,7 +71,7 @@ static ssize_t aoedisk_show_netif(struct gendisk * disk, char 
*page)
return p-page;
 }
 /* firmware version */
-static ssize_t aoedisk_show_fwver(struct gendisk * disk, char *page)
+static ssize_t aoedisk_show_fwver(struct gendisk *disk, char *page)
 {
struct aoedev *d = disk->private_data;
 
-- 
1.5.2.1

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


  1   2   >