Re: [PATCH 04/22] block: Abstract out bvec iterator
On Tue, Aug 13, 2013 at 11:51:58AM -0700, Kent Overstreet wrote: > On Tue, Aug 13, 2013 at 10:03:04AM -0400, Ed Cashin wrote: > > On Aug 9, 2013, Ed Cashin wrote: > > > On Aug 8, 2013, at 9:05 PM, Kent Overstreet wrote: > > > ... > > > > It's in the for-jens branch now. > > > > > > > > > Just examining the patches, I like the way it cleans up the aoe code. I > > > had a question about a new BUG added by the for-jens branch the > > > read-response handling path of the aoe driver. > > > > The aoe driver in linux-bcache/for-jens commit 4c36c973a8f45 is > > passing my tests. > > > > Here is a patch against that branch illustrating my suggestion for > > handling bad target responses gracefully. > > Thanks - shall I just fold that into the aoe immutable bvec patch? Yes, that would be good, thanks. Unfortunately, the way I usually send patches to vger didn't work this time. It looks like the MTA didn't retry after the greylisting used SMTP temporary failures. So I'm trying a different way to send and including the same patch for the benefit of the Cc list. commit 2c39f50b1ee02e2ac07fd072a883a91713da53cc Author: Ed Cashin Date: Tue Aug 13 10:50:28 2013 -0400 aoe: bad AoE responses fail I/O without BUG Instead of having a BUG when the AoE target does something wrong, just fail the I/O and log the problem with rate limiting. diff --git a/drivers/block/aoe/aoecmd.c b/drivers/block/aoe/aoecmd.c index cacd48e..b9916a6 100644 --- a/drivers/block/aoe/aoecmd.c +++ b/drivers/block/aoe/aoecmd.c @@ -1096,7 +1096,6 @@ bvcpy(struct sk_buff *skb, struct bio *bio, struct bvec_iter iter, long cnt) int soff = 0; struct bio_vec bv; - BUG_ON(cnt > iter.bi_size); iter.bi_size = cnt; __bio_for_each_segment(bv, bio, iter, iter) { @@ -1196,6 +1195,14 @@ noskb: if (buf) clear_bit(BIO_UPTODATE, &buf->bio->bi_flags); break; } + if (n > f->iter.bi_size) { + pr_err_ratelimited("%s e%ld.%d. bytes=%ld need=%u\n", + "aoe: too-large data size in read from", + (long) d->aoemajor, d->aoeminor, + n, f->iter.bi_size); + clear_bit(BIO_UPTODATE, &buf->bio->bi_flags); + break; + } bvcpy(skb, f->buf->bio, f->iter, n); case ATA_CMD_PIO_WRITE: case ATA_CMD_PIO_WRITE_EXT: -- Ed -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [2.6 patch] #if 0 aoedev_isbusy()
On Thu, Feb 14, 2008 at 12:05:37AM +0200, Adrian Bunk wrote: > On Wed, Feb 13, 2008 at 11:03:35PM +0100, Jan Engelhardt wrote: > > > > On Feb 13 2008 23:30, Adrian Bunk wrote: > > > > > >This patch #if 0's the no longer used aoedev_isbusy(). > > > > Why not just remove it? (It can be resurrected from earlier > > revisions should it be needed again.) > > I've switched to doing #if 0 instead since this addresses the > "It might be needed in the future." answer I otherwise sometimes > got. > > But if a maintainer wants to have such a function deleted instead that's > fine with me. Hello. That function can go away. I have some changes pending, but none of them use aoedev_isbusy. Thanks for Cc-ing me. -- Ed L Cashin <[EMAIL PROTECTED]> -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] Documentation: Add 00-INDEX file for AoE
On Sun, Jan 13, 2008 at 02:44:48AM +0100, Jesper Juhl wrote: > Documentation/aoe/ is missing a 00-INDEX file. Add one. Thanks. I think that it would help to clarify that using udev is the norm, and that the mkdev and mkshelf scripts are just examples to show how you could do it manually. Signed-off-by: Ed L. Cashin <[EMAIL PROTECTED]> diff --git a/Documentation/aoe/00-INDEX b/Documentation/aoe/00-INDEX index 8188b22..d087df6 100644 --- a/Documentation/aoe/00-INDEX +++ b/Documentation/aoe/00-INDEX @@ -5,9 +5,10 @@ aoe.txt autoload.sh - script for making the AoE driver autoload via /etc/modprobe.conf. mkdevs.sh - - script for creating required AoE device nodes. + - script for creating required AoE device nodes without udev. mkshelf.sh - - script for making one shelf's worth of block device nodes. + - script for making one shelf's worth of block device nodes + without udev. status.sh - script to collate and present sysfs information about AoE storage. todo.txt diff --git a/Documentation/aoe/mkdevs.sh b/Documentation/aoe/mkdevs.sh index 97374aa..5870a6c 100644 --- a/Documentation/aoe/mkdevs.sh +++ b/Documentation/aoe/mkdevs.sh @@ -1,4 +1,7 @@ #!/bin/sh +# This example script shows how device nodes for interacting with the +# aoe driver could be created in the absence of udev. Using udev is +# preferable to creating them manually. n_shelves=${n_shelves:-10} n_partitions=${n_partitions:-16} diff --git a/Documentation/aoe/mkshelf.sh b/Documentation/aoe/mkshelf.sh index 3261581..7f412f0 100644 --- a/Documentation/aoe/mkshelf.sh +++ b/Documentation/aoe/mkshelf.sh @@ -1,4 +1,7 @@ #! /bin/sh +# This example script shows how device nodes for interacting with the +# aoe driver could be created in the absence of udev. Using udev is +# preferable to creating them manually. if test "$#" != "2"; then echo "Usage: sh `basename $0` {dir} {shelfaddress}" 1>&2 -- Ed L Cashin <[EMAIL PROTECTED]> -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] aoe: document the behavior of /dev/etherd/err
Signed-off-by: Ed L. Cashin <[EMAIL PROTECTED]> --- drivers/block/aoe/aoechr.c |4 1 files changed, 4 insertions(+), 0 deletions(-) diff --git a/drivers/block/aoe/aoechr.c b/drivers/block/aoe/aoechr.c index 2620073..871f284 100644 --- a/drivers/block/aoe/aoechr.c +++ b/drivers/block/aoe/aoechr.c @@ -33,6 +33,10 @@ struct ErrMsg { char *msg; }; +/* A ring buffer of error messages, to be read through + * "/dev/etherd/err". When no messages are present, + * readers will block waiting for messages to appear. + */ static struct ErrMsg emsgs[NMSG]; static int emsgs_head_idx, emsgs_tail_idx; static struct semaphore emsgs_sema; -- 1.5.3.4 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] aoe: initialize locking structures before registering char devices
This patch was made against 2.6.24-rc6-mm1. In March 2007, Alexey Dobriyan suggested this change, which eliminates a race after register_chardev has been called but the locking primitives protecting the error messages ring buffer have not yet been initialized. The initialization could happen at compile time, but that would leave aoe as the only user of __DECLARE_SEMAPHORE_GENERIC. Signed-off-by: Ed L. Cashin <[EMAIL PROTECTED]> --- drivers/block/aoe/aoechr.c |4 ++-- 1 files changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/block/aoe/aoechr.c b/drivers/block/aoe/aoechr.c index e8e60e7..2620073 100644 --- a/drivers/block/aoe/aoechr.c +++ b/drivers/block/aoe/aoechr.c @@ -259,13 +259,13 @@ aoechr_init(void) { int n, i; + sema_init(&emsgs_sema, 0); + spin_lock_init(&emsgs_lock); n = register_chrdev(AOE_MAJOR, "aoechr", &aoe_fops); if (n < 0) { printk(KERN_ERR "aoe: can't register char device\n"); return n; } - sema_init(&emsgs_sema, 0); - spin_lock_init(&emsgs_lock); aoe_class = class_create(THIS_MODULE, "aoe"); if (IS_ERR(aoe_class)) { unregister_chrdev(AOE_MAJOR, "aoechr"); -- 1.5.3.4 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 09/13] remove race between use and initialization of locks
On Fri, Dec 21, 2007 at 10:00:40PM -0800, Andrew Morton wrote: > On Thu, 20 Dec 2007 17:15:57 -0500 "Ed L. Cashin" <[EMAIL PROTECTED]> wrote: ... > > +static __DECLARE_SEMAPHORE_GENERIC(emsgs_sema, 0); ... > > - sema_init(&emsgs_sema, 0); > > - spin_lock_init(&emsgs_lock); > > aoe_class = class_create(THIS_MODULE, "aoe"); > > if (IS_ERR(aoe_class)) { > > unregister_chrdev(AOE_MAJOR, "aoechr"); > > I think it would be better to go back to initialising emsgs_lock at runtime > rather than fattening the exported semaphore API like this. I don't think there is anything wrong with having a complete set of initialization routines for a semaphore, but it's certainly easy enough to go back to Alexey Dobriyan's original suggestion, which was to simply move the initialization calls before register_chardev. I will follow this email with a patch that does that. > emssgs_sema is a weird-looking thing. There really should be some comments > in there because it is unobvious what the code is attempting to do. > > What is the code attempting to do? There is a ring buffer of error messages. Userland processes can read these error messages by reading /dev/etherd/err, blocking if there are no error messages to read yet. The emsgs_sema semaphore is used to manage the reader(s) waiting for error messages. When there are sleepers waiting, "up" is used to wake one up when a new error message is produced. A reader gets a single message, not just some text with a mixture of different errors. If I do, cat /dev/etherd/err > /my/log/file ... then I can hit control-c or send a SIGTERM to stop it. > It appears to me that nblocked_emsgs_readers gets incorrectly > decremented if the down_interruptible() got interrupted, btw. The counter will be incremented again if the process goes back to sleep waiting for an error message, but the process might be getting killed. The counter is really just for sleeping readers. -- Ed L Cashin <[EMAIL PROTECTED]> -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 13/13] update copyright date
Update the year in the copyright notices. Signed-off-by: Ed L. Cashin <[EMAIL PROTECTED]> --- drivers/block/aoe/aoe.h |2 +- drivers/block/aoe/aoeblk.c |2 +- drivers/block/aoe/aoechr.c |2 +- drivers/block/aoe/aoecmd.c |2 +- drivers/block/aoe/aoedev.c |2 +- drivers/block/aoe/aoemain.c |2 +- drivers/block/aoe/aoenet.c |2 +- 7 files changed, 7 insertions(+), 7 deletions(-) diff --git a/drivers/block/aoe/aoe.h b/drivers/block/aoe/aoe.h index 67ef4d7..280e71e 100644 --- a/drivers/block/aoe/aoe.h +++ b/drivers/block/aoe/aoe.h @@ -1,4 +1,4 @@ -/* Copyright (c) 2006 Coraid, Inc. See COPYING for GPL terms. */ +/* Copyright (c) 2007 Coraid, Inc. See COPYING for GPL terms. */ #define VERSION "47" #define AOE_MAJOR 152 #define DEVICE_NAME "aoe" diff --git a/drivers/block/aoe/aoeblk.c b/drivers/block/aoe/aoeblk.c index 25c6760..0c39782 100644 --- a/drivers/block/aoe/aoeblk.c +++ b/drivers/block/aoe/aoeblk.c @@ -1,4 +1,4 @@ -/* Copyright (c) 2006 Coraid, Inc. See COPYING for GPL terms. */ +/* Copyright (c) 2007 Coraid, Inc. See COPYING for GPL terms. */ /* * aoeblk.c * block device routines diff --git a/drivers/block/aoe/aoechr.c b/drivers/block/aoe/aoechr.c index 670bba6..ef49e4b 100644 --- a/drivers/block/aoe/aoechr.c +++ b/drivers/block/aoe/aoechr.c @@ -1,4 +1,4 @@ -/* Copyright (c) 2006 Coraid, Inc. See COPYING for GPL terms. */ +/* Copyright (c) 2007 Coraid, Inc. See COPYING for GPL terms. */ /* * aoechr.c * AoE character device driver diff --git a/drivers/block/aoe/aoecmd.c b/drivers/block/aoe/aoecmd.c index 1e37cf6..44beb17 100644 --- a/drivers/block/aoe/aoecmd.c +++ b/drivers/block/aoe/aoecmd.c @@ -1,4 +1,4 @@ -/* Copyright (c) 2006 Coraid, Inc. See COPYING for GPL terms. */ +/* Copyright (c) 2007 Coraid, Inc. See COPYING for GPL terms. */ /* * aoecmd.c * Filesystem request handling methods diff --git a/drivers/block/aoe/aoedev.c b/drivers/block/aoe/aoedev.c index 839a964..d146c4e 100644 --- a/drivers/block/aoe/aoedev.c +++ b/drivers/block/aoe/aoedev.c @@ -1,4 +1,4 @@ -/* Copyright (c) 2006 Coraid, Inc. See COPYING for GPL terms. */ +/* Copyright (c) 2007 Coraid, Inc. See COPYING for GPL terms. */ /* * aoedev.c * AoE device utility functions; maintains device list. diff --git a/drivers/block/aoe/aoemain.c b/drivers/block/aoe/aoemain.c index a04b7d6..7b15a5e 100644 --- a/drivers/block/aoe/aoemain.c +++ b/drivers/block/aoe/aoemain.c @@ -1,4 +1,4 @@ -/* Copyright (c) 2006 Coraid, Inc. See COPYING for GPL terms. */ +/* Copyright (c) 2007 Coraid, Inc. See COPYING for GPL terms. */ /* * aoemain.c * Module initialization routines, discover timer diff --git a/drivers/block/aoe/aoenet.c b/drivers/block/aoe/aoenet.c index ada4a06..8460ef7 100644 --- a/drivers/block/aoe/aoenet.c +++ b/drivers/block/aoe/aoenet.c @@ -1,4 +1,4 @@ -/* Copyright (c) 2006 Coraid, Inc. See COPYING for GPL terms. */ +/* Copyright (c) 2007 Coraid, Inc. See COPYING for GPL terms. */ /* * aoenet.c * Ethernet portion of AoE driver -- 1.5.3.4 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 12/13] make error messages more specific
Andrew Morton pointed out that the "too many targets" message in patch 2 could be printed for failing GFP_ATOMIC allocations. This patch makes the messages more specific. Signed-off-by: Ed L. Cashin <[EMAIL PROTECTED]> --- drivers/block/aoe/aoecmd.c | 15 +++ 1 files changed, 7 insertions(+), 8 deletions(-) diff --git a/drivers/block/aoe/aoecmd.c b/drivers/block/aoe/aoecmd.c index bcea36c..1e37cf6 100644 --- a/drivers/block/aoe/aoecmd.c +++ b/drivers/block/aoe/aoecmd.c @@ -957,15 +957,17 @@ addtgt(struct aoedev *d, char *addr, ulong nframes) for (; tt < te && *tt; tt++) ; - if (tt == te) + if (tt == te) { + printk(KERN_INFO + "aoe: device addtgt failure; too many targets\n"); return NULL; - + } t = kcalloc(1, sizeof *t, GFP_ATOMIC); - if (!t) - return NULL; f = kcalloc(nframes, sizeof *f, GFP_ATOMIC); - if (!f) { + if (!t || !f) { + kfree(f); kfree(t); + printk(KERN_INFO "aoe: cannot allocate memory to add target\n"); return NULL; } @@ -1029,9 +1031,6 @@ aoecmd_cfg_rsp(struct sk_buff *skb) if (!t) { t = addtgt(d, h->src, n); if (!t) { - printk(KERN_INFO - "aoe: device addtgt failure; " - "too many targets?\n"); spin_unlock_irqrestore(&d->lock, flags); return; } -- 1.5.3.4 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 10/13] add module parameter for users who need more outstanding I/O
An AoE target provides an estimate of the number of outstanding commands that the AoE initiator can send before getting a response. The aoe_maxout parameter provides a way to set an even lower limit. It will not allow a user to use more outstanding commands than the target permits. If a user discovers a problem with a large setting, this parameter provides a way for us to work with them to debug the problem. We expect to improve the dynamic window sizing algorithm and drop this parameter. For the time being, it is a debugging aid. Signed-off-by: Ed L. Cashin <[EMAIL PROTECTED]> --- drivers/block/aoe/aoecmd.c | 10 +++--- 1 files changed, 7 insertions(+), 3 deletions(-) diff --git a/drivers/block/aoe/aoecmd.c b/drivers/block/aoe/aoecmd.c index 7a96183..e92d885 100644 --- a/drivers/block/aoe/aoecmd.c +++ b/drivers/block/aoe/aoecmd.c @@ -18,6 +18,11 @@ static int aoe_deadsecs = 60 * 3; module_param(aoe_deadsecs, int, 0644); MODULE_PARM_DESC(aoe_deadsecs, "After aoe_deadsecs seconds, give up and fail dev."); +static int aoe_maxout = 16; +module_param(aoe_maxout, int, 0644); +MODULE_PARM_DESC(aoe_maxout, + "Only aoe_maxout outstanding packets for every MAC on eX.Y."); + static struct sk_buff * new_skb(ulong len) { @@ -984,7 +989,6 @@ aoecmd_cfg_rsp(struct sk_buff *skb) struct aoeif *ifp; ulong flags, sysminor, aoemajor; struct sk_buff *sl; - enum { MAXFRAMES = 16 }; u16 n; h = (struct aoe_hdr *) skb_mac_header(skb); @@ -1009,8 +1013,8 @@ aoecmd_cfg_rsp(struct sk_buff *skb) } n = be16_to_cpu(ch->bufcnt); - if (n > MAXFRAMES) /* keep it reasonable */ - n = MAXFRAMES; + if (n > aoe_maxout) /* keep it reasonable */ + n = aoe_maxout; d = aoedev_by_sysminor_m(sysminor); if (d == NULL) { -- 1.5.3.4 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 11/13] the aoeminor doesn't need a long format
The aoedev aoeminor member doesn't need a long format. Signed-off-by: Ed L. Cashin <[EMAIL PROTECTED]> --- drivers/block/aoe/aoeblk.c |7 --- drivers/block/aoe/aoecmd.c |5 +++-- 2 files changed, 7 insertions(+), 5 deletions(-) diff --git a/drivers/block/aoe/aoeblk.c b/drivers/block/aoe/aoeblk.c index deea536..25c6760 100644 --- a/drivers/block/aoe/aoeblk.c +++ b/drivers/block/aoe/aoeblk.c @@ -202,7 +202,7 @@ aoeblk_make_request(struct request_queue *q, struct bio *bio) spin_lock_irqsave(&d->lock, flags); if ((d->flags & DEVFL_UP) == 0) { - printk(KERN_INFO "aoe: device %ld.%ld is not up\n", + printk(KERN_INFO "aoe: device %ld.%d is not up\n", d->aoemajor, d->aoeminor); spin_unlock_irqrestore(&d->lock, flags); mempool_free(buf, d->bufpool); @@ -255,14 +255,15 @@ aoeblk_gdalloc(void *vp) gd = alloc_disk(AOE_PARTITIONS); if (gd == NULL) { - printk(KERN_ERR "aoe: cannot allocate disk structure for %ld.%ld\n", + printk(KERN_ERR + "aoe: cannot allocate disk structure for %ld.%d\n", d->aoemajor, d->aoeminor); goto err; } d->bufpool = mempool_create_slab_pool(MIN_BUFS, buf_pool_cache); if (d->bufpool == NULL) { - printk(KERN_ERR "aoe: cannot allocate bufpool for %ld.%ld\n", + printk(KERN_ERR "aoe: cannot allocate bufpool for %ld.%d\n", d->aoemajor, d->aoeminor); goto err_disk; } diff --git a/drivers/block/aoe/aoecmd.c b/drivers/block/aoe/aoecmd.c index e92d885..bcea36c 100644 --- a/drivers/block/aoe/aoecmd.c +++ b/drivers/block/aoe/aoecmd.c @@ -697,7 +697,8 @@ ataid_complete(struct aoedev *d, struct aoetgt *t, unsigned char *id) } if (d->ssize != ssize) - printk(KERN_INFO "aoe: %012llx e%lu.%lu v%04x has %llu sectors\n", + printk(KERN_INFO + "aoe: %012llx e%ld.%d v%04x has %llu sectors\n", mac_addr(t->addr), d->aoemajor, d->aoeminor, d->fw_ver, (long long)ssize); @@ -822,7 +823,7 @@ aoecmd_ata_rsp(struct sk_buff *skb) if (ahin->cmdstat & 0xa9) { /* these bits cleared on success */ printk(KERN_ERR - "aoe: ata error cmd=%2.2Xh stat=%2.2Xh from e%ld.%ld\n", + "aoe: ata error cmd=%2.2Xh stat=%2.2Xh from e%ld.%d\n", ahout->cmdstat, ahin->cmdstat, d->aoemajor, d->aoeminor); if (buf) -- 1.5.3.4 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 08/13] only install new AoE device once
An aoe driver user who had about 70 AoE targets found that he was hitting a BUG in sysfs_create_file because the aoe driver was trying to tell the kernel about an AoE device more than once. Each AoE device was reachable by several local network interfaces, and multiple ATA device indentify responses were returning from that single device. This patch eliminates a race condition so that aoe always informs the block layer of a new AoE device once in the presence of multiple incoming ATA device identify responses. Signed-off-by: Ed L. Cashin <[EMAIL PROTECTED]> --- drivers/block/aoe/aoecmd.c |2 ++ 1 files changed, 2 insertions(+), 0 deletions(-) diff --git a/drivers/block/aoe/aoecmd.c b/drivers/block/aoe/aoecmd.c index b49e06e..7a96183 100644 --- a/drivers/block/aoe/aoecmd.c +++ b/drivers/block/aoe/aoecmd.c @@ -698,6 +698,8 @@ ataid_complete(struct aoedev *d, struct aoetgt *t, unsigned char *id) d->fw_ver, (long long)ssize); d->ssize = ssize; d->geo.start = 0; + if (d->flags & (DEVFL_GDALLOC|DEVFL_NEWSIZE)) + return; if (d->gd != NULL) { d->gd->capacity = ssize; d->flags |= DEVFL_NEWSIZE; -- 1.5.3.4 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 09/13] remove race between use and initialization of locks
Alexey Dobriyan noticed a race in the initialization of the dynamic locks in ... Message-ID: <[EMAIL PROTECTED]> Andrew Morton commented that these locks should be initialized at compile time, so this patch does that. Signed-off-by: Ed L. Cashin <[EMAIL PROTECTED]> --- drivers/block/aoe/aoechr.c |6 ++ 1 files changed, 2 insertions(+), 4 deletions(-) diff --git a/drivers/block/aoe/aoechr.c b/drivers/block/aoe/aoechr.c index 1bc85aa..670bba6 100644 --- a/drivers/block/aoe/aoechr.c +++ b/drivers/block/aoe/aoechr.c @@ -35,8 +35,8 @@ struct ErrMsg { static struct ErrMsg emsgs[NMSG]; static int emsgs_head_idx, emsgs_tail_idx; -static struct semaphore emsgs_sema; -static spinlock_t emsgs_lock; +static __DECLARE_SEMAPHORE_GENERIC(emsgs_sema, 0); +static DEFINE_SPINLOCK(emsgs_lock); static int nblocked_emsgs_readers; static struct class *aoe_class; static struct aoe_chardev chardevs[] = { @@ -264,8 +264,6 @@ aoechr_init(void) printk(KERN_ERR "aoe: can't register char device\n"); return n; } - sema_init(&emsgs_sema, 0); - spin_lock_init(&emsgs_lock); aoe_class = class_create(THIS_MODULE, "aoe"); if (IS_ERR(aoe_class)) { unregister_chrdev(AOE_MAJOR, "aoechr"); -- 1.5.3.4 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 06/13] user can ask driver to forget previously detected devices
When an AoE device is detected, the kernel is informed, and a new block device is created. If the device is unused, the block device corresponding to remote device that is no longer available may be removed from the system by telling the aoe driver to "flush" its list of devices. Without this patch, software like GPFS and LVM may attempt to read from AoE devices that were discovered earlier but are no longer present, blocking until the I/O attempt times out. Signed-off-by: Ed L. Cashin <[EMAIL PROTECTED]> --- Documentation/aoe/mkdevs.sh |2 + Documentation/aoe/udev.txt |1 + drivers/block/aoe/aoe.h |1 + drivers/block/aoe/aoechr.c |5 ++ drivers/block/aoe/aoedev.c | 87 +- 5 files changed, 77 insertions(+), 19 deletions(-) diff --git a/Documentation/aoe/mkdevs.sh b/Documentation/aoe/mkdevs.sh index 97374aa..44c0ab7 100644 --- a/Documentation/aoe/mkdevs.sh +++ b/Documentation/aoe/mkdevs.sh @@ -29,6 +29,8 @@ rm -f $dir/interfaces mknod -m 0200 $dir/interfaces c $MAJOR 4 rm -f $dir/revalidate mknod -m 0200 $dir/revalidate c $MAJOR 5 +rm -f $dir/flush +mknod -m 0200 $dir/flush c $MAJOR 6 export n_partitions mkshelf=`echo $0 | sed 's!mkdevs!mkshelf!'` diff --git a/Documentation/aoe/udev.txt b/Documentation/aoe/udev.txt index 17e76c4..8686e78 100644 --- a/Documentation/aoe/udev.txt +++ b/Documentation/aoe/udev.txt @@ -20,6 +20,7 @@ SUBSYSTEM=="aoe", KERNEL=="discover", NAME="etherd/%k", GROUP="disk", MODE="0220 SUBSYSTEM=="aoe", KERNEL=="err", NAME="etherd/%k", GROUP="disk", MODE="0440" SUBSYSTEM=="aoe", KERNEL=="interfaces",NAME="etherd/%k", GROUP="disk", MODE="0220" SUBSYSTEM=="aoe", KERNEL=="revalidate",NAME="etherd/%k", GROUP="disk", MODE="0220" +SUBSYSTEM=="aoe", KERNEL=="flush", NAME="etherd/%k", GROUP="disk", MODE="0220" # aoe block devices KERNEL=="etherd*", NAME="%k", GROUP="disk" diff --git a/drivers/block/aoe/aoe.h b/drivers/block/aoe/aoe.h index aecaac3..2248ab2 100644 --- a/drivers/block/aoe/aoe.h +++ b/drivers/block/aoe/aoe.h @@ -191,6 +191,7 @@ struct aoedev *aoedev_by_aoeaddr(int maj, int min); struct aoedev *aoedev_by_sysminor_m(ulong sysminor); void aoedev_downdev(struct aoedev *d); int aoedev_isbusy(struct aoedev *d); +int aoedev_flush(const char __user *str, size_t size); int aoenet_init(void); void aoenet_exit(void); diff --git a/drivers/block/aoe/aoechr.c b/drivers/block/aoe/aoechr.c index f112466..1bc85aa 100644 --- a/drivers/block/aoe/aoechr.c +++ b/drivers/block/aoe/aoechr.c @@ -15,6 +15,7 @@ enum { MINOR_DISCOVER, MINOR_INTERFACES, MINOR_REVALIDATE, + MINOR_FLUSH, MSGSZ = 2048, NMSG = 100, /* message backlog to retain */ }; @@ -43,6 +44,7 @@ static struct aoe_chardev chardevs[] = { { MINOR_DISCOVER, "discover" }, { MINOR_INTERFACES, "interfaces" }, { MINOR_REVALIDATE, "revalidate" }, + { MINOR_FLUSH, "flush" }, }; static int @@ -158,6 +160,9 @@ aoechr_write(struct file *filp, const char __user *buf, size_t cnt, loff_t *offp break; case MINOR_REVALIDATE: ret = revalidate(buf, cnt); + break; + case MINOR_FLUSH: + ret = aoedev_flush(buf, cnt); } if (ret == 0) ret = cnt; diff --git a/drivers/block/aoe/aoedev.c b/drivers/block/aoe/aoedev.c index a4d625a..e26f6f4 100644 --- a/drivers/block/aoe/aoedev.c +++ b/drivers/block/aoe/aoedev.c @@ -9,6 +9,10 @@ #include #include "aoe.h" +static void dummy_timer(ulong); +static void aoedev_freedev(struct aoedev *); +static void freetgt(struct aoetgt *t); + static struct aoedev *devlist; static spinlock_t devlist_lock; @@ -108,6 +112,70 @@ aoedev_downdev(struct aoedev *d) d->flags &= ~DEVFL_UP; } +static void +aoedev_freedev(struct aoedev *d) +{ + struct aoetgt **t, **e; + + if (d->gd) { + aoedisk_rm_sysfs(d); + del_gendisk(d->gd); + put_disk(d->gd); + } + t = d->targets; + e = t + NTARGETS; + for (; t < e && *t; t++) + freetgt(*t); + if (d->bufpool) + mempool_destroy(d->bufpool); + kfree(d); +} + +int +aoedev_flush(const char __user *str, size_t cnt) +{ + ulong flags; + struct aoedev *d, **dd; + struct aoedev *rmd = NULL; + char buf[16]; + int all = 0; + + if (cnt >= 3) { + if (cnt > sizeof buf) + cnt = sizeof buf; +
[PATCH 07/13] dynamically allocate a capped number of skbs when necessary
What this Patch Does Even before this recent series of 12 patches to 2.6.22-rc4, the aoe driver was reusing a small set of skbs that were allocated once and were only used for outbound AoE commands. The network layer cannot be allowed to put_page on the data that is still associated with a bio we haven't returned to the block layer, so the aoe driver (even before the patch under discussion) is still the owner of skbs that have been handed to the network layer for transmission. We need to keep track of these skbs so that we can free them, but by tracking them, we can also easily re-use them. The new patch was a response to the behavior of certain network drivers. We cannot reuse an skb that the network driver still has in its transmit ring. Network drivers can defer transmit ring cleanup and then use the state in the skb to determine how many data segments to clean up in its transmit ring. The tg3 driver is one driver that behaves in this way. When the network driver defers cleanup of its transmit ring, the aoe driver can find itself in a situation where it would like to send an AoE command, and the AoE target is ready for more work, but the network driver still has all of the pre-allocated skbs. In that case, the new patch just calls alloc_skb, as you'd expect. We don't want to get carried away, though. We try not to do excessive allocation in the write path, so we cap the number of skbs we dynamically allocate. Probably calling it a "dynamic pool" is misleading. We were already trying to use a small fixed-size set of pre-allocated skbs before this patch, and this patch just provides a little headroom (with a ceiling, though) to accomodate network drivers that hang onto skbs, by allocating when needed. The d->skbpool_hd list of allocated skbs is necessary so that we can free them later. We didn't notice the need for this headroom until AoE targets got fast enough. Alternatives If the network layer never did a put_page on the pages in the bio's we get from the block layer, then it would be possible for us to hand skbs to the network layer and forget about them, allowing the network layer to free skbs itself (and thereby calling our own skb->destructor callback function if we needed that). In that case we could get rid of the pre-allocated skbs and also the d->skbpool_hd, instead just calling alloc_skb every time we wanted to transmit a packet. The slab allocator would effectively maintain the list of skbs. Besides a loss of CPU cache locality, the main concern with that approach the danger that it would increase the likelihood of deadlock when VM is trying to free pages by writing dirty data from the page cache through the aoe driver out to persistent storage on an AoE device. Right now we have a situation where we have pre-allocation that corresponds to how much we use, which seems ideal. Of course, there's still the separate issue of receiving the packets that tell us that a write has successfully completed on the AoE target. When memory is low and VM is using AoE to flush dirty data to free up pages, it would be perfect if there were a way for us to register a fast callback that could recognize write command completion responses. But I don't think the current problems with the receive side of the situation are a justification for exacerbating the problem on the transmit side. Signed-off-by: Ed L. Cashin <[EMAIL PROTECTED]> --- drivers/block/aoe/aoe.h|5 ++ drivers/block/aoe/aoecmd.c | 117 +++- drivers/block/aoe/aoedev.c | 52 +--- 3 files changed, 133 insertions(+), 41 deletions(-) diff --git a/drivers/block/aoe/aoe.h b/drivers/block/aoe/aoe.h index 2248ab2..67ef4d7 100644 --- a/drivers/block/aoe/aoe.h +++ b/drivers/block/aoe/aoe.h @@ -89,6 +89,7 @@ enum { MIN_BUFS = 16, NTARGETS = 8, NAOEIFS = 8, + NSKBPOOLMAX = 128, TIMERTICK = HZ / 10, MINTIMER = HZ >> 2, @@ -138,6 +139,7 @@ struct aoetgt { u16 useme; ulong lastwadj; /* last window adjustment */ int wpkts, rpkts; + int dataref; }; struct aoedev { @@ -159,6 +161,9 @@ struct aoedev { spinlock_t lock; struct sk_buff *sendq_hd; /* packets needing to be sent, list head */ struct sk_buff *sendq_tl; + struct sk_buff *skbpool_hd; + struct sk_buff *skbpool_tl; + int nskbpool; mempool_t *bufpool; /* for deadlock-free Buf allocation */ struct list_head bufq; /* queue of bios to work on */ struct buf *inprocess; /* the one we're currently working on */ diff --git a/drivers/block/aoe/aoecmd.c b/drivers/block/aoe/aoecmd.c index 1be5150..b49e06e 100644 --- a/drivers/block/aoe/aoecmd.c +++ b/drivers/block/aoe/aoecmd.c @@ -10
[PATCH 04/13] clean up udev configuration example
This patch adds a known default location for the udev configuration file and uses the more recent "==" syntax for SUBSYSTEM and KERNEL. Signed-off-by: Ed L. Cashin <[EMAIL PROTECTED]> --- Documentation/aoe/udev-install.sh |5 - Documentation/aoe/udev.txt| 15 --- 2 files changed, 12 insertions(+), 8 deletions(-) diff --git a/Documentation/aoe/udev-install.sh b/Documentation/aoe/udev-install.sh index 6449911..15e86f5 100644 --- a/Documentation/aoe/udev-install.sh +++ b/Documentation/aoe/udev-install.sh @@ -23,7 +23,10 @@ fi # /etc/udev/rules.d # rules_d="`sed -n '/^udev_rules=/{ s!udev_rules=!!; s!\"!!g; p; }' $conf`" -if test -z "$rules_d" || test ! -d "$rules_d"; then +if test -z "$rules_d" ; then + rules_d=/etc/udev/rules.d +fi +if test ! -d "$rules_d"; then echo "$me Error: cannot find udev rules directory" 1>&2 exit 1 fi diff --git a/Documentation/aoe/udev.txt b/Documentation/aoe/udev.txt index a7ed1dc..17e76c4 100644 --- a/Documentation/aoe/udev.txt +++ b/Documentation/aoe/udev.txt @@ -1,6 +1,7 @@ # These rules tell udev what device nodes to create for aoe support. -# They may be installed along the following lines (adjusted to what -# you see on your system). +# They may be installed along the following lines. Check the section +# 8 udev manpage to see whether your udev supports SUBSYSTEM, and +# whether it uses one or two equal signs for SUBSYSTEM and KERNEL. # # [EMAIL PROTECTED] ~$ su # Password: @@ -15,10 +16,10 @@ # # aoe char devices -SUBSYSTEM="aoe", KERNEL="discover",NAME="etherd/%k", GROUP="disk", MODE="0220" -SUBSYSTEM="aoe", KERNEL="err", NAME="etherd/%k", GROUP="disk", MODE="0440" -SUBSYSTEM="aoe", KERNEL="interfaces", NAME="etherd/%k", GROUP="disk", MODE="0220" -SUBSYSTEM="aoe", KERNEL="revalidate", NAME="etherd/%k", GROUP="disk", MODE="0220" +SUBSYSTEM=="aoe", KERNEL=="discover", NAME="etherd/%k", GROUP="disk", MODE="0220" +SUBSYSTEM=="aoe", KERNEL=="err", NAME="etherd/%k", GROUP="disk", MODE="0440" +SUBSYSTEM=="aoe", KERNEL=="interfaces",NAME="etherd/%k", GROUP="disk", MODE="0220" +SUBSYSTEM=="aoe", KERNEL=="revalidate",NAME="etherd/%k", GROUP="disk", MODE="0220" # aoe block devices -KERNEL="etherd*", NAME="%k", GROUP="disk" +KERNEL=="etherd*", NAME="%k", GROUP="disk" -- 1.5.3.4 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 05/13] eliminate goto and improve readability
Adam Richter suggested eliminating this goto. Signed-off-by: Ed L. Cashin <[EMAIL PROTECTED]> --- drivers/block/aoe/aoechr.c | 69 +-- 1 files changed, 34 insertions(+), 35 deletions(-) diff --git a/drivers/block/aoe/aoechr.c b/drivers/block/aoe/aoechr.c index 03c7f4a..f112466 100644 --- a/drivers/block/aoe/aoechr.c +++ b/drivers/block/aoe/aoechr.c @@ -194,52 +194,51 @@ aoechr_read(struct file *filp, char __user *buf, size_t cnt, loff_t *off) ulong flags; n = (unsigned long) filp->private_data; - switch (n) { - case MINOR_ERR: - spin_lock_irqsave(&emsgs_lock, flags); -loop: - em = emsgs + emsgs_head_idx; - if ((em->flags & EMFL_VALID) == 0) { - if (filp->f_flags & O_NDELAY) { - spin_unlock_irqrestore(&emsgs_lock, flags); - return -EAGAIN; - } - nblocked_emsgs_readers++; + if (n != MINOR_ERR) + return -EFAULT; + + spin_lock_irqsave(&emsgs_lock, flags); + for (;;) { + em = emsgs + emsgs_head_idx; + if ((em->flags & EMFL_VALID) != 0) + break; + if (filp->f_flags & O_NDELAY) { spin_unlock_irqrestore(&emsgs_lock, flags); + return -EAGAIN; + } + nblocked_emsgs_readers++; + + spin_unlock_irqrestore(&emsgs_lock, flags); - n = down_interruptible(&emsgs_sema); + n = down_interruptible(&emsgs_sema); - spin_lock_irqsave(&emsgs_lock, flags); + spin_lock_irqsave(&emsgs_lock, flags); - nblocked_emsgs_readers--; + nblocked_emsgs_readers--; - if (n) { - spin_unlock_irqrestore(&emsgs_lock, flags); - return -ERESTARTSYS; - } - goto loop; - } - if (em->len > cnt) { + if (n) { spin_unlock_irqrestore(&emsgs_lock, flags); - return -EAGAIN; + return -ERESTARTSYS; } - mp = em->msg; - len = em->len; - em->msg = NULL; - em->flags &= ~EMFL_VALID; + } + if (em->len > cnt) { + spin_unlock_irqrestore(&emsgs_lock, flags); + return -EAGAIN; + } + mp = em->msg; + len = em->len; + em->msg = NULL; + em->flags &= ~EMFL_VALID; - emsgs_head_idx++; - emsgs_head_idx %= ARRAY_SIZE(emsgs); + emsgs_head_idx++; + emsgs_head_idx %= ARRAY_SIZE(emsgs); - spin_unlock_irqrestore(&emsgs_lock, flags); + spin_unlock_irqrestore(&emsgs_lock, flags); - n = copy_to_user(buf, mp, len); - kfree(mp); - return n == 0 ? len : -EFAULT; - default: - return -EFAULT; - } + n = copy_to_user(buf, mp, len); + kfree(mp); + return n == 0 ? len : -EFAULT; } static const struct file_operations aoe_fops = { -- 1.5.3.4 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 03/13] mac_addr: avoid 64-bit arch compiler warnings
By returning unsigned long long, mac_addr does not generate compiler warnings on 64-bit architectures. Signed-off-by: Ed L. Cashin <[EMAIL PROTECTED]> --- drivers/block/aoe/aoe.h|2 +- drivers/block/aoe/aoeblk.c |3 +-- drivers/block/aoe/aoecmd.c | 10 +- drivers/block/aoe/aoenet.c |4 ++-- 4 files changed, 9 insertions(+), 10 deletions(-) diff --git a/drivers/block/aoe/aoe.h b/drivers/block/aoe/aoe.h index 87df18b..aecaac3 100644 --- a/drivers/block/aoe/aoe.h +++ b/drivers/block/aoe/aoe.h @@ -198,4 +198,4 @@ void aoenet_xmit(struct sk_buff *); int is_aoe_netif(struct net_device *ifp); int set_aoe_iflist(const char __user *str, size_t size); -u64 mac_addr(char addr[6]); +unsigned long long mac_addr(char addr[6]); diff --git a/drivers/block/aoe/aoeblk.c b/drivers/block/aoe/aoeblk.c index c2649c9..deea536 100644 --- a/drivers/block/aoe/aoeblk.c +++ b/drivers/block/aoe/aoeblk.c @@ -37,8 +37,7 @@ static ssize_t aoedisk_show_mac(struct device *dev, if (t == NULL) return snprintf(page, PAGE_SIZE, "none\n"); - return snprintf(page, PAGE_SIZE, "%012llx\n", - (unsigned long long)mac_addr(t->addr)); + return snprintf(page, PAGE_SIZE, "%012llx\n", mac_addr(t->addr)); } static ssize_t aoedisk_show_netif(struct device *dev, struct device_attribute *attr, char *page) diff --git a/drivers/block/aoe/aoecmd.c b/drivers/block/aoe/aoecmd.c index 5e7daa1..1be5150 100644 --- a/drivers/block/aoe/aoecmd.c +++ b/drivers/block/aoe/aoecmd.c @@ -309,7 +309,8 @@ resend(struct aoedev *d, struct aoetgt *t, struct frame *f) "%15s e%ld.%d [EMAIL PROTECTED] newtag=%08x " "s=%012llx d=%012llx nout=%d\n", "retransmit", d->aoemajor, d->aoeminor, f->tag, jiffies, n, - mac_addr(h->src), mac_addr(h->dst), t->nout); + mac_addr(h->src), + mac_addr(h->dst), t->nout); aoechr_error(buf); f->tag = n; @@ -633,7 +634,7 @@ ataid_complete(struct aoedev *d, struct aoetgt *t, unsigned char *id) if (d->ssize != ssize) printk(KERN_INFO "aoe: %012llx e%lu.%lu v%04x has %llu sectors\n", - (unsigned long long)mac_addr(t->addr), + mac_addr(t->addr), d->aoemajor, d->aoeminor, d->fw_ver, (long long)ssize); d->ssize = ssize; @@ -727,8 +728,7 @@ aoecmd_ata_rsp(struct sk_buff *skb) t = gettgt(d, hin->src); if (t == NULL) { printk(KERN_INFO "aoe: can't find target e%ld.%d:%012llx\n", - d->aoemajor, d->aoeminor, - (unsigned long long) mac_addr(hin->src)); + d->aoemajor, d->aoeminor, mac_addr(hin->src)); spin_unlock_irqrestore(&d->lock, flags); return; } @@ -1003,7 +1003,7 @@ aoecmd_cfg_rsp(struct sk_buff *skb) "aoe: e%ld.%d: setting %d%s%s:%012llx\n", d->aoemajor, d->aoeminor, n, " byte data frames on ", ifp->nd->name, - (unsigned long long) mac_addr(t->addr)); + mac_addr(t->addr)); ifp->maxbcnt = n; } } diff --git a/drivers/block/aoe/aoenet.c b/drivers/block/aoe/aoenet.c index 7a38a45..ada4a06 100644 --- a/drivers/block/aoe/aoenet.c +++ b/drivers/block/aoe/aoenet.c @@ -83,7 +83,7 @@ set_aoe_iflist(const char __user *user_str, size_t size) return 0; } -u64 +unsigned long long mac_addr(char addr[6]) { __be64 n = 0; @@ -91,7 +91,7 @@ mac_addr(char addr[6]) memcpy(p + 2, addr, 6); /* (sizeof addr != 6) */ - return __be64_to_cpu(n); + return (unsigned long long) __be64_to_cpu(n); } void -- 1.5.3.4 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 02/13] handle multiple network paths to AoE device
A remote AoE device is something can process ATA commands and is identified by an AoE shelf number and an AoE slot number. Such a device might have more than one network interface, and it might be reachable by more than one local network interface. This patch tracks the available network paths available to each AoE device, allowing them to be used more efficiently. Andrew Morton asked about the call to msleep_interruptible in the revalidate function. Yes, if a signal is pending, then msleep_interruptible will not return 0. That means we will not loop but will call aoenet_xmit with a NULL skb, which is a noop. If the system is too low on memory or the aoe driver is too low on frames, then the user can hit control-C to interrupt the attempt to do a revalidate. I have added a comment to the code summarizing that. Andrew Morton asked whether the allocation performed inside addtgt could use a more relaxed allocation like GFP_KERNEL, but addtgt is called when the aoedev lock has been locked with spin_lock_irqsave. It would be nice to allocate the memory under fewer restrictions, but targets are only added when the device is being discovered, and if the target can't be added right now, we can try again in a minute when then next AoE config query broadcast goes out. Andrew Morton pointed out that the "too many targets" message could be printed for failing GFP_ATOMIC allocations. The last patch in this series makes the messages more specific. Signed-off-by: Ed L. Cashin <[EMAIL PROTECTED]> --- drivers/block/aoe/aoe.h| 57 +++-- drivers/block/aoe/aoeblk.c | 62 - drivers/block/aoe/aoechr.c | 17 +- drivers/block/aoe/aoecmd.c | 675 ++-- drivers/block/aoe/aoedev.c | 168 +-- drivers/block/aoe/aoenet.c |9 +- 6 files changed, 653 insertions(+), 335 deletions(-) diff --git a/drivers/block/aoe/aoe.h b/drivers/block/aoe/aoe.h index 4d0543a..87df18b 100644 --- a/drivers/block/aoe/aoe.h +++ b/drivers/block/aoe/aoe.h @@ -76,10 +76,8 @@ enum { DEVFL_EXT = (1<<2), /* device accepts lba48 commands */ DEVFL_CLOSEWAIT = (1<<3), /* device is waiting for all closes to revalidate */ DEVFL_GDALLOC = (1<<4), /* need to alloc gendisk */ - DEVFL_PAUSE = (1<<5), + DEVFL_KICKME = (1<<5), /* slow polling network card catch */ DEVFL_NEWSIZE = (1<<6), /* need to update dev size in block layer */ - DEVFL_MAXBCNT = (1<<7), /* d->maxbcnt is not changeable */ - DEVFL_KICKME = (1<<8), BUFFL_FAIL = 1, }; @@ -88,17 +86,24 @@ enum { DEFAULTBCNT = 2 * 512, /* 2 sectors */ NPERSHELF = 16, /* number of slots per shelf address */ FREETAG = -1, - MIN_BUFS = 8, + MIN_BUFS = 16, + NTARGETS = 8, + NAOEIFS = 8, + + TIMERTICK = HZ / 10, + MINTIMER = HZ >> 2, + MAXTIMER = HZ << 1, + HELPWAIT = 20, }; struct buf { struct list_head bufs; - ulong start_time; /* for disk stats */ + ulong stime;/* for disk stats */ ulong flags; ulong nframesout; - char *bufaddr; ulong resid; ulong bv_resid; + ulong bv_off; sector_t sector; struct bio *bio; struct bio_vec *bv; @@ -114,19 +119,37 @@ struct frame { struct sk_buff *skb; }; +struct aoeif { + struct net_device *nd; + unsigned char lost; + unsigned char lostjumbo; + ushort maxbcnt; +}; + +struct aoetgt { + unsigned char addr[6]; + ushort nframes; + struct frame *frames; + struct aoeif ifs[NAOEIFS]; + struct aoeif *ifp; /* current aoeif in use */ + ushort nout; + ushort maxout; + u16 lasttag;/* last tag sent */ + u16 useme; + ulong lastwadj; /* last window adjustment */ + int wpkts, rpkts; +}; + struct aoedev { struct aoedev *next; - unsigned char addr[6]; /* remote mac addr */ - ushort flags; ulong sysminor; ulong aoemajor; - ulong aoeminor; + u16 aoeminor; + u16 flags; u16 nopen; /* (bd_openers isn't available without sleeping) */ - u16 lasttag;/* last tag sent */ u16 rttavg; /* round trip average of requests/responses */ u16 mintimer; u16 fw_ver; /* version of blade's firmware */ - u16 maxbcnt; struct work_struct work;/* disk create work struct */ struct gendisk *gd; struct request_queue blkq; @@ -134,15 +157,14 @@ struct aoedev { sector_t ssize; struct timer_list timer; spinlock_t lock; - struct net_device *ifp; /* interface ed is attached to */ struct sk_buff *sendq_hd; /* packets needing to be sent, list head */ struct sk_buff *sendq_tl;
[PATCH 01/13] bring driver version number to 47
These patches were made against kernel 2.6.24-rc5-mm1 kernel. They were submitted earlier and have been modified to incorporate feedback from the kernel development community. I am resending them after resolving conflicts with patches already in the mm tree. The eleventh patch in the old series was obsoleted by a patch in the 2.6.24-rc5-mm1 tree, gregkh-driver-block-device.patch, and has been omitted. The last patch in this series is new but straightforward. Signed-off-by: Ed L. Cashin <[EMAIL PROTECTED]> --- drivers/block/aoe/aoe.h |2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/drivers/block/aoe/aoe.h b/drivers/block/aoe/aoe.h index 07f02f8..4d0543a 100644 --- a/drivers/block/aoe/aoe.h +++ b/drivers/block/aoe/aoe.h @@ -1,5 +1,5 @@ /* Copyright (c) 2006 Coraid, Inc. See COPYING for GPL terms. */ -#define VERSION "32" +#define VERSION "47" #define AOE_MAJOR 152 #define DEVICE_NAME "aoe" -- 1.5.3.4 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Patch 3/8] Enhanced partition statistics: aoe fix
On Thu, Dec 13, 2007 at 05:17:45PM +0100, Jerome Marchand wrote: > Updates the enhanced partition statistics in ATA over Ethernet driver (not > tested). Acked-by: Ed L. Cashin <[EMAIL PROTECTED]> > Signed-off-by: Jerome Marchand <[EMAIL PROTECTED]> > --- > aoecmd.c |8 > 1 file changed, 4 insertions(+), 4 deletions(-) > diff -urNp -X linux-2.6/Documentation/dontdiff > linux-2.6.orig/drivers/block/aoe/aoecmd.c linux-2.6/drivers/block/aoe/aoecmd.c > --- linux-2.6.orig/drivers/block/aoe/aoecmd.c 2007-12-04 17:37:13.0 > +0100 > +++ linux-2.6/drivers/block/aoe/aoecmd.c 2007-12-05 13:45:10.0 > +0100 > @@ -648,10 +648,10 @@ aoecmd_ata_rsp(struct sk_buff *skb) > struct gendisk *disk = d->gd; > const int rw = bio_data_dir(buf->bio); > > - disk_stat_inc(disk, ios[rw]); > - disk_stat_add(disk, ticks[rw], duration); > - disk_stat_add(disk, sectors[rw], n_sect); > - disk_stat_add(disk, io_ticks, duration); > + all_stat_inc(disk, ios[rw], buf->sector); > + all_stat_add(disk, ticks[rw], duration, buf->sector); > + all_stat_add(disk, sectors[rw], n_sect, buf->sector); > + all_stat_add(disk, io_ticks, duration, buf->sector); > n = (buf->flags & BUFFL_FAIL) ? -EIO : 0; > bio_endio(buf->bio, n); > mempool_free(buf, d->bufpool); -- Ed L Cashin <[EMAIL PROTECTED]> -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Bugme-new] [Bug 9482] New: kernel GPF in 2.6.24 (g09f345da)
On Sat, Dec 08, 2007 at 07:50:15PM -0800, Andrew Morton wrote: > On Sat, 8 Dec 2007 16:59:30 -0600 "Jon Nelson" <[EMAIL PROTECTED]> wrote: > > > I can confirm that 2.6.24rc4 with the (second) patch works fine. > > OK, thanks. > > We haven't heard back from Ed yet. I'll sit on this for a few more days. Sorry, yes, the second patch works well for me in testing. I had some initial concerns that were unfounded. -- Ed L Cashin <[EMAIL PROTECTED]> -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 13/13] make error messages more specific
Andrew Morton pointed out that the "too many targets" message in patch 2 could be printed for failing GFP_ATOMIC allocations. This patch makes the messages more specific. Signed-off-by: Ed L. Cashin <[EMAIL PROTECTED]> --- drivers/block/aoe/aoecmd.c | 15 +++ 1 files changed, 7 insertions(+), 8 deletions(-) diff --git a/drivers/block/aoe/aoecmd.c b/drivers/block/aoe/aoecmd.c index bcea36c..1e37cf6 100644 --- a/drivers/block/aoe/aoecmd.c +++ b/drivers/block/aoe/aoecmd.c @@ -957,15 +957,17 @@ addtgt(struct aoedev *d, char *addr, ulong nframes) for (; tt < te && *tt; tt++) ; - if (tt == te) + if (tt == te) { + printk(KERN_INFO + "aoe: device addtgt failure; too many targets\n"); return NULL; - + } t = kcalloc(1, sizeof *t, GFP_ATOMIC); - if (!t) - return NULL; f = kcalloc(nframes, sizeof *f, GFP_ATOMIC); - if (!f) { + if (!t || !f) { + kfree(f); kfree(t); + printk(KERN_INFO "aoe: cannot allocate memory to add target\n"); return NULL; } @@ -1029,9 +1031,6 @@ aoecmd_cfg_rsp(struct sk_buff *skb) if (!t) { t = addtgt(d, h->src, n); if (!t) { - printk(KERN_INFO - "aoe: device addtgt failure; " - "too many targets?\n"); spin_unlock_irqrestore(&d->lock, flags); return; } -- 1.5.3.4 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 12/13] the aoeminor doesn't need a long format
The aoedev aoeminor member doesn't need a long format. Signed-off-by: Ed L. Cashin <[EMAIL PROTECTED]> --- drivers/block/aoe/aoeblk.c |7 --- drivers/block/aoe/aoecmd.c |5 +++-- 2 files changed, 7 insertions(+), 5 deletions(-) diff --git a/drivers/block/aoe/aoeblk.c b/drivers/block/aoe/aoeblk.c index 98ab170..b78a8ef 100644 --- a/drivers/block/aoe/aoeblk.c +++ b/drivers/block/aoe/aoeblk.c @@ -203,7 +203,7 @@ aoeblk_make_request(struct request_queue *q, struct bio *bio) spin_lock_irqsave(&d->lock, flags); if ((d->flags & DEVFL_UP) == 0) { - printk(KERN_INFO "aoe: device %ld.%ld is not up\n", + printk(KERN_INFO "aoe: device %ld.%d is not up\n", d->aoemajor, d->aoeminor); spin_unlock_irqrestore(&d->lock, flags); mempool_free(buf, d->bufpool); @@ -256,14 +256,15 @@ aoeblk_gdalloc(void *vp) gd = alloc_disk(AOE_PARTITIONS); if (gd == NULL) { - printk(KERN_ERR "aoe: cannot allocate disk structure for %ld.%ld\n", + printk(KERN_ERR + "aoe: cannot allocate disk structure for %ld.%d\n", d->aoemajor, d->aoeminor); goto err; } d->bufpool = mempool_create_slab_pool(MIN_BUFS, buf_pool_cache); if (d->bufpool == NULL) { - printk(KERN_ERR "aoe: cannot allocate bufpool for %ld.%ld\n", + printk(KERN_ERR "aoe: cannot allocate bufpool for %ld.%d\n", d->aoemajor, d->aoeminor); goto err_disk; } diff --git a/drivers/block/aoe/aoecmd.c b/drivers/block/aoe/aoecmd.c index e92d885..bcea36c 100644 --- a/drivers/block/aoe/aoecmd.c +++ b/drivers/block/aoe/aoecmd.c @@ -697,7 +697,8 @@ ataid_complete(struct aoedev *d, struct aoetgt *t, unsigned char *id) } if (d->ssize != ssize) - printk(KERN_INFO "aoe: %012llx e%lu.%lu v%04x has %llu sectors\n", + printk(KERN_INFO + "aoe: %012llx e%ld.%d v%04x has %llu sectors\n", mac_addr(t->addr), d->aoemajor, d->aoeminor, d->fw_ver, (long long)ssize); @@ -822,7 +823,7 @@ aoecmd_ata_rsp(struct sk_buff *skb) if (ahin->cmdstat & 0xa9) { /* these bits cleared on success */ printk(KERN_ERR - "aoe: ata error cmd=%2.2Xh stat=%2.2Xh from e%ld.%ld\n", + "aoe: ata error cmd=%2.2Xh stat=%2.2Xh from e%ld.%d\n", ahout->cmdstat, ahin->cmdstat, d->aoemajor, d->aoeminor); if (buf) -- 1.5.3.4 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 11/13] remove extra space in prototypes for consistency
Remove extra space in prototypes for consistency. Signed-off-by: Ed L. Cashin <[EMAIL PROTECTED]> --- drivers/block/aoe/aoeblk.c |8 1 files changed, 4 insertions(+), 4 deletions(-) diff --git a/drivers/block/aoe/aoeblk.c b/drivers/block/aoe/aoeblk.c index 7168d3d..98ab170 100644 --- a/drivers/block/aoe/aoeblk.c +++ b/drivers/block/aoe/aoeblk.c @@ -15,7 +15,7 @@ static struct kmem_cache *buf_pool_cache; -static ssize_t aoedisk_show_state(struct gendisk * disk, char *page) +static ssize_t aoedisk_show_state(struct gendisk *disk, char *page) { struct aoedev *d = disk->private_data; @@ -26,7 +26,7 @@ static ssize_t aoedisk_show_state(struct gendisk * disk, char *page) (d->nopen && !(d->flags & DEVFL_UP)) ? ",closewait" : ""); /* I'd rather see nopen exported so we can ditch closewait */ } -static ssize_t aoedisk_show_mac(struct gendisk * disk, char *page) +static ssize_t aoedisk_show_mac(struct gendisk *disk, char *page) { struct aoedev *d = disk->private_data; struct aoetgt *t = d->targets[0]; @@ -35,7 +35,7 @@ static ssize_t aoedisk_show_mac(struct gendisk * disk, char *page) return snprintf(page, PAGE_SIZE, "none\n"); return snprintf(page, PAGE_SIZE, "%012llx\n", mac_addr(t->addr)); } -static ssize_t aoedisk_show_netif(struct gendisk * disk, char *page) +static ssize_t aoedisk_show_netif(struct gendisk *disk, char *page) { struct aoedev *d = disk->private_data; struct net_device *nds[8], **nd, **nnd, **ne; @@ -71,7 +71,7 @@ static ssize_t aoedisk_show_netif(struct gendisk * disk, char *page) return p-page; } /* firmware version */ -static ssize_t aoedisk_show_fwver(struct gendisk * disk, char *page) +static ssize_t aoedisk_show_fwver(struct gendisk *disk, char *page) { struct aoedev *d = disk->private_data; -- 1.5.3.4 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 10/13] add module parameter for users who need more outstanding I/O
An AoE target provides an estimate of the number of outstanding commands that the AoE initiator can send before getting a response. The aoe_maxout parameter provides a way to set an even lower limit. It will not allow a user to use more outstanding commands than the target permits. If a user discovers a problem with a large setting, this parameter provides a way for us to work with them to debug the problem. We expect to improve the dynamic window sizing algorithm and drop this parameter. For the time being, it is a debugging aid. Signed-off-by: Ed L. Cashin <[EMAIL PROTECTED]> --- drivers/block/aoe/aoecmd.c | 10 +++--- 1 files changed, 7 insertions(+), 3 deletions(-) diff --git a/drivers/block/aoe/aoecmd.c b/drivers/block/aoe/aoecmd.c index 7a96183..e92d885 100644 --- a/drivers/block/aoe/aoecmd.c +++ b/drivers/block/aoe/aoecmd.c @@ -18,6 +18,11 @@ static int aoe_deadsecs = 60 * 3; module_param(aoe_deadsecs, int, 0644); MODULE_PARM_DESC(aoe_deadsecs, "After aoe_deadsecs seconds, give up and fail dev."); +static int aoe_maxout = 16; +module_param(aoe_maxout, int, 0644); +MODULE_PARM_DESC(aoe_maxout, + "Only aoe_maxout outstanding packets for every MAC on eX.Y."); + static struct sk_buff * new_skb(ulong len) { @@ -984,7 +989,6 @@ aoecmd_cfg_rsp(struct sk_buff *skb) struct aoeif *ifp; ulong flags, sysminor, aoemajor; struct sk_buff *sl; - enum { MAXFRAMES = 16 }; u16 n; h = (struct aoe_hdr *) skb_mac_header(skb); @@ -1009,8 +1013,8 @@ aoecmd_cfg_rsp(struct sk_buff *skb) } n = be16_to_cpu(ch->bufcnt); - if (n > MAXFRAMES) /* keep it reasonable */ - n = MAXFRAMES; + if (n > aoe_maxout) /* keep it reasonable */ + n = aoe_maxout; d = aoedev_by_sysminor_m(sysminor); if (d == NULL) { -- 1.5.3.4 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 09/13] remove race between use and initialization of locks
Alexey Dobriyan noticed a race in the initialization of the dynamic locks in ... Message-ID: <[EMAIL PROTECTED]> Andrew Morton commented that these locks should be initialized at compile time, so this patch does that. Signed-off-by: Ed L. Cashin <[EMAIL PROTECTED]> --- drivers/block/aoe/aoechr.c |6 ++ 1 files changed, 2 insertions(+), 4 deletions(-) diff --git a/drivers/block/aoe/aoechr.c b/drivers/block/aoe/aoechr.c index 166f54f..0ce9bda 100644 --- a/drivers/block/aoe/aoechr.c +++ b/drivers/block/aoe/aoechr.c @@ -35,8 +35,8 @@ struct ErrMsg { static struct ErrMsg emsgs[NMSG]; static int emsgs_head_idx, emsgs_tail_idx; -static struct semaphore emsgs_sema; -static spinlock_t emsgs_lock; +static __DECLARE_SEMAPHORE_GENERIC(emsgs_sema, 0); +static DEFINE_SPINLOCK(emsgs_lock); static int nblocked_emsgs_readers; static struct class *aoe_class; static struct aoe_chardev chardevs[] = { @@ -264,8 +264,6 @@ aoechr_init(void) printk(KERN_ERR "aoe: can't register char device\n"); return n; } - sema_init(&emsgs_sema, 0); - spin_lock_init(&emsgs_lock); aoe_class = class_create(THIS_MODULE, "aoe"); if (IS_ERR(aoe_class)) { unregister_chrdev(AOE_MAJOR, "aoechr"); -- 1.5.3.4 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 08/13] only install new AoE device once
An aoe driver user who had about 70 AoE targets found that he was hitting a BUG in sysfs_create_file because the aoe driver was trying to tell the kernel about an AoE device more than once. Each AoE device was reachable by several local network interfaces, and multiple ATA device indentify responses were returning from that single device. This patch eliminates a race condition so that aoe always informs the block layer of a new AoE device once in the presence of multiple incoming ATA device identify responses. Signed-off-by: Ed L. Cashin <[EMAIL PROTECTED]> --- drivers/block/aoe/aoecmd.c |2 ++ 1 files changed, 2 insertions(+), 0 deletions(-) diff --git a/drivers/block/aoe/aoecmd.c b/drivers/block/aoe/aoecmd.c index b49e06e..7a96183 100644 --- a/drivers/block/aoe/aoecmd.c +++ b/drivers/block/aoe/aoecmd.c @@ -698,6 +698,8 @@ ataid_complete(struct aoedev *d, struct aoetgt *t, unsigned char *id) d->fw_ver, (long long)ssize); d->ssize = ssize; d->geo.start = 0; + if (d->flags & (DEVFL_GDALLOC|DEVFL_NEWSIZE)) + return; if (d->gd != NULL) { d->gd->capacity = ssize; d->flags |= DEVFL_NEWSIZE; -- 1.5.3.4 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 07/13] dynamically allocate a capped number of skbs when necessary
What this Patch Does Even before this recent series of 12 patches to 2.6.22-rc4, the aoe driver was reusing a small set of skbs that were allocated once and were only used for outbound AoE commands. The network layer cannot be allowed to put_page on the data that is still associated with a bio we haven't returned to the block layer, so the aoe driver (even before the patch under discussion) is still the owner of skbs that have been handed to the network layer for transmission. We need to keep track of these skbs so that we can free them, but by tracking them, we can also easily re-use them. The new patch was a response to the behavior of certain network drivers. We cannot reuse an skb that the network driver still has in its transmit ring. Network drivers can defer transmit ring cleanup and then use the state in the skb to determine how many data segments to clean up in its transmit ring. The tg3 driver is one driver that behaves in this way. When the network driver defers cleanup of its transmit ring, the aoe driver can find itself in a situation where it would like to send an AoE command, and the AoE target is ready for more work, but the network driver still has all of the pre-allocated skbs. In that case, the new patch just calls alloc_skb, as you'd expect. We don't want to get carried away, though. We try not to do excessive allocation in the write path, so we cap the number of skbs we dynamically allocate. Probably calling it a "dynamic pool" is misleading. We were already trying to use a small fixed-size set of pre-allocated skbs before this patch, and this patch just provides a little headroom (with a ceiling, though) to accomodate network drivers that hang onto skbs, by allocating when needed. The d->skbpool_hd list of allocated skbs is necessary so that we can free them later. We didn't notice the need for this headroom until AoE targets got fast enough. Alternatives If the network layer never did a put_page on the pages in the bio's we get from the block layer, then it would be possible for us to hand skbs to the network layer and forget about them, allowing the network layer to free skbs itself (and thereby calling our own skb->destructor callback function if we needed that). In that case we could get rid of the pre-allocated skbs and also the d->skbpool_hd, instead just calling alloc_skb every time we wanted to transmit a packet. The slab allocator would effectively maintain the list of skbs. Besides a loss of CPU cache locality, the main concern with that approach the danger that it would increase the likelihood of deadlock when VM is trying to free pages by writing dirty data from the page cache through the aoe driver out to persistent storage on an AoE device. Right now we have a situation where we have pre-allocation that corresponds to how much we use, which seems ideal. Of course, there's still the separate issue of receiving the packets that tell us that a write has successfully completed on the AoE target. When memory is low and VM is using AoE to flush dirty data to free up pages, it would be perfect if there were a way for us to register a fast callback that could recognize write command completion responses. But I don't think the current problems with the receive side of the situation are a justification for exacerbating the problem on the transmit side. Signed-off-by: Ed L. Cashin <[EMAIL PROTECTED]> --- drivers/block/aoe/aoe.h|5 ++ drivers/block/aoe/aoecmd.c | 117 +++- drivers/block/aoe/aoedev.c | 52 +--- 3 files changed, 133 insertions(+), 41 deletions(-) diff --git a/drivers/block/aoe/aoe.h b/drivers/block/aoe/aoe.h index 2248ab2..67ef4d7 100644 --- a/drivers/block/aoe/aoe.h +++ b/drivers/block/aoe/aoe.h @@ -89,6 +89,7 @@ enum { MIN_BUFS = 16, NTARGETS = 8, NAOEIFS = 8, + NSKBPOOLMAX = 128, TIMERTICK = HZ / 10, MINTIMER = HZ >> 2, @@ -138,6 +139,7 @@ struct aoetgt { u16 useme; ulong lastwadj; /* last window adjustment */ int wpkts, rpkts; + int dataref; }; struct aoedev { @@ -159,6 +161,9 @@ struct aoedev { spinlock_t lock; struct sk_buff *sendq_hd; /* packets needing to be sent, list head */ struct sk_buff *sendq_tl; + struct sk_buff *skbpool_hd; + struct sk_buff *skbpool_tl; + int nskbpool; mempool_t *bufpool; /* for deadlock-free Buf allocation */ struct list_head bufq; /* queue of bios to work on */ struct buf *inprocess; /* the one we're currently working on */ diff --git a/drivers/block/aoe/aoecmd.c b/drivers/block/aoe/aoecmd.c index 1be5150..b49e06e 100644 --- a/drivers/block/aoe/aoecmd.c +++ b/drivers/block/aoe/aoecmd.c @@ -10
[PATCH 06/13] user can ask driver to forget previously detected devices
When an AoE device is detected, the kernel is informed, and a new block device is created. If the device is unused, the block device corresponding to remote device that is no longer available may be removed from the system by telling the aoe driver to "flush" its list of devices. Without this patch, software like GPFS and LVM may attempt to read from AoE devices that were discovered earlier but are no longer present, blocking until the I/O attempt times out. Signed-off-by: Ed L. Cashin <[EMAIL PROTECTED]> --- Documentation/aoe/mkdevs.sh |2 + Documentation/aoe/udev.txt |1 + drivers/block/aoe/aoe.h |1 + drivers/block/aoe/aoechr.c |5 ++ drivers/block/aoe/aoedev.c | 87 +- 5 files changed, 77 insertions(+), 19 deletions(-) diff --git a/Documentation/aoe/mkdevs.sh b/Documentation/aoe/mkdevs.sh index 97374aa..44c0ab7 100644 --- a/Documentation/aoe/mkdevs.sh +++ b/Documentation/aoe/mkdevs.sh @@ -29,6 +29,8 @@ rm -f $dir/interfaces mknod -m 0200 $dir/interfaces c $MAJOR 4 rm -f $dir/revalidate mknod -m 0200 $dir/revalidate c $MAJOR 5 +rm -f $dir/flush +mknod -m 0200 $dir/flush c $MAJOR 6 export n_partitions mkshelf=`echo $0 | sed 's!mkdevs!mkshelf!'` diff --git a/Documentation/aoe/udev.txt b/Documentation/aoe/udev.txt index 17e76c4..8686e78 100644 --- a/Documentation/aoe/udev.txt +++ b/Documentation/aoe/udev.txt @@ -20,6 +20,7 @@ SUBSYSTEM=="aoe", KERNEL=="discover", NAME="etherd/%k", GROUP="disk", MODE="0220 SUBSYSTEM=="aoe", KERNEL=="err", NAME="etherd/%k", GROUP="disk", MODE="0440" SUBSYSTEM=="aoe", KERNEL=="interfaces",NAME="etherd/%k", GROUP="disk", MODE="0220" SUBSYSTEM=="aoe", KERNEL=="revalidate",NAME="etherd/%k", GROUP="disk", MODE="0220" +SUBSYSTEM=="aoe", KERNEL=="flush", NAME="etherd/%k", GROUP="disk", MODE="0220" # aoe block devices KERNEL=="etherd*", NAME="%k", GROUP="disk" diff --git a/drivers/block/aoe/aoe.h b/drivers/block/aoe/aoe.h index aecaac3..2248ab2 100644 --- a/drivers/block/aoe/aoe.h +++ b/drivers/block/aoe/aoe.h @@ -191,6 +191,7 @@ struct aoedev *aoedev_by_aoeaddr(int maj, int min); struct aoedev *aoedev_by_sysminor_m(ulong sysminor); void aoedev_downdev(struct aoedev *d); int aoedev_isbusy(struct aoedev *d); +int aoedev_flush(const char __user *str, size_t size); int aoenet_init(void); void aoenet_exit(void); diff --git a/drivers/block/aoe/aoechr.c b/drivers/block/aoe/aoechr.c index 4a3889d..166f54f 100644 --- a/drivers/block/aoe/aoechr.c +++ b/drivers/block/aoe/aoechr.c @@ -15,6 +15,7 @@ enum { MINOR_DISCOVER, MINOR_INTERFACES, MINOR_REVALIDATE, + MINOR_FLUSH, MSGSZ = 2048, NMSG = 100, /* message backlog to retain */ }; @@ -43,6 +44,7 @@ static struct aoe_chardev chardevs[] = { { MINOR_DISCOVER, "discover" }, { MINOR_INTERFACES, "interfaces" }, { MINOR_REVALIDATE, "revalidate" }, + { MINOR_FLUSH, "flush" }, }; static int @@ -158,6 +160,9 @@ aoechr_write(struct file *filp, const char __user *buf, size_t cnt, loff_t *offp break; case MINOR_REVALIDATE: ret = revalidate(buf, cnt); + break; + case MINOR_FLUSH: + ret = aoedev_flush(buf, cnt); } if (ret == 0) ret = cnt; diff --git a/drivers/block/aoe/aoedev.c b/drivers/block/aoe/aoedev.c index a4d625a..e26f6f4 100644 --- a/drivers/block/aoe/aoedev.c +++ b/drivers/block/aoe/aoedev.c @@ -9,6 +9,10 @@ #include #include "aoe.h" +static void dummy_timer(ulong); +static void aoedev_freedev(struct aoedev *); +static void freetgt(struct aoetgt *t); + static struct aoedev *devlist; static spinlock_t devlist_lock; @@ -108,6 +112,70 @@ aoedev_downdev(struct aoedev *d) d->flags &= ~DEVFL_UP; } +static void +aoedev_freedev(struct aoedev *d) +{ + struct aoetgt **t, **e; + + if (d->gd) { + aoedisk_rm_sysfs(d); + del_gendisk(d->gd); + put_disk(d->gd); + } + t = d->targets; + e = t + NTARGETS; + for (; t < e && *t; t++) + freetgt(*t); + if (d->bufpool) + mempool_destroy(d->bufpool); + kfree(d); +} + +int +aoedev_flush(const char __user *str, size_t cnt) +{ + ulong flags; + struct aoedev *d, **dd; + struct aoedev *rmd = NULL; + char buf[16]; + int all = 0; + + if (cnt >= 3) { + if (cnt > sizeof buf) + cnt = sizeof buf; +
[PATCH 05/13] eliminate goto and improve readability
Adam Richter suggested eliminating this goto. Signed-off-by: Ed L. Cashin <[EMAIL PROTECTED]> --- drivers/block/aoe/aoechr.c | 69 +-- 1 files changed, 34 insertions(+), 35 deletions(-) diff --git a/drivers/block/aoe/aoechr.c b/drivers/block/aoe/aoechr.c index 1a5c4b5..4a3889d 100644 --- a/drivers/block/aoe/aoechr.c +++ b/drivers/block/aoe/aoechr.c @@ -194,52 +194,51 @@ aoechr_read(struct file *filp, char __user *buf, size_t cnt, loff_t *off) ulong flags; n = (unsigned long) filp->private_data; - switch (n) { - case MINOR_ERR: - spin_lock_irqsave(&emsgs_lock, flags); -loop: - em = emsgs + emsgs_head_idx; - if ((em->flags & EMFL_VALID) == 0) { - if (filp->f_flags & O_NDELAY) { - spin_unlock_irqrestore(&emsgs_lock, flags); - return -EAGAIN; - } - nblocked_emsgs_readers++; + if (n != MINOR_ERR) + return -EFAULT; + + spin_lock_irqsave(&emsgs_lock, flags); + for (;;) { + em = emsgs + emsgs_head_idx; + if ((em->flags & EMFL_VALID) != 0) + break; + if (filp->f_flags & O_NDELAY) { spin_unlock_irqrestore(&emsgs_lock, flags); + return -EAGAIN; + } + nblocked_emsgs_readers++; + + spin_unlock_irqrestore(&emsgs_lock, flags); - n = down_interruptible(&emsgs_sema); + n = down_interruptible(&emsgs_sema); - spin_lock_irqsave(&emsgs_lock, flags); + spin_lock_irqsave(&emsgs_lock, flags); - nblocked_emsgs_readers--; + nblocked_emsgs_readers--; - if (n) { - spin_unlock_irqrestore(&emsgs_lock, flags); - return -ERESTARTSYS; - } - goto loop; - } - if (em->len > cnt) { + if (n) { spin_unlock_irqrestore(&emsgs_lock, flags); - return -EAGAIN; + return -ERESTARTSYS; } - mp = em->msg; - len = em->len; - em->msg = NULL; - em->flags &= ~EMFL_VALID; + } + if (em->len > cnt) { + spin_unlock_irqrestore(&emsgs_lock, flags); + return -EAGAIN; + } + mp = em->msg; + len = em->len; + em->msg = NULL; + em->flags &= ~EMFL_VALID; - emsgs_head_idx++; - emsgs_head_idx %= ARRAY_SIZE(emsgs); + emsgs_head_idx++; + emsgs_head_idx %= ARRAY_SIZE(emsgs); - spin_unlock_irqrestore(&emsgs_lock, flags); + spin_unlock_irqrestore(&emsgs_lock, flags); - n = copy_to_user(buf, mp, len); - kfree(mp); - return n == 0 ? len : -EFAULT; - default: - return -EFAULT; - } + n = copy_to_user(buf, mp, len); + kfree(mp); + return n == 0 ? len : -EFAULT; } static const struct file_operations aoe_fops = { -- 1.5.3.4 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 04/13] clean up udev configuration example
This patch adds a known default location for the udev configuration file and uses the more recent "==" syntax for SUBSYSTEM and KERNEL. Signed-off-by: Ed L. Cashin <[EMAIL PROTECTED]> --- Documentation/aoe/udev-install.sh |5 - Documentation/aoe/udev.txt| 15 --- 2 files changed, 12 insertions(+), 8 deletions(-) diff --git a/Documentation/aoe/udev-install.sh b/Documentation/aoe/udev-install.sh index 6449911..15e86f5 100644 --- a/Documentation/aoe/udev-install.sh +++ b/Documentation/aoe/udev-install.sh @@ -23,7 +23,10 @@ fi # /etc/udev/rules.d # rules_d="`sed -n '/^udev_rules=/{ s!udev_rules=!!; s!\"!!g; p; }' $conf`" -if test -z "$rules_d" || test ! -d "$rules_d"; then +if test -z "$rules_d" ; then + rules_d=/etc/udev/rules.d +fi +if test ! -d "$rules_d"; then echo "$me Error: cannot find udev rules directory" 1>&2 exit 1 fi diff --git a/Documentation/aoe/udev.txt b/Documentation/aoe/udev.txt index a7ed1dc..17e76c4 100644 --- a/Documentation/aoe/udev.txt +++ b/Documentation/aoe/udev.txt @@ -1,6 +1,7 @@ # These rules tell udev what device nodes to create for aoe support. -# They may be installed along the following lines (adjusted to what -# you see on your system). +# They may be installed along the following lines. Check the section +# 8 udev manpage to see whether your udev supports SUBSYSTEM, and +# whether it uses one or two equal signs for SUBSYSTEM and KERNEL. # # [EMAIL PROTECTED] ~$ su # Password: @@ -15,10 +16,10 @@ # # aoe char devices -SUBSYSTEM="aoe", KERNEL="discover",NAME="etherd/%k", GROUP="disk", MODE="0220" -SUBSYSTEM="aoe", KERNEL="err", NAME="etherd/%k", GROUP="disk", MODE="0440" -SUBSYSTEM="aoe", KERNEL="interfaces", NAME="etherd/%k", GROUP="disk", MODE="0220" -SUBSYSTEM="aoe", KERNEL="revalidate", NAME="etherd/%k", GROUP="disk", MODE="0220" +SUBSYSTEM=="aoe", KERNEL=="discover", NAME="etherd/%k", GROUP="disk", MODE="0220" +SUBSYSTEM=="aoe", KERNEL=="err", NAME="etherd/%k", GROUP="disk", MODE="0440" +SUBSYSTEM=="aoe", KERNEL=="interfaces",NAME="etherd/%k", GROUP="disk", MODE="0220" +SUBSYSTEM=="aoe", KERNEL=="revalidate",NAME="etherd/%k", GROUP="disk", MODE="0220" # aoe block devices -KERNEL="etherd*", NAME="%k", GROUP="disk" +KERNEL=="etherd*", NAME="%k", GROUP="disk" -- 1.5.3.4 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 03/13] mac_addr: avoid 64-bit arch compiler warnings
By returning unsigned long long, mac_addr does not generate compiler warnings on 64-bit architectures. Signed-off-by: Ed L. Cashin <[EMAIL PROTECTED]> --- drivers/block/aoe/aoe.h|2 +- drivers/block/aoe/aoeblk.c |3 +-- drivers/block/aoe/aoecmd.c | 10 +- drivers/block/aoe/aoenet.c |4 ++-- 4 files changed, 9 insertions(+), 10 deletions(-) diff --git a/drivers/block/aoe/aoe.h b/drivers/block/aoe/aoe.h index 87df18b..aecaac3 100644 --- a/drivers/block/aoe/aoe.h +++ b/drivers/block/aoe/aoe.h @@ -198,4 +198,4 @@ void aoenet_xmit(struct sk_buff *); int is_aoe_netif(struct net_device *ifp); int set_aoe_iflist(const char __user *str, size_t size); -u64 mac_addr(char addr[6]); +unsigned long long mac_addr(char addr[6]); diff --git a/drivers/block/aoe/aoeblk.c b/drivers/block/aoe/aoeblk.c index e10a7f3..7168d3d 100644 --- a/drivers/block/aoe/aoeblk.c +++ b/drivers/block/aoe/aoeblk.c @@ -33,8 +33,7 @@ static ssize_t aoedisk_show_mac(struct gendisk * disk, char *page) if (t == NULL) return snprintf(page, PAGE_SIZE, "none\n"); - return snprintf(page, PAGE_SIZE, "%012llx\n", - (unsigned long long)mac_addr(t->addr)); + return snprintf(page, PAGE_SIZE, "%012llx\n", mac_addr(t->addr)); } static ssize_t aoedisk_show_netif(struct gendisk * disk, char *page) { diff --git a/drivers/block/aoe/aoecmd.c b/drivers/block/aoe/aoecmd.c index 5e7daa1..1be5150 100644 --- a/drivers/block/aoe/aoecmd.c +++ b/drivers/block/aoe/aoecmd.c @@ -309,7 +309,8 @@ resend(struct aoedev *d, struct aoetgt *t, struct frame *f) "%15s e%ld.%d [EMAIL PROTECTED] newtag=%08x " "s=%012llx d=%012llx nout=%d\n", "retransmit", d->aoemajor, d->aoeminor, f->tag, jiffies, n, - mac_addr(h->src), mac_addr(h->dst), t->nout); + mac_addr(h->src), + mac_addr(h->dst), t->nout); aoechr_error(buf); f->tag = n; @@ -633,7 +634,7 @@ ataid_complete(struct aoedev *d, struct aoetgt *t, unsigned char *id) if (d->ssize != ssize) printk(KERN_INFO "aoe: %012llx e%lu.%lu v%04x has %llu sectors\n", - (unsigned long long)mac_addr(t->addr), + mac_addr(t->addr), d->aoemajor, d->aoeminor, d->fw_ver, (long long)ssize); d->ssize = ssize; @@ -727,8 +728,7 @@ aoecmd_ata_rsp(struct sk_buff *skb) t = gettgt(d, hin->src); if (t == NULL) { printk(KERN_INFO "aoe: can't find target e%ld.%d:%012llx\n", - d->aoemajor, d->aoeminor, - (unsigned long long) mac_addr(hin->src)); + d->aoemajor, d->aoeminor, mac_addr(hin->src)); spin_unlock_irqrestore(&d->lock, flags); return; } @@ -1003,7 +1003,7 @@ aoecmd_cfg_rsp(struct sk_buff *skb) "aoe: e%ld.%d: setting %d%s%s:%012llx\n", d->aoemajor, d->aoeminor, n, " byte data frames on ", ifp->nd->name, - (unsigned long long) mac_addr(t->addr)); + mac_addr(t->addr)); ifp->maxbcnt = n; } } diff --git a/drivers/block/aoe/aoenet.c b/drivers/block/aoe/aoenet.c index 7a38a45..ada4a06 100644 --- a/drivers/block/aoe/aoenet.c +++ b/drivers/block/aoe/aoenet.c @@ -83,7 +83,7 @@ set_aoe_iflist(const char __user *user_str, size_t size) return 0; } -u64 +unsigned long long mac_addr(char addr[6]) { __be64 n = 0; @@ -91,7 +91,7 @@ mac_addr(char addr[6]) memcpy(p + 2, addr, 6); /* (sizeof addr != 6) */ - return __be64_to_cpu(n); + return (unsigned long long) __be64_to_cpu(n); } void -- 1.5.3.4 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 02/13] handle multiple network paths to AoE device
A remote AoE device is something can process ATA commands and is identified by an AoE shelf number and an AoE slot number. Such a device might have more than one network interface, and it might be reachable by more than one local network interface. This patch tracks the available network paths available to each AoE device, allowing them to be used more efficiently. Andrew Morton asked about the call to msleep_interruptible in the revalidate function. Yes, if a signal is pending, then msleep_interruptible will not return 0. That means we will not loop but will call aoenet_xmit with a NULL skb, which is a noop. If the system is too low on memory or the aoe driver is too low on frames, then the user can hit control-C to interrupt the attempt to do a revalidate. I have added a comment to the code summarizing that. Andrew Morton asked whether the allocation performed inside addtgt could use a more relaxed allocation like GFP_KERNEL, but addtgt is called when the aoedev lock has been locked with spin_lock_irqsave. It would be nice to allocate the memory under fewer restrictions, but targets are only added when the device is being discovered, and if the target can't be added right now, we can try again in a minute when then next AoE config query broadcast goes out. Andrew Morton pointed out that the "too many targets" message could be printed for failing GFP_ATOMIC allocations. The last patch in this series makes the messages more specific. Signed-off-by: Ed L. Cashin <[EMAIL PROTECTED]> --- drivers/block/aoe/aoe.h| 57 +++-- drivers/block/aoe/aoeblk.c | 62 - drivers/block/aoe/aoechr.c | 17 +- drivers/block/aoe/aoecmd.c | 675 ++-- drivers/block/aoe/aoedev.c | 168 +-- drivers/block/aoe/aoenet.c |9 +- 6 files changed, 653 insertions(+), 335 deletions(-) diff --git a/drivers/block/aoe/aoe.h b/drivers/block/aoe/aoe.h index 4d0543a..87df18b 100644 --- a/drivers/block/aoe/aoe.h +++ b/drivers/block/aoe/aoe.h @@ -76,10 +76,8 @@ enum { DEVFL_EXT = (1<<2), /* device accepts lba48 commands */ DEVFL_CLOSEWAIT = (1<<3), /* device is waiting for all closes to revalidate */ DEVFL_GDALLOC = (1<<4), /* need to alloc gendisk */ - DEVFL_PAUSE = (1<<5), + DEVFL_KICKME = (1<<5), /* slow polling network card catch */ DEVFL_NEWSIZE = (1<<6), /* need to update dev size in block layer */ - DEVFL_MAXBCNT = (1<<7), /* d->maxbcnt is not changeable */ - DEVFL_KICKME = (1<<8), BUFFL_FAIL = 1, }; @@ -88,17 +86,24 @@ enum { DEFAULTBCNT = 2 * 512, /* 2 sectors */ NPERSHELF = 16, /* number of slots per shelf address */ FREETAG = -1, - MIN_BUFS = 8, + MIN_BUFS = 16, + NTARGETS = 8, + NAOEIFS = 8, + + TIMERTICK = HZ / 10, + MINTIMER = HZ >> 2, + MAXTIMER = HZ << 1, + HELPWAIT = 20, }; struct buf { struct list_head bufs; - ulong start_time; /* for disk stats */ + ulong stime;/* for disk stats */ ulong flags; ulong nframesout; - char *bufaddr; ulong resid; ulong bv_resid; + ulong bv_off; sector_t sector; struct bio *bio; struct bio_vec *bv; @@ -114,19 +119,37 @@ struct frame { struct sk_buff *skb; }; +struct aoeif { + struct net_device *nd; + unsigned char lost; + unsigned char lostjumbo; + ushort maxbcnt; +}; + +struct aoetgt { + unsigned char addr[6]; + ushort nframes; + struct frame *frames; + struct aoeif ifs[NAOEIFS]; + struct aoeif *ifp; /* current aoeif in use */ + ushort nout; + ushort maxout; + u16 lasttag;/* last tag sent */ + u16 useme; + ulong lastwadj; /* last window adjustment */ + int wpkts, rpkts; +}; + struct aoedev { struct aoedev *next; - unsigned char addr[6]; /* remote mac addr */ - ushort flags; ulong sysminor; ulong aoemajor; - ulong aoeminor; + u16 aoeminor; + u16 flags; u16 nopen; /* (bd_openers isn't available without sleeping) */ - u16 lasttag;/* last tag sent */ u16 rttavg; /* round trip average of requests/responses */ u16 mintimer; u16 fw_ver; /* version of blade's firmware */ - u16 maxbcnt; struct work_struct work;/* disk create work struct */ struct gendisk *gd; struct request_queue blkq; @@ -134,15 +157,14 @@ struct aoedev { sector_t ssize; struct timer_list timer; spinlock_t lock; - struct net_device *ifp; /* interface ed is attached to */ struct sk_buff *sendq_hd; /* packets needing to be sent, list head */ struct sk_buff *sendq_tl;
[PATCH 01/13] bring driver version number to 47
These patches were made against kernel 2.6.23-rc4 kernel with the aoe-properly-initialise-the-request_queues-backing_dev_info patch (currently in mm) applied. They were submitted earlier and have been modified to incorporate feedback from the kernel development community. Signed-off-by: Ed L. Cashin <[EMAIL PROTECTED]> --- drivers/block/aoe/aoe.h |2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/drivers/block/aoe/aoe.h b/drivers/block/aoe/aoe.h index 07f02f8..4d0543a 100644 --- a/drivers/block/aoe/aoe.h +++ b/drivers/block/aoe/aoe.h @@ -1,5 +1,5 @@ /* Copyright (c) 2006 Coraid, Inc. See COPYING for GPL terms. */ -#define VERSION "32" +#define VERSION "47" #define AOE_MAJOR 152 #define DEVICE_NAME "aoe" -- 1.5.3.4 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Bugme-new] [Bug 9482] New: kernel GPF in 2.6.24 (g09f345da)
On Mon, Dec 03, 2007 at 03:13:49PM -0800, Andrew Morton wrote: > On Mon, 3 Dec 2007 14:47:22 -0800 > Andrew Morton <[EMAIL PROTECTED]> wrote: > > > Does this fix? > > Slightly more elaborate version Yes, this patch does eliminate the problem. Without it, no write can complete, and with it I have seen many writes complete without any trouble. Thank you for looking into this. I will look more closely at this patch tomorrow. -- Ed L Cashin <[EMAIL PROTECTED]> -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Bugme-new] [Bug 9482] New: kernel GPF in 2.6.24 (g09f345da)
On Mon, Dec 03, 2007 at 02:47:22PM -0800, Andrew Morton wrote: > On Mon, 3 Dec 2007 16:38:37 -0500 > "Ed L. Cashin" <[EMAIL PROTECTED]> wrote: ... > > It appears that the fbc->counters pointer is NULL. > > Does this fix? > > --- a/drivers/block/aoe/aoeblk.c~a > +++ a/drivers/block/aoe/aoeblk.c > @@ -6,6 +6,7 @@ > > #include > #include > +#include > #include > #include > #include > @@ -228,6 +229,7 @@ aoeblk_gdalloc(void *vp) > > spin_lock_irqsave(&d->lock, flags); > blk_queue_make_request(&d->blkq, aoeblk_make_request); > + bdi_init(&d->blkq.backing_dev_info); > gd->major = AOE_MAJOR; > gd->first_minor = d->sysminor * AOE_PARTITIONS; > gd->fops = &aoe_bdops; > _ > > > No, the behavior doesn't change with this patch applied. Meanwhile I have started a git bisect, and hopefully that will turn up a specific patch before I hit an unbootable kernel or get my machine in a state where it won't boot. -- Ed L Cashin <[EMAIL PROTECTED]> -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Bugme-new] [Bug 9482] New: kernel GPF in 2.6.24 (g09f345da)
On Mon, Dec 03, 2007 at 04:00:05PM -0500, Ed L. Cashin wrote: ... > I'll keep looking at this, but at a glance it looks like the cpu > number is valid, because I don't trip a BUG_ON when I make the change > below (the badval variable is noise, sorry). > > --- lx/lib/percpu_counter.c.200711302007-12-03 15:43:19.0 -0500 > +++ lx/lib/percpu_counter.c 2007-12-03 15:47:38.0 -0500 > @@ -33,7 +33,9 @@ void __percpu_counter_add(struct percpu_ > s64 count; > s32 *pcount; > int cpu = get_cpu(); > + u64 badval = 0xULL; > > + BUG_ON(!cpu_possible(cpu)); > pcount = per_cpu_ptr(fbc->counters, cpu); > count = *pcount + amount; > if (count >= batch || count <= -batch) { It appears that the fbc->counters pointer is NULL. I added the line, BUG_ON(!fbc->counters); ... (on line 39 in my percpu_counter.c), and it results in the trace below. It looks like when it's NULL, percpu_ptr passes it to __percpu_disguise, which makes it all ones and then tries to dereference 0x to access to the "ptrs" member of the struct percpu_data. [ cut here ] kernel BUG at lib/percpu_counter.c:39! invalid opcode: [1] SMP CPU 0 Modules linked in: aoe Pid: 3354, comm: bash Not tainted 2.6.24-rc3-47dbg #10 RIP: 0010:[] [] __percpu_counter_add+0x2a/0x8f RSP: 0018:810075031aa8 EFLAGS: 00010046 RAX: RBX: 81007fd19bd8 RCX: RDX: 0010 RSI: 0001 RDI: RBP: 810075031ac8 R08: 81007cc077b0 R09: 802ae5ee R10: 810075031aa8 R11: 8100750318e8 R12: 81007c81c380 R13: 810073ce8250 R14: 0200 R15: 8100755016b0 FS: 2b3e5c052db0() GS:8078b000() knlGS: CS: 0010 DS: ES: CR0: 8005003b CR2: 2b7f44fb64e0 CR3: 7c4b1000 CR4: 06e0 DR0: DR1: DR2: DR3: DR6: 0ff0 DR7: 0400 Process bash (pid: 3354, threadinfo 81007503, task 81007b4da040) Stack: 810075031ac8 81007fd19bd8 81007c81c380 810075031af8 802ae682 100075031ae8 8100755016b0 0200 81007fd19bd8 810075031b18 802ae75c Call Trace: [] __set_page_dirty+0xdc/0x121 [] mark_buffer_dirty+0x95/0x99 [] __block_commit_write+0x72/0xac [] block_write_end+0x4f/0x5b [] blkdev_write_end+0x1b/0x38 [] generic_file_buffered_write+0x1c0/0x648 [] current_fs_time+0x22/0x29 [] __generic_file_aio_write_nolock+0x358/0x3c2 [] filemap_fault+0x1c4/0x320 [] unlock_page+0x2d/0x31 [] generic_file_aio_write_nolock+0x3b/0x8d [] do_sync_write+0xe2/0x126 [] autoremove_wake_function+0x0/0x38 [] do_page_fault+0x3f8/0x7bb [] fd_install+0x5f/0x68 [] vfs_write+0xae/0x137 [] sys_write+0x47/0x70 [] system_call+0x7e/0x83 Code: 0f 0b eb fe 0f a3 3d 7e 08 4f 00 19 c0 85 c0 75 04 0f 0b eb RIP [] __percpu_counter_add+0x2a/0x8f RSP BUG: sleeping function called from invalid context at kernel/rwsem.c:20 in_atomic():0, irqs_disabled():1 INFO: lockdep is turned off. Call Trace: [] debug_show_held_locks+0x1b/0x24 [] __might_sleep+0xc7/0xc9 [] down_read+0x1d/0x4a [] exit_mm+0x34/0xf7 [] do_exit+0x247/0x75b [] kernel_math_error+0x0/0x7e [] do_trap+0x101/0x110 [] do_invalid_op+0x91/0x9a [] __percpu_counter_add+0x2a/0x8f [] :aoe:aoeblk_make_request+0x1c3/0x1d0 [] io_schedule+0x28/0x34 [] error_exit+0x0/0x9a [] __set_page_dirty+0x48/0x121 [] __percpu_counter_add+0x2a/0x8f [] __set_page_dirty+0xdc/0x121 [] mark_buffer_dirty+0x95/0x99 [] __block_commit_write+0x72/0xac [] block_write_end+0x4f/0x5b [] blkdev_write_end+0x1b/0x38 [] generic_file_buffered_write+0x1c0/0x648 [] current_fs_time+0x22/0x29 [] __generic_file_aio_write_nolock+0x358/0x3c2 [] filemap_fault+0x1c4/0x320 [] unlock_page+0x2d/0x31 [] generic_file_aio_write_nolock+0x3b/0x8d [] do_sync_write+0xe2/0x126 [] autoremove_wake_function+0x0/0x38 [] do_page_fault+0x3f8/0x7bb [] fd_install+0x5f/0x68 [] vfs_write+0xae/0x137 [] sys_write+0x47/0x70 [] system_call+0x7e/0x83 -- Ed L Cashin <[EMAIL PROTECTED]> -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Bugme-new] [Bug 9482] New: kernel GPF in 2.6.24 (g09f345da)
On Mon, Dec 03, 2007 at 11:34:59AM -0800, Andrew Morton wrote: ... > Strange. It _looks_ like we've somehow caused smp_processor_id() to return > a not-possible CPU number. ... > Could you debug this a bit please? Find out which CPU number > __percpu_counter_add() is using, for a start? I'd do: ... > Alternatively, just do > > if (!cpu_possible(cpu)) > printk(...) > > in __percpu_counter_add(). Then you can proceed to work through the > various operations which smp_processor_id() does and find out where it went > wrong: print out %fs, mainly. > > If the cpu number is valid then perhaps something scribbled on the cpu's > per-cpu memory. I'll keep looking at this, but at a glance it looks like the cpu number is valid, because I don't trip a BUG_ON when I make the change below (the badval variable is noise, sorry). --- lx/lib/percpu_counter.c.200711302007-12-03 15:43:19.0 -0500 +++ lx/lib/percpu_counter.c 2007-12-03 15:47:38.0 -0500 @@ -33,7 +33,9 @@ void __percpu_counter_add(struct percpu_ s64 count; s32 *pcount; int cpu = get_cpu(); + u64 badval = 0xULL; + BUG_ON(!cpu_possible(cpu)); pcount = per_cpu_ptr(fbc->counters, cpu); count = *pcount + amount; if (count >= batch || count <= -batch) { The trace is, Unable to handle kernel paging request at RIP: [] __percpu_counter_add+0x35/0x7f PGD 203067 PUD 204067 PMD 0 Oops: [1] SMP CPU 0 Modules linked in: aoe Pid: 2777, comm: bash Not tainted 2.6.24-rc3-47dbg #9 RIP: 0010:[] [] __percpu_counter_add+0x35/0x7f RSP: 0018:810078d19aa8 EFLAGS: 00010086 RAX: RBX: 81007fc71950 RCX: 0010 RDX: RSI: 0001 RDI: 810078d9a250 RBP: 810078d19ac8 R08: 81007cc077b0 R09: 802ae5ee R10: 810078d19aa8 R11: 810078cb59d8 R12: 81007c81c380 R13: 810078d9a250 R14: 0200 R15: 81007805f830 FS: 2b341db94db0() GS:8078b000() knlGS: CS: 0010 DS: ES: CR0: 8005003b CR2: CR3: 7b415000 CR4: 06e0 DR0: DR1: DR2: DR3: DR6: 0ff0 DR7: 0400 Process bash (pid: 2777, threadinfo 810078d18000, task 810078d160c0) Stack: 810078d19ac8 81007fc71950 81007c81c380 810078d19af8 802ae682 100078d19ae8 81007805f830 0200 81007fc71950 810078d19b18 802ae75c Call Trace: [] __set_page_dirty+0xdc/0x121 [] mark_buffer_dirty+0x95/0x99 [] __block_commit_write+0x72/0xac [] block_write_end+0x4f/0x5b [] blkdev_write_end+0x1b/0x38 [] generic_file_buffered_write+0x1c0/0x648 [] current_fs_time+0x22/0x29 [] __generic_file_aio_write_nolock+0x358/0x3c2 [] filemap_fault+0x1c4/0x320 [] unlock_page+0x2d/0x31 [] generic_file_aio_write_nolock+0x3b/0x8d [] do_sync_write+0xe2/0x126 [] autoremove_wake_function+0x0/0x38 [] do_page_fault+0x3f8/0x7bb [] fd_install+0x5f/0x68 [] vfs_write+0xae/0x137 [] sys_write+0x47/0x70 [] system_call+0x7e/0x83 Code: 4c 8b 24 d0 49 63 04 24 48 8d 1c 30 48 63 c1 48 39 c3 7d 0a RIP [] __percpu_counter_add+0x35/0x7f RSP CR2: BUG: sleeping function called from invalid context at kernel/rwsem.c:20 in_atomic():0, irqs_disabled():1 INFO: lockdep is turned off. Call Trace: [] debug_show_held_locks+0x1b/0x24 [] __might_sleep+0xc7/0xc9 [] down_read+0x1d/0x4a [] exit_mm+0x34/0xf7 [] do_exit+0x247/0x75b [] do_page_fault+0x6c7/0x7bb [] thread_return+0x42/0x86 [] :aoe:aoeblk_make_request+0x1c3/0x1d0 [] error_exit+0x0/0x9a [] __set_page_dirty+0x48/0x121 [] __percpu_counter_add+0x35/0x7f [] __set_page_dirty+0xdc/0x121 [] mark_buffer_dirty+0x95/0x99 [] __block_commit_write+0x72/0xac [] block_write_end+0x4f/0x5b [] blkdev_write_end+0x1b/0x38 [] generic_file_buffered_write+0x1c0/0x648 [] current_fs_time+0x22/0x29 [] __generic_file_aio_write_nolock+0x358/0x3c2 [] filemap_fault+0x1c4/0x320 [] unlock_page+0x2d/0x31 [] generic_file_aio_write_nolock+0x3b/0x8d [] do_sync_write+0xe2/0x126 [] autoremove_wake_function+0x0/0x38 [] do_page_fault+0x3f8/0x7bb [] fd_install+0x5f/0x68 [] vfs_write+0xae/0x137 [] sys_write+0x47/0x70 [] system_call+0x7e/0x83 -- Ed L Cashin <[EMAIL PROTECTED]> -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Bugme-new] [Bug 9482] New: kernel GPF in 2.6.24 (g09f345da)
On Sat, Dec 01, 2007 at 12:23:02PM -0800, Andrew Morton wrote: > (switched to email - please respond via emailed reply-to-all, not via the > bugzilla web interface) > > On Sat, 1 Dec 2007 11:54:11 -0800 (PST) [EMAIL PROTECTED] wrote: > > > http://bugzilla.kernel.org/show_bug.cgi?id=9482 ... > Damn that's odd. General Protection Fault in > __set_page_dirty->__percpu_counter_add(). No sign of AOE in the trace. > > I assume that it is repeatable and that it doesn't occur with mkfs on > regular local disk drives? I am encountering this same problem during testing of some patches I would like to send to the LKML, applied to 2.6.24-rc3, and I can trip this problem with just, echo > /dev/etherd/e7.0 ... at which point I get the trace below. (I had added a couple of checks for 0x pointers to __percpu_counter_add.) I haven't been able to check the unpatched aoe driver, but it looks the same. Unable to handle kernel paging request at RIP: [] __percpu_counter_add+0x24/0x6d PGD 203067 PUD 204067 PMD 0 Oops: [1] SMP CPU 0 Modules linked in: aoe Pid: 2860, comm: bash Not tainted 2.6.24-rc3-47dbg #5 RIP: 0010:[] [] __percpu_counter_add+0x24/0x6d RSP: 0018:81007a0fbaa8 EFLAGS: 00010092 RAX: RBX: 81007fcc48e0 RCX: 0010 RDX: RSI: 0001 RDI: 81007ace7240 RBP: 81007a0fbac8 R08: 81007cc077b0 R09: 802ae5ee R10: 81007a0fbaa8 R11: 810077dd99d8 R12: 81007ace7240 R13: R14: 0200 R15: 810078473bb0 FS: 2ba601c5cdb0() GS:8078b000() knlGS: CS: 0010 DS: ES: CR0: 8005003b CR2: CR3: 77c31000 CR4: 06e0 DR0: DR1: DR2: DR3: DR6: 0ff0 DR7: 0400 Process bash (pid: 2860, threadinfo 81007a0fa000, task 81007bf48040) Stack: 81007a0fbac8 81007fcc48e0 81007c81c380 81007a0fbaf8 802ae682 10007a0fbae8 810078473bb0 0200 81007fcc48e0 81007a0fbb18 802ae75c Call Trace: [] __set_page_dirty+0xdc/0x121 [] mark_buffer_dirty+0x95/0x99 [] __block_commit_write+0x72/0xac [] block_write_end+0x4f/0x5b [] blkdev_write_end+0x1b/0x38 [] generic_file_buffered_write+0x1c0/0x648 [] current_fs_time+0x22/0x29 [] __generic_file_aio_write_nolock+0x358/0x3c2 [] filemap_fault+0x1c4/0x320 [] unlock_page+0x2d/0x31 [] generic_file_aio_write_nolock+0x3b/0x8d [] do_sync_write+0xe2/0x126 [] autoremove_wake_function+0x0/0x38 [] do_page_fault+0x3f8/0x7bb [] fd_install+0x5f/0x68 [] vfs_write+0xae/0x137 [] sys_write+0x47/0x70 [] system_call+0x7e/0x83 Code: 4c 8b 2c d0 49 63 45 00 48 8d 1c 30 48 63 c1 48 39 c3 7d 0a RIP [] __percpu_counter_add+0x24/0x6d RSP CR2: BUG: sleeping function called from invalid context at kernel/rwsem.c:20 in_atomic():0, irqs_disabled():1 INFO: lockdep is turned off. Call Trace: [] debug_show_held_locks+0x1b/0x24 [] __might_sleep+0xc7/0xc9 [] down_read+0x1d/0x4a [] exit_mm+0x34/0xf7 [] do_exit+0x247/0x75b [] do_page_fault+0x6c7/0x7bb [] thread_return+0x42/0x86 [] :aoe:aoeblk_make_request+0x1e8/0x1f5 [] error_exit+0x0/0x9a [] __set_page_dirty+0x48/0x121 [] __percpu_counter_add+0x24/0x6d [] __set_page_dirty+0xdc/0x121 [] mark_buffer_dirty+0x95/0x99 [] __block_commit_write+0x72/0xac [] block_write_end+0x4f/0x5b [] blkdev_write_end+0x1b/0x38 [] generic_file_buffered_write+0x1c0/0x648 [] current_fs_time+0x22/0x29 [] __generic_file_aio_write_nolock+0x358/0x3c2 [] filemap_fault+0x1c4/0x320 [] unlock_page+0x2d/0x31 [] generic_file_aio_write_nolock+0x3b/0x8d [] do_sync_write+0xe2/0x126 [] autoremove_wake_function+0x0/0x38 [] do_page_fault+0x3f8/0x7bb [] fd_install+0x5f/0x68 [] vfs_write+0xae/0x137 [] sys_write+0x47/0x70 [] system_call+0x7e/0x83 -- Ed L Cashin <[EMAIL PROTECTED]> -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] [73/2many] MAINTAINERS - ATA OVER ETHERNET DRIVER
On Sun, Aug 12, 2007 at 11:23:41PM -0700, [EMAIL PROTECTED] wrote: > Add file pattern to MAINTAINER entry > > Signed-off-by: Joe Perches <[EMAIL PROTECTED]> > > diff --git a/MAINTAINERS b/MAINTAINERS > index a165698..b8bb108 100644 > --- a/MAINTAINERS > +++ b/MAINTAINERS > @@ -740,6 +740,8 @@ P:Ed L. Cashin > M: [EMAIL PROTECTED] > W: http://www.coraid.com/support/linux > S: Supported > +F: Documentation/aoe/ > +F: drivers/block/aoe/ > > ATL1 ETHERNET DRIVER > P: Jay Cliburn That looks fine to me. -- Ed L Cashin <[EMAIL PROTECTED]> - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] aoe: remove unecessary wrapper function
We can just use skb_mac_header now, and we don't need a wrapper function to perform the cast. Instead of requiring the reader to check aoe.h to look up what an aoe_hdr function does, I'd rather do without it. --- drivers/block/aoe/aoe.h|9 - drivers/block/aoe/aoecmd.c | 14 +++--- drivers/block/aoe/aoenet.c |2 +- 3 files changed, 8 insertions(+), 17 deletions(-) diff --git a/drivers/block/aoe/aoe.h b/drivers/block/aoe/aoe.h index ba07f76..07f02f8 100644 --- a/drivers/block/aoe/aoe.h +++ b/drivers/block/aoe/aoe.h @@ -48,15 +48,6 @@ struct aoe_hdr { __be32 tag; }; -#ifdef __KERNEL__ -#include - -static inline struct aoe_hdr *aoe_hdr(const struct sk_buff *skb) -{ - return (struct aoe_hdr *)skb_mac_header(skb); -} -#endif - struct aoe_atahdr { unsigned char aflags; unsigned char errfeat; diff --git a/drivers/block/aoe/aoecmd.c b/drivers/block/aoe/aoecmd.c index 01fbdd3..8893a26 100644 --- a/drivers/block/aoe/aoecmd.c +++ b/drivers/block/aoe/aoecmd.c @@ -119,7 +119,7 @@ aoecmd_ata_rw(struct aoedev *d, struct frame *f) /* initialize the headers & frame */ skb = f->skb; - h = aoe_hdr(skb); + h = (struct aoe_hdr *) skb_mac_header(skb); ah = (struct aoe_atahdr *) (h+1); skb_put(skb, sizeof *h + sizeof *ah); memset(h, 0, skb->len); @@ -208,7 +208,7 @@ aoecmd_cfg_pkts(ushort aoemajor, unsigned char aoeminor, struct sk_buff **tail) skb->dev = ifp; if (sl_tail == NULL) sl_tail = skb; - h = aoe_hdr(skb); + h = (struct aoe_hdr *) skb_mac_header(skb); memset(h, 0, sizeof *h + sizeof *ch); memset(h->dst, 0xff, sizeof h->dst); @@ -303,7 +303,7 @@ rexmit(struct aoedev *d, struct frame *f) aoechr_error(buf); skb = f->skb; - h = aoe_hdr(skb); + h = (struct aoe_hdr *) skb_mac_header(skb); ah = (struct aoe_atahdr *) (h+1); f->tag = n; h->tag = cpu_to_be32(n); @@ -532,7 +532,7 @@ aoecmd_ata_rsp(struct sk_buff *skb) char ebuf[128]; u16 aoemajor; - hin = aoe_hdr(skb); + hin = (struct aoe_hdr *) skb_mac_header(skb); aoemajor = be16_to_cpu(get_unaligned(&hin->major)); d = aoedev_by_aoeaddr(aoemajor, hin->minor); if (d == NULL) { @@ -564,7 +564,7 @@ aoecmd_ata_rsp(struct sk_buff *skb) calc_rttavg(d, tsince(f->tag)); ahin = (struct aoe_atahdr *) (hin+1); - hout = aoe_hdr(f->skb); + hout = (struct aoe_hdr *) skb_mac_header(f->skb); ahout = (struct aoe_atahdr *) (hout+1); buf = f->buf; @@ -698,7 +698,7 @@ aoecmd_ata_id(struct aoedev *d) /* initialize the headers & frame */ skb = f->skb; - h = aoe_hdr(skb); + h = (struct aoe_hdr *) skb_mac_header(skb); ah = (struct aoe_atahdr *) (h+1); skb_put(skb, sizeof *h + sizeof *ah); memset(h, 0, skb->len); @@ -729,7 +729,7 @@ aoecmd_cfg_rsp(struct sk_buff *skb) enum { MAXFRAMES = 16 }; u16 n; - h = aoe_hdr(skb); + h = (struct aoe_hdr *) skb_mac_header(skb); ch = (struct aoe_cfghdr *) (h+1); /* diff --git a/drivers/block/aoe/aoenet.c b/drivers/block/aoe/aoenet.c index f9ddfda..d54bf3a 100644 --- a/drivers/block/aoe/aoenet.c +++ b/drivers/block/aoe/aoenet.c @@ -123,7 +123,7 @@ aoenet_rcv(struct sk_buff *skb, struct net_device *ifp, struct packet_type *pt, goto exit; skb_push(skb, ETH_HLEN);/* (1) */ - h = aoe_hdr(skb); + h = (struct aoe_hdr *) skb_mac_header(skb); n = be32_to_cpu(get_unaligned(&h->tag)); if ((h->verfl & AOEFL_RSP) == 0 || (n & 1<<31)) goto exit; -- 1.5.2.1 - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: ATA over ethernet swapping and obfuscated code
On Tue, Jul 31, 2007 at 05:29:24PM +0200, Pavel Machek wrote: ... > Is the protocol documented somewhere? aoe.txt only points at > HOWTO... aha, protocol is linked from wikipedia. > http://www.coraid.com/documents/AoEr10.txt ... perhaps that should be > linked from aoe.txt, too? Perhaps. Most people reading the aoe.txt file won't need to refer to the protocol itself, though. > Hmm, aoe protocol is really trivial. Perhaps netpoll/netconsole > infrastructure could be used to create driver good enough for > swapping? (Ok, it would not neccessarily perform too well, but... we'd > simply wait for the reply synchronously. It should be pretty simple). I think that in general you still need a way to receive write confirmations without allocating memory, and the driver can't provide that mechanism. The problem is that when memory is scarce, writes of dirty data must be able to complete, but because memory is scarce, there might not be enough to receive and process packets write-reponse packets, and the driver has no way of affecting the situation. That's why I think a callback could work: The network layer could allow storage drivers to register a callback that recognizes write responses. Usually the callback would not be used, but if free pages became so scarce that network receives could not take place in a normal fashion, the (zero or few) registered callbacks would be used to quickly determine whether each packet was a write response. The distinction is important, because write responses can result in the freeing of pages. When a storage driver's callback identified a write response, then a reserved skb could be used to process the receive without allocating memory. During the memory crunch packets that were not write responses would be dropped just as they are already, but dirty pages would be flushed. The mechanism would only take effect when free pages were scarce. It is easy to chat, though. Maybe someday I will test and submit a patch that implements this mechanism, but I'm hoping that somebody beats me to it. :) -- Ed L Cashin <[EMAIL PROTECTED]> - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: ATA over ethernet swapping and obfuscated code
On Tue, Jul 31, 2007 at 03:58:31PM +0200, Pavel Machek wrote: > Hi! > > I wanted to know if it is possible/okay to swap over AOE... > > According to > http://www.coraid.com/support/linux/EtherDrive-2.6-HOWTO-5.html#ss5.20 > .. it runs OOM even during normal use, so I guess swapping over it is > no-no? It can be done (e.g., to create virtual memory for running xfs_check on a diskless machine as a temporary measure), but it probably won't be a good idea until there is a mechanism that allows write responses to be (quickly recognized and then) received without allocating memory when there are no free pages. I think if we could register a very fast function to recognize write responses, which would be called only when free memory was very low, and then use a pre-allocated receive skb for receiving write responses, then we'd be OK, and the common case wouldn't be affected. > Can I build both client and server for these using free software? Yes. A popular free target is the vblade (aoetools.sourceforge.net), and there are others. The most popular free software initiator is the aoe driver in Linux. > In the process, I looked at the aoe code, and parts of it look like > obfuscated C contest. The use of switch() as an if was particulary > creative; I'm not even sure if I translated it properly... can you > take a look? I recently submitted a set of patches, and Andrew Morton asked me to avoid the switch statement you are talking about, so thanks for the patch, but that code is going to be patched soon anyway. More below. > (Patch is > > Signed-off-by: Pavel Machek <[EMAIL PROTECTED]> > > but I did not even compile test it) > > diff --git a/drivers/block/aoe/aoedev.c b/drivers/block/aoe/aoedev.c > index 05a9719..38ba35d 100644 > --- a/drivers/block/aoe/aoedev.c > +++ b/drivers/block/aoe/aoedev.c > @@ -64,29 +64,26 @@ aoedev_newdev(ulong nframes) > > d = kzalloc(sizeof *d, GFP_ATOMIC); > f = kcalloc(nframes, sizeof *f, GFP_ATOMIC); > - switch (!d || !f) { > - case 0: > - d->nframes = nframes; > - d->frames = f; > - e = f + nframes; > - for (; f - f->tag = FREETAG; > - f->skb = new_skb(ETH_ZLEN); > - if (!f->skb) > - break; > - } > - if (f == e) > - break; > + if (!d || !f) { > + kfree(f); > + kfree(d); > + return NULL; > + } > + > + d->nframes = nframes; > + d->frames = f; > + e = f + nframes; > + for (; f + f->tag = FREETAG; > + f->skb = new_skb(ETH_ZLEN); > + if (!f->skb) > + break; > + } > + if (f != e) { > while (f > d->frames) { > f--; > dev_kfree_skb(f->skb); > } > - default: > - if (f) > - kfree(f); > - if (d) > - kfree(d); > - return NULL; > } > INIT_WORK(&d->work, aoecmd_sleepwork); > spin_lock_init(&d->lock); > > > aoedev_by_sysminor_m() returns with spinlock held in error case; I > guess that's bad. > > struct aoedev * > aoedev_by_sysminor_m(ulong sysminor, ulong bufcnt) > { > struct aoedev *d; > ulong flags; > > spin_lock_irqsave(&devlist_lock, flags); > > for (d=devlist; d; d=d->next) > if (d->sysminor == sysminor) > break; > > if (d == NULL) { > d = aoedev_newdev(bufcnt); > if (d == NULL) { > spin_unlock_irqrestore(&devlist_lock, flags); > printk(KERN_INFO "aoe: aoedev_newdev > failure.\n"); > return NULL; > ~~~~~~~~~ here I don't see what you mean. There's an unlock two lines before the return. > } > d->sysminor = sysminor; > d->aoemajor = AOEMAJOR(sysminor); > d->aoeminor = AOEMINOR(sysminor); > } > > spin_unlock_irqrestore(&devlist_lock, flags); > return d; > } > -- Ed L Cashin <[EMAIL PROTECTED]> - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] stacked ifs (was Re: [PATCH 02/12] handle multiple network paths to AoE device)
On Mon, Jul 02, 2007 at 09:29:49PM -0700, Andrew Morton wrote: > On Tue, 26 Jun 2007 14:50:10 -0400 "Ed L. Cashin" <[EMAIL PROTECTED]> wrote: ... > > +static struct frame * > > +freeframe(struct aoedev *d) > > { > > + struct frame *f, *e; > > + struct aoetgt **t; > > + ulong n; > > + > > + if (d->targets[0] == NULL) {/* shouldn't happen, but I'm paranoid */ > > + printk(KERN_ERR "aoe: NULL TARGETS!\n"); > > + return NULL; > > + } > > + t = d->targets; > > + do { > > + if (t != d->htgt) > > + if ((*t)->ifp->nd) > > + if ((*t)->nout < (*t)->maxout) { > > ugh. Do this: > > do { > if (t == d->htgt) > continue; > if (!(*t)->ifp->nd) > continue; > if ((*t)->nout >= (*t)->maxout) > continue; > > > } while (++t ...) Do you think the "stacked ifs" in the first version above could be accepted as a convenient extension to the K&R-based conventions in Documentation/CodingStyle? Brian Kerhnighan (the "K" in "K&R") and Ken Thompson, were among the UNIX hackers who produced the UNIX v7 files that have stacked ifs: namei.c, dump.c, iostat.c od.c sa.c, vplot.c, refer/what2.c, sed/sed1.c, tbl/t8.c, chess/{agen, play, stdin}.c Certainly, Linux isn't UNIX, but stacked ifs needn't be treated as foreign. They're in the K&R tradition that Documentation/CodingStyle is based on. Stacked ifs are functionally equivalent to a single if with its conditionals joined by "&&". Once that is retained, they are not at all difficult to recognize and understand. And they have some advantages over the single if that uses "&&". When editing code, it is easier to remove conditionals that are no longer needed, or to arrange conditionals in a different order. Conditional expressions stand out clearly. When stacked ifs are in use, the resulting patches can be easier to read because fewer lines need to change (compared to splitting on the &&), and also simply because the text is more regular when it comes to parentheses and ampersands. There is less distracting noise. Of course, my primary motivation is for us to be able to contribute aoe driver patches that use this style, and that would be fantastic, but I don't think it is unrealistic to say that the adoption of this style would benefit others, helping to make patches easier to review in some cases. Understanding it only takes an understanding of C itself, so the only "new" change would be a slight and justifiable loosening of the indentation policy. Signed-off-by: "Ed L. Cashin" <[EMAIL PROTECTED]> --- Documentation/CodingStyle | 21 + 1 files changed, 21 insertions(+), 0 deletions(-) diff --git a/Documentation/CodingStyle b/Documentation/CodingStyle index a667eb1..bb2bb57 100644 --- a/Documentation/CodingStyle +++ b/Documentation/CodingStyle @@ -94,6 +94,27 @@ void fun(int a, int b, int c) next_statement; } +One way to break a long condition in an if statement is to use stacked +ifs. The following code extracts are equivalent, but the version with +stacked ifs is easier to read and edit, and it generates more specific +patches. + + /* version one: stacked ifs */ + if (condition) + if (condition2) + if (condition3) + if (condition4) + first_statement; + else + next_statement; + + /* version two: logical and */ + if (condition1 && condition2 && condition3 && condition4) + first_statement; + else + next_statement; + + Chapter 3: Placing Braces and Spaces The other issue that always comes up in C styling is the placement of -- 1.5.2.1 -- Ed L Cashin <[EMAIL PROTECTED]> - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 02/12] handle multiple network paths to AoE device
On Mon, Jul 02, 2007 at 09:29:49PM -0700, Andrew Morton wrote: > On Tue, 26 Jun 2007 14:50:10 -0400 "Ed L. Cashin" <[EMAIL PROTECTED]> wrote: ... > > +loop: > > + skb = aoecmd_ata_id(d); > > spin_unlock_irqrestore(&d->lock, flags); > > + if (!skb && !msleep_interruptible(200)) { > > + spin_lock_irqsave(&d->lock, flags); > > + goto loop; > > + } > > + aoenet_xmit(skb); > > aoecmd_cfg(major, minor); > > - > > return 0; > > } > > interruptible sleep? Does this code work as intended when there's a signal > pending? (Maybe that's what the interruptible sleep is for: don't know, > and am not inclined to work it out given the (low) level of comments in > here and the (lower) level of changelogging). Yes, if a signal is pending, then msleep_interruptible will not return 0. That means we will not loop but will call aoenet_xmit with a NULL skb, which is a noop. So if the system is too low on memory or the aoe driver is too low on frames, then the user can hit control-C to interrupt the attempt to do a revalidate. I will add a comment to that effect in the resubmitted patch. -- Ed L Cashin <[EMAIL PROTECTED]> - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 07/12] use a dynamic pool of sk_buffs to keep up with fast targets
Hi. I will address the style issues and other things that Andrew Morton pointed out---Thanks again for the feedback. As far as the skb pool goes, I'm afraid my comment is misleading. What this Patch Does Even before this recent series of 12 patches to 2.6.22-rc4, the aoe driver was reusing a small set of skbs that were allocated once and were only used for outbound AoE commands. The network layer cannot be allowed to put_page on the data that is still associated with a bio we haven't returned to the block layer, so the aoe driver (even before the patch under discussion) is still the owner of skbs that have been handed to the network layer for transmission. We need to keep track of these skbs so that we can free them, but by tracking them, we can also easily re-use them. The new patch was a response to the behavior of certain network drivers. We cannot reuse an skb that the network driver still has in its transmit ring. Network drivers can defer transmit ring cleanup and then use the state in the skb to determine how many data segments to clean up in its transmit ring. The tg3 driver is one driver that behaves in this way. When the network driver defers cleanup of its transmit ring, the aoe driver can find itself in a situation where it would like to send an AoE command, and the AoE target is ready for more work, but the network driver still has all of the pre-allocated skbs. In that case, the new patch just calls alloc_skb, as you'd expect. We don't want to get carried away, though. We try not to do excessive allocation in the write path, so we cap the number of skbs we dynamically allocate. Probably calling it a "dynamic pool" is misleading. We were already trying to use a small fixed-size set of pre-allocated skbs before this patch, and this patch just provides a little headroom (with a ceiling, though) to accomodate network drivers that hang onto skbs, by allocating when needed. The d->skbpool_hd list of allocated skbs is necessary so that we can free them later. We didn't notice the need for this headroom until AoE targets got fast enough, but the comment summarizing this patch still wasn't very good. So, when I resubmit this patch, I will use a different comment: dynamically allocate a capped number of skbs when necessary Alternatives If the network layer never did a put_page on the pages in the bio's we get from the block layer, then it would be possible for us to hand skbs to the network layer and forget about them, allowing the network layer to free skbs itself (and thereby calling our own skb->destructor callback function if we needed that). In that case we could get rid of the pre-allocated skbs and also the d->skbpool_hd, instead just calling alloc_skb every time we wanted to transmit a packet. The slab allocator would effectively maintain the list of skbs. Besides a loss of CPU cache locality, the main concern with that approach the danger that it would increase the likelihood of deadlock when VM is trying to free pages by writing dirty data from the page cache through the aoe driver out to persistent storage on an AoE device. Right now we have a situation where we have pre-allocation that corresponds to how much we use, which seems ideal. Of course, there's still the separate issue of receiving the packets that tell us that a write has successfully completed on the AoE target. When memory is low and VM is using AoE to flush dirty data to free up pages, it would be perfect if there were a way for us to register a fast callback that could recognize write command completion responses. But I don't think the current problems with the receive side of the situation are a justification for exacerbating the problem on the transmit side. -- Support - http://www.coraid.com/support/howto.html Ed L Cashin <[EMAIL PROTECTED]> - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 1/1] docs: static initialization of spinlocks is OK
Static initialization of spinlocks is preferable to dynamic initialization when it is practical. This patch updates documentation for consistency with comments in spinlock_types.h. Signed-off-by: Ed L. Cashin <[EMAIL PROTECTED]> --- Documentation/spinlocks.txt | 20 +++- 1 files changed, 11 insertions(+), 9 deletions(-) diff --git a/Documentation/spinlocks.txt b/Documentation/spinlocks.txt index a661d68..471e753 100644 --- a/Documentation/spinlocks.txt +++ b/Documentation/spinlocks.txt @@ -1,7 +1,12 @@ -UPDATE March 21 2005 Amit Gud <[EMAIL PROTECTED]> +SPIN_LOCK_UNLOCKED and RW_LOCK_UNLOCKED defeat lockdep state tracking and +are hence deprecated. -Macros SPIN_LOCK_UNLOCKED and RW_LOCK_UNLOCKED are deprecated and will be -removed soon. So for any new code dynamic initialization should be used: +Please use DEFINE_SPINLOCK()/DEFINE_RWLOCK() or +__SPIN_LOCK_UNLOCKED()/__RW_LOCK_UNLOCKED() as appropriate for static +initialization. + +Dynamic initialization, when necessary, may be performed as +demonstrated below. spinlock_t xxx_lock; rwlock_t xxx_rw_lock; @@ -15,12 +20,9 @@ removed soon. So for any new code dynamic initialization should be used: module_init(xxx_init); -Reasons for deprecation - - it hurts automatic lock validators - - it becomes intrusive for the realtime preemption patches - -Following discussion is still valid, however, with the dynamic initialization -of spinlocks instead of static. +The following discussion is still valid, however, with the dynamic +initialization of spinlocks or with DEFINE_SPINLOCK, etc., used +instead of SPIN_LOCK_UNLOCKED. --- -- 1.5.2.1 - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 12/12] the aoeminor doesn't need a long format
On Tue, Jun 26, 2007 at 12:51:07PM -0700, Randy Dunlap wrote: > On Tue, 26 Jun 2007 14:50:12 -0400 Ed L. Cashin wrote: > > > The aoedev aoeminor member doesn't need a long format. > > Was there a patch that changed aoeminor to an int? > Last I see is: > ulong aoeminor; > in linux-2.6.22-rc6/drivers/block/aoe/aoe.h. > > If it's still ulong, you shouldn't change the printk format. Yes, thanks for checking. Patch 2 of 12 did change it to a u16. -- Ed L Cashin <[EMAIL PROTECTED]> - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 02/12] handle multiple network paths to AoE device
Handle multiple network paths to AoE device. Signed-off-by: Ed L. Cashin <[EMAIL PROTECTED]> --- drivers/block/aoe/aoe.h| 58 +++-- drivers/block/aoe/aoeblk.c | 63 - drivers/block/aoe/aoechr.c | 14 +- drivers/block/aoe/aoecmd.c | 660 +-- drivers/block/aoe/aoedev.c | 163 +-- drivers/block/aoe/aoenet.c |4 +- 6 files changed, 630 insertions(+), 332 deletions(-) diff --git a/drivers/block/aoe/aoe.h b/drivers/block/aoe/aoe.h index 2ce5ce9..069f04c 100644 --- a/drivers/block/aoe/aoe.h +++ b/drivers/block/aoe/aoe.h @@ -85,10 +85,8 @@ enum { DEVFL_EXT = (1<<2), /* device accepts lba48 commands */ DEVFL_CLOSEWAIT = (1<<3), /* device is waiting for all closes to revalidate */ DEVFL_GDALLOC = (1<<4), /* need to alloc gendisk */ - DEVFL_PAUSE = (1<<5), + DEVFL_KICKME = (1<<5), /* slow polling network card catch */ DEVFL_NEWSIZE = (1<<6), /* need to update dev size in block layer */ - DEVFL_MAXBCNT = (1<<7), /* d->maxbcnt is not changeable */ - DEVFL_KICKME = (1<<8), BUFFL_FAIL = 1, }; @@ -97,17 +95,24 @@ enum { DEFAULTBCNT = 2 * 512, /* 2 sectors */ NPERSHELF = 16, /* number of slots per shelf address */ FREETAG = -1, - MIN_BUFS = 8, + MIN_BUFS = 16, + NTARGETS = 8, + NAOEIFS = 8, + + TIMERTICK = HZ / 10, + MINTIMER = HZ >> 2, + MAXTIMER = HZ << 1, + HELPWAIT = 20, }; struct buf { struct list_head bufs; - ulong start_time; /* for disk stats */ + ulong stime;/* for disk stats */ ulong flags; ulong nframesout; - char *bufaddr; ulong resid; ulong bv_resid; + ulong bv_off; sector_t sector; struct bio *bio; struct bio_vec *bv; @@ -123,19 +128,37 @@ struct frame { struct sk_buff *skb; }; +struct aoeif { + struct net_device *nd; + unsigned char lost; + unsigned char lostjumbo; + ushort maxbcnt; +}; + +struct aoetgt { + unsigned char addr[6]; + ushort nframes; + struct frame *frames; + struct aoeif ifs[NAOEIFS]; + struct aoeif *ifp; /* current aoeif in use */ + ushort nout; + ushort maxout; + u16 lasttag;/* last tag sent */ + u16 useme; + ulong lastwadj; /* last window adjustment */ +int wpkts, rpkts; +}; + struct aoedev { struct aoedev *next; - unsigned char addr[6]; /* remote mac addr */ - ushort flags; ulong sysminor; ulong aoemajor; - ulong aoeminor; + u16 aoeminor; + u16 flags; u16 nopen; /* (bd_openers isn't available without sleeping) */ - u16 lasttag;/* last tag sent */ u16 rttavg; /* round trip average of requests/responses */ u16 mintimer; u16 fw_ver; /* version of blade's firmware */ - u16 maxbcnt; struct work_struct work;/* disk create work struct */ struct gendisk *gd; request_queue_t blkq; @@ -143,15 +166,15 @@ struct aoedev { sector_t ssize; struct timer_list timer; spinlock_t lock; - struct net_device *ifp; /* interface ed is attached to */ struct sk_buff *sendq_hd; /* packets needing to be sent, list head */ struct sk_buff *sendq_tl; mempool_t *bufpool; /* for deadlock-free Buf allocation */ struct list_head bufq; /* queue of bios to work on */ struct buf *inprocess; /* the one we're currently working on */ - ushort lostjumbo; - ushort nframes; /* number of frames below */ - struct frame *frames; + struct aoetgt *targets[NTARGETS]; + struct aoetgt **tgt;/* target in use when working */ + struct aoetgt **htgt; /* target needing rexmit assistance */ +//int ios[64]; }; @@ -169,12 +192,13 @@ void aoecmd_cfg(ushort aoemajor, unsigned char aoeminor); void aoecmd_ata_rsp(struct sk_buff *); void aoecmd_cfg_rsp(struct sk_buff *); void aoecmd_sleepwork(struct work_struct *); -struct sk_buff *new_skb(ulong); +void aoecmd_cleanslate(struct aoedev *); +struct sk_buff *aoecmd_ata_id(struct aoedev *); int aoedev_init(void); void aoedev_exit(void); struct aoedev *aoedev_by_aoeaddr(int maj, int min); -struct aoedev *aoedev_by_sysminor_m(ulong sysminor, ulong bufcnt); +struct aoedev *aoedev_by_sysminor_m(ulong sysminor); void aoedev_downdev(struct aoedev *d); int aoedev_isbusy(struct aoedev *d); diff --git a/drivers/block/aoe/aoeblk.c b/drivers/block/aoe/aoeblk.c index 478489c..f6773ab 100644 --- a/drivers/block/aoe/aoeblk.c +++ b/drivers/block/aoe/aoeblk.c @@ -21,22 +21,55 @@ static ssize_t aoedisk_show_state(struct gendisk * disk, char *page) return snprintf(page, PAGE_SIZE,
[PATCH 01/12] bring driver version number to 47
These patches were made against kernel 2.6.22-rc4. Bring driver version number to 47. Signed-off-by: Ed L. Cashin <[EMAIL PROTECTED]> --- drivers/block/aoe/aoe.h |2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/drivers/block/aoe/aoe.h b/drivers/block/aoe/aoe.h index 1d84668..2ce5ce9 100644 --- a/drivers/block/aoe/aoe.h +++ b/drivers/block/aoe/aoe.h @@ -1,5 +1,5 @@ /* Copyright (c) 2006 Coraid, Inc. See COPYING for GPL terms. */ -#define VERSION "32" +#define VERSION "47" #define AOE_MAJOR 152 #define DEVICE_NAME "aoe" -- 1.5.2.1 - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 12/12] the aoeminor doesn't need a long format
The aoedev aoeminor member doesn't need a long format. Signed-off-by: Ed L. Cashin <[EMAIL PROTECTED]> --- drivers/block/aoe/aoeblk.c |6 +++--- drivers/block/aoe/aoecmd.c |4 ++-- 2 files changed, 5 insertions(+), 5 deletions(-) diff --git a/drivers/block/aoe/aoeblk.c b/drivers/block/aoe/aoeblk.c index ccadd9a..e216fe0 100644 --- a/drivers/block/aoe/aoeblk.c +++ b/drivers/block/aoe/aoeblk.c @@ -203,7 +203,7 @@ aoeblk_make_request(request_queue_t *q, struct bio *bio) spin_lock_irqsave(&d->lock, flags); if ((d->flags & DEVFL_UP) == 0) { - printk(KERN_INFO "aoe: device %ld.%ld is not up\n", + printk(KERN_INFO "aoe: device %ld.%d is not up\n", d->aoemajor, d->aoeminor); spin_unlock_irqrestore(&d->lock, flags); mempool_free(buf, d->bufpool); @@ -256,7 +256,7 @@ aoeblk_gdalloc(void *vp) gd = alloc_disk(AOE_PARTITIONS); if (gd == NULL) { - printk(KERN_ERR "aoe: cannot allocate disk structure for %ld.%ld\n", + printk(KERN_ERR "aoe: cannot allocate disk structure for %ld.%d\n", d->aoemajor, d->aoeminor); spin_lock_irqsave(&d->lock, flags); d->flags &= ~DEVFL_GDALLOC; @@ -266,7 +266,7 @@ aoeblk_gdalloc(void *vp) d->bufpool = mempool_create_slab_pool(MIN_BUFS, buf_pool_cache); if (d->bufpool == NULL) { - printk(KERN_ERR "aoe: cannot allocate bufpool for %ld.%ld\n", + printk(KERN_ERR "aoe: cannot allocate bufpool for %ld.%d\n", d->aoemajor, d->aoeminor); put_disk(gd); spin_lock_irqsave(&d->lock, flags); diff --git a/drivers/block/aoe/aoecmd.c b/drivers/block/aoe/aoecmd.c index 9de0024..dfb1184 100644 --- a/drivers/block/aoe/aoecmd.c +++ b/drivers/block/aoe/aoecmd.c @@ -680,7 +680,7 @@ ataid_complete(struct aoedev *d, struct aoetgt *t, unsigned char *id) } if (d->ssize != ssize) - printk(KERN_INFO "aoe: %012llx e%lu.%lu v%04x has %llu sectors\n", + printk(KERN_INFO "aoe: %012llx e%ld.%d v%04x has %llu sectors\n", mac_addr(t->addr), d->aoemajor, d->aoeminor, d->fw_ver, (long long)ssize); @@ -805,7 +805,7 @@ aoecmd_ata_rsp(struct sk_buff *skb) if (ahin->cmdstat & 0xa9) { /* these bits cleared on success */ printk(KERN_ERR - "aoe: ata error cmd=%2.2Xh stat=%2.2Xh from e%ld.%ld\n", + "aoe: ata error cmd=%2.2Xh stat=%2.2Xh from e%ld.%d\n", ahout->cmdstat, ahin->cmdstat, d->aoemajor, d->aoeminor); if (buf) -- 1.5.2.1 - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 09/12] remove race between use and initialization of locks
This change was originally submitted by Alexey Dobriyan in an email with ... Message-ID: <[EMAIL PROTECTED]> and the comment, Some drivers do register_chrdev() before lock or semaphore used in corresponding file_operations is initialized. Andrew Morton commented that these locks should be initialized at compile time, but Alexey Debriyan pointed out that the Documentation tells us to use dynamic initialization whenever possible, and then the discussion petered out. http://preview.tinyurl.com/2pxq6p I believe we made these locks dynamic because of the notice in Documentation/spinlocks.txt, which says that static initializers are deprecated: UPDATE March 21 2005 Amit Gud <[EMAIL PROTECTED]> Macros SPIN_LOCK_UNLOCKED and RW_LOCK_UNLOCKED are deprecated and will be removed soon. So for any new code dynamic initialization should be used: ... In any case, the patch below makes the code correct and in keeping with the existing documentation. If the existing docs are wrong, I'd be happy to follow up with a patch that corrects them and makes these aoechr.c locks static. Signed-off-by: Alexey Dobriyan <[EMAIL PROTECTED]> Signed-off-by: Ed L. Cashin <[EMAIL PROTECTED]> --- drivers/block/aoe/aoechr.c |4 ++-- 1 files changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/block/aoe/aoechr.c b/drivers/block/aoe/aoechr.c index 10b38a7..2b4f873 100644 --- a/drivers/block/aoe/aoechr.c +++ b/drivers/block/aoe/aoechr.c @@ -256,13 +256,13 @@ aoechr_init(void) { int n, i; + sema_init(&emsgs_sema, 0); + spin_lock_init(&emsgs_lock); n = register_chrdev(AOE_MAJOR, "aoechr", &aoe_fops); if (n < 0) { printk(KERN_ERR "aoe: can't register char device\n"); return n; } - sema_init(&emsgs_sema, 0); - spin_lock_init(&emsgs_lock); aoe_class = class_create(THIS_MODULE, "aoe"); if (IS_ERR(aoe_class)) { unregister_chrdev(AOE_MAJOR, "aoechr"); -- 1.5.2.1 - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 11/12] remove extra space in prototypes for consistency
Remove extra space in prototypes for consistency. Signed-off-by: Ed L. Cashin <[EMAIL PROTECTED]> --- drivers/block/aoe/aoeblk.c |8 1 files changed, 4 insertions(+), 4 deletions(-) diff --git a/drivers/block/aoe/aoeblk.c b/drivers/block/aoe/aoeblk.c index c9cf576..ccadd9a 100644 --- a/drivers/block/aoe/aoeblk.c +++ b/drivers/block/aoe/aoeblk.c @@ -14,7 +14,7 @@ static struct kmem_cache *buf_pool_cache; -static ssize_t aoedisk_show_state(struct gendisk * disk, char *page) +static ssize_t aoedisk_show_state(struct gendisk *disk, char *page) { struct aoedev *d = disk->private_data; @@ -25,7 +25,7 @@ static ssize_t aoedisk_show_state(struct gendisk * disk, char *page) (d->nopen && !(d->flags & DEVFL_UP)) ? ",closewait" : ""); /* I'd rather see nopen exported so we can ditch closewait */ } -static ssize_t aoedisk_show_mac(struct gendisk * disk, char *page) +static ssize_t aoedisk_show_mac(struct gendisk *disk, char *page) { struct aoedev *d = disk->private_data; struct aoetgt *t = d->targets[0]; @@ -34,7 +34,7 @@ static ssize_t aoedisk_show_mac(struct gendisk * disk, char *page) return snprintf(page, PAGE_SIZE, "none\n"); return snprintf(page, PAGE_SIZE, "%012llx\n", mac_addr(t->addr)); } -static ssize_t aoedisk_show_netif(struct gendisk * disk, char *page) +static ssize_t aoedisk_show_netif(struct gendisk *disk, char *page) { struct aoedev *d = disk->private_data; struct net_device *nds[8], **nd, **nnd, **ne; @@ -71,7 +71,7 @@ static ssize_t aoedisk_show_netif(struct gendisk * disk, char *page) return p-page; } /* firmware version */ -static ssize_t aoedisk_show_fwver(struct gendisk * disk, char *page) +static ssize_t aoedisk_show_fwver(struct gendisk *disk, char *page) { struct aoedev *d = disk->private_data; -- 1.5.2.1 - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 10/12] add module parameter for users who need more outstanding I/O
Add module parameter for users who need more outstanding I/O. Signed-off-by: Ed L. Cashin <[EMAIL PROTECTED]> --- drivers/block/aoe/aoecmd.c | 10 +++--- 1 files changed, 7 insertions(+), 3 deletions(-) diff --git a/drivers/block/aoe/aoecmd.c b/drivers/block/aoe/aoecmd.c index 8d6540b..9de0024 100644 --- a/drivers/block/aoe/aoecmd.c +++ b/drivers/block/aoe/aoecmd.c @@ -17,6 +17,11 @@ static int aoe_deadsecs = 60 * 3; module_param(aoe_deadsecs, int, 0644); MODULE_PARM_DESC(aoe_deadsecs, "After aoe_deadsecs seconds, give up and fail dev."); +static int aoe_maxout = 16; +module_param(aoe_maxout, int, 0644); +MODULE_PARM_DESC(aoe_maxout, + "Only aoe_maxout outstanding packets for every MAC on eX.Y."); + static struct sk_buff * new_skb(ulong len) { @@ -967,7 +972,6 @@ aoecmd_cfg_rsp(struct sk_buff *skb) struct aoeif *ifp; ulong flags, sysminor, aoemajor; struct sk_buff *sl; - enum { MAXFRAMES = 16 }; u16 n; h = aoe_hdr(skb); @@ -992,8 +996,8 @@ aoecmd_cfg_rsp(struct sk_buff *skb) } n = be16_to_cpu(ch->bufcnt); - if (n > MAXFRAMES) /* keep it reasonable */ - n = MAXFRAMES; + if (n > aoe_maxout) /* keep it reasonable */ + n = aoe_maxout; d = aoedev_by_sysminor_m(sysminor); if (d == NULL) { -- 1.5.2.1 - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 08/12] only schedule work once
Only schedule work once. Signed-off-by: Ed L. Cashin <[EMAIL PROTECTED]> --- drivers/block/aoe/aoecmd.c |2 ++ 1 files changed, 2 insertions(+), 0 deletions(-) diff --git a/drivers/block/aoe/aoecmd.c b/drivers/block/aoe/aoecmd.c index 89df9de..8d6540b 100644 --- a/drivers/block/aoe/aoecmd.c +++ b/drivers/block/aoe/aoecmd.c @@ -681,6 +681,8 @@ ataid_complete(struct aoedev *d, struct aoetgt *t, unsigned char *id) d->fw_ver, (long long)ssize); d->ssize = ssize; d->geo.start = 0; + if (d->flags & (DEVFL_GDALLOC|DEVFL_NEWSIZE)) + return; if (d->gd != NULL) { d->gd->capacity = ssize; d->flags |= DEVFL_NEWSIZE; -- 1.5.2.1 - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 06/12] user can ask driver to forget previously detected devices
When an AoE device is detected, the kernel is informed, and a new block device is created. If the device is unused, the block device corresponding to remote device that is no longer available may be removed from the system by telling the aoe driver to "flush" its list of devices. Signed-off-by: Ed L. Cashin <[EMAIL PROTECTED]> --- Documentation/aoe/mkdevs.sh |2 + Documentation/aoe/udev.txt |1 + drivers/block/aoe/aoe.h |1 + drivers/block/aoe/aoechr.c |5 ++ drivers/block/aoe/aoedev.c | 87 +- 5 files changed, 77 insertions(+), 19 deletions(-) diff --git a/Documentation/aoe/mkdevs.sh b/Documentation/aoe/mkdevs.sh index 97374aa..44c0ab7 100644 --- a/Documentation/aoe/mkdevs.sh +++ b/Documentation/aoe/mkdevs.sh @@ -29,6 +29,8 @@ rm -f $dir/interfaces mknod -m 0200 $dir/interfaces c $MAJOR 4 rm -f $dir/revalidate mknod -m 0200 $dir/revalidate c $MAJOR 5 +rm -f $dir/flush +mknod -m 0200 $dir/flush c $MAJOR 6 export n_partitions mkshelf=`echo $0 | sed 's!mkdevs!mkshelf!'` diff --git a/Documentation/aoe/udev.txt b/Documentation/aoe/udev.txt index 17e76c4..8686e78 100644 --- a/Documentation/aoe/udev.txt +++ b/Documentation/aoe/udev.txt @@ -20,6 +20,7 @@ SUBSYSTEM=="aoe", KERNEL=="discover", NAME="etherd/%k", GROUP="disk", MODE="0220 SUBSYSTEM=="aoe", KERNEL=="err", NAME="etherd/%k", GROUP="disk", MODE="0440" SUBSYSTEM=="aoe", KERNEL=="interfaces",NAME="etherd/%k", GROUP="disk", MODE="0220" SUBSYSTEM=="aoe", KERNEL=="revalidate",NAME="etherd/%k", GROUP="disk", MODE="0220" +SUBSYSTEM=="aoe", KERNEL=="flush", NAME="etherd/%k", GROUP="disk", MODE="0220" # aoe block devices KERNEL=="etherd*", NAME="%k", GROUP="disk" diff --git a/drivers/block/aoe/aoe.h b/drivers/block/aoe/aoe.h index f085465..7fa86dd 100644 --- a/drivers/block/aoe/aoe.h +++ b/drivers/block/aoe/aoe.h @@ -201,6 +201,7 @@ struct aoedev *aoedev_by_aoeaddr(int maj, int min); struct aoedev *aoedev_by_sysminor_m(ulong sysminor); void aoedev_downdev(struct aoedev *d); int aoedev_isbusy(struct aoedev *d); +int aoedev_flush(const char __user *str, size_t size); int aoenet_init(void); void aoenet_exit(void); diff --git a/drivers/block/aoe/aoechr.c b/drivers/block/aoe/aoechr.c index 511f995..10b38a7 100644 --- a/drivers/block/aoe/aoechr.c +++ b/drivers/block/aoe/aoechr.c @@ -15,6 +15,7 @@ enum { MINOR_DISCOVER, MINOR_INTERFACES, MINOR_REVALIDATE, + MINOR_FLUSH, MSGSZ = 2048, NMSG = 100, /* message backlog to retain */ }; @@ -43,6 +44,7 @@ static struct aoe_chardev chardevs[] = { { MINOR_DISCOVER, "discover" }, { MINOR_INTERFACES, "interfaces" }, { MINOR_REVALIDATE, "revalidate" }, + { MINOR_FLUSH, "flush" }, }; static int @@ -155,6 +157,9 @@ aoechr_write(struct file *filp, const char __user *buf, size_t cnt, loff_t *offp break; case MINOR_REVALIDATE: ret = revalidate(buf, cnt); + break; + case MINOR_FLUSH: + ret = aoedev_flush(buf, cnt); } if (ret == 0) ret = cnt; diff --git a/drivers/block/aoe/aoedev.c b/drivers/block/aoe/aoedev.c index 04c75b4..8659796 100644 --- a/drivers/block/aoe/aoedev.c +++ b/drivers/block/aoe/aoedev.c @@ -9,6 +9,10 @@ #include #include "aoe.h" +static void dummy_timer(ulong); +static void aoedev_freedev(struct aoedev *); +static void freetgt(struct aoetgt *t); + static struct aoedev *devlist; static spinlock_t devlist_lock; @@ -108,6 +112,70 @@ aoedev_downdev(struct aoedev *d) d->flags &= ~DEVFL_UP; } +static void +aoedev_freedev(struct aoedev *d) +{ + struct aoetgt **t, **e; + + if (d->gd) { + aoedisk_rm_sysfs(d); + del_gendisk(d->gd); + put_disk(d->gd); + } + t = d->targets; + e = t + NTARGETS; + for (; tbufpool) + mempool_destroy(d->bufpool); + kfree(d); +} + +int +aoedev_flush(const char __user *str, size_t cnt) +{ + ulong flags; + struct aoedev *d, **dd; + struct aoedev *rmd = NULL; + char buf[16]; + int all = 0; + + if (cnt >= 3) { + if (cnt > sizeof buf) + cnt = sizeof buf; + if (copy_from_user(buf, str, cnt)) + return -EFAULT; + all = !strncmp(buf, "all", 3); + } + + flush_scheduled_work(); + spin_lock_irqsave(&devlist_lock, flags); +
[PATCH 07/12] use a dynamic pool of sk_buffs to keep up with fast targets
Use a dynamic pool of sk_buffs to keep up with fast targets. Signed-off-by: Ed L. Cashin <[EMAIL PROTECTED]> --- drivers/block/aoe/aoe.h|5 ++ drivers/block/aoe/aoecmd.c | 129 +--- drivers/block/aoe/aoedev.c | 51 +++--- 3 files changed, 134 insertions(+), 51 deletions(-) diff --git a/drivers/block/aoe/aoe.h b/drivers/block/aoe/aoe.h index 7fa86dd..55c2f08 100644 --- a/drivers/block/aoe/aoe.h +++ b/drivers/block/aoe/aoe.h @@ -98,6 +98,7 @@ enum { MIN_BUFS = 16, NTARGETS = 8, NAOEIFS = 8, + NSKBPOOLMAX = 128, TIMERTICK = HZ / 10, MINTIMER = HZ >> 2, @@ -147,6 +148,7 @@ struct aoetgt { u16 useme; ulong lastwadj; /* last window adjustment */ int wpkts, rpkts; +int dataref; }; struct aoedev { @@ -168,6 +170,9 @@ struct aoedev { spinlock_t lock; struct sk_buff *sendq_hd; /* packets needing to be sent, list head */ struct sk_buff *sendq_tl; + struct sk_buff *skbpool_hd; + struct sk_buff *skbpool_tl; + int nskbpool; mempool_t *bufpool; /* for deadlock-free Buf allocation */ struct list_head bufq; /* queue of bios to work on */ struct buf *inprocess; /* the one we're currently working on */ diff --git a/drivers/block/aoe/aoecmd.c b/drivers/block/aoe/aoecmd.c index 62ba58c..89df9de 100644 --- a/drivers/block/aoe/aoecmd.c +++ b/drivers/block/aoe/aoecmd.c @@ -105,43 +105,102 @@ ifrotate(struct aoetgt *t) } } +static void +skb_pool_put(struct aoedev *d, struct sk_buff *skb) +{ + if (!d->skbpool_hd) + d->skbpool_hd = skb; + else + d->skbpool_tl->next = skb; + d->skbpool_tl = skb; +} + +static struct sk_buff * +skb_pool_get(struct aoedev *d) +{ + struct sk_buff *skb; + + skb = d->skbpool_hd; + if (skb) + if (atomic_read(&skb_shinfo(skb)->dataref) == 1) { + d->skbpool_hd = skb->next; + skb->next = NULL; + return skb; + } + if (d->nskbpool < NSKBPOOLMAX) + if ((skb = new_skb(ETH_ZLEN))) { + d->nskbpool++; + return skb; + } + return NULL; +} + +/* freeframe is where we do our load balancing so it's a little hairy. */ static struct frame * freeframe(struct aoedev *d) { - struct frame *f, *e; + struct frame *f, *e, *rf; struct aoetgt **t; - ulong n; + struct sk_buff *skb; if (d->targets[0] == NULL) {/* shouldn't happen, but I'm paranoid */ printk(KERN_ERR "aoe: NULL TARGETS!\n"); return NULL; } - t = d->targets; - do { + t = d->tgt; + t++; + if (t >= &d->targets[NTARGETS] || !*t) + t = d->targets; + for (;;) { + if ((*t)->nout < (*t)->maxout) if (t != d->htgt) - if ((*t)->ifp->nd) - if ((*t)->nout < (*t)->maxout) { - n = (*t)->nframes; + if ((*t)->ifp->nd) { + rf = NULL; f = (*t)->frames; - e = f + n; + e = f + (*t)->nframes; for (; ftag != FREETAG) continue; - if (atomic_read(&skb_shinfo(f->skb)->dataref) != 1) { - n--; + skb = f->skb; + if (!skb) + if (!(f->skb = skb = new_skb(ETH_ZLEN))) + continue; + if (atomic_read(&skb_shinfo(skb)->dataref) != 1) { + if (!rf) + rf = f; continue; } - skb_shinfo(f->skb)->nr_frags = f->skb->data_len = 0; - skb_trim(f->skb, 0); +gotone:skb_shinfo(skb)->nr_frags = skb->data_len = 0; + skb_trim(skb, 0); d->tgt = t; ifrotate(*t); return f; } - if (n == 0) /* slow polling network card */ + /* Work can be done, but the network layer is + holding our precious packets. Try to grab + one from the pool. */ + f = rf; + if (f == NULL) {/* more paranoia
[PATCH 03/12] mac_addr: avoid 64-bit arch compiler warnings
By returning unsigned long long, mac_addr does not generate compiler warnings on 64-bit architectures. Signed-off-by: Ed L. Cashin <[EMAIL PROTECTED]> --- drivers/block/aoe/aoe.h|2 +- drivers/block/aoe/aoeblk.c |3 +-- drivers/block/aoe/aoecmd.c | 10 +- drivers/block/aoe/aoenet.c |4 ++-- 4 files changed, 9 insertions(+), 10 deletions(-) diff --git a/drivers/block/aoe/aoe.h b/drivers/block/aoe/aoe.h index 069f04c..f085465 100644 --- a/drivers/block/aoe/aoe.h +++ b/drivers/block/aoe/aoe.h @@ -208,4 +208,4 @@ void aoenet_xmit(struct sk_buff *); int is_aoe_netif(struct net_device *ifp); int set_aoe_iflist(const char __user *str, size_t size); -u64 mac_addr(char addr[6]); +unsigned long long mac_addr(char addr[6]); diff --git a/drivers/block/aoe/aoeblk.c b/drivers/block/aoe/aoeblk.c index f6773ab..c9cf576 100644 --- a/drivers/block/aoe/aoeblk.c +++ b/drivers/block/aoe/aoeblk.c @@ -32,8 +32,7 @@ static ssize_t aoedisk_show_mac(struct gendisk * disk, char *page) if (t == NULL) return snprintf(page, PAGE_SIZE, "none\n"); - return snprintf(page, PAGE_SIZE, "%012llx\n", - (unsigned long long)mac_addr(t->addr)); + return snprintf(page, PAGE_SIZE, "%012llx\n", mac_addr(t->addr)); } static ssize_t aoedisk_show_netif(struct gendisk * disk, char *page) { diff --git a/drivers/block/aoe/aoecmd.c b/drivers/block/aoe/aoecmd.c index 8a3a973..62ba58c 100644 --- a/drivers/block/aoe/aoecmd.c +++ b/drivers/block/aoe/aoecmd.c @@ -305,7 +305,8 @@ resend(struct aoedev *d, struct aoetgt *t, struct frame *f) snprintf(buf, sizeof buf, "%15s e%ld.%d [EMAIL PROTECTED] newtag=%08x s=%012llx d=%012llx nout=%d\n", "retransmit", d->aoemajor, d->aoeminor, f->tag, jiffies, n, - mac_addr(h->src), mac_addr(h->dst), t->nout); + mac_addr(h->src), + mac_addr(h->dst), t->nout); aoechr_error(buf); f->tag = n; @@ -616,7 +617,7 @@ ataid_complete(struct aoedev *d, struct aoetgt *t, unsigned char *id) if (d->ssize != ssize) printk(KERN_INFO "aoe: %012llx e%lu.%lu v%04x has %llu sectors\n", - (unsigned long long)mac_addr(t->addr), + mac_addr(t->addr), d->aoemajor, d->aoeminor, d->fw_ver, (long long)ssize); d->ssize = ssize; @@ -710,8 +711,7 @@ aoecmd_ata_rsp(struct sk_buff *skb) t = gettgt(d, hin->src); if (t == NULL) { printk(KERN_INFO "aoe: can't find target e%ld.%d:%012llx\n", - d->aoemajor, d->aoeminor, - (unsigned long long) mac_addr(hin->src)); + d->aoemajor, d->aoeminor, mac_addr(hin->src)); spin_unlock_irqrestore(&d->lock, flags); return; } @@ -988,7 +988,7 @@ aoecmd_cfg_rsp(struct sk_buff *skb) printk(KERN_INFO "aoe: e%ld.%d: setting %d byte data frames on %s:%012llx\n", d->aoemajor, d->aoeminor, n, ifp->nd->name, - (unsigned long long) mac_addr(t->addr)); + mac_addr(t->addr)); ifp->maxbcnt = n; } } diff --git a/drivers/block/aoe/aoenet.c b/drivers/block/aoe/aoenet.c index f335099..d2c1a47 100644 --- a/drivers/block/aoe/aoenet.c +++ b/drivers/block/aoe/aoenet.c @@ -82,7 +82,7 @@ set_aoe_iflist(const char __user *user_str, size_t size) return 0; } -u64 +unsigned long long mac_addr(char addr[6]) { __be64 n = 0; @@ -90,7 +90,7 @@ mac_addr(char addr[6]) memcpy(p + 2, addr, 6); /* (sizeof addr != 6) */ - return __be64_to_cpu(n); + return (unsigned long long) __be64_to_cpu(n); } void -- 1.5.2.1 - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 04/12] clean up udev configuration example
This patch adds a known default location for the udev configuration file and uses the more recent "==" syntax for SUBSYSTEM and KERNEL. Signed-off-by: Ed L. Cashin <[EMAIL PROTECTED]> --- Documentation/aoe/udev-install.sh |5 - Documentation/aoe/udev.txt| 15 --- 2 files changed, 12 insertions(+), 8 deletions(-) diff --git a/Documentation/aoe/udev-install.sh b/Documentation/aoe/udev-install.sh index 6449911..15e86f5 100644 --- a/Documentation/aoe/udev-install.sh +++ b/Documentation/aoe/udev-install.sh @@ -23,7 +23,10 @@ fi # /etc/udev/rules.d # rules_d="`sed -n '/^udev_rules=/{ s!udev_rules=!!; s!\"!!g; p; }' $conf`" -if test -z "$rules_d" || test ! -d "$rules_d"; then +if test -z "$rules_d" ; then + rules_d=/etc/udev/rules.d +fi +if test ! -d "$rules_d"; then echo "$me Error: cannot find udev rules directory" 1>&2 exit 1 fi diff --git a/Documentation/aoe/udev.txt b/Documentation/aoe/udev.txt index a7ed1dc..17e76c4 100644 --- a/Documentation/aoe/udev.txt +++ b/Documentation/aoe/udev.txt @@ -1,6 +1,7 @@ # These rules tell udev what device nodes to create for aoe support. -# They may be installed along the following lines (adjusted to what -# you see on your system). +# They may be installed along the following lines. Check the section +# 8 udev manpage to see whether your udev supports SUBSYSTEM, and +# whether it uses one or two equal signs for SUBSYSTEM and KERNEL. # # [EMAIL PROTECTED] ~$ su # Password: @@ -15,10 +16,10 @@ # # aoe char devices -SUBSYSTEM="aoe", KERNEL="discover",NAME="etherd/%k", GROUP="disk", MODE="0220" -SUBSYSTEM="aoe", KERNEL="err", NAME="etherd/%k", GROUP="disk", MODE="0440" -SUBSYSTEM="aoe", KERNEL="interfaces", NAME="etherd/%k", GROUP="disk", MODE="0220" -SUBSYSTEM="aoe", KERNEL="revalidate", NAME="etherd/%k", GROUP="disk", MODE="0220" +SUBSYSTEM=="aoe", KERNEL=="discover", NAME="etherd/%k", GROUP="disk", MODE="0220" +SUBSYSTEM=="aoe", KERNEL=="err", NAME="etherd/%k", GROUP="disk", MODE="0440" +SUBSYSTEM=="aoe", KERNEL=="interfaces",NAME="etherd/%k", GROUP="disk", MODE="0220" +SUBSYSTEM=="aoe", KERNEL=="revalidate",NAME="etherd/%k", GROUP="disk", MODE="0220" # aoe block devices -KERNEL="etherd*", NAME="%k", GROUP="disk" +KERNEL=="etherd*", NAME="%k", GROUP="disk" -- 1.5.2.1 - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 05/12] eliminate goto and improve readability
Adam Richter suggested eliminating this goto. Signed-off-by: Ed L. Cashin <[EMAIL PROTECTED]> --- drivers/block/aoe/aoechr.c | 69 +-- 1 files changed, 34 insertions(+), 35 deletions(-) diff --git a/drivers/block/aoe/aoechr.c b/drivers/block/aoe/aoechr.c index 9026c44..511f995 100644 --- a/drivers/block/aoe/aoechr.c +++ b/drivers/block/aoe/aoechr.c @@ -191,52 +191,51 @@ aoechr_read(struct file *filp, char __user *buf, size_t cnt, loff_t *off) ulong flags; n = (unsigned long) filp->private_data; - switch (n) { - case MINOR_ERR: - spin_lock_irqsave(&emsgs_lock, flags); -loop: - em = emsgs + emsgs_head_idx; - if ((em->flags & EMFL_VALID) == 0) { - if (filp->f_flags & O_NDELAY) { - spin_unlock_irqrestore(&emsgs_lock, flags); - return -EAGAIN; - } - nblocked_emsgs_readers++; + if (n != MINOR_ERR) + return -EFAULT; + + spin_lock_irqsave(&emsgs_lock, flags); + for (;;) { + em = emsgs + emsgs_head_idx; + if ((em->flags & EMFL_VALID) != 0) + break; + if (filp->f_flags & O_NDELAY) { spin_unlock_irqrestore(&emsgs_lock, flags); + return -EAGAIN; + } + nblocked_emsgs_readers++; + + spin_unlock_irqrestore(&emsgs_lock, flags); - n = down_interruptible(&emsgs_sema); + n = down_interruptible(&emsgs_sema); - spin_lock_irqsave(&emsgs_lock, flags); + spin_lock_irqsave(&emsgs_lock, flags); - nblocked_emsgs_readers--; + nblocked_emsgs_readers--; - if (n) { - spin_unlock_irqrestore(&emsgs_lock, flags); - return -ERESTARTSYS; - } - goto loop; - } - if (em->len > cnt) { + if (n) { spin_unlock_irqrestore(&emsgs_lock, flags); - return -EAGAIN; + return -ERESTARTSYS; } - mp = em->msg; - len = em->len; - em->msg = NULL; - em->flags &= ~EMFL_VALID; + } + if (em->len > cnt) { + spin_unlock_irqrestore(&emsgs_lock, flags); + return -EAGAIN; + } + mp = em->msg; + len = em->len; + em->msg = NULL; + em->flags &= ~EMFL_VALID; - emsgs_head_idx++; - emsgs_head_idx %= ARRAY_SIZE(emsgs); + emsgs_head_idx++; + emsgs_head_idx %= ARRAY_SIZE(emsgs); - spin_unlock_irqrestore(&emsgs_lock, flags); + spin_unlock_irqrestore(&emsgs_lock, flags); - n = copy_to_user(buf, mp, len); - kfree(mp); - return n == 0 ? len : -EFAULT; - default: - return -EFAULT; - } + n = copy_to_user(buf, mp, len); + kfree(mp); + return n == 0 ? len : -EFAULT; } static const struct file_operations aoe_fops = { -- 1.5.2.1 - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
PATCH 2.6.21-rc1 aoe: handle zero _count pages in bios
This patch works around a problem discussed here and on the XFS mailing list in January. http://lkml.org/lkml/2007/1/19/56 To summarize the issue: If XFS (or any other creator of bios) gives the aoe driver a bio with pages that have a zero page _count, and then the aoe driver hands the page to the network layer in an sk_buff's frags, and if the network card does not support the scatter gather feature, then the network layer will eventually try to put_page on the page, and the kernel winds up panicing. There is a disconnect between the assumptions of the bio creator (that pages don't need to have a non-zero _count), and the assumptions of the network layer (where it's assumed that pages will always have a positive count). There was no response, though, to a call in January for ideas about resolving the disconnect. So to work around the issue, the simple patch below increments the page _count before handing it to the network layer and decrements it after the network layer is done with the page. This patch eliminates panics for XFS on aoe users who lack scatter gather support in their network cards. It's regrettable that _count is manipulated directly, because Andrew Morton changed the page "count" member to a _count to prevent exactly this kind of direct manipulation of the data. There does not appear to be a "right" way to increment and decrement the count, however, inside a driver without unwanted side effects. The closest candidates are in mm/internal.h and are presumably intended to be used exclusively by mm/*.c. Signed-off-by: "Ed L. Cashin" <[EMAIL PROTECTED]> diff -upr -X linux-2.6.21-rc1.dontdiff linux-2.6.21-rc1.orig/drivers/block/aoe/aoe.h linux-2.6.21-rc1/drivers/block/aoe/aoe.h --- linux-2.6.21-rc1.orig/drivers/block/aoe/aoe.h 2007-02-27 14:11:06.249132000 -0500 +++ linux-2.6.21-rc1/drivers/block/aoe/aoe.h2007-02-27 17:43:22.037069000 -0500 @@ -150,6 +150,7 @@ int aoeblk_init(void); void aoeblk_exit(void); void aoeblk_gdalloc(void *); void aoedisk_rm_sysfs(struct aoedev *d); +void aoe_bio_done(struct bio *bio, unsigned int bytes_done, int error); int aoechr_init(void); void aoechr_exit(void); diff -upr -X linux-2.6.21-rc1.dontdiff linux-2.6.21-rc1.orig/drivers/block/aoe/aoeblk.c linux-2.6.21-rc1/drivers/block/aoe/aoeblk.c --- linux-2.6.21-rc1.orig/drivers/block/aoe/aoeblk.c2007-02-27 14:11:06.253132000 -0500 +++ linux-2.6.21-rc1/drivers/block/aoe/aoeblk.c 2007-02-27 17:43:22.037069000 -0500 @@ -14,6 +14,29 @@ static struct kmem_cache *buf_pool_cache; +/* workaround for XFS and bios with zero pageref pages in general */ +void +aoe_bio_done(struct bio *bio, unsigned int bytes_done, int error) +{ + struct bio_vec *bv; + int i; + + bio_for_each_segment(bv, bio, i) + atomic_dec(&bv->bv_page->_count); + + bio_endio(bio, bytes_done, error); +} + +static void +bio_refpages(struct bio *bio) +{ + struct bio_vec *bv; + int i; + + bio_for_each_segment(bv, bio, i) + atomic_inc(&bv->bv_page->_count); +} + static ssize_t aoedisk_show_state(struct gendisk * disk, char *page) { struct aoedev *d = disk->private_data; @@ -147,6 +170,7 @@ aoeblk_make_request(request_queue_t *q, buf->bio = bio; buf->resid = bio->bi_size; buf->sector = bio->bi_sector; + bio_refpages(bio); buf->bv = &bio->bi_io_vec[bio->bi_idx]; WARN_ON(buf->bv->bv_len == 0); buf->bv_resid = buf->bv->bv_len; @@ -159,7 +183,7 @@ aoeblk_make_request(request_queue_t *q, d->aoemajor, d->aoeminor); spin_unlock_irqrestore(&d->lock, flags); mempool_free(buf, d->bufpool); - bio_endio(bio, bio->bi_size, -ENXIO); + aoe_bio_done(bio, bio->bi_size, -ENXIO); return 0; } diff -upr -X linux-2.6.21-rc1.dontdiff linux-2.6.21-rc1.orig/drivers/block/aoe/aoecmd.c linux-2.6.21-rc1/drivers/block/aoe/aoecmd.c --- linux-2.6.21-rc1.orig/drivers/block/aoe/aoecmd.c2007-02-27 14:11:06.253132000 -0500 +++ linux-2.6.21-rc1/drivers/block/aoe/aoecmd.c 2007-02-27 17:43:22.037069000 -0500 @@ -649,7 +649,7 @@ aoecmd_ata_rsp(struct sk_buff *skb) disk_stat_add(disk, sectors[rw], n_sect); disk_stat_add(disk, io_ticks, duration); n = (buf->flags & BUFFL_FAIL) ? -EIO : 0; - bio_endio(buf->bio, buf->bio->bi_size, n); + aoe_bio_done(buf->bio, buf->bio->bi_size, n); mempool_free(buf, d->bufpool); } } diff -upr -X linux-2.6.21-rc1.dontdiff linux-2.6.21-rc1.orig/drivers/block/aoe/aoedev.c linux-2.6.21-rc1/drivers/block/aoe/aoedev.c --- linux-2.6.21-rc1.o
Re: Re: bio pages with zero page reference count
On Mon, Dec 18, 2006 at 10:53:43PM +, Christoph Hellwig wrote: > On Mon, Dec 18, 2006 at 05:21:09PM -0500, Ed L. Cashin wrote: ... > > If anyone has a better reference, I'd like to see it. > > I searched around a little bit and found these: > > > http://groups.google.at/group/open-iscsi/browse_frm/thread/17fbe253cf1f69dd/f26cf19b0fee9147?tvc=1&q=kmalloc+iscsi+%22christoph+hellwig%22&hl=de#f26cf19b0fee9147 > http://www.ussg.iu.edu/hypermail/linux/kernel/0408.3/0061.html > > But that's not the conclusion I was looking for. So it sounds like you've been advocating a general discussion of this issue for a few years now. To summarize the issue: 1) users of the block layer assume that it's fine to associate pages that have a zero reference count with a bio before requesting I/O, 2) intermediaries like iscsi, aoe, and drbd, associate the pages with the frags of skbuffs, but 3) when the network layer has to linearize the skbuff for a network device that doesn't support scatter gather, it winds up doing a get_page and put_page on each page in the frags, despite the fact that the page reference count on each may already be zero. The network layer is assuming that it's OK to do use these operations on any page in the frags. Maybe the discussion is slow to start because too many parts of the kernel are involved. Here are a couple of specific questions. Maybe they'll help get the ball rolling. 1) What are the disadvantages of making the network layer *not* to assume it's correct to use get/put_page on the frags when it linearizes an sk_buff? For example, the network layer could omit the get/put_page when the page reference count is zero. 2) What are the disadvantages of having one part of the kernel (e.g., XFS) reference a page before handing it off to another part of the kernel, e.g., in a bio? This change would require multiple parts of the kernel to change behavior, but it seems conceptually cleaner, since the reference count would reflect the reality that the page does have an owner (XFS or whoever). I don't know how practical the implementation would be. 3) It seems messy to handle this is in each of the individual intermediary drivers that sit between the block and network layers, but if that really is the place to do it, then is there a problem with simply incrementing the page reference counts upon getting a bio from the block layer, and later decrementing them before giving them back with bio_endio? bio_for_each_segment(bv, bio, i) atomic_inc(&bv->bv_page->_count); ... [and later] bio_for_each_segment(bv, bio, i) atomic_dec(&bv->bv_page->_count); bio_endio(bio, bytes_done, error); That seems to eliminate problems aoe users have with XFS on AoE devices that are accessible via network devices that don't support scatter gather, but is it the right fix? Andrew Morton changed "count" to "_count" to stop folks from directly manipulating the page struct member, but I don't see any get/put_page type operations that fit what the aoe driver has to do in this case. -- Ed L Cashin <[EMAIL PROTECTED]> - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
bio pages with zero page reference count
(This email is a followup to "Re: [PATCH 2.6.19.1] fix aoe without scatter-gather [Bug 7662]".) On Mon, Dec 18, 2006 at 12:53:00PM -0500, Ed L. Cashin wrote: ... > This patch eliminates the offset data on cards that don't support > scatter-gather or have had scatter-gather turned off. There remains > an unrelated issue that I'll address in a separate email. After fixing the problem with the skb headers, we noticed that there were still problems when scatter gather wasn't in use. XFS was giving us bios that had pages with a reference count of zero. The aoe driver sets up the skb with the frags pointing to the pages, and when scatter gather isn't supported and __pskb_pull_tail gets involved, put_page is called after the data is copied from the pages. That causes problems because of the zero page reference count. It seems like it would always be incorrect for one part of the kernel to give pages with a zero reference count to another part of the kernel, so this seems like a bug in XFS. Christoph Hellwig, though, points out, > It's a kmalloced page. The same can happen with ext3 aswell, but > only when doing log recovery. The last time this came up (vs > iscsi) the conclusion was that the driver needs to handle this > case. In attempting to find the conversation he was referencing, I only found this: Subject: tcp_sendpage and page allocation lifetime vs. iscsi Date: 2005-04-25 17:02:59 GMT http://article.gmane.org/gmane.linux.kernel/298377 If anyone has a better reference, I'd like to see it. -- Ed L Cashin <[EMAIL PROTECTED]> - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 2.6.19.1] fix aoe without scatter-gather [Bug 7662]
The patch below fixes a bug that only appears when AoE goes over a network card that does not support scatter-gather. The headers in the linear part of the skb appeared to be larger than they really were, resulting in data that was offset by 24 bytes. This patch eliminates the offset data on cards that don't support scatter-gather or have had scatter-gather turned off. There remains an unrelated issue that I'll address in a separate email. Signed-off-by: "Ed L. Cashin" <[EMAIL PROTECTED]> diff -uprN linux-2.6.19.orig/drivers/block/aoe/aoecmd.c linux-2.6.19.mod/drivers/block/aoe/aoecmd.c --- linux-2.6.19.orig/drivers/block/aoe/aoecmd.c2006-12-11 18:15:42.322711000 -0500 +++ linux-2.6.19.mod/drivers/block/aoe/aoecmd.c 2006-12-12 17:12:59.307200500 -0500 @@ -30,8 +30,6 @@ new_skb(ulong len) skb->nh.raw = skb->mac.raw = skb->data; skb->protocol = __constant_htons(ETH_P_AOE); skb->priority = 0; - skb_put(skb, len); - memset(skb->head, 0, len); skb->next = skb->prev = NULL; /* tell the network layer not to perform IP checksums @@ -122,8 +120,8 @@ aoecmd_ata_rw(struct aoedev *d, struct f skb = f->skb; h = (struct aoe_hdr *) skb->mac.raw; ah = (struct aoe_atahdr *) (h+1); - skb->len = sizeof *h + sizeof *ah; - memset(h, 0, ETH_ZLEN); + skb_put(skb, sizeof *h + sizeof *ah); + memset(h, 0, skb->len); f->tag = aoehdr_atainit(d, h); f->waited = 0; f->buf = buf; @@ -149,7 +147,6 @@ aoecmd_ata_rw(struct aoedev *d, struct f skb->len += bcnt; skb->data_len = bcnt; } else { - skb->len = ETH_ZLEN; writebit = 0; } @@ -206,6 +203,7 @@ aoecmd_cfg_pkts(ushort aoemajor, unsigne printk(KERN_INFO "aoe: skb alloc failure\n"); continue; } + skb_put(skb, sizeof *h + sizeof *ch); skb->dev = ifp; if (sl_tail == NULL) sl_tail = skb; @@ -243,6 +241,7 @@ freeframe(struct aoedev *d) continue; if (atomic_read(&skb_shinfo(f->skb)->dataref) == 1) { skb_shinfo(f->skb)->nr_frags = f->skb->data_len = 0; + skb_trim(f->skb, 0); return f; } n++; @@ -698,8 +697,8 @@ aoecmd_ata_id(struct aoedev *d) skb = f->skb; h = (struct aoe_hdr *) skb->mac.raw; ah = (struct aoe_atahdr *) (h+1); - skb->len = ETH_ZLEN; - memset(h, 0, ETH_ZLEN); + skb_put(skb, sizeof *h + sizeof *ah); + memset(h, 0, skb->len); f->tag = aoehdr_atainit(d, h); f->waited = 0; -- Ed L Cashin <[EMAIL PROTECTED]> - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] aoe: Add forgotten NULL at end of attribute list in aoeblk.c
On Mon, Nov 13, 2006 at 09:15:20AM +0100, Dennis Stosberg wrote: > This caused the system to stall when the aoe module was loaded. The > error was introduced in commit 4ca5224f3ea4779054d96e885ca9b3980801ce13 Boy, I've been totally spoiled by the gitweb at kernel.org. It's been unavailable lately. Anyway, thanks for the fix. It looks good to me if it looks good to Greg---after a google search it appears that your patch is a followup to his consolidation of the attributes. > Signed-off-by: Dennis Stosberg <[EMAIL PROTECTED]> > --- > > The log of the caused error can be found at > http://stosberg.net/tmp/aoe_trace.txt > > drivers/block/aoe/aoeblk.c |1 + > 1 files changed, 1 insertions(+), 0 deletions(-) > > diff --git a/drivers/block/aoe/aoeblk.c b/drivers/block/aoe/aoeblk.c > index d433f27..aa25f8b 100644 > --- a/drivers/block/aoe/aoeblk.c > +++ b/drivers/block/aoe/aoeblk.c > @@ -68,6 +68,7 @@ static struct attribute *aoe_attrs[] = { > &disk_attr_mac.attr, > &disk_attr_netif.attr, > &disk_attr_fwver.attr, > + NULL > }; > > static const struct attribute_group attr_group = { > -- > 1.4.3.3 -- Ed L Cashin <[EMAIL PROTECTED]> - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: aoe fails on sparc64
Ed L Cashin <[EMAIL PROTECTED]> writes: ... > Let's take this discussion off the lkml, because I doubt there's a > problem with the aoe driver in the kernel, and I can easily follow up > to the lkml with a synopsis if it turns out I'm wrong. It looks like I was probably wrong. I need to do some debugging, but the only sparc64 machine here at hand is in use. If anybody would be up for running 2.6.13 on a sparc64 host and running tests with a patched aoe driver, please let me know. A test would look something like this, using an x86 host and a sparc64 host on the same LAN. x86$ dd if=/dev/zero of=/tmp/0x1234567 bs=1k count=1 seek=19088742 x86$ vblade 0 1 eth0 /tmp/0x1234567 sparc64$ rmmod aoe sparc64$ cd ~/linux-2.6.13 sparc64$ patch -p1 < aoe.diff sparc64$ make && make modules_install sparc64$ modprobe aoe I'd email you patches, and you'd email me the printk messages that show up in the logs. Such help would be much appreciated. -- Ed L Cashin <[EMAIL PROTECTED]> - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: aoe fails on sparc64
Jim MacBaine <[EMAIL PROTECTED]> writes: > On 9/1/05, Ed L Cashin <[EMAIL PROTECTED]> wrote: > >> The aoe driver looks OK, but it turns out there's a byte swapping bug >> in the vblade that could be related if he's running the vblade on a >> big endian host (even though he said it was an x86 host), but I >> haven't heard back from the original poster yet. > > It is in fact a x86_64 kernel, but with a mostly x86 userland. Vblade > is pure x86 code. > >> The vblade bug was the omission of swapping the bytes in each short. >> The fix below shows what I mean: > > Unfortunately it doesn't fix anything here. The client still reports > the same wrong size as before. The dmesg output is identical, too. Let's take this discussion off the lkml, because I doubt there's a problem with the aoe driver in the kernel, and I can easily follow up to the lkml with a synopsis if it turns out I'm wrong. Jim MacBaine, I'm going to ask for more details in a separate email. -- Ed L Cashin <[EMAIL PROTECTED]> - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: aoe fails on sparc64
"David S. Miller" <[EMAIL PROTECTED]> writes: > From: Ed L Cashin <[EMAIL PROTECTED]> ... >> OK. 67553994410557440 is 61440 byte swapped in 64 bits, and 30MB is >> 61440 sectors, so this should be a simple byte order fix. > > More strangely, the upper and lower 32-bit words are swapped. > The bytes within each 32-bit word are swapped correctly. > > So the calculation maybe should be something like: > > __le32 *p = (__le32 *) &id[100 << 1]; > u32 high32 = le32_to_cpup(p); > u32 low32 = le32_to_cpup(p + 1); > > ssize = (((u64)high32 << 32) | (u64) low32); > > But that doesn't make any sense, and even ide_fix_driveid() in > drivers/ide/ide-iops.c does a le64_to_cpu() for this value: > > id->lba_capacity_2 = __le64_to_cpu(id->lba_capacity_2); > > I wonder if this is some artifact of how AOE devices encode > this field when sending it to the client. Well, an EtherDrive blade just copies the ATA identify response data into a network packet without looking at it. The vblade, though, has to set the lba_capacity and lba_capacity_2 fields itself. The aoe driver looks OK, but it turns out there's a byte swapping bug in the vblade that could be related if he's running the vblade on a big endian host (even though he said it was an x86 host), but I haven't heard back from the original poster yet. The vblade bug was the omission of swapping the bytes in each short. The fix below shows what I mean: diff -urNp a-exp/ata.c b-exp/ata.c --- a-exp/ata.c 2005-09-01 10:19:11.0 -0400 +++ b-exp/ata.c 2005-09-01 10:19:12.0 -0400 @@ -55,24 +55,29 @@ setfld(ushort *a, int idx, int len, char } static void -setlba28(ushort *p, vlong lba) +setlba28(ushort *ident, vlong lba) { - p += 60; - *p++ = lba & 0x; - *p = lba >> 16 & 0x0fff; + uchar *cp; + + cp = (uchar *) &ident[60]; + *cp++ = lba; + *cp++ = lba >>= 8; + *cp++ = lba >>= 8; + *cp++ = (lba >>= 8) & 0xf; } static void -setlba48(ushort *p, vlong lba) +setlba48(ushort *ident, vlong lba) { - p += 100; - *p++ = lba; - lba >>= 16; - *p++ = lba; - lba >>= 16; - *p++ = lba; - lba >>= 16; - *p = lba; + uchar *cp; + + cp = (uchar *) &ident[100]; + *cp++ = lba; + *cp++ = lba >>= 8; + *cp++ = lba >>= 8; + *cp++ = lba >>= 8; + *cp++ = lba >>= 8; + *cp++ = lba >>= 8; } void -- Ed L Cashin <[EMAIL PROTECTED]> - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: aoe fails on sparc64
Jim MacBaine <[EMAIL PROTECTED]> writes: > Hello, > > Using aoe on a sparc64 system gives strange results: > ... > The log says: > > Aug 31 15:18:49 sunny kernel: devfs_mk_dir: invalid argument.<6> > etherd/e0.0: unknown partition table > Aug 31 15:18:49 sunny kernel: aoe: 0011d8xx e0.0 v4000 has > 67553994410557440 > sectors OK. 67553994410557440 is 61440 byte swapped in 64 bits, and 30MB is 61440 sectors, so this should be a simple byte order fix. > The system is an Sun Ultra 5, running 2.6.12.5/sparc64 compiled with > gcc-3.4.2. e0.0 is exported on a x86 system using vblade-5, and has a > size of 30 MB. Thanks for the report. -- Ed L Cashin <[EMAIL PROTECTED]> - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 2.6.13-rc6] aoe [2/2]: update driver version number to twelve
Update driver version number to twelve. Signed-off-by: Ed L. Cashin <[EMAIL PROTECTED]> Index: 2.6.13-rc6-aoe/drivers/block/aoe/aoe.h === --- 2.6.13-rc6-aoe.orig/drivers/block/aoe/aoe.h 2005-08-19 11:57:04.0 -0400 +++ 2.6.13-rc6-aoe/drivers/block/aoe/aoe.h 2005-08-19 11:57:05.0 -0400 @@ -1,5 +1,5 @@ /* Copyright (c) 2004 Coraid, Inc. See COPYING for GPL terms. */ -#define VERSION "10" +#define VERSION "12" #define AOE_MAJOR 152 #define DEVICE_NAME "aoe" -- Ed L. Cashin <[EMAIL PROTECTED]> - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 2.6.13-rc6] aoe [1/2]: support 16 AoE slot addresses per AoE shelf
Change the number of supported AoE slot addresses per AoE shelf address to 16. Signed-off-by: Ed L. Cashin <[EMAIL PROTECTED]> Index: 2.6.13-rc6-aoe/Documentation/aoe/mkshelf.sh === --- 2.6.13-rc6-aoe.orig/Documentation/aoe/mkshelf.sh2005-08-19 11:15:21.0 -0400 +++ 2.6.13-rc6-aoe/Documentation/aoe/mkshelf.sh 2005-08-19 11:57:04.0 -0400 @@ -8,13 +8,15 @@ n_partitions=${n_partitions:-16} dir=$1 shelf=$2 +nslots=16 +maxslot=`echo $nslots 1 - p | dc` MAJOR=152 set -e -minor=`echo 10 \* $shelf \* $n_partitions | bc` +minor=`echo $nslots \* $shelf \* $n_partitions | bc` endp=`echo $n_partitions - 1 | bc` -for slot in `seq 0 9`; do +for slot in `seq 0 $maxslot`; do for part in `seq 0 $endp`; do name=e$shelf.$slot test "$part" != "0" && name=${name}p$part Index: 2.6.13-rc6-aoe/drivers/block/aoe/aoe.h === --- 2.6.13-rc6-aoe.orig/drivers/block/aoe/aoe.h 2005-08-19 11:15:21.0 -0400 +++ 2.6.13-rc6-aoe/drivers/block/aoe/aoe.h 2005-08-19 11:57:04.0 -0400 @@ -7,12 +7,12 @@ * default is 16, which is 15 partitions plus the whole disk */ #ifndef AOE_PARTITIONS -#define AOE_PARTITIONS 16 +#define AOE_PARTITIONS (16) #endif -#define SYSMINOR(aoemajor, aoeminor) ((aoemajor) * 10 + (aoeminor)) -#define AOEMAJOR(sysminor) ((sysminor) / 10) -#define AOEMINOR(sysminor) ((sysminor) % 10) +#define SYSMINOR(aoemajor, aoeminor) ((aoemajor) * NPERSHELF + (aoeminor)) +#define AOEMAJOR(sysminor) ((sysminor) / NPERSHELF) +#define AOEMINOR(sysminor) ((sysminor) % NPERSHELF) #define WHITESPACE " \t\v\f\n" enum { @@ -83,7 +83,7 @@ enum { MAXATADATA = 1024, - NPERSHELF = 10, + NPERSHELF = 16, /* number of slots per shelf address */ FREETAG = -1, MIN_BUFS = 8, }; -- Ed L. Cashin <[EMAIL PROTECTED]> - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
/sys/module (was Re: [PATCH 2.6.12-rc2] aoe [1/6]: improve allowed interfaces configuration)
Greg KH <[EMAIL PROTECTED]> writes: ... > It's not for things other than modules, it's filling a real need that > you yourself just pointed out. Namely, we need to be able to have > access to module paramaters in a consistant place, no matter if the > driver is built into the kernel or not. > > Man, you try to be nice to people... :) It wasn't a complaint --- like I said, I'm pleased! I just wanted to serve as a datum: one guy was surprised. I wanted to put the interfaces list configuration into sysfs, but I didn't really know how or where to put it, so I procrastinated. Then I created the module parameter and was pleased to see it show up in sysfs. I had read about module parameters before, but I had forgotten about that feature, or maybe it's newer than the docs I read. Thanks for working so hard to make sysfs useful. -- Ed L Cashin <[EMAIL PROTECTED]> - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 2.6.12-rc2] aoe [1/6]: improve allowed interfaces configuration
Greg KH <[EMAIL PROTECTED]> writes: > On Thu, Apr 21, 2005 at 09:36:17AM -0400, Ed L Cashin wrote: >> "Bodo Eggert <[EMAIL PROTECTED]>" <[EMAIL PROTECTED]> writes: >> >> > Ed L Cashin <[EMAIL PROTECTED]> wrote: >> > >> >> +++ b/Documentation/aoe/aoe.txt 2005-04-20 11:42:20.0 -0400 >> > >> >> + When the aoe driver is a module, use >> > >> > Is there any reason for this inconsistent behaviour? >> >> Yes, the /sys/module/aoe area is only present when the aoe driver is a >> module. > > Not true, have you looked in /sys/module lately? :) > >> It would be nicer if there were a sysfs area where I could >> put this file regardless of whether the driver is a module or built >> into the kernel. > > That's the place for it. It will be there if the driver is built as a > module or into the kernel. Wow! Well, that's very convenient for driver writers, so I'm pleased, and I can update the docs. It surprises me, though, to find out that /sys/module is for things other than modules. The correction below follows and depends on patch 1 of the six. fix docs: built-in driver can use files in /sys/module Signed-off-by: Ed L. Cashin <[EMAIL PROTECTED]> diff -urNp a-exp/linux/Documentation/aoe/aoe.txt b-exp/linux/Documentation/aoe/aoe.txt --- a-exp/linux/Documentation/aoe/aoe.txt 2005-04-21 11:25:48.0 -0400 +++ b-exp/linux/Documentation/aoe/aoe.txt 2005-04-21 11:25:49.0 -0400 @@ -102,12 +102,11 @@ USING SYSFS e4.8eth1 up e4.9eth1 up - When the aoe driver is a module, use - /sys/module/aoe/parameters/aoe_iflist instead of - /dev/etherd/interfaces to limit AoE traffic to the network - interfaces in the given whitespace-separated list. Unlike the old - character device, the sysfs entry can be read from as well as - written to. + Use /sys/module/aoe/parameters/aoe_iflist (or better, the driver + option discussed below) instead of /dev/etherd/interfaces to limit + AoE traffic to the network interfaces in the given + whitespace-separated list. Unlike the old character device, the + sysfs entry can be read from as well as written to. It's helpful to trigger discovery after setting the list of allowed interfaces. The aoetools package provides an aoe-discover script -- Ed L Cashin <[EMAIL PROTECTED]>
Re: [PATCH 2.6.12-rc2] aoe [1/6]: improve allowed interfaces configuration
"Bodo Eggert <[EMAIL PROTECTED]>" <[EMAIL PROTECTED]> writes: > Ed L Cashin <[EMAIL PROTECTED]> wrote: > >> +++ b/Documentation/aoe/aoe.txt 2005-04-20 11:42:20.0 -0400 > >> + When the aoe driver is a module, use > > Is there any reason for this inconsistent behaviour? Yes, the /sys/module/aoe area is only present when the aoe driver is a module. It would be nicer if there were a sysfs area where I could put this file regardless of whether the driver is a module or built into the kernel. I could probably create one, but I got the file in /sys/module/aoe/parameters for free when I used module_param_string. >> + /sys/module/aoe/parameters/aoe_iflist instead of > ^^^ > > Why does the module name need to be part of the attribute? > That's redundant. That's redundant. Yes. That's true. Redundancy isn't always bad, though, and using the "aoe_" prefix lets the kernel parameter for the built-in aoe driver be the same as the parameter for the modular driver. -- Ed L Cashin <[EMAIL PROTECTED]> - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 2.6.12-rc2] aoe [5/6]: add firmware version to info in sysfs
"Randy.Dunlap" <[EMAIL PROTECTED]> writes: ... > so something like 'firmware-version' would be appreciated > (for the sysfs filename). Fair enough. This patch follows and depends on the fifth patch of the six. use a more explicit filename for sysfs firmware version info Signed-off-by: Ed L. Cashin <[EMAIL PROTECTED]> diff -urNp a-exp/linux/drivers/block/aoe/aoeblk.c b-exp/linux/drivers/block/aoe/aoeblk.c --- a-exp/linux/drivers/block/aoe/aoeblk.c 2005-04-20 15:09:13.0 -0400 +++ b-exp/linux/drivers/block/aoe/aoeblk.c 2005-04-20 15:09:13.0 -0400 @@ -58,7 +58,7 @@ static struct disk_attribute disk_attr_n .show = aoedisk_show_netif }; static struct disk_attribute disk_attr_fwver = { - .attr = {.name = "fwver", .mode = S_IRUGO }, + .attr = {.name = "firmware-version", .mode = S_IRUGO }, .show = aoedisk_show_fwver }; @@ -76,7 +76,7 @@ aoedisk_rm_sysfs(struct aoedev *d) sysfs_remove_link(&d->gd->kobj, "state"); sysfs_remove_link(&d->gd->kobj, "mac"); sysfs_remove_link(&d->gd->kobj, "netif"); - sysfs_remove_link(&d->gd->kobj, "fwver"); + sysfs_remove_link(&d->gd->kobj, "firmware-version"); } static int -- Ed L Cashin <[EMAIL PROTECTED]>
Re: [PATCH 2.6.12-rc2] aoe [1/6]: improve allowed interfaces configuration
"Randy.Dunlap" <[EMAIL PROTECTED]> writes: > On Wed, 20 Apr 2005 13:02:12 -0400 Ed L Cashin wrote: > > Just a nit/typo: > > | +modprobe aoe_iflist="eth1 eth3" > > | static char aoe_iflist[IFLISTSZ]; > | +module_param_string(aoe_iflist, aoe_iflist, IFLISTSZ, 0600); > | +MODULE_PARM_DESC(aoe_iflist, " aoe_iflist=\"dev1 [dev2 ...]\n"); > > No leading space (" aoe_iflist=") and put a trailing \" in it: > > +MODULE_PARM_DESC(aoe_iflist, "aoe_iflist=\"dev1 [dev2 ...]\"\n"); Thanks for catching that. improve allowed interfaces configuration Signed-off-by: Ed L. Cashin <[EMAIL PROTECTED]> diff -uprN a/Documentation/aoe/aoe.txt b/Documentation/aoe/aoe.txt --- a/Documentation/aoe/aoe.txt 2005-04-20 11:40:55.0 -0400 +++ b/Documentation/aoe/aoe.txt 2005-04-20 11:42:20.0 -0400 @@ -33,6 +33,9 @@ USING DEVICE NODES "cat /dev/etherd/err" blocks, waiting for error diagnostic output, like any retransmitted packets. + The /dev/etherd/interfaces special file is obsoleted by the + aoe_iflist boot option and module option (and its sysfs entry + described in the next section). "echo eth2 eth4 > /dev/etherd/interfaces" tells the aoe driver to limit ATA over Ethernet traffic to eth2 and eth4. AoE traffic from untrusted networks should be ignored as a matter of security. @@ -89,3 +92,24 @@ USING SYSFS e4.7eth1 up e4.8eth1 up e4.9eth1 up + + When the aoe driver is a module, use + /sys/module/aoe/parameters/aoe_iflist instead of + /dev/etherd/interfaces to limit AoE traffic to the network + interfaces in the given whitespace-separated list. Unlike the old + character device, the sysfs entry can be read from as well as + written to. + + It's helpful to trigger discovery after setting the list of allowed + interfaces. If your distro provides an aoe-discover script, you can + use that. Otherwise, you can directly use the /dev/etherd/discover + file described above. + +DRIVER OPTIONS + + There is a boot option for the built-in aoe driver and a + corresponding module parameter, aoe_iflist. Without this option, + all network interfaces may be used for ATA over Ethernet. Here is a + usage example for the module parameter. + +modprobe aoe_iflist="eth1 eth3" diff -uprN a/drivers/block/aoe/aoenet.c b/drivers/block/aoe/aoenet.c --- a/drivers/block/aoe/aoenet.c2005-04-20 11:41:18.0 -0400 +++ b/drivers/block/aoe/aoenet.c2005-04-20 11:42:20.0 -0400 @@ -7,6 +7,7 @@ #include #include #include +#include #include "aoe.h" #define NECODES 5 @@ -26,6 +27,19 @@ enum { }; static char aoe_iflist[IFLISTSZ]; +module_param_string(aoe_iflist, aoe_iflist, IFLISTSZ, 0600); +MODULE_PARM_DESC(aoe_iflist, "aoe_iflist=\"dev1 [dev2 ...]\"\n"); + +#ifndef MODULE +static int __init aoe_iflist_setup(char *str) +{ + strncpy(aoe_iflist, str, IFLISTSZ); + aoe_iflist[IFLISTSZ - 1] = '\0'; + return 1; +} + +__setup("aoe_iflist=", aoe_iflist_setup); +#endif int is_aoe_netif(struct net_device *ifp) @@ -36,7 +50,8 @@ is_aoe_netif(struct net_device *ifp) if (aoe_iflist[0] == '\0') return 1; - for (p = aoe_iflist; *p; p = q + strspn(q, WHITESPACE)) { + p = aoe_iflist + strspn(aoe_iflist, WHITESPACE); + for (; *p; p = q + strspn(q, WHITESPACE)) { q = p + strcspn(p, WHITESPACE); if (q != p) len = q - p; -- Ed L Cashin <[EMAIL PROTECTED]>
[PATCH 2.6.12-rc2] aoe [4/6]: allow multiple aoe devices to have the same mac
allow multiple aoe devices to have the same mac Signed-off-by: Ed L. Cashin <[EMAIL PROTECTED]> diff -u b/drivers/block/aoe/aoedev.c b/drivers/block/aoe/aoedev.c --- b/drivers/block/aoe/aoedev.c2005-04-20 11:42:18.0 -0400 +++ b/drivers/block/aoe/aoedev.c2005-04-20 11:42:22.0 -0400 @@ -109,25 +109,22 @@ spin_lock_irqsave(&devlist_lock, flags); for (d=devlist; d; d=d->next) - if (d->sysminor == sysminor - || memcmp(d->addr, addr, sizeof d->addr) == 0) + if (d->sysminor == sysminor) break; if (d == NULL && (d = aoedev_newdev(bufcnt)) == NULL) { spin_unlock_irqrestore(&devlist_lock, flags); printk(KERN_INFO "aoe: aoedev_set: aoedev_newdev failure.\n"); return NULL; - } + } /* if newdev, (d->flags & DEVFL_UP) == 0 for below */ spin_unlock_irqrestore(&devlist_lock, flags); spin_lock_irqsave(&d->lock, flags); d->ifp = ifp; - - if (d->sysminor != sysminor - || (d->flags & DEVFL_UP) == 0) { + memcpy(d->addr, addr, sizeof d->addr); + if ((d->flags & DEVFL_UP) == 0) { aoedev_downdev(d); /* flushes outstanding frames */ - memcpy(d->addr, addr, sizeof d->addr); d->sysminor = sysminor; d->aoemajor = AOEMAJOR(sysminor); d->aoeminor = AOEMINOR(sysminor); -- Ed L. Cashin <[EMAIL PROTECTED]> - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 2.6.12-rc2] aoe [6/6]: update version number to 10
update version number to 10 Signed-off-by: Ed L. Cashin <[EMAIL PROTECTED]> --- b/drivers/block/aoe/aoe.h 2005-04-20 11:42:19.0 -0400 +++ b/drivers/block/aoe/aoe.h 2005-04-20 11:42:22.0 -0400 @@ -1,5 +1,5 @@ /* Copyright (c) 2004 Coraid, Inc. See COPYING for GPL terms. */ -#define VERSION "6" +#define VERSION "10" #define AOE_MAJOR 152 #define DEVICE_NAME "aoe" -- Ed L. Cashin <[EMAIL PROTECTED]> - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 2.6.12-rc2] aoe [5/6]: add firmware version to info in sysfs
add firmware version to info in sysfs Signed-off-by: Ed L. Cashin <[EMAIL PROTECTED]> diff -uprN a/drivers/block/aoe/aoeblk.c b/drivers/block/aoe/aoeblk.c --- a/drivers/block/aoe/aoeblk.c2005-04-20 11:41:18.0 -0400 +++ b/drivers/block/aoe/aoeblk.c2005-04-20 11:42:23.0 -0400 @@ -37,6 +37,13 @@ static ssize_t aoedisk_show_netif(struct return snprintf(page, PAGE_SIZE, "%s\n", d->ifp->name); } +/* firmware version */ +static ssize_t aoedisk_show_fwver(struct gendisk * disk, char *page) +{ + struct aoedev *d = disk->private_data; + + return snprintf(page, PAGE_SIZE, "0x%04x\n", (unsigned int) d->fw_ver); +} static struct disk_attribute disk_attr_state = { .attr = {.name = "state", .mode = S_IRUGO }, @@ -50,6 +57,10 @@ static struct disk_attribute disk_attr_n .attr = {.name = "netif", .mode = S_IRUGO }, .show = aoedisk_show_netif }; +static struct disk_attribute disk_attr_fwver = { + .attr = {.name = "fwver", .mode = S_IRUGO }, + .show = aoedisk_show_fwver +}; static void aoedisk_add_sysfs(struct aoedev *d) @@ -57,6 +68,7 @@ aoedisk_add_sysfs(struct aoedev *d) sysfs_create_file(&d->gd->kobj, &disk_attr_state.attr); sysfs_create_file(&d->gd->kobj, &disk_attr_mac.attr); sysfs_create_file(&d->gd->kobj, &disk_attr_netif.attr); + sysfs_create_file(&d->gd->kobj, &disk_attr_fwver.attr); } void aoedisk_rm_sysfs(struct aoedev *d) @@ -64,6 +76,7 @@ aoedisk_rm_sysfs(struct aoedev *d) sysfs_remove_link(&d->gd->kobj, "state"); sysfs_remove_link(&d->gd->kobj, "mac"); sysfs_remove_link(&d->gd->kobj, "netif"); + sysfs_remove_link(&d->gd->kobj, "fwver"); } static int -- Ed L. Cashin <[EMAIL PROTECTED]> - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 2.6.12-rc2] aoe [3/6]: update the documentation to mention aoetools
update the documentation to mention aoetools Signed-off-by: Ed L. Cashin <[EMAIL PROTECTED]> diff -uprN a/Documentation/aoe/aoe.txt b/Documentation/aoe/aoe.txt --- a/Documentation/aoe/aoe.txt 2005-04-20 11:42:20.0 -0400 +++ b/Documentation/aoe/aoe.txt 2005-04-20 11:42:21.0 -0400 @@ -4,6 +4,16 @@ The EtherDrive (R) HOWTO for users of 2. It has many tips and hints! +The aoetools are userland programs that are designed to work with this +driver. The aoetools are on sourceforge. + + http://aoetools.sourceforge.net/ + +The scripts in this Documentation/aoe directory are intended to +document the use of the driver and are not necessary if you install +the aoetools. + + CREATING DEVICE NODES Users of udev should find the block device nodes created @@ -33,19 +43,17 @@ USING DEVICE NODES "cat /dev/etherd/err" blocks, waiting for error diagnostic output, like any retransmitted packets. - The /dev/etherd/interfaces special file is obsoleted by the - aoe_iflist boot option and module option (and its sysfs entry - described in the next section). "echo eth2 eth4 > /dev/etherd/interfaces" tells the aoe driver to limit ATA over Ethernet traffic to eth2 and eth4. AoE traffic from - untrusted networks should be ignored as a matter of security. + untrusted networks should be ignored as a matter of security. See + also the aoe_iflist driver option described below. "echo > /dev/etherd/discover" tells the driver to find out what AoE devices are available. These character devices may disappear and be replaced by sysfs - counterparts, so distribution maintainers are encouraged to create - scripts that use these devices. + counterparts. Using the commands in aoetools insulates users from + these implementation details. The block devices are named like this: @@ -69,7 +77,8 @@ USING SYSFS through which we are communicating with the remote AoE device. There is a script in this directory that formats this information - in a convenient way. + in a convenient way. Users with aoetools can use the aoe-stat + command. [EMAIL PROTECTED] root# sh Documentation/aoe/status.sh e10.0eth3 up @@ -101,9 +110,9 @@ USING SYSFS written to. It's helpful to trigger discovery after setting the list of allowed - interfaces. If your distro provides an aoe-discover script, you can - use that. Otherwise, you can directly use the /dev/etherd/discover - file described above. + interfaces. The aoetools package provides an aoe-discover script + for this purpose. You can also directly use the + /dev/etherd/discover special file described above. DRIVER OPTIONS -- Ed L. Cashin <[EMAIL PROTECTED]> - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 2.6.12-rc2] aoe [2/6]: aoe-stat should work for built-in as well as module
aoe-stat should work for built-in as well as module Signed-off-by: Ed L. Cashin <[EMAIL PROTECTED]> diff -uprN a/Documentation/aoe/status.sh b/Documentation/aoe/status.sh --- a/Documentation/aoe/status.sh 2005-04-20 11:40:55.0 -0400 +++ b/Documentation/aoe/status.sh 2005-04-20 11:42:20.0 -0400 @@ -14,10 +14,6 @@ test ! -d "$sysd/block" && { echo "$me Error: sysfs is not mounted" 1>&2 exit 1 } -test -z "`lsmod | grep '^aoe'`" && { - echo "$me Error: aoe module is not loaded" 1>&2 - exit 1 -} for d in `ls -d $sysd/block/etherd* 2>/dev/null | grep -v p` end; do # maybe ls comes up empty, so we use "end" -- Ed L. Cashin <[EMAIL PROTECTED]> - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 2.6.12-rc2] aoe [1/6]: improve allowed interfaces configuration
improve allowed interfaces configuration Signed-off-by: Ed L. Cashin <[EMAIL PROTECTED]> diff -uprN a/Documentation/aoe/aoe.txt b/Documentation/aoe/aoe.txt --- a/Documentation/aoe/aoe.txt 2005-04-20 11:40:55.0 -0400 +++ b/Documentation/aoe/aoe.txt 2005-04-20 11:42:20.0 -0400 @@ -33,6 +33,9 @@ USING DEVICE NODES "cat /dev/etherd/err" blocks, waiting for error diagnostic output, like any retransmitted packets. + The /dev/etherd/interfaces special file is obsoleted by the + aoe_iflist boot option and module option (and its sysfs entry + described in the next section). "echo eth2 eth4 > /dev/etherd/interfaces" tells the aoe driver to limit ATA over Ethernet traffic to eth2 and eth4. AoE traffic from untrusted networks should be ignored as a matter of security. @@ -89,3 +92,24 @@ USING SYSFS e4.7eth1 up e4.8eth1 up e4.9eth1 up + + When the aoe driver is a module, use + /sys/module/aoe/parameters/aoe_iflist instead of + /dev/etherd/interfaces to limit AoE traffic to the network + interfaces in the given whitespace-separated list. Unlike the old + character device, the sysfs entry can be read from as well as + written to. + + It's helpful to trigger discovery after setting the list of allowed + interfaces. If your distro provides an aoe-discover script, you can + use that. Otherwise, you can directly use the /dev/etherd/discover + file described above. + +DRIVER OPTIONS + + There is a boot option for the built-in aoe driver and a + corresponding module parameter, aoe_iflist. Without this option, + all network interfaces may be used for ATA over Ethernet. Here is a + usage example for the module parameter. + +modprobe aoe_iflist="eth1 eth3" diff -uprN a/drivers/block/aoe/aoenet.c b/drivers/block/aoe/aoenet.c --- a/drivers/block/aoe/aoenet.c2005-04-20 11:41:18.0 -0400 +++ b/drivers/block/aoe/aoenet.c2005-04-20 11:42:20.0 -0400 @@ -7,6 +7,7 @@ #include #include #include +#include #include "aoe.h" #define NECODES 5 @@ -26,6 +27,19 @@ enum { }; static char aoe_iflist[IFLISTSZ]; +module_param_string(aoe_iflist, aoe_iflist, IFLISTSZ, 0600); +MODULE_PARM_DESC(aoe_iflist, " aoe_iflist=\"dev1 [dev2 ...]\n"); + +#ifndef MODULE +static int __init aoe_iflist_setup(char *str) +{ + strncpy(aoe_iflist, str, IFLISTSZ); + aoe_iflist[IFLISTSZ - 1] = '\0'; + return 1; +} + +__setup("aoe_iflist=", aoe_iflist_setup); +#endif int is_aoe_netif(struct net_device *ifp) @@ -36,7 +50,8 @@ is_aoe_netif(struct net_device *ifp) if (aoe_iflist[0] == '\0') return 1; - for (p = aoe_iflist; *p; p = q + strspn(q, WHITESPACE)) { + p = aoe_iflist + strspn(aoe_iflist, WHITESPACE); + for (; *p; p = q + strspn(q, WHITESPACE)) { q = p + strcspn(p, WHITESPACE); if (q != p) len = q - p; -- Ed L. Cashin <[EMAIL PROTECTED]> - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: AOE and large filesystems?
Dan Stromberg <[EMAIL PROTECTED]> writes: > Some questions for the list: > > 1) Is anyone on the list using AOE in production? I don't know of any AoE users that read the lkml. Except me, of course. > 2) Is anyone on the list using AOE in combination with md and/or LVM2? Yes, most AoE users use md. Many use LVM2, but a couple have had trouble with striped volume groups. > 3) Is anyone on the list using AOE on a 64 bit platform? People are using AoE on 64 bit platforms and not reading the lkml. :) -- Ed L Cashin <[EMAIL PROTECTED]> - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 2.6.11] aoe [7/12]: support configuration of AOE_PARTITIONS from Kconfig
Greg KH <[EMAIL PROTECTED]> writes: > On Thu, Apr 07, 2005 at 02:56:39PM -0400, Ed L Cashin wrote: ... >> Just aoe-AOE_PARTITIONS.patch, the seventh of the twelve, should be >> dropped. > > Ok, dropped. > >> Then later I'll send a batch of patches that will include a change to >> make aoe disks non-partitionable by default. > > That's fine. Mind if I forward the other aoe patches in that directory > to Linus soon? Go ahead. I have a new batch of patches to send, but it looks like I might not get to it for a few days. -- Ed L Cashin <[EMAIL PROTECTED]> - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 2.6.11] aoe [7/12]: support configuration of AOE_PARTITIONS from Kconfig
Greg KH <[EMAIL PROTECTED]> writes: ... > So, which one of the aoe patches listed at: > > http://www.kernel.org/pub/linux/kernel/people/gregkh/gregkh-2.6/patches/driver/ > do you want me to drop? This one: > > http://www.kernel.org/pub/linux/kernel/people/gregkh/gregkh-2.6/patches/driver/aoe-AOE_PARTITIONS.patch > ? > Or some other one too? Just aoe-AOE_PARTITIONS.patch, the seventh of the twelve, should be dropped. Then later I'll send a batch of patches that will include a change to make aoe disks non-partitionable by default. -- Ed L Cashin <[EMAIL PROTECTED]> - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 2.6.11] aoe [7/12]: support configuration of AOE_PARTITIONS from Kconfig
Christoph Hellwig <[EMAIL PROTECTED]> writes: > On Tue, Mar 29, 2005 at 11:48:48AM -0500, Ed L Cashin wrote: >> I don't know if it matters now that we have udev. When udev manages >> the device nodes it all just works, > > But most peopel still don't use udev. > >> If you're saying that it's bad in principal, then that's another >> story. If that's what you mean, then it's a Linux policy issue, and >> to follow convention I'd think that we'd need another major number. >> That would be like the partitionable md devices, etc. > > Yes, it's a policy issue. We don't do this weird config option anywhere > else. A couple support calls later, I think I've come around to your point of view. This patch isn't needed and may cause confusion. Few aoe users really use partitions on their aoe disks, so I can make the aoe driver have one minor number per disk as the default to avoid the most common problems people encounter. Then, aoe users who really need to partition their network disks can use the partitionable md driver to "wrap" the aoe disk, like this: mdadm -B -l linear --force -n 1 --auto=mdp /dev/md_p0 /dev/etherd/e7.0 fdisk /dev/md_p0 -- Ed L Cashin <[EMAIL PROTECTED]> - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 2.6.11] aoe [7/12]: support configuration of AOE_PARTITIONS from Kconfig
Christoph Hellwig <[EMAIL PROTECTED]> writes: > On Tue, Mar 29, 2005 at 11:06:16AM -0500, Ed L Cashin wrote: >> > >> > NACK. this changes devices nodes based on a compile-time option. >> >> I'm not sure I follow. This configuration option sets the number of >> partitions per device in the driver. It doesn't create device nodes. >> >> If the user has udev, then the device nodes are created correctly (on >> Fedora Core 3), so that if the driver is configured with 1 partition >> per device, the minor numbers for the disks are low. >> >> The folks I've talked to who aren't using udev but want one partition >> per device already know that they have to re-create their device >> files. > > It changes a kernel ABI, so people that have different config options > set can't use the same userland. It's a really big no-go. I don't know if it matters now that we have udev. When udev manages the device nodes it all just works, so there's practically not much of an issue. The UUIDs in Software RAID and LVM make this even less of an issue. Besides, if they aren't using udev and are using a custom kernel *and* using this configure option, then they're changing the ABI on purpose for a practical benefit. If you're saying that it's bad in principal, then that's another story. If that's what you mean, then it's a Linux policy issue, and to follow convention I'd think that we'd need another major number. That would be like the partitionable md devices, etc. To me, the latter seems like the uglier way to go, and it would be more permanent, while the configure option seems ugly but convenient for this transitional stage, and could go away when high minor numbers are well supported on the systems currently in use. -- Ed L Cashin <[EMAIL PROTECTED]> - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 2.6.11] aoe [7/12]: support configuration of AOE_PARTITIONS from Kconfig
Christoph Hellwig <[EMAIL PROTECTED]> writes: > On Thu, Mar 24, 2005 at 07:21:28AM -0800, [EMAIL PROTECTED] wrote: >> >> support configuration of AOE_PARTITIONS from Kconfig >> >> Signed-off-by: Ed L. Cashin <[EMAIL PROTECTED]> >> >> diff -uprN a/drivers/block/Kconfig b/drivers/block/Kconfig >> --- a/drivers/block/Kconfig 2005-03-07 17:37:58.0 -0500 >> +++ b/drivers/block/Kconfig 2005-03-10 12:19:54.0 -0500 >> @@ -506,4 +506,19 @@ config ATA_OVER_ETH >> This driver provides Support for ATA over Ethernet block >> devices like the Coraid EtherDrive (R) Storage Blade. >> >> +config AOE_PARTITIONS >> +int "Partitions per AoE device" if ATA_OVER_ETH >> +default "16" >> +help >> + The default is to support 16 partitions per aoe device. Some >> + systems lack good support for devices with large minor >> + numbers. >> + >> + Such systems will be able to use more aoe disks when >> + AOE_PARTITIONS is set to one, but you won't be able to >> + partition the disks, and you must make sure your device >> + nodes are created to work with the value you select. >> + >> + If unsure, use 16. >> + > > NACK. this changes devices nodes based on a compile-time option. I'm not sure I follow. This configuration option sets the number of partitions per device in the driver. It doesn't create device nodes. If the user has udev, then the device nodes are created correctly (on Fedora Core 3), so that if the driver is configured with 1 partition per device, the minor numbers for the disks are low. The folks I've talked to who aren't using udev but want one partition per device already know that they have to re-create their device files. > Just tell people to update their userland to a 2.6-copatible > version. Even if the glibc, coreutils, etc., get it right, some programs try to parse the device node bits themselves and fail to find all the minor number bits. Making this configurable makes it possible for a debian sarge user or a Slackware 10 user to run a 2.6.11 kernel and use up to 256 disks. Even a Fedora Core 3 user has an mdadm that balks at minor numbers like 1120. By using one partition per device, I can use FC3's mdadm and have everything work. The AoE users started doing this themselves, but this configuration option allows non-C-programmers to do the same. It's helpful during the transitional period and should be removed when the userland software that people are actually running has caught up. -- Ed L Cashin <[EMAIL PROTECTED]> - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 2.6.11] aoe [4/12]: handle distros that have a udev rules file instead of dir
handle distros that have a udev rules file instead of dir Signed-off-by: Ed L. Cashin <[EMAIL PROTECTED]> diff -uprN a/Documentation/aoe/udev-install.sh b/Documentation/aoe/udev-install.sh --- a/Documentation/aoe/udev-install.sh 2005-03-10 11:59:55.0 -0500 +++ b/Documentation/aoe/udev-install.sh 2005-03-10 12:19:18.0 -0500 @@ -23,4 +23,8 @@ fi # /etc/udev/rules.d # rules_d="`sed -n '/^udev_rules=/{ s!udev_rules=!!; s!\"!!g; p; }' $conf`" -test "$rules_d" && sh -xc "cp `dirname $0`/udev.txt $rules_d/60-aoe.rules" +if test -z "$rules_d" || test ! -d "$rules_d"; then + echo "$me Error: cannot find udev rules directory" 1>&2 + exit 1 +fi +sh -xc "cp `dirname $0`/udev.txt $rules_d/60-aoe.rules" -- Ed L. Cashin <[EMAIL PROTECTED]> - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 2.6.11] aoe [1/12]: remove too-low cap on minor number
Greg KH <[EMAIL PROTECTED]> writes: > I've applied 11 of these 12 patches (the one from Randy was already > included) to my trees. Those haven't gone to the lkml yet. I'll post them through gmane. (The original postings never made it to the list, and I can't get in touch with the lkml admins, but gmane should work.) -- Ed L Cashin <[EMAIL PROTECTED]> - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] aoe: fix printk warning (sparc64)
Randy Dunlap writes: > aoeblk: mac_addr() returns u64, coerce to unsigned long long to printk it: > (sparc64 build warning) > > drivers/block/aoe/aoeblk.c:245: warning: long long unsigned int format, u64 > arg (arg 2) > drivers/block/aoe/aoeblk.c:31: warning: long long unsigned int format, u64 > arg (arg 4) > > cross-compile results: > https://www.osdl.org/plm-cgi/plm?module=patch_info&patch_id=4239 Thanks. I have a few patches I'll be sending along soon (today, I hope) in a batch, including Alexey Dobriyan's recent sparse cleanups, and I'll include this one among them. -- Ed L Cashin <[EMAIL PROTECTED]> - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] aoe: fix abuse of arrays and sparse warnings
Alexey Dobriyan <[EMAIL PROTECTED]> writes: > Signed-off-by: Alexey Dobriyan <[EMAIL PROTECTED]> Hi. Thanks for the patch. Have you tested it? If you don't have any ATA over Ethernet hardware, you can using the alpha vblade program for testing. (Run it on a system in the broadcast domain of the host running your patched aoe driver, and vblade will export any file, e.g., /dev/loop0, as block storage.) It's at http://sf.net/projects/aoetools I was trying to determine what sparse warnings you see, so I got sparse from bk://sparse.bkbits.net/sparse and ran it. Your patch cuts down significantly on the complaints, but there are some that persist. Maybe you're using an older version of sparse? [EMAIL PROTECTED] linux-2.6.11-rc4-bk9$ make C=1 CHK include/linux/version.h make[1]: `arch/i386/kernel/asm-offsets.s' is up to date. CHK include/asm-i386/asm_offsets.h CHK include/linux/compile.h CHK usr/initramfs_list CHECK drivers/block/aoe/aoeblk.c CC [M] drivers/block/aoe/aoeblk.o CHECK drivers/block/aoe/aoechr.c drivers/block/aoe/aoechr.c:236:24: warning: symbol 'aoe_fops' was not declared. Should it be static? This change has been made already, so I'll check whether I've pushed the change up. I have a couple of things I haven't submitted yet. CC [M] drivers/block/aoe/aoechr.o CHECK drivers/block/aoe/aoecmd.c drivers/block/aoe/aoecmd.c:27:17: warning: incorrect type in assignment (different base types) drivers/block/aoe/aoecmd.c:27:17:expected unsigned short [unsigned] protocol drivers/block/aoe/aoecmd.c:27:17:got restricted unsigned short [usertype] [force] CC [M] drivers/block/aoe/aoecmd.o CHECK drivers/block/aoe/aoedev.c CC [M] drivers/block/aoe/aoedev.o CHECK drivers/block/aoe/aoemain.c CC [M] drivers/block/aoe/aoemain.o CHECK drivers/block/aoe/aoenet.c drivers/block/aoe/aoenet.c:156:10: warning: incorrect type in initializer (different base types) drivers/block/aoe/aoenet.c:156:10:expected unsigned short [unsigned] type drivers/block/aoe/aoenet.c:156:10:got restricted unsigned short [usertype] [force] CC [M] drivers/block/aoe/aoenet.o LD [M] drivers/block/aoe/aoe.o Kernel: arch/i386/boot/bzImage is ready Building modules, stage 2. MODPOST CC drivers/block/aoe/aoe.mod.o LD [M] drivers/block/aoe/aoe.ko [EMAIL PROTECTED] linux-2.6.11-rc4-bk9$ The "array abuse" is something that I'm not all that enthusiastic about changing, since it's mostly a style issue, and last time I changed it the way your patch does, the original author of the patch changed it back. But you've figured out how to make sparse happy, and for that I'm grateful! :) -- Ed L Cashin <[EMAIL PROTECTED]> - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [2.6 patch] drivers/block/aoe/aoechr.c cleanups
Adrian Bunk <[EMAIL PROTECTED]> writes: > This patch contains the following cleanups: > - make the needlessly global struct aoe_fops static > - #if 0 the unused global function aoechr_hdump Thanks for the patch. The original patch leaves the prototype for aoechr_hdump in aoe.h, but since this function is just for debugging, it seems better to just take both prototype and definition out. remove aoechr_hdump make aoe_fops static Signed-off-by: Ed L. Cashin <[EMAIL PROTECTED]> diff -urNp aa/drivers/block/aoe/aoe.h bb/drivers/block/aoe/aoe.h --- aa/drivers/block/aoe/aoe.h 2005-01-31 09:20:53.0 -0500 +++ bb/drivers/block/aoe/aoe.h 2005-01-31 09:21:01.0 -0500 @@ -143,7 +143,6 @@ void aoedisk_rm_sysfs(struct aoedev *d); int aoechr_init(void); void aoechr_exit(void); void aoechr_error(char *); -void aoechr_hdump(char *, int len); void aoecmd_work(struct aoedev *d); void aoecmd_cfg(ushort, unsigned char); diff -urNp aa/drivers/block/aoe/aoechr.c bb/drivers/block/aoe/aoechr.c --- aa/drivers/block/aoe/aoechr.c 2005-01-31 09:20:53.0 -0500 +++ bb/drivers/block/aoe/aoechr.c 2005-01-31 09:21:01.0 -0500 @@ -99,41 +99,6 @@ bail:spin_unlock_irqrestore(&emsgs_loc up(&emsgs_sema); } -#define PERLINE 16 -void -aoechr_hdump(char *buf, int n) -{ - int bufsiz; - char *fbuf; - int linelen; - char *p, *e, *fp; - - bufsiz = n * 3; /* 2 hex digits and a space */ - bufsiz += n / PERLINE + 1; /* the newline characters */ - bufsiz += 1;/* the final '\0' */ - - fbuf = kmalloc(bufsiz, GFP_ATOMIC); - if (!fbuf) { - printk(KERN_INFO - "%s: cannot allocate memory\n", - __FUNCTION__); - return; - } - - for (p = buf; n <= 0;) { - linelen = n > PERLINE ? PERLINE : n; - n -= linelen; - - fp = fbuf; - for (e=p+linelen; p -- Ed L Cashin <[EMAIL PROTECTED]>
[PATCH block-2.6] aoe status.sh: handle sysfs not in /etc/mtab
Suse 9.1 Pro doesn't put /sys in /etc/mtab. This patch makes the example aoe status.sh script work when sysfs is mounted but `mount` doesn't mention sysfs. aoe status.sh: handle sysfs not in /etc/mtab Signed-off-by: Ed L. Cashin <[EMAIL PROTECTED]> --- a/Documentation/aoe/status.sh +++ b/Documentation/aoe/status.sh @@ -4,10 +4,13 @@ set -e format="%8s\t%8s\t%8s\n" me=`basename $0` +sysd=${sysfs_dir:-/sys} # printf "$format" device mac netif state -test -z "`mount | grep sysfs`" && { +# Suse 9.1 Pro doesn't put /sys in /etc/mtab +#test -z "`mount | grep sysfs`" && { +test ! -d "$sysd/block" && { echo "$me Error: sysfs is not mounted" 1>&2 exit 1 } @@ -16,7 +19,7 @@ exit 1 } -for d in `ls -d /sys/block/etherd* 2>/dev/null | grep -v p` end; do +for d in `ls -d $sysd/block/etherd* 2>/dev/null | grep -v p` end; do # maybe ls comes up empty, so we use "end" test $d = end && continue -- Ed L Cashin <[EMAIL PROTECTED]>
[PATCH block-2.6] aoe: fail IO on disk errors
This patch makes disk errors fail the IO instead of getting logged and ignored. Fail IO on disk errors Signed-off-by: Ed L. Cashin <[EMAIL PROTECTED]> diff -uprN block-2.6-aa/drivers/block/aoe/aoecmd.c block-2.6-bb/drivers/block/aoe/aoecmd.c --- block-2.6-aa/drivers/block/aoe/aoecmd.c 2005-01-25 08:07:33.0 -0500 +++ block-2.6-bb/drivers/block/aoe/aoecmd.c 2005-01-25 10:00:45.0 -0500 @@ -416,7 +416,9 @@ aoecmd_ata_rsp(struct sk_buff *skb) if (ahin->cmdstat & 0xa9) { /* these bits cleared on success */ printk(KERN_CRIT "aoe: aoecmd_ata_rsp: ata error cmd=%2.2Xh " - "stat=%2.2Xh\n", ahout->cmdstat, ahin->cmdstat); + "stat=%2.2Xh from e%ld.%ld\n", + ahout->cmdstat, ahin->cmdstat, + d->aoemajor, d->aoeminor); if (buf) buf->flags |= BUFFL_FAIL; } else { @@ -458,8 +460,8 @@ aoecmd_ata_rsp(struct sk_buff *skb) if (buf) { buf->nframesout -= 1; if (buf->nframesout == 0 && buf->resid == 0) { - n = !(buf->flags & BUFFL_FAIL); - bio_endio(buf->bio, buf->bio->bi_size, 0); + n = (buf->flags & BUFFL_FAIL) ? -EIO : 0; + bio_endio(buf->bio, buf->bio->bi_size, n); mempool_free(buf, d->bufpool); } } -- Ed L Cashin <[EMAIL PROTECTED]>
Re: [PATCH] aoe: add documentation for udev users
Bodo Eggert <[EMAIL PROTECTED]> writes: > Ed L Cashin <[EMAIL PROTECTED]> wrote: > >> +if test -z "$conf"; then >> +conf="`find /etc -type f -name udev.conf 2> /dev/null`" >> +fi >> +if test -z "$conf" || test ! -r $conf; then >> +echo "$me Error: could not find readable udev.conf in /etc" 1>&2 >> +exit 1 >> +fi > > This will fail and print > --- > bash: test: etc/udev.conf: binary operator expected > --- > if there is more than one udev.conf. > > Fix: Always put quotes around variables. Thanks. With the changes below, it still will complain if it finds more than one udev.conf, but only if /etc/udev/udev.conf doesn't exist. Quote all shell variables, and use /etc/udev/udev.conf if available. Signed-off-by: Ed L. Cashin <[EMAIL PROTECTED]> diff -uprN a/Documentation/aoe/udev-install.sh b/Documentation/aoe/udev-install.sh --- a/Documentation/aoe/udev-install.sh 2005-01-20 09:14:58.0 -0500 +++ b/Documentation/aoe/udev-install.sh 2005-01-20 09:13:38.0 -0500 @@ -8,11 +8,15 @@ me="`basename $0`" # (or environment can specify where to find udev.conf) # if test -z "$conf"; then - conf="`find /etc -type f -name udev.conf 2> /dev/null`" -fi -if test -z "$conf" || test ! -r $conf; then - echo "$me Error: could not find readable udev.conf in /etc" 1>&2 - exit 1 + if test -r /etc/udev/udev.conf; then + conf=/etc/udev/udev.conf + else + conf="`find /etc -type f -name udev.conf 2> /dev/null`" + if test -z "$conf" || test ! -r "$conf"; then + echo "$me Error: no udev.conf found" 1>&2 + exit 1 + fi + fi fi # find the directory where udev rules are stored, often -- Ed L Cashin <[EMAIL PROTECTED]>
Re: [PATCH] AOE: fix up the block device registration so that it actually works now.
Greg KH <[EMAIL PROTECTED]> writes: > On Wed, Jan 19, 2005 at 09:08:14AM -0500, Ed L Cashin wrote: >> > diff -Nru a/drivers/block/aoe/aoeblk.c b/drivers/block/aoe/aoeblk.c >> > --- a/drivers/block/aoe/aoeblk.c 2005-01-18 16:06:57 -08:00 >> > +++ b/drivers/block/aoe/aoeblk.c 2005-01-18 16:06:57 -08:00 >> > @@ -249,6 +249,7 @@ >> > aoeblk_exit(void) >> > { >> >kmem_cache_destroy(buf_pool_cache); >> > + unregister_blkdev(AOE_MAJOR, DEVICE_NAME); >> > } >> >> The unregister_blkdev should already be in aoemain.c:aoe_exit. > > Why? You do a register_blockdev() in this file, so if something fails, > you should also unregister here. No, the register_blkdev that you see in aoeblk.c is the artifact of a snafu I made in patch submission. I submitted a small patch yesterday (ID below) that corrects the snafu and makes block-2.6 the same as the driver I have. Message-ID: <[EMAIL PROTECTED]> In the current aoe driver, register_blkdev is only in aoemain.c:aoe_init, and that register_blkdev is the last step in the initialization sequence. If register_blkdev fails, then I don't unregister_blkdev, because presumably I shouldn't undo something that wasn't done. > The big problem is you were trying to > register the same major twice :( That's because two snippets of my recent fixes got orphaned (my fault) instead of getting included in the submitted patches, so instead of my patches moving register_blkdev it got duplicated. The patch I sent yesterday corrects the problem and brings block-2.6 back into accordance with what I've got. Sorry for the confusion. -- Ed L Cashin <[EMAIL PROTECTED]> - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] aoe: add documentation for udev users
Hi. This patch was generated against block-2.6 but should apply easily in other trees, since it touches only documentation. add documentation for udev users Signed-off-by: Ed L. Cashin <[EMAIL PROTECTED]> --- block-2.6-export-b/Documentation/aoe/aoe.txt2005-01-19 14:29:15.0 -0500 +++ linux/Documentation/aoe/aoe.txt 2005-01-19 15:57:37.0 -0500 @@ -6,9 +6,16 @@ The EtherDrive (R) HOWTO for users of 2. CREATING DEVICE NODES - Users of udev should find device nodes created automatically. Two - scripts are provided in Documentation/aoe as examples of static - device node creation for using the aoe driver. + Users of udev should find the block device nodes created + automatically, but to create all the necessary device nodes, use the + udev configuration rules provided in udev.txt (in this directory). + + There is a udev-install.sh script that shows how to install these + rules on your system. + + If you are not using udev, two scripts are provided in + Documentation/aoe as examples of static device node creation for + using the aoe driver. rm -rf /dev/etherd sh Documentation/aoe/mkdevs.sh /dev/etherd --- block-2.6-export-b/Documentation/aoe/udev-install.sh1969-12-31 19:00:00.0 -0500 +++ linux/Documentation/aoe/udev-install.sh 2005-01-19 15:57:37.0 -0500 @@ -0,0 +1,22 @@ +# install the aoe-specific udev rules from udev.txt into +# the system's udev configuration +# + +me="`basename $0`" + +# find udev.conf, often /etc/udev/udev.conf +# (or environment can specify where to find udev.conf) +# +if test -z "$conf"; then + conf="`find /etc -type f -name udev.conf 2> /dev/null`" +fi +if test -z "$conf" || test ! -r $conf; then + echo "$me Error: could not find readable udev.conf in /etc" 1>&2 + exit 1 +fi + +# find the directory where udev rules are stored, often +# /etc/udev/rules.d +# +rules_d="`sed -n '/^udev_rules=/{ s!udev_rules=!!; s!\"!!g; p; }' $conf`" +test "$rules_d" && sh -xc "cp `dirname $0`/udev.txt $rules_d/60-aoe.rules" --- block-2.6-export-b/Documentation/aoe/udev.txt 1969-12-31 19:00:00.0 -0500 +++ linux/Documentation/aoe/udev.txt2005-01-19 15:57:37.0 -0500 @@ -0,0 +1,23 @@ +# These rules tell udev what device nodes to create for aoe support. +# They may be installed along the following lines (adjusted to what +# you see on your system). +# +# [EMAIL PROTECTED] ~$ su +# Password: +# bash# find /etc -type f -name udev.conf +# /etc/udev/udev.conf +# bash# grep udev_rules= /etc/udev/udev.conf +# udev_rules="/etc/udev/rules.d/" +# bash# ls /etc/udev/rules.d/ +# 10-wacom.rules 50-udev.rules +# bash# cp /path/to/linux-2.6.xx/Documentation/aoe/udev.txt \ +# /etc/udev/rules.d/60-aoe.rules +# + +# aoe char devices +SUBSYSTEM="aoe", KERNEL="discover",NAME="etherd/%k", GROUP="disk", MODE="0220" +SUBSYSTEM="aoe", KERNEL="err", NAME="etherd/%k", GROUP="disk", MODE="0440" +SUBSYSTEM="aoe", KERNEL="interfaces", NAME="etherd/%k", GROUP="disk", MODE="0220" + +# aoe block devices +KERNEL="etherd*", NAME="%k", GROUP="disk" -- Ed L Cashin <[EMAIL PROTECTED]>
Re: [PATCH] AOE: fix up the block device registration so that it actually works now.
Greg KH <[EMAIL PROTECTED]> writes: > Ed, I need the following patch against the latest -bk tree in order to > get the aoe code to load and work properly. Does it look good to you? I hadn't submitted all my changes correctly, sorry. Here's a patch against block-2.6 that rectifies the omission. Remove allow aoeblk_exit to be called from __init code, and move register_blkdev into aoe_init. Signed-off-by: Ed L. Cashin <[EMAIL PROTECTED]> diff -uprN block-2.6-export-a/drivers/block/aoe/aoeblk.c b/drivers/block/aoe/aoeblk.c --- block-2.6-export-a/drivers/block/aoe/aoeblk.c 2005-01-19 14:29:31.0 -0500 +++ patch-block-2.6-20050119-export/linux/drivers/block/aoe/aoeblk.c 2005-01-19 15:21:53.0 -0500 @@ -245,7 +252,7 @@ aoeblk_gdalloc(void *vp) d->fw_ver, (long long)d->ssize); } -void __exit +void aoeblk_exit(void) { kmem_cache_destroy(buf_pool_cache); @@ -254,19 +261,12 @@ aoeblk_exit(void) int __init aoeblk_init(void) { - int n; - buf_pool_cache = kmem_cache_create("aoe_bufs", sizeof(struct buf), 0, 0, NULL, NULL); if (buf_pool_cache == NULL) return -ENOMEM; - n = register_blkdev(AOE_MAJOR, DEVICE_NAME); - if (n < 0) { - printk(KERN_ERR "aoe: aoeblk_init: can't register major\n"); - return n; - } return 0; } -- Ed L Cashin <[EMAIL PROTECTED]>
Re: [PATCH] AOE: fix up the block device registration so that it actually works now.
Greg KH <[EMAIL PROTECTED]> writes: > Ed, I need the following patch against the latest -bk tree in order to > get the aoe code to load and work properly. Does it look good to you? I'm having trouble seeing what's in -bk. I have a clone of bk://linux.bkbits.net/linux-2.5, but when I "bk pull" there it says "Nothing to pull." And my clone doesn't have all the aoe patches I've seen get pushed to -bk. ... > - > > AOE: fix up the block device registration so that it actually works now. > > Signed-off-by: Greg Kroah-Hartman <[EMAIL PROTECTED]> > > diff -Nru a/drivers/block/aoe/aoeblk.c b/drivers/block/aoe/aoeblk.c > --- a/drivers/block/aoe/aoeblk.c 2005-01-18 16:06:57 -08:00 > +++ b/drivers/block/aoe/aoeblk.c 2005-01-18 16:06:57 -08:00 > @@ -249,6 +249,7 @@ > aoeblk_exit(void) > { > kmem_cache_destroy(buf_pool_cache); > + unregister_blkdev(AOE_MAJOR, DEVICE_NAME); > } The unregister_blkdev should already be in aoemain.c:aoe_exit. static void aoe_exit(void) { discover_timer(TKILL); aoenet_exit(); unregister_blkdev(AOE_MAJOR, DEVICE_NAME); aoechr_exit(); aoedev_exit(); aoeblk_exit(); /* free cache after de-allocating bufs */ } > int __init > diff -Nru a/drivers/block/aoe/aoemain.c b/drivers/block/aoe/aoemain.c > --- a/drivers/block/aoe/aoemain.c 2005-01-18 16:06:57 -08:00 > +++ b/drivers/block/aoe/aoemain.c 2005-01-18 16:06:57 -08:00 > @@ -82,11 +82,6 @@ > ret = aoenet_init(); > if (ret) > goto net_fail; > - ret = register_blkdev(AOE_MAJOR, DEVICE_NAME); > - if (ret < 0) { > - printk(KERN_ERR "aoe: aoeblk_init: can't register major\n"); > - goto blkreg_fail; > - } > > printk(KERN_INFO > "aoe: aoe_init: AoE v2.6-%s initialised.\n", Hmm. I'll try to send a patch against usb, since I can pull from there. -- Ed L Cashin <[EMAIL PROTECTED]> - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/