Re: [PATCH, RFC] MAINTAINERS: update OSD entries
On Wed, May 3, 2017 at 1:08 PM, Boaz Harrosh wrote: > > On 05/02/2017 02:33 PM, Jeff Layton wrote: > > On Tue, 2017-05-02 at 09:57 +0200, Christoph Hellwig wrote: > >> The open-osd domain doesn't exist anymore, and mails to the list lead > >> to really annoying bounced that repeat every day. > >> > >> Also the primarydata address for Benny bounces, and while I have a new > >> one for him he doesn't seem to be maintaining the OSD code any more. > >> > >> Which beggs the question: should we really leave the Supported status > >> in MAINTAINERS given that the code is barely maintained? > >> > >> Signed-off-by: Christoph Hellwig > >> --- > >> MAINTAINERS | 4 > >> 1 file changed, 4 deletions(-) > >> > >> diff --git a/MAINTAINERS b/MAINTAINERS > >> index 1bb06c5f7716..28dd83a1d9e2 100644 > >> --- a/MAINTAINERS > >> +++ b/MAINTAINERS > >> @@ -9418,10 +9418,6 @@ F:drivers/net/wireless/intersil/orinoco/ > >> > >> OSD LIBRARY and FILESYSTEM > >> M: Boaz Harrosh > >> -M: Benny Halevy > >> -L: osd-...@open-osd.org > >> -W: http://open-osd.org > >> -T: git git://git.open-osd.org/open-osd.git > >> S: Maintained > >> F: drivers/scsi/osd/ > >> F: include/scsi/osd_* > > > > Hah, you beat me to it! I was going to spin up a patch for this today. > > > > Acked-by: Jeff Layton > > Acked-by: Boaz Harrosh Acked-by: Benny Halevy > > > >
Re: [PATCH] MAINTAINERS: Update email address for bhalevy
On 02/13/2014 07:59 PM, Boaz Harrosh wrote: > On 02/10/2014 11:57 AM, Benny Halevy wrote: >> Tonian is now Primary Data. >> > > Benny hi > > Do you need a push from my tree? Yes please. Makes most sense. Benny > > Boaz > >> Signed-off-by: Benny Halevy >> --- >> MAINTAINERS | 2 +- >> 1 file changed, 1 insertion(+), 1 deletion(-) >> >> diff --git a/MAINTAINERS b/MAINTAINERS >> index 6a6e4ac..b42174d 100644 >> --- a/MAINTAINERS >> +++ b/MAINTAINERS >> @@ -6336,7 +6336,7 @@ F: drivers/net/wireless/orinoco/ >> >> OSD LIBRARY and FILESYSTEM >> M: Boaz Harrosh >> -M: Benny Halevy >> +M: Benny Halevy >> L: osd-...@open-osd.org >> W: http://open-osd.org >> T: git git://git.open-osd.org/open-osd.git >> > -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] MAINTAINERS: Update email address for bhalevy
Tonian is now Primary Data. Signed-off-by: Benny Halevy --- MAINTAINERS | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/MAINTAINERS b/MAINTAINERS index 6a6e4ac..b42174d 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -6336,7 +6336,7 @@ F:drivers/net/wireless/orinoco/ OSD LIBRARY and FILESYSTEM M: Boaz Harrosh -M: Benny Halevy +M: Benny Halevy L: osd-...@open-osd.org W: http://open-osd.org T: git git://git.open-osd.org/open-osd.git -- 1.8.3.1 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] MAINTAINERS: Update email address for Benny Halevy
Update my email address to new company name. Cc: Boaz Harrosh Signed-off-by: Benny Halevy --- MAINTAINERS | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/MAINTAINERS b/MAINTAINERS index 229c66b..0a754ea 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -6125,7 +6125,7 @@ F:drivers/net/wireless/orinoco/ OSD LIBRARY and FILESYSTEM M: Boaz Harrosh -M: Benny Halevy +M: Benny Halevy L: osd-...@open-osd.org W: http://open-osd.org T: git git://git.open-osd.org/open-osd.git -- 1.8.3.1 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 1/2] Revert "IB/fmr_pool: ib_fmr_pool_flush() should flush all dirty FMRs"
Diabolical ;-) Thanks for the pointer! Benny On Feb. 26, 2008, 11:39 -0800, Matthew Wilcox <[EMAIL PROTECTED]> wrote: > On Tue, Feb 26, 2008 at 11:23:01AM -0800, Benny Halevy wrote: >> Pete, the subject says "PATCH 1/2" but I didn't see any follow-up message >> for PATCH 2/2. Just wondering :) > > I think the problem's on your end ... I got it and so did marc: > http://marc.info/?l=linux-scsi&m=120405067313933&w=2 > -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 1/2] Revert "IB/fmr_pool: ib_fmr_pool_flush() should flush all dirty FMRs"
Pete, the subject says "PATCH 1/2" but I didn't see any follow-up message for PATCH 2/2. Just wondering :) Benny On Feb. 26, 2008, 10:27 -0800, Pete Wyckoff <[EMAIL PROTECTED]> wrote: > This reverts commit a3cd7d9070be417a21905c997ee32d756d999b38. > > The original commit breaks iSER reliably, making it complain: > > iser: iser_reg_page_vec:ib_fmr_pool_map_phys failed: -11 > > The FMR cleanup thread runs ib_fmr_batch_release() as dirty > entries build up. This commit causes clean but used FMR > entries also to be purged. During that process, another thread > can see that there are no free FMRs and fail, even though > there should always have been enough available. > > Signed-off-by: Pete Wyckoff <[EMAIL PROTECTED]> > --- > drivers/infiniband/core/fmr_pool.c | 21 ++--- > 1 files changed, 6 insertions(+), 15 deletions(-) > > diff --git a/drivers/infiniband/core/fmr_pool.c > b/drivers/infiniband/core/fmr_pool.c > index 7f00347..4044fdf 100644 > --- a/drivers/infiniband/core/fmr_pool.c > +++ b/drivers/infiniband/core/fmr_pool.c > @@ -139,7 +139,7 @@ static inline struct ib_pool_fmr > *ib_fmr_cache_lookup(struct ib_fmr_pool *pool, > static void ib_fmr_batch_release(struct ib_fmr_pool *pool) > { > int ret; > - struct ib_pool_fmr *fmr, *next; > + struct ib_pool_fmr *fmr; > LIST_HEAD(unmap_list); > LIST_HEAD(fmr_list); > > @@ -158,20 +158,6 @@ static void ib_fmr_batch_release(struct ib_fmr_pool > *pool) > #endif > } > > - /* > - * The free_list may hold FMRs that have been put there > - * because they haven't reached the max_remap count. > - * Invalidate their mapping as well. > - */ > - list_for_each_entry_safe(fmr, next, &pool->free_list, list) { > - if (fmr->remap_count == 0) > - continue; > - hlist_del_init(&fmr->cache_node); > - fmr->remap_count = 0; > - list_add_tail(&fmr->fmr->list, &fmr_list); > - list_move(&fmr->list, &unmap_list); > - } > - > list_splice(&pool->dirty_list, &unmap_list); > INIT_LIST_HEAD(&pool->dirty_list); > pool->dirty_len = 0; > @@ -384,6 +370,11 @@ void ib_destroy_fmr_pool(struct ib_fmr_pool *pool) > > i = 0; > list_for_each_entry_safe(fmr, tmp, &pool->free_list, list) { > + if (fmr->remap_count) { > + INIT_LIST_HEAD(&fmr_list); > + list_add_tail(&fmr->fmr->list, &fmr_list); > + ib_unmap_fmr(&fmr_list); > + } > ib_dealloc_fmr(fmr->fmr); > list_del(&fmr->list); > kfree(fmr); -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Tabs, spaces, indent and 80 character lines
On Feb. 25, 2008, 13:40 -0800, Richard Knutsson <[EMAIL PROTECTED]> wrote: > Benny Halevy wrote: >> On Feb. 24, 2008, 7:40 -0800, Richard Knutsson <[EMAIL PROTECTED]> wrote: >> >>> Krzysztof Halasa wrote: >>> >>>> Richard Knutsson <[EMAIL PROTECTED]> writes: >>>> >>>> >>>> >>>>> Why hinder a developer who prefer >>>>> 2, 4, 6 or any other != 8 width? >>>>> >>>>> >>>> I guess we could use tabs only at the line start, for indentation >>>> only. Rather hard to implement, most text editors can't do that yet. >>>> >>>> >>> You mean for split lines? Hopefully there won't be that many, so there >>> is just to delete the tabs it added and replace it with spaces. >>> >> IMO, tabs SHOULD be used for syntactic indentation and spaces for >> decoration purpose only. I.e. a line should start with a number of tabs >> equal to its nesting level and after that only spaces should be used. >> for example, the following code >> >> for (i = 0; i < n; i++) printk("a very long format string", some, >> parameters); >> >> should be formatted like this: >> >> for (i = 0; i < n; i++) >> printk("a very long format string", >>some, parameters); >> >> this will show exactly right regardless of your editor's tab expansion >> setting >> as long as you use fixed-width fonts - where the screen width of the space >> character >> is equal to all other characters. Once you start using tabs instead of >> spaces >> to push text right so it appears exactly below some other text on the line >> above >> you make a dependency on *your* editor's tab expansion policy and that's not >> very >> considerate for folks who prefer a different one. >> > Don't know what to say more then: Yup! :) > > But the CodeStyle-document and checkpatch.pl does not agree with that. > I know :( If there's enough interest I can take a stab at seeing what it'd take to implement such a check in checkpatch.pl Benny -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] Change a WARN message in checkpatch
On Feb. 24, 2008, 19:29 -0800, Andy Whitcroft <[EMAIL PROTECTED]> wrote: > On Sun, Feb 24, 2008 at 07:14:13PM +0100, Paolo Ciarrocchi wrote: >> On Sun, Feb 24, 2008 at 4:18 PM, Benny Halevy <[EMAIL PROTECTED]> wrote: >> [...] >>> How about: >>> - WARN("no space between function name and >>> open parenthesis '('\n" . $herecurr); >>> + WARN("there should be no space between >>> function name and open parenthesis '('\n" . $herecurr); >> I originally suggested: >> + WARN("don't put a space between >> function name and open parenthesis '('\n" . $herecurr); >> but I really prefer your version. >> >>> The original phrasing can be interpreted like "there is no space between >>> ..." and the correct >>> interpretation should be "there should be no space between ..." >> Indeed. > > As we want the messages to be as short as possible, I am leaning towards > standardising on: > > spaces prohibited > spaces required Sounds good to me. Benny > > -apw -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Tabs, spaces, indent and 80 character lines
On Feb. 24, 2008, 7:40 -0800, Richard Knutsson <[EMAIL PROTECTED]> wrote: > Krzysztof Halasa wrote: >> Richard Knutsson <[EMAIL PROTECTED]> writes: >> >> >>> Why hinder a developer who prefer >>> 2, 4, 6 or any other != 8 width? >>> >> I guess we could use tabs only at the line start, for indentation >> only. Rather hard to implement, most text editors can't do that yet. >> > You mean for split lines? Hopefully there won't be that many, so there > is just to delete the tabs it added and replace it with spaces. IMO, tabs SHOULD be used for syntactic indentation and spaces for decoration purpose only. I.e. a line should start with a number of tabs equal to its nesting level and after that only spaces should be used. for example, the following code for (i = 0; i < n; i++) printk("a very long format string", some, parameters); should be formatted like this: for (i = 0; i < n; i++) printk("a very long format string", some, parameters); this will show exactly right regardless of your editor's tab expansion setting as long as you use fixed-width fonts - where the screen width of the space character is equal to all other characters. Once you start using tabs instead of spaces to push text right so it appears exactly below some other text on the line above you make a dependency on *your* editor's tab expansion policy and that's not very considerate for folks who prefer a different one. >> >>> By only using tabs as indents, and >>> changing the CodeStyle to be something like "maximum 80 >>> characters-wide lines, with a tab-setting of 8 spaces", >>> >> This changes nothing. >> > Exactly! But then we can remove the "we use 8 wide tabs in the kernel" > in CodeStyle. >> >>> that is >>> possible + easier to write code-checkers [2]. >>> >> I doubt it. >> > Easier to write code-checkers? OK, maybe not. Just that I got hit by > this problem at a time when I wrote a simple checker (don't remember its > purpose). >> >>> Or are we really that concerned about the disk-space? ;) >>> >> Unpacked sources will be much bigger with not tabs, sure. >> > Without no tabs at all, you mean? Don't want to think about that > scenario, but with this suggestion, I would estimate maybe 0,5 - 1% bigger. > > Thanks for your input > > -- > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to [EMAIL PROTECTED] > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] Change a WARN message in checkpatch
On Feb. 23, 2008, 5:38 -0800, "Paolo Ciarrocchi" <[EMAIL PROTECTED]> wrote: > On Mon, Jan 28, 2008 at 4:13 PM, Paolo Ciarrocchi > <[EMAIL PROTECTED]> wrote: >> On Jan 28, 2008 3:56 PM, Andy Whitcroft <[EMAIL PROTECTED]> wrote: >> > On Mon, Jan 14, 2008 at 11:29:13PM +0100, Paolo Ciarrocchi wrote: >> > > Hi Andy, >> > > When I started using checkpatch I was confused by the following WARN >> message: >> > > >> > > no space between function name and open parenthesis >> > > >> > > I thought the problem was that a space was missing while the truth is >> the opposite. >> > > >> > > How about the following patch? >> > >> > I can see how that language would be confusing. Your patch makes the >> > english clearer, but somehow seems clumsy. There must be a better way >> > to say this. I will think on it and see what I can come up with. >> >> Fair enought, I'm not English mother tongue and I'm sure you can come >> up with a better >> sentence :-) > > AFAICS the problem is still present. > > Ciao, How about: - WARN("no space between function name and open parenthesis '('\n" . $herecurr); + WARN("there should be no space between function name and open parenthesis '('\n" . $herecurr); The original phrasing can be interpreted like "there is no space between ..." and the correct interpretation should be "there should be no space between ..." Benny -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: linux-next: first tree
On Feb. 14, 2008, 20:29 +0200, "Yinghai Lu" <[EMAIL PROTECTED]> wrote: > On Thu, Feb 14, 2008 at 5:35 AM, Stephen Rothwell <[EMAIL PROTECTED]> wrote: >> Hi all, >> >> I have created the first cut of the linux-next tree at >> git://git.kernel.org/pub/scm/linux/kernel/git/sfr/linux-next.git. >> >> Things to know about this tree: >> >> It has two branches - master and stable. Stable is currently just Linus' >> tree and will never rebase. Master will rebase on an almost daily basis >> (maybe slower at the start). > > can you make stable rebase too? The point is that if you rebase it it's no longer stable. Benny > > so we make git-bisect working by fold in some obvious bug fix. > > or that is linux-stable tree? > > YH > -- > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to [EMAIL PROTECTED] > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Announce: Linux-next (Or Andrew's dream :-))
On Feb. 13, 2008, 19:52 +0200, "J. Bruce Fields" <[EMAIL PROTECTED]> wrote: > On Tue, Feb 12, 2008 at 09:43:10PM -0800, Linus Torvalds wrote: >> So just the fact that the right commit gets blamed when somebody does a >> "git bisect" is I think a big issue. It's just fundamentally more fair to >> everybody. And it means that the people who push their work to me can >> really choose to stand behind it, knowing that whatever happens, their >> work won't get diluted by bad luck or others' incompetence. >> >> And no, maybe most people don't feel things like that matters. But I do >> think it's important. > > The obvious advantage to rebasing in this case is that the blame > (misplaced though it may be), at least lands on a commit that made a > single small change, likely making the problem easier to diagnose. > > (As opposed to the case of a large merge, where all you may know is that > somewhere in the hundreds of commits done on one side of the merge there > was a conflict with the hundreds of commits on the other side.) > > I think a lot of people would see rebasing as an acceptable tradeof that > gives up a small amount of accuracy in assigning blame to individuals in > return for a large increase in ability to debug problems. > > I suppose one response to that would be that it's important that people > learn how to work in parallel, that failures to do so are particularly > important failures in the process, and that it's therefore worth it to > make sure that such failures are always identified specifically as merge > failures. > > It would be nice if merges, like patches, were broken up into somewhat > smaller units. There's an understandable desire to wait to the last > minute to actually commit to one's commits, but a willingness to do so a > little earlier might avoid some of the problems that seem to come from > having a lot of large merges happen all at once. > > --b. One idea that I thought about when debating rebase vs. merge (and it's far far from being fully baked) is versioned commits. The gist of it is that patches are assigned an hash identifier like today when they are first committed into the tree, but, and this is the main change: if they mutate, e.g. by a rebase, or even git commit --amend, their version is bumped up rather than SHA changed. This way all versions of the commit would be accessible and addressable using their respective SHA *and* version. I think that this can help keep the tree's history in a more intuitive way (since the patches' base identifier won't change, just its version number), and you get a bonus of seeing each commit's history, who changed it, and what was the change. Benny -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Announce: Linux-next (Or Andrew's dream :-))
On Feb. 12, 2008, 20:36 +0200, Linus Torvalds <[EMAIL PROTECTED]> wrote: > > On Tue, 12 Feb 2008, Benny Halevy wrote: >> IMHO, this base tree should typically be based off of linus' tree >> and kept rebased on top of it. This way you get the mainline fixes >> through the integration base tree. > > Hell no! > > No rebasing! If people rebase, then it's useless as a base. Linus, what you're saying is pretty extreme. Our experience with pnfs development taught us that rebase was actually the only method that worked on the long run because it allows us to maintain a *clean* series of patchsets over an extended period of time (order of two years). In the past, when we merged your tree into ours rather than using git-remote update and rebase we ended up with a cluttered mess that buried all the resolved merge conflicts in merge commits and maintaining it became an increasingly harder nightmare. Nowadays, I rebase the linux-pnfs git tree almost on a daily basis, keeping tags on historical refs (to be cleaned up when a branch stabilizes) and the other contributors simply use git-remote update to refresh their tracking branches and rebase their stuff onto the new heads. It took the team a couple weeks to get used to this methodology but it really works great for us now. It seems to me that merging works well one (your :) way when patches are merged upstream and become immutable at this point, while on the other way, keeping a patchset fresh and comprehensible is feasible only with rebase when there are occasional conflicts with stuff coming down from upstream. The domino- effect caused by rebase is manageable and not worse than merge if people just keep track of the original bases of their development branch before updating their tracking branches. Maybe the missing link is a tool to help do rebasing more easily. I wrote my own tool for that: git-rebase-tree (available on git://git.bhalevy.com/git-tools.git) and using it makes the mechanics of rebasing our patchsets almost trivial. I really encourage folks to try it out and give me feedback about it. Benny > > That base tree needs to be something people can *depend* on. It contains > the API changes, and not anything else. Otherwise I will never ever pull > the resulting mess, and you all end up with tons of extra work. > > Just say *no* to rebasing. > > Rebasing is fine for maintaining *your* own patch-set, ie it is an > alternative to using quilt. But it is absolutely not acceptable for > *anythign* else. So what's the solution for maintaining a series of patchsets that depend on one another? > > In particular, people who rebase other peoples trees should just be shot > (*). It's simply not acceptable behaviour. It screws up the sign-off > procedure, it screws up the people whose code was merged, and it's just > WRONG. > > Linus > > (*) The exception being if there is something seriously wrong with the > tree. I think I've had trees which I just refused to pull, and while most > of the time I just say "I refuse to pull", early on in git development I > actually ended up fixing some of those trees up because my refusal was due > to people mis-using git in the first place. So I have actually effectively > rebased a maintainer tree at least once. But I still think it is seriously > screwed up. > -- > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to [EMAIL PROTECTED] > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Announce: Linux-next (Or Andrew's dream :-))
On Feb. 12, 2008, 19:41 +0200, James Bottomley <[EMAIL PROTECTED]> wrote: > On Tue, 2008-02-12 at 09:09 -0800, Linus Torvalds wrote: >> (a) create a base tree with _just_ that fundamental infrastructure change, >> and make sure that base branch is so obviously good that there is no >> question about merging it. > > The problem is how do we use a? Usually we need to track your -rc tree > as our fixes go in ... some of which affect our development trees. > > If we stick with (a) as the base, we don't get to pull in the fixes in > your tree. If we use your tree we have to pull in (a) creating n > different merge points for the n different upstream trees.. IMHO, this base tree should typically be based off of linus' tree and kept rebased on top of it. This way you get the mainline fixes through the integration base tree. Benny -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Announce: Linux-next (Or Andrew's dream :-))
On Feb. 12, 2008, 18:36 +0200, Jeff Garzik <[EMAIL PROTECTED]> wrote: > David Miller wrote: >> This is why, with the networking, we've just tossed all of the network >> driver stuff in there too. I can rebase freely, remove changesets, >> rework them, etc. and this causes a very low amount of pain for Jeff >> Garzik and John Linville. > > > s/very low/not low/ > > Rebasing is always a pain, and John and I both agreed the other day that > you do it too often. > > I've complained about this before, too... but figured this was just > another thing I was getting ignored on, and so life moved on. But don't > try to sell rebasing as "low pain". > > Rebasing makes the history all nice and pretty, but by totalling > flattening the history, trashing all the commit ids (and rewriting > associated metadata), you create obvious downstream problems. FWIW, when I rebase branches in my tree that others depend on I keep tags on the old heads for reference. Once the work-in-progress is done (e.g. tree pulled upstream) the reference tags can be cleaned up and the tree can be pruned. Benny > > Rebasing is low impact only if you don't have git downstream people. > Otherwise, you're just treating it as a useful quilt clone, really. > > Jeff > > > > -- > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to [EMAIL PROTECTED] > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Announce: Linux-next (Or Andrew's dream :-))
On Feb. 12, 2008, 17:07 +0200, James Bottomley <[EMAIL PROTECTED]> wrote: > On Mon, 2008-02-11 at 21:53 -0800, Greg KH wrote: >>> this is why you need specific trees for just the API change, and these >>> need to EXPLICITLY go first before EVERYTHING ELSE. Yes this needs a >>> bit of coordination, but it's the only way. >> Even then, it will not work. >> >> Again, Roland isn't going to want to always pull in my driver tree just >> to build his tree. He wants to, and needs to do his own development >> effort. >> >> But when we merge them together, there would be problems. >> >> So, you can't just "drop" the IB tree. >> You can't just "drip" my tree. >> >> Where do you "fix this up" at? I can send a patch for the IB tree, but >> Roland can't put it in his tree, and I can't put it in my tree, it needs >> to go _after_ both of our trees. > > Actually, we had exactly this issue with the SCSI bidirectional patches: > They depended on the sg_table patches in block. The solution I adopted > was two merge trees: One to go in immediately with no dependencies > (scsi-misc-2.6) and the other based on the pieces of block (so it would > compile and apply) to go in mid way through the merge round after block > (scsi-bidi-2.6). What I did was keep rebasing the bidi tree until I > could see there was nothing other than a plane base before merging it. My take on this, in retrospect, is that the code should probably have been developed in one branch off of one of the trees, or maybe even better in a third tree. The point is that the structure of git trees followed the organizational structure rather than the problem at hand and if contributions coming from different trees depend on each other, staging branches should be created in the integration tree for working out the conflicts and when ready, the integrated branches should be pushed towards the tree's trunk. > > Of course, this only worked because Jens has a git tree ... it would > have been a lot harder (but not impossible) if I'd had entangled patches > from a quilt tree. > > So I've already proven that the split tree solution is viable, if not > pretty. The bidi tree had to be rebased an awful lot as the block trees > changed and rebased. Unfortunately, git isn't very good at this, I > eventually had to keep a base and a top reference and just try to cherry > pick this series into the new constructed block tree. But it can be > done... I developed git-rebase-tree (http://git.bhalevy.com/git/gitweb.cgi?p=git-tools.git;a=blob_plain;f=git-rebase-tree;hb=HEAD) for exactly this reason - frequently rebasing sub-branches in the linux-pnfs tree and my experience so far is that it really speeds things up for me and Boaz. Please feel invited to try it out, I think it's very to close to be mature enough for general availability (I know, I know, it'd be perfect if this functionality would be merged into git-rebase :) > >> That's what -mm has been able to handle so far, and that needs to also >> work with -next. > > Actually, we never successfully got block and bidi via -mm. > > James > > -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: bug in checkpatch (on pointers to typedefs?)
On Feb. 11, 2008, 20:42 +0200, Andy Whitcroft <[EMAIL PROTECTED]> wrote: > On Mon, Feb 11, 2008 at 06:58:08PM +0200, Benny Halevy wrote: >> OK, but the return type doesn't have to be in the patched line, it could be >> in >> a synchronization line or even missing if the function has a long multi-line >> argument >> list. > > Ok, I guess thats fair criticism. Could you check out the current > checkpatch-next (0.14-8-g3737366 or later -9, -10 etc), and see if > that works. It seems to on the simple examples you sent me :). Confirmed with 0.14-8-g3737366. Thanks! Benny Oh, and I really liked the fact that you print the patch file name in the summary line of each patch checked rather than "Your patch" :) > > Thanks. > > -apw -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] scsi_error: Fix language abuse.
Seriously, can't you just add a disclaimer to the README file? In http://lkml.org/lkml/2008/2/9/29, Luben Tuikov made an interesting point that in many cases "illegal" refers to a valid value that violates the specification, so the term "invalid" may be technically incorrect. Benny On Feb. 11, 2008, 18:07 +0200, "linux-os (Dick Johnson)" <[EMAIL PROTECTED]> wrote: > On Fri, 8 Feb 2008, Mark Hounschell wrote: > >> linux-os (Dick Johnson) wrote: >>> The correct word should be "invalid," in spite of >>> the fact that the SCSI committee used invalid syntax. >>> >>> Alan is right. There is nothing illegal in the kernel >>> and if there is, it must be removed as soon as it >>> is discovered! >>> >> il·le·gal (-lgl) >> adj. >> 1. Prohibited by law. >> *2. Prohibited by official rules: an illegal pass in football. >> *3. Unacceptable to or not performable by a computer: an illegal operation. >> >> Mark > > Many early computer programmers including myself, while writing > error messages in early software, did not understand that computer > programmers do not make law so we called bad operations "illegal." > > Once you are called to testify in a court of law, about some > message your code wrote to the screen just before a factory > burned down, you start to be much more careful about the syntax. > > I advise that, regardless of the rewrite of dictionaries and, > in fact, the rewrite of history, whenever possible we use > the correct message syntax. Dictionaries pick up "common usage" > in spite of the fact that it is wrong. See "irregardless" and > other abortions which now exist in the dictionary. > > > Cheers, > Dick Johnson > Penguin : Linux version 2.6.22.1 on an i686 machine (5588.28 BogoMips). > My book : http://www.AbominableFirebug.com/ > _ > > > The information transmitted in this message is confidential and may be > privileged. Any review, retransmission, dissemination, or other use of this > information by persons or entities other than the intended recipient is > prohibited. If you are not the intended recipient, please notify Analogic > Corporation immediately - by replying to this message or by sending an email > to [EMAIL PROTECTED] - and destroy all copies of this information, including > any attachments, without reading or disclosing them. > > Thank you. > - > To unsubscribe from this list: send the line "unsubscribe linux-scsi" in > the body of a message to [EMAIL PROTECTED] > More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: bug in checkpatch (on pointers to typedefs?)
On Feb. 11, 2008, 18:40 +0200, Andy Whitcroft <[EMAIL PROTECTED]> wrote: > On Mon, Feb 11, 2008 at 06:05:48PM +0200, Benny Halevy wrote: >> I saw this too with checkpatch.pl version 0.12 >> It seems like checkpatch.pl knows only about types derived >> from @typeList by build_types. >> >> Example below... >> >> Benny >> >> $ cat <> Signed-off-by: [EMAIL PROTECTED] >> --- >> diff a/f.c b/f.c >> --- a/f.c >> +++ b/f.c >> @@ -1,0 +1,2 @@ >> +foo(int a, my_uint32 *); >> +bar(int a, my_uint32_t *); > > But that isn't actually syntactically correct code is it? You have types > as parameters like a function declaration, but no return type. So there > is no hint to checkpatch that this is a function declaration and therefore > the parameters are not expected to be types, nor are they checked as such. > > The following diff is clean on the latest version of checkpatch: > > Signed-off-by: [EMAIL PROTECTED] > --- > diff a/f.c b/f.c > --- a/f.c > +++ b/f.c > @@ -1,0 +1,2 @@ > +void foo(int a, my_uint32 *); > +int bar(int a, my_uint32_t *); > EOF OK, but the return type doesn't have to be in the patched line, it could be in a synchronization line or even missing if the function has a long multi-line argument list. > > Could you try out the version of checkpatch at the URL below on the real > patch you are using to test, and let me know if it works. There are > a number of improvements to type tracking in the face of ifdef's and > the like. If it doesn't could I have the hunk which fails: > > > http://www.kernel.org/pub/linux/kernel/people/apw/checkpatch/checkpatch.pl-next Your modified patch passes with version 0.12 as well as checkpatch.pl-next However, the following patch that has the return type in the synchronization lines still produces the same error: $ ./checkpatch.pl-next - Signed-off-by: [EMAIL PROTECTED] --- diff a/f.c b/f.c --- a/f.c +++ b/f.c @@ -1,2 +1,4 @@ int +foo(int a, my_uint32 *); int +bar(int a, my_uint32_t *); ERROR: need consistent spacing around '*' (ctx:WxB) #8: FILE: f.c:2: +foo(int a, my_uint32 *); ^ total: 1 errors, 0 warnings, 4 lines checked > > -apw > -- > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to [EMAIL PROTECTED] > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: bug in checkpatch (on pointers to typedefs?)
I saw this too with checkpatch.pl version 0.12 It seems like checkpatch.pl knows only about types derived from @typeList by build_types. Example below... Benny $ cat < wrote: > On Sun, Feb 10, 2008 at 03:33:02PM +0100, Marcin Slusarz wrote: >> Hi >> >> Checkpatch in current mainline outputs following errors: >> >> $ ./scripts/checkpatch.pl -q --file ./fs/udf/misc.c >> ERROR: need consistent spacing around '*' (ctx:WxV) >> #205: FILE: fs/udf/misc.c:205: >> + tag *tag_p; >> ^ >> >> $ ./scripts/checkpatch.pl -q --file ./fs/udf/unicode.c >> ERROR: need consistent spacing around '*' (ctx:WxV) >> #48: FILE: fs/udf/unicode.c:48: >> +int udf_build_ustr(struct ustr *dest, dstring *ptr, int size) >>^ >> (...) >> >> I think the code is ok. > > Yep the code is clearly correct. Can I have the whole patch fragment > you got these against so I can figure out why we are unable to detect > these two as types, I would expect us to have done so. Also what > version of checkpatch is this? There is a version string at the top. > > -apw > -- > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to [EMAIL PROTECTED] > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Integration of SCST in the mainstream Linux kernel
On Feb. 06, 2008, 14:16 +0200, "Bart Van Assche" <[EMAIL PROTECTED]> wrote: > On Feb 5, 2008 6:01 PM, Erez Zilber <[EMAIL PROTECTED]> wrote: >> Using such large values for FirstBurstLength will give you poor >> performance numbers for WRITE commands (with iSER). FirstBurstLength >> means how much data should you send as unsolicited data (i.e. without >> RDMA). It means that your WRITE commands were sent without RDMA. > > Sorry, but I'm afraid you got this wrong. When the iSER transport is > used instead of TCP, all data is sent via RDMA, including unsolicited > data. If you have look at the iSER implementation in the Linux kernel > (source files under drivers/infiniband/ulp/iser), you will see that > all data is transferred via RDMA and not via TCP/IP. Regardless of what the current implementation is, the behavior you (Bart) describe seems to disagree with http://www.ietf.org/rfc/rfc5046.txt. Benny > > Bart Van Assche. > - -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] update checkpatch.pl to version 0.14
On Feb. 06, 2008, 16:43 +0200, Andy Whitcroft <[EMAIL PROTECTED]> wrote: > On Tue, Feb 05, 2008 at 09:26:47PM +0200, Benny Halevy wrote: >> Was version 0.13 NACKed? > > It was picked up by Andrew for -mm according to my email. I do not > think there has been an -mm release since then however. This patch is > relative to that version. > > -apw OK. I was afraid it was since Andrew had some comments about white spacing in checkpatch.pl itself IIRC. Benny -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] update checkpatch.pl to version 0.14
Was version 0.13 NACKed? Benny -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re:
am kara wrote: > hello, > > If kernel does kmap_atomic(temporary kernel mapping) > on behalf of a process by a cpu, does the process will > continue to run and no other process can be scheduled > to switch it off?(till kunmap_atomic is done) Effectively, kmap_atomic implementations call pagefault_disable and that in turn is equivalent to preempt_disable() so the answer to your question seems to be "yes". Benny -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] update checkpatch.pl to version 0.13
On Jan. 18, 2008, 13:37 +0200, Andy Whitcroft <[EMAIL PROTECTED]> wrote: > On Thu, Jan 17, 2008 at 11:19:23AM -0800, Andrew Morton wrote: >> On Thu, 17 Jan 2008 16:23:51 - Andy Whitcroft <[EMAIL PROTECTED]> wrote: >> >>> This version brings a large number of fixes which have built up over >>> the Christmas period. Mostly these are fixes for false positives, both >>> through improvments to unary checks and possible type detection. It >>> also brings new checks for while location and CVS keywords. >> heh. Doctor, heal thyself. > > Heh, yeah I was feeling pressure to push out the update and forgot to > check it. Spanner. > > I have fixed the three lines which have random tabs on them. Its > something I do in vi which is adding them, one day I will figure out > what the heck it is I do. autoindent set by any chance? try :set noai If you you like it better this way you can configure it in your vi{,m}rc Benny > > -apw > > --- > clean up some space violations in checkpatch.pl > > Seems that something I do in vi leaves lines with multiple tabs on > them lying about. Clean these up before edit things more. > > Signed-off-by: Andy Whitcroft <[EMAIL PROTECTED]> > --- > checkpatch.pl |6 +++--- > 1 files changed, 3 insertions(+), 3 deletions(-) > > diff --git a/checkpatch.pl b/checkpatch.pl > index 07ba401..a2b4c41 100755 > --- a/checkpatch.pl > +++ b/checkpatch.pl > @@ -341,7 +341,7 @@ sub sanitise_line { > my $clean = 'X' x length($1); > $res =~ s@(#\s*(?:error|warning)\s+)[EMAIL PROTECTED]@; > } > - > + > return $res; > } > > @@ -947,7 +947,7 @@ sub process { > if ($realcnt) { > # Ignore goto labels. > if ($line =~ /$Ident:\*$/) { > - > + > # Ignore functions being called > } elsif ($line =~ /^.\s*$Ident\s*\(/) { > > @@ -1190,7 +1190,7 @@ sub process { > > # Ignore those directives where spaces _are_ permitted. > if ($name =~ > /^(?:if|for|while|switch|return|volatile|__volatile__|__attribute__|format|__extension__|Copyright|case)$/) > { > - > + > # cpp #define statements have non-optional spaces, ie > # if there is a space between the name and the open > # parenthesis it is simply not a parameter group. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Checkpatch.pl failure
On Jan. 14, 2008, 17:48 +0200, James Bottomley <[EMAIL PROTECTED]> wrote: > This error: > > ERROR: no space before that close parenthesis ')' > #501: FILE: drivers/scsi/dpt_i2o.c:2299: > + if (dev_status == 0x02 /*CHECK_CONDITION*/) { > > Is definitely wrong. I think it's stripped the comments so now the if > looks to have a space before the bracket, but stylistically the > complaint it has errored out for is wrong. I've seen similar complaints as well and I agree with James that they seem bogus. I think that the comment should be treated as part of the grammar and not just stripped out and then you can even add checks about allowed spacing before and after the comment. > > James -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] teach checkpatch.pl about list_for_each
On Jan. 03, 2008, 14:30 +0200, Arnaldo Carvalho de Melo <[EMAIL PROTECTED]> wrote: > Em Thu, Jan 03, 2008 at 12:26:10PM +, Christoph Hellwig escreveu: >> On Thu, Jan 03, 2008 at 11:10:58AM +, Andy Whitcroft wrote: >>> We have had some stabs at changing this, but no consensus was reached on >>> whether it was a "for" or a "function". My memory is of there being >>> slightly more "without a space" tenders than the other and so it has not >>> been changed. This thread also seems so far to have not really >>> generated a concensus. So I would tend to leave things as they are. >>> >>> A third option might be to accept either on *for_each* constructs. >>> That might tend to lead to divergance. Difficult. However, also see my >>> later comments on "style guide". >> Pretty much all core code uses list_for_each_entry( so new code should >> follow that example. > > Agreed, CodingStyle is not about mindless consistency such as "for (" is > the right thing, so "list_for_each (" is consistent with it, it is about > codifying practice contributors got used to over the years. > Why mindless? Coding style is also about giving the coding language logic a graphical representation. Following a convention that flow control keywords such as "if", "for", or "while" are distinguished from function calls by use of a space after the keyword really helps the code readability regardless of how people used to code it in the past... The for_each_* macros are clearly not function calls but rather translate to for () flow control constructs hence they should follow the same convention. FWIW, I think that changing the existing convention is worth it in this case. Benny > - Arnaldo -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] checkpatch.pl: recognize the #elif preprocessor directive
checkpatch.pl does not recognize #elif as a preprocessor directive causing it to print bogus errors for, e.g.: ERROR: need consistent spacing around '&' (ctx:WxV) when the operator is not recognized as unary in this context. for example: void foo(void) { int x, y, z; void *p[1] = { #if defined(X) &x #elif defined(Y) &y #else &z #endif }; } Signed-off-by: Benny Halevy <[EMAIL PROTECTED]> --- scripts/checkpatch.pl |2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/scripts/checkpatch.pl b/scripts/checkpatch.pl index 579f50f..9911c17 100755 --- a/scripts/checkpatch.pl +++ b/scripts/checkpatch.pl @@ -534,7 +534,7 @@ sub annotate_values { $preprocessor = 1; $paren_type[$paren] = 'N'; - } elsif ($cur =~ /^(#\s*(?:ifdef|ifndef|if|else|endif))/o) { + } elsif ($cur =~ /^(#\s*(?:ifdef|ifndef|if|else|elif|endif))/o) { print "PRE($1)\n" if ($debug); $preprocessor = 1; $type = 'N'; -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: void* arithmnetic
On Nov. 29, 2007, 3:19 +0200, "Ming Lei" <[EMAIL PROTECTED]> wrote: > 2007/11/29, Jan Engelhardt <[EMAIL PROTECTED]>: >> On Nov 29 2007 01:05, J.A. Magallón wrote: >>> Since begin of the ages the build of the nvidia driver says things like >>> this: >>> >> Explicitly adding -Wpointer-arith to ones own Makefile is like >> admitting the code might be problematic. :-> >> >> >> I think sizeof(void *) == 1 is taken as granted as sizeof(int) >= 4 >> these days. Sigh. > sizeof(void *) == 4, sizeof(void)==1, :) well, sizeof(void *) == sizeof(unsigned long) maybe :) >> - >> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in >> the body of a message to [EMAIL PROTECTED] >> More majordomo info at http://vger.kernel.org/majordomo-info.html >> Please read the FAQ at http://www.tux.org/lkml/ >> > - > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to [EMAIL PROTECTED] > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ > - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.24-rc2 XFS nfsd hang
I wonder if this is a similar hang to what Christian was seeing here: http://lkml.org/lkml/2007/11/13/319 Benny On Nov. 14, 2007, 9:04 +0200, Chris Wedgwood <[EMAIL PROTECTED]> wrote: > With 2.6.24-rc2 (amd64) I sometimes (usually but perhaps not always) > see a hang when accessing some NFS exported XFS filesystems. Local > access to these filesystems ahead of time works without problems. > > This does not occur with 2.6.23.1. The filesystem does not appear to > be corrupt. > > > The call chain for the wedged process is: > > [ 1462.911256] nfsd D 80547840 4760 2966 2 > [ 1462.911283] 81010414d4d0 0046 > 81010414d610 > [ 1462.911322] 810104cbc6e0 81010414d480 80746dc0 > 80746dc0 > [ 1462.911360] 80744020 80746dc0 81010129c140 > 8101000ad100 > [ 1462.911391] Call Trace: > [ 1462.911417] [] __down+0xe9/0x101 > [ 1462.911437] [] default_wake_function+0x0/0xe > [ 1462.911458] [] __down_failed+0x35/0x3a > [ 1462.911480] [] _xfs_buf_find+0x84/0x24d > [ 1462.911501] [] _xfs_buf_find+0x193/0x24d > [ 1462.911522] [] xfs_buf_lock+0x43/0x45 > [ 1462.911543] [] _xfs_buf_find+0x1ba/0x24d > [ 1462.911564] [] xfs_buf_get_flags+0x5a/0x14b > [ 1462.911586] [] xfs_buf_read_flags+0x12/0x86 > [ 1462.911607] [] xfs_trans_read_buf+0x4c/0x2cf > [ 1462.911629] [] xfs_da_do_buf+0x41b/0x65b > [ 1462.911652] [] xfs_da_read_buf+0x24/0x29 > [ 1462.911673] [] xfs_dir2_block_lookup_int+0x4d/0x1ab > [ 1462.911694] [] xfs_dir2_block_lookup_int+0x4d/0x1ab > [ 1462.911717] [] xfs_dir2_block_lookup+0x15/0x8e > [ 1462.911738] [] xfs_dir_lookup+0xd2/0x12c > [ 1462.911761] [] submit_bio+0x10d/0x114 > [ 1462.911781] [] xfs_dir_lookup_int+0x2c/0xc5 > [ 1462.911802] [] lockdep_init_map+0x90/0x495 > [ 1462.911823] [] xfs_lookup+0x44/0x6f > [ 1462.911843] [] xfs_vn_lookup+0x29/0x60 > [ 1462.915246] [] __lookup_hash+0xe5/0x109 > [ 1462.915267] [] lookup_one_len+0x41/0x4e > [ 1462.915289] [] compose_entry_fh+0xc1/0x117 > [ 1462.915311] [] encode_entry+0x17c/0x38b > [ 1462.915333] [] find_or_create_page+0x3f/0xc9 > [ 1462.915355] [] _xfs_buf_lookup_pages+0x2c1/0x2f6 > [ 1462.915377] [] _spin_unlock+0x1f/0x49 > [ 1462.915399] [] cache_alloc_refill+0x1ba/0x4b9 > [ 1462.915424] [] nfs3svc_encode_entry_plus+0x0/0x13 > [ 1462.915448] [] nfs3svc_encode_entry_plus+0x10/0x13 > [ 1462.915469] [] xfs_dir2_block_getdents+0x15b/0x1e2 > [ 1462.915491] [] nfs3svc_encode_entry_plus+0x0/0x13 > [ 1462.915514] [] nfs3svc_encode_entry_plus+0x0/0x13 > [ 1462.915534] [] xfs_readdir+0x91/0xb6 > [ 1462.915557] [] nfs3svc_encode_entry_plus+0x0/0x13 > [ 1462.915579] [] xfs_file_readdir+0x31/0x40 > [ 1462.915599] [] vfs_readdir+0x61/0x93 > [ 1462.915619] [] nfs3svc_encode_entry_plus+0x0/0x13 > [ 1462.915642] [] nfsd_readdir+0x6d/0xc5 > [ 1462.915663] [] nfsd3_proc_readdirplus+0x114/0x204 > [ 1462.915686] [] nfsd_dispatch+0xde/0x1b6 > [ 1462.915706] [] svc_process+0x3f8/0x717 > [ 1462.915729] [] nfsd+0x1a9/0x2c1 > [ 1462.915749] [] child_rip+0xa/0x12 > [ 1462.915769] [] __svc_create_thread+0xea/0x1eb > [ 1462.915792] [] nfsd+0x0/0x2c1 > [ 1462.915812] [] child_rip+0x0/0x12 > > Over time other processes pile up beind this. > > [ 1462.910728] nfsd D 5440 2965 2 > [ 1462.910769] 8101040cdd40 0046 0001 > 810103471900 > [ 1462.910812] 8101029a72c0 8101040cdcf0 80746dc0 > 80746dc0 > [ 1462.910852] 80744020 80746dc0 81010008e0c0 > 8101012a1040 > [ 1462.910882] Call Trace: > [ 1462.910909] [] nfsd_permission+0x95/0xeb > [ 1462.910931] [] vfs_readdir+0x46/0x93 > [ 1462.910950] [] mutex_lock_nested+0x165/0x27c > [ 1462.910971] [] _spin_unlock+0x1f/0x49 > [ 1462.910994] [] nfs3svc_encode_entry_plus+0x0/0x13 > [ 1462.911015] [] vfs_readdir+0x46/0x93 > [ 1462.911037] [] nfs3svc_encode_entry_plus+0x0/0x13 > [ 1462.911057] [] nfsd_readdir+0x6d/0xc5 > [ 1462.911079] [] nfsd3_proc_readdirplus+0x114/0x204 > [ 1462.911102] [] nfsd_dispatch+0xde/0x1b6 > [ 1462.911122] [] svc_process+0x3f8/0x717 > [ 1462.911143] [] nfsd+0x1a9/0x2c1 > [ 1462.911165] [] child_rip+0xa/0x12 > [ 1462.911184] [] __svc_create_thread+0xea/0x1eb > [ 1462.911206] [] nfsd+0x0/0x2c1 > [ 1462.911225] [] child_rip+0x0/0x12 > > > Any suggestions other than to bisect this? (Bisection might be > painful as it crosses the x86-merge.) > - > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to [EMAIL PROTECTED] > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read
Re: linux-nfs created at vger
Dave, this sounds like a good idea. How about cross posting this message also to [EMAIL PROTECTED] Benny On Nov. 12, 2007, 10:16 +0200, David Miller <[EMAIL PROTECTED]> wrote: > Because the sourceforge lists are a huge collection of spam and > subscriber-posting only, and someone reminded me of this recently, > I've decided to sort-of force the issue wrt. the NFS mailing lists by > putting up a [EMAIL PROTECTED] > > I'll let the masses decide whether to use it or not :-) > - > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to [EMAIL PROTECTED] > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ > - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Coding Style: indenting with tabs vs. spaces
On Nov. 08, 2007, 17:58 +0200, Chris Snook <[EMAIL PROTECTED]> wrote: > Benny Halevy wrote: >> Greetings, >> >> I would like to hear peoples opinion about the indentation convention >> described below that I personally found the most practical with >> several different editors. >> >> The gist of it is that tabs should be used for nesting, not for decoration. >> Indent your code with as many tabs as your nesting level, where all >> statements >> will begin, and from there on use space characters. >> The rational behind it is to be tab-width agnostic so regardless of your >> tab expansion setup, the code will look correct and will make sense. >> >> When you break a line and want the new line text to start below a specific >> point >> relative to the previous line (I consider that "decorating") then start the >> new >> line with the same number of tabs as the previous one and then just use space >> characters as their width is the same as any character in the previous line, >> (assuming fixed-width fonts of course). > > I find it meaningful to indent extended lines one extra tab stop, but beyond > that I agree it is just decoration. Yup, that's a valid convention, as long as there are no trailing spaces after that extra tab stop. Concatenating spaces to this one extra tab stop (as checkpatch allows for up to 7 spaces) for decoration works well just as long as everybody expand tabs the same way. Benny > > -- Chris > - - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Coding Style: indenting with tabs vs. spaces
On Nov. 11, 2007, 11:23 +0200, James Courtier-Dutton <[EMAIL PROTECTED]> wrote: > DervishD wrote: >> Bonjour Xavier :) >> >> * Xavier Bestel <[EMAIL PROTECTED]> dixit: >> >>> Le samedi 10 novembre 2007 à 13:04 +0100, DervishD a écrit : >>> >>>> * Benny Halevy <[EMAIL PROTECTED]> dixit: >>>> >>>>> I would like to hear peoples opinion about the indentation convention >>>>> described below that I personally found the most practical with >>>>> several different editors. >>>>> >>>> While I respect you opinion about tabs, I find tab indentation the most >>>> evil thing ever invented. Even if done right (that is, not indenting >>>> using a mixture of spaces and tabs), the only advantage is that you save >>>> a few bytes. >>>> >>> Who cares ? >>> >> About the space saving? Not me, of course. It's just that I didn't see >> any other advantage. >> >> >>> The only advantage is that people can make tabs as big (or as small) >>> as they wish. Tabs become "logical indentation". So one's indentation >>> isn't forced on anotherone's editor. >>> >> The only way of having a sane indentation using tabs is to make sure >> that ALL indentation are tabs, not a mix of tabs and spaces (spaces, if >> any, should be at the end of indentation for aesthetical purposes, but >> should be removed without the logical indentation being lost). A good >> editor can ensure that all indentation are tabs and not a mix, but a >> good editor can adapt indentation to your likings when loading the file >> and save the file translating your favourite indentation back to spaces >> or whatever. >> >> If everybody used tabs correctly, indenting using tabs would be great, >> but IMHO indenting with spaces is much better. >> >> Raúl Núñez de Arenas Coronado >> >> > > Just use the linux > > ./scripts/checkpatch.pl --file > > It does all the indent checks for you before you submit a patch. > I.e. I checks that one has not mixed tabs with spaces etc. > So, any patches to the Linux kernel will have tabs used correctly. > Where is the problem? checkpatch allows to indent with any number of tabs and up to 7 spaces. This is consistent with Documentation/CodingStyle and therefore can be considered "correct". However, forcing everybody to the same tab expansion setup is too limiting, especially when working in several environments at the same time where some of them may not be the linux kernel. Using only spaces as DervishD suggested works around that using brute force by forcing the user to the author's preference which is legitimate but may not be the most productive way. I think that my proposal of using tabs as logical indents only (as Xav put it) and spaces for decorative alignment provides the best of both worlds. One can expand the tabs to any number of spaces as one likes and then the trailing spaces will work on any editor setup as long as you use fixed-width font. That's not considered "correct" as per checkpatch but works much better for me. Benny > > James > > > - - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Coding Style: indenting with tabs vs. spaces
On Nov. 10, 2007, 14:27 +0200, Xavier Bestel <[EMAIL PROTECTED]> wrote: > Le samedi 10 novembre 2007 à 13:04 +0100, DervishD a écrit : >> Hi Benny :) >> >> * Benny Halevy <[EMAIL PROTECTED]> dixit: >>> I would like to hear peoples opinion about the indentation convention >>> described below that I personally found the most practical with >>> several different editors. >> While I respect you opinion about tabs, I find tab indentation the most >> evil thing ever invented. Even if done right (that is, not indenting >> using a mixture of spaces and tabs), the only advantage is that you save >> a few bytes. > > Who cares ? > The only advantage is that people can make tabs as big (or as small) as > they wish. Tabs become "logical indentation". So one's indentation isn't > forced on anotherone's editor. Right. That's exactly the point. I find it harder to read someone else's code if (s)he uses 2 space indentation. With tabs, when done right, I can expand to my personal preference. > > Xav > > - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Coding Style: indenting with tabs vs. spaces
Greetings, I would like to hear peoples opinion about the indentation convention described below that I personally found the most practical with several different editors. The gist of it is that tabs should be used for nesting, not for decoration. Indent your code with as many tabs as your nesting level, where all statements will begin, and from there on use space characters. The rational behind it is to be tab-width agnostic so regardless of your tab expansion setup, the code will look correct and will make sense. When you break a line and want the new line text to start below a specific point relative to the previous line (I consider that "decorating") then start the new line with the same number of tabs as the previous one and then just use space characters as their width is the same as any character in the previous line, (assuming fixed-width fonts of course). For example: { if (very_long_expression && it_needs_to_be_broken_into_several_lines) return a_very_long_result + the_remainder_of_it_that_spilled_off + to_the_next_lines; return printk("just my %d cents\n", 2); } Thanks, Benny - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] NFS: Stop sillyname renames and unmounts from racing
On Nov. 06, 2007, 7:06 +0200, Andrew Morton <[EMAIL PROTECTED]> wrote: > On Sat, 03 Nov 2007 07:09:25 -0400 Steve Dickson <[EMAIL PROTECTED]> wrote: > >> The following patch stops NFS sillyname renames and umounts from racing. > > (appropriate cc's added) > >> I have a test script does the following: >> 1) start nfs server >> 2) mount loopback >> 3) open file in background >> 4) remove file >> 5) stop nfs server >> 6) kill -9 process which has file open >> 7) restart nfs server >> 8) umount looback mount. >> >> After umount I got the "VFS: Busy inodes after unmount" message >> because the processing of the rename has not finished. >> >> Below is a patch that the uses the new silly_count mechanism to >> synchronize sillyname processing and umounts. The patch introduces a >> nfs_put_super() routine that waits until the nfsi->silly_count count >> is zero. >> >> A side-effect of finding and waiting for all the inode to >> find the sillyname processing, is I need to traverse >> the sb->s_inodes list in the supper block. To do that >> safely the inode_lock spin lock has to be held. So for >> modules to be able to "see" that lock I needed to >> EXPORT_SYMBOL_GPL() it. >> >> Any objections to exporting the inode_lock spin lock? >> If so, how should modules _safely_ access the s_inode list? >> >> steved. >> >> >> Author: Steve Dickson <[EMAIL PROTECTED]> >> Date: Wed Oct 31 12:19:26 2007 -0400 >> >> Close a unlink/sillyname rename and umount race by added a >> nfs_put_super routine that will run through all the inode >> currently on the super block, waiting for those that are >> in the middle of a sillyname rename or removal. >> >> This patch stop the infamous "VFS: Busy inodes after unmount... " >> warning during umounts. >> >> Signed-off-by: Steve Dickson <[EMAIL PROTECTED]> >> >> diff --git a/fs/inode.c b/fs/inode.c >> index ed35383..da9034a 100644 >> --- a/fs/inode.c >> +++ b/fs/inode.c >> @@ -81,6 +81,7 @@ static struct hlist_head *inode_hashtable __read_mostly; >>* the i_state of an inode while it is in use.. >>*/ >> DEFINE_SPINLOCK(inode_lock); >> +EXPORT_SYMBOL_GPL(inode_lock); > > That's going to make hch unhappy. > > Your email client is performing space-stuffing. > See http://mbligh.org/linuxdocs/Email/Clients/Thunderbird > >> static struct file_system_type nfs_fs_type = { >> .owner = THIS_MODULE, >> @@ -223,6 +225,7 @@ static const struct super_operations nfs_sops = { >> .alloc_inode= nfs_alloc_inode, >> .destroy_inode = nfs_destroy_inode, >> .write_inode= nfs_write_inode, >> +.put_super = nfs_put_super, >> .statfs = nfs_statfs, >> .clear_inode= nfs_clear_inode, >> .umount_begin = nfs_umount_begin, >> @@ -1767,6 +1770,30 @@ static void nfs4_kill_super(struct super_block *sb) >> nfs_free_server(server); >> } >> >> +void nfs_put_super(struct super_block *sb) > > This was (correctly) declared to be static. We should define it that way > too (I didn't know you could do this, actually). > >> +{ >> +struct inode *inode; >> +struct nfs_inode *nfsi; >> +/* >> + * Make sure there are no outstanding renames >> + */ >> +relock: >> +spin_lock(&inode_lock); >> +list_for_each_entry(inode, &sb->s_inodes, i_sb_list) { >> +nfsi = NFS_I(inode); >> +if (atomic_read(&nfsi->silly_count) > 0) { >> +/* Keep this inode around during the wait */ >> +atomic_inc(&inode->i_count); >> +spin_unlock(&inode_lock); >> +wait_event(nfsi->waitqueue, >> +atomic_read(&nfsi->silly_count) == 1); >> +iput(inode); >> +goto relock; >> +} >> +} >> +spin_unlock(&inode_lock); >> +} > > That's an O(n^2) search. If it is at all possible to hit a catastrophic > slowdown in here, you can bet that someone out there will indeed hit it in > real life. > > I'm too lazy to look, but we might need to check things like I_FREEING > and I_CLEAR before taking a ref on this inode. It'd be very nice if the silly renamed inodes (with silly_count > 1) were moved to a different list in the first pass, under the inode_lock, and then waited on until silly_count <= 1 in a second pass only on the filtered list. This will provide you with O(1). - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 09/10] Change table chaining layout
On Oct. 25, 2007, 17:40 +0200, Linus Torvalds <[EMAIL PROTECTED]> wrote: > > On Thu, 25 Oct 2007, Rusty Russell wrote: >> On Wednesday 24 October 2007 01:22:55 Linus Torvalds wrote: >>> Well, I'd personally actually prefer to *not* have the count be passed >>> down explicitly, because it's just too error prone. >> Well, the duplication is bad, but walking lists to find the length is >> inefficient so you pass around the length as well. > > Nobody should *ever* walk the list to find the length. Does anybody really > do that? Yes, we pass the thing down, but do people *need* it? > > [ Side note: some of the users of that length currently would seem to be > buggy in the presense of continuation entries, and seem to assume that > the "list" is just a contiguous array. In fatc, that's almost the only > valid use for the "count" thing, since any other use _has_ to walk it > entry by entry anyway, no? ] > > The thing is, nobody should care. You walk the list to fill things in, or > to write it out to some HW-specific DMA table, you should never care about > the length. However, you *do* care about the "where does it end" part: to > be able to detect overflows (which should never happen, but from a > debugging standpoint it needs to be detectable rather than just silently > use or corrupt memory). > > But if people really want/need the length, then we damn well should have a > "header" thing, not two independent "list + length" parameters. There are a number of indicators that need to be kept in sync, depending on the usage. The number of entries is set when it is allocated and is currently needed to free it up (note that in the sgtable "sketch" James proposed we saved the sg pool index in the sgtable header and used it to free each chunk to the right pool). The number of entries is then used in many places to scan the list, however, after the sg list is dma mapped, the dma mapping may be shorter than the original sg list when multiple pages are coalesced and there we need to defer to use the dma_length (plus the number of entries) to determine the end of the list. IMO I think that the byte count can be used authoritatively to scan the contents of the sg list either before or after dma mapping while the number of entries is relevant to walking the list in a context free manner (i.e. to go over all the entries that were allocated) > > Linus > - - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [BUG] 2.6.23-git18 Kernel oops in sg helpers
On Oct. 24, 2007, 10:50 +0200, Benny Halevy <[EMAIL PROTECTED]> wrote: > On Oct. 24, 2007, 10:32 +0200, Jens Axboe <[EMAIL PROTECTED]> wrote: >> On Wed, Oct 24 2007, FUJITA Tomonori wrote: >>> On Tue, 23 Oct 2007 20:49:40 +0530 >>> Kamalesh Babulal <[EMAIL PROTECTED]> wrote: >>> >>>> Hi, >>>> >>>> Kernel oops is triggered while running fsx-linux test, followed by cpu >>>> softlock >>>> over the AMD box >>>> >>>> Unable to handle kernel NULL pointer dereference at 0018 RIP: >>>> [] gart_map_sg+0x26c/0x406 >>>> PGD 10185b067 PUD 10075b067 PMD 0 >>> Does this work? >>> >>> >>> diff --git a/arch/x86/kernel/pci-gart_64.c b/arch/x86/kernel/pci-gart_64.c >>> index c56e9ee..ae7e016 100644 >>> --- a/arch/x86/kernel/pci-gart_64.c >>> +++ b/arch/x86/kernel/pci-gart_64.c >>> @@ -338,7 +338,6 @@ static int __dma_map_cont(struct scatterlist *start, >>> int nelems, >>> >>> BUG_ON(s != start && s->offset); >>> if (s == start) { >>> - *sout = *s; >>> sout->dma_address = iommu_bus_base; >>> sout->dma_address += iommu_page*PAGE_SIZE + s->offset; >>> sout->dma_length = s->length; >>> @@ -365,7 +364,7 @@ static inline int dma_map_cont(struct scatterlist >>> *start, int nelems, >>> { >>> if (!need) { >>> BUG_ON(nelems != 1); >>> - *sout = *start; >>> + sout->dma_address = start->dma_address; > > I don't see this could fix anything since "s" above and "start" here are still > dereferenced. Also, this makes sout->dma_address inconsistent with > sout->page_link > and with the end marker. OK, it took me a day to figure out why the fix is working :) The end of list marker was copied into sout and later, in line 432 sg_next(sgmap) returned NULL since sgmap became the last entry in the list (which is strangely correct in the dma mapped vector). 431:if (out < nents) { 432:sgmap = sg_next(sgmap); 433:sgmap->dma_length = 0; 434:} Alas, the dma mapping convention apparently requires dma_length == 0 as a terminator if the "compressed" list for dma mapping is shorter than the sg list. Although this change does not keep each sg->dma_address in sync with each sg->page_link, previously there was nothing to keep sg->length in sync with sg->dma_length so I actually think that keeping the dma mapping and the page mappings orthogonal and independent may be even better since the original sg list can still be reused safely even after dma mapping. > > Benny > >>> sout->dma_length = start->length; >>> return 0; >>> } >>> -- >>> 1.5.2.4 >> Care to write up a proper changelog? >> > > - > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to [EMAIL PROTECTED] > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ > - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [BUG] 2.6.23-git18 Kernel oops in sg helpers
On Oct. 24, 2007, 10:32 +0200, Jens Axboe <[EMAIL PROTECTED]> wrote: > On Wed, Oct 24 2007, FUJITA Tomonori wrote: >> On Tue, 23 Oct 2007 20:49:40 +0530 >> Kamalesh Babulal <[EMAIL PROTECTED]> wrote: >> >>> Hi, >>> >>> Kernel oops is triggered while running fsx-linux test, followed by cpu >>> softlock >>> over the AMD box >>> >>> Unable to handle kernel NULL pointer dereference at 0018 RIP: >>> [] gart_map_sg+0x26c/0x406 >>> PGD 10185b067 PUD 10075b067 PMD 0 >> Does this work? >> >> >> diff --git a/arch/x86/kernel/pci-gart_64.c b/arch/x86/kernel/pci-gart_64.c >> index c56e9ee..ae7e016 100644 >> --- a/arch/x86/kernel/pci-gart_64.c >> +++ b/arch/x86/kernel/pci-gart_64.c >> @@ -338,7 +338,6 @@ static int __dma_map_cont(struct scatterlist *start, int >> nelems, >> >> BUG_ON(s != start && s->offset); >> if (s == start) { >> -*sout = *s; >> sout->dma_address = iommu_bus_base; >> sout->dma_address += iommu_page*PAGE_SIZE + s->offset; >> sout->dma_length = s->length; >> @@ -365,7 +364,7 @@ static inline int dma_map_cont(struct scatterlist >> *start, int nelems, >> { >> if (!need) { >> BUG_ON(nelems != 1); >> -*sout = *start; >> +sout->dma_address = start->dma_address; I don't see this could fix anything since "s" above and "start" here are still dereferenced. Also, this makes sout->dma_address inconsistent with sout->page_link and with the end marker. Benny >> sout->dma_length = start->length; >> return 0; >> } >> -- >> 1.5.2.4 > > Care to write up a proper changelog? > - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 09/10] Change table chaining layout
On Oct. 22, 2007, 22:16 +0200, Alan Cox <[EMAIL PROTECTED]> wrote: > On Mon, 22 Oct 2007 12:49:40 -0700 (PDT) > Linus Torvalds <[EMAIL PROTECTED]> wrote: > >> >> On Mon, 22 Oct 2007, Geert Uytterhoeven wrote: >>> Better safe than sorry... >>> >>> Is it possible that a chain entry pointer has bit 1 set on architectures >>> (e.g. m68k) where the natural alignment of 32-bit quantities is _2_ bytes, >>> not 4? >> Better make sure that such alignment never happens... But no, I don't >> think it will, since these things would generally always have to be >> allocated with an allocator, and the *allocator* won't return 2-byte >> aligned data structures. > > No - but a structure which has other objects in it before the object > being written out may well be 2 byte aligned on M68K and some of the > other externally 16bit platforms - ditto local dynamic objects. > > Why can't we just make the list one item longer than the entry count and > stick a NULL on the end of it like normal people ? Then you need one bit > which ought to be safe for everyone (and if the bit is a macro any CPU > warped enough to have byte alignment is surely going to have top bits > spare...) Alternatively, I proposed to check for end of list in sg_next by calling it with the next iterator value and number of list elements. We tried that patch here and it seems like a reasonable alternative. If folks are interested, I can send the full patch for review. > > Alan > - > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to [EMAIL PROTECTED] > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ > - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 08/10] [SG] Update arch/ to use sg helpers
It looks like it could be nice to define and use a helper for page_address(sg_page(sg)) (although 11 call sites could use it after this patch) #define sg_pgaddr(sg) page_address(sg_page(sg)) Note that mips sg_{un,}map_sg checked for page_address(sg->page) != 0 before calling __dma_sync(addr + sg->offset, sg->length, direction) and you changed it to addr = (unsigned long) sg_virt(sg) which takes sg->offset into account. That said I'm not sure if the original code was correct for the (page_address(sg->page) == 0 && sg->offset != 0) case... On Oct. 22, 2007, 20:11 +0200, Jens Axboe <[EMAIL PROTECTED]> wrote: > diff --git a/arch/mips/mm/dma-default.c b/arch/mips/mm/dma-default.c > index 98b5e5b..b0b034c 100644 > --- a/arch/mips/mm/dma-default.c > +++ b/arch/mips/mm/dma-default.c > @@ -165,12 +165,11 @@ int dma_map_sg(struct device *dev, struct scatterlist > *sg, int nents, > for (i = 0; i < nents; i++, sg++) { > unsigned long addr; > > - addr = (unsigned long) page_address(sg->page); > + addr = (unsigned long) sg_virt(sg); > if (!plat_device_is_coherent(dev) && addr) > - __dma_sync(addr + sg->offset, sg->length, direction); > + __dma_sync(addr, sg->length, direction); > sg->dma_address = plat_map_dma_mem(dev, > -(void *)(addr + sg->offset), > -sg->length); > +(void *)addr, sg->length); > } > > return nents; > @@ -223,10 +222,9 @@ void dma_unmap_sg(struct device *dev, struct scatterlist > *sg, int nhwentries, > for (i = 0; i < nhwentries; i++, sg++) { > if (!plat_device_is_coherent(dev) && > direction != DMA_TO_DEVICE) { > - addr = (unsigned long) page_address(sg->page); > + addr = (unsigned long) sg_virt(sg); > if (addr) > - __dma_sync(addr + sg->offset, sg->length, > -direction); > + __dma_sync(addr, sg->length, direction); > } > plat_unmap_dma_mem(sg->dma_address); > } > diff --git a/arch/x86/kernel/pci-calgary_64.c > b/arch/x86/kernel/pci-calgary_64.c > index 5098f58..1a20fe3 100644 > --- a/arch/x86/kernel/pci-calgary_64.c > +++ b/arch/x86/kernel/pci-calgary_64.c > @@ -411,8 +411,10 @@ static int calgary_nontranslate_map_sg(struct device* > dev, > int i; > > for_each_sg(sg, s, nelems, i) { > - BUG_ON(!s->page); > - s->dma_address = virt_to_bus(page_address(s->page) +s->offset); > + struct page *p = sg_page(s); > + > + BUG_ON(!p); why not just BUG_ON(!sg_page(s))? > + s->dma_address = virt_to_bus(sg_virt(s)); > s->dma_length = s->length; > } > return nelems; - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [bug] ata subsystem related crash with latest -git
On Oct. 18, 2007, 16:05 +0200, Jens Axboe <[EMAIL PROTECTED]> wrote: > On Thu, Oct 18 2007, Jens Axboe wrote: >> On Thu, Oct 18 2007, Benny Halevy wrote: >>>>return sg; >>>> } >>>> @@ -83,6 +96,9 @@ static inline struct scatterlist *sg_last(struct >>>> scatterlist *sgl, >>>>ret = sg; >>>> >>>> #endif >>>> +#ifdef CONFIG_DEBUG_SG >>>> + BUG_ON(sgl[0].sg_magic != SG_MAGIC); >>> can it also do BUG_ON(!sg_is_last(sg))? >> That would make sense, definitely. I'll add that. > > BUG_ON(!sg_is_last(ret)); > > it should be, not sg. That's what I merged. > right. of course. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [bug] ata subsystem related crash with latest -git
On Oct. 18, 2007, 15:32 +0200, Jens Axboe <[EMAIL PROTECTED]> wrote: > static inline struct scatterlist *sg_next(struct scatterlist *sg) > { > - sg++; > - > - if (unlikely(sg_is_chain(sg))) > +#ifdef CONFIG_DEBUG_SG > + BUG_ON(sg->sg_magic != SG_MAGIC); > +#endif > + if (sg_is_last(sg)) > + sg = NULL; > + else if (sg_is_chain(sg)) > sg = sg_chain_ptr(sg); > + else > + sg++; > Jens, again, please correct me if I'm wrong, but when sg points at the entry right before a chain entry this implementation of sg_next will return a pointer to the chain entry here, which I believe it must not. > return sg; > } > here's how I think sg_next should be implemented: */ static inline struct scatterlist *sg_next(struct scatterlist *sg) { +#ifdef CONFIG_DEBUG_SG + BUG_ON(sg->sg_magic != SG_MAGIC); +#endif + if (sg_is_last(sg)) + return NULL; + sg++; if (unlikely(sg_is_chain(sg))) sg = sg_chain_ptr(sg); return sg; } - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [bug] ata subsystem related crash with latest -git
On Oct. 18, 2007, 14:15 +0200, Jens Axboe <[EMAIL PROTECTED]> wrote: > /** > * sg_next - return the next scatterlist entry in a list > @@ -43,10 +51,15 @@ static inline void sg_init_one(struct scatterlist *sg, > const void *buf, > */ > static inline struct scatterlist *sg_next(struct scatterlist *sg) > { > - sg++; > - > - if (unlikely(sg_is_chain(sg))) > +#ifdef CONFIG_DEBUG_SG > + BUG_ON(sg->sg_magic != SG_MAGIC); > +#endif > + if (sg_is_last(sg)) > + sg = NULL; > + else if (sg_is_chain(sg)) > sg = sg_chain_ptr(sg); > + else > + sg++; Hmm, sg_next is not supposed to return a pointer to the chain entry itself, but rather skip it. I think that the fix needs only check the "last" flag before incrementing sg. +#ifdef CONFIG_DEBUG_SG + BUG_ON(sg->sg_magic != SG_MAGIC); +#endif + if (sg_is_last(sg)) + return NULL; + sg++ if (unlikely(sg_is_chain(sg))) sg = sg_chain_ptr(sg); > > return sg; > } > @@ -83,6 +96,9 @@ static inline struct scatterlist *sg_last(struct > scatterlist *sgl, > ret = sg; > > #endif > +#ifdef CONFIG_DEBUG_SG > + BUG_ON(sgl[0].sg_magic != SG_MAGIC); can it also do BUG_ON(!sg_is_last(sg))? > +#endif > return ret; > } > - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [bug] block subsystem related crash with latest -git
On Oct. 17, 2007, 20:22 +0200, Jens Axboe <[EMAIL PROTECTED]> wrote: > On Wed, Oct 17 2007, Linus Torvalds wrote: >> >> On Wed, 17 Oct 2007, Jens Axboe wrote: So avoiding the "sg_next()" on the last entry is pointless. >>> Yeah, I didn't quite understand why if sg was valid, why dereferencing >>> *(sg + 1)->page would crap out :/ >> Actually, I take that back. If 'sg' is the last entry in a *non*linked >> scatter-gather list (ie we don't use the last entry as a link, we actually >> use it as a real SG entry), then "sg_next(sg)" will indeed access past the >> end of the whole allocated array, and will access one past the end. >> >> And with page-alloc debugging, that *will* blow up. >> >> So I think your change to use "sg_next()" only when you actually need a >> next pointer is the correct one after all. > > Thanks, so I'm not totally crazy :-) > > Can you just pull: > > git://git.kernel.dk/data/git/linux-2.6-block.git for-linus > > then so we get those two pieces correct? Then the remaining issue seems > to be a new one that is biting Ingo elsewhere, at least we'll all be on > the same page then. > Jens, for_each_sg still calls sg_next on the last entry which will dereference a possibly bogus sg->page (for the sg_is_chain(sg) condition in sg_next) if the last entry is the last one on the page of unchained entry and sg+1 falls over into an uninitialized page. How about the following? (untested yet. sg.c included here as an example for usage out of scatterlist.h) diff --git a/include/linux/scatterlist.h b/include/linux/scatterlist.h index 2dc7464..3a27e03 100644 --- a/include/linux/scatterlist.h +++ b/include/linux/scatterlist.h @@ -30,7 +30,7 @@ static inline void sg_init_one(struct scatterlist *sg, const void *buf, ((struct scatterlist *) ((unsigned long) (sg)->page & ~0x01)) /** - * sg_next - return the next scatterlist entry in a list + * sg_next_unsafe - return the next scatterlist entry in a list * @sg:The current sg entry * * Usually the next entry will be @sg@ + 1, but if this sg element is part @@ -41,7 +41,7 @@ static inline void sg_init_one(struct scatterlist *sg, const void *buf, * the current entry, this function will NOT return NULL for an end-of-list. * */ -static inline struct scatterlist *sg_next(struct scatterlist *sg) +static inline struct scatterlist *sg_next_unsafe(struct scatterlist *sg) { sg++; @@ -51,11 +51,27 @@ static inline struct scatterlist *sg_next(struct scatterlist *sg) return sg; } +/** + * sg_next - return the next scatterlist entry in a list + * @sg:The current sg entry + * @next: Index of next sg entry + * @nr:Number of sg entries in the list + * + * Note that the caller must ensure that there are further entries after + * the current entry, this function will NOT return NULL for an end-of-list. + * + */ +static inline struct scatterlist *sg_next(struct scatterlist *sg, + int next, int nr) +{ + return next < nr ? sg_next_unsafe(sg) : NULL; +} + /* * Loop over each sg element, following the pointer to a new list if necessary */ #define for_each_sg(sglist, sg, nr, __i) \ - for (__i = 0, sg = (sglist); __i < (nr); __i++, sg = sg_next(sg)) + for (__i = 0, sg = (sglist); sg; sg = sg_next(sg, ++__i, nr)) /** * sg_last - return the last scatterlist entry in a list diff --git a/drivers/scsi/sg.c b/drivers/scsi/sg.c index 7238b2d..57cc1dd 100644 --- a/drivers/scsi/sg.c +++ b/drivers/scsi/sg.c @@ -1165,7 +1165,7 @@ sg_vma_nopage(struct vm_area_struct *vma, unsigned long addr, int *type) sg = rsv_schp->buffer; sa = vma->vm_start; for (k = 0; (k < rsv_schp->k_use_sg) && (sa < vma->vm_end); -++k, sg = sg_next(sg)) { +sg = sg_next(sg, ++k, rsv_schp->k_use_sg)) { len = vma->vm_end - sa; len = (len < sg->length) ? len : sg->length; if (offset < len) { @@ -1209,7 +1209,7 @@ sg_mmap(struct file *filp, struct vm_area_struct *vma) sa = vma->vm_start; sg = rsv_schp->buffer; for (k = 0; (k < rsv_schp->k_use_sg) && (sa < vma->vm_end); -++k, sg = sg_next(sg)) { +sg = sg_next(sg, ++k, rsv_schp->k_use_sg)) { len = vma->vm_end - sa; len = (len < sg->length) ? len : sg->length; sa += len; @@ -1840,7 +1840,7 @@ sg_build_indirect(Sg_scatter_hold * schp, Sg_fd * sfp, int buff_size) } for (k = 0, sg = schp->buffer, rem_sz = blk_size; (rem_sz > 0) && (k < mx_sc_elems); -++k, rem_sz -= ret_sz, sg = sg_next(sg)) { +rem_sz -= ret_sz, sg = sg_next(sg, ++k, mx_sc_elems)) { num = (rem_sz > scatter_elem_sz_prev) ? scatter_elem_sz_prev : rem_sz; @@ -1913,7 +1913,7 @@ sg_write_xfer(Sg_request * srp) if (res)
Re: [PATCH] x86-64: pci-gart iommu sg chaining zeroes wrong sg.
On Sep 27, 2007, 18:46 +0200, FUJITA Tomonori <[EMAIL PROTECTED]> wrote: > On Fri, 28 Sep 2007 01:38:27 +0900 > FUJITA Tomonori <[EMAIL PROTECTED]> wrote: > >> This patch is for Jens' block tree (sg chaining branch). >> >> I don't have the hardware but this looks like a bug. >> >> --- >> From: FUJITA Tomonori <[EMAIL PROTECTED]> >> Subject: [PATCH] x86-64: pci-gart iommu sg chaining zeroes a wrong sg's >> dma_length >> >> Needs to zero the end of the list. >> >> Signed-off-by: FUJITA Tomonori <[EMAIL PROTECTED]> >> --- >> arch/x86_64/kernel/pci-gart.c |3 ++- >> 1 files changed, 2 insertions(+), 1 deletions(-) >> >> diff --git a/arch/x86_64/kernel/pci-gart.c b/arch/x86_64/kernel/pci-gart.c >> index 27b7db4..a4151a7 100644 >> --- a/arch/x86_64/kernel/pci-gart.c >> +++ b/arch/x86_64/kernel/pci-gart.c >> @@ -425,9 +425,10 @@ int gart_map_sg(struct device *dev, struct scatterlist >> *sg, int nents, int dir) >> if (dma_map_cont(start_sg, i - start, sgmap, pages, need) < 0) >> goto error; >> out++; >> +sgmap = sg_next(sgmap); >> flush_gart(); >> if (out < nents) >> -ps->dma_length = 0; >> +sgmap->dma_length = 0; >> return out; > > Sorry, it should be: > > > diff --git a/arch/x86_64/kernel/pci-gart.c b/arch/x86_64/kernel/pci-gart.c > index 27b7db4..cfcc84e 100644 > --- a/arch/x86_64/kernel/pci-gart.c > +++ b/arch/x86_64/kernel/pci-gart.c > @@ -426,8 +426,10 @@ int gart_map_sg(struct device *dev, struct scatterlist > *sg, int nents, int dir) > goto error; > out++; > flush_gart(); > - if (out < nents) > - ps->dma_length = 0; > + if (out < nents) { > + sgmap = sg_next(sgmap); > + sgmap->dma_length = 0; > + } looks correct to me. ps points at the previous "scanned" sg entry while you want to zero out dma_length at the entry immediately following the last entry mapped (if (out < nents)) the original code before 62296749bd421904dace1e6b0fc3c4538aac7111 was: - if (out < nents) - sg[out].dma_length = 0; Benny > return out; > > error: > - > To unsubscribe from this list: send the line "unsubscribe linux-scsi" in > the body of a message to [EMAIL PROTECTED] > More majordomo info at http://vger.kernel.org/majordomo-info.html > - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: printk format "%4.4s"
On Sep 17, 2007, 2:57 +0200, shinkoi2005a <[EMAIL PROTECTED]> wrote: > Hi, all > > I have a question about printk format. > > Can printk format use "%4.4s"? Yes it can. The precision part of the format determines the max number of characters to copy from the string. The 4 byte signature array might not null terminated so you must use the .4 in the format otherwise vsnprintf might look over the buffer for the terminating '\0'. > This format is used following source. > > # > drivers/acpi/tables/tbinstal.c > ACPI_ERROR((AE_INFO, > "Table has invalid signature [%4.4s], must be > SSDT, PSDT or OEMx", > table_desc->pointer->signature)); > > ## > At least, my dmesg is buggy output like that.. Hmm, looks like you should believe the error messages rather than blaming the code for buggy output ;-) Like it says, the table seems corrupt, has invalid signature and checksum. > > ## > $ dmesg > (snip) > ACPI Warning (tbutils-0158): Incorrect checksum in table [ ^E礑 - 00, should > b > e F6 [20070126] > ACPI Error (tbinstal-0134): Table has invalid signature [ ^E礑, must be SSDT, > P > SDT or OEMx [20070126] > (snip) > ## > > - > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to [EMAIL PROTECTED] > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ > - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] Chaining sg lists for big IO commands v5
Jens Axboe wrote: > On Mon, May 21 2007, Jens Axboe wrote: >> On Fri, May 18 2007, Badari Pulavarty wrote: >>> On Fri, 2007-05-18 at 09:35 +0200, Jens Axboe wrote: On Thu, May 17 2007, Badari Pulavarty wrote: > On Thu, 2007-05-17 at 08:27 +0200, Jens Axboe wrote: >> On Wed, May 16 2007, Badari Pulavarty wrote: >>> On Tue, 2007-05-15 at 19:50 +0200, Jens Axboe wrote: On Tue, May 15 2007, Badari Pulavarty wrote: > On Tue, 2007-05-15 at 19:20 +0200, Jens Axboe wrote: >> On Tue, May 15 2007, Badari Pulavarty wrote: >>> On Fri, 2007-05-11 at 15:51 +0200, Jens Axboe wrote: Hi, Updated version of the patch - this time I'll just attach the patch file... >>> Missing scatterlist.h inclusions.. >>> >>> drivers/scsi/sym53c8xx_2/sym_glue.c: In function ???sym_scatter???: >>> drivers/scsi/sym53c8xx_2/sym_glue.c:385: warning: implicit >>> declaration >>> of function ???for_each_sg??? >>> drivers/scsi/sym53c8xx_2/sym_glue.c:385: error: expected ???;??? >>> before ???{??? >>> token >>> drivers/scsi/sym53c8xx_2/sym_glue.c:375: warning: unused variable >>> ???tp??? >>> make[3]: *** [drivers/scsi/sym53c8xx_2/sym_glue.o] Error 1 >>> >>> >>> drivers/scsi/qla2xxx/qla_iocb.c: In function >>> ???qla24xx_build_scsi_iocbs???: >>> drivers/scsi/qla2xxx/qla_iocb.c:678: warning: implicit declaration >>> of >>> function ???for_each_sg??? >>> drivers/scsi/qla2xxx/qla_iocb.c:678: error: expected ???;??? before >>> ???{??? >>> token >> Thanks, will fix those. What arch? I tested it here. > I am playing with them on ppc64. Ah ok, you need the updated patch series for ppc64 support. Builds fine here on ppc64. See the #sglist branch of the block repo: git://git.kernel.dk/data/git/linux-2.6-block.git I can mail you an updated patch, if you want. >>> >>> Here is the whole panic stack.. >> Thanks will fix that up, the IDE part is totally untested. Can you try >> and backout this patch and see if it boots? > I increased max_segments to 1024 on my qla2200 attached disks and > simple "dd" (direct read) resulted in following: > > elm3b29:/sys/block/sdd/queue # echo 1024 > max_segments > elm3b29:/sys/block/sdd/queue # cat max_hw_sectors_kb > max_sectors_kb > elm3b29:/mnt # dd iflag=direct if=./z of=/dev/null bs=512M > > Unable to handle kernel paging request at 1008 RIP: > [] __rmqueue+0x6f/0x120 Auch, that's a bug. I don't think the oom path has been tested yet, perhaps this is hitting it. Can you try with this debug patch, plus enable the slab debugging helpers (like poisoning)? diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c index 7456992..a479d1e 100644 --- a/drivers/scsi/scsi_lib.c +++ b/drivers/scsi/scsi_lib.c @@ -793,6 +793,7 @@ struct scatterlist *scsi_alloc_sgtable(struct scsi_cmnd *cmd, gfp_t gfp_mask) return ret; enomem: if (ret) { + printk(KERN_ERR "scsi: failed to allocate sg table\n"); /* * Free entries chained off ret. Since we were trying to * allocate another sglist, we know that all entries are of >>> Not much help. I get all kinds of weird panics.. This time I got (with >>> the above debug). >>> >>> general protection fault: [1] SMP >>> CPU 1 >>> Modules linked in: jfs sg sd_mod qla2xxx firmware_class >>> scsi_transport_fc scsi_mod vfat fat ipv6 thermal processor fan button >>> battery ac dm_mod floppy parport_pc lp parport >>> Pid: 56, comm: kblockd/1 Not tainted 2.6.22-rc1-sg #8 >>> RIP: 0010:[] [] kmem_cache_alloc >>> +0x36/0x70 >>> RSP: 0018:81017abbfc10 EFLAGS: 00010002 >>> RAX: RBX: 0082 RCX: 0664 >>> RDX: 81019ff2b8a0 RSI: 00011220 RDI: 8068d120 >>> RBP: 81017abbfc20 R08: 39f8 R09: >>> R10: 81019cbee700 R11: 0188 R12: 8101df2a64e0 >>> R13: 00011220 R14: 8101df2a6510 R15: 81017abbfc50 >>> FS: 2b505b027f20() GS:81018021f300() >>> knlGS:f7da26b0 >>> CS: 0010 DS: 0018 ES: 0018 CR0: 8005003b >>> CR2: 2b505b029008 CR3: 00019af73000 CR4: 06e0 >>> Process kblockd/1 (pid: 56, threadinfo 81017abbe000, task >>> 81017a571440) >>> Stack: 7abbfc30 81017abbfc30 >>> 8025d001 >>> 81017abbfcb0 8025d122 81017abbfc60 80219dc0 >>> 880e5da6 00ad 81017abbfcd0 8021a366 >>> Call Trace: >>> [] mempool_alloc_slab+0x11/0x20 >>> [] mempool_alloc+0x42/0x110 >>> [] flush_gart+0x40/0x50
Re: [PATCH 8/12] x86-64: update iommu/dma mapping functions to sg helpers
Jens Axboe wrote: > On Thu, May 10 2007, Benny Halevy wrote: >> @@ -411,12 +406,13 @@ int gart_map_sg(struct device *dev, struct scatterlist >> *sg, int nents, int dir) >> boundary and the new one doesn't have an offset. */ >> if (!iommu_merge || !nextneed || !need || s->offset || >> (ps->offset + ps->length) % PAGE_SIZE) { >> -if (dma_map_cont(sg, start, i, sg+out, pages, >> - need) < 0) >> +if (dma_map_cont(start_sg, i - start, sg+out, >> + pages, need) < 0) >> goto error; >> out++; >> pages = 0; >> -start = i; >> +start = i; >> +start_sg = s; >> } >> } >> >> @@ -424,7 +420,7 @@ int gart_map_sg(struct device *dev, struct scatterlist >> *sg, int nents, int dir) >> pages += to_pages(s->offset, s->length); >> ps = s; >> } >> -if (dma_map_cont(sg, start, i, sg+out, pages, need) < 0) >> +if (dma_map_cont(start_sg, i - start, sg+out, pages, need) < 0) >> goto error; >> out++; >> flush_gart(); > > Your patch is (very) buggy, the whole premise of doing chained sg > entries makes sg + int illegal! > You're right. I'll send a fix asap. Benny -- Benny Halevy Software Architect Tel/Fax: +972-3-647-8340 Mobile: +972-54-802-8340 [EMAIL PROTECTED] Panasas, Inc. The Leader in Parallel Storage www.panasas.com - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
synchronization in usb_serial_put
Hi Greg, I think there is a race between usb_serial_put() and usb_serial_get_by_index() (and get_free_serial()) with regards to handling the serial port refcount. usb_serial_get_by_index() gets a reference on the serial port under table_lock while return_serial releases all the returned ports from the table under the same lock. However, the table_lock is not taken around the call to kref_put, theoretically allowing to sneak in and grab a reference after kref_put has already determined that the reference count is zero (and before calling destroy_serial) causing use after free. How about this fix? >From b91b9cffd8bbb727c6480dfb18f79655806237e6 Mon Sep 17 00:00:00 2001 From: Benny Halevy <[EMAIL PROTECTED]> Date: Tue, 15 May 2007 10:41:31 +0300 Subject: [PATCH] fix usb_serial_put synchronization Signed-off-by: Benny Halevy <[EMAIL PROTECTED]> --- drivers/usb/serial/usb-serial.c |4 ++-- 1 files changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/usb/serial/usb-serial.c b/drivers/usb/serial/usb-serial.c index 87f3788..4e5b996 100644 --- a/drivers/usb/serial/usb-serial.c +++ b/drivers/usb/serial/usb-serial.c @@ -120,11 +120,9 @@ static void return_serial(struct usb_serial *serial) if (serial == NULL) return; - spin_lock(&table_lock); for (i = 0; i < serial->num_ports; ++i) { serial_table[serial->minor + i] = NULL; } - spin_unlock(&table_lock); } static void destroy_serial(struct kref *kref) @@ -172,7 +170,9 @@ static void destroy_serial(struct kref *kref) void usb_serial_put(struct usb_serial *serial) { + spin_lock(&table_lock); kref_put(&serial->kref, destroy_serial); + spin_unlock(&table_lock); } /* - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 1/12] crypto: don't pollute the global namespace with sg_next()
Jens Axboe wrote: > On Thu, May 10 2007, Benny Halevy wrote: >> Jens Axboe wrote: >>> It's a subsystem function, prefix it as such. >> Jens, Boaz and I talked about this over lunch. >> I wonder whether the crypto code must use your implementation >> instead of its own as it needs to over the sglist, e.g. for >> calculating iscsi (data) digest. > > The thought did cross my mind, and yes I think that would be a good > idea. The whole thing should probably just migrate to > lib/scattersomething.c > >> The crypto implementation of chained sglists in crypto/scatterwalk.h >> determines the chain link by !sg->length which will sorta work >> with your implementation, however the marker bit on page pointer must >> be cleared to use it. > > I don't like using sg->length, as that may be modified for legitimate > reason. That's why I chose to use the lsb bit of the page pointer. > >> Also, is it possible that after the original sglist has gone through >> dma_map_sg and entries were merged, some entries will have zero >> length? I'm not sure... If so, if the crypto implementation scans >> the sg list after it was dma mapped (maybe in a retry path) it >> may hit an entry that looks to it like a chaining link. This >> might be an existing bug and another reason for the crypto code >> to use your implementation. > > It's hard to say, depends heavily on the sub system or arch. Even if > using the pointer tagging mechanism seems a bit nasty, I think it's the > more resilient approach. > We're in agreement then :) I was trying to say that the methods should be compatible, otherwise bugs can happen, and that your scheme is better since it can handle sglists with zero length entries that aren't the last. A case that might be valid after dma mapping and merging. If indeed this case is possible, this seems to be the right time to converge to your scheme. Benny - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Andi, you broke my laptop :-)
Joerg's patch works for me too. Thanks, Benny Andi Kleen wrote: > On Thu, May 10, 2007 at 03:35:56PM +0200, Joerg Roedel wrote: >> On Thu, May 10, 2007 at 03:01:44PM +0200, Andi Kleen wrote: >>> On Wed, May 09, 2007 at 12:56:16PM -0700, Pete Zaitcev wrote: >>>> Hi, Andi: >>>> >>>> The attached patch (actually, git show output) makes my Dell 1501 to hang >>>> on boot. Sorry, I have no clue why... The culprit is found with git bisect. >>>> But yes, it's an AMD MK-36. I use an x86_64 kernel. It is 100% >>>> reproducible. >>> MK-36? Does it have SVM? >>> >>> Anyways we previously had issues with this being miscompiled, but >>> I thought the latest patch should have been ok. What compiler do you use? >>> >>> Can you send me a disassembly listing of arch/x86_64/kernel/time.o? >> I debugged this problem a bit and my compiler[1]interprets the =A >> constraint as %rax instead of %edx:%eax on x86_64 which causes the >> problem. The appended patch provides a workaround for this and fixed the >> hang on my machine. > > Hmm yes now I can reproduce it too. I didn't see any hangs so i suppose > my (and that of most -mm tester's) compiled binary always happened to have a > suitable value in edx > > Thanks for the patch. > > -Andi > - > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to [EMAIL PROTECTED] > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ -- Benny Halevy Panasas Inc. Accelerating Time to Results(TM) with Clustered Storage www.panasas.com [EMAIL PROTECTED] Tel/Fax: +972-3-647-8340 Mobile: +972-54-802-8340 - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 1/12] crypto: don't pollute the global namespace with sg_next()
Jens Axboe wrote: > It's a subsystem function, prefix it as such. Jens, Boaz and I talked about this over lunch. I wonder whether the crypto code must use your implementation instead of its own as it needs to over the sglist, e.g. for calculating iscsi (data) digest. The crypto implementation of chained sglists in crypto/scatterwalk.h determines the chain link by !sg->length which will sorta work with your implementation, however the marker bit on page pointer must be cleared to use it. Also, is it possible that after the original sglist has gone through dma_map_sg and entries were merged, some entries will have zero length? I'm not sure... If so, if the crypto implementation scans the sg list after it was dma mapped (maybe in a retry path) it may hit an entry that looks to it like a chaining link. This might be an existing bug and another reason for the crypto code to use your implementation. Thanks, Benny - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 8/12] x86-64: update iommu/dma mapping functions to sg helpers
Jens Axboe wrote: > @@ -323,13 +324,17 @@ static int __dma_map_cont(struct scatterlist *sg, int > start, int stopat, > { > unsigned long iommu_start = alloc_iommu(pages); > unsigned long iommu_page = iommu_start; > - int i; > + struct scatterlist *s; > + int i, nelems; > > if (iommu_start == -1) > return -1; > + > + nelems = stopat - start; > + while (start--) > + sg = sg_next(sg); Ouch. This will suck if we merge many times in a long list. How about keeping and passing start as a struct scatterlist * rather than an index? (see attached example, (compiles, bu untested though) > > - for (i = start; i < stopat; i++) { > - struct scatterlist *s = &sg[i]; > + for_each_sg(sg, s, nelems, i) { > unsigned long pages, addr; > unsigned long phys_addr = s->dma_address; There seems to be a bug hiding here as now i is in [0, nelems) now, not [start, stopat) so "if (i == start)" below should turn into "if (i == 0)" this is fixed by comparing pointers instead way in the attached example. > > @@ -360,12 +365,14 @@ static inline int dma_map_cont(struct scatterlist *sg, > int start, int stopat, > struct scatterlist *sout, > unsigned long pages, int need) > { > - if (!need) { > + if (!need) { > BUG_ON(stopat - start != 1); > - *sout = sg[start]; > - sout->dma_length = sg[start].length; > + while (--start) > + sg = sg_next(sg); same efficiency issue as above. > + *sout = *sg; > + sout->dma_length = sg->length; > return 0; > - } > + } > return __dma_map_cont(sg, start, stopat, sout, pages); > } > diff --git a/arch/x86_64/kernel/pci-gart.c b/arch/x86_64/kernel/pci-gart.c index 8bc2ed7..48ce635 100644 --- a/arch/x86_64/kernel/pci-gart.c +++ b/arch/x86_64/kernel/pci-gart.c @@ -319,27 +319,23 @@ static int dma_map_sg_nonforce(struct device *dev, struct scatterlist *sg, } /* Map multiple scatterlist entries continuous into the first. */ -static int __dma_map_cont(struct scatterlist *sg, int start, int stopat, +static int __dma_map_cont(struct scatterlist *start, int nelems, struct scatterlist *sout, unsigned long pages) { unsigned long iommu_start = alloc_iommu(pages); unsigned long iommu_page = iommu_start; struct scatterlist *s; - int i, nelems; + int i; if (iommu_start == -1) return -1; - nelems = stopat - start; - while (start--) - sg = sg_next(sg); - - for_each_sg(sg, s, nelems, i) { + for_each_sg(start, s, nelems, i) { unsigned long pages, addr; unsigned long phys_addr = s->dma_address; - BUG_ON(i > start && s->offset); - if (i == start) { + BUG_ON(s != start && s->offset); + if (s == start) { *sout = *s; sout->dma_address = iommu_bus_base; sout->dma_address += iommu_page*PAGE_SIZE + s->offset; @@ -361,19 +357,17 @@ static int __dma_map_cont(struct scatterlist *sg, int start, int stopat, return 0; } -static inline int dma_map_cont(struct scatterlist *sg, int start, int stopat, +static inline int dma_map_cont(struct scatterlist *start, int nelems, struct scatterlist *sout, unsigned long pages, int need) { if (!need) { - BUG_ON(stopat - start != 1); - while (--start) - sg = sg_next(sg); - *sout = *sg; - sout->dma_length = sg->length; + BUG_ON(nelems != 1); + *sout = *start; + sout->dma_length = start->length; return 0; } - return __dma_map_cont(sg, start, stopat, sout, pages); + return __dma_map_cont(start, nelems, sout, pages); } /* @@ -387,7 +381,7 @@ int gart_map_sg(struct device *dev, struct scatterlist *sg, int nents, int dir) int start; unsigned long pages = 0; int need = 0, nextneed; - struct scatterlist *s, *ps; + struct scatterlist *s, *ps, *start_sg; if (nents == 0) return 0; @@ -397,6 +391,7 @@ int gart_map_sg(struct device *dev, struct scatterlist *sg, int nents, int dir) out = 0; start = 0; + start_sg = sg; ps = NULL; /* shut up gcc */ for_each_sg(sg, s, nents, i) { dma_addr_t addr = page_to_phys(s->page) + s->offset; @@ -411,12 +406,13 @@ int gart_map_sg(struct device *dev, struct scatterlist *sg, int nents, int dir) boundary and the new one doesn't have an offset. */
Re: [PATCH 7/13] i386 sg: add support for chaining scatterlists
Jens Axboe wrote: > +#define sg_is_chain(sg) ((unsigned long) (sg)->page & 0x01) > +#define sg_chain_ptr(sg) \ > + ((struct scatterlist *) ((unsigned long) (sg)->page & ~0x01)) > + > +/* > + * We overload the meaning of ->page for sg chaining. If the LSB is > + * set, the page member contains a pointer to the next sgtable. > + */ > +static inline struct scatterlist *sg_next(struct scatterlist *sg) > +{ > + if (sg_is_chain(sg)) > + return sg_chain_ptr(sg); > + > + return sg + 1; > +} Jens, should sg_next ever return the sg containing the chain link? If sg points at the entry right before the link entry, don't we want to skip it? E.g.: static inline struct scatterlist *sg_next(struct scatterlist *sg) { struct scatterlist *next; /* just in case, shouldn't really ever be here */ if (sg_is_chain(sg)) return sg_chain_ptr(sg); next = sg + 1; if (sg_is_chain(next)) return sg_chain_ptr(next); return next; } - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Andi, you broke my laptop :-)
Andy, Pete, this patch also causes our test machines to hang hard during boot. x86_64 smp kernel, single cpu Athlon 64 machine, cpuinfo below. Benny $ cat /proc/cpuinfo processor : 0 vendor_id : AuthenticAMD cpu family : 15 model : 79 model name : AMD Athlon(tm) 64 Processor 3800+ stepping: 2 cpu MHz : 2412.397 cache size : 512 KB fpu : yes fpu_exception : yes cpuid level : 1 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 syscall nx mmxext fxsr_opt rdtscp lm 3dnowext 3dnow up pni cx16 lahf_lm svm cr8_legacy bogomips: 4828.89 TLB size: 1024 4K pages clflush size: 64 cache_alignment : 64 address sizes : 40 bits physical, 48 bits virtual power management: ts fid vid ttp tm stc Pete Zaitcev <[EMAIL PROTECTED]> wrote: > > Hi, Andi: > > The attached patch (actually, git show output) makes my Dell 1501 to hang > on boot. Sorry, I have no clue why... The culprit is found with git bisect. > But yes, it's an AMD MK-36. I use an x86_64 kernel. It is 100% reproducible. > > Cheers, > -- Pete > > commit c5bcb5635a03da3158f121ae20ccbbf72b4fc62a > Author: Andi Kleen <[EMAIL PROTECTED]> > Date: Wed May 2 19:27:21 2007 +0200 > > [PATCH] x86: Use RDTSCP for synchronous get_cycles if possible > > RDTSCP is already synchronous and doesn't need an explicit CPUID. > This is a little faster and more importantly avoids VMEXITs on > Hypervisors. > > Original patch from Joerg Roedel, but reworked by AK > Also includes miscompilation fix by Eric Biederman > > Cc: "Joerg Roedel" <[EMAIL PROTECTED]> > > Signed-off-by: Andi Kleen <[EMAIL PROTECTED]> > > diff --git a/include/asm-i386/tsc.h b/include/asm-i386/tsc.h > index 0181f9d..3f3c1fa 100644 > --- a/include/asm-i386/tsc.h > +++ b/include/asm-i386/tsc.h > @@ -38,6 +38,15 @@ static __always_inline cycles_t get_cycles_sync(void) > unsigned eax; > > /* > + * Use RDTSCP if possible; it is guaranteed to be synchronous > + * and doesn't cause a VMEXIT on Hypervisors > + */ > + alternative_io(ASM_NOP3, ".byte 0x0f,0x01,0xf9", X86_FEATURE_RDTSCP, > + "=A" (ret), "0" (0ULL) : "ecx", "memory"); > + if (ret) > + return ret; > + > + /* >* Don't do an additional sync on CPUs where we know >* RDTSC is already synchronous: >*/ > - > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to [EMAIL PROTECTED] > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC 1/6] bidi support: request dma_data_direction
Muli Ben-Yehuda wrote: > On Tue, Jan 23, 2007 at 03:45:00PM +0200, Benny Halevy wrote: > >>>> +static inline int dma_uni_dir(enum dma_data_direction dir) >>>> +{ >>>> + return (dir == DMA_TO_DEVICE) || (dir == DMA_FROM_DEVICE) || >>>> + (dir == DMA_NONE); >>>> +} >>> While this doesn't look very useful. Why is "DMA_NONE" a uni-dir? I >>> suggest replacing this with an open coded (dir != DMA_BIDIRECTIONAL). >> The idea was to be resilient to invalid values. (dir != DMA_BIDIRECTIONAL) >> is fine of course, but I'd add a BUG_ON such as (dir < 0 || dir > >> DMA_BIDIRECTIONAL) > > If DMA_NONE isn't actually allowed here, you can use valid_dma_direction(). DMA_NONE should be allowed as it is used by commands that do no I/O and these are handled on uni-directional path. BTW, the BUG_ON I suggested has a bug of course since (countering my intuition) DMA_BIDIRECTIONAL==0, so it should be BUG_ON(dir < 0 || dir > DMA_NONE) instead. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC 1/6] bidi support: request dma_data_direction
Douglas Gilbert wrote: > Benny Halevy wrote: >> Douglas Gilbert wrote: > > Perhaps the right use of DMA_BIRECTIONAL needs to be > defined. > > Could it be used with a XDWRITE(10) SCSI command > defined in sbc3r07.pdf at http://www.t10.org ? I suspect > using two scatter gather lists would be a better approach. Exactly. This is a classic example of a bidirectional command and indeed two scatter-gather lists (that are mapped into two bio lists) are used. > >>>> - Introduce new blk_rq_init_unqueued_req() and use it in places ad-hoc >>>> requests were used and bzero'ed. >>> With a bi-directional transfer is it always unambiguous >>> which transfer occurs first (or could they occur at >>> the same time)? >> The bidi transfers can occur in any order and in parallel. > > Then it is not sufficient for modern SCSI transports in which > certain bidirectional commands (probably most) have a well > defined order. > > So DMA_BIDIRECTIONAL looks PCI specific and it may have > been a mistake to replace other subsystem's direction flags > with it. RDMA might be an interesting case. > I would say that it might make sense to define an equivalent for dma_data_direction at the block layer, for example: enum req_io_direction { REQ_IO_NONE = 0, REQ_IN_FROM_DEVICE = 1, REQ_OUT_TO_DEVICE = 2, REQ_BIDIRECTIONAL = 3, }; can be used in struct request and upper layers. Besides the fact that having separate I/O buffers for bidirectional transfers makes block I/O different from pci bidi I/O, this enum makes more sense "arithmetically" and has a much better meaning for the zero value. Today DMA_BIDIRECTIONAL is used in several places as the default and "invalid" value since no-one ever used it before. I'd rather have the value 0 mean REQ_IO_NONE (or REQ_IO_INVALID if we want such thing). Benny - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC 1/6] bidi support: request dma_data_direction
Muli Ben-Yehuda wrote: > On Mon, Jan 22, 2007 at 01:21:28AM +0200, Boaz Harrosh wrote: > >> - Introduce a new enum dma_data_direction data_dir member in struct request. >> and remove the RW bit from request->cmd_flag > > Some architecture use 'enum dma_data_direction' and some 'int > dma_data_direction'. The consensus was to move to int over > time. Please use 'int dma_data_direction'. That should be fine (although I'm not sure what made you go this way :) Please see my reply to Douglas, proposing an enum req_io_direction at the block layer and up which will provide a better enumeration for our use. >> >> +static inline int dma_write_dir(enum dma_data_direction dir) >> +{ >> +return (dir == DMA_TO_DEVICE) || (dir == DMA_BIDIRECTIONAL); >> +} > > "write" can mean "write to device" or "write to memory", depending on > context. Not exactly something which should be a generic > helper. Rename to 'dma_to_device(int dir)'? much better. thanks! > >> +static inline int dma_uni_dir(enum dma_data_direction dir) >> +{ >> +return (dir == DMA_TO_DEVICE) || (dir == DMA_FROM_DEVICE) || >> + (dir == DMA_NONE); >> +} > > While this doesn't look very useful. Why is "DMA_NONE" a uni-dir? I > suggest replacing this with an open coded (dir != DMA_BIDIRECTIONAL). The idea was to be resilient to invalid values. (dir != DMA_BIDIRECTIONAL) is fine of course, but I'd add a BUG_ON such as (dir < 0 || dir > DMA_BIDIRECTIONAL) Benny - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC 3/6] bidi support: bidirectional request
James Bottomley wrote: > On Mon, 2007-01-22 at 01:25 +0200, Boaz Harrosh wrote: >> - Instantiate another request_io_part in request for bidi_read. >> - Define & Implement new API for accessing bidi parts. >> - API to Build bidi requests and map to sglists. >> - Define new end_that_request_block() function to end a complete request. > > Actually, this approach looks to be a bit too narrow. You seem to be > predicating on the idea that the bidirectional will transfer in and out > of the same area. For some of the frame in/frame out stuff, we probably > need the read and write areas for the bidirectional request to be > separated. Hmm, our proposal introduces two separate and independent i/o areas. One used for uni-directional transfers and for bidi data output, the other is used for bidi data input so we're in agreement. Benny - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC 1/6] bidi support: request dma_data_direction
Douglas Gilbert wrote: > Boaz Harrosh wrote: >> - Introduce a new enum dma_data_direction data_dir member in struct request. >> and remove the RW bit from request->cmd_flag >> - Add new API to query request direction. >> - Adjust existing API and implementation. >> - Cleanup wrong use of DMA_BIDIRECTIONAL >> - Introduce new blk_rq_init_unqueued_req() and use it in places ad-hoc >> requests were used and bzero'ed. > > With a bi-directional transfer is it always unambiguous > which transfer occurs first (or could they occur at > the same time)? The bidi transfers can occur in any order and in parallel. > > Doug Gilbert > - > To unsubscribe from this list: send the line "unsubscribe linux-scsi" in > the body of a message to [EMAIL PROTECTED] > More majordomo info at http://vger.kernel.org/majordomo-info.html - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [nfsv4] RE: Finding hardlinks
Nicolas Williams wrote: > On Thu, Jan 04, 2007 at 12:04:14PM +0200, Benny Halevy wrote: >> I agree that the way the client implements its cache is out of the protocol >> scope. But how do you interpret "correct behavior" in section 4.2.1? >> "Clients MUST use filehandle comparisons only to improve performance, not >> for correct behavior. All clients need to be prepared for situations in >> which it cannot be determined whether two filehandles denote the same object >> and in such cases, avoid making invalid assumptions which might cause >> incorrect behavior." >> Don't you consider data corruption due to cache inconsistency an incorrect >> behavior? > > If a file with multiple hardlinks appears to have multiple distinct > filehandles then a client like Trond's will treat it as multiple > distinct files (with the same hardlink count, and you won't be able to > find the other links to them -- oh well). Can this cause data > corruption? Yes, but only if there are applications that rely on the > different file names referencing the same file, and backup apps on the > client won't get the hardlinks right either. The case I'm discussing is multiple filehandles for the same name, not even for different hardlinks. This causes spurious EIO errors on the client when the filehandle changes and cache inconsistency when opening the file multiple times in parallel. > > What I don't understand is why getting the fileid is so hard -- always > GETATTR when you GETFH and you'll be fine. I'm guessing that's not as > difficult as it is to maintain a hash table of fileids. It's not difficult at all, just that the client can't rely on the fileids to be unique in both space and time because of server non-compliance (e.g. netapp's snapshots) and fileid reuse after delete. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [nfsv4] RE: Finding hardlinks
Trond Myklebust wrote: > On Thu, 2007-01-04 at 12:04 +0200, Benny Halevy wrote: >> I agree that the way the client implements its cache is out of the protocol >> scope. But how do you interpret "correct behavior" in section 4.2.1? >> "Clients MUST use filehandle comparisons only to improve performance, not >> for correct behavior. All clients need to be prepared for situations in >> which it cannot be determined whether two filehandles denote the same object >> and in such cases, avoid making invalid assumptions which might cause >> incorrect behavior." >> Don't you consider data corruption due to cache inconsistency an incorrect >> behavior? > > Exactly where do you see us violating the close-to-open cache > consistency guarantees? > I haven't seen that. What I did see is cache inconsistency when opening the same file with different file descriptors when the filehandle changes. My testing shows that at least fsync and close fail with EIO when the filehandle changed while there was dirty data in the cache and that's good. Still, not sharing the cache while the file is opened (even on a different file descriptors by the same process) seems impractical. Benny - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [nfsv4] RE: Finding hardlinks
Trond Myklebust wrote: > On Wed, 2007-01-03 at 14:35 +0200, Benny Halevy wrote: >> I sincerely expect you or anybody else for this matter to try to provide >> feedback and object to the protocol specification in case they disagree >> with it (or think it's ambiguous or self contradicting) rather than ignoring >> it and implementing something else. I think we're shooting ourselves in the >> foot when doing so and it is in our common interest to strive to reach a >> realistic standard we can all comply with and interoperate with each other. > > You are reading the protocol wrong in this case. Obviously we interpret it differently and that by itself calls for considering clarification of the text :) > > While the protocol does allow the server to implement the behaviour that > you've been advocating, it in no way mandates it. Nor does it mandate > that the client should gather files with the same (fsid,fileid) and > cache them together. Those are issues to do with _implementation_, and > are thus beyond the scope of the IETF. > > In our case, the client will ignore the unique_handles attribute. It > will use filehandles as our inode cache identifier. It will not jump > through hoops to provide caching semantics that go beyond close-to-open > for servers that set unique_handles to "false". I agree that the way the client implements its cache is out of the protocol scope. But how do you interpret "correct behavior" in section 4.2.1? "Clients MUST use filehandle comparisons only to improve performance, not for correct behavior. All clients need to be prepared for situations in which it cannot be determined whether two filehandles denote the same object and in such cases, avoid making invalid assumptions which might cause incorrect behavior." Don't you consider data corruption due to cache inconsistency an incorrect behavior? Benny - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [nfsv4] RE: Finding hardlinks
Trond Myklebust wrote: > On Sun, 2006-12-31 at 16:25 -0500, Halevy, Benny wrote: >> Trond Myklebust wrote: >>> >>> On Thu, 2006-12-28 at 15:07 -0500, Halevy, Benny wrote: Mikulas Patocka wrote: > BTW. how does (or how should?) NFS client deal with cache coherency if > filehandles for the same file differ? > Trond can probably answer this better than me... As I read it, currently the nfs client matches both the fileid and the filehandle (in nfs_find_actor). This means that different filehandles for the same file would result in different inodes :(. Strictly following the nfs protocol, comparing only the fileid should be enough IF fileids are indeed unique within the filesystem. Comparing the filehandle works as a workaround when the exported filesystem (or the nfs server) violates that. From a user stand point I think that this should be configurable, probably per mount point. >>> Matching files by fileid instead of filehandle is a lot more trouble >>> since fileids may be reused after a file has been deleted. Every time >>> you look up a file, and get a new filehandle for the same fileid, you >>> would at the very least have to do another GETATTR using one of the >>> 'old' filehandles in order to ensure that the file is the same object as >>> the one you have cached. Then there is the issue of what to do when you >>> open(), read() or write() to the file: which filehandle do you use, are >>> the access permissions the same for all filehandles, ... >>> >>> All in all, much pain for little or no gain. >> See my answer to your previous reply. It seems like the current >> implementation is in violation of the nfs protocol and the extra pain >> is required. > > ...and we should care because...? > > Trond > Believe it or not, but server companies like Panasas try to follow the standard when designing and implementing their products while relying on client vendors to do the same. I sincerely expect you or anybody else for this matter to try to provide feedback and object to the protocol specification in case they disagree with it (or think it's ambiguous or self contradicting) rather than ignoring it and implementing something else. I think we're shooting ourselves in the foot when doing so and it is in our common interest to strive to reach a realistic standard we can all comply with and interoperate with each other. Benny - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Finding hardlinks
Arjan van de Ven wrote: >> It seems like the posix idea of unique doesn't >> hold water for modern file systems > > are you really sure? Well Jan's example was of Coda that uses 128-bit internal file ids. > and if so, why don't we fix *THAT* instead Hmm, sometimes you can't fix the world, especially if the filesystem is exported over NFS and has a problem with fitting its file IDs uniquely into a 64-bit identifier. > rather than adding racy > syscalls and such that just can't really be used right... > If the syscall is working on two pathnames I agree there might be a race that isn't different from calling lstat() on each of these names before opening them. But I'm not sure I see a race if you operate on two open file descriptors (compared to fstat()ing both of them) On the nfs side, if the client looked up two names (or opened them over nfsv4) and has two filehandles in hand, asking the server whether they refer to the same object isn't racy. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Finding hardlinks
Jeff Layton wrote: > Benny Halevy wrote: >> It seems like the posix idea of unique doesn't >> hold water for modern file systems and that creates real problems for >> backup apps which rely on that to detect hard links. >> > > Why not? Granted, many of the filesystems in the Linux kernel don't enforce > that > they have unique st_ino values, but I'm working on a set of patches to try > and > fix that. That's great and will surely help most file systems (apparently not Coda as Jan says they use 128 bit internal file identifiers). What about 32 bit architectures? Is ino_t going to be 64 bit there too? > >> Adding a vfs call to check for file equivalence seems like a good idea to me. >> A syscall exposing it to user mode apps can look like what you sketched >> above, >> and another variant of it can maybe take two paths and possibly a flags field >> (for e.g. don't follow symlinks). >> >> I'm cross-posting this also to [EMAIL PROTECTED] NFS has exactly the same >> problem >> with as fileid is 64 bit wide. Although the nfs client can >> determine that two filesystem objects are hard linked if they have the same >> filehandle but there are cases where two distinct filehandles can still >> refer to >> the same filesystem object. Letting the nfs client determine file >> equivalency >> based on filehandles will probably satisfy most users but if the exported >> fs supports the new call discussed above, exporting it over NFS makes a >> lot of sense to me... What do you guys think about adding such an operation >> to NFS? >> > > This sounds like a bug to me. It seems like we should have a one to one > correspondence of filehandle -> inode. In what situations would this not be > the > case? Well, the NFS protocol allows that [see rfc1813, p. 21: "If two file handles from the same server are equal, they must refer to the same file, but if they are not equal, no conclusions can be drawn."] As an example, some file systems encode hint information into the filehandle and the hints may change over time, another example is encoding parent information into the filehandle and then handles representing hard links to the same file from different directories will differ. > > -- Jeff > > - > To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in > the body of a message to [EMAIL PROTECTED] > More majordomo info at http://vger.kernel.org/majordomo-info.html - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Finding hardlinks
Mikulas Patocka wrote: >>> If user (or script) doesn't specify that flag, it doesn't help. I think >>> the best solution for these filesystems would be either to add new syscall >>> int is_hardlink(char *filename1, char *filename2) >>> (but I know adding syscall bloat may be objectionable) >> it's also the wrong api; the filenames may have been changed under you >> just as you return from this call, so it really is a >> "was_hardlink_at_some_point()" as you specify it. >> If you make it work on fd's.. it has a chance at least. > > Yes, but it doesn't matter --- if the tree changes under "cp -a" command, > no one guarantees you what you get. > int fis_hardlink(int handle1, int handle 2); > Is another possibility but it can't detect hardlinked symlinks. It seems like the posix idea of unique doesn't hold water for modern file systems and that creates real problems for backup apps which rely on that to detect hard links. Adding a vfs call to check for file equivalence seems like a good idea to me. A syscall exposing it to user mode apps can look like what you sketched above, and another variant of it can maybe take two paths and possibly a flags field (for e.g. don't follow symlinks). I'm cross-posting this also to [EMAIL PROTECTED] NFS has exactly the same problem with as fileid is 64 bit wide. Although the nfs client can determine that two filesystem objects are hard linked if they have the same filehandle but there are cases where two distinct filehandles can still refer to the same filesystem object. Letting the nfs client determine file equivalency based on filehandles will probably satisfy most users but if the exported fs supports the new call discussed above, exporting it over NFS makes a lot of sense to me... What do you guys think about adding such an operation to NFS? Benny > > Mikulas > - > To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in > the body of a message to [EMAIL PROTECTED] > More majordomo info at http://vger.kernel.org/majordomo-info.html - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/