Re: [RFC][PATCH] ensure i_ino uniqueness in filesystems without permanent inode numbers (via idr)

2006-12-03 Thread Jeff Layton

Brad Boyer wrote:


This sounds slightly petty to me. For example, generic_file_read() is
there just to make it easier to implement the read callback, but it
isn't required. In fact, I would think that any filesystem complex
enough to be worth making proprietary would not use it. However, that
doesn't seem to me to be a good argument for marking it GPL-only. The
functionality in question is easier to reimplement, but that doesn't
make it right to force it on people just because of a license choice.



Yes, most filesystems have their own scheme for managing i_ino 
assignment, so this is primarily for "pseudo-filesystems". Stuff like 
pipefs, sockfs, /proc, etc...


I'm certainly open to discussion though. Is there a compelling reason to 
open this up to proprietary software authors?


I don't think there is a compelling reason to open it up since the
functionality could be reimplemented if needed, but I also think
the only reason it is being marked GPL-only is the very common
attitude that there should not be any proprietary modules.

To be honest, I think it looks bad for someone associated with redhat
to be suggesting that life should be made more difficult for those
who write proprietary software on Linux. The support from commercial
software is a major reason for the success of the RHEL product line.
I can't imagine that this attitude will affect support from software
companies as long as there is a demand for software on Linux, but
it isn't exactly supportive.



I have no problem with someone writing, selling and supporting 
proprietary modules. Knock yourself out. I just don't see a reason why I 
should contribute code to such an effort.


Still though, this was coded in part on company time. I certainly don't 
want to go against Red Hat's policy in such a matter, so I'll do some 
due diligence internally as to how this should be done.


In the meantime, does anyone have objections or comments on this 
approach on technical grounds?


Thanks,
Jeff
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC][PATCH] ensure i_ino uniqueness in filesystems without permanent inode numbers (via idr)

2006-12-03 Thread Al Boldi
Brad Boyer wrote:
> To be honest, I think it looks bad for someone associated with redhat
> to be suggesting that life should be made more difficult for those
> who write proprietary software on Linux. The support from commercial
> software is a major reason for the success of the RHEL product line.

The real reason for the success of the RHEL product line is that its been GPL 
from the beginning.  And commercial software saw it fit to leverage this 
GPL-pool, which is OK, but to then come around and say that "The support 
from commercial software is a major reason for the success of the RHEL 
product line" is trying to portray the situation up-side-down.

This does not mean that we shouldn't allow non-GPL linkage, on the contrary, 
I am even calling for a stable API for the benefit of everyone, but it's 
probably the closed-source market's arrogant behavior that forces 
GPL-developers to respond in kind.  Which is rather sad, if you think about 
it.


Thanks!

--
Al

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC][PATCH] ensure i_ino uniqueness in filesystems without permanent inode numbers (via idr)

2006-12-03 Thread Al Boldi
Brad Boyer wrote:
 To be honest, I think it looks bad for someone associated with redhat
 to be suggesting that life should be made more difficult for those
 who write proprietary software on Linux. The support from commercial
 software is a major reason for the success of the RHEL product line.

The real reason for the success of the RHEL product line is that its been GPL 
from the beginning.  And commercial software saw it fit to leverage this 
GPL-pool, which is OK, but to then come around and say that The support 
from commercial software is a major reason for the success of the RHEL 
product line is trying to portray the situation up-side-down.

This does not mean that we shouldn't allow non-GPL linkage, on the contrary, 
I am even calling for a stable API for the benefit of everyone, but it's 
probably the closed-source market's arrogant behavior that forces 
GPL-developers to respond in kind.  Which is rather sad, if you think about 
it.


Thanks!

--
Al

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC][PATCH] ensure i_ino uniqueness in filesystems without permanent inode numbers (via idr)

2006-12-03 Thread Jeff Layton

Brad Boyer wrote:


This sounds slightly petty to me. For example, generic_file_read() is
there just to make it easier to implement the read callback, but it
isn't required. In fact, I would think that any filesystem complex
enough to be worth making proprietary would not use it. However, that
doesn't seem to me to be a good argument for marking it GPL-only. The
functionality in question is easier to reimplement, but that doesn't
make it right to force it on people just because of a license choice.



Yes, most filesystems have their own scheme for managing i_ino 
assignment, so this is primarily for pseudo-filesystems. Stuff like 
pipefs, sockfs, /proc, etc...


I'm certainly open to discussion though. Is there a compelling reason to 
open this up to proprietary software authors?


I don't think there is a compelling reason to open it up since the
functionality could be reimplemented if needed, but I also think
the only reason it is being marked GPL-only is the very common
attitude that there should not be any proprietary modules.

To be honest, I think it looks bad for someone associated with redhat
to be suggesting that life should be made more difficult for those
who write proprietary software on Linux. The support from commercial
software is a major reason for the success of the RHEL product line.
I can't imagine that this attitude will affect support from software
companies as long as there is a demand for software on Linux, but
it isn't exactly supportive.



I have no problem with someone writing, selling and supporting 
proprietary modules. Knock yourself out. I just don't see a reason why I 
should contribute code to such an effort.


Still though, this was coded in part on company time. I certainly don't 
want to go against Red Hat's policy in such a matter, so I'll do some 
due diligence internally as to how this should be done.


In the meantime, does anyone have objections or comments on this 
approach on technical grounds?


Thanks,
Jeff
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC][PATCH] ensure i_ino uniqueness in filesystems without permanent inode numbers (via idr)

2006-12-02 Thread Brad Boyer
On Sat, Dec 02, 2006 at 09:56:27PM -0500, Jeff Layton wrote:
> My main reasoning for doing this was that the structures involved are 
> per-superblock. There is virtually no reason that a filesystem would 
> ever need to touch these structures in another filesystem.

I don't think this is relevant to the issue. I wouldn't expect that
if you allow people to use this functionality that they would go
poking around in other filesystems. I would expect people to use it
on the filesystem they are writing.

> So, this is essentially a service to make it easy for filesystems to 
> implement i_ino uniqueness. I'm not terribly interested in making things 
> easier for proprietary filesystems, so I don't see a real reason to make 
> this available to them. They can always implement their own scheme to do 
> this.

This sounds slightly petty to me. For example, generic_file_read() is
there just to make it easier to implement the read callback, but it
isn't required. In fact, I would think that any filesystem complex
enough to be worth making proprietary would not use it. However, that
doesn't seem to me to be a good argument for marking it GPL-only. The
functionality in question is easier to reimplement, but that doesn't
make it right to force it on people just because of a license choice.

> I'm certainly open to discussion though. Is there a compelling reason to 
> open this up to proprietary software authors?

I don't think there is a compelling reason to open it up since the
functionality could be reimplemented if needed, but I also think
the only reason it is being marked GPL-only is the very common
attitude that there should not be any proprietary modules.

To be honest, I think it looks bad for someone associated with redhat
to be suggesting that life should be made more difficult for those
who write proprietary software on Linux. The support from commercial
software is a major reason for the success of the RHEL product line.
I can't imagine that this attitude will affect support from software
companies as long as there is a demand for software on Linux, but
it isn't exactly supportive.

Brad Boyer
[EMAIL PROTECTED]

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC][PATCH] ensure i_ino uniqueness in filesystems without permanent inode numbers (via idr)

2006-12-02 Thread Jeff Layton

Brad Boyer wrote:

On Fri, Dec 01, 2006 at 12:21:36PM -0500, Jeff Layton wrote:

Here's an updated (but untested) patch based on your suggestions. I also went
ahead and made the exported symbols GPL-only since that seems like it would be
appropriate here. Any further thoughts on it?





This seems like exactly the sort of thing that should be a generic
service available to all filesystem implementors whether it's GPL or
not. The usual justification for GPL-only is that it's something
random modules shouldn't be touching anyway, but it's something that
some part of the tree which could be a module needs.


My main reasoning for doing this was that the structures involved are 
per-superblock. There is virtually no reason that a filesystem would 
ever need to touch these structures in another filesystem.


So, this is essentially a service to make it easy for filesystems to 
implement i_ino uniqueness. I'm not terribly interested in making things 
easier for proprietary filesystems, so I don't see a real reason to make 
this available to them. They can always implement their own scheme to do 
this.


I'm certainly open to discussion though. Is there a compelling reason to 
open this up to proprietary software authors?


-- Jeff

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC][PATCH] ensure i_ino uniqueness in filesystems without permanent inode numbers (via idr)

2006-12-02 Thread Brad Boyer
On Fri, Dec 01, 2006 at 12:21:36PM -0500, Jeff Layton wrote:
> Here's an updated (but untested) patch based on your suggestions. I also went
> ahead and made the exported symbols GPL-only since that seems like it would be
> appropriate here. Any further thoughts on it?

I don't know that this is a good idea. I know this isn't likely to be
a popular statement, but I think that there should be at least some
minor justification to making a symbol GPL-only (it won't take much).
This seems like exactly the sort of thing that should be a generic
service available to all filesystem implementors whether it's GPL or
not. The usual justification for GPL-only is that it's something
random modules shouldn't be touching anyway, but it's something that
some part of the tree which could be a module needs.

Brad Boyer
[EMAIL PROTECTED]

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC][PATCH] ensure i_ino uniqueness in filesystems without permanent inode numbers (via idr)

2006-12-02 Thread Brad Boyer
On Fri, Dec 01, 2006 at 12:21:36PM -0500, Jeff Layton wrote:
 Here's an updated (but untested) patch based on your suggestions. I also went
 ahead and made the exported symbols GPL-only since that seems like it would be
 appropriate here. Any further thoughts on it?

I don't know that this is a good idea. I know this isn't likely to be
a popular statement, but I think that there should be at least some
minor justification to making a symbol GPL-only (it won't take much).
This seems like exactly the sort of thing that should be a generic
service available to all filesystem implementors whether it's GPL or
not. The usual justification for GPL-only is that it's something
random modules shouldn't be touching anyway, but it's something that
some part of the tree which could be a module needs.

Brad Boyer
[EMAIL PROTECTED]

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC][PATCH] ensure i_ino uniqueness in filesystems without permanent inode numbers (via idr)

2006-12-02 Thread Jeff Layton

Brad Boyer wrote:

On Fri, Dec 01, 2006 at 12:21:36PM -0500, Jeff Layton wrote:

Here's an updated (but untested) patch based on your suggestions. I also went
ahead and made the exported symbols GPL-only since that seems like it would be
appropriate here. Any further thoughts on it?





This seems like exactly the sort of thing that should be a generic
service available to all filesystem implementors whether it's GPL or
not. The usual justification for GPL-only is that it's something
random modules shouldn't be touching anyway, but it's something that
some part of the tree which could be a module needs.


My main reasoning for doing this was that the structures involved are 
per-superblock. There is virtually no reason that a filesystem would 
ever need to touch these structures in another filesystem.


So, this is essentially a service to make it easy for filesystems to 
implement i_ino uniqueness. I'm not terribly interested in making things 
easier for proprietary filesystems, so I don't see a real reason to make 
this available to them. They can always implement their own scheme to do 
this.


I'm certainly open to discussion though. Is there a compelling reason to 
open this up to proprietary software authors?


-- Jeff

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC][PATCH] ensure i_ino uniqueness in filesystems without permanent inode numbers (via idr)

2006-12-02 Thread Brad Boyer
On Sat, Dec 02, 2006 at 09:56:27PM -0500, Jeff Layton wrote:
 My main reasoning for doing this was that the structures involved are 
 per-superblock. There is virtually no reason that a filesystem would 
 ever need to touch these structures in another filesystem.

I don't think this is relevant to the issue. I wouldn't expect that
if you allow people to use this functionality that they would go
poking around in other filesystems. I would expect people to use it
on the filesystem they are writing.

 So, this is essentially a service to make it easy for filesystems to 
 implement i_ino uniqueness. I'm not terribly interested in making things 
 easier for proprietary filesystems, so I don't see a real reason to make 
 this available to them. They can always implement their own scheme to do 
 this.

This sounds slightly petty to me. For example, generic_file_read() is
there just to make it easier to implement the read callback, but it
isn't required. In fact, I would think that any filesystem complex
enough to be worth making proprietary would not use it. However, that
doesn't seem to me to be a good argument for marking it GPL-only. The
functionality in question is easier to reimplement, but that doesn't
make it right to force it on people just because of a license choice.

 I'm certainly open to discussion though. Is there a compelling reason to 
 open this up to proprietary software authors?

I don't think there is a compelling reason to open it up since the
functionality could be reimplemented if needed, but I also think
the only reason it is being marked GPL-only is the very common
attitude that there should not be any proprietary modules.

To be honest, I think it looks bad for someone associated with redhat
to be suggesting that life should be made more difficult for those
who write proprietary software on Linux. The support from commercial
software is a major reason for the success of the RHEL product line.
I can't imagine that this attitude will affect support from software
companies as long as there is a demand for software on Linux, but
it isn't exactly supportive.

Brad Boyer
[EMAIL PROTECTED]

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC][PATCH] ensure i_ino uniqueness in filesystems without permanent inode numbers (via idr hashing)

2006-12-01 Thread Jeff Layton
Thanks again, Randy. Here's an updated and tested patch and description. 
This
one also makes sure that the root inode for the mount gets a unique 
i_ino value

as well. Let me know what you think...

--[snip]---

This patch is a proof of concept. It works, but I'd like to get some buyin
on the approach before I start doing the legwork to convert all of
the existing filesystems to use it. First, the problems:

1) on filesystems w/o permanent inode numbers, i_ino values can be
larger than 32 bits, which can cause problems for some 32 bit userspace
programs on a 64 bit kernel.

2) many filesystems call new_inode and assume that the i_ino values they
are given are unique. They are not guaranteed to be so, since the static
counter can wrap.

3) after allocating a new inode, some filesystems call iunique to try to
get a unique i_ino value, but they don't actually add their inodes to
the hashtable, and so they're still not guaranteed to be unique.

One way to fix this would be to just make sure they all use iunique and hash
their inodes, but that might slow down lookups for filesystems whose inodes
are not pinned in memory.

This patch is one way to correct these problems. This adds 2 new
functions, an iunique_register and iunique_unregister. Filesystems can call
iunique_register at inode creation time, and then at deletion time, we'll
automatically unregister them.

This patch also adds a new s_generation counter to the superblock.
Because i_ino's can be reused so quickly, we don't want NFS getting
confused when it happens. When iunique_register is called, we'll assign
the s_generation value to the i_generation, and then increment it to
help ensure that we get different filehandles.

There are some things that need to be cleaned up, of course:

- better error handling for the iunique calls

- recheck all the possible places where the inode should be unhashed

- convert other filesystems

- remove the static counter from new_inode and (maybe) eliminate iunique

The patch also converts pipefs to use the new scheme as an example. Al
Viro had expressed some concern with an earlier patch that this might
slow down pipe creation. I've done some testing and I think the impact
will be minimal. Timing a small program that creates and closes 100
million pipes in a loop:

patched:
-
real8m8.623s
user0m37.418s
sys 7m31.196s

unpatched:
--
real8m7.150s
user0m40.943s
sys 7m26.204s

As the number of pipes grows on the system this time may grow somewhat,
but it doesn't seem like it would be terrible.

Comments and suggestions appreciated.

Signed-off-by: Jeff Layton <[EMAIL PROTECTED]>


diff --git a/fs/inode.c b/fs/inode.c
index 26cdb11..252192a 100644
--- a/fs/inode.c
+++ b/fs/inode.c
@@ -288,6 +288,8 @@ static void dispose_list(struct list_hea
list_del_init(>i_sb_list);
spin_unlock(_lock);

+   iunique_unregister(inode);
+
wake_up_inode(inode);
destroy_inode(inode);
nr_disposed++;
@@ -706,6 +708,32 @@ retry:

 EXPORT_SYMBOL(iunique);

+int iunique_register(struct inode *inode, int max_reserved)
+{
+   int rv;
+
+   rv = idr_pre_get(>i_sb->s_inode_ids, GFP_KERNEL);
+   if (!rv)
+   return -ENOMEM;
+
+   spin_lock(>i_sb->s_inode_ids_lock);
+   rv = idr_get_new_above(>i_sb->s_inode_ids, inode,
+   max_reserved+1, (int *) >i_ino);
+   inode->i_generation = inode->i_sb->s_generation++;
+   spin_unlock(>i_sb->s_inode_ids_lock);
+   return rv;
+}
+EXPORT_SYMBOL_GPL(iunique_register);
+
+void iunique_unregister(struct inode *inode)
+{
+   spin_lock(>i_sb->s_inode_ids_lock);
+   if (idr_find(>i_sb->s_inode_ids, (int) inode->i_ino))
+   idr_remove(>i_sb->s_inode_ids, (int) inode->i_ino);
+   spin_unlock(>i_sb->s_inode_ids_lock);
+}
+EXPORT_SYMBOL_GPL(iunique_unregister);
+
 struct inode *igrab(struct inode *inode)
 {
spin_lock(_lock);
@@ -1025,6 +1053,7 @@ void generic_delete_inode(struct inode *
spin_lock(_lock);
hlist_del_init(>i_hash);
spin_unlock(_lock);
+   iunique_unregister(inode);
wake_up_inode(inode);
BUG_ON(inode->i_state != I_CLEAR);
destroy_inode(inode);
@@ -1057,6 +1086,7 @@ static void generic_forget_inode(struct
inode->i_state |= I_FREEING;
inodes_stat.nr_inodes--;
spin_unlock(_lock);
+   iunique_unregister(inode);
if (inode->i_data.nrpages)
truncate_inode_pages(>i_data, 0);
clear_inode(inode);
diff --git a/fs/pipe.c b/fs/pipe.c
index b1626f2..74dbbf0 100644
--- a/fs/pipe.c
+++ b/fs/pipe.c
@@ -845,6 +845,9 @@ static struct inode * get_pipe_inode(voi
if (!inode)
goto fail_inode;

+   if (iunique_register(inode, 0))
+   goto fail_iput;
+
pipe = alloc_pipe_info(inode);
if (!pipe)
goto fail_iput;
@@ 

Re: [RFC][PATCH] ensure i_ino uniqueness in filesystems without permanent inode numbers (via idr)

2006-12-01 Thread Randy Dunlap

Jeff Layton wrote:

On Fri, Dec 01, 2006 at 08:52:27AM -0800, Randy Dunlap wrote:

Thanks for having a look, Randy...


s/idr_/iunique_/


Doh! Can you tell I cut and pasted this email from earlier ones? :-)


- don't attempt to remove inodes with values <100

Please explain that one.  (May be obvious to some, but not to me.)


Actually, we probably don't need to do that now. My thought here was to add
a low range of i_ino numbers that could be used by the filesystem code without
needing to call iunique (in particular for things like the root inode in the
filesystem). It's probably best to not do this though and let the filesystem
handle it on its own.


Better to post patches inline (for review) rather than as attachments.


Here's an updated (but untested) patch based on your suggestions. I also went
ahead and made the exported symbols GPL-only since that seems like it would be
appropriate here. Any further thoughts on it?


Just needs new/updated patch description.
and one "typo" fixed.


diff --git a/fs/inode.c b/fs/inode.c
index 26cdb11..e45cec9 100644
--- a/fs/inode.c
+++ b/fs/inode.c
@@ -706,6 +708,32 @@ retry:
 
 EXPORT_SYMBOL(iunique);
 
+int iunique_register(struct inode *inode, int max_reserved)

+{
+   int rv;
+
+   rv = idr_pre_get(>i_sb->s_inode_ids, GFP_KERNEL);
+   if (! rv)


No space after !, just:
if (!rv)


+   return -ENOMEM;
+
+   spin_lock(>i_sb->s_inode_ids_lock);
+   rv = idr_get_new_above(>i_sb->s_inode_ids, inode,
+   max_reserved+1, (int *) >i_ino);
+   inode->i_generation = inode->i_sb->s_generation++;
+   spin_unlock(>i_sb->s_inode_ids_lock);
+   return rv;
+}


Thanks.
--
~Randy
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC][PATCH] ensure i_ino uniqueness in filesystems without permanent inode numbers (via idr)

2006-12-01 Thread Jeff Layton
On Fri, Dec 01, 2006 at 08:52:27AM -0800, Randy Dunlap wrote:

Thanks for having a look, Randy...

> s/idr_/iunique_/

Doh! Can you tell I cut and pasted this email from earlier ones? :-)

> > - don't attempt to remove inodes with values <100
> 
> Please explain that one.  (May be obvious to some, but not to me.)

Actually, we probably don't need to do that now. My thought here was to add
a low range of i_ino numbers that could be used by the filesystem code without
needing to call iunique (in particular for things like the root inode in the
filesystem). It's probably best to not do this though and let the filesystem
handle it on its own.

> Better to post patches inline (for review) rather than as attachments.

Here's an updated (but untested) patch based on your suggestions. I also went
ahead and made the exported symbols GPL-only since that seems like it would be
appropriate here. Any further thoughts on it?

Signed-off-by: Jeff Layton <[EMAIL PROTECTED]>

diff --git a/fs/inode.c b/fs/inode.c
index 26cdb11..e45cec9 100644
--- a/fs/inode.c
+++ b/fs/inode.c
@@ -288,6 +288,8 @@ static void dispose_list(struct list_hea
list_del_init(>i_sb_list);
spin_unlock(_lock);
 
+   iunique_unregister(inode);
+
wake_up_inode(inode);
destroy_inode(inode);
nr_disposed++;
@@ -706,6 +708,32 @@ retry:
 
 EXPORT_SYMBOL(iunique);
 
+int iunique_register(struct inode *inode, int max_reserved)
+{
+   int rv;
+
+   rv = idr_pre_get(>i_sb->s_inode_ids, GFP_KERNEL);
+   if (! rv)
+   return -ENOMEM;
+
+   spin_lock(>i_sb->s_inode_ids_lock);
+   rv = idr_get_new_above(>i_sb->s_inode_ids, inode,
+   max_reserved+1, (int *) >i_ino);
+   inode->i_generation = inode->i_sb->s_generation++;
+   spin_unlock(>i_sb->s_inode_ids_lock);
+   return rv;
+}
+EXPORT_SYMBOL_GPL(iunique_register);
+
+void iunique_unregister(struct inode *inode)
+{
+   spin_lock(>i_sb->s_inode_ids_lock);
+   if (idr_find(>i_sb->s_inode_ids, (int) inode->i_ino))
+   idr_remove(>i_sb->s_inode_ids, (int) inode->i_ino);
+   spin_unlock(>i_sb->s_inode_ids_lock);
+}
+EXPORT_SYMBOL_GPL(iunique_unregister);
+
 struct inode *igrab(struct inode *inode)
 {
spin_lock(_lock);
@@ -1025,6 +1053,7 @@ void generic_delete_inode(struct inode *
spin_lock(_lock);
hlist_del_init(>i_hash);
spin_unlock(_lock);
+   iunique_unregister(inode);
wake_up_inode(inode);
BUG_ON(inode->i_state != I_CLEAR);
destroy_inode(inode);
@@ -1057,6 +1086,7 @@ static void generic_forget_inode(struct 
inode->i_state |= I_FREEING;
inodes_stat.nr_inodes--;
spin_unlock(_lock);
+   iunique_unregister(inode);
if (inode->i_data.nrpages)
truncate_inode_pages(>i_data, 0);
clear_inode(inode);
diff --git a/fs/pipe.c b/fs/pipe.c
index b1626f2..d74ae65 100644
--- a/fs/pipe.c
+++ b/fs/pipe.c
@@ -845,6 +845,9 @@ static struct inode * get_pipe_inode(voi
if (!inode)
goto fail_inode;
 
+   if (iunique_register(inode, 0))
+   goto fail_iput;
+
pipe = alloc_pipe_info(inode);
if (!pipe)
goto fail_iput;
diff --git a/fs/super.c b/fs/super.c
index 47e554c..d2dbdec 100644
--- a/fs/super.c
+++ b/fs/super.c
@@ -93,6 +93,8 @@ static struct super_block *alloc_super(s
s->s_qcop = sb_quotactl_ops;
s->s_op = _op;
s->s_time_gran = 10;
+   idr_init(>s_inode_ids);
+   spin_lock_init(>s_inode_ids_lock);
}
 out:
return s;
diff --git a/include/linux/fs.h b/include/linux/fs.h
index 2fe6e3f..3afb4a2 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -278,6 +278,7 @@ #include 
 #include 
 #include 
 #include 
+#include 
 
 #include 
 #include 
@@ -961,6 +962,12 @@ #endif
/* Granularity of c/m/atime in ns.
   Cannot be worse than a second */
u32s_time_gran;
+
+   /* for fs's with dynamic i_ino values, track them with idr, and 
increment
+  the generation every time we register a new inode */
+   __u32   s_generation;
+   struct idr  s_inode_ids;
+   spinlock_t  s_inode_ids_lock;
 };
 
 extern struct timespec current_fs_time(struct super_block *sb);
@@ -1681,6 +1688,8 @@ extern void inode_init_once(struct inode
 extern void iput(struct inode *);
 extern struct inode * igrab(struct inode *);
 extern ino_t iunique(struct super_block *, ino_t);
+extern int iunique_register(struct inode *inode, int max_reserved);
+extern void iunique_unregister(struct inode *inode);
 extern int inode_needs_sync(struct inode *inode);
 extern void generic_delete_inode(struct inode *inode);
 extern void generic_drop_inode(struct inode *inode);
-
To unsubscribe from this list: send the line "unsubscribe 

Re: [RFC][PATCH] ensure i_ino uniqueness in filesystems without permanent inode numbers (via idr)

2006-12-01 Thread Randy Dunlap
On Fri, 01 Dec 2006 09:48:36 -0500 Jeff Layton wrote:

> This patch is a proof of concept. It works, but still needs a bit of
> polish before it's ready for submission. First, the problems:
> 
> 
> This patch is a first step at correcting these problems. This adds 2 new
> functions, an idr_register and idr_unregister. Filesystems can call
> idr_register at inode creation time, and then at deletion time, we'll
> automatically unregister them.

s/idr_/iunique_/

> This patch also adds a new s_generation counter to the superblock.
> Because i_ino's can be reused so quickly, we don't want NFS getting
> confused when it happens. When iunique_register is called, we'll assign
> the s_generation value to the i_generation, and then increment it to
> help ensure that we get different filehandles.
> 
> There are some things that need to be cleaned up, of course:
> 
> - error handling for the idr calls
> 
> - recheck all the possible places where the inode should be unhashed
> 
> - don't attempt to remove inodes with values <100

Please explain that one.  (May be obvious to some, but not to me.)

> - convert other filesystems
> 
> - remove the static counter from new_inode and (maybe) eliminate iunique
> 
> Comments and suggestions appreciated.


Better to post patches inline (for review) rather than as attachments.

Some (mostly style) comments on the patch:

+   rv = idr_pre_get(>i_sb->s_inode_ids, GFP_KERNEL);
+   if (! rv)
+   return -ENOMEM;

if (!rv)

+   rv = idr_get_new_above(>i_sb->s_inode_ids, inode,
+   max_reserved+1, (int *) >i_ino);

max_reserved + 1,

+}
+
+EXPORT_SYMBOL(iunique_register);

No need for the extra blank line after the function closing
brace.  Just put the EXPORT_SYMBOL immediately on the next line.
(in multiple places)

@@ -1681,6 +1688,8 @@ extern void inode_init_once(struct inode
 extern void iput(struct inode *);
 extern struct inode * igrab(struct inode *);
 extern ino_t iunique(struct super_block *, ino_t);
+extern int iunique_register(struct inode *, int);
+extern void iunique_unregister(struct inode *);
 extern int inode_needs_sync(struct inode *inode);
 extern void generic_delete_inode(struct inode *inode);
 extern void generic_drop_inode(struct inode *inode);

Some of these have a parameter name, some don't.
Having a (meaningful) parameter name is strongly preferred.

---
~Randy
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[RFC][PATCH] ensure i_ino uniqueness in filesystems without permanent inode numbers (via idr)

2006-12-01 Thread Jeff Layton

This patch is a proof of concept. It works, but still needs a bit of
polish before it's ready for submission. First, the problems:

1) on filesystems w/o permanent inode numbers, i_ino values can be
larger than 32 bits, which can cause problems for some 32 bit userspace
programs on a 64 bit kernel.

2) many filesystems call new_inode and assume that the i_ino values they
are given are unique. They are not guaranteed to be so, since the static
counter can wrap.

3) after allocating a new inode, some filesystems call iunique to try to
get a unique i_ino value, but they don't actually add their inodes to
the hashtable, so they're still not guaranteed to be unique.

This patch is a first step at correcting these problems. This adds 2 new
functions, an idr_register and idr_unregister. Filesystems can call
idr_register at inode creation time, and then at deletion time, we'll
automatically unregister them.

This patch also adds a new s_generation counter to the superblock.
Because i_ino's can be reused so quickly, we don't want NFS getting
confused when it happens. When iunique_register is called, we'll assign
the s_generation value to the i_generation, and then increment it to
help ensure that we get different filehandles.

There are some things that need to be cleaned up, of course:

- error handling for the idr calls

- recheck all the possible places where the inode should be unhashed

- don't attempt to remove inodes with values <100

- convert other filesystems

- remove the static counter from new_inode and (maybe) eliminate iunique

The patch also converts pipefs to use the new scheme as an example. Al
Viro had expressed some concern with an earlier patch that this might
slow down pipe creation. I've done some testing and I think the impact
will be minimal. Timing a small program that creates and closes 100
million pipes in a loop:

patched:
-
real8m8.623s
user0m37.418s
sys 7m31.196s

unpatched:
--
real8m7.150s
user0m40.943s
sys 7m26.204s

As the number of pipes grows on the system, this time may grow somewhat
but it doesn't seem like it would be terrible.

Comments and suggestions appreciated.

Signed-off-by: Jeff Layton <[EMAIL PROTECTED]>


diff --git a/fs/inode.c b/fs/inode.c
index 26cdb11..841e2fc 100644
--- a/fs/inode.c
+++ b/fs/inode.c
@@ -288,6 +288,8 @@ static void dispose_list(struct list_hea
 		list_del_init(>i_sb_list);
 		spin_unlock(_lock);
 
+		iunique_unregister(inode);
+
 		wake_up_inode(inode);
 		destroy_inode(inode);
 		nr_disposed++;
@@ -706,6 +708,34 @@ retry:
 
 EXPORT_SYMBOL(iunique);
 
+int iunique_register(struct inode *inode, int max_reserved)
+{
+	int rv;
+
+	rv = idr_pre_get(>i_sb->s_inode_ids, GFP_KERNEL);
+	if (! rv)
+		return -ENOMEM;
+
+	spin_lock(>i_sb->s_inode_ids_lock);
+	rv = idr_get_new_above(>i_sb->s_inode_ids, inode,
+		max_reserved+1, (int *) >i_ino);
+	inode->i_generation = inode->i_sb->s_generation++;
+	spin_unlock(>i_sb->s_inode_ids_lock);
+	return rv;
+}
+
+EXPORT_SYMBOL(iunique_register);
+
+void iunique_unregister(struct inode *inode)
+{
+	spin_lock(>i_sb->s_inode_ids_lock);
+	if (idr_find(>i_sb->s_inode_ids, (int) inode->i_ino))
+		idr_remove(>i_sb->s_inode_ids, (int) inode->i_ino);
+	spin_unlock(>i_sb->s_inode_ids_lock);
+}
+
+EXPORT_SYMBOL(iunique_unregister);
+
 struct inode *igrab(struct inode *inode)
 {
 	spin_lock(_lock);
@@ -1025,6 +1055,7 @@ void generic_delete_inode(struct inode *
 	spin_lock(_lock);
 	hlist_del_init(>i_hash);
 	spin_unlock(_lock);
+	iunique_unregister(inode);
 	wake_up_inode(inode);
 	BUG_ON(inode->i_state != I_CLEAR);
 	destroy_inode(inode);
@@ -1057,6 +1088,7 @@ static void generic_forget_inode(struct 
 	inode->i_state |= I_FREEING;
 	inodes_stat.nr_inodes--;
 	spin_unlock(_lock);
+	iunique_unregister(inode);
 	if (inode->i_data.nrpages)
 		truncate_inode_pages(>i_data, 0);
 	clear_inode(inode);
diff --git a/fs/pipe.c b/fs/pipe.c
index b1626f2..d74ae65 100644
--- a/fs/pipe.c
+++ b/fs/pipe.c
@@ -845,6 +845,9 @@ static struct inode * get_pipe_inode(voi
 	if (!inode)
 		goto fail_inode;
 
+	if (iunique_register(inode, 0))
+		goto fail_iput;
+
 	pipe = alloc_pipe_info(inode);
 	if (!pipe)
 		goto fail_iput;
diff --git a/fs/super.c b/fs/super.c
index 47e554c..d2dbdec 100644
--- a/fs/super.c
+++ b/fs/super.c
@@ -93,6 +93,8 @@ static struct super_block *alloc_super(s
 		s->s_qcop = sb_quotactl_ops;
 		s->s_op = _op;
 		s->s_time_gran = 10;
+		idr_init(>s_inode_ids);
+		spin_lock_init(>s_inode_ids_lock);
 	}
 out:
 	return s;
diff --git a/include/linux/fs.h b/include/linux/fs.h
index 2fe6e3f..3ad12a6 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -278,6 +278,7 @@ #include 
 #include 
 #include 
 #include 
+#include 
 
 #include 
 #include 
@@ -961,6 +962,12 @@ #endif
 	/* Granularity of c/m/atime in ns.
 	   Cannot be worse than a second */
 	u32		   s_time_gran;
+
+	/* for fs's with dynamic i_ino values, track them with idr, and increment
+	   the generation every 

[RFC][PATCH] ensure i_ino uniqueness in filesystems without permanent inode numbers (via idr)

2006-12-01 Thread Jeff Layton

This patch is a proof of concept. It works, but still needs a bit of
polish before it's ready for submission. First, the problems:

1) on filesystems w/o permanent inode numbers, i_ino values can be
larger than 32 bits, which can cause problems for some 32 bit userspace
programs on a 64 bit kernel.

2) many filesystems call new_inode and assume that the i_ino values they
are given are unique. They are not guaranteed to be so, since the static
counter can wrap.

3) after allocating a new inode, some filesystems call iunique to try to
get a unique i_ino value, but they don't actually add their inodes to
the hashtable, so they're still not guaranteed to be unique.

This patch is a first step at correcting these problems. This adds 2 new
functions, an idr_register and idr_unregister. Filesystems can call
idr_register at inode creation time, and then at deletion time, we'll
automatically unregister them.

This patch also adds a new s_generation counter to the superblock.
Because i_ino's can be reused so quickly, we don't want NFS getting
confused when it happens. When iunique_register is called, we'll assign
the s_generation value to the i_generation, and then increment it to
help ensure that we get different filehandles.

There are some things that need to be cleaned up, of course:

- error handling for the idr calls

- recheck all the possible places where the inode should be unhashed

- don't attempt to remove inodes with values 100

- convert other filesystems

- remove the static counter from new_inode and (maybe) eliminate iunique

The patch also converts pipefs to use the new scheme as an example. Al
Viro had expressed some concern with an earlier patch that this might
slow down pipe creation. I've done some testing and I think the impact
will be minimal. Timing a small program that creates and closes 100
million pipes in a loop:

patched:
-
real8m8.623s
user0m37.418s
sys 7m31.196s

unpatched:
--
real8m7.150s
user0m40.943s
sys 7m26.204s

As the number of pipes grows on the system, this time may grow somewhat
but it doesn't seem like it would be terrible.

Comments and suggestions appreciated.

Signed-off-by: Jeff Layton [EMAIL PROTECTED]


diff --git a/fs/inode.c b/fs/inode.c
index 26cdb11..841e2fc 100644
--- a/fs/inode.c
+++ b/fs/inode.c
@@ -288,6 +288,8 @@ static void dispose_list(struct list_hea
 		list_del_init(inode-i_sb_list);
 		spin_unlock(inode_lock);
 
+		iunique_unregister(inode);
+
 		wake_up_inode(inode);
 		destroy_inode(inode);
 		nr_disposed++;
@@ -706,6 +708,34 @@ retry:
 
 EXPORT_SYMBOL(iunique);
 
+int iunique_register(struct inode *inode, int max_reserved)
+{
+	int rv;
+
+	rv = idr_pre_get(inode-i_sb-s_inode_ids, GFP_KERNEL);
+	if (! rv)
+		return -ENOMEM;
+
+	spin_lock(inode-i_sb-s_inode_ids_lock);
+	rv = idr_get_new_above(inode-i_sb-s_inode_ids, inode,
+		max_reserved+1, (int *) inode-i_ino);
+	inode-i_generation = inode-i_sb-s_generation++;
+	spin_unlock(inode-i_sb-s_inode_ids_lock);
+	return rv;
+}
+
+EXPORT_SYMBOL(iunique_register);
+
+void iunique_unregister(struct inode *inode)
+{
+	spin_lock(inode-i_sb-s_inode_ids_lock);
+	if (idr_find(inode-i_sb-s_inode_ids, (int) inode-i_ino))
+		idr_remove(inode-i_sb-s_inode_ids, (int) inode-i_ino);
+	spin_unlock(inode-i_sb-s_inode_ids_lock);
+}
+
+EXPORT_SYMBOL(iunique_unregister);
+
 struct inode *igrab(struct inode *inode)
 {
 	spin_lock(inode_lock);
@@ -1025,6 +1055,7 @@ void generic_delete_inode(struct inode *
 	spin_lock(inode_lock);
 	hlist_del_init(inode-i_hash);
 	spin_unlock(inode_lock);
+	iunique_unregister(inode);
 	wake_up_inode(inode);
 	BUG_ON(inode-i_state != I_CLEAR);
 	destroy_inode(inode);
@@ -1057,6 +1088,7 @@ static void generic_forget_inode(struct 
 	inode-i_state |= I_FREEING;
 	inodes_stat.nr_inodes--;
 	spin_unlock(inode_lock);
+	iunique_unregister(inode);
 	if (inode-i_data.nrpages)
 		truncate_inode_pages(inode-i_data, 0);
 	clear_inode(inode);
diff --git a/fs/pipe.c b/fs/pipe.c
index b1626f2..d74ae65 100644
--- a/fs/pipe.c
+++ b/fs/pipe.c
@@ -845,6 +845,9 @@ static struct inode * get_pipe_inode(voi
 	if (!inode)
 		goto fail_inode;
 
+	if (iunique_register(inode, 0))
+		goto fail_iput;
+
 	pipe = alloc_pipe_info(inode);
 	if (!pipe)
 		goto fail_iput;
diff --git a/fs/super.c b/fs/super.c
index 47e554c..d2dbdec 100644
--- a/fs/super.c
+++ b/fs/super.c
@@ -93,6 +93,8 @@ static struct super_block *alloc_super(s
 		s-s_qcop = sb_quotactl_ops;
 		s-s_op = default_op;
 		s-s_time_gran = 10;
+		idr_init(s-s_inode_ids);
+		spin_lock_init(s-s_inode_ids_lock);
 	}
 out:
 	return s;
diff --git a/include/linux/fs.h b/include/linux/fs.h
index 2fe6e3f..3ad12a6 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -278,6 +278,7 @@ #include linux/prio_tree.h
 #include linux/init.h
 #include linux/sched.h
 #include linux/mutex.h
+#include linux/idr.h
 
 #include asm/atomic.h
 #include asm/semaphore.h
@@ -961,6 +962,12 @@ #endif
 	/* Granularity of c/m/atime in ns.
 	 

Re: [RFC][PATCH] ensure i_ino uniqueness in filesystems without permanent inode numbers (via idr)

2006-12-01 Thread Randy Dunlap
On Fri, 01 Dec 2006 09:48:36 -0500 Jeff Layton wrote:

 This patch is a proof of concept. It works, but still needs a bit of
 polish before it's ready for submission. First, the problems:
 
 
 This patch is a first step at correcting these problems. This adds 2 new
 functions, an idr_register and idr_unregister. Filesystems can call
 idr_register at inode creation time, and then at deletion time, we'll
 automatically unregister them.

s/idr_/iunique_/

 This patch also adds a new s_generation counter to the superblock.
 Because i_ino's can be reused so quickly, we don't want NFS getting
 confused when it happens. When iunique_register is called, we'll assign
 the s_generation value to the i_generation, and then increment it to
 help ensure that we get different filehandles.
 
 There are some things that need to be cleaned up, of course:
 
 - error handling for the idr calls
 
 - recheck all the possible places where the inode should be unhashed
 
 - don't attempt to remove inodes with values 100

Please explain that one.  (May be obvious to some, but not to me.)

 - convert other filesystems
 
 - remove the static counter from new_inode and (maybe) eliminate iunique
 
 Comments and suggestions appreciated.


Better to post patches inline (for review) rather than as attachments.

Some (mostly style) comments on the patch:

+   rv = idr_pre_get(inode-i_sb-s_inode_ids, GFP_KERNEL);
+   if (! rv)
+   return -ENOMEM;

if (!rv)

+   rv = idr_get_new_above(inode-i_sb-s_inode_ids, inode,
+   max_reserved+1, (int *) inode-i_ino);

max_reserved + 1,

+}
+
+EXPORT_SYMBOL(iunique_register);

No need for the extra blank line after the function closing
brace.  Just put the EXPORT_SYMBOL immediately on the next line.
(in multiple places)

@@ -1681,6 +1688,8 @@ extern void inode_init_once(struct inode
 extern void iput(struct inode *);
 extern struct inode * igrab(struct inode *);
 extern ino_t iunique(struct super_block *, ino_t);
+extern int iunique_register(struct inode *, int);
+extern void iunique_unregister(struct inode *);
 extern int inode_needs_sync(struct inode *inode);
 extern void generic_delete_inode(struct inode *inode);
 extern void generic_drop_inode(struct inode *inode);

Some of these have a parameter name, some don't.
Having a (meaningful) parameter name is strongly preferred.

---
~Randy
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC][PATCH] ensure i_ino uniqueness in filesystems without permanent inode numbers (via idr)

2006-12-01 Thread Jeff Layton
On Fri, Dec 01, 2006 at 08:52:27AM -0800, Randy Dunlap wrote:

Thanks for having a look, Randy...

 s/idr_/iunique_/

Doh! Can you tell I cut and pasted this email from earlier ones? :-)

  - don't attempt to remove inodes with values 100
 
 Please explain that one.  (May be obvious to some, but not to me.)

Actually, we probably don't need to do that now. My thought here was to add
a low range of i_ino numbers that could be used by the filesystem code without
needing to call iunique (in particular for things like the root inode in the
filesystem). It's probably best to not do this though and let the filesystem
handle it on its own.

 Better to post patches inline (for review) rather than as attachments.

Here's an updated (but untested) patch based on your suggestions. I also went
ahead and made the exported symbols GPL-only since that seems like it would be
appropriate here. Any further thoughts on it?

Signed-off-by: Jeff Layton [EMAIL PROTECTED]

diff --git a/fs/inode.c b/fs/inode.c
index 26cdb11..e45cec9 100644
--- a/fs/inode.c
+++ b/fs/inode.c
@@ -288,6 +288,8 @@ static void dispose_list(struct list_hea
list_del_init(inode-i_sb_list);
spin_unlock(inode_lock);
 
+   iunique_unregister(inode);
+
wake_up_inode(inode);
destroy_inode(inode);
nr_disposed++;
@@ -706,6 +708,32 @@ retry:
 
 EXPORT_SYMBOL(iunique);
 
+int iunique_register(struct inode *inode, int max_reserved)
+{
+   int rv;
+
+   rv = idr_pre_get(inode-i_sb-s_inode_ids, GFP_KERNEL);
+   if (! rv)
+   return -ENOMEM;
+
+   spin_lock(inode-i_sb-s_inode_ids_lock);
+   rv = idr_get_new_above(inode-i_sb-s_inode_ids, inode,
+   max_reserved+1, (int *) inode-i_ino);
+   inode-i_generation = inode-i_sb-s_generation++;
+   spin_unlock(inode-i_sb-s_inode_ids_lock);
+   return rv;
+}
+EXPORT_SYMBOL_GPL(iunique_register);
+
+void iunique_unregister(struct inode *inode)
+{
+   spin_lock(inode-i_sb-s_inode_ids_lock);
+   if (idr_find(inode-i_sb-s_inode_ids, (int) inode-i_ino))
+   idr_remove(inode-i_sb-s_inode_ids, (int) inode-i_ino);
+   spin_unlock(inode-i_sb-s_inode_ids_lock);
+}
+EXPORT_SYMBOL_GPL(iunique_unregister);
+
 struct inode *igrab(struct inode *inode)
 {
spin_lock(inode_lock);
@@ -1025,6 +1053,7 @@ void generic_delete_inode(struct inode *
spin_lock(inode_lock);
hlist_del_init(inode-i_hash);
spin_unlock(inode_lock);
+   iunique_unregister(inode);
wake_up_inode(inode);
BUG_ON(inode-i_state != I_CLEAR);
destroy_inode(inode);
@@ -1057,6 +1086,7 @@ static void generic_forget_inode(struct 
inode-i_state |= I_FREEING;
inodes_stat.nr_inodes--;
spin_unlock(inode_lock);
+   iunique_unregister(inode);
if (inode-i_data.nrpages)
truncate_inode_pages(inode-i_data, 0);
clear_inode(inode);
diff --git a/fs/pipe.c b/fs/pipe.c
index b1626f2..d74ae65 100644
--- a/fs/pipe.c
+++ b/fs/pipe.c
@@ -845,6 +845,9 @@ static struct inode * get_pipe_inode(voi
if (!inode)
goto fail_inode;
 
+   if (iunique_register(inode, 0))
+   goto fail_iput;
+
pipe = alloc_pipe_info(inode);
if (!pipe)
goto fail_iput;
diff --git a/fs/super.c b/fs/super.c
index 47e554c..d2dbdec 100644
--- a/fs/super.c
+++ b/fs/super.c
@@ -93,6 +93,8 @@ static struct super_block *alloc_super(s
s-s_qcop = sb_quotactl_ops;
s-s_op = default_op;
s-s_time_gran = 10;
+   idr_init(s-s_inode_ids);
+   spin_lock_init(s-s_inode_ids_lock);
}
 out:
return s;
diff --git a/include/linux/fs.h b/include/linux/fs.h
index 2fe6e3f..3afb4a2 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -278,6 +278,7 @@ #include linux/prio_tree.h
 #include linux/init.h
 #include linux/sched.h
 #include linux/mutex.h
+#include linux/idr.h
 
 #include asm/atomic.h
 #include asm/semaphore.h
@@ -961,6 +962,12 @@ #endif
/* Granularity of c/m/atime in ns.
   Cannot be worse than a second */
u32s_time_gran;
+
+   /* for fs's with dynamic i_ino values, track them with idr, and 
increment
+  the generation every time we register a new inode */
+   __u32   s_generation;
+   struct idr  s_inode_ids;
+   spinlock_t  s_inode_ids_lock;
 };
 
 extern struct timespec current_fs_time(struct super_block *sb);
@@ -1681,6 +1688,8 @@ extern void inode_init_once(struct inode
 extern void iput(struct inode *);
 extern struct inode * igrab(struct inode *);
 extern ino_t iunique(struct super_block *, ino_t);
+extern int iunique_register(struct inode *inode, int max_reserved);
+extern void iunique_unregister(struct inode *inode);
 extern int inode_needs_sync(struct inode *inode);
 extern void 

Re: [RFC][PATCH] ensure i_ino uniqueness in filesystems without permanent inode numbers (via idr)

2006-12-01 Thread Randy Dunlap

Jeff Layton wrote:

On Fri, Dec 01, 2006 at 08:52:27AM -0800, Randy Dunlap wrote:

Thanks for having a look, Randy...


s/idr_/iunique_/


Doh! Can you tell I cut and pasted this email from earlier ones? :-)


- don't attempt to remove inodes with values 100

Please explain that one.  (May be obvious to some, but not to me.)


Actually, we probably don't need to do that now. My thought here was to add
a low range of i_ino numbers that could be used by the filesystem code without
needing to call iunique (in particular for things like the root inode in the
filesystem). It's probably best to not do this though and let the filesystem
handle it on its own.


Better to post patches inline (for review) rather than as attachments.


Here's an updated (but untested) patch based on your suggestions. I also went
ahead and made the exported symbols GPL-only since that seems like it would be
appropriate here. Any further thoughts on it?


Just needs new/updated patch description.
and one typo fixed.


diff --git a/fs/inode.c b/fs/inode.c
index 26cdb11..e45cec9 100644
--- a/fs/inode.c
+++ b/fs/inode.c
@@ -706,6 +708,32 @@ retry:
 
 EXPORT_SYMBOL(iunique);
 
+int iunique_register(struct inode *inode, int max_reserved)

+{
+   int rv;
+
+   rv = idr_pre_get(inode-i_sb-s_inode_ids, GFP_KERNEL);
+   if (! rv)


No space after !, just:
if (!rv)


+   return -ENOMEM;
+
+   spin_lock(inode-i_sb-s_inode_ids_lock);
+   rv = idr_get_new_above(inode-i_sb-s_inode_ids, inode,
+   max_reserved+1, (int *) inode-i_ino);
+   inode-i_generation = inode-i_sb-s_generation++;
+   spin_unlock(inode-i_sb-s_inode_ids_lock);
+   return rv;
+}


Thanks.
--
~Randy
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC][PATCH] ensure i_ino uniqueness in filesystems without permanent inode numbers (via idr hashing)

2006-12-01 Thread Jeff Layton
Thanks again, Randy. Here's an updated and tested patch and description. 
This
one also makes sure that the root inode for the mount gets a unique 
i_ino value

as well. Let me know what you think...

--[snip]---

This patch is a proof of concept. It works, but I'd like to get some buyin
on the approach before I start doing the legwork to convert all of
the existing filesystems to use it. First, the problems:

1) on filesystems w/o permanent inode numbers, i_ino values can be
larger than 32 bits, which can cause problems for some 32 bit userspace
programs on a 64 bit kernel.

2) many filesystems call new_inode and assume that the i_ino values they
are given are unique. They are not guaranteed to be so, since the static
counter can wrap.

3) after allocating a new inode, some filesystems call iunique to try to
get a unique i_ino value, but they don't actually add their inodes to
the hashtable, and so they're still not guaranteed to be unique.

One way to fix this would be to just make sure they all use iunique and hash
their inodes, but that might slow down lookups for filesystems whose inodes
are not pinned in memory.

This patch is one way to correct these problems. This adds 2 new
functions, an iunique_register and iunique_unregister. Filesystems can call
iunique_register at inode creation time, and then at deletion time, we'll
automatically unregister them.

This patch also adds a new s_generation counter to the superblock.
Because i_ino's can be reused so quickly, we don't want NFS getting
confused when it happens. When iunique_register is called, we'll assign
the s_generation value to the i_generation, and then increment it to
help ensure that we get different filehandles.

There are some things that need to be cleaned up, of course:

- better error handling for the iunique calls

- recheck all the possible places where the inode should be unhashed

- convert other filesystems

- remove the static counter from new_inode and (maybe) eliminate iunique

The patch also converts pipefs to use the new scheme as an example. Al
Viro had expressed some concern with an earlier patch that this might
slow down pipe creation. I've done some testing and I think the impact
will be minimal. Timing a small program that creates and closes 100
million pipes in a loop:

patched:
-
real8m8.623s
user0m37.418s
sys 7m31.196s

unpatched:
--
real8m7.150s
user0m40.943s
sys 7m26.204s

As the number of pipes grows on the system this time may grow somewhat,
but it doesn't seem like it would be terrible.

Comments and suggestions appreciated.

Signed-off-by: Jeff Layton [EMAIL PROTECTED]


diff --git a/fs/inode.c b/fs/inode.c
index 26cdb11..252192a 100644
--- a/fs/inode.c
+++ b/fs/inode.c
@@ -288,6 +288,8 @@ static void dispose_list(struct list_hea
list_del_init(inode-i_sb_list);
spin_unlock(inode_lock);

+   iunique_unregister(inode);
+
wake_up_inode(inode);
destroy_inode(inode);
nr_disposed++;
@@ -706,6 +708,32 @@ retry:

 EXPORT_SYMBOL(iunique);

+int iunique_register(struct inode *inode, int max_reserved)
+{
+   int rv;
+
+   rv = idr_pre_get(inode-i_sb-s_inode_ids, GFP_KERNEL);
+   if (!rv)
+   return -ENOMEM;
+
+   spin_lock(inode-i_sb-s_inode_ids_lock);
+   rv = idr_get_new_above(inode-i_sb-s_inode_ids, inode,
+   max_reserved+1, (int *) inode-i_ino);
+   inode-i_generation = inode-i_sb-s_generation++;
+   spin_unlock(inode-i_sb-s_inode_ids_lock);
+   return rv;
+}
+EXPORT_SYMBOL_GPL(iunique_register);
+
+void iunique_unregister(struct inode *inode)
+{
+   spin_lock(inode-i_sb-s_inode_ids_lock);
+   if (idr_find(inode-i_sb-s_inode_ids, (int) inode-i_ino))
+   idr_remove(inode-i_sb-s_inode_ids, (int) inode-i_ino);
+   spin_unlock(inode-i_sb-s_inode_ids_lock);
+}
+EXPORT_SYMBOL_GPL(iunique_unregister);
+
 struct inode *igrab(struct inode *inode)
 {
spin_lock(inode_lock);
@@ -1025,6 +1053,7 @@ void generic_delete_inode(struct inode *
spin_lock(inode_lock);
hlist_del_init(inode-i_hash);
spin_unlock(inode_lock);
+   iunique_unregister(inode);
wake_up_inode(inode);
BUG_ON(inode-i_state != I_CLEAR);
destroy_inode(inode);
@@ -1057,6 +1086,7 @@ static void generic_forget_inode(struct
inode-i_state |= I_FREEING;
inodes_stat.nr_inodes--;
spin_unlock(inode_lock);
+   iunique_unregister(inode);
if (inode-i_data.nrpages)
truncate_inode_pages(inode-i_data, 0);
clear_inode(inode);
diff --git a/fs/pipe.c b/fs/pipe.c
index b1626f2..74dbbf0 100644
--- a/fs/pipe.c
+++ b/fs/pipe.c
@@ -845,6 +845,9 @@ static struct inode * get_pipe_inode(voi
if (!inode)
goto fail_inode;

+   if (iunique_register(inode, 0))
+   goto fail_iput;
+
pipe =