Re: [RFC][PATCH] /proc/pid/maps doesn't match "ipcs -m" shmid

2007-06-08 Thread Badari Pulavarty



Eric W. Biederman wrote:



At this point given that we actually have a small user space dependency
and the fact that after I have reviewed the code it looks harmless to
change the inode number of those inodes, in both cases they are just
anonymous inodes generated with new_inode, and anything that we wrap
is likely to be equally so.

So it looks to me like we need to do three things:
- Fix the inode number


Okay. its already done.



- Fix the name on the hugetlbfs dentry to hold the key

I don't see need for doing this for hugetlbfs inodes. Currently, they 
don't base their
name on "key" + basing on the "key" is kind of useless anyway (its not 
unique).




- Add a big fat comment that user space programs depend on this
 behavior of both the dentry name and the inode number.

I don't think, the user-space can depend on the dentry-name. It can only 
depend

on inode# to match shmid. (since key is not unique esp. for key=0x).

BTW, I agree that shmid is not unique even without namespaces as its 
based on

seq# and we wrap seq#.

Thanks,
Badari




-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC][PATCH] /proc/pid/maps doesn't match "ipcs -m" shmid

2007-06-08 Thread Albert Cahalan

On 6/8/07, Eric W. Biederman <[EMAIL PROTECTED]> wrote:

"Albert Cahalan" <[EMAIL PROTECTED]> writes:
> On 6/7/07, Eric W. Biederman <[EMAIL PROTECTED]> wrote:



>> So it looks to me like we need to do three things:
>> - Fix the inode number
>> - Fix the name on the hugetlbfs dentry to hold the key
>> - Add a big fat comment that user space programs depend on this
>>   behavior of both the dentry name and the inode number.
>
> Assuming that this proposed fix goes in:
>
> Since the inode number is the shmid, and this is a number
> that the kernel randomly chooses AFAIK, there should be
> no need to have different shm segments sharing the same
> inode number.

Where we run into inode number confusion is that all of
these shm segments are actually files on a tmpfs filesystem
somewhere, and by making the inode number the shmid we loose
the tmpfs inode number.  So it is possible we get tmpfs inode
number conflicts.  However the inode number is not used for
anything, and the files are not visible in any other way except
as shm segments so it doesn't matter.


Eh, the kernel choses both shmid and tmpfs inode number.
You could set a high bit in one or the other.


There is another case with ipc namespaces where we ultimately need
to support duplicate shmids on the same machine (so migration
is a possibility).  However by and large the user space
processes with duplicate ids should be invisible to each other.


On the bright side, this only screws up people who get the
crazy idea that processes can be migrated.


> The situation with the key is a bit more disturbing, though
> we already hit that anyway when IPC_PRIVATE is used.
> (why anybody would NOT use IPC_PRIVATE is a mystery)
> So having the key in the name doesn't make things worse.

Having "SYSV" in the name appears mandatory.  Otherwise you
don't even know it is a shm file. Although I may be confused.


It's mandatory for a different reason: to satisfy parsers.

It is nearly useless for identifying shm files. Look what I can do:
   touch /SYSV
   touch '/SYSV (deleted)'

(so pmap creates a shm, looks for the address in /proc/self/maps,
determines the device major/minor in use, and then uses that)


Hmm.  Thinking about this I have just realized that we may want
to approach this a little differently.  Currently I am reusing
the dentry and inode structure that hugetlbfs and tmpfs return
me, and simply have a distinct struct file for each shm mapping.

There is a little more cost but it may actually make sense to have
a dentry and inode that is specific to shm.c so we can do whatever
we need to without adding requirements to the normal tmpfs or hugtlb
code.


Piggybacking on tmpfs has always seemed a bit dirty to me.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC][PATCH] /proc/pid/maps doesn't match ipcs -m shmid

2007-06-08 Thread Albert Cahalan

On 6/8/07, Eric W. Biederman [EMAIL PROTECTED] wrote:

Albert Cahalan [EMAIL PROTECTED] writes:
 On 6/7/07, Eric W. Biederman [EMAIL PROTECTED] wrote:



 So it looks to me like we need to do three things:
 - Fix the inode number
 - Fix the name on the hugetlbfs dentry to hold the key
 - Add a big fat comment that user space programs depend on this
   behavior of both the dentry name and the inode number.

 Assuming that this proposed fix goes in:

 Since the inode number is the shmid, and this is a number
 that the kernel randomly chooses AFAIK, there should be
 no need to have different shm segments sharing the same
 inode number.

Where we run into inode number confusion is that all of
these shm segments are actually files on a tmpfs filesystem
somewhere, and by making the inode number the shmid we loose
the tmpfs inode number.  So it is possible we get tmpfs inode
number conflicts.  However the inode number is not used for
anything, and the files are not visible in any other way except
as shm segments so it doesn't matter.


Eh, the kernel choses both shmid and tmpfs inode number.
You could set a high bit in one or the other.


There is another case with ipc namespaces where we ultimately need
to support duplicate shmids on the same machine (so migration
is a possibility).  However by and large the user space
processes with duplicate ids should be invisible to each other.


On the bright side, this only screws up people who get the
crazy idea that processes can be migrated.


 The situation with the key is a bit more disturbing, though
 we already hit that anyway when IPC_PRIVATE is used.
 (why anybody would NOT use IPC_PRIVATE is a mystery)
 So having the key in the name doesn't make things worse.

Having SYSV in the name appears mandatory.  Otherwise you
don't even know it is a shm file. Although I may be confused.


It's mandatory for a different reason: to satisfy parsers.

It is nearly useless for identifying shm files. Look what I can do:
   touch /SYSV
   touch '/SYSV (deleted)'

(so pmap creates a shm, looks for the address in /proc/self/maps,
determines the device major/minor in use, and then uses that)


Hmm.  Thinking about this I have just realized that we may want
to approach this a little differently.  Currently I am reusing
the dentry and inode structure that hugetlbfs and tmpfs return
me, and simply have a distinct struct file for each shm mapping.

There is a little more cost but it may actually make sense to have
a dentry and inode that is specific to shm.c so we can do whatever
we need to without adding requirements to the normal tmpfs or hugtlb
code.


Piggybacking on tmpfs has always seemed a bit dirty to me.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC][PATCH] /proc/pid/maps doesn't match ipcs -m shmid

2007-06-08 Thread Badari Pulavarty



Eric W. Biederman wrote:



At this point given that we actually have a small user space dependency
and the fact that after I have reviewed the code it looks harmless to
change the inode number of those inodes, in both cases they are just
anonymous inodes generated with new_inode, and anything that we wrap
is likely to be equally so.

So it looks to me like we need to do three things:
- Fix the inode number


Okay. its already done.



- Fix the name on the hugetlbfs dentry to hold the key

I don't see need for doing this for hugetlbfs inodes. Currently, they 
don't base their
name on key + basing on the key is kind of useless anyway (its not 
unique).




- Add a big fat comment that user space programs depend on this
 behavior of both the dentry name and the inode number.

I don't think, the user-space can depend on the dentry-name. It can only 
depend

on inode# to match shmid. (since key is not unique esp. for key=0x).

BTW, I agree that shmid is not unique even without namespaces as its 
based on

seq# and we wrap seq#.

Thanks,
Badari




-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC][PATCH] /proc/pid/maps doesn't match "ipcs -m" shmid

2007-06-07 Thread Eric W. Biederman
"Albert Cahalan" <[EMAIL PROTECTED]> writes:

> On 6/7/07, Eric W. Biederman <[EMAIL PROTECTED]> wrote:
>
>> So it looks to me like we need to do three things:
>> - Fix the inode number
>> - Fix the name on the hugetlbfs dentry to hold the key
>> - Add a big fat comment that user space programs depend on this
>>   behavior of both the dentry name and the inode number.
>
> Assuming that this proposed fix goes in:
>
> Since the inode number is the shmid, and this is a number
> that the kernel randomly chooses AFAIK, there should be
> no need to have different shm segments sharing the same
> inode number.

Where we run into inode number confusion is that all of
these shm segments are actually files on a tmpfs filesystem
somewhere, and by making the inode number the shmid we loose
the tmpfs inode number.  So it is possible we get tmpfs inode
number conflicts.  However the inode number is not used for
anything, and the files are not visible in any other way except
as shm segments so it doesn't matter.

There is another case with ipc namespaces where we ultimately need
to support duplicate shmids on the same machine (so migration
is a possibility).  However by and large the user space
processes with duplicate ids should be invisible to each other.

> The situation with the key is a bit more disturbing, though
> we already hit that anyway when IPC_PRIVATE is used.
> (why anybody would NOT use IPC_PRIVATE is a mystery)
> So having the key in the name doesn't make things worse.

Having "SYSV" in the name appears mandatory.  Otherwise you
don't even know it is a shm file. Although I may be confused.

> I have some concern about the device minor number.
> This should be the same for all shm mappings; I do not
> know if the behavior changed.

So I haven't changed anything here.  But I haven't really
looked either.

I don't have a clue if hugetlbfs files use the same device minor
number as tmpfs files.

Hmm.  Thinking about this I have just realized that we may want
to approach this a little differently.  Currently I am reusing
the dentry and inode structure that hugetlbfs and tmpfs return
me, and simply have a distinct struct file for each shm mapping.

There is a little more cost but it may actually make sense to have
a dentry and inode that is specific to shm.c so we can do whatever
we need to without adding requirements to the normal tmpfs or hugtlb
code.

Eric
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC][PATCH] /proc/pid/maps doesn't match "ipcs -m" shmid

2007-06-07 Thread Albert Cahalan

On 6/7/07, Eric W. Biederman <[EMAIL PROTECTED]> wrote:


So it looks to me like we need to do three things:
- Fix the inode number
- Fix the name on the hugetlbfs dentry to hold the key
- Add a big fat comment that user space programs depend on this
  behavior of both the dentry name and the inode number.


Assuming that this proposed fix goes in:

Since the inode number is the shmid, and this is a number
that the kernel randomly chooses AFAIK, there should be
no need to have different shm segments sharing the same
inode number.

The situation with the key is a bit more disturbing, though
we already hit that anyway when IPC_PRIVATE is used.
(why anybody would NOT use IPC_PRIVATE is a mystery)
So having the key in the name doesn't make things worse.

I have some concern about the device minor number.
This should be the same for all shm mappings; I do not
know if the behavior changed.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC][PATCH] /proc/pid/maps doesn't match "ipcs -m" shmid

2007-06-07 Thread Eric W. Biederman
"Serge E. Hallyn" <[EMAIL PROTECTED]> writes:

> Ok, so IIUC the problem was that inode->i_ino was being set to the id,
> and the id can be the same for different things in two namespaces.

There is nothing preventing inode number collisions in this code even
without multiple namespaces, and even when it was functioning
correctly.  However as it does not seem possible to find these files
through normal filesystem operations that does not seem to be a problem.

> So aside from not using the id as inode->i_ino, an alternative is to use
> a separate superblock, spearate mqeueue fs, for each ipc ns.
>
> I haven't looked at that enough to see whether it's feasible, i.e. I 
> don't know what else mqueue fs is used for.  Eric, does that sound
> reasonable to you?

At this point given that we actually have a small user space dependency
and the fact that after I have reviewed the code it looks harmless to
change the inode number of those inodes, in both cases they are just
anonymous inodes generated with new_inode, and anything that we wrap
is likely to be equally so.

So it looks to me like we need to do three things:
- Fix the inode number
- Fix the name on the hugetlbfs dentry to hold the key
- Add a big fat comment that user space programs depend on this
  behavior of both the dentry name and the inode number.

So Badari it looks like your original patch plus a little bit is
what we need.

Eric
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC][PATCH] /proc/pid/maps doesn't match "ipcs -m" shmid

2007-06-07 Thread Badari Pulavarty



Serge E. Hallyn wrote:


Quoting Serge E. Hallyn ([EMAIL PROTECTED]):


Quoting Badari Pulavarty ([EMAIL PROTECTED]):


On Thu, 2007-06-07 at 15:37 -0500, Serge E. Hallyn wrote:


Quoting Badari Pulavarty ([EMAIL PROTECTED]):


On Thu, 2007-06-07 at 12:48 -0700, Andrew Morton wrote:


On Thu, 07 Jun 2007 10:06:37 -0700
Badari Pulavarty <[EMAIL PROTECTED]> wrote:


On Thu, 2007-06-07 at 12:43 -0400, Albert Cahalan wrote:


On 6/7/07, Badari Pulavarty <[EMAIL PROTECTED]> wrote:


BTW, I agree with Eric that its would be nice to use shmid as part
of name instead of forcing to be as inode number. It should be
possible for pmap to workout shmid from "key" or name. Isn't it ?


It is not at all nice.

1. it's incompatible ABI breakage
2. where will you put the key then, in the inode? :-)


Nope. Currently "key" is part of the name (but its not unique).


Changing to "SYSVID%d" is no good either. Look, people
are ***parsing*** this stuff in /proc. The /proc filesystem
is not some random sandbox to be playing in.

Before you go messing with it, note that the device number
also matters. (it's per-boot dynamic, but that's OK)
That's how one knows that /SYSV is not just
a regular file; sadly these didn't get a non-/ prefix.
(and no you can't fix that now; it's way too late)

Next time you feel like breaking an ABI, mind putting
"LET'S BREAK AN ABI!" in the subject of your email?


I am not breaking ABI. Its already broken in the current
mainline. I am trying to fix it by putting back the ino#
as shmid. Eric had a suggestion that, instead of depending
on the inode# to be shmid, we could embed shmid into name
(instead of "key" which is currently not unique).


BTW, I suspect this kind of thing also breaks:
a. fuser, lsof, and other resource usage display tools
b. various obscure emulators (similar to valgrind)

If you strongly feel that "old" behaviour needs to be retained, 


yup, we should put it back.  The change was, afaik, accidental.


here is the patch I originally suggested.


Confused.  Will this one-liner fix all the userspace breakage to which
Albert refers?


Yes. Albert, please correct me if I am wrong.


It will, but could lead to two different inodes with the same i_ino,
right?


Only if we generate same ID in two different namespaces. Is it currently
possible ? 


Should be nothing stopping it.



(just to be more certain, a quick test showed I can get id 0 for
different keys, and different ids for the same key 0xff, in different
ipc namespaces)


Funny. I played with it and decided that it can happen :)

Thanks,
Badari



-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC][PATCH] /proc/pid/maps doesn't match "ipcs -m" shmid

2007-06-07 Thread Serge E. Hallyn
Quoting Serge E. Hallyn ([EMAIL PROTECTED]):
> Quoting Badari Pulavarty ([EMAIL PROTECTED]):
> > On Thu, 2007-06-07 at 15:37 -0500, Serge E. Hallyn wrote:
> > > Quoting Badari Pulavarty ([EMAIL PROTECTED]):
> > > > On Thu, 2007-06-07 at 12:48 -0700, Andrew Morton wrote:
> > > > > On Thu, 07 Jun 2007 10:06:37 -0700
> > > > > Badari Pulavarty <[EMAIL PROTECTED]> wrote:
> > > > > 
> > > > > > On Thu, 2007-06-07 at 12:43 -0400, Albert Cahalan wrote:
> > > > > > > On 6/7/07, Badari Pulavarty <[EMAIL PROTECTED]> wrote:
> > > > > > > 
> > > > > > > > BTW, I agree with Eric that its would be nice to use shmid as 
> > > > > > > > part
> > > > > > > > of name instead of forcing to be as inode number. It should be
> > > > > > > > possible for pmap to workout shmid from "key" or name. Isn't it 
> > > > > > > > ?
> > > > > > > 
> > > > > > > It is not at all nice.
> > > > > > > 
> > > > > > > 1. it's incompatible ABI breakage
> > > > > > > 2. where will you put the key then, in the inode? :-)
> > > > > > 
> > > > > > Nope. Currently "key" is part of the name (but its not unique).
> > > > > > 
> > > > > > > 
> > > > > > > Changing to "SYSVID%d" is no good either. Look, people
> > > > > > > are ***parsing*** this stuff in /proc. The /proc filesystem
> > > > > > > is not some random sandbox to be playing in.
> > > > > > > 
> > > > > > > Before you go messing with it, note that the device number
> > > > > > > also matters. (it's per-boot dynamic, but that's OK)
> > > > > > > That's how one knows that /SYSV is not just
> > > > > > > a regular file; sadly these didn't get a non-/ prefix.
> > > > > > > (and no you can't fix that now; it's way too late)
> > > > > > > 
> > > > > > > Next time you feel like breaking an ABI, mind putting
> > > > > > > "LET'S BREAK AN ABI!" in the subject of your email?
> > > > > > 
> > > > > > I am not breaking ABI. Its already broken in the current
> > > > > > mainline. I am trying to fix it by putting back the ino#
> > > > > > as shmid. Eric had a suggestion that, instead of depending
> > > > > > on the inode# to be shmid, we could embed shmid into name
> > > > > > (instead of "key" which is currently not unique).
> > > > > > 
> > > > > > > BTW, I suspect this kind of thing also breaks:
> > > > > > > a. fuser, lsof, and other resource usage display tools
> > > > > > > b. various obscure emulators (similar to valgrind)
> > > > > > 
> > > > > > If you strongly feel that "old" behaviour needs to be retained, 
> > > > > 
> > > > > yup, we should put it back.  The change was, afaik, accidental.
> > > > > 
> > > > > > here is the patch I originally suggested.
> > > > > 
> > > > > Confused.  Will this one-liner fix all the userspace breakage to which
> > > > > Albert refers?
> > > > 
> > > > Yes. Albert, please correct me if I am wrong.
> > > 
> > > It will, but could lead to two different inodes with the same i_ino,
> > > right?
> > 
> > Only if we generate same ID in two different namespaces. Is it currently
> > possible ? 
> 
> Should be nothing stopping it.

(just to be more certain, a quick test showed I can get id 0 for
different keys, and different ids for the same key 0xff, in different
ipc namespaces)

-serge
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC][PATCH] /proc/pid/maps doesn't match "ipcs -m" shmid

2007-06-07 Thread Badari Pulavarty



Serge E. Hallyn wrote:


Quoting Badari Pulavarty ([EMAIL PROTECTED]):


On Thu, 2007-06-07 at 15:37 -0500, Serge E. Hallyn wrote:


Quoting Badari Pulavarty ([EMAIL PROTECTED]):


On Thu, 2007-06-07 at 12:48 -0700, Andrew Morton wrote:


On Thu, 07 Jun 2007 10:06:37 -0700
Badari Pulavarty <[EMAIL PROTECTED]> wrote:


On Thu, 2007-06-07 at 12:43 -0400, Albert Cahalan wrote:


On 6/7/07, Badari Pulavarty <[EMAIL PROTECTED]> wrote:


BTW, I agree with Eric that its would be nice to use shmid as part
of name instead of forcing to be as inode number. It should be
possible for pmap to workout shmid from "key" or name. Isn't it ?


It is not at all nice.

1. it's incompatible ABI breakage
2. where will you put the key then, in the inode? :-)


Nope. Currently "key" is part of the name (but its not unique).


Changing to "SYSVID%d" is no good either. Look, people
are ***parsing*** this stuff in /proc. The /proc filesystem
is not some random sandbox to be playing in.

Before you go messing with it, note that the device number
also matters. (it's per-boot dynamic, but that's OK)
That's how one knows that /SYSV is not just
a regular file; sadly these didn't get a non-/ prefix.
(and no you can't fix that now; it's way too late)

Next time you feel like breaking an ABI, mind putting
"LET'S BREAK AN ABI!" in the subject of your email?


I am not breaking ABI. Its already broken in the current
mainline. I am trying to fix it by putting back the ino#
as shmid. Eric had a suggestion that, instead of depending
on the inode# to be shmid, we could embed shmid into name
(instead of "key" which is currently not unique).


BTW, I suspect this kind of thing also breaks:
a. fuser, lsof, and other resource usage display tools
b. various obscure emulators (similar to valgrind)

If you strongly feel that "old" behaviour needs to be retained, 


yup, we should put it back.  The change was, afaik, accidental.


here is the patch I originally suggested.


Confused.  Will this one-liner fix all the userspace breakage to which
Albert refers?


Yes. Albert, please correct me if I am wrong.


It will, but could lead to two different inodes with the same i_ino,
right?


Only if we generate same ID in two different namespaces. Is it currently
possible ? 



Should be nothing stopping it.

But like I say we never find the inode based on i_ino, and don't hash
the inode, so it might be ok.

Correct. We might end up with same shmid - which mean same inode# shows 
up in /proc/pid/maps.
If we don't unshare pid namespace or look from parent namespace - we 
will end up seeing same
shmid/inode# in different /proc/pid/maps, even though they are 
different. But I guess its okay..


Thanks,
Badari



-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC][PATCH] /proc/pid/maps doesn't match "ipcs -m" shmid

2007-06-07 Thread Serge E. Hallyn
Quoting Serge E. Hallyn ([EMAIL PROTECTED]):
> Quoting Badari Pulavarty ([EMAIL PROTECTED]):
> > On Thu, 2007-06-07 at 12:48 -0700, Andrew Morton wrote:
> > > On Thu, 07 Jun 2007 10:06:37 -0700
> > > Badari Pulavarty <[EMAIL PROTECTED]> wrote:
> > > 
> > > > On Thu, 2007-06-07 at 12:43 -0400, Albert Cahalan wrote:
> > > > > On 6/7/07, Badari Pulavarty <[EMAIL PROTECTED]> wrote:
> > > > > 
> > > > > > BTW, I agree with Eric that its would be nice to use shmid as part
> > > > > > of name instead of forcing to be as inode number. It should be
> > > > > > possible for pmap to workout shmid from "key" or name. Isn't it ?
> > > > > 
> > > > > It is not at all nice.
> > > > > 
> > > > > 1. it's incompatible ABI breakage
> > > > > 2. where will you put the key then, in the inode? :-)
> > > > 
> > > > Nope. Currently "key" is part of the name (but its not unique).
> > > > 
> > > > > 
> > > > > Changing to "SYSVID%d" is no good either. Look, people
> > > > > are ***parsing*** this stuff in /proc. The /proc filesystem
> > > > > is not some random sandbox to be playing in.
> > > > > 
> > > > > Before you go messing with it, note that the device number
> > > > > also matters. (it's per-boot dynamic, but that's OK)
> > > > > That's how one knows that /SYSV is not just
> > > > > a regular file; sadly these didn't get a non-/ prefix.
> > > > > (and no you can't fix that now; it's way too late)
> > > > > 
> > > > > Next time you feel like breaking an ABI, mind putting
> > > > > "LET'S BREAK AN ABI!" in the subject of your email?
> > > > 
> > > > I am not breaking ABI. Its already broken in the current
> > > > mainline. I am trying to fix it by putting back the ino#
> > > > as shmid. Eric had a suggestion that, instead of depending
> > > > on the inode# to be shmid, we could embed shmid into name
> > > > (instead of "key" which is currently not unique).
> > > > 
> > > > > BTW, I suspect this kind of thing also breaks:
> > > > > a. fuser, lsof, and other resource usage display tools
> > > > > b. various obscure emulators (similar to valgrind)
> > > > 
> > > > If you strongly feel that "old" behaviour needs to be retained, 
> > > 
> > > yup, we should put it back.  The change was, afaik, accidental.
> > > 
> > > > here is the patch I originally suggested.
> > > 
> > > Confused.  Will this one-liner fix all the userspace breakage to which
> > > Albert refers?
> > 
> > Yes. Albert, please correct me if I am wrong.
> 
> It will, but could lead to two different inodes with the same i_ino,
> right?

Well I guess it's not *technically* a problem since these inodes are
never hashed.

-serge
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC][PATCH] /proc/pid/maps doesn't match "ipcs -m" shmid

2007-06-07 Thread Serge E. Hallyn
Quoting Badari Pulavarty ([EMAIL PROTECTED]):
> On Thu, 2007-06-07 at 15:37 -0500, Serge E. Hallyn wrote:
> > Quoting Badari Pulavarty ([EMAIL PROTECTED]):
> > > On Thu, 2007-06-07 at 12:48 -0700, Andrew Morton wrote:
> > > > On Thu, 07 Jun 2007 10:06:37 -0700
> > > > Badari Pulavarty <[EMAIL PROTECTED]> wrote:
> > > > 
> > > > > On Thu, 2007-06-07 at 12:43 -0400, Albert Cahalan wrote:
> > > > > > On 6/7/07, Badari Pulavarty <[EMAIL PROTECTED]> wrote:
> > > > > > 
> > > > > > > BTW, I agree with Eric that its would be nice to use shmid as part
> > > > > > > of name instead of forcing to be as inode number. It should be
> > > > > > > possible for pmap to workout shmid from "key" or name. Isn't it ?
> > > > > > 
> > > > > > It is not at all nice.
> > > > > > 
> > > > > > 1. it's incompatible ABI breakage
> > > > > > 2. where will you put the key then, in the inode? :-)
> > > > > 
> > > > > Nope. Currently "key" is part of the name (but its not unique).
> > > > > 
> > > > > > 
> > > > > > Changing to "SYSVID%d" is no good either. Look, people
> > > > > > are ***parsing*** this stuff in /proc. The /proc filesystem
> > > > > > is not some random sandbox to be playing in.
> > > > > > 
> > > > > > Before you go messing with it, note that the device number
> > > > > > also matters. (it's per-boot dynamic, but that's OK)
> > > > > > That's how one knows that /SYSV is not just
> > > > > > a regular file; sadly these didn't get a non-/ prefix.
> > > > > > (and no you can't fix that now; it's way too late)
> > > > > > 
> > > > > > Next time you feel like breaking an ABI, mind putting
> > > > > > "LET'S BREAK AN ABI!" in the subject of your email?
> > > > > 
> > > > > I am not breaking ABI. Its already broken in the current
> > > > > mainline. I am trying to fix it by putting back the ino#
> > > > > as shmid. Eric had a suggestion that, instead of depending
> > > > > on the inode# to be shmid, we could embed shmid into name
> > > > > (instead of "key" which is currently not unique).
> > > > > 
> > > > > > BTW, I suspect this kind of thing also breaks:
> > > > > > a. fuser, lsof, and other resource usage display tools
> > > > > > b. various obscure emulators (similar to valgrind)
> > > > > 
> > > > > If you strongly feel that "old" behaviour needs to be retained, 
> > > > 
> > > > yup, we should put it back.  The change was, afaik, accidental.
> > > > 
> > > > > here is the patch I originally suggested.
> > > > 
> > > > Confused.  Will this one-liner fix all the userspace breakage to which
> > > > Albert refers?
> > > 
> > > Yes. Albert, please correct me if I am wrong.
> > 
> > It will, but could lead to two different inodes with the same i_ino,
> > right?
> 
> Only if we generate same ID in two different namespaces. Is it currently
> possible ? 

Should be nothing stopping it.

But like I say we never find the inode based on i_ino, and don't hash
the inode, so it might be ok.

-serge
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC][PATCH] /proc/pid/maps doesn't match "ipcs -m" shmid

2007-06-07 Thread Badari Pulavarty
On Thu, 2007-06-07 at 15:37 -0500, Serge E. Hallyn wrote:
> Quoting Badari Pulavarty ([EMAIL PROTECTED]):
> > On Thu, 2007-06-07 at 12:48 -0700, Andrew Morton wrote:
> > > On Thu, 07 Jun 2007 10:06:37 -0700
> > > Badari Pulavarty <[EMAIL PROTECTED]> wrote:
> > > 
> > > > On Thu, 2007-06-07 at 12:43 -0400, Albert Cahalan wrote:
> > > > > On 6/7/07, Badari Pulavarty <[EMAIL PROTECTED]> wrote:
> > > > > 
> > > > > > BTW, I agree with Eric that its would be nice to use shmid as part
> > > > > > of name instead of forcing to be as inode number. It should be
> > > > > > possible for pmap to workout shmid from "key" or name. Isn't it ?
> > > > > 
> > > > > It is not at all nice.
> > > > > 
> > > > > 1. it's incompatible ABI breakage
> > > > > 2. where will you put the key then, in the inode? :-)
> > > > 
> > > > Nope. Currently "key" is part of the name (but its not unique).
> > > > 
> > > > > 
> > > > > Changing to "SYSVID%d" is no good either. Look, people
> > > > > are ***parsing*** this stuff in /proc. The /proc filesystem
> > > > > is not some random sandbox to be playing in.
> > > > > 
> > > > > Before you go messing with it, note that the device number
> > > > > also matters. (it's per-boot dynamic, but that's OK)
> > > > > That's how one knows that /SYSV is not just
> > > > > a regular file; sadly these didn't get a non-/ prefix.
> > > > > (and no you can't fix that now; it's way too late)
> > > > > 
> > > > > Next time you feel like breaking an ABI, mind putting
> > > > > "LET'S BREAK AN ABI!" in the subject of your email?
> > > > 
> > > > I am not breaking ABI. Its already broken in the current
> > > > mainline. I am trying to fix it by putting back the ino#
> > > > as shmid. Eric had a suggestion that, instead of depending
> > > > on the inode# to be shmid, we could embed shmid into name
> > > > (instead of "key" which is currently not unique).
> > > > 
> > > > > BTW, I suspect this kind of thing also breaks:
> > > > > a. fuser, lsof, and other resource usage display tools
> > > > > b. various obscure emulators (similar to valgrind)
> > > > 
> > > > If you strongly feel that "old" behaviour needs to be retained, 
> > > 
> > > yup, we should put it back.  The change was, afaik, accidental.
> > > 
> > > > here is the patch I originally suggested.
> > > 
> > > Confused.  Will this one-liner fix all the userspace breakage to which
> > > Albert refers?
> > 
> > Yes. Albert, please correct me if I am wrong.
> 
> It will, but could lead to two different inodes with the same i_ino,
> right?

Only if we generate same ID in two different namespaces. Is it currently
possible ? 

Thanks,
Badari


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC][PATCH] /proc/pid/maps doesn't match "ipcs -m" shmid

2007-06-07 Thread Serge E. Hallyn
Quoting Badari Pulavarty ([EMAIL PROTECTED]):
> On Thu, 2007-06-07 at 12:48 -0700, Andrew Morton wrote:
> > On Thu, 07 Jun 2007 10:06:37 -0700
> > Badari Pulavarty <[EMAIL PROTECTED]> wrote:
> > 
> > > On Thu, 2007-06-07 at 12:43 -0400, Albert Cahalan wrote:
> > > > On 6/7/07, Badari Pulavarty <[EMAIL PROTECTED]> wrote:
> > > > 
> > > > > BTW, I agree with Eric that its would be nice to use shmid as part
> > > > > of name instead of forcing to be as inode number. It should be
> > > > > possible for pmap to workout shmid from "key" or name. Isn't it ?
> > > > 
> > > > It is not at all nice.
> > > > 
> > > > 1. it's incompatible ABI breakage
> > > > 2. where will you put the key then, in the inode? :-)
> > > 
> > > Nope. Currently "key" is part of the name (but its not unique).
> > > 
> > > > 
> > > > Changing to "SYSVID%d" is no good either. Look, people
> > > > are ***parsing*** this stuff in /proc. The /proc filesystem
> > > > is not some random sandbox to be playing in.
> > > > 
> > > > Before you go messing with it, note that the device number
> > > > also matters. (it's per-boot dynamic, but that's OK)
> > > > That's how one knows that /SYSV is not just
> > > > a regular file; sadly these didn't get a non-/ prefix.
> > > > (and no you can't fix that now; it's way too late)
> > > > 
> > > > Next time you feel like breaking an ABI, mind putting
> > > > "LET'S BREAK AN ABI!" in the subject of your email?
> > > 
> > > I am not breaking ABI. Its already broken in the current
> > > mainline. I am trying to fix it by putting back the ino#
> > > as shmid. Eric had a suggestion that, instead of depending
> > > on the inode# to be shmid, we could embed shmid into name
> > > (instead of "key" which is currently not unique).
> > > 
> > > > BTW, I suspect this kind of thing also breaks:
> > > > a. fuser, lsof, and other resource usage display tools
> > > > b. various obscure emulators (similar to valgrind)
> > > 
> > > If you strongly feel that "old" behaviour needs to be retained, 
> > 
> > yup, we should put it back.  The change was, afaik, accidental.
> > 
> > > here is the patch I originally suggested.
> > 
> > Confused.  Will this one-liner fix all the userspace breakage to which
> > Albert refers?
> 
> Yes. Albert, please correct me if I am wrong.

It will, but could lead to two different inodes with the same i_ino,
right?

thanks,
-serge
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC][PATCH] /proc/pid/maps doesn't match "ipcs -m" shmid

2007-06-07 Thread Badari Pulavarty
On Thu, 2007-06-07 at 12:48 -0700, Andrew Morton wrote:
> On Thu, 07 Jun 2007 10:06:37 -0700
> Badari Pulavarty <[EMAIL PROTECTED]> wrote:
> 
> > On Thu, 2007-06-07 at 12:43 -0400, Albert Cahalan wrote:
> > > On 6/7/07, Badari Pulavarty <[EMAIL PROTECTED]> wrote:
> > > 
> > > > BTW, I agree with Eric that its would be nice to use shmid as part
> > > > of name instead of forcing to be as inode number. It should be
> > > > possible for pmap to workout shmid from "key" or name. Isn't it ?
> > > 
> > > It is not at all nice.
> > > 
> > > 1. it's incompatible ABI breakage
> > > 2. where will you put the key then, in the inode? :-)
> > 
> > Nope. Currently "key" is part of the name (but its not unique).
> > 
> > > 
> > > Changing to "SYSVID%d" is no good either. Look, people
> > > are ***parsing*** this stuff in /proc. The /proc filesystem
> > > is not some random sandbox to be playing in.
> > > 
> > > Before you go messing with it, note that the device number
> > > also matters. (it's per-boot dynamic, but that's OK)
> > > That's how one knows that /SYSV is not just
> > > a regular file; sadly these didn't get a non-/ prefix.
> > > (and no you can't fix that now; it's way too late)
> > > 
> > > Next time you feel like breaking an ABI, mind putting
> > > "LET'S BREAK AN ABI!" in the subject of your email?
> > 
> > I am not breaking ABI. Its already broken in the current
> > mainline. I am trying to fix it by putting back the ino#
> > as shmid. Eric had a suggestion that, instead of depending
> > on the inode# to be shmid, we could embed shmid into name
> > (instead of "key" which is currently not unique).
> > 
> > > BTW, I suspect this kind of thing also breaks:
> > > a. fuser, lsof, and other resource usage display tools
> > > b. various obscure emulators (similar to valgrind)
> > 
> > If you strongly feel that "old" behaviour needs to be retained, 
> 
> yup, we should put it back.  The change was, afaik, accidental.
> 
> > here is the patch I originally suggested.
> 
> Confused.  Will this one-liner fix all the userspace breakage to which
> Albert refers?

Yes. Albert, please correct me if I am wrong.

Thanks,
Badari


> > "ino#" in /proc/pid/maps used to match "ipcs -m" output for shared 
> > memory (shmid). It was useful in debugging, but its changed recently. 
> > This patch sets inode number to shared memory id to match /proc/pid/maps.
> > 
> > Signed-off-by: Badari Pulavarty <[EMAIL PROTECTED]>
> > 
> > Index: linux-2.6.22-rc4/ipc/shm.c
> > ===
> > --- linux-2.6.22-rc4.orig/ipc/shm.c 2007-06-04 17:57:25.0 -0700
> > +++ linux-2.6.22-rc4/ipc/shm.c  2007-06-06 08:23:57.0 -0700
> > @@ -397,6 +397,7 @@ static int newseg (struct ipc_namespace 
> > shp->shm_nattch = 0;
> > shp->id = shm_buildid(ns, id, shp->shm_perm.seq);
> > shp->shm_file = file;
> > +   file->f_dentry->d_inode->i_ino = shp->id;
> >  
> > ns->shm_tot += numpages;
> > shm_unlock(shp);
> > 
> > 

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC][PATCH] /proc/pid/maps doesn't match "ipcs -m" shmid

2007-06-07 Thread Andrew Morton
On Thu, 07 Jun 2007 10:06:37 -0700
Badari Pulavarty <[EMAIL PROTECTED]> wrote:

> On Thu, 2007-06-07 at 12:43 -0400, Albert Cahalan wrote:
> > On 6/7/07, Badari Pulavarty <[EMAIL PROTECTED]> wrote:
> > 
> > > BTW, I agree with Eric that its would be nice to use shmid as part
> > > of name instead of forcing to be as inode number. It should be
> > > possible for pmap to workout shmid from "key" or name. Isn't it ?
> > 
> > It is not at all nice.
> > 
> > 1. it's incompatible ABI breakage
> > 2. where will you put the key then, in the inode? :-)
> 
> Nope. Currently "key" is part of the name (but its not unique).
> 
> > 
> > Changing to "SYSVID%d" is no good either. Look, people
> > are ***parsing*** this stuff in /proc. The /proc filesystem
> > is not some random sandbox to be playing in.
> > 
> > Before you go messing with it, note that the device number
> > also matters. (it's per-boot dynamic, but that's OK)
> > That's how one knows that /SYSV is not just
> > a regular file; sadly these didn't get a non-/ prefix.
> > (and no you can't fix that now; it's way too late)
> > 
> > Next time you feel like breaking an ABI, mind putting
> > "LET'S BREAK AN ABI!" in the subject of your email?
> 
> I am not breaking ABI. Its already broken in the current
> mainline. I am trying to fix it by putting back the ino#
> as shmid. Eric had a suggestion that, instead of depending
> on the inode# to be shmid, we could embed shmid into name
> (instead of "key" which is currently not unique).
> 
> > BTW, I suspect this kind of thing also breaks:
> > a. fuser, lsof, and other resource usage display tools
> > b. various obscure emulators (similar to valgrind)
> 
> If you strongly feel that "old" behaviour needs to be retained, 

yup, we should put it back.  The change was, afaik, accidental.

> here is the patch I originally suggested.

Confused.  Will this one-liner fix all the userspace breakage to which
Albert refers?

> Thanks,
> Badari
> 
> "ino#" in /proc/pid/maps used to match "ipcs -m" output for shared 
> memory (shmid). It was useful in debugging, but its changed recently. 
> This patch sets inode number to shared memory id to match /proc/pid/maps.
> 
> Signed-off-by: Badari Pulavarty <[EMAIL PROTECTED]>
> 
> Index: linux-2.6.22-rc4/ipc/shm.c
> ===
> --- linux-2.6.22-rc4.orig/ipc/shm.c   2007-06-04 17:57:25.0 -0700
> +++ linux-2.6.22-rc4/ipc/shm.c2007-06-06 08:23:57.0 -0700
> @@ -397,6 +397,7 @@ static int newseg (struct ipc_namespace 
>   shp->shm_nattch = 0;
>   shp->id = shm_buildid(ns, id, shp->shm_perm.seq);
>   shp->shm_file = file;
> + file->f_dentry->d_inode->i_ino = shp->id;
>  
>   ns->shm_tot += numpages;
>   shm_unlock(shp);
> 
> 
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC][PATCH] /proc/pid/maps doesn't match "ipcs -m" shmid

2007-06-07 Thread Serge E. Hallyn
Quoting Albert Cahalan ([EMAIL PROTECTED]):
> On 6/6/07, Andrew Morton <[EMAIL PROTECTED]> wrote:
> >On Wed, 6 Jun 2007 23:27:01 -0400 "Albert Cahalan" <[EMAIL PROTECTED]> 
> >wrote:
> >> Eric W. Biederman writes:
> >> > Badari Pulavarty <[EMAIL PROTECTED]> writes:
> >>
> >> >> Your recent cleanup to shm code, namely
> >> >>
> >> >> [PATCH] shm: make sysv ipc shared memory use stacked files
> >> >>
> >> >> took away one of the debugging feature for shm segments.
> >> >> Originally, shmid were forced to be the inode numbers and
> >> >> they show up in /proc/pid/maps for the process which mapped
> >> >> this shared memory segments (vma listing). That way, its easy
> >> >> to find out who all mapped this shared memory segment. Your
> >> >> patchset, took away the inode# setting. So, we can't easily
> >> >> match the shmem segments to /proc/pid/maps easily. (It was
> >> >> really useful in tracking down a customer problem recently).
> >> >> Is this done deliberately ? Anything wrong in setting this back ?
> >> >
> >> > Theoretically it makes the stacked file concept more brittle,
> >> > because it means the lower layers can't care about their inode
> >> > number.
> >> >
> >> > We do need something to tie these things together.
> >> >
> >> > So I suspect what makes most sense is to simply rename the
> >> > dentry SYSVID
> >>
> >> Please stop breaking things in /proc. The pmap command relys
> >> on the old behavior.
> >
> >What effect did this change have upon the pmap command?  Details, please.
> >
> >> It's time to revert.
> >
> >Probably true, but we'd need to understand what the impact was.
> 
> Very simply, pmap reports the shmid.
> 
> albert 0 ~$ pmap `pidof X` | egrep -2 shmid
> 3005  16384K rw-s-  /dev/fb0
> 3105152K rw---[ anon ]
> 31076000384K rw-s-[ shmid=0x3f428000 ]
> 310d6000384K rw-s-[ shmid=0x3f430001 ]
> 31136000384K rw-s-[ shmid=0x3f438002 ]
> 31196000384K rw-s-[ shmid=0x3f440003 ]
> 311f6000384K rw-s-[ shmid=0x3f448004 ]
> 31256000384K rw-s-[ shmid=0x3f450005 ]
> 312b6000384K rw-s-[ shmid=0x3f460006 ]
> 31316000384K rw-s-[ shmid=0x3f870007 ]
> 31491000140K r  /usr/share/fonts/type1/gsfonts/n021003l.pfb
> 3150e000   9496K rw---[ anon ]

Ok, so IIUC the problem was that inode->i_ino was being set to the id,
and the id can be the same for different things in two namespaces.

So aside from not using the id as inode->i_ino, an alternative is to use
a separate superblock, spearate mqeueue fs, for each ipc ns.

I haven't looked at that enough to see whether it's feasible, i.e. I 
don't know what else mqueue fs is used for.  Eric, does that sound
reasonable to you?

thanks,
-serge
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC][PATCH] /proc/pid/maps doesn't match "ipcs -m" shmid

2007-06-07 Thread Badari Pulavarty
On Thu, 2007-06-07 at 12:43 -0400, Albert Cahalan wrote:
> On 6/7/07, Badari Pulavarty <[EMAIL PROTECTED]> wrote:
> 
> > BTW, I agree with Eric that its would be nice to use shmid as part
> > of name instead of forcing to be as inode number. It should be
> > possible for pmap to workout shmid from "key" or name. Isn't it ?
> 
> It is not at all nice.
> 
> 1. it's incompatible ABI breakage
> 2. where will you put the key then, in the inode? :-)

Nope. Currently "key" is part of the name (but its not unique).

> 
> Changing to "SYSVID%d" is no good either. Look, people
> are ***parsing*** this stuff in /proc. The /proc filesystem
> is not some random sandbox to be playing in.
> 
> Before you go messing with it, note that the device number
> also matters. (it's per-boot dynamic, but that's OK)
> That's how one knows that /SYSV is not just
> a regular file; sadly these didn't get a non-/ prefix.
> (and no you can't fix that now; it's way too late)
> 
> Next time you feel like breaking an ABI, mind putting
> "LET'S BREAK AN ABI!" in the subject of your email?

I am not breaking ABI. Its already broken in the current
mainline. I am trying to fix it by putting back the ino#
as shmid. Eric had a suggestion that, instead of depending
on the inode# to be shmid, we could embed shmid into name
(instead of "key" which is currently not unique).

> BTW, I suspect this kind of thing also breaks:
> a. fuser, lsof, and other resource usage display tools
> b. various obscure emulators (similar to valgrind)

If you strongly feel that "old" behaviour needs to be retained, 
here is the patch I originally suggested.

Thanks,
Badari

"ino#" in /proc/pid/maps used to match "ipcs -m" output for shared 
memory (shmid). It was useful in debugging, but its changed recently. 
This patch sets inode number to shared memory id to match /proc/pid/maps.

Signed-off-by: Badari Pulavarty <[EMAIL PROTECTED]>

Index: linux-2.6.22-rc4/ipc/shm.c
===
--- linux-2.6.22-rc4.orig/ipc/shm.c 2007-06-04 17:57:25.0 -0700
+++ linux-2.6.22-rc4/ipc/shm.c  2007-06-06 08:23:57.0 -0700
@@ -397,6 +397,7 @@ static int newseg (struct ipc_namespace 
shp->shm_nattch = 0;
shp->id = shm_buildid(ns, id, shp->shm_perm.seq);
shp->shm_file = file;
+   file->f_dentry->d_inode->i_ino = shp->id;
 
ns->shm_tot += numpages;
shm_unlock(shp);



-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC][PATCH] /proc/pid/maps doesn't match "ipcs -m" shmid

2007-06-07 Thread Albert Cahalan

On 6/7/07, Badari Pulavarty <[EMAIL PROTECTED]> wrote:


BTW, I agree with Eric that its would be nice to use shmid as part
of name instead of forcing to be as inode number. It should be
possible for pmap to workout shmid from "key" or name. Isn't it ?


It is not at all nice.

1. it's incompatible ABI breakage
2. where will you put the key then, in the inode? :-)

Changing to "SYSVID%d" is no good either. Look, people
are ***parsing*** this stuff in /proc. The /proc filesystem
is not some random sandbox to be playing in.

Before you go messing with it, note that the device number
also matters. (it's per-boot dynamic, but that's OK)
That's how one knows that /SYSV is not just
a regular file; sadly these didn't get a non-/ prefix.
(and no you can't fix that now; it's way too late)

Next time you feel like breaking an ABI, mind putting
"LET'S BREAK AN ABI!" in the subject of your email?

BTW, I suspect this kind of thing also breaks:
a. fuser, lsof, and other resource usage display tools
b. various obscure emulators (similar to valgrind)
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC][PATCH] /proc/pid/maps doesn't match "ipcs -m" shmid

2007-06-07 Thread Badari Pulavarty
On Thu, 2007-06-07 at 00:53 -0400, Albert Cahalan wrote:
> On 6/6/07, Andrew Morton <[EMAIL PROTECTED]> wrote:
> > On Wed, 6 Jun 2007 23:27:01 -0400 "Albert Cahalan" <[EMAIL PROTECTED]> 
> > wrote:
> > > Eric W. Biederman writes:
> > > > Badari Pulavarty <[EMAIL PROTECTED]> writes:
> > >
> > > >> Your recent cleanup to shm code, namely
> > > >>
> > > >> [PATCH] shm: make sysv ipc shared memory use stacked files
> > > >>
> > > >> took away one of the debugging feature for shm segments.
> > > >> Originally, shmid were forced to be the inode numbers and
> > > >> they show up in /proc/pid/maps for the process which mapped
> > > >> this shared memory segments (vma listing). That way, its easy
> > > >> to find out who all mapped this shared memory segment. Your
> > > >> patchset, took away the inode# setting. So, we can't easily
> > > >> match the shmem segments to /proc/pid/maps easily. (It was
> > > >> really useful in tracking down a customer problem recently).
> > > >> Is this done deliberately ? Anything wrong in setting this back ?
> > > >
> > > > Theoretically it makes the stacked file concept more brittle,
> > > > because it means the lower layers can't care about their inode
> > > > number.
> > > >
> > > > We do need something to tie these things together.
> > > >
> > > > So I suspect what makes most sense is to simply rename the
> > > > dentry SYSVID
> > >
> > > Please stop breaking things in /proc. The pmap command relys
> > > on the old behavior.
> >
> > What effect did this change have upon the pmap command?  Details, please.
> >
> > > It's time to revert.
> >
> > Probably true, but we'd need to understand what the impact was.
> 
> Very simply, pmap reports the shmid.
> 
> albert 0 ~$ pmap `pidof X` | egrep -2 shmid
> 3005  16384K rw-s-  /dev/fb0
> 3105152K rw---[ anon ]
> 31076000384K rw-s-[ shmid=0x3f428000 ]
> 310d6000384K rw-s-[ shmid=0x3f430001 ]
> 31136000384K rw-s-[ shmid=0x3f438002 ]
> 31196000384K rw-s-[ shmid=0x3f440003 ]
> 311f6000384K rw-s-[ shmid=0x3f448004 ]
> 31256000384K rw-s-[ shmid=0x3f450005 ]
> 312b6000384K rw-s-[ shmid=0x3f460006 ]
> 31316000384K rw-s-[ shmid=0x3f870007 ]
> 31491000140K r  /usr/share/fonts/type1/gsfonts/n021003l.pfb
> 3150e000   9496K rw---[ anon ]

pmap seems to get shmid from "ino#" field of /proc/pid/map.
Its already broken in current mainline.

But, the breakage is not due to namespaces or container effort :(
Its due to noble effort from Eric to clean up the shm code,
take out the hacks to handle hugetlbfs and make the code
more streamlined and readable.

If we really really want old behaviour, we need my one line
patch to force shmid as inode# :(

BTW, I agree with Eric that its would be nice to use shmid as part
of name instead of forcing to be as inode number. It should be
possible for pmap to workout shmid from "key" or name. Isn't it ?

Andrew/Linus, its up to you to figure out if its worth breaking.
Here is the patch to base dentry-name on shmid - so we don't
need to use ino# to identify shmid.

Thanks,
Badari

Instead of basing dentry name on the shm "key", base it on
"shmid" - so it shows up clearly in /proc/pid/maps. Earlier
we were forcing ino# to match shmid.

Signed-off-by: Badari Pulavarty <[EMAIL PROTECTED]>
Index: linux-2.6.22-rc4/ipc/shm.c
===
--- linux-2.6.22-rc4.orig/ipc/shm.c 2007-06-04 17:57:25.0 -0700
+++ linux-2.6.22-rc4/ipc/shm.c  2007-06-06 13:43:36.0 -0700
@@ -364,6 +364,14 @@ static int newseg (struct ipc_namespace 
return error;
}
 
+   error = -ENOSPC;
+   id = shm_addid(ns, shp);
+   if(id == -1)
+   goto no_id;
+
+   /* Build an id, so we can use it for filename */
+   shp->id = shm_buildid(ns, id, shp->shm_perm.seq);
+
if (shmflg & SHM_HUGETLB) {
/* hugetlb_zero_setup takes care of mlock user accounting */
file = hugetlb_zero_setup(size);
@@ -377,34 +385,28 @@ static int newseg (struct ipc_namespace 
if  ((shmflg & SHM_NORESERVE) &&
sysctl_overcommit_memory != OVERCOMMIT_NEVER)
acctflag = 0;
-   sprintf (name, "SYSV%08x", key);
+   sprintf (name, "SYSVID%d", shp->id);
file = shmem_file_setup(name, size, acctflag);
}
error = PTR_ERR(file);
if (IS_ERR(file))
goto no_file;
 
-   error = -ENOSPC;
-   id = shm_addid(ns, shp);
-   if(id == -1) 
-   goto no_id;
-
shp->shm_cprid = current->tgid;
shp->shm_lprid = 0;
shp->shm_atim = shp->shm_dtim = 0;
shp->shm_ctim = get_seconds();
shp->shm_segsz = size;
shp->shm_nattch = 0;
-   shp->id = shm_buildid(ns, id, shp->shm_perm.seq);
shp->shm_file = file;
 
ns->shm_tot += numpages;
 

Re: [RFC][PATCH] /proc/pid/maps doesn't match ipcs -m shmid

2007-06-07 Thread Badari Pulavarty



Serge E. Hallyn wrote:


Quoting Serge E. Hallyn ([EMAIL PROTECTED]):


Quoting Badari Pulavarty ([EMAIL PROTECTED]):


On Thu, 2007-06-07 at 15:37 -0500, Serge E. Hallyn wrote:


Quoting Badari Pulavarty ([EMAIL PROTECTED]):


On Thu, 2007-06-07 at 12:48 -0700, Andrew Morton wrote:


On Thu, 07 Jun 2007 10:06:37 -0700
Badari Pulavarty [EMAIL PROTECTED] wrote:


On Thu, 2007-06-07 at 12:43 -0400, Albert Cahalan wrote:


On 6/7/07, Badari Pulavarty [EMAIL PROTECTED] wrote:


BTW, I agree with Eric that its would be nice to use shmid as part
of name instead of forcing to be as inode number. It should be
possible for pmap to workout shmid from key or name. Isn't it ?


It is not at all nice.

1. it's incompatible ABI breakage
2. where will you put the key then, in the inode? :-)


Nope. Currently key is part of the name (but its not unique).


Changing to SYSVID%d is no good either. Look, people
are ***parsing*** this stuff in /proc. The /proc filesystem
is not some random sandbox to be playing in.

Before you go messing with it, note that the device number
also matters. (it's per-boot dynamic, but that's OK)
That's how one knows that /SYSV is not just
a regular file; sadly these didn't get a non-/ prefix.
(and no you can't fix that now; it's way too late)

Next time you feel like breaking an ABI, mind putting
LET'S BREAK AN ABI! in the subject of your email?


I am not breaking ABI. Its already broken in the current
mainline. I am trying to fix it by putting back the ino#
as shmid. Eric had a suggestion that, instead of depending
on the inode# to be shmid, we could embed shmid into name
(instead of key which is currently not unique).


BTW, I suspect this kind of thing also breaks:
a. fuser, lsof, and other resource usage display tools
b. various obscure emulators (similar to valgrind)

If you strongly feel that old behaviour needs to be retained, 


yup, we should put it back.  The change was, afaik, accidental.


here is the patch I originally suggested.


Confused.  Will this one-liner fix all the userspace breakage to which
Albert refers?


Yes. Albert, please correct me if I am wrong.


It will, but could lead to two different inodes with the same i_ino,
right?


Only if we generate same ID in two different namespaces. Is it currently
possible ? 


Should be nothing stopping it.



(just to be more certain, a quick test showed I can get id 0 for
different keys, and different ids for the same key 0xff, in different
ipc namespaces)


Funny. I played with it and decided that it can happen :)

Thanks,
Badari



-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC][PATCH] /proc/pid/maps doesn't match ipcs -m shmid

2007-06-07 Thread Eric W. Biederman
Serge E. Hallyn [EMAIL PROTECTED] writes:

 Ok, so IIUC the problem was that inode-i_ino was being set to the id,
 and the id can be the same for different things in two namespaces.

There is nothing preventing inode number collisions in this code even
without multiple namespaces, and even when it was functioning
correctly.  However as it does not seem possible to find these files
through normal filesystem operations that does not seem to be a problem.

 So aside from not using the id as inode-i_ino, an alternative is to use
 a separate superblock, spearate mqeueue fs, for each ipc ns.

 I haven't looked at that enough to see whether it's feasible, i.e. I 
 don't know what else mqueue fs is used for.  Eric, does that sound
 reasonable to you?

At this point given that we actually have a small user space dependency
and the fact that after I have reviewed the code it looks harmless to
change the inode number of those inodes, in both cases they are just
anonymous inodes generated with new_inode, and anything that we wrap
is likely to be equally so.

So it looks to me like we need to do three things:
- Fix the inode number
- Fix the name on the hugetlbfs dentry to hold the key
- Add a big fat comment that user space programs depend on this
  behavior of both the dentry name and the inode number.

So Badari it looks like your original patch plus a little bit is
what we need.

Eric
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC][PATCH] /proc/pid/maps doesn't match ipcs -m shmid

2007-06-07 Thread Albert Cahalan

On 6/7/07, Eric W. Biederman [EMAIL PROTECTED] wrote:


So it looks to me like we need to do three things:
- Fix the inode number
- Fix the name on the hugetlbfs dentry to hold the key
- Add a big fat comment that user space programs depend on this
  behavior of both the dentry name and the inode number.


Assuming that this proposed fix goes in:

Since the inode number is the shmid, and this is a number
that the kernel randomly chooses AFAIK, there should be
no need to have different shm segments sharing the same
inode number.

The situation with the key is a bit more disturbing, though
we already hit that anyway when IPC_PRIVATE is used.
(why anybody would NOT use IPC_PRIVATE is a mystery)
So having the key in the name doesn't make things worse.

I have some concern about the device minor number.
This should be the same for all shm mappings; I do not
know if the behavior changed.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC][PATCH] /proc/pid/maps doesn't match ipcs -m shmid

2007-06-07 Thread Eric W. Biederman
Albert Cahalan [EMAIL PROTECTED] writes:

 On 6/7/07, Eric W. Biederman [EMAIL PROTECTED] wrote:

 So it looks to me like we need to do three things:
 - Fix the inode number
 - Fix the name on the hugetlbfs dentry to hold the key
 - Add a big fat comment that user space programs depend on this
   behavior of both the dentry name and the inode number.

 Assuming that this proposed fix goes in:

 Since the inode number is the shmid, and this is a number
 that the kernel randomly chooses AFAIK, there should be
 no need to have different shm segments sharing the same
 inode number.

Where we run into inode number confusion is that all of
these shm segments are actually files on a tmpfs filesystem
somewhere, and by making the inode number the shmid we loose
the tmpfs inode number.  So it is possible we get tmpfs inode
number conflicts.  However the inode number is not used for
anything, and the files are not visible in any other way except
as shm segments so it doesn't matter.

There is another case with ipc namespaces where we ultimately need
to support duplicate shmids on the same machine (so migration
is a possibility).  However by and large the user space
processes with duplicate ids should be invisible to each other.

 The situation with the key is a bit more disturbing, though
 we already hit that anyway when IPC_PRIVATE is used.
 (why anybody would NOT use IPC_PRIVATE is a mystery)
 So having the key in the name doesn't make things worse.

Having SYSV in the name appears mandatory.  Otherwise you
don't even know it is a shm file. Although I may be confused.

 I have some concern about the device minor number.
 This should be the same for all shm mappings; I do not
 know if the behavior changed.

So I haven't changed anything here.  But I haven't really
looked either.

I don't have a clue if hugetlbfs files use the same device minor
number as tmpfs files.

Hmm.  Thinking about this I have just realized that we may want
to approach this a little differently.  Currently I am reusing
the dentry and inode structure that hugetlbfs and tmpfs return
me, and simply have a distinct struct file for each shm mapping.

There is a little more cost but it may actually make sense to have
a dentry and inode that is specific to shm.c so we can do whatever
we need to without adding requirements to the normal tmpfs or hugtlb
code.

Eric
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC][PATCH] /proc/pid/maps doesn't match ipcs -m shmid

2007-06-07 Thread Badari Pulavarty
On Thu, 2007-06-07 at 00:53 -0400, Albert Cahalan wrote:
 On 6/6/07, Andrew Morton [EMAIL PROTECTED] wrote:
  On Wed, 6 Jun 2007 23:27:01 -0400 Albert Cahalan [EMAIL PROTECTED] 
  wrote:
   Eric W. Biederman writes:
Badari Pulavarty [EMAIL PROTECTED] writes:
  
Your recent cleanup to shm code, namely
   
[PATCH] shm: make sysv ipc shared memory use stacked files
   
took away one of the debugging feature for shm segments.
Originally, shmid were forced to be the inode numbers and
they show up in /proc/pid/maps for the process which mapped
this shared memory segments (vma listing). That way, its easy
to find out who all mapped this shared memory segment. Your
patchset, took away the inode# setting. So, we can't easily
match the shmem segments to /proc/pid/maps easily. (It was
really useful in tracking down a customer problem recently).
Is this done deliberately ? Anything wrong in setting this back ?
   
Theoretically it makes the stacked file concept more brittle,
because it means the lower layers can't care about their inode
number.
   
We do need something to tie these things together.
   
So I suspect what makes most sense is to simply rename the
dentry SYSVIDsegmentid
  
   Please stop breaking things in /proc. The pmap command relys
   on the old behavior.
 
  What effect did this change have upon the pmap command?  Details, please.
 
   It's time to revert.
 
  Probably true, but we'd need to understand what the impact was.
 
 Very simply, pmap reports the shmid.
 
 albert 0 ~$ pmap `pidof X` | egrep -2 shmid
 3005  16384K rw-s-  /dev/fb0
 3105152K rw---[ anon ]
 31076000384K rw-s-[ shmid=0x3f428000 ]
 310d6000384K rw-s-[ shmid=0x3f430001 ]
 31136000384K rw-s-[ shmid=0x3f438002 ]
 31196000384K rw-s-[ shmid=0x3f440003 ]
 311f6000384K rw-s-[ shmid=0x3f448004 ]
 31256000384K rw-s-[ shmid=0x3f450005 ]
 312b6000384K rw-s-[ shmid=0x3f460006 ]
 31316000384K rw-s-[ shmid=0x3f870007 ]
 31491000140K r  /usr/share/fonts/type1/gsfonts/n021003l.pfb
 3150e000   9496K rw---[ anon ]

pmap seems to get shmid from ino# field of /proc/pid/map.
Its already broken in current mainline.

But, the breakage is not due to namespaces or container effort :(
Its due to noble effort from Eric to clean up the shm code,
take out the hacks to handle hugetlbfs and make the code
more streamlined and readable.

If we really really want old behaviour, we need my one line
patch to force shmid as inode# :(

BTW, I agree with Eric that its would be nice to use shmid as part
of name instead of forcing to be as inode number. It should be
possible for pmap to workout shmid from key or name. Isn't it ?

Andrew/Linus, its up to you to figure out if its worth breaking.
Here is the patch to base dentry-name on shmid - so we don't
need to use ino# to identify shmid.

Thanks,
Badari

Instead of basing dentry name on the shm key, base it on
shmid - so it shows up clearly in /proc/pid/maps. Earlier
we were forcing ino# to match shmid.

Signed-off-by: Badari Pulavarty [EMAIL PROTECTED]
Index: linux-2.6.22-rc4/ipc/shm.c
===
--- linux-2.6.22-rc4.orig/ipc/shm.c 2007-06-04 17:57:25.0 -0700
+++ linux-2.6.22-rc4/ipc/shm.c  2007-06-06 13:43:36.0 -0700
@@ -364,6 +364,14 @@ static int newseg (struct ipc_namespace 
return error;
}
 
+   error = -ENOSPC;
+   id = shm_addid(ns, shp);
+   if(id == -1)
+   goto no_id;
+
+   /* Build an id, so we can use it for filename */
+   shp-id = shm_buildid(ns, id, shp-shm_perm.seq);
+
if (shmflg  SHM_HUGETLB) {
/* hugetlb_zero_setup takes care of mlock user accounting */
file = hugetlb_zero_setup(size);
@@ -377,34 +385,28 @@ static int newseg (struct ipc_namespace 
if  ((shmflg  SHM_NORESERVE) 
sysctl_overcommit_memory != OVERCOMMIT_NEVER)
acctflag = 0;
-   sprintf (name, SYSV%08x, key);
+   sprintf (name, SYSVID%d, shp-id);
file = shmem_file_setup(name, size, acctflag);
}
error = PTR_ERR(file);
if (IS_ERR(file))
goto no_file;
 
-   error = -ENOSPC;
-   id = shm_addid(ns, shp);
-   if(id == -1) 
-   goto no_id;
-
shp-shm_cprid = current-tgid;
shp-shm_lprid = 0;
shp-shm_atim = shp-shm_dtim = 0;
shp-shm_ctim = get_seconds();
shp-shm_segsz = size;
shp-shm_nattch = 0;
-   shp-id = shm_buildid(ns, id, shp-shm_perm.seq);
shp-shm_file = file;
 
ns-shm_tot += numpages;
shm_unlock(shp);
return shp-id;
 
-no_id:
-   fput(file);
 no_file:
+   shm_rmid(ns, shp-id);
+no_id:
security_shm_free(shp);
ipc_rcu_putref(shp);

Re: [RFC][PATCH] /proc/pid/maps doesn't match ipcs -m shmid

2007-06-07 Thread Albert Cahalan

On 6/7/07, Badari Pulavarty [EMAIL PROTECTED] wrote:


BTW, I agree with Eric that its would be nice to use shmid as part
of name instead of forcing to be as inode number. It should be
possible for pmap to workout shmid from key or name. Isn't it ?


It is not at all nice.

1. it's incompatible ABI breakage
2. where will you put the key then, in the inode? :-)

Changing to SYSVID%d is no good either. Look, people
are ***parsing*** this stuff in /proc. The /proc filesystem
is not some random sandbox to be playing in.

Before you go messing with it, note that the device number
also matters. (it's per-boot dynamic, but that's OK)
That's how one knows that /SYSV is not just
a regular file; sadly these didn't get a non-/ prefix.
(and no you can't fix that now; it's way too late)

Next time you feel like breaking an ABI, mind putting
LET'S BREAK AN ABI! in the subject of your email?

BTW, I suspect this kind of thing also breaks:
a. fuser, lsof, and other resource usage display tools
b. various obscure emulators (similar to valgrind)
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC][PATCH] /proc/pid/maps doesn't match ipcs -m shmid

2007-06-07 Thread Badari Pulavarty
On Thu, 2007-06-07 at 12:43 -0400, Albert Cahalan wrote:
 On 6/7/07, Badari Pulavarty [EMAIL PROTECTED] wrote:
 
  BTW, I agree with Eric that its would be nice to use shmid as part
  of name instead of forcing to be as inode number. It should be
  possible for pmap to workout shmid from key or name. Isn't it ?
 
 It is not at all nice.
 
 1. it's incompatible ABI breakage
 2. where will you put the key then, in the inode? :-)

Nope. Currently key is part of the name (but its not unique).

 
 Changing to SYSVID%d is no good either. Look, people
 are ***parsing*** this stuff in /proc. The /proc filesystem
 is not some random sandbox to be playing in.
 
 Before you go messing with it, note that the device number
 also matters. (it's per-boot dynamic, but that's OK)
 That's how one knows that /SYSV is not just
 a regular file; sadly these didn't get a non-/ prefix.
 (and no you can't fix that now; it's way too late)
 
 Next time you feel like breaking an ABI, mind putting
 LET'S BREAK AN ABI! in the subject of your email?

I am not breaking ABI. Its already broken in the current
mainline. I am trying to fix it by putting back the ino#
as shmid. Eric had a suggestion that, instead of depending
on the inode# to be shmid, we could embed shmid into name
(instead of key which is currently not unique).

 BTW, I suspect this kind of thing also breaks:
 a. fuser, lsof, and other resource usage display tools
 b. various obscure emulators (similar to valgrind)

If you strongly feel that old behaviour needs to be retained, 
here is the patch I originally suggested.

Thanks,
Badari

ino# in /proc/pid/maps used to match ipcs -m output for shared 
memory (shmid). It was useful in debugging, but its changed recently. 
This patch sets inode number to shared memory id to match /proc/pid/maps.

Signed-off-by: Badari Pulavarty [EMAIL PROTECTED]

Index: linux-2.6.22-rc4/ipc/shm.c
===
--- linux-2.6.22-rc4.orig/ipc/shm.c 2007-06-04 17:57:25.0 -0700
+++ linux-2.6.22-rc4/ipc/shm.c  2007-06-06 08:23:57.0 -0700
@@ -397,6 +397,7 @@ static int newseg (struct ipc_namespace 
shp-shm_nattch = 0;
shp-id = shm_buildid(ns, id, shp-shm_perm.seq);
shp-shm_file = file;
+   file-f_dentry-d_inode-i_ino = shp-id;
 
ns-shm_tot += numpages;
shm_unlock(shp);



-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC][PATCH] /proc/pid/maps doesn't match ipcs -m shmid

2007-06-07 Thread Serge E. Hallyn
Quoting Albert Cahalan ([EMAIL PROTECTED]):
 On 6/6/07, Andrew Morton [EMAIL PROTECTED] wrote:
 On Wed, 6 Jun 2007 23:27:01 -0400 Albert Cahalan [EMAIL PROTECTED] 
 wrote:
  Eric W. Biederman writes:
   Badari Pulavarty [EMAIL PROTECTED] writes:
 
   Your recent cleanup to shm code, namely
  
   [PATCH] shm: make sysv ipc shared memory use stacked files
  
   took away one of the debugging feature for shm segments.
   Originally, shmid were forced to be the inode numbers and
   they show up in /proc/pid/maps for the process which mapped
   this shared memory segments (vma listing). That way, its easy
   to find out who all mapped this shared memory segment. Your
   patchset, took away the inode# setting. So, we can't easily
   match the shmem segments to /proc/pid/maps easily. (It was
   really useful in tracking down a customer problem recently).
   Is this done deliberately ? Anything wrong in setting this back ?
  
   Theoretically it makes the stacked file concept more brittle,
   because it means the lower layers can't care about their inode
   number.
  
   We do need something to tie these things together.
  
   So I suspect what makes most sense is to simply rename the
   dentry SYSVIDsegmentid
 
  Please stop breaking things in /proc. The pmap command relys
  on the old behavior.
 
 What effect did this change have upon the pmap command?  Details, please.
 
  It's time to revert.
 
 Probably true, but we'd need to understand what the impact was.
 
 Very simply, pmap reports the shmid.
 
 albert 0 ~$ pmap `pidof X` | egrep -2 shmid
 3005  16384K rw-s-  /dev/fb0
 3105152K rw---[ anon ]
 31076000384K rw-s-[ shmid=0x3f428000 ]
 310d6000384K rw-s-[ shmid=0x3f430001 ]
 31136000384K rw-s-[ shmid=0x3f438002 ]
 31196000384K rw-s-[ shmid=0x3f440003 ]
 311f6000384K rw-s-[ shmid=0x3f448004 ]
 31256000384K rw-s-[ shmid=0x3f450005 ]
 312b6000384K rw-s-[ shmid=0x3f460006 ]
 31316000384K rw-s-[ shmid=0x3f870007 ]
 31491000140K r  /usr/share/fonts/type1/gsfonts/n021003l.pfb
 3150e000   9496K rw---[ anon ]

Ok, so IIUC the problem was that inode-i_ino was being set to the id,
and the id can be the same for different things in two namespaces.

So aside from not using the id as inode-i_ino, an alternative is to use
a separate superblock, spearate mqeueue fs, for each ipc ns.

I haven't looked at that enough to see whether it's feasible, i.e. I 
don't know what else mqueue fs is used for.  Eric, does that sound
reasonable to you?

thanks,
-serge
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC][PATCH] /proc/pid/maps doesn't match ipcs -m shmid

2007-06-07 Thread Andrew Morton
On Thu, 07 Jun 2007 10:06:37 -0700
Badari Pulavarty [EMAIL PROTECTED] wrote:

 On Thu, 2007-06-07 at 12:43 -0400, Albert Cahalan wrote:
  On 6/7/07, Badari Pulavarty [EMAIL PROTECTED] wrote:
  
   BTW, I agree with Eric that its would be nice to use shmid as part
   of name instead of forcing to be as inode number. It should be
   possible for pmap to workout shmid from key or name. Isn't it ?
  
  It is not at all nice.
  
  1. it's incompatible ABI breakage
  2. where will you put the key then, in the inode? :-)
 
 Nope. Currently key is part of the name (but its not unique).
 
  
  Changing to SYSVID%d is no good either. Look, people
  are ***parsing*** this stuff in /proc. The /proc filesystem
  is not some random sandbox to be playing in.
  
  Before you go messing with it, note that the device number
  also matters. (it's per-boot dynamic, but that's OK)
  That's how one knows that /SYSV is not just
  a regular file; sadly these didn't get a non-/ prefix.
  (and no you can't fix that now; it's way too late)
  
  Next time you feel like breaking an ABI, mind putting
  LET'S BREAK AN ABI! in the subject of your email?
 
 I am not breaking ABI. Its already broken in the current
 mainline. I am trying to fix it by putting back the ino#
 as shmid. Eric had a suggestion that, instead of depending
 on the inode# to be shmid, we could embed shmid into name
 (instead of key which is currently not unique).
 
  BTW, I suspect this kind of thing also breaks:
  a. fuser, lsof, and other resource usage display tools
  b. various obscure emulators (similar to valgrind)
 
 If you strongly feel that old behaviour needs to be retained, 

yup, we should put it back.  The change was, afaik, accidental.

 here is the patch I originally suggested.

Confused.  Will this one-liner fix all the userspace breakage to which
Albert refers?

 Thanks,
 Badari
 
 ino# in /proc/pid/maps used to match ipcs -m output for shared 
 memory (shmid). It was useful in debugging, but its changed recently. 
 This patch sets inode number to shared memory id to match /proc/pid/maps.
 
 Signed-off-by: Badari Pulavarty [EMAIL PROTECTED]
 
 Index: linux-2.6.22-rc4/ipc/shm.c
 ===
 --- linux-2.6.22-rc4.orig/ipc/shm.c   2007-06-04 17:57:25.0 -0700
 +++ linux-2.6.22-rc4/ipc/shm.c2007-06-06 08:23:57.0 -0700
 @@ -397,6 +397,7 @@ static int newseg (struct ipc_namespace 
   shp-shm_nattch = 0;
   shp-id = shm_buildid(ns, id, shp-shm_perm.seq);
   shp-shm_file = file;
 + file-f_dentry-d_inode-i_ino = shp-id;
  
   ns-shm_tot += numpages;
   shm_unlock(shp);
 
 
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC][PATCH] /proc/pid/maps doesn't match ipcs -m shmid

2007-06-07 Thread Badari Pulavarty
On Thu, 2007-06-07 at 12:48 -0700, Andrew Morton wrote:
 On Thu, 07 Jun 2007 10:06:37 -0700
 Badari Pulavarty [EMAIL PROTECTED] wrote:
 
  On Thu, 2007-06-07 at 12:43 -0400, Albert Cahalan wrote:
   On 6/7/07, Badari Pulavarty [EMAIL PROTECTED] wrote:
   
BTW, I agree with Eric that its would be nice to use shmid as part
of name instead of forcing to be as inode number. It should be
possible for pmap to workout shmid from key or name. Isn't it ?
   
   It is not at all nice.
   
   1. it's incompatible ABI breakage
   2. where will you put the key then, in the inode? :-)
  
  Nope. Currently key is part of the name (but its not unique).
  
   
   Changing to SYSVID%d is no good either. Look, people
   are ***parsing*** this stuff in /proc. The /proc filesystem
   is not some random sandbox to be playing in.
   
   Before you go messing with it, note that the device number
   also matters. (it's per-boot dynamic, but that's OK)
   That's how one knows that /SYSV is not just
   a regular file; sadly these didn't get a non-/ prefix.
   (and no you can't fix that now; it's way too late)
   
   Next time you feel like breaking an ABI, mind putting
   LET'S BREAK AN ABI! in the subject of your email?
  
  I am not breaking ABI. Its already broken in the current
  mainline. I am trying to fix it by putting back the ino#
  as shmid. Eric had a suggestion that, instead of depending
  on the inode# to be shmid, we could embed shmid into name
  (instead of key which is currently not unique).
  
   BTW, I suspect this kind of thing also breaks:
   a. fuser, lsof, and other resource usage display tools
   b. various obscure emulators (similar to valgrind)
  
  If you strongly feel that old behaviour needs to be retained, 
 
 yup, we should put it back.  The change was, afaik, accidental.
 
  here is the patch I originally suggested.
 
 Confused.  Will this one-liner fix all the userspace breakage to which
 Albert refers?

Yes. Albert, please correct me if I am wrong.

Thanks,
Badari


  ino# in /proc/pid/maps used to match ipcs -m output for shared 
  memory (shmid). It was useful in debugging, but its changed recently. 
  This patch sets inode number to shared memory id to match /proc/pid/maps.
  
  Signed-off-by: Badari Pulavarty [EMAIL PROTECTED]
  
  Index: linux-2.6.22-rc4/ipc/shm.c
  ===
  --- linux-2.6.22-rc4.orig/ipc/shm.c 2007-06-04 17:57:25.0 -0700
  +++ linux-2.6.22-rc4/ipc/shm.c  2007-06-06 08:23:57.0 -0700
  @@ -397,6 +397,7 @@ static int newseg (struct ipc_namespace 
  shp-shm_nattch = 0;
  shp-id = shm_buildid(ns, id, shp-shm_perm.seq);
  shp-shm_file = file;
  +   file-f_dentry-d_inode-i_ino = shp-id;
   
  ns-shm_tot += numpages;
  shm_unlock(shp);
  
  

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC][PATCH] /proc/pid/maps doesn't match ipcs -m shmid

2007-06-07 Thread Serge E. Hallyn
Quoting Badari Pulavarty ([EMAIL PROTECTED]):
 On Thu, 2007-06-07 at 12:48 -0700, Andrew Morton wrote:
  On Thu, 07 Jun 2007 10:06:37 -0700
  Badari Pulavarty [EMAIL PROTECTED] wrote:
  
   On Thu, 2007-06-07 at 12:43 -0400, Albert Cahalan wrote:
On 6/7/07, Badari Pulavarty [EMAIL PROTECTED] wrote:

 BTW, I agree with Eric that its would be nice to use shmid as part
 of name instead of forcing to be as inode number. It should be
 possible for pmap to workout shmid from key or name. Isn't it ?

It is not at all nice.

1. it's incompatible ABI breakage
2. where will you put the key then, in the inode? :-)
   
   Nope. Currently key is part of the name (but its not unique).
   

Changing to SYSVID%d is no good either. Look, people
are ***parsing*** this stuff in /proc. The /proc filesystem
is not some random sandbox to be playing in.

Before you go messing with it, note that the device number
also matters. (it's per-boot dynamic, but that's OK)
That's how one knows that /SYSV is not just
a regular file; sadly these didn't get a non-/ prefix.
(and no you can't fix that now; it's way too late)

Next time you feel like breaking an ABI, mind putting
LET'S BREAK AN ABI! in the subject of your email?
   
   I am not breaking ABI. Its already broken in the current
   mainline. I am trying to fix it by putting back the ino#
   as shmid. Eric had a suggestion that, instead of depending
   on the inode# to be shmid, we could embed shmid into name
   (instead of key which is currently not unique).
   
BTW, I suspect this kind of thing also breaks:
a. fuser, lsof, and other resource usage display tools
b. various obscure emulators (similar to valgrind)
   
   If you strongly feel that old behaviour needs to be retained, 
  
  yup, we should put it back.  The change was, afaik, accidental.
  
   here is the patch I originally suggested.
  
  Confused.  Will this one-liner fix all the userspace breakage to which
  Albert refers?
 
 Yes. Albert, please correct me if I am wrong.

It will, but could lead to two different inodes with the same i_ino,
right?

thanks,
-serge
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC][PATCH] /proc/pid/maps doesn't match ipcs -m shmid

2007-06-07 Thread Badari Pulavarty
On Thu, 2007-06-07 at 15:37 -0500, Serge E. Hallyn wrote:
 Quoting Badari Pulavarty ([EMAIL PROTECTED]):
  On Thu, 2007-06-07 at 12:48 -0700, Andrew Morton wrote:
   On Thu, 07 Jun 2007 10:06:37 -0700
   Badari Pulavarty [EMAIL PROTECTED] wrote:
   
On Thu, 2007-06-07 at 12:43 -0400, Albert Cahalan wrote:
 On 6/7/07, Badari Pulavarty [EMAIL PROTECTED] wrote:
 
  BTW, I agree with Eric that its would be nice to use shmid as part
  of name instead of forcing to be as inode number. It should be
  possible for pmap to workout shmid from key or name. Isn't it ?
 
 It is not at all nice.
 
 1. it's incompatible ABI breakage
 2. where will you put the key then, in the inode? :-)

Nope. Currently key is part of the name (but its not unique).

 
 Changing to SYSVID%d is no good either. Look, people
 are ***parsing*** this stuff in /proc. The /proc filesystem
 is not some random sandbox to be playing in.
 
 Before you go messing with it, note that the device number
 also matters. (it's per-boot dynamic, but that's OK)
 That's how one knows that /SYSV is not just
 a regular file; sadly these didn't get a non-/ prefix.
 (and no you can't fix that now; it's way too late)
 
 Next time you feel like breaking an ABI, mind putting
 LET'S BREAK AN ABI! in the subject of your email?

I am not breaking ABI. Its already broken in the current
mainline. I am trying to fix it by putting back the ino#
as shmid. Eric had a suggestion that, instead of depending
on the inode# to be shmid, we could embed shmid into name
(instead of key which is currently not unique).

 BTW, I suspect this kind of thing also breaks:
 a. fuser, lsof, and other resource usage display tools
 b. various obscure emulators (similar to valgrind)

If you strongly feel that old behaviour needs to be retained, 
   
   yup, we should put it back.  The change was, afaik, accidental.
   
here is the patch I originally suggested.
   
   Confused.  Will this one-liner fix all the userspace breakage to which
   Albert refers?
  
  Yes. Albert, please correct me if I am wrong.
 
 It will, but could lead to two different inodes with the same i_ino,
 right?

Only if we generate same ID in two different namespaces. Is it currently
possible ? 

Thanks,
Badari


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC][PATCH] /proc/pid/maps doesn't match ipcs -m shmid

2007-06-07 Thread Serge E. Hallyn
Quoting Badari Pulavarty ([EMAIL PROTECTED]):
 On Thu, 2007-06-07 at 15:37 -0500, Serge E. Hallyn wrote:
  Quoting Badari Pulavarty ([EMAIL PROTECTED]):
   On Thu, 2007-06-07 at 12:48 -0700, Andrew Morton wrote:
On Thu, 07 Jun 2007 10:06:37 -0700
Badari Pulavarty [EMAIL PROTECTED] wrote:

 On Thu, 2007-06-07 at 12:43 -0400, Albert Cahalan wrote:
  On 6/7/07, Badari Pulavarty [EMAIL PROTECTED] wrote:
  
   BTW, I agree with Eric that its would be nice to use shmid as part
   of name instead of forcing to be as inode number. It should be
   possible for pmap to workout shmid from key or name. Isn't it ?
  
  It is not at all nice.
  
  1. it's incompatible ABI breakage
  2. where will you put the key then, in the inode? :-)
 
 Nope. Currently key is part of the name (but its not unique).
 
  
  Changing to SYSVID%d is no good either. Look, people
  are ***parsing*** this stuff in /proc. The /proc filesystem
  is not some random sandbox to be playing in.
  
  Before you go messing with it, note that the device number
  also matters. (it's per-boot dynamic, but that's OK)
  That's how one knows that /SYSV is not just
  a regular file; sadly these didn't get a non-/ prefix.
  (and no you can't fix that now; it's way too late)
  
  Next time you feel like breaking an ABI, mind putting
  LET'S BREAK AN ABI! in the subject of your email?
 
 I am not breaking ABI. Its already broken in the current
 mainline. I am trying to fix it by putting back the ino#
 as shmid. Eric had a suggestion that, instead of depending
 on the inode# to be shmid, we could embed shmid into name
 (instead of key which is currently not unique).
 
  BTW, I suspect this kind of thing also breaks:
  a. fuser, lsof, and other resource usage display tools
  b. various obscure emulators (similar to valgrind)
 
 If you strongly feel that old behaviour needs to be retained, 

yup, we should put it back.  The change was, afaik, accidental.

 here is the patch I originally suggested.

Confused.  Will this one-liner fix all the userspace breakage to which
Albert refers?
   
   Yes. Albert, please correct me if I am wrong.
  
  It will, but could lead to two different inodes with the same i_ino,
  right?
 
 Only if we generate same ID in two different namespaces. Is it currently
 possible ? 

Should be nothing stopping it.

But like I say we never find the inode based on i_ino, and don't hash
the inode, so it might be ok.

-serge
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC][PATCH] /proc/pid/maps doesn't match ipcs -m shmid

2007-06-07 Thread Serge E. Hallyn
Quoting Serge E. Hallyn ([EMAIL PROTECTED]):
 Quoting Badari Pulavarty ([EMAIL PROTECTED]):
  On Thu, 2007-06-07 at 12:48 -0700, Andrew Morton wrote:
   On Thu, 07 Jun 2007 10:06:37 -0700
   Badari Pulavarty [EMAIL PROTECTED] wrote:
   
On Thu, 2007-06-07 at 12:43 -0400, Albert Cahalan wrote:
 On 6/7/07, Badari Pulavarty [EMAIL PROTECTED] wrote:
 
  BTW, I agree with Eric that its would be nice to use shmid as part
  of name instead of forcing to be as inode number. It should be
  possible for pmap to workout shmid from key or name. Isn't it ?
 
 It is not at all nice.
 
 1. it's incompatible ABI breakage
 2. where will you put the key then, in the inode? :-)

Nope. Currently key is part of the name (but its not unique).

 
 Changing to SYSVID%d is no good either. Look, people
 are ***parsing*** this stuff in /proc. The /proc filesystem
 is not some random sandbox to be playing in.
 
 Before you go messing with it, note that the device number
 also matters. (it's per-boot dynamic, but that's OK)
 That's how one knows that /SYSV is not just
 a regular file; sadly these didn't get a non-/ prefix.
 (and no you can't fix that now; it's way too late)
 
 Next time you feel like breaking an ABI, mind putting
 LET'S BREAK AN ABI! in the subject of your email?

I am not breaking ABI. Its already broken in the current
mainline. I am trying to fix it by putting back the ino#
as shmid. Eric had a suggestion that, instead of depending
on the inode# to be shmid, we could embed shmid into name
(instead of key which is currently not unique).

 BTW, I suspect this kind of thing also breaks:
 a. fuser, lsof, and other resource usage display tools
 b. various obscure emulators (similar to valgrind)

If you strongly feel that old behaviour needs to be retained, 
   
   yup, we should put it back.  The change was, afaik, accidental.
   
here is the patch I originally suggested.
   
   Confused.  Will this one-liner fix all the userspace breakage to which
   Albert refers?
  
  Yes. Albert, please correct me if I am wrong.
 
 It will, but could lead to two different inodes with the same i_ino,
 right?

Well I guess it's not *technically* a problem since these inodes are
never hashed.

-serge
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC][PATCH] /proc/pid/maps doesn't match ipcs -m shmid

2007-06-07 Thread Badari Pulavarty



Serge E. Hallyn wrote:


Quoting Badari Pulavarty ([EMAIL PROTECTED]):


On Thu, 2007-06-07 at 15:37 -0500, Serge E. Hallyn wrote:


Quoting Badari Pulavarty ([EMAIL PROTECTED]):


On Thu, 2007-06-07 at 12:48 -0700, Andrew Morton wrote:


On Thu, 07 Jun 2007 10:06:37 -0700
Badari Pulavarty [EMAIL PROTECTED] wrote:


On Thu, 2007-06-07 at 12:43 -0400, Albert Cahalan wrote:


On 6/7/07, Badari Pulavarty [EMAIL PROTECTED] wrote:


BTW, I agree with Eric that its would be nice to use shmid as part
of name instead of forcing to be as inode number. It should be
possible for pmap to workout shmid from key or name. Isn't it ?


It is not at all nice.

1. it's incompatible ABI breakage
2. where will you put the key then, in the inode? :-)


Nope. Currently key is part of the name (but its not unique).


Changing to SYSVID%d is no good either. Look, people
are ***parsing*** this stuff in /proc. The /proc filesystem
is not some random sandbox to be playing in.

Before you go messing with it, note that the device number
also matters. (it's per-boot dynamic, but that's OK)
That's how one knows that /SYSV is not just
a regular file; sadly these didn't get a non-/ prefix.
(and no you can't fix that now; it's way too late)

Next time you feel like breaking an ABI, mind putting
LET'S BREAK AN ABI! in the subject of your email?


I am not breaking ABI. Its already broken in the current
mainline. I am trying to fix it by putting back the ino#
as shmid. Eric had a suggestion that, instead of depending
on the inode# to be shmid, we could embed shmid into name
(instead of key which is currently not unique).


BTW, I suspect this kind of thing also breaks:
a. fuser, lsof, and other resource usage display tools
b. various obscure emulators (similar to valgrind)

If you strongly feel that old behaviour needs to be retained, 


yup, we should put it back.  The change was, afaik, accidental.


here is the patch I originally suggested.


Confused.  Will this one-liner fix all the userspace breakage to which
Albert refers?


Yes. Albert, please correct me if I am wrong.


It will, but could lead to two different inodes with the same i_ino,
right?


Only if we generate same ID in two different namespaces. Is it currently
possible ? 



Should be nothing stopping it.

But like I say we never find the inode based on i_ino, and don't hash
the inode, so it might be ok.

Correct. We might end up with same shmid - which mean same inode# shows 
up in /proc/pid/maps.
If we don't unshare pid namespace or look from parent namespace - we 
will end up seeing same
shmid/inode# in different /proc/pid/maps, even though they are 
different. But I guess its okay..


Thanks,
Badari



-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC][PATCH] /proc/pid/maps doesn't match ipcs -m shmid

2007-06-07 Thread Serge E. Hallyn
Quoting Serge E. Hallyn ([EMAIL PROTECTED]):
 Quoting Badari Pulavarty ([EMAIL PROTECTED]):
  On Thu, 2007-06-07 at 15:37 -0500, Serge E. Hallyn wrote:
   Quoting Badari Pulavarty ([EMAIL PROTECTED]):
On Thu, 2007-06-07 at 12:48 -0700, Andrew Morton wrote:
 On Thu, 07 Jun 2007 10:06:37 -0700
 Badari Pulavarty [EMAIL PROTECTED] wrote:
 
  On Thu, 2007-06-07 at 12:43 -0400, Albert Cahalan wrote:
   On 6/7/07, Badari Pulavarty [EMAIL PROTECTED] wrote:
   
BTW, I agree with Eric that its would be nice to use shmid as 
part
of name instead of forcing to be as inode number. It should be
possible for pmap to workout shmid from key or name. Isn't it 
?
   
   It is not at all nice.
   
   1. it's incompatible ABI breakage
   2. where will you put the key then, in the inode? :-)
  
  Nope. Currently key is part of the name (but its not unique).
  
   
   Changing to SYSVID%d is no good either. Look, people
   are ***parsing*** this stuff in /proc. The /proc filesystem
   is not some random sandbox to be playing in.
   
   Before you go messing with it, note that the device number
   also matters. (it's per-boot dynamic, but that's OK)
   That's how one knows that /SYSV is not just
   a regular file; sadly these didn't get a non-/ prefix.
   (and no you can't fix that now; it's way too late)
   
   Next time you feel like breaking an ABI, mind putting
   LET'S BREAK AN ABI! in the subject of your email?
  
  I am not breaking ABI. Its already broken in the current
  mainline. I am trying to fix it by putting back the ino#
  as shmid. Eric had a suggestion that, instead of depending
  on the inode# to be shmid, we could embed shmid into name
  (instead of key which is currently not unique).
  
   BTW, I suspect this kind of thing also breaks:
   a. fuser, lsof, and other resource usage display tools
   b. various obscure emulators (similar to valgrind)
  
  If you strongly feel that old behaviour needs to be retained, 
 
 yup, we should put it back.  The change was, afaik, accidental.
 
  here is the patch I originally suggested.
 
 Confused.  Will this one-liner fix all the userspace breakage to which
 Albert refers?

Yes. Albert, please correct me if I am wrong.
   
   It will, but could lead to two different inodes with the same i_ino,
   right?
  
  Only if we generate same ID in two different namespaces. Is it currently
  possible ? 
 
 Should be nothing stopping it.

(just to be more certain, a quick test showed I can get id 0 for
different keys, and different ids for the same key 0xff, in different
ipc namespaces)

-serge
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC][PATCH] /proc/pid/maps doesn't match "ipcs -m" shmid

2007-06-06 Thread Albert Cahalan

On 6/6/07, Andrew Morton <[EMAIL PROTECTED]> wrote:

On Wed, 6 Jun 2007 23:27:01 -0400 "Albert Cahalan" <[EMAIL PROTECTED]> wrote:
> Eric W. Biederman writes:
> > Badari Pulavarty <[EMAIL PROTECTED]> writes:
>
> >> Your recent cleanup to shm code, namely
> >>
> >> [PATCH] shm: make sysv ipc shared memory use stacked files
> >>
> >> took away one of the debugging feature for shm segments.
> >> Originally, shmid were forced to be the inode numbers and
> >> they show up in /proc/pid/maps for the process which mapped
> >> this shared memory segments (vma listing). That way, its easy
> >> to find out who all mapped this shared memory segment. Your
> >> patchset, took away the inode# setting. So, we can't easily
> >> match the shmem segments to /proc/pid/maps easily. (It was
> >> really useful in tracking down a customer problem recently).
> >> Is this done deliberately ? Anything wrong in setting this back ?
> >
> > Theoretically it makes the stacked file concept more brittle,
> > because it means the lower layers can't care about their inode
> > number.
> >
> > We do need something to tie these things together.
> >
> > So I suspect what makes most sense is to simply rename the
> > dentry SYSVID
>
> Please stop breaking things in /proc. The pmap command relys
> on the old behavior.

What effect did this change have upon the pmap command?  Details, please.

> It's time to revert.

Probably true, but we'd need to understand what the impact was.


Very simply, pmap reports the shmid.

albert 0 ~$ pmap `pidof X` | egrep -2 shmid
3005  16384K rw-s-  /dev/fb0
3105152K rw---[ anon ]
31076000384K rw-s-[ shmid=0x3f428000 ]
310d6000384K rw-s-[ shmid=0x3f430001 ]
31136000384K rw-s-[ shmid=0x3f438002 ]
31196000384K rw-s-[ shmid=0x3f440003 ]
311f6000384K rw-s-[ shmid=0x3f448004 ]
31256000384K rw-s-[ shmid=0x3f450005 ]
312b6000384K rw-s-[ shmid=0x3f460006 ]
31316000384K rw-s-[ shmid=0x3f870007 ]
31491000140K r  /usr/share/fonts/type1/gsfonts/n021003l.pfb
3150e000   9496K rw---[ anon ]
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC][PATCH] /proc/pid/maps doesn't match "ipcs -m" shmid

2007-06-06 Thread Andrew Morton
On Wed, 6 Jun 2007 23:27:01 -0400 "Albert Cahalan" <[EMAIL PROTECTED]> wrote:

> Eric W. Biederman writes:
> > Badari Pulavarty <[EMAIL PROTECTED]> writes:
> 
> >> Your recent cleanup to shm code, namely
> >>
> >> [PATCH] shm: make sysv ipc shared memory use stacked files
> >>
> >> took away one of the debugging feature for shm segments.
> >> Originally, shmid were forced to be the inode numbers and
> >> they show up in /proc/pid/maps for the process which mapped
> >> this shared memory segments (vma listing). That way, its easy
> >> to find out who all mapped this shared memory segment. Your
> >> patchset, took away the inode# setting. So, we can't easily
> >> match the shmem segments to /proc/pid/maps easily. (It was
> >> really useful in tracking down a customer problem recently).
> >> Is this done deliberately ? Anything wrong in setting this back ?
> >
> > Theoretically it makes the stacked file concept more brittle,
> > because it means the lower layers can't care about their inode
> > number.
> >
> > We do need something to tie these things together.
> >
> > So I suspect what makes most sense is to simply rename the
> > dentry SYSVID
> 
> Please stop breaking things in /proc. The pmap command relys
> on the old behavior.

What effect did this change have upon the pmap command?  Details, please.

> It's time to revert.

Probably true, but we'd need to understand what the impact was.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC][PATCH] /proc/pid/maps doesn't match "ipcs -m" shmid

2007-06-06 Thread Albert Cahalan

Eric W. Biederman writes:

Badari Pulavarty <[EMAIL PROTECTED]> writes:



Your recent cleanup to shm code, namely

[PATCH] shm: make sysv ipc shared memory use stacked files

took away one of the debugging feature for shm segments.
Originally, shmid were forced to be the inode numbers and
they show up in /proc/pid/maps for the process which mapped
this shared memory segments (vma listing). That way, its easy
to find out who all mapped this shared memory segment. Your
patchset, took away the inode# setting. So, we can't easily
match the shmem segments to /proc/pid/maps easily. (It was
really useful in tracking down a customer problem recently).
Is this done deliberately ? Anything wrong in setting this back ?


Theoretically it makes the stacked file concept more brittle,
because it means the lower layers can't care about their inode
number.

We do need something to tie these things together.

So I suspect what makes most sense is to simply rename the
dentry SYSVID


Please stop breaking things in /proc. The pmap command relys
on the old behavior. It's time to revert. Put back the segment ID
where it belongs, and leave the key where it belongs too.

Containers are NOT worth breaking our ABIs left and right.
We don't need to leap off that bridge just because Solaris did,
unless you can explain why complexity and bloat are desirable.
We already have SE Linux, chroot, KVM, and several more!
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC][PATCH] /proc/pid/maps doesn't match "ipcs -m" shmid

2007-06-06 Thread Eric W. Biederman
Badari Pulavarty <[EMAIL PROTECTED]> writes:
>
> -- Shared Memory Segments 
> keyshmid  owner  perms  bytes  nattch status
> 0x 884737 db2inst1  76733554432   13
> 0x 950275 db2fenc1  70123052288   13
>
> There is no unique way to identify them easily :(
>
> I guess, like you suggested, we can change the dentry name to use shmid
> instead of the portions of the "key" to make it unique. I think, I can 
> work out a patch for this.

Thanks.  That should be more robust.

Eric
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC][PATCH] /proc/pid/maps doesn't match "ipcs -m" shmid

2007-06-06 Thread Badari Pulavarty
On Wed, 2007-06-06 at 11:02 -0600, Eric W. Biederman wrote:
> Badari Pulavarty <[EMAIL PROTECTED]> writes:
> 
> > Hi Eric,
> >
> > Your recent cleanup to shm code, namely
> >
> > [PATCH] shm: make sysv ipc shared memory use stacked files
> >
> > took away one of the debugging feature for shm segments.
> > Originally, shmid were forced to be the inode numbers and
> > they show up in /proc/pid/maps for the process which mapped
> > this shared memory segments (vma listing). That way, its easy
> > to find out who all mapped this shared memory segment. Your
> > patchset, took away the inode# setting. So, we can't easily
> > match the shmem segments to /proc/pid/maps easily. (It was
> > really useful in tracking down a customer problem recently). 
> > Is this done deliberately ? Anything wrong in setting this back ?
> >
> > Comments ?
> >
> > Thanks,
> > Badari
> >
> > Without patch:
> > --
> >
> > # ipcs -m
> >
> > -- Shared Memory Segments 
> > keyshmid  owner  perms  bytes  nattch status
> > 0x 884737 db2inst1  76733554432   13
> >
> > # grep 884737 /proc/*/maps
> > #
> >
> > With patch:
> > ---
> >
> > # ipcs -m
> >
> > -- Shared Memory Segments 
> > keyshmid  owner  perms  bytes  nattch status
> > 0x 884737 db2inst1  76733554432   13
> >
> > # grep 884737 /proc/*/maps
> > /proc/0/maps:40006724000-40008724000 rw-s  00:08 884737
> > /SYSV (deleted)
> > /proc/1/maps:40006724000-40008724000 rw-s  00:08 884737
> > /SYSV (deleted)
> > /proc/2/maps:40006724000-40008724000 rw-s  00:08 884737
> > /SYSV (deleted)
> > /proc/3/maps:40006724000-40008724000 rw-s  00:08 884737
> > /SYSV (deleted)
> > /proc/4/maps:40006724000-40008724000 rw-s  00:08 884737
> > /SYSV (deleted)
> > /proc/5/maps:40006724000-40008724000 rw-s  00:08 884737
> > /SYSV (deleted)
> > /proc/6/maps:40006724000-40008724000 rw-s  00:08 884737
> > /SYSV (deleted)
> > /proc/7/maps:40006724000-40008724000 rw-s  00:08 884737
> > /SYSV (deleted)
> > /proc/8/maps:40006724000-40008724000 rw-s  00:08 884737
> > /SYSV (deleted)
> > /proc/11121/maps:40006724000-40008724000 rw-s  00:08 884737
> > /SYSV (deleted)
> > /proc/11122/maps:40006724000-40008724000 rw-s  00:08 884737
> > /SYSV (deleted)
> > /proc/11124/maps:4000389c000-4000589c000 rw-s  00:08 884737
> > /SYSV (deleted)
> > /proc/11575/maps:40006724000-40008724000 rw-s  00:08 884737
> > /SYSV (deleted)
> >
> >
> >
> > Here is the patch.
> >
> > "ino#" in /proc/pid/maps used to match "ipcs -m" output for shared 
> > memory (shmid). It was useful in debugging, but its changed recently. 
> > This patch sets inode number to shared memory id to match /proc/pid/maps.
> 
> Theoretically it makes the stacked file concept more brittle, because
> it means the lower layers can't care about their inode number.
> 
> We do need something to tie these things together.
> 
> So I suspect what makes most sense is to simply rename the dentry
> SYSVID

Yep. Currently, we use part of "key" as the dentry name. For example,

# ipcs

-- Shared Memory Segments 
keyshmid  owner  perms  bytes  nattch status
0x083d0d74 851968 db2inst1  76733554432   13

# grep 83d0d74 /proc/*/maps
/proc/0/maps:40004724000-40006724000 rw-s  00:08 851968  
/SYSV083d0d74 (deleted)
/proc/1/maps:40004724000-40006724000 rw-s  00:08 851968  
/SYSV083d0d74 (deleted)
/proc/2/maps:40004724000-40006724000 rw-s  00:08 851968  
/SYSV083d0d74 (deleted)
/proc/3/maps:40004724000-40006724000 rw-s  00:08 851968  
/SYSV083d0d74 (deleted)
..

The issue is with the ones with key = 0x000, like following:

# ipcs

-- Shared Memory Segments 
keyshmid  owner  perms  bytes  nattch status
0x 884737 db2inst1  76733554432   13
0x 950275 db2fenc1  70123052288   13

There is no unique way to identify them easily :(

I guess, like you suggested, we can change the dentry name to use shmid
instead of the portions of the "key" to make it unique. I think, I can 
work out a patch for this.


> 
> That should give you the necessary information while not doing something
> that is a long term maintenance problem.
> 
> Do you think you can cook up a patch to that effect?
> Otherwise I will see if I can.
> 
> In practice I'm not really against your change, but I would prefer
> to leave the code in a state where someone can reimplement hugetlbfs
> or shmfs and we simply don't care.

Thanks for your suggestion.

Thanks,
Badari


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL 

Re: [RFC][PATCH] /proc/pid/maps doesn't match "ipcs -m" shmid

2007-06-06 Thread Eric W. Biederman
Badari Pulavarty <[EMAIL PROTECTED]> writes:

> Hi Eric,
>
> Your recent cleanup to shm code, namely
>
> [PATCH] shm: make sysv ipc shared memory use stacked files
>
> took away one of the debugging feature for shm segments.
> Originally, shmid were forced to be the inode numbers and
> they show up in /proc/pid/maps for the process which mapped
> this shared memory segments (vma listing). That way, its easy
> to find out who all mapped this shared memory segment. Your
> patchset, took away the inode# setting. So, we can't easily
> match the shmem segments to /proc/pid/maps easily. (It was
> really useful in tracking down a customer problem recently). 
> Is this done deliberately ? Anything wrong in setting this back ?
>
> Comments ?
>
> Thanks,
> Badari
>
> Without patch:
> --
>
> # ipcs -m
>
> -- Shared Memory Segments 
> keyshmid  owner  perms  bytes  nattch status
> 0x 884737 db2inst1  76733554432   13
>
> # grep 884737 /proc/*/maps
> #
>
> With patch:
> ---
>
> # ipcs -m
>
> -- Shared Memory Segments 
> keyshmid  owner  perms  bytes  nattch status
> 0x 884737 db2inst1  76733554432   13
>
> # grep 884737 /proc/*/maps
> /proc/0/maps:40006724000-40008724000 rw-s  00:08 884737
> /SYSV (deleted)
> /proc/1/maps:40006724000-40008724000 rw-s  00:08 884737
> /SYSV (deleted)
> /proc/2/maps:40006724000-40008724000 rw-s  00:08 884737
> /SYSV (deleted)
> /proc/3/maps:40006724000-40008724000 rw-s  00:08 884737
> /SYSV (deleted)
> /proc/4/maps:40006724000-40008724000 rw-s  00:08 884737
> /SYSV (deleted)
> /proc/5/maps:40006724000-40008724000 rw-s  00:08 884737
> /SYSV (deleted)
> /proc/6/maps:40006724000-40008724000 rw-s  00:08 884737
> /SYSV (deleted)
> /proc/7/maps:40006724000-40008724000 rw-s  00:08 884737
> /SYSV (deleted)
> /proc/8/maps:40006724000-40008724000 rw-s  00:08 884737
> /SYSV (deleted)
> /proc/11121/maps:40006724000-40008724000 rw-s  00:08 884737
> /SYSV (deleted)
> /proc/11122/maps:40006724000-40008724000 rw-s  00:08 884737
> /SYSV (deleted)
> /proc/11124/maps:4000389c000-4000589c000 rw-s  00:08 884737
> /SYSV (deleted)
> /proc/11575/maps:40006724000-40008724000 rw-s  00:08 884737
> /SYSV (deleted)
>
>
>
> Here is the patch.
>
> "ino#" in /proc/pid/maps used to match "ipcs -m" output for shared 
> memory (shmid). It was useful in debugging, but its changed recently. 
> This patch sets inode number to shared memory id to match /proc/pid/maps.

Theoretically it makes the stacked file concept more brittle, because
it means the lower layers can't care about their inode number.

We do need something to tie these things together.

So I suspect what makes most sense is to simply rename the dentry
SYSVID

That should give you the necessary information while not doing something
that is a long term maintenance problem.

Do you think you can cook up a patch to that effect?
Otherwise I will see if I can.

In practice I'm not really against your change, but I would prefer
to leave the code in a state where someone can reimplement hugetlbfs
or shmfs and we simply don't care.

Eric
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[RFC][PATCH] /proc/pid/maps doesn't match "ipcs -m" shmid

2007-06-06 Thread Badari Pulavarty
Hi Eric,

Your recent cleanup to shm code, namely

[PATCH] shm: make sysv ipc shared memory use stacked files

took away one of the debugging feature for shm segments.
Originally, shmid were forced to be the inode numbers and
they show up in /proc/pid/maps for the process which mapped
this shared memory segments (vma listing). That way, its easy
to find out who all mapped this shared memory segment. Your
patchset, took away the inode# setting. So, we can't easily
match the shmem segments to /proc/pid/maps easily. (It was
really useful in tracking down a customer problem recently). 
Is this done deliberately ? Anything wrong in setting this back ?

Comments ?

Thanks,
Badari

Without patch:
--

# ipcs -m

-- Shared Memory Segments 
keyshmid  owner  perms  bytes  nattch status
0x 884737 db2inst1  76733554432   13

# grep 884737 /proc/*/maps
#

With patch:
---

# ipcs -m

-- Shared Memory Segments 
keyshmid  owner  perms  bytes  nattch status
0x 884737 db2inst1  76733554432   13

# grep 884737 /proc/*/maps
/proc/0/maps:40006724000-40008724000 rw-s  00:08 884737 
/SYSV (deleted)
/proc/1/maps:40006724000-40008724000 rw-s  00:08 884737 
/SYSV (deleted)
/proc/2/maps:40006724000-40008724000 rw-s  00:08 884737 
/SYSV (deleted)
/proc/3/maps:40006724000-40008724000 rw-s  00:08 884737 
/SYSV (deleted)
/proc/4/maps:40006724000-40008724000 rw-s  00:08 884737 
/SYSV (deleted)
/proc/5/maps:40006724000-40008724000 rw-s  00:08 884737 
/SYSV (deleted)
/proc/6/maps:40006724000-40008724000 rw-s  00:08 884737 
/SYSV (deleted)
/proc/7/maps:40006724000-40008724000 rw-s  00:08 884737 
/SYSV (deleted)
/proc/8/maps:40006724000-40008724000 rw-s  00:08 884737 
/SYSV (deleted)
/proc/11121/maps:40006724000-40008724000 rw-s  00:08 884737 
/SYSV (deleted)
/proc/11122/maps:40006724000-40008724000 rw-s  00:08 884737 
/SYSV (deleted)
/proc/11124/maps:4000389c000-4000589c000 rw-s  00:08 884737 
/SYSV (deleted)
/proc/11575/maps:40006724000-40008724000 rw-s  00:08 884737 
/SYSV (deleted)



Here is the patch.

"ino#" in /proc/pid/maps used to match "ipcs -m" output for shared 
memory (shmid). It was useful in debugging, but its changed recently. 
This patch sets inode number to shared memory id to match /proc/pid/maps.

Signed-off-by: Badari Pulavarty <[EMAIL PROTECTED]>

Index: linux-2.6.22-rc4/ipc/shm.c
===
--- linux-2.6.22-rc4.orig/ipc/shm.c 2007-06-04 17:57:25.0 -0700
+++ linux-2.6.22-rc4/ipc/shm.c  2007-06-06 08:23:57.0 -0700
@@ -397,6 +397,7 @@ static int newseg (struct ipc_namespace 
shp->shm_nattch = 0;
shp->id = shm_buildid(ns, id, shp->shm_perm.seq);
shp->shm_file = file;
+   file->f_dentry->d_inode->i_ino = shp->id;
 
ns->shm_tot += numpages;
shm_unlock(shp);


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[RFC][PATCH] /proc/pid/maps doesn't match ipcs -m shmid

2007-06-06 Thread Badari Pulavarty
Hi Eric,

Your recent cleanup to shm code, namely

[PATCH] shm: make sysv ipc shared memory use stacked files

took away one of the debugging feature for shm segments.
Originally, shmid were forced to be the inode numbers and
they show up in /proc/pid/maps for the process which mapped
this shared memory segments (vma listing). That way, its easy
to find out who all mapped this shared memory segment. Your
patchset, took away the inode# setting. So, we can't easily
match the shmem segments to /proc/pid/maps easily. (It was
really useful in tracking down a customer problem recently). 
Is this done deliberately ? Anything wrong in setting this back ?

Comments ?

Thanks,
Badari

Without patch:
--

# ipcs -m

-- Shared Memory Segments 
keyshmid  owner  perms  bytes  nattch status
0x 884737 db2inst1  76733554432   13

# grep 884737 /proc/*/maps
#

With patch:
---

# ipcs -m

-- Shared Memory Segments 
keyshmid  owner  perms  bytes  nattch status
0x 884737 db2inst1  76733554432   13

# grep 884737 /proc/*/maps
/proc/0/maps:40006724000-40008724000 rw-s  00:08 884737 
/SYSV (deleted)
/proc/1/maps:40006724000-40008724000 rw-s  00:08 884737 
/SYSV (deleted)
/proc/2/maps:40006724000-40008724000 rw-s  00:08 884737 
/SYSV (deleted)
/proc/3/maps:40006724000-40008724000 rw-s  00:08 884737 
/SYSV (deleted)
/proc/4/maps:40006724000-40008724000 rw-s  00:08 884737 
/SYSV (deleted)
/proc/5/maps:40006724000-40008724000 rw-s  00:08 884737 
/SYSV (deleted)
/proc/6/maps:40006724000-40008724000 rw-s  00:08 884737 
/SYSV (deleted)
/proc/7/maps:40006724000-40008724000 rw-s  00:08 884737 
/SYSV (deleted)
/proc/8/maps:40006724000-40008724000 rw-s  00:08 884737 
/SYSV (deleted)
/proc/11121/maps:40006724000-40008724000 rw-s  00:08 884737 
/SYSV (deleted)
/proc/11122/maps:40006724000-40008724000 rw-s  00:08 884737 
/SYSV (deleted)
/proc/11124/maps:4000389c000-4000589c000 rw-s  00:08 884737 
/SYSV (deleted)
/proc/11575/maps:40006724000-40008724000 rw-s  00:08 884737 
/SYSV (deleted)



Here is the patch.

ino# in /proc/pid/maps used to match ipcs -m output for shared 
memory (shmid). It was useful in debugging, but its changed recently. 
This patch sets inode number to shared memory id to match /proc/pid/maps.

Signed-off-by: Badari Pulavarty [EMAIL PROTECTED]

Index: linux-2.6.22-rc4/ipc/shm.c
===
--- linux-2.6.22-rc4.orig/ipc/shm.c 2007-06-04 17:57:25.0 -0700
+++ linux-2.6.22-rc4/ipc/shm.c  2007-06-06 08:23:57.0 -0700
@@ -397,6 +397,7 @@ static int newseg (struct ipc_namespace 
shp-shm_nattch = 0;
shp-id = shm_buildid(ns, id, shp-shm_perm.seq);
shp-shm_file = file;
+   file-f_dentry-d_inode-i_ino = shp-id;
 
ns-shm_tot += numpages;
shm_unlock(shp);


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC][PATCH] /proc/pid/maps doesn't match ipcs -m shmid

2007-06-06 Thread Eric W. Biederman
Badari Pulavarty [EMAIL PROTECTED] writes:

 Hi Eric,

 Your recent cleanup to shm code, namely

 [PATCH] shm: make sysv ipc shared memory use stacked files

 took away one of the debugging feature for shm segments.
 Originally, shmid were forced to be the inode numbers and
 they show up in /proc/pid/maps for the process which mapped
 this shared memory segments (vma listing). That way, its easy
 to find out who all mapped this shared memory segment. Your
 patchset, took away the inode# setting. So, we can't easily
 match the shmem segments to /proc/pid/maps easily. (It was
 really useful in tracking down a customer problem recently). 
 Is this done deliberately ? Anything wrong in setting this back ?

 Comments ?

 Thanks,
 Badari

 Without patch:
 --

 # ipcs -m

 -- Shared Memory Segments 
 keyshmid  owner  perms  bytes  nattch status
 0x 884737 db2inst1  76733554432   13

 # grep 884737 /proc/*/maps
 #

 With patch:
 ---

 # ipcs -m

 -- Shared Memory Segments 
 keyshmid  owner  perms  bytes  nattch status
 0x 884737 db2inst1  76733554432   13

 # grep 884737 /proc/*/maps
 /proc/0/maps:40006724000-40008724000 rw-s  00:08 884737
 /SYSV (deleted)
 /proc/1/maps:40006724000-40008724000 rw-s  00:08 884737
 /SYSV (deleted)
 /proc/2/maps:40006724000-40008724000 rw-s  00:08 884737
 /SYSV (deleted)
 /proc/3/maps:40006724000-40008724000 rw-s  00:08 884737
 /SYSV (deleted)
 /proc/4/maps:40006724000-40008724000 rw-s  00:08 884737
 /SYSV (deleted)
 /proc/5/maps:40006724000-40008724000 rw-s  00:08 884737
 /SYSV (deleted)
 /proc/6/maps:40006724000-40008724000 rw-s  00:08 884737
 /SYSV (deleted)
 /proc/7/maps:40006724000-40008724000 rw-s  00:08 884737
 /SYSV (deleted)
 /proc/8/maps:40006724000-40008724000 rw-s  00:08 884737
 /SYSV (deleted)
 /proc/11121/maps:40006724000-40008724000 rw-s  00:08 884737
 /SYSV (deleted)
 /proc/11122/maps:40006724000-40008724000 rw-s  00:08 884737
 /SYSV (deleted)
 /proc/11124/maps:4000389c000-4000589c000 rw-s  00:08 884737
 /SYSV (deleted)
 /proc/11575/maps:40006724000-40008724000 rw-s  00:08 884737
 /SYSV (deleted)



 Here is the patch.

 ino# in /proc/pid/maps used to match ipcs -m output for shared 
 memory (shmid). It was useful in debugging, but its changed recently. 
 This patch sets inode number to shared memory id to match /proc/pid/maps.

Theoretically it makes the stacked file concept more brittle, because
it means the lower layers can't care about their inode number.

We do need something to tie these things together.

So I suspect what makes most sense is to simply rename the dentry
SYSVIDsegmentid

That should give you the necessary information while not doing something
that is a long term maintenance problem.

Do you think you can cook up a patch to that effect?
Otherwise I will see if I can.

In practice I'm not really against your change, but I would prefer
to leave the code in a state where someone can reimplement hugetlbfs
or shmfs and we simply don't care.

Eric
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC][PATCH] /proc/pid/maps doesn't match ipcs -m shmid

2007-06-06 Thread Badari Pulavarty
On Wed, 2007-06-06 at 11:02 -0600, Eric W. Biederman wrote:
 Badari Pulavarty [EMAIL PROTECTED] writes:
 
  Hi Eric,
 
  Your recent cleanup to shm code, namely
 
  [PATCH] shm: make sysv ipc shared memory use stacked files
 
  took away one of the debugging feature for shm segments.
  Originally, shmid were forced to be the inode numbers and
  they show up in /proc/pid/maps for the process which mapped
  this shared memory segments (vma listing). That way, its easy
  to find out who all mapped this shared memory segment. Your
  patchset, took away the inode# setting. So, we can't easily
  match the shmem segments to /proc/pid/maps easily. (It was
  really useful in tracking down a customer problem recently). 
  Is this done deliberately ? Anything wrong in setting this back ?
 
  Comments ?
 
  Thanks,
  Badari
 
  Without patch:
  --
 
  # ipcs -m
 
  -- Shared Memory Segments 
  keyshmid  owner  perms  bytes  nattch status
  0x 884737 db2inst1  76733554432   13
 
  # grep 884737 /proc/*/maps
  #
 
  With patch:
  ---
 
  # ipcs -m
 
  -- Shared Memory Segments 
  keyshmid  owner  perms  bytes  nattch status
  0x 884737 db2inst1  76733554432   13
 
  # grep 884737 /proc/*/maps
  /proc/0/maps:40006724000-40008724000 rw-s  00:08 884737
  /SYSV (deleted)
  /proc/1/maps:40006724000-40008724000 rw-s  00:08 884737
  /SYSV (deleted)
  /proc/2/maps:40006724000-40008724000 rw-s  00:08 884737
  /SYSV (deleted)
  /proc/3/maps:40006724000-40008724000 rw-s  00:08 884737
  /SYSV (deleted)
  /proc/4/maps:40006724000-40008724000 rw-s  00:08 884737
  /SYSV (deleted)
  /proc/5/maps:40006724000-40008724000 rw-s  00:08 884737
  /SYSV (deleted)
  /proc/6/maps:40006724000-40008724000 rw-s  00:08 884737
  /SYSV (deleted)
  /proc/7/maps:40006724000-40008724000 rw-s  00:08 884737
  /SYSV (deleted)
  /proc/8/maps:40006724000-40008724000 rw-s  00:08 884737
  /SYSV (deleted)
  /proc/11121/maps:40006724000-40008724000 rw-s  00:08 884737
  /SYSV (deleted)
  /proc/11122/maps:40006724000-40008724000 rw-s  00:08 884737
  /SYSV (deleted)
  /proc/11124/maps:4000389c000-4000589c000 rw-s  00:08 884737
  /SYSV (deleted)
  /proc/11575/maps:40006724000-40008724000 rw-s  00:08 884737
  /SYSV (deleted)
 
 
 
  Here is the patch.
 
  ino# in /proc/pid/maps used to match ipcs -m output for shared 
  memory (shmid). It was useful in debugging, but its changed recently. 
  This patch sets inode number to shared memory id to match /proc/pid/maps.
 
 Theoretically it makes the stacked file concept more brittle, because
 it means the lower layers can't care about their inode number.
 
 We do need something to tie these things together.
 
 So I suspect what makes most sense is to simply rename the dentry
 SYSVIDsegmentid

Yep. Currently, we use part of key as the dentry name. For example,

# ipcs

-- Shared Memory Segments 
keyshmid  owner  perms  bytes  nattch status
0x083d0d74 851968 db2inst1  76733554432   13

# grep 83d0d74 /proc/*/maps
/proc/0/maps:40004724000-40006724000 rw-s  00:08 851968  
/SYSV083d0d74 (deleted)
/proc/1/maps:40004724000-40006724000 rw-s  00:08 851968  
/SYSV083d0d74 (deleted)
/proc/2/maps:40004724000-40006724000 rw-s  00:08 851968  
/SYSV083d0d74 (deleted)
/proc/3/maps:40004724000-40006724000 rw-s  00:08 851968  
/SYSV083d0d74 (deleted)
..

The issue is with the ones with key = 0x000, like following:

# ipcs

-- Shared Memory Segments 
keyshmid  owner  perms  bytes  nattch status
0x 884737 db2inst1  76733554432   13
0x 950275 db2fenc1  70123052288   13

There is no unique way to identify them easily :(

I guess, like you suggested, we can change the dentry name to use shmid
instead of the portions of the key to make it unique. I think, I can 
work out a patch for this.


 
 That should give you the necessary information while not doing something
 that is a long term maintenance problem.
 
 Do you think you can cook up a patch to that effect?
 Otherwise I will see if I can.
 
 In practice I'm not really against your change, but I would prefer
 to leave the code in a state where someone can reimplement hugetlbfs
 or shmfs and we simply don't care.

Thanks for your suggestion.

Thanks,
Badari


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC][PATCH] /proc/pid/maps doesn't match ipcs -m shmid

2007-06-06 Thread Eric W. Biederman
Badari Pulavarty [EMAIL PROTECTED] writes:

 -- Shared Memory Segments 
 keyshmid  owner  perms  bytes  nattch status
 0x 884737 db2inst1  76733554432   13
 0x 950275 db2fenc1  70123052288   13

 There is no unique way to identify them easily :(

 I guess, like you suggested, we can change the dentry name to use shmid
 instead of the portions of the key to make it unique. I think, I can 
 work out a patch for this.

Thanks.  That should be more robust.

Eric
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC][PATCH] /proc/pid/maps doesn't match ipcs -m shmid

2007-06-06 Thread Albert Cahalan

Eric W. Biederman writes:

Badari Pulavarty [EMAIL PROTECTED] writes:



Your recent cleanup to shm code, namely

[PATCH] shm: make sysv ipc shared memory use stacked files

took away one of the debugging feature for shm segments.
Originally, shmid were forced to be the inode numbers and
they show up in /proc/pid/maps for the process which mapped
this shared memory segments (vma listing). That way, its easy
to find out who all mapped this shared memory segment. Your
patchset, took away the inode# setting. So, we can't easily
match the shmem segments to /proc/pid/maps easily. (It was
really useful in tracking down a customer problem recently).
Is this done deliberately ? Anything wrong in setting this back ?


Theoretically it makes the stacked file concept more brittle,
because it means the lower layers can't care about their inode
number.

We do need something to tie these things together.

So I suspect what makes most sense is to simply rename the
dentry SYSVIDsegmentid


Please stop breaking things in /proc. The pmap command relys
on the old behavior. It's time to revert. Put back the segment ID
where it belongs, and leave the key where it belongs too.

Containers are NOT worth breaking our ABIs left and right.
We don't need to leap off that bridge just because Solaris did,
unless you can explain why complexity and bloat are desirable.
We already have SE Linux, chroot, KVM, and several more!
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC][PATCH] /proc/pid/maps doesn't match ipcs -m shmid

2007-06-06 Thread Andrew Morton
On Wed, 6 Jun 2007 23:27:01 -0400 Albert Cahalan [EMAIL PROTECTED] wrote:

 Eric W. Biederman writes:
  Badari Pulavarty [EMAIL PROTECTED] writes:
 
  Your recent cleanup to shm code, namely
 
  [PATCH] shm: make sysv ipc shared memory use stacked files
 
  took away one of the debugging feature for shm segments.
  Originally, shmid were forced to be the inode numbers and
  they show up in /proc/pid/maps for the process which mapped
  this shared memory segments (vma listing). That way, its easy
  to find out who all mapped this shared memory segment. Your
  patchset, took away the inode# setting. So, we can't easily
  match the shmem segments to /proc/pid/maps easily. (It was
  really useful in tracking down a customer problem recently).
  Is this done deliberately ? Anything wrong in setting this back ?
 
  Theoretically it makes the stacked file concept more brittle,
  because it means the lower layers can't care about their inode
  number.
 
  We do need something to tie these things together.
 
  So I suspect what makes most sense is to simply rename the
  dentry SYSVIDsegmentid
 
 Please stop breaking things in /proc. The pmap command relys
 on the old behavior.

What effect did this change have upon the pmap command?  Details, please.

 It's time to revert.

Probably true, but we'd need to understand what the impact was.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC][PATCH] /proc/pid/maps doesn't match ipcs -m shmid

2007-06-06 Thread Albert Cahalan

On 6/6/07, Andrew Morton [EMAIL PROTECTED] wrote:

On Wed, 6 Jun 2007 23:27:01 -0400 Albert Cahalan [EMAIL PROTECTED] wrote:
 Eric W. Biederman writes:
  Badari Pulavarty [EMAIL PROTECTED] writes:

  Your recent cleanup to shm code, namely
 
  [PATCH] shm: make sysv ipc shared memory use stacked files
 
  took away one of the debugging feature for shm segments.
  Originally, shmid were forced to be the inode numbers and
  they show up in /proc/pid/maps for the process which mapped
  this shared memory segments (vma listing). That way, its easy
  to find out who all mapped this shared memory segment. Your
  patchset, took away the inode# setting. So, we can't easily
  match the shmem segments to /proc/pid/maps easily. (It was
  really useful in tracking down a customer problem recently).
  Is this done deliberately ? Anything wrong in setting this back ?
 
  Theoretically it makes the stacked file concept more brittle,
  because it means the lower layers can't care about their inode
  number.
 
  We do need something to tie these things together.
 
  So I suspect what makes most sense is to simply rename the
  dentry SYSVIDsegmentid

 Please stop breaking things in /proc. The pmap command relys
 on the old behavior.

What effect did this change have upon the pmap command?  Details, please.

 It's time to revert.

Probably true, but we'd need to understand what the impact was.


Very simply, pmap reports the shmid.

albert 0 ~$ pmap `pidof X` | egrep -2 shmid
3005  16384K rw-s-  /dev/fb0
3105152K rw---[ anon ]
31076000384K rw-s-[ shmid=0x3f428000 ]
310d6000384K rw-s-[ shmid=0x3f430001 ]
31136000384K rw-s-[ shmid=0x3f438002 ]
31196000384K rw-s-[ shmid=0x3f440003 ]
311f6000384K rw-s-[ shmid=0x3f448004 ]
31256000384K rw-s-[ shmid=0x3f450005 ]
312b6000384K rw-s-[ shmid=0x3f460006 ]
31316000384K rw-s-[ shmid=0x3f870007 ]
31491000140K r  /usr/share/fonts/type1/gsfonts/n021003l.pfb
3150e000   9496K rw---[ anon ]
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/