Re: Accessing file-offset info for fds in /proc?

2007-02-20 Thread Miklos Szeredi
> On Tue, 2007-02-20 at 02:31 -0500, Hank Leininger wrote:
> > Is there anything provided by the kernel that would let you see the
> > current offset of an existing filehandle?
> > 
> > Sometimes when processing a very large file (grepping a log, bzip2'ing
> > or gpg'ing a file, or whatever), I'd really like to know how far along
> > it is, because I'm impatient.  lsof has an -o flag to show offsets for
> > file descriptors it lists, but it appears that's not supported under
> > Linux.  It looks like all of the information lsof and fuser print about 
> > files in use, etc can be gotten from /proc/*/fd/* (and /proc/*/maps, but
> > I'm not really concerned with mmap'ed files, just positions on fds).
> > Sometimes I'll resort to strace -s4096'ing the process to see what chunk
> > of text it's currently reading, and try to guess from that.  Silly.
> > 
> > Has anybody ever developed a patch to implement this?  I realize this
> > could create a variety of information-leakage problems; the information
> > probably would need to be restricted, such as by the same rules as
> > dumpable.  Are there any horribly painful reasons why this couldn't be
> > done?
> 
> It shouldn't be too painful.  The code to populate /proc/*/fd/ has the
> file struct.  It just doesn't have a place pass the offset to user-space
> since it's basically creating a symlink.  In proc_fd_link(), it has the
> file struct.  The offset is file->f_pos.
> 
> One could create something like /proc/*/fd_offsets, whose read method
> could list the file descriptor, path, and offset for each open file.

I have an old patch, that does something like this.  Not much of it
applies now, but maybe it can be dusted off.

Miklos


This patch adds support for finding out the current file position of
an open file through the proc filesystem.  These new entries are
added:

  /proc/PID/fdinfo/FD/pos
  /proc/PID/task/TID/fdinfo/FD/pos

Various other (simpler) approaches are possible:

  a) return file position in st_size if lstat() is called on
 /proc/PID/fd/FD (suggested by lsof FAQ)

  b) list the open files and current positions in a single proc file
 (e.g. /proc/PID/fdpos)

I don't really like a) because it uses the file size information for
something else.  b) has the problem of not scaling well to large
number of file descriptors, if the user only wants information for a
single descriptor.

The 'fdinfo' approach also has the advantage of being easily
extensible.  For example if the "autonomous mount trees" patch makes
it to the kernel, there would be a need to get mount tree information
for each open file.

The inode number assignment looks like this:

  procfile  ino
    ---
  /proc/PID/fd/FD   (PID << 16) + 0x8000 + FD
  /proc/PID/fdinfo/FD   (PID << 16) + 0xc000 + FD * 8
  /proc/PID/fdinfo/FD/pos   (PID << 16) + 0xc000 + FD * 8 + 1

It is obvious that if a process has large number of file descriptors
(or just a large maximal fd) then inode numbers will clash.  This is
nothing new, the previous limit of 32768 file descriptors can easily
be exceeded with appropriate limits.  Now this goes down to 2048 fds,
before possible clashing.

If there is an inode number clash, nothing bad happens, just the inums
won't be unique anymore.  I really can't think of solving this in a
fundamentally better way.  Suggestions welcome.

Patch is against 2.6.12-rc5, but applies to any not very old kernel.

Index: linux/fs/proc/base.c
===
--- linux.orig/fs/proc/base.c   2005-08-19 14:47:36.0 +0200
+++ linux/fs/proc/base.c2005-08-19 14:47:37.0 +0200
@@ -122,6 +122,7 @@ enum pid_directory_inos {
 #endif
PROC_TGID_OOM_SCORE,
PROC_TGID_OOM_ADJUST,
+   PROC_TGID_FDINFO,
PROC_TID_INO,
PROC_TID_STATUS,
PROC_TID_MEM,
@@ -160,11 +161,17 @@ enum pid_directory_inos {
 #endif
PROC_TID_OOM_SCORE,
PROC_TID_OOM_ADJUST,
+   PROC_TID_FDINFO,
 
/* Add new entries before this */
-   PROC_TID_FD_DIR = 0x8000,   /* 0x8000-0x */
+   PROC_TID_FD_DIR = 0x8000,   /* /proc/PID/fd/FD */
+   PROC_TID_FDINFO_DIR = 0xc000,   /* /proc/PID/fdinfo/FD */
+
+   PROC_FDINFO_POS = 1,/* /proc/PID/fdinfo/FD/pos */
 };
 
+#define PROC_TID_FDINFO_MUL 8
+
 struct pid_entry {
int type;
int len;
@@ -177,6 +184,7 @@ struct pid_entry {
 static struct pid_entry tgid_base_stuff[] = {
E(PROC_TGID_TASK,  "task",S_IFDIR|S_IRUGO|S_IXUGO),
E(PROC_TGID_FD,"fd",  S_IFDIR|S_IRUSR|S_IXUSR),
+   E(PROC_TGID_FDINFO,"fdinfo",  S_IFDIR|S_IRUSR|S_IXUSR),
E(PROC_TGID_ENVIRON,   "environ", S_IFREG|S_IRUSR),
E(PROC_TGID_AUXV,  "auxv",S_IFREG|S_IRUSR),
E(PROC_TGID_STATUS,"status",  S_IFREG|S_IRUGO),
@@ -217,6 +225,7 @@ static

Re: Accessing file-offset info for fds in /proc?

2007-02-20 Thread Dave Kleikamp
On Tue, 2007-02-20 at 02:31 -0500, Hank Leininger wrote:
> Is there anything provided by the kernel that would let you see the
> current offset of an existing filehandle?
> 
> Sometimes when processing a very large file (grepping a log, bzip2'ing
> or gpg'ing a file, or whatever), I'd really like to know how far along
> it is, because I'm impatient.  lsof has an -o flag to show offsets for
> file descriptors it lists, but it appears that's not supported under
> Linux.  It looks like all of the information lsof and fuser print about 
> files in use, etc can be gotten from /proc/*/fd/* (and /proc/*/maps, but
> I'm not really concerned with mmap'ed files, just positions on fds).
> Sometimes I'll resort to strace -s4096'ing the process to see what chunk
> of text it's currently reading, and try to guess from that.  Silly.
> 
> Has anybody ever developed a patch to implement this?  I realize this
> could create a variety of information-leakage problems; the information
> probably would need to be restricted, such as by the same rules as
> dumpable.  Are there any horribly painful reasons why this couldn't be
> done?

It shouldn't be too painful.  The code to populate /proc/*/fd/ has the
file struct.  It just doesn't have a place pass the offset to user-space
since it's basically creating a symlink.  In proc_fd_link(), it has the
file struct.  The offset is file->f_pos.

One could create something like /proc/*/fd_offsets, whose read method
could list the file descriptor, path, and offset for each open file.

Shaggy
-- 
David Kleikamp
IBM Linux Technology Center

-
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Accessing file-offset info for fds in /proc?

2007-02-20 Thread Hank Leininger
Is there anything provided by the kernel that would let you see the
current offset of an existing filehandle?

Sometimes when processing a very large file (grepping a log, bzip2'ing
or gpg'ing a file, or whatever), I'd really like to know how far along
it is, because I'm impatient.  lsof has an -o flag to show offsets for
file descriptors it lists, but it appears that's not supported under
Linux.  It looks like all of the information lsof and fuser print about 
files in use, etc can be gotten from /proc/*/fd/* (and /proc/*/maps, but
I'm not really concerned with mmap'ed files, just positions on fds).
Sometimes I'll resort to strace -s4096'ing the process to see what chunk
of text it's currently reading, and try to guess from that.  Silly.

Has anybody ever developed a patch to implement this?  I realize this
could create a variety of information-leakage problems; the information
probably would need to be restricted, such as by the same rules as
dumpable.  Are there any horribly painful reasons why this couldn't be
done?

Thanks,

-- 

Hank Leininger <[EMAIL PROTECTED]>
F980 A584 5175 1996 DD7E  C47B 1A71 105C CB44 CBF8


pgpvadQ3fvcex.pgp
Description: PGP signature