On Wed, Nov 28, 2007 at 01:56:49PM -0800, Brad Boyer wrote:
>
> I have a few comments. Some of them are inline with the code.
>
> The one generic thing is that the Sun documentation specifically
> says to check pathconf(_PC_MIN_HOLE_SIZE) on the filesystem to see
> if it supports this request. Have you looked into adding this to
> the Linux version of pathconf? I haven't tried to look at the latest
> glibc, but 2.3.2 doesn't appear to have it. If we do have it, this
> request should really go to the kernel to ask the individual
> filesystem. However, it looks like pathconf is entirely implemented
> in glibc. Should we add a system call or some other way to pass
> pathconf requests into the kernel? I know pathconf currently returns
> some inaccurate results in some cases due to this limitation. There
> are some existing ones like _PC_LINK_MAX that glibc tries to guess
> the correct result based on statfs and can get wrong in any
> non-standard setup.
>
Since it appears pathconf doesn't actually interface with the kernel I think
that its not really worth chasing down. I don't think every fs maintainer is
really going to want to chase down a glibc person in order to change what
pathconf returns for their particular fs.
>
> On Wed, Nov 28, 2007 at 03:02:07PM -0500, Josef Bacik wrote:
> > diff --git a/fs/read_write.c b/fs/read_write.c
> > index ea1f94c..cf61e1e 100644
> > --- a/fs/read_write.c
> > +++ b/fs/read_write.c
> > @@ -31,10 +31,32 @@ const struct file_operations generic_ro_fops = {
> >
> > EXPORT_SYMBOL(generic_ro_fops);
> >
> > +static loff_t generic_seek_hole_data(struct file *file, loff_t offset,
> > + int origin)
> > +{
> > + loff_t retval = -ENXIO;
> > + struct inode *inode = file->f_mapping->host;
> > + sector_t block, found_block;
> > + sector_t last_block = (i_size_read(inode) - 1) >> inode->i_blkbits;
> > + int want = (origin == SEEK_HOLE) ? 0 : 1;
> > +
> > + for (block = offset >> inode->i_blkbits; block <= last_block; block++) {
> > + found_block = bmap(inode, block);
> > +
> > + if (!!found_block == want) {
> > + retval = block << inode->i_blkbits;
> > + break;
> > + }
> > + }
> > +
> > + return retval;
> > +}
> > +
>
> It looks like this function can return an offset that is before the
> one passed in as an argument due to losing the lower bits. I think
> it needs to do somthing more like this in the loop initializer:
>
> block = (offset + (1 << inode->i_blkbits) - 1) >> inode->i_blkbits;
>
> By adding 1 block if it wasn't already on an even block, this assures
> us that it won't go backwards from the original offset while allowing
> someone to get a block that really does start exactly on the offset.
>
> > loff_t generic_file_llseek(struct file *file, loff_t offset, int origin)
> > {
> > long long retval;
> > struct inode *inode = file->f_mapping->host;
> > + loff_t (*fn)(struct file *, loff_t, int);
> >
> > mutex_lock(&inode->i_mutex);
> > switch (origin) {
> > @@ -43,15 +65,24 @@ loff_t generic_file_llseek(struct file *file, loff_t
> > offset, int origin)
> > break;
> > case SEEK_CUR:
> > offset += file->f_pos;
> > + break;
> > + case SEEK_HOLE:
> > + case SEEK_DATA:
> > + fn = generic_seek_hole_data;
> > + if (file->f_op->seek_hole_data)
> > + fn = file->f_op->seek_hole_data;
> > + offset = fn(file, offset, origin);
> > }
> > retval = -EINVAL;
> > if (offset>=0 && offset<=inode->i_sb->s_maxbytes) {
> > - if (offset != file->f_pos) {
> > + if (offset != file->f_pos && origin != SEEK_HOLE) {
>
> Why is SEEK_HOLE special in this case? The description of SEEK_HOLE
> and SEEK_DATA in the Sun document would imply to me that they should
> be handled the same.
>
This was my interpretation of the man page, however Joern looked at what solaris
does and it seems that SEEK_HOLE does update f_pos, so I will change this.
> > file->f_pos = offset;
> > file->f_version = 0;
> > }
> > retval = offset;
> > - }
> > + } else if (offset < 0)
> > + retval = offset;
> > +
> > mutex_unlock(&inode->i_mutex);
> > return retval;
Thanks much for your comments,
Josef
-
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at http://vger.kernel.org/majordomo-info.html