Re: [RFC] [PATCH 1/1] Nanosecond timestamps

2007-02-06 Thread Johann Lombardi
On Fri, Feb 02, 2007 at 08:19:50PM +0530, Kalpak Shah wrote:
 Index: linux-2.6.19/fs/ext3/super.c
 ===
 --- linux-2.6.19.orig/fs/ext3/super.c
 +++ linux-2.6.19/fs/ext3/super.c
 @@ -1770,6 +1772,32 @@ static int ext3_fill_super (struct super
 }
  
 ext3_setup_super (sb, es, sb-s_flags  MS_RDONLY);
 +
 +   /* determine the minimum size of new large inodes, if present */
 +   if (sbi-s_inode_size  EXT3_GOOD_OLD_INODE_SIZE) {
 +   EXT3_SB(sb)-s_want_extra_isize = sizeof(struct ext3_inode) -  
 EXT3_GOOD_OLD_INODE_SIZE;

Maybe EXT3_SB(sb)- could be replaced by sbi- here and in the lines below.

 +   if (EXT3_HAS_RO_COMPAT_FEATURE(sb,
 +   EXT3_FEATURE_RO_COMPAT_EXTRA_ISIZE)) {
 +   if (EXT3_SB(sb)-s_want_extra_isize 
 +   le32_to_cpu(es-s_want_extra_isize))
^^
 +   EXT3_SB(sb)-s_want_extra_isize =
 +   le32_to_cpu(es-s_want_extra_isize);
^^
 +   if (EXT3_SB(sb)-s_want_extra_isize 
 +   le32_to_cpu(es-s_min_extra_isize))
^^
 +   EXT3_SB(sb)-s_want_extra_isize =
 +   le32_to_cpu(es-s_min_extra_isize);
^^
Since es-s_{min,want}_extra_isize are both __u16 (BTW, shouldn't it be 
__le16?),
I think you should use le16_to_cpu() instead of le32_to_cpu().

 +   }
 +   }
 +   /* Check if enough inode space is available */
 +   if (EXT3_GOOD_OLD_INODE_SIZE + EXT3_SB(sb)-s_want_extra_isize 
 +   sbi-s_inode_size) {
 +   EXT3_SB(sb)-s_want_extra_isize = sizeof(struct ext3_inode) - 
 EXT3_GOOD_OLD_INODE_SIZE;
 +   printk(KERN_INFO EXT3-fs: required extra inode space not
 +   available.\n);
 +   }

If the inode size is EXT3_GOOD_OLD_INODE_SIZE, sbi-s_want_extra_isize won't be
initialized. However, it should not be an issue because the ext3_sb_info
is set to zero in ext3_fill_super().

Johann
-
To unsubscribe from this list: send the line unsubscribe linux-ext4 in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC][PATCH 2/3] Move the file data to the new blocks

2007-02-06 Thread Andrew Morton
On Tue, 16 Jan 2007 21:05:20 +0900
[EMAIL PROTECTED] wrote:

 Move the blocks on the temporary inode to the original inode
 by a page.
 1. Read the file data from the old blocks to the page
 2. Move the block on the temporary inode to the original inode
 3. Write the file data on the page into the new blocks
 
 ...

 +
 +/**
 + * ext4_ext_replace_branches - replace original extents with new extents.
 + * @org_inodeOriginal inode
 + * @dest_inode   temporary inode
 + * @from_pagePage offset
 + * @count_page   Page count to be replaced
 + * @delete_start block offset for deletion
 + *
 + * This function returns 0 if succeed, otherwise returns error value.
 + * Replace extents for blocks from from to from+count-1.
 + */
 +static int
 +ext4_ext_replace_branches(struct inode *org_inode, struct inode *dest_inode,
 + pgoff_t from_page,  pgoff_t dest_from_page,
 + pgoff_t count_page, unsigned long *delete_start) 
 +{
 + struct ext4_ext_path *org_path = NULL;
 + struct ext4_ext_path *dest_path = NULL;
 + struct ext4_extent   *oext, *dext;
 + struct ext4_extent   tmp_ext;
 + int err = 0;
 + int depth;
 + unsigned long from, count, dest_off, diff, replaced_count = 0;

These should be sector_t, shouldn't they?

 + handle_t *handle = NULL;
 + unsigned jnum;
 +
 + from = from_page  (PAGE_CACHE_SHIFT - dest_inode-i_blkbits);

In which case one needs to be very careful to avoid overflows in
expressions such as this one.

 + wait_on_page_locked(page);
 + lock_page(page);

The wait_on_page_locked() is unneeded here.
-
To unsubscribe from this list: send the line unsubscribe linux-ext4 in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC][PATCH 2/3] Move the file data to the new blocks

2007-02-06 Thread Andrew Morton
On Mon, 5 Feb 2007 14:12:04 +0100
Jan Kara [EMAIL PROTECTED] wrote:

  Move the blocks on the temporary inode to the original inode
  by a page.
  1. Read the file data from the old blocks to the page
  2. Move the block on the temporary inode to the original inode
  3. Write the file data on the page into the new blocks
   I have one thing - it's probably not good to use page cache for
 defragmentation.

Then it is no longer online defragmentation.  The issues with maintaining
correctness and coherency with ongoing VFS activity would be truly ghastly.

If we're worried about pagecache pollution then it would be better to control
that from userspace via fadvise().
-
To unsubscribe from this list: send the line unsubscribe linux-ext4 in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 0/1][RFC] mm: prepare_write positive return value

2007-02-06 Thread Andrew Morton
On Tue, 06 Feb 2007 11:33:46 +0300
Dmitriy Monakhov [EMAIL PROTECTED] wrote:

 Almost all read/write operation handles data with chunks(segments or pages)
 and result has integral behaviour for folowing scenario: 
 for_each_chunk() {
  res = op();
  if(IS_ERROR(res))
return progress ? progress : res;
  progress += res;
 }
 prepare_write may has integral behaviour in case of blksize  pgsize,
 for example we successfully allocated/read some blocks, but not all of them,
 and than some error happend. Currently we eliminate this progress by doing
 vmtrunate() after prepare_has failed.
 It is good to have ability to signal about this progress. Interprete positive
 prepare_write() ret code as bytes num that fs ready to handle at this moment.
 I've ask SAW, he think it is sane. This size always less than PAGE_CACHE_SIZE
 so it less than AOP_TRUNCATED_PAGE too.
  
 BTH: This approach was used in OpenVZ 2.6.9 kernel in order to make FS with 
 delayed allocation more correct, and its works well.
 I think not everybody will happy about this,  but let's discuss all advantages
 and disadvantages of this change.

That seems to be a logical change.  We'd need to review all the callers and
callees to make sure that they handle this change correctly.

Your changes deviate quite a lot from standard kernel coding style.  Please fix
that.

Please cc linux-ext4@vger.kernel.org on the next version of these patches.  I'm
seriously running out of bandwidth over here and ext4 has other maintainers.

Thanks.
-
To unsubscribe from this list: send the line unsubscribe linux-ext4 in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Testing ext4 persistent preallocation patches for 64 bit features

2007-02-06 Thread Amit K. Arora
I plan to test the persistent preallocation patches on a huge sparse
device, to know if 32 bit physical block numbers (upto 48bit) behave as
expected. I have following questions for this and will appreciate
suggestions here:

a) What should be the sparse device size which I should use for testing?
Should a size of  8TB (say, 100 TB) be enough ?
The physical device (backing store device) size I can have is upto 70GB.

b) How do I test allocation of 32 bit physical block numbers ? I can
not fill  8TB, since the physical storage available with me is just
70GB.

c) Do I need to put some hack in the filesystem code for above (to
allocate 32 bit physical block numbers) ?

Any further ideas on how to test this will help. Thanks!

--
Regards,
Amit Arora
-
To unsubscribe from this list: send the line unsubscribe linux-ext4 in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html