Re: Ext4 devel interlock meeting minutes (April 23, 2007)

2007-05-01 Thread Kalpak Shah
On Mon, 2007-04-30 at 16:36 +0530, Aneesh Kumar wrote:
 On 4/24/07, Avantika Mathur [EMAIL PROTECTED] wrote:
  Ext4 Developer Interlock Call: 04/23/2007 Meeting Minutes
 
  TESTING
  - extents testing
  - Discussed methods for testing extents on highly fragmented
  filesystems.
  - Jose will look into possible tests, including perhaps using the
  'aged' option in FFSB
  - Ted suggested creating a mountoption that creates a bad block
  allocator which it jumps to a new block group every 8 blocks.  This
  would force a very large number of extents, and may be a good test for
  extents.
 
 
 What i am doing for creating a large number of extents is
 
 dd if=/dev/zero of=myfile count=10
 seek=20
 while [ 1 ]; do dd if=/dev/zero of=myfile count=10 seek=$seek;
 seek=`expr $seek + 20`; done
 
 

I had written a simple tool bitmap_manip with which you can actually
manipulate the number of free chunks and their sizes in a filesystem. It
uses libext2fs to set the bits in block bitmaps thereby leaving the
desired free extents. I had written it to test the allocators
performance. 

It can be used as:
 ./bitmap_manip /dev/sda9 1MA 4 16K 1 12K 3 8K 4 4K 6
 
This will leave only 1 16K chunk, 3 12K chunks,  free in the
filesystem. 1MA 4 will get us 4 1Mb free ALIGNED chunks.

It isn't very beautiful code since it was only used for testing but
maybe it can help.

Thanks,
Kalpak.

 -aneesh
 -
 To unsubscribe from this list: send the line unsubscribe linux-ext4 in
 the body of a message to [EMAIL PROTECTED]
 More majordomo info at  http://vger.kernel.org/majordomo-info.html
/* Manipulate block bitmap directly for mballoc testing */

/* USAGE:
 * ./bitmap_manip /dev/volmballoc/test 16K 1 12K 3 8K 4 4K 6
 * This will leave 1 16K chunk, 3 12K chunks,  in the filesystem specified.
 * Ideally give the inputs in ascending order. 
 * 1MA 4 will get us 4 1Mb ALIGNED chunks.	
 */

#include stdio.h
#include ext2fs/ext2fs.h
#include ext2fs/ext2_types.h
#include fcntl.h
#include stdlib.h

#define ONE_MB (1024 * 1024)
#define ONE_KB 1024

#define SETTING 0
#define FREEING 1

#define NO_ALIGN 0
#define ALIGN 1

struct chunk_arg {
	int chunk_size;
	int num_chunks;
	int align;
};

int main(int argc, char **argv)
{
	ext2_filsys fs;
	ext2fs_block_bitmap *map = NULL;
	int bg_num = 0, retval, arg_num, multiply, chunk_num;
	int i, start_blk, set_bit, test_bit, j;
	struct chunk_arg chunk[50];
	int free_blocks_req = 0, free_blocks_avail, num_of_chunks_req = 0, group;
	char str[10];
	float orig_avail_req, avail_req;
	int set_till_now, free_till_now, num_blks_to_set, num_blks_to_free, phase;
	int  current, align_flag = 0, align = 0, curr = 0;

	if (argc  2) {
		printf(Please give name of a filesystem. Exiting...\n);
		return -1;
	}
	
	/* Even from user's perspective */
	if(argc  0x01) {
		printf(This utility cannot have even number of arguments.\n);
		return -1;
	}

	if ((retval = ext2fs_open(argv[1], EXT2_FLAG_RW, 0, 0, unix_io_manager, fs))) {
		com_err(ext2fs open:, retval, while opening %s\n, argv[1]);
		return retval;
	}	

	srand(1234567);
	chunk_num = 0;
	for (arg_num = 2; arg_num  argc; arg_num += 2, chunk_num++) {
		strcpy(str, argv[arg_num]);

		/* Check if we have to align */
		if (toupper(str[strlen(str) - 1 ]) == 'A') {
			chunk[chunk_num].align = ALIGN;
			str[strlen(str) - 1] = '\0';	
			align = 1;
		}
		else
			chunk[chunk_num].align = NO_ALIGN;
		if (toupper(str[strlen(str) - 1]) == 'K')
			multiply = ONE_KB;
		else if(toupper(str[strlen(str) - 1]) == 'M') 
			multiply = ONE_MB;

		str[strlen(str) - 1] = '\0';
		chunk[chunk_num].chunk_size = ((strtod(str, NULL)) * multiply)/ (fs-blocksize);
		chunk[chunk_num].num_chunks = strtod(argv[arg_num + 1], NULL); 
			
		free_blocks_req += chunk[chunk_num].chunk_size * chunk[chunk_num].num_chunks;
		num_of_chunks_req += chunk[chunk_num].num_chunks;
	}

	ext2fs_read_block_bitmap(fs);			
	map = fs-block_map;	

	start_blk = fs-super-s_first_data_block;
	free_blocks_avail = fs-super-s_free_blocks_count;

	orig_avail_req = free_blocks_avail / free_blocks_req;
	current = 0;
	i = start_blk;
	
	num_blks_to_set = (orig_avail_req / 4) * chunk[current].chunk_size;
	num_blks_to_free = chunk[current].chunk_size;
	phase = SETTING;
	do {
		test_bit = i;
		if (!ext2fs_fast_test_block_bitmap(*map, test_bit)) {
			if (phase == SETTING) {
if (chunk[current].align == ALIGN  chunk[current].num_chunks  0) {
	if (align_flag == 0) {
		num_blks_to_set = (i / chunk[current].chunk_size + 1) * 
			chunk[current].chunk_size - i;
		align_flag = 1;
	}
	else if (i % chunk[current].chunk_size == 0) {
		num_blks_to_set = 0;
		phase = FREEING;
	}
}
set_bit = i;
		ext2fs_mark_block_bitmap(*map, set_bit); 
group = (set_bit - fs-super-s_first_data_block) / fs-super-s_blocks_per_group;
fs-group_desc[group].bg_free_blocks_count--;
fs-super-s_free_blocks_count--;		
num_blks_to_set--;
if (num_blks_to_set == 0) {
	phase = 

Re: Ext4 devel interlock meeting minutes (April 23, 2007)

2007-04-30 Thread Aneesh Kumar

On 4/24/07, Avantika Mathur [EMAIL PROTECTED] wrote:

Ext4 Developer Interlock Call: 04/23/2007 Meeting Minutes

TESTING
- extents testing
- Discussed methods for testing extents on highly fragmented
filesystems.
- Jose will look into possible tests, including perhaps using the
'aged' option in FFSB
- Ted suggested creating a mountoption that creates a bad block
allocator which it jumps to a new block group every 8 blocks.  This
would force a very large number of extents, and may be a good test for
extents.



What i am doing for creating a large number of extents is

dd if=/dev/zero of=myfile count=10
seek=20
while [ 1 ]; do dd if=/dev/zero of=myfile count=10 seek=$seek;
seek=`expr $seek + 20`; done


-aneesh
-
To unsubscribe from this list: send the line unsubscribe linux-ext4 in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Ext4 devel interlock meeting minutes (April 23, 2007)

2007-04-30 Thread Alex Tomas

Aneesh Kumar wrote:

What i am doing for creating a large number of extents is

dd if=/dev/zero of=myfile count=10
seek=20
while [ 1 ]; do dd if=/dev/zero of=myfile count=10 seek=$seek;
seek=`expr $seek + 20`; done


with AGGRESSIVE_TEST defined in include/linux/ext4_fs_extents.h you may
get much more extents and index blocks.



-
To unsubscribe from this list: send the line unsubscribe linux-ext4 in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Ext4 devel interlock meeting minutes (April 23, 2007)

2007-04-24 Thread Alex Tomas

Avantika Mathur wrote:

TESTING
- extents testing
   - Discussed methods for testing extents on highly fragmented 
filesystems.
   - Jose will look into possible tests, including perhaps using the 
'aged' option in FFSB
   - Ted suggested creating a mountoption that creates a bad block 
allocator which it jumps to a new block group every 8 blocks.  This 
would force a very large number of extents, and may be a good test for 
extents.


there is AGGRESSIVE_TEST define which limits number of entries in index/leaf.


- Large file deletion
   - Valerie had recently tested large file deletion on ext3/4, but did 
not see the expected performance gain with ext4 due to compact metadata 
when using extents.


any details?

thanks, Alex

-
To unsubscribe from this list: send the line unsubscribe linux-ext4 in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Ext4 devel interlock meeting minutes (April 23, 2007)

2007-04-24 Thread Valerie Clement

Alex Tomas wrote:


- Large file deletion
   - Valerie had recently tested large file deletion on ext3/4, but 
did not see the expected performance gain with ext4 due to compact 
metadata when using extents.


any details?



Ok, I found my mistake. There was a typo in my test script and the 
pagecache was not flushed between the file creation and the deletion.


Here are the results I obtain with a 2.6.17-rc7 kernel to delete a 100GB 
file:


ext3 : real  2m35.048suser  0m0.000s sys  0m6.424s
ext4 : real  0m11.160suser  0m0.000s sys  0m5.532s
xfs :  real  0m0.377s user  0m0.004s sys  0m0.004s

The performance gain with ext4 is much larger when running a good test...
Sorry the wrong information,

   Valérie


-
To unsubscribe from this list: send the line unsubscribe linux-ext4 in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Ext4 devel interlock meeting minutes (April 23, 2007)

2007-04-24 Thread Alex Tomas

Valerie Clement wrote:
Here are the results I obtain with a 2.6.17-rc7 kernel to delete a 100GB 
file:


ext3 : real  2m35.048suser  0m0.000s sys  0m6.424s
ext4 : real  0m11.160suser  0m0.000s sys  0m5.532s
xfs :  real  0m0.377s user  0m0.004s sys  0m0.004s


would be very interesting to know how much IO was done to remove the file
and actual fragmentation in all the cases.

thanks, Alex


-
To unsubscribe from this list: send the line unsubscribe linux-ext4 in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Ext4 devel interlock meeting minutes (April 23, 2007)

2007-04-24 Thread Eric Sandeen
Avantika Mathur wrote:
 - large filesystem
 - We would like to perform more testing on large (16TB) filesystems
 - currently hardware limitations are preventing this testing.  We 
 have tested 10TB raid dists, and 16TB loopback devices.  Avantika will 
 look into creating very large sparse devices for testing.

I've been hacking up some [EMAIL PROTECTED] testing scripts to use sparse
devicemapper devices which make use of snapshots... loopback files don't
work for testing, at least not hosted on ext[234], because we still
can't do these large file offsets.

(Documentation/device-mapper/zero.txt in the kernel tree describes these
sparse dm devices)

Testing the whole range as a sparse snapshot can be slow, since
devicemapper has to do all the exception handling etc, and I think
essentially creates a fragmented block device.

I've been playing with something like this:

# 90% of the real device size is used for a real 1:1 mapping
# The other 10% is sparsely mapped out to add up to totalsize.
# i.e. -

#  [large sparse-ish device]
#
# +--~  ~-+
# | sparse| real  |
# +--~  ~-+
#
# | SPARSE_SIZE |- REAL_SIZE -|

# is mapped on top of:

#   [real block device]
#  ++
#  | sp |   real|
#  ++

and then marking the sparse range as full (maybe via lazy_bg, or other
methods).  You could then also put a dm-error target under the full
sections so that any IO that may stray there will fail.

This way you can direct the real IO to the 1:1 mapping portion of the
large dm device, and shouldn't get the snapshot slowdowns.

Anyway, just something I've been playing with...

-eric
-
To unsubscribe from this list: send the line unsubscribe linux-ext4 in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html