Re: [RFC] Ext3 online defrag

2006-11-15 Thread Takashi Sato
Hi Alex, Thank you for your information. I have sent the patches of the defragmentation for a extent-based file on ext3 using your patches of the multi-block allocation. I'm happy if you have a time to review my patches. "[RFC][PATCH 0/3] Extent base online defrag" http://marc.theaimsgroup.com/?l

Re: [RFC] Ext3 online defrag

2006-10-27 Thread Alex Tomas
interested, definitely. thanks, Alex > Eric Sandeen (ES) writes: ES> Thanks. XFS recently made similar scalability changes in this area, ES> see the 2006 OLS paper, if you're interested. ES> -Eric - To unsubscribe from this list: send the line "unsubscribe linux-ext4" in the body of a m

Re: [RFC] Ext3 online defrag

2006-10-27 Thread Eric Sandeen
Alex Tomas wrote: Eric Sandeen (ES) writes: ES> Alex Tomas wrote: >> 3) scalable reservation >> required for delayed allocation to avoid -ENOSPC at flush time. >> current version uses per-sb spinlock. ES> Can you elaborate on this issue? Shouldn't delayed allocation ES> decrement free s

Re: [RFC] Ext3 online defrag

2006-10-27 Thread Alex Tomas
> Eric Sandeen (ES) writes: ES> Alex Tomas wrote: >> 3) scalable reservation >> required for delayed allocation to avoid -ENOSPC at flush time. >> current version uses per-sb spinlock. ES> Can you elaborate on this issue? Shouldn't delayed allocation ES> decrement free space immediatel

Re: [RFC] Ext3 online defrag

2006-10-27 Thread Eric Sandeen
Alex Tomas wrote: 3) scalable reservation required for delayed allocation to avoid -ENOSPC at flush time. current version uses per-sb spinlock. Can you elaborate on this issue? Shouldn't delayed allocation decrement free space immediately, and only the actual block location choice is de

Re: [RFC] Ext3 online defrag

2006-10-27 Thread Alex Tomas
I've been reworking mballoc with few new features: 1) in-core preallocation like existing reservation, but can preallocate few pieces for a file 2) locality groups to maintain groups of related files and flush them together. say, two users are unpacking kernel. with delayed allocation

Re: [RFC] Ext3 online defrag

2006-10-27 Thread sho
Hi, > TT> On Mon, Oct 23, 2006 at 02:27:10PM +0200, Jan Kara wrote: > >> Hello, > >> > >> I've written a simple patch implementing ext3 ioctl for file > >> relocation. Basically you call ioctl on a file, give it list of blocks > >> and it relocates the file into given blocks (provided they are st

Re: [RFC] Ext3 online defrag

2006-10-26 Thread David Chinner
On Thu, Oct 26, 2006 at 01:37:22PM +0200, Jan Kara wrote: > > On Wed, Oct 25, 2006 at 01:00:52PM -0400, Jeff Garzik wrote: > > We don't need to expose anything filesystem specific to userspace to > > implement this. Online data movement (i.e. the defrag mechanism) > > becomes something like: > >

Re: [RFC] Ext3 online defrag

2006-10-26 Thread Jörn Engel
On Wed, 25 October 2006 14:41:18 -0400, Jeff Garzik wrote: > On Wed, Oct 25, 2006 at 08:36:56PM +0200, Jan Kara wrote: > > Yes, but there's a question of the interface to this operation. How to > > specify which indirect block I mean? Obviously we could introduce > > separate call for remapping i

Re: [RFC] Ext3 online defrag

2006-10-26 Thread Dave Kleikamp
On Thu, 2006-10-26 at 09:37 -0400, Theodore Tso wrote: > On Thu, Oct 26, 2006 at 04:36:48PM +1000, David Chinner wrote: > > > > Remember, I'm not just talking about defrag - I'm talking about > > > > an interface that is actually useful to apps that might care > > > > about how data is laid out on

Re: [RFC] Ext3 online defrag

2006-10-26 Thread Theodore Tso
On Thu, Oct 26, 2006 at 04:36:48PM +1000, David Chinner wrote: > > > Remember, I'm not just talking about defrag - I'm talking about > > > an interface that is actually useful to apps that might care > > > about how data is laid out on disk but the applications writers > > > don't know anyhting abo

Re: [RFC] Ext3 online defrag

2006-10-26 Thread Jan Kara
> On Wed, Oct 25, 2006 at 01:00:52PM -0400, Jeff Garzik wrote: > > On Wed, Oct 25, 2006 at 06:11:37PM +1000, David Chinner wrote: > > > On Wed, Oct 25, 2006 at 02:01:42AM -0400, Jeff Garzik wrote: > > > So how do you then get the generic interface to allocate blocks > > > specified by userspace rac

Re: [RFC] Ext3 online defrag

2006-10-26 Thread Andreas Dilger
On Oct 25, 2006 16:54 +0200, Jan Kara wrote: > I've just not yet decided how to handle indirect > blocks in case of relocation in the middle of the file. Should they be > relocated or shouldn't they? Probably they should be relocated at least > in case they are fully contained in relocated interva

Re: [RFC] Ext3 online defrag

2006-10-25 Thread David Chinner
On Wed, Oct 25, 2006 at 11:33:16PM -0400, Theodore Tso wrote: > On Thu, Oct 26, 2006 at 11:40:20AM +1000, David Chinner wrote: > > We don't need to expose anything filesystem specific to userspace to > > implement this. Online data movement (i.e. the defrag mechanism) > > becomes something like: >

Re: [RFC] Ext3 online defrag

2006-10-25 Thread Theodore Tso
On Thu, Oct 26, 2006 at 11:40:20AM +1000, David Chinner wrote: > We don't need to expose anything filesystem specific to userspace to > implement this. Online data movement (i.e. the defrag mechanism) > becomes something like: > > do { > get_free_list(dst_fd, location, len, li

Re: [RFC] Ext3 online defrag

2006-10-25 Thread David Chinner
On Wed, Oct 25, 2006 at 01:00:52PM -0400, Jeff Garzik wrote: > On Wed, Oct 25, 2006 at 06:11:37PM +1000, David Chinner wrote: > > On Wed, Oct 25, 2006 at 02:01:42AM -0400, Jeff Garzik wrote: > > > On Wed, Oct 25, 2006 at 03:38:23PM +1000, David Chinner wrote: > > > > On Wed, Oct 25, 2006 at 12:48:4

Re: [RFC] Ext3 online defrag

2006-10-25 Thread Jeff Garzik
On Wed, Oct 25, 2006 at 08:36:56PM +0200, Jan Kara wrote: > Yes, but there's a question of the interface to this operation. How to > specify which indirect block I mean? Obviously we could introduce > separate call for remapping indirect blocks but I find this solution > kind of clumsy... Agreed

Re: [RFC] Ext3 online defrag

2006-10-25 Thread Jan Kara
> On Oct 23, 2006 18:03 +0200, Jan Kara wrote: > > Andreas Dilger wrote: > > > I would in fact go so far as to allow only a single extent to be specified > > > per call. This is to avoid the passing of any pointers as part of the > > > interface (hello ioctl police :-), and also makes the kernel

Re: [RFC] Ext3 online defrag

2006-10-25 Thread Jeff Garzik
On Wed, Oct 25, 2006 at 08:25:30PM +0200, Jan Kara wrote: > I see. So you mean that in our ext3meta filesystem we'd have a file > named "add_this_extent_to_inode" and a file "reloc_inode_interval" and > they'd be fed essentially the same info as the current ioctl interface and > do the same thing

Re: [RFC] Ext3 online defrag

2006-10-25 Thread Jan Kara
> On Wed, Oct 25, 2006 at 07:58:51PM +0200, Jan Kara wrote: > > I've briefly looked at this and this kind of interface has some > > appeal. On the other hand it's not obvious to me, how to implement in > > this interface *atomic* operation "copy data from file F to given set of > > blocks and rew

Re: [RFC] Ext3 online defrag

2006-10-25 Thread Jeff Garzik
On Wed, Oct 25, 2006 at 07:58:51PM +0200, Jan Kara wrote: > I've briefly looked at this and this kind of interface has some > appeal. On the other hand it's not obvious to me, how to implement in > this interface *atomic* operation "copy data from file F to given set of > blocks and rewrite point

Re: [RFC] Ext3 online defrag

2006-10-25 Thread Jan Kara
> On Wed, Oct 25, 2006 at 04:54:50PM +0200, Jan Kara wrote: > > Yes, this sounds feasible. We could split the defrag ioctl into two > > pieces (addition of given extent to a file and swapping of extents), which > > can have generic interface... > > An ioctl is UGLY. Agreed. > This was discus

Re: [RFC] Ext3 online defrag

2006-10-25 Thread Jeff Garzik
On Wed, Oct 25, 2006 at 04:54:50PM +0200, Jan Kara wrote: > Yes, this sounds feasible. We could split the defrag ioctl into two > pieces (addition of given extent to a file and swapping of extents), which > can have generic interface... An ioctl is UGLY. This was discussed years ago. Google f

Re: [RFC] Ext3 online defrag

2006-10-25 Thread Jeff Garzik
On Wed, Oct 25, 2006 at 06:11:37PM +1000, David Chinner wrote: > On Wed, Oct 25, 2006 at 02:01:42AM -0400, Jeff Garzik wrote: > > On Wed, Oct 25, 2006 at 03:38:23PM +1000, David Chinner wrote: > > > On Wed, Oct 25, 2006 at 12:48:44AM -0400, Jeff Garzik wrote: > > > So why are you arguing that an in

Re: [RFC] Ext3 online defrag

2006-10-25 Thread Jan Kara
> On Oct 24, 2006 15:44 -0400, Theodore Tso wrote: > > First of all, we would need a way of allowing userpsace to specify > > which blocks should be used in the preallocation. > > Presumably it could do this in the same way it will be specifying > which blocks to relocate in the defragger - by pa

Re: [RFC] Ext3 online defrag

2006-10-25 Thread David Chinner
On Wed, Oct 25, 2006 at 02:01:42AM -0400, Jeff Garzik wrote: > On Wed, Oct 25, 2006 at 03:38:23PM +1000, David Chinner wrote: > > On Wed, Oct 25, 2006 at 12:48:44AM -0400, Jeff Garzik wrote: > > So why are you arguing that an interface is no good because it > > is fundamentally racy? ;) > > My poi

Re: [RFC] Ext3 online defrag

2006-10-24 Thread Jeff Garzik
On Wed, Oct 25, 2006 at 03:38:23PM +1000, David Chinner wrote: > On Wed, Oct 25, 2006 at 12:48:44AM -0400, Jeff Garzik wrote: > > On Wed, Oct 25, 2006 at 02:27:53PM +1000, David Chinner wrote: > > > But it a race that is _easily_ handled, and applications only need to > > > implement one interface,

Re: [RFC] Ext3 online defrag

2006-10-24 Thread David Chinner
On Wed, Oct 25, 2006 at 12:48:44AM -0400, Jeff Garzik wrote: > On Wed, Oct 25, 2006 at 02:27:53PM +1000, David Chinner wrote: > > But it a race that is _easily_ handled, and applications only need to > > implement one interface, not a different method for every > > filesystem that requires deeep fi

Re: [RFC] Ext3 online defrag

2006-10-24 Thread Jeff Garzik
On Wed, Oct 25, 2006 at 02:27:53PM +1000, David Chinner wrote: > But it a race that is _easily_ handled, and applications only need to > implement one interface, not a different method for every > filesystem that requires deeep filesystem knowledge. > > Besides, you still have to handle the case w

Re: [RFC] Ext3 online defrag

2006-10-24 Thread David Chinner
On Tue, Oct 24, 2006 at 10:42:57PM -0400, Jeff Garzik wrote: > On Wed, Oct 25, 2006 at 12:30:02PM +1000, Barry Naujok wrote: > > Could we have a more abstract method for asking the filesystem where the > > free blocks are and then using the same block addressing to tell the > > fs where to allocat

Re: [RFC] Ext3 online defrag

2006-10-24 Thread Jeff Garzik
On Wed, Oct 25, 2006 at 12:30:02PM +1000, Barry Naujok wrote: > Could we have a more abstract method for asking the filesystem where the > free blocks are and then using the same block addressing to tell the > fs where to allocate/move the file's data to? That's fundamentally racy, so you might a

RE: [RFC] Ext3 online defrag

2006-10-24 Thread Barry Naujok
On Wed, 25 Oct 2006 11:19 AM, David Chinner wrote: > On Tue, Oct 24, 2006 at 11:26:26AM -0500, Dave Kleikamp wrote: > > On Wed, 2006-10-25 at 02:01 +1000, David Chinner wrote: > > > On Tue, Oct 24, 2006 at 09:51:41AM -0500, Dave Kleikamp wrote: > > > The allocation interface needs to be be able

Re: [RFC] Ext3 online defrag

2006-10-24 Thread David Chinner
On Tue, Oct 24, 2006 at 03:44:16PM -0400, Theodore Tso wrote: > On Tue, Oct 24, 2006 at 11:59:28PM +1000, David Chinner wrote: > > That's the wrong way to look at it. if you want the userspace > > process to specify a location, then you should preallocate it first > > before doing anything else. Th

Re: [RFC] Ext3 online defrag

2006-10-24 Thread David Chinner
On Tue, Oct 24, 2006 at 11:26:26AM -0500, Dave Kleikamp wrote: > On Wed, 2006-10-25 at 02:01 +1000, David Chinner wrote: > > On Tue, Oct 24, 2006 at 09:51:41AM -0500, Dave Kleikamp wrote: > > > On Tue, 2006-10-24 at 23:59 +1000, David Chinner wrote: > > > > That's the wrong way to look at it. if yo

Re: [RFC] Ext3 online defrag

2006-10-24 Thread Andreas Dilger
On Oct 24, 2006 15:44 -0400, Theodore Tso wrote: > First of all, we would need a way of allowing userpsace to specify > which blocks should be used in the preallocation. Presumably it could do this in the same way it will be specifying which blocks to relocate in the defragger - by passing an ext

Re: [RFC] Ext3 online defrag

2006-10-24 Thread Russell Cattelan
On Tue, 2006-10-24 at 15:44 -0400, Theodore Tso wrote: > On Tue, Oct 24, 2006 at 11:59:28PM +1000, David Chinner wrote: > > That's the wrong way to look at it. if you want the userspace > > process to specify a location, then you should preallocate it first > > before doing anything else. There is

Re: [RFC] Ext3 online defrag

2006-10-24 Thread Theodore Tso
On Tue, Oct 24, 2006 at 11:59:28PM +1000, David Chinner wrote: > That's the wrong way to look at it. if you want the userspace > process to specify a location, then you should preallocate it first > before doing anything else. There is no need to clutter a simple > data mover interface with all sor

Re: [RFC] Ext3 online defrag

2006-10-24 Thread Dave Kleikamp
On Wed, 2006-10-25 at 02:01 +1000, David Chinner wrote: > On Tue, Oct 24, 2006 at 09:51:41AM -0500, Dave Kleikamp wrote: > > On Tue, 2006-10-24 at 23:59 +1000, David Chinner wrote: > > > That's the wrong way to look at it. if you want the userspace > > > process to specify a location, then you shou

Re: [RFC] Ext3 online defrag

2006-10-24 Thread David Chinner
On Tue, Oct 24, 2006 at 09:51:41AM -0500, Dave Kleikamp wrote: > On Tue, 2006-10-24 at 23:59 +1000, David Chinner wrote: > > On Tue, Oct 24, 2006 at 12:14:33AM -0400, Jeff Garzik wrote: > > > On Mon, Oct 23, 2006 at 06:31:40PM +0400, Alex Tomas wrote: > > > > isn't that a kernel responsbility to fi

Re: [RFC] Ext3 online defrag

2006-10-24 Thread Eric Sandeen
David Chinner wrote: > The allocation interface, OTOH, is anything but simple and is really > a filesystem specific interface. Seems logical to me to separate > the two. And ext[234] preallocation would be a very nice feature in its own right. -Eric - To unsubscribe from this list: send the line

Re: [RFC] Ext3 online defrag

2006-10-24 Thread Dave Kleikamp
On Tue, 2006-10-24 at 23:59 +1000, David Chinner wrote: > On Tue, Oct 24, 2006 at 12:14:33AM -0400, Jeff Garzik wrote: > > On Mon, Oct 23, 2006 at 06:31:40PM +0400, Alex Tomas wrote: > > > isn't that a kernel responsbility to find/allocate target blocks? > > > wouldn't it better to specify desirabl

Re: [RFC] Ext3 online defrag

2006-10-24 Thread David Chinner
On Tue, Oct 24, 2006 at 12:14:33AM -0400, Jeff Garzik wrote: > On Mon, Oct 23, 2006 at 06:31:40PM +0400, Alex Tomas wrote: > > isn't that a kernel responsbility to find/allocate target blocks? > > wouldn't it better to specify desirable target group and minimal > > acceptable chunk of free blocks?

Re: [RFC] Ext3 online defrag

2006-10-23 Thread Jeff Garzik
On Mon, Oct 23, 2006 at 06:31:40PM +0400, Alex Tomas wrote: > isn't that a kernel responsbility to find/allocate target blocks? > wouldn't it better to specify desirable target group and minimal > acceptable chunk of free blocks? The kernel doesn't have enough knowledge to know whether or not the

Re: [RFC] Ext3 online defrag

2006-10-23 Thread Andreas Dilger
On Oct 23, 2006 18:03 +0200, Jan Kara wrote: > Andreas Dilger wrote: > > I would in fact go so far as to allow only a single extent to be specified > > per call. This is to avoid the passing of any pointers as part of the > > interface (hello ioctl police :-), and also makes the kernel code simpl

Re: [RFC] Ext3 online defrag

2006-10-23 Thread Jan Kara
> On Oct 23, 2006 10:16 -0400, Theodore Tso wrote: > > As a suggestion, I would pass the inode number and inode generation > > number into the ext3_file_mode_data array: > > > > struct ext3_file_move_data { > > int extents; > > struct ext3_reloc_extent __user *ext_array; > > }; > > > > T

Re: [RFC] Ext3 online defrag

2006-10-23 Thread Andreas Dilger
On Oct 23, 2006 10:16 -0400, Theodore Tso wrote: > As a suggestion, I would pass the inode number and inode generation > number into the ext3_file_mode_data array: > > struct ext3_file_move_data { > int extents; > struct ext3_reloc_extent __user *ext_array; > }; > > This will be much

Re: [RFC] Ext3 online defrag

2006-10-23 Thread Eric Sandeen
Alex Tomas wrote: Theodore Tso (TT) writes: TT> On Mon, Oct 23, 2006 at 02:27:10PM +0200, Jan Kara wrote: >> Hello, >> >> I've written a simple patch implementing ext3 ioctl for file >> relocation. Basically you call ioctl on a file, give it list of blocks >> and it relocates the file i

Re: [RFC] Ext3 online defrag

2006-10-23 Thread Jan Kara
> On Oct 23, 2006 18:31 +0400, Alex Tomas wrote: > I would make this interface optionally allow the target extent to be > specified, but if target block == 0 then the kernel is free to do its > own allocation. That's a good idea! I'll change the handling so that if block==0 we just allocate bloc

Re: [RFC] Ext3 online defrag

2006-10-23 Thread Jan Kara
> > Theodore Tso (TT) writes: > > TT> On Mon, Oct 23, 2006 at 02:27:10PM +0200, Jan Kara wrote: > >> Hello, > >> > >> I've written a simple patch implementing ext3 ioctl for file > >> relocation. Basically you call ioctl on a file, give it list of blocks > >> and it relocates the file i

Re: [RFC] Ext3 online defrag

2006-10-23 Thread Andreas Dilger
On Oct 23, 2006 18:31 +0400, Alex Tomas wrote: > isn't that a kernel responsbility to find/allocate target blocks? > wouldn't it better to specify desirable target group and minimal > acceptable chunk of free blocks? In some cases this is useful (e.g. if file has small fragments after being writt

Re: [RFC] Ext3 online defrag

2006-10-23 Thread Jan Kara
Hello, > > I've written a simple patch implementing ext3 ioctl for file > > relocation. Basically you call ioctl on a file, give it list of blocks > > and it relocates the file into given blocks (provided they are still > > free). The idea is to use it as a kernel part of ext3 online > > defra

Re: [RFC] Ext3 online defrag

2006-10-23 Thread Alex Tomas
> Theodore Tso (TT) writes: TT> On Mon, Oct 23, 2006 at 02:27:10PM +0200, Jan Kara wrote: >> Hello, >> >> I've written a simple patch implementing ext3 ioctl for file >> relocation. Basically you call ioctl on a file, give it list of blocks >> and it relocates the file into given blocks

Re: [RFC] Ext3 online defrag

2006-10-23 Thread Theodore Tso
On Mon, Oct 23, 2006 at 02:27:10PM +0200, Jan Kara wrote: > Hello, > > I've written a simple patch implementing ext3 ioctl for file > relocation. Basically you call ioctl on a file, give it list of blocks > and it relocates the file into given blocks (provided they are still > free). The idea