Re: VFAT: slow fs corruption? [long]

2007-04-18 Thread Benjamin LaHaise
On Wed, Apr 18, 2007 at 07:58:40PM +0200, Albrecht Dreß wrote:
> - Are there known issues with VFAT in 2.6.11 which might lead to the  
> observed problems?  Were they fixed?
> - Is it possible to change the block size in ext2 to 16k (to match the SD  
> card's erase block size)?

Flash cards tend to be rather flaky given that they are low cost consumer 
grade commodities these days.  I would recommend getting a new card first 
and seeing if you can still replicate the problem.  Out of the box, I've 
had to replace 2 of 8 flash cards in the last 6 months when they showed 
similiarly eerie data corruption when files disappeared.  Doing an md5sum 
on the device 2 times in a row and getting back different results is Not 
Good.

-ben
-- 
"Time is of no importance, Mr. President, only life is important."
Don't Email: <[EMAIL PROTECTED]>.
-
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] add file position info to proc

2007-03-09 Thread Benjamin LaHaise
On Fri, Mar 09, 2007 at 09:15:06AM -0600, Dave Kleikamp wrote:
> I think this information would be a little easier to access if there
> would be a single file per pid or thread containing something like:
> 
> handleflags   pos path
> 0 012 1234/dev/pts/1
> 1 014 5678/tmp/output
> etc.

That would not be a good idea, as not all users have the same permissions 
for viewing this information.  It's also quite against the design philosophy 
used elsewhere in /proc.

-ben
-- 
"Time is of no importance, Mr. President, only life is important."
Don't Email: <[EMAIL PROTECTED]>.
-
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH -mm 0/10][RFC] aio: make struct kiocb private

2007-01-17 Thread Benjamin LaHaise
On Mon, Jan 15, 2007 at 08:25:15PM -0800, Nate Diller wrote:
> the right thing to do from a design perspective.  Hopefully it enables
> a new architecture that can reduce context switches in I/O completion,
> and reduce overhead.  That's the real motive ;)

And it's a broken motive.  Context switches per se are not bad, as they 
make it possible to properly schedule code in a busy system (which is 
*very* important when realtime concerns come into play).  Have a look 
at how things were done in the 2.4 aio code to see how completion would 
get done with a non-retry method, typically in interrupt context.  I had 
code that did direct I/O rather differently by sharing code with the 
read/write code paths at some point, the catch being that it was pretty 
invasive, which meant that it never got merged with the changes to handle 
writeback pressure and other work that happened during 2.5.

That said, you can't make kiocb private without completely removing the 
ability of the rest of the kernel to complete an aio sanely from irq context.  
You need some form of i/o descriptor, and a kiocb is just that.  Adding more 
layering is just going to make things messier and slower for no real gain.

-ben
-- 
"Time is of no importance, Mr. President, only life is important."
Don't Email: <[EMAIL PROTECTED]>.
-
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: aio-stress throughput regressions from 2.6.11 to 2.6.12

2005-07-08 Thread Benjamin LaHaise
On Wed, Jul 06, 2005 at 10:00:56AM +0530, Suparna Bhattacharya wrote:
> Do you see the regression as well, or is it just me ?

Throughput for me on an ICH6 SATA system using O_DIRECT seems to remain 
pretty constant across 2.6.11-something-FC3 to 2.6.13-rc2 around 31MB/s.  
2.6.11 proper hangs for me, though.

-ben
-- 
"Time is what keeps everything from happening all at once." -- John Wheeler
-
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: aio-stress throughput regressions from 2.6.11 to 2.6.12

2005-07-05 Thread Benjamin LaHaise
On Tue, Jul 05, 2005 at 10:00:24AM -0400, Chris Mason wrote:
> If it doesn't regress, I would suspect something in the aio core.  My first 
> attempts at the context switch reduction patches caused this kind of 
> regression.  There was too much latency in sending the events up to userland.

AIO will by its very nature have a higher rate of context switches unless 
the subsystem in question does its completion from the interrupt handler.  
There are a few optimizations in this area that we can do to improve things 
for some of the common usage models: when sleeping in read_events() we can 
do iocb runs in the sleeping task instead of switching to the aio helper 
daemon.  That might help for Oracle's usage model of aio, and it should 
also help aio-stress.

There are also other ways of reducing the overhead of the context switches.  
O_DIRECT operations could take note of the mm they are operating on and 
do its get_user_pages() on the mm without the tlb being flushed.

-ben
-- 
"Time is what keeps everything from happening all at once." -- John Wheeler
-
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC] Add support for semaphore-like structure with support for asynchronous I/O

2005-04-15 Thread Benjamin LaHaise
On Fri, Apr 15, 2005 at 03:42:54PM -0700, Trond Myklebust wrote:
> AFAICS You grab the wait_queue_t lock once in down()/__mutex_lock()
> order to try to take the lock (or queue the waiter if that fails), then
> once more in order to pass the mutex on to the next waiter on
> up()/mutex_unlock(). That is more or less the exact same thing I was
> doing with iosems using bog-standard waitqueues, and which Ben has
> adapted to his mutexes. What am I missing?

I didn't quite see that either.

What about the use of atomic operations on frv?  Are they more lightweight 
than a semaphore, making for a better fastpath?

-ben
-- 
"Time is what keeps everything from happening all at once." -- John Wheeler
-
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC] Add support for semaphore-like structure with support for asynchronous I/O

2005-04-08 Thread Benjamin LaHaise
On Thu, Apr 07, 2005 at 12:43:02PM +0100, Christoph Hellwig wrote:
>  - switch all current semaphore users that don't need counting semaphores
>over to use a mutex_t type.  For now it can map to struct semaphore.
>  - rip out all existing complicated struct semaphore implementations and
>replace it with a portable C implementation.  There's not a lot of users
>anyway.  Add a mutex_t implementation that allows sensible assembly hooks
>for architectures instead of reimplementing all of it
>  - add more features to mutex_t where nessecary

Oh dear, this is going to take a while.  In any case, here is such a 
first step in creating such a sequence of patches.  Located at 
http://www.kvack.org/~bcrl/patches/mutex-A0/ are the following patches:

00_mutex.diff   - Introduces the basic mutex abstraction on top 
  of the existing semaphore implementation.
01_i_sem.diff   - Converts all users of i_sem to use the mutex 
  abstraction.
10_new_mutex.diff - Replaces the semphore mutex with a new mutex 
derrived from Trond's iosem patch.  Note that 
this fixes a serious bug in iosems: see the 
change in mutex_lock_wake_function that ignores 
the return value of default_wake_function, as 
on SMP a process might still be running while 
we actually made progress.
sem-test.c  - A basic stress tester for the mutex / semaphore.

I'm still not convinced that introducing the mutex type is the best 
approach, especially given the history of the up()/down() implementation.

On the aio side of things, I introduced the owner field in the mutex (as 
opposed to the flag in Trond's iosem) for the next patch in the series to 
enable something like the following api:

int aio_lock_mutex(struct mutex *lock, struct iocb *iocb);

...generic_file_read
{
ret = mutex_lock_aio(&inode->i_sem, iocb);
if (ret)
return ret; /* aio_lock_mutex can return -EIOCBQUEUED */
...
mutex_unlock(&inode->i_sem);
}

mutex_lock_aio will attempt to take the lock if the iocb is not the owner, 
otherwise it returns immediately (ie ->owner == iocb).  This will allow for 
code paths that support aio to follow a fairly similar coding style to the 
synchronous io path.

More next week...

-ben
-- 
"Time is what keeps everything from happening all at once." -- John Wheeler
-
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC] Add support for semaphore-like structure with support for asynchronous I/O

2005-04-05 Thread Benjamin LaHaise
On Mon, Apr 04, 2005 at 01:56:35PM -0400, Trond Myklebust wrote:
> IOW: the current semaphore implementations really all need to die, and
> be replaced by a single generic version to which it is actually
> practical to add new functionality.

I can see that goal, but I don't think introducing iosems is the right 
way to acheive it.  Instead (and I'll start tackling this), how about 
factoring out the existing semaphore implementations to use a common 
lib/semaphore.c, much like lib/rwsem.c?  The iosems can be used as a 
basis for the implementation, but we can avoid having to do a giant 
s/semaphore/iosem/g over the kernel tree.

> Failing that, it is _much_ easier to convert the generic code that needs
> to support aio to use a new locking implementation and then test that.
> It is not as if conversion to aio won't involve changes to the code in
> the area surrounding those locks anyway.

Quite true.  There's a lot more work to do in this area, and these common 
primatives are needed to make progress.  Someone at netapp sent me an 
email yesterday asking about aio support in NFS, so there is some demand 
out there.  Cheers,

-ben
-- 
"Time is what keeps everything from happening all at once." -- John Wheeler
-
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html