Re: [PATCH] update ctime and mtime for mmaped write

2007-02-22 Thread Miklos Szeredi
> Miklos Szeredi wrote:
>  This still does not address the situation where a file is 'permanently'
>  mmap'd, does it?
>  
> >>> So?  If application doesn't do msync, then the file times won't be
> >>> updated.  That's allowed by the standard, and so portable applications
> >>> will have to call msync.
> >>>   
> >> It is allowed, but it is clearly not useful behaviour. Nowhere is it set
> >> in stone that we should be implementing just the minimum allowed.
> >> 
> >
> > You're right.  In theory, at least.  But in practice I don't think
> > this matters.  Show me an application that writes to a shared mapping
> > then doesn't call either msync or munmap and doesn't even exit.
> >
> > If there were lot of these apps, then this bug would have been fixed
> > lots of years earlier.  In fact there are _very_ few apps writing to
> > shared mappings at all.
> >
> >   
> 
> Perhaps true, although I know of at least one customer of Red Hat who
> does have an application (or more) than uses mmap'd files and is
> suffering from this lack of appropriate semantics.  They are not
> getting files backed up which need to be.

Yeah, but I bet, that just updating the times on msync and munmap will
cure the problem.  Until you prove me wrong, this is a purely
theoretical argument.

> > Applications should be encouraged to call msync(MS_ASYNC) because:
> >
> >   - it's very fast (basically a no-op) on recent linux kernels
> >
> >   - it's the only portable way to guarantee, that the data you written
> > will _ever_ hit the disk.
> >
> > There's really no downside to using msync(MS_ASYNC) in your
> > application, so making an effort to support applications that don't do
> > this is stupid, IMO.
> 
> It may be a no-op on recent Linux kernels, but I don't think that it
> is a no-op on other systems.

Exactly.  Which is the reason applications _have_ to call msync on
these OSs if they want data to be synchronized with the file.

There are two cases:

 1) app don't cares about it's data being synchronized with the file

 2) app does care

In the first case there's no problem if the times aren't updated,
since the app didn't care.

In the second case the a portable application has to call msync, and
on linux it doesn't even lose any performance from this, so it should
be happy.

You are trying to provide for an app that requires 1) on non-linux and
2) on linux.  Does that make any sense?

Miklos
-
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] update ctime and mtime for mmaped write

2007-02-22 Thread Peter Staubach

Miklos Szeredi wrote:

This still does not address the situation where a file is 'permanently'
mmap'd, does it?


So?  If application doesn't do msync, then the file times won't be
updated.  That's allowed by the standard, and so portable applications
will have to call msync.
  

It is allowed, but it is clearly not useful behaviour. Nowhere is it set
in stone that we should be implementing just the minimum allowed.



You're right.  In theory, at least.  But in practice I don't think
this matters.  Show me an application that writes to a shared mapping
then doesn't call either msync or munmap and doesn't even exit.

If there were lot of these apps, then this bug would have been fixed
lots of years earlier.  In fact there are _very_ few apps writing to
shared mappings at all.

  


Perhaps true, although I know of at least one customer of Red Hat who
does have an application (or more) than uses mmap'd files and is
suffering from this lack of appropriate semantics.  They are not
getting files backed up which need to be.


Applications should be encouraged to call msync(MS_ASYNC) because:

  - it's very fast (basically a no-op) on recent linux kernels

  - it's the only portable way to guarantee, that the data you written
will _ever_ hit the disk.

There's really no downside to using msync(MS_ASYNC) in your
application, so making an effort to support applications that don't do
this is stupid, IMO.


It may be a no-op on recent Linux kernels, but I don't think that it
is a no-op on other systems.

I do not wish to defend applications which don't use msync or some
such, but it seems that the appropriate semantics to support are
more than just depending upon the application to "do the right thing".

   Thanx...

  ps
-
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] update ctime and mtime for mmaped write

2007-02-22 Thread Miklos Szeredi
> > > This still does not address the situation where a file is 'permanently'
> > > mmap'd, does it?
> > 
> > So?  If application doesn't do msync, then the file times won't be
> > updated.  That's allowed by the standard, and so portable applications
> > will have to call msync.
> 
> It is allowed, but it is clearly not useful behaviour. Nowhere is it set
> in stone that we should be implementing just the minimum allowed.

You're right.  In theory, at least.  But in practice I don't think
this matters.  Show me an application that writes to a shared mapping
then doesn't call either msync or munmap and doesn't even exit.

If there were lot of these apps, then this bug would have been fixed
lots of years earlier.  In fact there are _very_ few apps writing to
shared mappings at all.

Applications should be encouraged to call msync(MS_ASYNC) because:

  - it's very fast (basically a no-op) on recent linux kernels

  - it's the only portable way to guarantee, that the data you written
will _ever_ hit the disk.

There's really no downside to using msync(MS_ASYNC) in your
application, so making an effort to support applications that don't do
this is stupid, IMO.

Thanks,
Miklos
-
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] update ctime and mtime for mmaped write

2007-02-22 Thread Trond Myklebust
On Thu, 2007-02-22 at 21:48 +0100, Miklos Szeredi wrote:
> > This still does not address the situation where a file is 'permanently'
> > mmap'd, does it?
> 
> So?  If application doesn't do msync, then the file times won't be
> updated.  That's allowed by the standard, and so portable applications
> will have to call msync.

It is allowed, but it is clearly not useful behaviour. Nowhere is it set
in stone that we should be implementing just the minimum allowed.

Trond

-
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] update ctime and mtime for mmaped write

2007-02-22 Thread Peter Staubach

Miklos Szeredi wrote:

__fput() will be called when there are no more references to 'file',
then it will update the time if the flag is set.  This applies to
regular files as well as devices.

  
  

I suspect that you will find that, for a block device, the wrong inode
gets updated.  That's where the bd_inode_update_time() portion of my
proposed patch came from.



How horrible :( I haven't noticed that part of the patch.  But I don't
think that's needed.  Updating the times through the file pointer
should be OK.  You have this problem because you use the inode which
comes from the blockdev pseudo-filesystem.

  


It was nasty, I certainly agree.  :-)


But I've moved the check from __fput to remove_vma() in the next
revision of the patch, which would give slightly nicer semantics, and
be equally conforming.
  

This still does not address the situation where a file is 'permanently'
mmap'd, does it?



So?  If application doesn't do msync, then the file times won't be
updated.  That's allowed by the standard, and so portable applications
will have to call msync.


Well, there allowable by the specification, and then there is expected and
reasonable.  It seems reasonable to me that the file times should be
updated _sometime_, even if the application does not take proactive action
to cause them to be updated.  Otherwise, it would be easy to end up with a
file, whose contents are updated and reside on stable storage, but whose
mtime never changes.  Part of the motivation behind starting this work was
to address the situation where an application modifies files using mmap
but then backup software would never see the need to backup those files.

It seems to me that if something like sync() causes the file contents to
be written to stable storage, then the file metadata should follow in a
not too distant fashion.

   Thanx...

  ps
-
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] update ctime and mtime for mmaped write

2007-02-22 Thread Peter Staubach

Miklos Szeredi wrote:

Take this example:

fd = open()
addr = mmap(.., fd)
write(fd, ...)
close(fd)
sleep(100)
msync(addr,...)
munmap(addr)

The file times will be updated in write(), but with your patch, the
bit in the mapping will also be set.

Then in msync() the file times will be updated again, which is wrong,
since the memory was _not_ modified through the mapping.
  

This is correct.  I have updated my proposed patch to include the clearing
of AS_MCTIME in the routine which updates the mtime field.



That doesn't really help.  Look at __generic_file_aio_write_nolock().
file_update_time() is called before the data is written, so after the
last write, there will be nothing to clear the flag.

And even if fixed this case by moving the file_update_time() call to
the end of the function, there's no guarantee, that some filesystem
won't do something exotic and call set_page_dirty() indenpendently of
write().  Good luck auditing all the set_page_dirty() calls ;)


Interesting.

No, I wouldn't want to get into worrying about set_page_dirty() calls.
Life is short and people have too much imagination...  :-)

   Thanx...

  ps

-
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] update ctime and mtime for mmaped write

2007-02-22 Thread Miklos Szeredi
> > __fput() will be called when there are no more references to 'file',
> > then it will update the time if the flag is set.  This applies to
> > regular files as well as devices.
> >
> >   
> 
> I suspect that you will find that, for a block device, the wrong inode
> gets updated.  That's where the bd_inode_update_time() portion of my
> proposed patch came from.

How horrible :( I haven't noticed that part of the patch.  But I don't
think that's needed.  Updating the times through the file pointer
should be OK.  You have this problem because you use the inode which
comes from the blockdev pseudo-filesystem.

> 
> > But I've moved the check from __fput to remove_vma() in the next
> > revision of the patch, which would give slightly nicer semantics, and
> > be equally conforming.
> 
> This still does not address the situation where a file is 'permanently'
> mmap'd, does it?

So?  If application doesn't do msync, then the file times won't be
updated.  That's allowed by the standard, and so portable applications
will have to call msync.

Thanks,
Miklos
-
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] update ctime and mtime for mmaped write

2007-02-22 Thread Miklos Szeredi
> > Take this example:
> >
> > fd = open()
> > addr = mmap(.., fd)
> > write(fd, ...)
> > close(fd)
> > sleep(100)
> > msync(addr,...)
> > munmap(addr)
> >
> > The file times will be updated in write(), but with your patch, the
> > bit in the mapping will also be set.
> >
> > Then in msync() the file times will be updated again, which is wrong,
> > since the memory was _not_ modified through the mapping.
> 
> This is correct.  I have updated my proposed patch to include the clearing
> of AS_MCTIME in the routine which updates the mtime field.

That doesn't really help.  Look at __generic_file_aio_write_nolock().
file_update_time() is called before the data is written, so after the
last write, there will be nothing to clear the flag.

And even if fixed this case by moving the file_update_time() call to
the end of the function, there's no guarantee, that some filesystem
won't do something exotic and call set_page_dirty() indenpendently of
write().  Good luck auditing all the set_page_dirty() calls ;)

Thanks,
Miklos
-
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] update ctime and mtime for mmaped write

2007-02-22 Thread Peter Staubach

Miklos Szeredi wrote:

Why is the flag checked in __fput()?



It's because of this bit in the standard:

If there is no such call and if the underlying file is modified
as a result of a write reference, then these fields shall be
marked for update at some time after the write reference.

It could be done in munmap/mremap, but it seemed more difficult to
track down all the places where the vma is removed.  But yes, that may
be a nicer solution.
  

It seems to me that, with this support, a file, which is mmap'd,
modified, but never msync'd or munmap'd, will never get its mtime
updated.  Or did I miss that?

I also don't see how an mmap'd block device will get its mtime
updated either.



__fput() will be called when there are no more references to 'file',
then it will update the time if the flag is set.  This applies to
regular files as well as devices.

  


I suspect that you will find that, for a block device, the wrong inode
gets updated.  That's where the bd_inode_update_time() portion of my
proposed patch came from.


But I've moved the check from __fput to remove_vma() in the next
revision of the patch, which would give slightly nicer semantics, and
be equally conforming.


This still does not address the situation where a file is 'permanently'
mmap'd, does it?

   Thanx...

  ps
-
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] update ctime and mtime for mmaped write

2007-02-22 Thread Peter Staubach

Miklos Szeredi wrote:

+int set_page_dirty_mapping(struct page *page);
  
  
  
  

This aspect of the design seems intrusive to me.  I didn't see a strong
reason to introduce new versions of many of the routines just to handle
these semantics.  What motivated this part of your design?  Why the new
_mapping versions of routines?




Because there's no way to know inside the set_page_dirty() functions
if the dirtying comes from a memory mapping or from a modification
through a normal write().  And they have different semantics, for
write() the modification times are updated immediately.
  
  

Perhaps I didn't understand what page_mapped() does, but it does seem to
have the right semantics as far as I could see.



The problems will start, when you have a file that is both mapped and
modified with write().  Then the dirying from the write() will set the
flag, and that will have undesirable consequences.
  

I don't think that I quite follow the logic.  The dirtying from write()
will set the flag, but then the mtime will get updated and the flag will
be cleared by the hook in file_update_time().  Right?



Take this example:

fd = open()
addr = mmap(.., fd)
write(fd, ...)
close(fd)
sleep(100)
msync(addr,...)
munmap(addr)

The file times will be updated in write(), but with your patch, the
bit in the mapping will also be set.

Then in msync() the file times will be updated again, which is wrong,
since the memory was _not_ modified through the mapping.


This is correct.  I have updated my proposed patch to include the clearing
of AS_MCTIME in the routine which updates the mtime field.  I haven't
reposted it yet until I complete testing of the new resulting system.  I
anticipate doing this later today.

   Thanx..

  ps
-
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] update ctime and mtime for mmaped write

2007-02-22 Thread Miklos Szeredi
> > +int set_page_dirty_mapping(struct page *page);
> >   
> >   
> >   
>  This aspect of the design seems intrusive to me.  I didn't see a strong
>  reason to introduce new versions of many of the routines just to handle
>  these semantics.  What motivated this part of your design?  Why the new
>  _mapping versions of routines?
>  
>  
> >>> Because there's no way to know inside the set_page_dirty() functions
> >>> if the dirtying comes from a memory mapping or from a modification
> >>> through a normal write().  And they have different semantics, for
> >>> write() the modification times are updated immediately.
> >>>   
> >> Perhaps I didn't understand what page_mapped() does, but it does seem to
> >> have the right semantics as far as I could see.
> >> 
> >
> > The problems will start, when you have a file that is both mapped and
> > modified with write().  Then the dirying from the write() will set the
> > flag, and that will have undesirable consequences.
> 
> I don't think that I quite follow the logic.  The dirtying from write()
> will set the flag, but then the mtime will get updated and the flag will
> be cleared by the hook in file_update_time().  Right?

Take this example:

fd = open()
addr = mmap(.., fd)
write(fd, ...)
close(fd)
sleep(100)
msync(addr,...)
munmap(addr)

The file times will be updated in write(), but with your patch, the
bit in the mapping will also be set.

Then in msync() the file times will be updated again, which is wrong,
since the memory was _not_ modified through the mapping.

Thanks,
Miklos
-
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] update ctime and mtime for mmaped write

2007-02-22 Thread Miklos Szeredi
> >> Why is the flag checked in __fput()?
> >> 
> >
> > It's because of this bit in the standard:
> >
> > If there is no such call and if the underlying file is modified
> > as a result of a write reference, then these fields shall be
> > marked for update at some time after the write reference.
> >
> > It could be done in munmap/mremap, but it seemed more difficult to
> > track down all the places where the vma is removed.  But yes, that may
> > be a nicer solution.
> 
> It seems to me that, with this support, a file, which is mmap'd,
> modified, but never msync'd or munmap'd, will never get its mtime
> updated.  Or did I miss that?
> 
> I also don't see how an mmap'd block device will get its mtime
> updated either.

__fput() will be called when there are no more references to 'file',
then it will update the time if the flag is set.  This applies to
regular files as well as devices.

But I've moved the check from __fput to remove_vma() in the next
revision of the patch, which would give slightly nicer semantics, and
be equally conforming.

Thanks,
Miklos
-
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] update ctime and mtime for mmaped write

2007-02-22 Thread Peter Staubach

Miklos Szeredi wrote:

Inspired by Peter Staubach's patch and the resulting comments.

  
  
  

An updated version of the original patch was submitted to LKML
yesterday...  :-)



Strange coincidence :)

  
  

file = vma->vm_file;
start = vma->vm_end;
+   mapping_update_time(file);
if ((flags & MS_SYNC) && file &&
(vma->vm_flags & VM_SHARED)) {
get_file(file);
  
  
  

It seems to me that this might lead to file times being updated for
non-MAP_SHARED mappings.



In theory no, because the COW-ed pages become anonymous and are not
part of the original mapping any more.

  
  

I must profess to having a incomplete understanding of all of this
support, but then why would it be necessary to test VM_SHARED at
this point in msync()?



That's basically just an optimization.  If it wasn't there, then data
from a another (shared) mapping could be written back, which is not
wrong, but not required either.

  

I ran into problems early on with file times being updated incorrectly
so I am a little sensitive this aspect.



+int set_page_dirty_mapping(struct page *page);
  
  
  

This aspect of the design seems intrusive to me.  I didn't see a strong
reason to introduce new versions of many of the routines just to handle
these semantics.  What motivated this part of your design?  Why the new
_mapping versions of routines?



Because there's no way to know inside the set_page_dirty() functions
if the dirtying comes from a memory mapping or from a modification
through a normal write().  And they have different semantics, for
write() the modification times are updated immediately.
  

Perhaps I didn't understand what page_mapped() does, but it does seem to
have the right semantics as far as I could see.



The problems will start, when you have a file that is both mapped and
modified with write().  Then the dirying from the write() will set the
flag, and that will have undesirable consequences.


I don't think that I quite follow the logic.  The dirtying from write()
will set the flag, but then the mtime will get updated and the flag will
be cleared by the hook in file_update_time().  Right?

   Thanx...

  ps
-
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] update ctime and mtime for mmaped write

2007-02-22 Thread Peter Staubach

Miklos Szeredi wrote:

On Wed, 21 Feb 2007 18:51:52 +0100 Miklos Szeredi <[EMAIL PROTECTED]> wrote:



This patch makes writing to shared memory mappings update st_ctime and
st_mtime as defined by SUSv3:

   The st_ctime and st_mtime fields of a file that is mapped with
   MAP_SHARED and PROT_WRITE shall be marked for update at some point
   in the interval between a write reference to the mapped region and
   the next call to msync() with MS_ASYNC or MS_SYNC for that portion
   of the file by any process. If there is no such call and if the
   underlying file is modified as a result of a write reference, then
   these fields shall be marked for update at some time after the
   write reference.

A new address_space flag is introduced: AS_CMTIME.  This is set each
time a page is dirtied through a userspace memory mapping.  This
includes write accesses via get_user_pages().

Note, the flag is set unconditionally, even if the page is already
dirty.  This is important, because the page might have been dirtied
earlier by a non-mmap write.

This flag is checked in msync() and __fput(), and if set, the file
times are updated and the flag is cleared

The flag is also cleared, if the time update is triggered by a normal
write.  This is not mandated by the standard, but seems to be a sane
thing to do.
  

Why is the flag checked in __fput()?



It's because of this bit in the standard:

If there is no such call and if the underlying file is modified
as a result of a write reference, then these fields shall be
marked for update at some time after the write reference.

It could be done in munmap/mremap, but it seemed more difficult to
track down all the places where the vma is removed.  But yes, that may
be a nicer solution.


It seems to me that, with this support, a file, which is mmap'd,
modified, but never msync'd or munmap'd, will never get its mtime
updated.  Or did I miss that?

I also don't see how an mmap'd block device will get its mtime
updated either.

   Thanx...

  ps
-
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] update ctime and mtime for mmaped write

2007-02-21 Thread Miklos Szeredi
> On Wed, 21 Feb 2007 18:51:52 +0100 Miklos Szeredi <[EMAIL PROTECTED]> wrote:
> 
> > This patch makes writing to shared memory mappings update st_ctime and
> > st_mtime as defined by SUSv3:
> > 
> >The st_ctime and st_mtime fields of a file that is mapped with
> >MAP_SHARED and PROT_WRITE shall be marked for update at some point
> >in the interval between a write reference to the mapped region and
> >the next call to msync() with MS_ASYNC or MS_SYNC for that portion
> >of the file by any process. If there is no such call and if the
> >underlying file is modified as a result of a write reference, then
> >these fields shall be marked for update at some time after the
> >write reference.
> > 
> > A new address_space flag is introduced: AS_CMTIME.  This is set each
> > time a page is dirtied through a userspace memory mapping.  This
> > includes write accesses via get_user_pages().
> > 
> > Note, the flag is set unconditionally, even if the page is already
> > dirty.  This is important, because the page might have been dirtied
> > earlier by a non-mmap write.
> > 
> > This flag is checked in msync() and __fput(), and if set, the file
> > times are updated and the flag is cleared
> > 
> > The flag is also cleared, if the time update is triggered by a normal
> > write.  This is not mandated by the standard, but seems to be a sane
> > thing to do.
> 
> Why is the flag checked in __fput()?

It's because of this bit in the standard:

If there is no such call and if the underlying file is modified
as a result of a write reference, then these fields shall be
marked for update at some time after the write reference.

It could be done in munmap/mremap, but it seemed more difficult to
track down all the places where the vma is removed.  But yes, that may
be a nicer solution.

Miklos
-
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] update ctime and mtime for mmaped write

2007-02-21 Thread Andrew Morton
On Wed, 21 Feb 2007 18:51:52 +0100 Miklos Szeredi <[EMAIL PROTECTED]> wrote:

> This patch makes writing to shared memory mappings update st_ctime and
> st_mtime as defined by SUSv3:
> 
>The st_ctime and st_mtime fields of a file that is mapped with
>MAP_SHARED and PROT_WRITE shall be marked for update at some point
>in the interval between a write reference to the mapped region and
>the next call to msync() with MS_ASYNC or MS_SYNC for that portion
>of the file by any process. If there is no such call and if the
>underlying file is modified as a result of a write reference, then
>these fields shall be marked for update at some time after the
>write reference.
> 
> A new address_space flag is introduced: AS_CMTIME.  This is set each
> time a page is dirtied through a userspace memory mapping.  This
> includes write accesses via get_user_pages().
> 
> Note, the flag is set unconditionally, even if the page is already
> dirty.  This is important, because the page might have been dirtied
> earlier by a non-mmap write.
> 
> This flag is checked in msync() and __fput(), and if set, the file
> times are updated and the flag is cleared
> 
> The flag is also cleared, if the time update is triggered by a normal
> write.  This is not mandated by the standard, but seems to be a sane
> thing to do.

Why is the flag checked in __fput()?
-
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] update ctime and mtime for mmaped write

2007-02-21 Thread Miklos Szeredi
> >>> Inspired by Peter Staubach's patch and the resulting comments.
> >>>
> >>>   
> >>>   
> >> An updated version of the original patch was submitted to LKML
> >> yesterday...  :-)
> >> 
> >
> > Strange coincidence :)
> >
> >   
> >>>   file = vma->vm_file;
> >>>   start = vma->vm_end;
> >>> + mapping_update_time(file);
> >>>   if ((flags & MS_SYNC) && file &&
> >>>   (vma->vm_flags & VM_SHARED)) {
> >>>   get_file(file);
> >>>   
> >>>   
> >> It seems to me that this might lead to file times being updated for
> >> non-MAP_SHARED mappings.
> >> 
> >
> > In theory no, because the COW-ed pages become anonymous and are not
> > part of the original mapping any more.
> >
> >   
> 
> I must profess to having a incomplete understanding of all of this
> support, but then why would it be necessary to test VM_SHARED at
> this point in msync()?

That's basically just an optimization.  If it wasn't there, then data
from a another (shared) mapping could be written back, which is not
wrong, but not required either.

> I ran into problems early on with file times being updated incorrectly
> so I am a little sensitive this aspect.
> 
> >>> +int set_page_dirty_mapping(struct page *page);
> >>>   
> >>>   
> >> This aspect of the design seems intrusive to me.  I didn't see a strong
> >> reason to introduce new versions of many of the routines just to handle
> >> these semantics.  What motivated this part of your design?  Why the new
> >> _mapping versions of routines?
> >> 
> >
> > Because there's no way to know inside the set_page_dirty() functions
> > if the dirtying comes from a memory mapping or from a modification
> > through a normal write().  And they have different semantics, for
> > write() the modification times are updated immediately.
> 
> Perhaps I didn't understand what page_mapped() does, but it does seem to
> have the right semantics as far as I could see.

The problems will start, when you have a file that is both mapped and
modified with write().  Then the dirying from the write() will set the
flag, and that will have undesirable consequences.

Miklos
-
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] update ctime and mtime for mmaped write

2007-02-21 Thread Peter Staubach

Miklos Szeredi wrote:

Inspired by Peter Staubach's patch and the resulting comments.

  
  

An updated version of the original patch was submitted to LKML
yesterday...  :-)



Strange coincidence :)

  

file = vma->vm_file;
start = vma->vm_end;
+   mapping_update_time(file);
if ((flags & MS_SYNC) && file &&
(vma->vm_flags & VM_SHARED)) {
get_file(file);
  
  

It seems to me that this might lead to file times being updated for
non-MAP_SHARED mappings.



In theory no, because the COW-ed pages become anonymous and are not
part of the original mapping any more.

  


I must profess to having a incomplete understanding of all of this
support, but then why would it be necessary to test VM_SHARED at
this point in msync()?

I ran into problems early on with file times being updated incorrectly
so I am a little sensitive this aspect.


+int set_page_dirty_mapping(struct page *page);
  
  

This aspect of the design seems intrusive to me.  I didn't see a strong
reason to introduce new versions of many of the routines just to handle
these semantics.  What motivated this part of your design?  Why the new
_mapping versions of routines?



Because there's no way to know inside the set_page_dirty() functions
if the dirtying comes from a memory mapping or from a modification
through a normal write().  And they have different semantics, for
write() the modification times are updated immediately.


Perhaps I didn't understand what page_mapped() does, but it does seem to
have the right semantics as far as I could see.

   Thanx...

  ps
-
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] update ctime and mtime for mmaped write

2007-02-21 Thread Miklos Szeredi
> > > > This flag is checked in msync() and __fput(), and if set, the file
> > > > times are updated and the flag is cleared
> > > 
> > > Why not also check inside vfs_getattr?
> > 
> > This is the minimum, that the standard asks for.
> > 
> > Note, your porposal would touch the times in vfs_getattr(), which
> > means, that the modification times would depend on the time of the
> > last stat() call, which is not really right, though it would still be
> > conforming.
> > 
> > It is much saner, if the modification time is always the time of the
> > last write() or msync().
> 
> I disagree. The above doesn't allow a program like 'make' to discover
> whether or not the file has changed by simply calling stat(). Instead,
> you're forcing a call to msync()+stat().

Yes, but that's the only portable way _anyway_.

And it probably doesn't matter, programs using mmap to write to a file
_will_ call msync, or at least close the file, when they're done.

Thanks,
Miklos
-
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] update ctime and mtime for mmaped write

2007-02-21 Thread Peter Staubach

Trond Myklebust wrote:

On Wed, 2007-02-21 at 19:28 +0100, Miklos Szeredi wrote:
  

This flag is checked in msync() and __fput(), and if set, the file
times are updated and the flag is cleared


Why not also check inside vfs_getattr?
  

This is the minimum, that the standard asks for.

Note, your porposal would touch the times in vfs_getattr(), which
means, that the modification times would depend on the time of the
last stat() call, which is not really right, though it would still be
conforming.

It is much saner, if the modification time is always the time of the
last write() or msync().



I disagree. The above doesn't allow a program like 'make' to discover
whether or not the file has changed by simply calling stat(). Instead,
you're forcing a call to msync()+stat().


Right, but that's what the specification specifies.  The file times
for mmap'd files are not maintained as tightly as they are for files
which are being modified via write(2).

Rightly or wrongly.

  ps
-
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] update ctime and mtime for mmaped write

2007-02-21 Thread Trond Myklebust
On Wed, 2007-02-21 at 19:28 +0100, Miklos Szeredi wrote:
> > > This flag is checked in msync() and __fput(), and if set, the file
> > > times are updated and the flag is cleared
> > 
> > Why not also check inside vfs_getattr?
> 
> This is the minimum, that the standard asks for.
> 
> Note, your porposal would touch the times in vfs_getattr(), which
> means, that the modification times would depend on the time of the
> last stat() call, which is not really right, though it would still be
> conforming.
> 
> It is much saner, if the modification time is always the time of the
> last write() or msync().

I disagree. The above doesn't allow a program like 'make' to discover
whether or not the file has changed by simply calling stat(). Instead,
you're forcing a call to msync()+stat().

Cheers
  Trond

-
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] update ctime and mtime for mmaped write

2007-02-21 Thread Miklos Szeredi
> > This flag is checked in msync() and __fput(), and if set, the file
> > times are updated and the flag is cleared
> 
> Why not also check inside vfs_getattr?

This is the minimum, that the standard asks for.

Note, your porposal would touch the times in vfs_getattr(), which
means, that the modification times would depend on the time of the
last stat() call, which is not really right, though it would still be
conforming.

It is much saner, if the modification time is always the time of the
last write() or msync().

Miklos
-
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] update ctime and mtime for mmaped write

2007-02-21 Thread Miklos Szeredi
> > Inspired by Peter Staubach's patch and the resulting comments.
> >
> >   
> 
> An updated version of the original patch was submitted to LKML
> yesterday...  :-)

Strange coincidence :)

> > file = vma->vm_file;
> > start = vma->vm_end;
> > +   mapping_update_time(file);
> > if ((flags & MS_SYNC) && file &&
> > (vma->vm_flags & VM_SHARED)) {
> > get_file(file);
> >   
> 
> It seems to me that this might lead to file times being updated for
> non-MAP_SHARED mappings.

In theory no, because the COW-ed pages become anonymous and are not
part of the original mapping any more.

> > +int set_page_dirty_mapping(struct page *page);
> >   
> 
> This aspect of the design seems intrusive to me.  I didn't see a strong
> reason to introduce new versions of many of the routines just to handle
> these semantics.  What motivated this part of your design?  Why the new
> _mapping versions of routines?

Because there's no way to know inside the set_page_dirty() functions
if the dirtying comes from a memory mapping or from a modification
through a normal write().  And they have different semantics, for
write() the modification times are updated immediately.

Thanks,
Miklos
-
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] update ctime and mtime for mmaped write

2007-02-21 Thread Trond Myklebust
On Wed, 2007-02-21 at 18:51 +0100, Miklos Szeredi wrote:
> From: Miklos Szeredi <[EMAIL PROTECTED]>
> 
> This patch makes writing to shared memory mappings update st_ctime and
> st_mtime as defined by SUSv3:
> 
>The st_ctime and st_mtime fields of a file that is mapped with
>MAP_SHARED and PROT_WRITE shall be marked for update at some point
>in the interval between a write reference to the mapped region and
>the next call to msync() with MS_ASYNC or MS_SYNC for that portion
>of the file by any process. If there is no such call and if the
>underlying file is modified as a result of a write reference, then
>these fields shall be marked for update at some time after the
>write reference.
> 
> A new address_space flag is introduced: AS_CMTIME.  This is set each
> time a page is dirtied through a userspace memory mapping.  This
> includes write accesses via get_user_pages().
> 
> Note, the flag is set unconditionally, even if the page is already
> dirty.  This is important, because the page might have been dirtied
> earlier by a non-mmap write.
> 
> This flag is checked in msync() and __fput(), and if set, the file
> times are updated and the flag is cleared

Why not also check inside vfs_getattr?

Cheers
  Trond

-
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] update ctime and mtime for mmaped write

2007-02-21 Thread Peter Staubach

Miklos Szeredi wrote:

From: Miklos Szeredi <[EMAIL PROTECTED]>

This patch makes writing to shared memory mappings update st_ctime and
st_mtime as defined by SUSv3:

   The st_ctime and st_mtime fields of a file that is mapped with
   MAP_SHARED and PROT_WRITE shall be marked for update at some point
   in the interval between a write reference to the mapped region and
   the next call to msync() with MS_ASYNC or MS_SYNC for that portion
   of the file by any process. If there is no such call and if the
   underlying file is modified as a result of a write reference, then
   these fields shall be marked for update at some time after the
   write reference.

A new address_space flag is introduced: AS_CMTIME.  This is set each
time a page is dirtied through a userspace memory mapping.  This
includes write accesses via get_user_pages().

Note, the flag is set unconditionally, even if the page is already
dirty.  This is important, because the page might have been dirtied
earlier by a non-mmap write.

This flag is checked in msync() and __fput(), and if set, the file
times are updated and the flag is cleared

The flag is also cleared, if the time update is triggered by a normal
write.  This is not mandated by the standard, but seems to be a sane
thing to do.

Fixes Novell Bugzilla #206431.

Inspired by Peter Staubach's patch and the resulting comments.

  


An updated version of the original patch was submitted to LKML
yesterday...  :-)

Some comments below --


Signed-off-by: Miklos Szeredi <[EMAIL PROTECTED]>
---

Index: linux/include/linux/pagemap.h
===
--- linux.orig/include/linux/pagemap.h  2007-02-21 14:15:06.0 +0100
+++ linux/include/linux/pagemap.h   2007-02-21 14:16:04.0 +0100
@@ -18,6 +18,7 @@
  */
 #defineAS_EIO  (__GFP_BITS_SHIFT + 0)  /* IO error on async 
write */
 #define AS_ENOSPC  (__GFP_BITS_SHIFT + 1)  /* ENOSPC on async write */
+#define AS_CMTIME  (__GFP_BITS_SHIFT + 2)  /* ctime/mtime update needed */
 
 static inline gfp_t mapping_gfp_mask(struct address_space * mapping)

 {
Index: linux/mm/msync.c
===
--- linux.orig/mm/msync.c   2007-02-21 14:14:43.0 +0100
+++ linux/mm/msync.c2007-02-21 14:16:04.0 +0100
@@ -77,6 +77,7 @@ asmlinkage long sys_msync(unsigned long 
 		}

file = vma->vm_file;
start = vma->vm_end;
+   mapping_update_time(file);
if ((flags & MS_SYNC) && file &&
(vma->vm_flags & VM_SHARED)) {
get_file(file);
  


It seems to me that this might lead to file times being updated for
non-MAP_SHARED mappings.


Index: linux/fs/file_table.c
===
--- linux.orig/fs/file_table.c  2007-02-21 14:14:43.0 +0100
+++ linux/fs/file_table.c   2007-02-21 14:16:04.0 +0100
@@ -165,6 +165,7 @@ void fastcall __fput(struct file *file)
 */
eventpoll_release(file);
locks_remove_flock(file);
+   mapping_update_time(file);
 
 	if (file->f_op && file->f_op->release)

file->f_op->release(inode, file);
Index: linux/fs/inode.c
===
--- linux.orig/fs/inode.c   2007-02-21 14:14:43.0 +0100
+++ linux/fs/inode.c2007-02-21 14:16:04.0 +0100
@@ -1219,6 +1219,7 @@ void file_update_time(struct file *file)
struct timespec now;
int sync_it = 0;
 
+	clear_bit(AS_CMTIME, &file->f_mapping->flags);

if (IS_NOCMTIME(inode))
return;
if (IS_RDONLY(inode))
@@ -1241,6 +1242,12 @@ void file_update_time(struct file *file)
 
 EXPORT_SYMBOL(file_update_time);
 
+void mapping_update_time(struct file *file)

+{
+   if (test_bit(AS_CMTIME, &file->f_mapping->flags))
+   file_update_time(file);
+}
+
 int inode_needs_sync(struct inode *inode)
 {
if (IS_SYNC(inode))
Index: linux/include/linux/fs.h
===
--- linux.orig/include/linux/fs.h   2007-02-21 14:15:06.0 +0100
+++ linux/include/linux/fs.h2007-02-21 14:16:04.0 +0100
@@ -1887,6 +1887,7 @@ extern int inode_change_ok(struct inode 
 extern int __must_check inode_setattr(struct inode *, struct iattr *);
 
 extern void file_update_time(struct file *file);

+extern void mapping_update_time(struct file *file);
 
 static inline ino_t parent_ino(struct dentry *dentry)

 {
Index: linux/include/linux/mm.h
===
--- linux.orig/include/linux/mm.h   2007-02-21 14:14:43.0 +0100
+++ linux/include/linux/mm.h2007-02-21 14:16:04.0 +0100
@@ -790,6 +790,7 @@ int redirty_page_for_writepage(struct wr
struct page *page);