Re: [PATCH] update ctime and mtime for mmaped write
> Miklos Szeredi wrote: > This still does not address the situation where a file is 'permanently' > mmap'd, does it? > > >>> So? If application doesn't do msync, then the file times won't be > >>> updated. That's allowed by the standard, and so portable applications > >>> will have to call msync. > >>> > >> It is allowed, but it is clearly not useful behaviour. Nowhere is it set > >> in stone that we should be implementing just the minimum allowed. > >> > > > > You're right. In theory, at least. But in practice I don't think > > this matters. Show me an application that writes to a shared mapping > > then doesn't call either msync or munmap and doesn't even exit. > > > > If there were lot of these apps, then this bug would have been fixed > > lots of years earlier. In fact there are _very_ few apps writing to > > shared mappings at all. > > > > > > Perhaps true, although I know of at least one customer of Red Hat who > does have an application (or more) than uses mmap'd files and is > suffering from this lack of appropriate semantics. They are not > getting files backed up which need to be. Yeah, but I bet, that just updating the times on msync and munmap will cure the problem. Until you prove me wrong, this is a purely theoretical argument. > > Applications should be encouraged to call msync(MS_ASYNC) because: > > > > - it's very fast (basically a no-op) on recent linux kernels > > > > - it's the only portable way to guarantee, that the data you written > > will _ever_ hit the disk. > > > > There's really no downside to using msync(MS_ASYNC) in your > > application, so making an effort to support applications that don't do > > this is stupid, IMO. > > It may be a no-op on recent Linux kernels, but I don't think that it > is a no-op on other systems. Exactly. Which is the reason applications _have_ to call msync on these OSs if they want data to be synchronized with the file. There are two cases: 1) app don't cares about it's data being synchronized with the file 2) app does care In the first case there's no problem if the times aren't updated, since the app didn't care. In the second case the a portable application has to call msync, and on linux it doesn't even lose any performance from this, so it should be happy. You are trying to provide for an app that requires 1) on non-linux and 2) on linux. Does that make any sense? Miklos - To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] update ctime and mtime for mmaped write
Miklos Szeredi wrote: This still does not address the situation where a file is 'permanently' mmap'd, does it? So? If application doesn't do msync, then the file times won't be updated. That's allowed by the standard, and so portable applications will have to call msync. It is allowed, but it is clearly not useful behaviour. Nowhere is it set in stone that we should be implementing just the minimum allowed. You're right. In theory, at least. But in practice I don't think this matters. Show me an application that writes to a shared mapping then doesn't call either msync or munmap and doesn't even exit. If there were lot of these apps, then this bug would have been fixed lots of years earlier. In fact there are _very_ few apps writing to shared mappings at all. Perhaps true, although I know of at least one customer of Red Hat who does have an application (or more) than uses mmap'd files and is suffering from this lack of appropriate semantics. They are not getting files backed up which need to be. Applications should be encouraged to call msync(MS_ASYNC) because: - it's very fast (basically a no-op) on recent linux kernels - it's the only portable way to guarantee, that the data you written will _ever_ hit the disk. There's really no downside to using msync(MS_ASYNC) in your application, so making an effort to support applications that don't do this is stupid, IMO. It may be a no-op on recent Linux kernels, but I don't think that it is a no-op on other systems. I do not wish to defend applications which don't use msync or some such, but it seems that the appropriate semantics to support are more than just depending upon the application to "do the right thing". Thanx... ps - To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] update ctime and mtime for mmaped write
> > > This still does not address the situation where a file is 'permanently' > > > mmap'd, does it? > > > > So? If application doesn't do msync, then the file times won't be > > updated. That's allowed by the standard, and so portable applications > > will have to call msync. > > It is allowed, but it is clearly not useful behaviour. Nowhere is it set > in stone that we should be implementing just the minimum allowed. You're right. In theory, at least. But in practice I don't think this matters. Show me an application that writes to a shared mapping then doesn't call either msync or munmap and doesn't even exit. If there were lot of these apps, then this bug would have been fixed lots of years earlier. In fact there are _very_ few apps writing to shared mappings at all. Applications should be encouraged to call msync(MS_ASYNC) because: - it's very fast (basically a no-op) on recent linux kernels - it's the only portable way to guarantee, that the data you written will _ever_ hit the disk. There's really no downside to using msync(MS_ASYNC) in your application, so making an effort to support applications that don't do this is stupid, IMO. Thanks, Miklos - To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] update ctime and mtime for mmaped write
On Thu, 2007-02-22 at 21:48 +0100, Miklos Szeredi wrote: > > This still does not address the situation where a file is 'permanently' > > mmap'd, does it? > > So? If application doesn't do msync, then the file times won't be > updated. That's allowed by the standard, and so portable applications > will have to call msync. It is allowed, but it is clearly not useful behaviour. Nowhere is it set in stone that we should be implementing just the minimum allowed. Trond - To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] update ctime and mtime for mmaped write
Miklos Szeredi wrote: __fput() will be called when there are no more references to 'file', then it will update the time if the flag is set. This applies to regular files as well as devices. I suspect that you will find that, for a block device, the wrong inode gets updated. That's where the bd_inode_update_time() portion of my proposed patch came from. How horrible :( I haven't noticed that part of the patch. But I don't think that's needed. Updating the times through the file pointer should be OK. You have this problem because you use the inode which comes from the blockdev pseudo-filesystem. It was nasty, I certainly agree. :-) But I've moved the check from __fput to remove_vma() in the next revision of the patch, which would give slightly nicer semantics, and be equally conforming. This still does not address the situation where a file is 'permanently' mmap'd, does it? So? If application doesn't do msync, then the file times won't be updated. That's allowed by the standard, and so portable applications will have to call msync. Well, there allowable by the specification, and then there is expected and reasonable. It seems reasonable to me that the file times should be updated _sometime_, even if the application does not take proactive action to cause them to be updated. Otherwise, it would be easy to end up with a file, whose contents are updated and reside on stable storage, but whose mtime never changes. Part of the motivation behind starting this work was to address the situation where an application modifies files using mmap but then backup software would never see the need to backup those files. It seems to me that if something like sync() causes the file contents to be written to stable storage, then the file metadata should follow in a not too distant fashion. Thanx... ps - To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] update ctime and mtime for mmaped write
Miklos Szeredi wrote: Take this example: fd = open() addr = mmap(.., fd) write(fd, ...) close(fd) sleep(100) msync(addr,...) munmap(addr) The file times will be updated in write(), but with your patch, the bit in the mapping will also be set. Then in msync() the file times will be updated again, which is wrong, since the memory was _not_ modified through the mapping. This is correct. I have updated my proposed patch to include the clearing of AS_MCTIME in the routine which updates the mtime field. That doesn't really help. Look at __generic_file_aio_write_nolock(). file_update_time() is called before the data is written, so after the last write, there will be nothing to clear the flag. And even if fixed this case by moving the file_update_time() call to the end of the function, there's no guarantee, that some filesystem won't do something exotic and call set_page_dirty() indenpendently of write(). Good luck auditing all the set_page_dirty() calls ;) Interesting. No, I wouldn't want to get into worrying about set_page_dirty() calls. Life is short and people have too much imagination... :-) Thanx... ps - To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] update ctime and mtime for mmaped write
> > __fput() will be called when there are no more references to 'file', > > then it will update the time if the flag is set. This applies to > > regular files as well as devices. > > > > > > I suspect that you will find that, for a block device, the wrong inode > gets updated. That's where the bd_inode_update_time() portion of my > proposed patch came from. How horrible :( I haven't noticed that part of the patch. But I don't think that's needed. Updating the times through the file pointer should be OK. You have this problem because you use the inode which comes from the blockdev pseudo-filesystem. > > > But I've moved the check from __fput to remove_vma() in the next > > revision of the patch, which would give slightly nicer semantics, and > > be equally conforming. > > This still does not address the situation where a file is 'permanently' > mmap'd, does it? So? If application doesn't do msync, then the file times won't be updated. That's allowed by the standard, and so portable applications will have to call msync. Thanks, Miklos - To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] update ctime and mtime for mmaped write
> > Take this example: > > > > fd = open() > > addr = mmap(.., fd) > > write(fd, ...) > > close(fd) > > sleep(100) > > msync(addr,...) > > munmap(addr) > > > > The file times will be updated in write(), but with your patch, the > > bit in the mapping will also be set. > > > > Then in msync() the file times will be updated again, which is wrong, > > since the memory was _not_ modified through the mapping. > > This is correct. I have updated my proposed patch to include the clearing > of AS_MCTIME in the routine which updates the mtime field. That doesn't really help. Look at __generic_file_aio_write_nolock(). file_update_time() is called before the data is written, so after the last write, there will be nothing to clear the flag. And even if fixed this case by moving the file_update_time() call to the end of the function, there's no guarantee, that some filesystem won't do something exotic and call set_page_dirty() indenpendently of write(). Good luck auditing all the set_page_dirty() calls ;) Thanks, Miklos - To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] update ctime and mtime for mmaped write
Miklos Szeredi wrote: Why is the flag checked in __fput()? It's because of this bit in the standard: If there is no such call and if the underlying file is modified as a result of a write reference, then these fields shall be marked for update at some time after the write reference. It could be done in munmap/mremap, but it seemed more difficult to track down all the places where the vma is removed. But yes, that may be a nicer solution. It seems to me that, with this support, a file, which is mmap'd, modified, but never msync'd or munmap'd, will never get its mtime updated. Or did I miss that? I also don't see how an mmap'd block device will get its mtime updated either. __fput() will be called when there are no more references to 'file', then it will update the time if the flag is set. This applies to regular files as well as devices. I suspect that you will find that, for a block device, the wrong inode gets updated. That's where the bd_inode_update_time() portion of my proposed patch came from. But I've moved the check from __fput to remove_vma() in the next revision of the patch, which would give slightly nicer semantics, and be equally conforming. This still does not address the situation where a file is 'permanently' mmap'd, does it? Thanx... ps - To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] update ctime and mtime for mmaped write
Miklos Szeredi wrote: +int set_page_dirty_mapping(struct page *page); This aspect of the design seems intrusive to me. I didn't see a strong reason to introduce new versions of many of the routines just to handle these semantics. What motivated this part of your design? Why the new _mapping versions of routines? Because there's no way to know inside the set_page_dirty() functions if the dirtying comes from a memory mapping or from a modification through a normal write(). And they have different semantics, for write() the modification times are updated immediately. Perhaps I didn't understand what page_mapped() does, but it does seem to have the right semantics as far as I could see. The problems will start, when you have a file that is both mapped and modified with write(). Then the dirying from the write() will set the flag, and that will have undesirable consequences. I don't think that I quite follow the logic. The dirtying from write() will set the flag, but then the mtime will get updated and the flag will be cleared by the hook in file_update_time(). Right? Take this example: fd = open() addr = mmap(.., fd) write(fd, ...) close(fd) sleep(100) msync(addr,...) munmap(addr) The file times will be updated in write(), but with your patch, the bit in the mapping will also be set. Then in msync() the file times will be updated again, which is wrong, since the memory was _not_ modified through the mapping. This is correct. I have updated my proposed patch to include the clearing of AS_MCTIME in the routine which updates the mtime field. I haven't reposted it yet until I complete testing of the new resulting system. I anticipate doing this later today. Thanx.. ps - To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] update ctime and mtime for mmaped write
> > +int set_page_dirty_mapping(struct page *page); > > > > > > > This aspect of the design seems intrusive to me. I didn't see a strong > reason to introduce new versions of many of the routines just to handle > these semantics. What motivated this part of your design? Why the new > _mapping versions of routines? > > > >>> Because there's no way to know inside the set_page_dirty() functions > >>> if the dirtying comes from a memory mapping or from a modification > >>> through a normal write(). And they have different semantics, for > >>> write() the modification times are updated immediately. > >>> > >> Perhaps I didn't understand what page_mapped() does, but it does seem to > >> have the right semantics as far as I could see. > >> > > > > The problems will start, when you have a file that is both mapped and > > modified with write(). Then the dirying from the write() will set the > > flag, and that will have undesirable consequences. > > I don't think that I quite follow the logic. The dirtying from write() > will set the flag, but then the mtime will get updated and the flag will > be cleared by the hook in file_update_time(). Right? Take this example: fd = open() addr = mmap(.., fd) write(fd, ...) close(fd) sleep(100) msync(addr,...) munmap(addr) The file times will be updated in write(), but with your patch, the bit in the mapping will also be set. Then in msync() the file times will be updated again, which is wrong, since the memory was _not_ modified through the mapping. Thanks, Miklos - To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] update ctime and mtime for mmaped write
> >> Why is the flag checked in __fput()? > >> > > > > It's because of this bit in the standard: > > > > If there is no such call and if the underlying file is modified > > as a result of a write reference, then these fields shall be > > marked for update at some time after the write reference. > > > > It could be done in munmap/mremap, but it seemed more difficult to > > track down all the places where the vma is removed. But yes, that may > > be a nicer solution. > > It seems to me that, with this support, a file, which is mmap'd, > modified, but never msync'd or munmap'd, will never get its mtime > updated. Or did I miss that? > > I also don't see how an mmap'd block device will get its mtime > updated either. __fput() will be called when there are no more references to 'file', then it will update the time if the flag is set. This applies to regular files as well as devices. But I've moved the check from __fput to remove_vma() in the next revision of the patch, which would give slightly nicer semantics, and be equally conforming. Thanks, Miklos - To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] update ctime and mtime for mmaped write
Miklos Szeredi wrote: Inspired by Peter Staubach's patch and the resulting comments. An updated version of the original patch was submitted to LKML yesterday... :-) Strange coincidence :) file = vma->vm_file; start = vma->vm_end; + mapping_update_time(file); if ((flags & MS_SYNC) && file && (vma->vm_flags & VM_SHARED)) { get_file(file); It seems to me that this might lead to file times being updated for non-MAP_SHARED mappings. In theory no, because the COW-ed pages become anonymous and are not part of the original mapping any more. I must profess to having a incomplete understanding of all of this support, but then why would it be necessary to test VM_SHARED at this point in msync()? That's basically just an optimization. If it wasn't there, then data from a another (shared) mapping could be written back, which is not wrong, but not required either. I ran into problems early on with file times being updated incorrectly so I am a little sensitive this aspect. +int set_page_dirty_mapping(struct page *page); This aspect of the design seems intrusive to me. I didn't see a strong reason to introduce new versions of many of the routines just to handle these semantics. What motivated this part of your design? Why the new _mapping versions of routines? Because there's no way to know inside the set_page_dirty() functions if the dirtying comes from a memory mapping or from a modification through a normal write(). And they have different semantics, for write() the modification times are updated immediately. Perhaps I didn't understand what page_mapped() does, but it does seem to have the right semantics as far as I could see. The problems will start, when you have a file that is both mapped and modified with write(). Then the dirying from the write() will set the flag, and that will have undesirable consequences. I don't think that I quite follow the logic. The dirtying from write() will set the flag, but then the mtime will get updated and the flag will be cleared by the hook in file_update_time(). Right? Thanx... ps - To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] update ctime and mtime for mmaped write
Miklos Szeredi wrote: On Wed, 21 Feb 2007 18:51:52 +0100 Miklos Szeredi <[EMAIL PROTECTED]> wrote: This patch makes writing to shared memory mappings update st_ctime and st_mtime as defined by SUSv3: The st_ctime and st_mtime fields of a file that is mapped with MAP_SHARED and PROT_WRITE shall be marked for update at some point in the interval between a write reference to the mapped region and the next call to msync() with MS_ASYNC or MS_SYNC for that portion of the file by any process. If there is no such call and if the underlying file is modified as a result of a write reference, then these fields shall be marked for update at some time after the write reference. A new address_space flag is introduced: AS_CMTIME. This is set each time a page is dirtied through a userspace memory mapping. This includes write accesses via get_user_pages(). Note, the flag is set unconditionally, even if the page is already dirty. This is important, because the page might have been dirtied earlier by a non-mmap write. This flag is checked in msync() and __fput(), and if set, the file times are updated and the flag is cleared The flag is also cleared, if the time update is triggered by a normal write. This is not mandated by the standard, but seems to be a sane thing to do. Why is the flag checked in __fput()? It's because of this bit in the standard: If there is no such call and if the underlying file is modified as a result of a write reference, then these fields shall be marked for update at some time after the write reference. It could be done in munmap/mremap, but it seemed more difficult to track down all the places where the vma is removed. But yes, that may be a nicer solution. It seems to me that, with this support, a file, which is mmap'd, modified, but never msync'd or munmap'd, will never get its mtime updated. Or did I miss that? I also don't see how an mmap'd block device will get its mtime updated either. Thanx... ps - To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] update ctime and mtime for mmaped write
> On Wed, 21 Feb 2007 18:51:52 +0100 Miklos Szeredi <[EMAIL PROTECTED]> wrote: > > > This patch makes writing to shared memory mappings update st_ctime and > > st_mtime as defined by SUSv3: > > > >The st_ctime and st_mtime fields of a file that is mapped with > >MAP_SHARED and PROT_WRITE shall be marked for update at some point > >in the interval between a write reference to the mapped region and > >the next call to msync() with MS_ASYNC or MS_SYNC for that portion > >of the file by any process. If there is no such call and if the > >underlying file is modified as a result of a write reference, then > >these fields shall be marked for update at some time after the > >write reference. > > > > A new address_space flag is introduced: AS_CMTIME. This is set each > > time a page is dirtied through a userspace memory mapping. This > > includes write accesses via get_user_pages(). > > > > Note, the flag is set unconditionally, even if the page is already > > dirty. This is important, because the page might have been dirtied > > earlier by a non-mmap write. > > > > This flag is checked in msync() and __fput(), and if set, the file > > times are updated and the flag is cleared > > > > The flag is also cleared, if the time update is triggered by a normal > > write. This is not mandated by the standard, but seems to be a sane > > thing to do. > > Why is the flag checked in __fput()? It's because of this bit in the standard: If there is no such call and if the underlying file is modified as a result of a write reference, then these fields shall be marked for update at some time after the write reference. It could be done in munmap/mremap, but it seemed more difficult to track down all the places where the vma is removed. But yes, that may be a nicer solution. Miklos - To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] update ctime and mtime for mmaped write
On Wed, 21 Feb 2007 18:51:52 +0100 Miklos Szeredi <[EMAIL PROTECTED]> wrote: > This patch makes writing to shared memory mappings update st_ctime and > st_mtime as defined by SUSv3: > >The st_ctime and st_mtime fields of a file that is mapped with >MAP_SHARED and PROT_WRITE shall be marked for update at some point >in the interval between a write reference to the mapped region and >the next call to msync() with MS_ASYNC or MS_SYNC for that portion >of the file by any process. If there is no such call and if the >underlying file is modified as a result of a write reference, then >these fields shall be marked for update at some time after the >write reference. > > A new address_space flag is introduced: AS_CMTIME. This is set each > time a page is dirtied through a userspace memory mapping. This > includes write accesses via get_user_pages(). > > Note, the flag is set unconditionally, even if the page is already > dirty. This is important, because the page might have been dirtied > earlier by a non-mmap write. > > This flag is checked in msync() and __fput(), and if set, the file > times are updated and the flag is cleared > > The flag is also cleared, if the time update is triggered by a normal > write. This is not mandated by the standard, but seems to be a sane > thing to do. Why is the flag checked in __fput()? - To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] update ctime and mtime for mmaped write
> >>> Inspired by Peter Staubach's patch and the resulting comments. > >>> > >>> > >>> > >> An updated version of the original patch was submitted to LKML > >> yesterday... :-) > >> > > > > Strange coincidence :) > > > > > >>> file = vma->vm_file; > >>> start = vma->vm_end; > >>> + mapping_update_time(file); > >>> if ((flags & MS_SYNC) && file && > >>> (vma->vm_flags & VM_SHARED)) { > >>> get_file(file); > >>> > >>> > >> It seems to me that this might lead to file times being updated for > >> non-MAP_SHARED mappings. > >> > > > > In theory no, because the COW-ed pages become anonymous and are not > > part of the original mapping any more. > > > > > > I must profess to having a incomplete understanding of all of this > support, but then why would it be necessary to test VM_SHARED at > this point in msync()? That's basically just an optimization. If it wasn't there, then data from a another (shared) mapping could be written back, which is not wrong, but not required either. > I ran into problems early on with file times being updated incorrectly > so I am a little sensitive this aspect. > > >>> +int set_page_dirty_mapping(struct page *page); > >>> > >>> > >> This aspect of the design seems intrusive to me. I didn't see a strong > >> reason to introduce new versions of many of the routines just to handle > >> these semantics. What motivated this part of your design? Why the new > >> _mapping versions of routines? > >> > > > > Because there's no way to know inside the set_page_dirty() functions > > if the dirtying comes from a memory mapping or from a modification > > through a normal write(). And they have different semantics, for > > write() the modification times are updated immediately. > > Perhaps I didn't understand what page_mapped() does, but it does seem to > have the right semantics as far as I could see. The problems will start, when you have a file that is both mapped and modified with write(). Then the dirying from the write() will set the flag, and that will have undesirable consequences. Miklos - To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] update ctime and mtime for mmaped write
Miklos Szeredi wrote: Inspired by Peter Staubach's patch and the resulting comments. An updated version of the original patch was submitted to LKML yesterday... :-) Strange coincidence :) file = vma->vm_file; start = vma->vm_end; + mapping_update_time(file); if ((flags & MS_SYNC) && file && (vma->vm_flags & VM_SHARED)) { get_file(file); It seems to me that this might lead to file times being updated for non-MAP_SHARED mappings. In theory no, because the COW-ed pages become anonymous and are not part of the original mapping any more. I must profess to having a incomplete understanding of all of this support, but then why would it be necessary to test VM_SHARED at this point in msync()? I ran into problems early on with file times being updated incorrectly so I am a little sensitive this aspect. +int set_page_dirty_mapping(struct page *page); This aspect of the design seems intrusive to me. I didn't see a strong reason to introduce new versions of many of the routines just to handle these semantics. What motivated this part of your design? Why the new _mapping versions of routines? Because there's no way to know inside the set_page_dirty() functions if the dirtying comes from a memory mapping or from a modification through a normal write(). And they have different semantics, for write() the modification times are updated immediately. Perhaps I didn't understand what page_mapped() does, but it does seem to have the right semantics as far as I could see. Thanx... ps - To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] update ctime and mtime for mmaped write
> > > > This flag is checked in msync() and __fput(), and if set, the file > > > > times are updated and the flag is cleared > > > > > > Why not also check inside vfs_getattr? > > > > This is the minimum, that the standard asks for. > > > > Note, your porposal would touch the times in vfs_getattr(), which > > means, that the modification times would depend on the time of the > > last stat() call, which is not really right, though it would still be > > conforming. > > > > It is much saner, if the modification time is always the time of the > > last write() or msync(). > > I disagree. The above doesn't allow a program like 'make' to discover > whether or not the file has changed by simply calling stat(). Instead, > you're forcing a call to msync()+stat(). Yes, but that's the only portable way _anyway_. And it probably doesn't matter, programs using mmap to write to a file _will_ call msync, or at least close the file, when they're done. Thanks, Miklos - To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] update ctime and mtime for mmaped write
Trond Myklebust wrote: On Wed, 2007-02-21 at 19:28 +0100, Miklos Szeredi wrote: This flag is checked in msync() and __fput(), and if set, the file times are updated and the flag is cleared Why not also check inside vfs_getattr? This is the minimum, that the standard asks for. Note, your porposal would touch the times in vfs_getattr(), which means, that the modification times would depend on the time of the last stat() call, which is not really right, though it would still be conforming. It is much saner, if the modification time is always the time of the last write() or msync(). I disagree. The above doesn't allow a program like 'make' to discover whether or not the file has changed by simply calling stat(). Instead, you're forcing a call to msync()+stat(). Right, but that's what the specification specifies. The file times for mmap'd files are not maintained as tightly as they are for files which are being modified via write(2). Rightly or wrongly. ps - To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] update ctime and mtime for mmaped write
On Wed, 2007-02-21 at 19:28 +0100, Miklos Szeredi wrote: > > > This flag is checked in msync() and __fput(), and if set, the file > > > times are updated and the flag is cleared > > > > Why not also check inside vfs_getattr? > > This is the minimum, that the standard asks for. > > Note, your porposal would touch the times in vfs_getattr(), which > means, that the modification times would depend on the time of the > last stat() call, which is not really right, though it would still be > conforming. > > It is much saner, if the modification time is always the time of the > last write() or msync(). I disagree. The above doesn't allow a program like 'make' to discover whether or not the file has changed by simply calling stat(). Instead, you're forcing a call to msync()+stat(). Cheers Trond - To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] update ctime and mtime for mmaped write
> > This flag is checked in msync() and __fput(), and if set, the file > > times are updated and the flag is cleared > > Why not also check inside vfs_getattr? This is the minimum, that the standard asks for. Note, your porposal would touch the times in vfs_getattr(), which means, that the modification times would depend on the time of the last stat() call, which is not really right, though it would still be conforming. It is much saner, if the modification time is always the time of the last write() or msync(). Miklos - To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] update ctime and mtime for mmaped write
> > Inspired by Peter Staubach's patch and the resulting comments. > > > > > > An updated version of the original patch was submitted to LKML > yesterday... :-) Strange coincidence :) > > file = vma->vm_file; > > start = vma->vm_end; > > + mapping_update_time(file); > > if ((flags & MS_SYNC) && file && > > (vma->vm_flags & VM_SHARED)) { > > get_file(file); > > > > It seems to me that this might lead to file times being updated for > non-MAP_SHARED mappings. In theory no, because the COW-ed pages become anonymous and are not part of the original mapping any more. > > +int set_page_dirty_mapping(struct page *page); > > > > This aspect of the design seems intrusive to me. I didn't see a strong > reason to introduce new versions of many of the routines just to handle > these semantics. What motivated this part of your design? Why the new > _mapping versions of routines? Because there's no way to know inside the set_page_dirty() functions if the dirtying comes from a memory mapping or from a modification through a normal write(). And they have different semantics, for write() the modification times are updated immediately. Thanks, Miklos - To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] update ctime and mtime for mmaped write
On Wed, 2007-02-21 at 18:51 +0100, Miklos Szeredi wrote: > From: Miklos Szeredi <[EMAIL PROTECTED]> > > This patch makes writing to shared memory mappings update st_ctime and > st_mtime as defined by SUSv3: > >The st_ctime and st_mtime fields of a file that is mapped with >MAP_SHARED and PROT_WRITE shall be marked for update at some point >in the interval between a write reference to the mapped region and >the next call to msync() with MS_ASYNC or MS_SYNC for that portion >of the file by any process. If there is no such call and if the >underlying file is modified as a result of a write reference, then >these fields shall be marked for update at some time after the >write reference. > > A new address_space flag is introduced: AS_CMTIME. This is set each > time a page is dirtied through a userspace memory mapping. This > includes write accesses via get_user_pages(). > > Note, the flag is set unconditionally, even if the page is already > dirty. This is important, because the page might have been dirtied > earlier by a non-mmap write. > > This flag is checked in msync() and __fput(), and if set, the file > times are updated and the flag is cleared Why not also check inside vfs_getattr? Cheers Trond - To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] update ctime and mtime for mmaped write
Miklos Szeredi wrote: From: Miklos Szeredi <[EMAIL PROTECTED]> This patch makes writing to shared memory mappings update st_ctime and st_mtime as defined by SUSv3: The st_ctime and st_mtime fields of a file that is mapped with MAP_SHARED and PROT_WRITE shall be marked for update at some point in the interval between a write reference to the mapped region and the next call to msync() with MS_ASYNC or MS_SYNC for that portion of the file by any process. If there is no such call and if the underlying file is modified as a result of a write reference, then these fields shall be marked for update at some time after the write reference. A new address_space flag is introduced: AS_CMTIME. This is set each time a page is dirtied through a userspace memory mapping. This includes write accesses via get_user_pages(). Note, the flag is set unconditionally, even if the page is already dirty. This is important, because the page might have been dirtied earlier by a non-mmap write. This flag is checked in msync() and __fput(), and if set, the file times are updated and the flag is cleared The flag is also cleared, if the time update is triggered by a normal write. This is not mandated by the standard, but seems to be a sane thing to do. Fixes Novell Bugzilla #206431. Inspired by Peter Staubach's patch and the resulting comments. An updated version of the original patch was submitted to LKML yesterday... :-) Some comments below -- Signed-off-by: Miklos Szeredi <[EMAIL PROTECTED]> --- Index: linux/include/linux/pagemap.h === --- linux.orig/include/linux/pagemap.h 2007-02-21 14:15:06.0 +0100 +++ linux/include/linux/pagemap.h 2007-02-21 14:16:04.0 +0100 @@ -18,6 +18,7 @@ */ #defineAS_EIO (__GFP_BITS_SHIFT + 0) /* IO error on async write */ #define AS_ENOSPC (__GFP_BITS_SHIFT + 1) /* ENOSPC on async write */ +#define AS_CMTIME (__GFP_BITS_SHIFT + 2) /* ctime/mtime update needed */ static inline gfp_t mapping_gfp_mask(struct address_space * mapping) { Index: linux/mm/msync.c === --- linux.orig/mm/msync.c 2007-02-21 14:14:43.0 +0100 +++ linux/mm/msync.c2007-02-21 14:16:04.0 +0100 @@ -77,6 +77,7 @@ asmlinkage long sys_msync(unsigned long } file = vma->vm_file; start = vma->vm_end; + mapping_update_time(file); if ((flags & MS_SYNC) && file && (vma->vm_flags & VM_SHARED)) { get_file(file); It seems to me that this might lead to file times being updated for non-MAP_SHARED mappings. Index: linux/fs/file_table.c === --- linux.orig/fs/file_table.c 2007-02-21 14:14:43.0 +0100 +++ linux/fs/file_table.c 2007-02-21 14:16:04.0 +0100 @@ -165,6 +165,7 @@ void fastcall __fput(struct file *file) */ eventpoll_release(file); locks_remove_flock(file); + mapping_update_time(file); if (file->f_op && file->f_op->release) file->f_op->release(inode, file); Index: linux/fs/inode.c === --- linux.orig/fs/inode.c 2007-02-21 14:14:43.0 +0100 +++ linux/fs/inode.c2007-02-21 14:16:04.0 +0100 @@ -1219,6 +1219,7 @@ void file_update_time(struct file *file) struct timespec now; int sync_it = 0; + clear_bit(AS_CMTIME, &file->f_mapping->flags); if (IS_NOCMTIME(inode)) return; if (IS_RDONLY(inode)) @@ -1241,6 +1242,12 @@ void file_update_time(struct file *file) EXPORT_SYMBOL(file_update_time); +void mapping_update_time(struct file *file) +{ + if (test_bit(AS_CMTIME, &file->f_mapping->flags)) + file_update_time(file); +} + int inode_needs_sync(struct inode *inode) { if (IS_SYNC(inode)) Index: linux/include/linux/fs.h === --- linux.orig/include/linux/fs.h 2007-02-21 14:15:06.0 +0100 +++ linux/include/linux/fs.h2007-02-21 14:16:04.0 +0100 @@ -1887,6 +1887,7 @@ extern int inode_change_ok(struct inode extern int __must_check inode_setattr(struct inode *, struct iattr *); extern void file_update_time(struct file *file); +extern void mapping_update_time(struct file *file); static inline ino_t parent_ino(struct dentry *dentry) { Index: linux/include/linux/mm.h === --- linux.orig/include/linux/mm.h 2007-02-21 14:14:43.0 +0100 +++ linux/include/linux/mm.h2007-02-21 14:16:04.0 +0100 @@ -790,6 +790,7 @@ int redirty_page_for_writepage(struct wr struct page *page);