Re: process killed: text file modification
Julian Elischer wrote: On 13/4/17 5:45 am, Rick Macklem wrote: > I have just committed a patch to head (r316745) which should fix this. > (It includes code to handle the recent change to head to make the pageouts > write through the buffer cache.) > > It will be MFC'd and should be in 11.1. > is there any relevance of this change to stable/10 Yes, I do plan on MFC'ng this to stable/10. If Kostik doesn't MFC the VOP_PUTPAGES() changes, I'll just leave the ncl_flush() call out of nfs_set_text(). When I noted it would be in 11.1, I didn't intend to imply that it would be MFC'd to stable/11 only. rick ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: process killed: text file modification
On Mon, Apr 17, 2017 at 12:56:43AM +0800, Julian Elischer wrote: > On 13/4/17 5:45 am, Rick Macklem wrote: > > I have just committed a patch to head (r316745) which should fix this. > > (It includes code to handle the recent change to head to make the pageouts > > write through the buffer cache.) > > > > It will be MFC'd and should be in 11.1. > > is there any relevance of this change to stable/10? The code to kill the process on VV_TEXT mtime inconsistence is there from the initial newnfs import commit r191783. It is also present in the oldnfs client, but I am sure that it will not be changed. ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: process killed: text file modification
On 13/4/17 5:45 am, Rick Macklem wrote: I have just committed a patch to head (r316745) which should fix this. (It includes code to handle the recent change to head to make the pageouts write through the buffer cache.) It will be MFC'd and should be in 11.1. is there any relevance of this change to stable/10? Thanks everyone, for your help with this, rick From: owner-freebsd-curr...@freebsd.org on behalf of Rick Macklem Sent: Friday, March 24, 2017 4:14:45 PM To: Konstantin Belousov Cc: Gergely Czuczy; Dimitry Andric; Ian Lepore; FreeBSD Current Subject: Re: process killed: text file modification I can't do commits until I get home in mid-April. That's why it will be waiting until then. It should make it into stable/11 in plenty of time for 11.1. Thanks for your help with this, rick From: owner-freebsd-curr...@freebsd.org on behalf of Konstantin Belousov Sent: Friday, March 24, 2017 3:01:41 AM To: Rick Macklem Cc: Gergely Czuczy; Dimitry Andric; Ian Lepore; FreeBSD Current Subject: Re: process killed: text file modification On Thu, Mar 23, 2017 at 09:39:00PM +, Rick Macklem wrote: Try whatever you like. However, if the case that failed before doesn't fail now, I'd call the test a success. Thanks, rick ps: It looks like Kostik is going to work on converting the NFS vop_putpages() to using the buffer cache. However, if this isn't ready for head/current by mid-April, I will commit this patch to help fix things in the meantime. I do not see a reason to wait for my work before committing your patch. IMO, instead, it should be committed ASAP and merged into stable/11 for upcoming 11.1. I will do required adjustments if/when _putpages() patch progresses enough. ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org" ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org" ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org" ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: process killed: text file modification
I have just committed a patch to head (r316745) which should fix this. (It includes code to handle the recent change to head to make the pageouts write through the buffer cache.) It will be MFC'd and should be in 11.1. Thanks everyone, for your help with this, rick From: owner-freebsd-curr...@freebsd.org on behalf of Rick Macklem Sent: Friday, March 24, 2017 4:14:45 PM To: Konstantin Belousov Cc: Gergely Czuczy; Dimitry Andric; Ian Lepore; FreeBSD Current Subject: Re: process killed: text file modification I can't do commits until I get home in mid-April. That's why it will be waiting until then. It should make it into stable/11 in plenty of time for 11.1. Thanks for your help with this, rick From: owner-freebsd-curr...@freebsd.org on behalf of Konstantin Belousov Sent: Friday, March 24, 2017 3:01:41 AM To: Rick Macklem Cc: Gergely Czuczy; Dimitry Andric; Ian Lepore; FreeBSD Current Subject: Re: process killed: text file modification On Thu, Mar 23, 2017 at 09:39:00PM +, Rick Macklem wrote: > Try whatever you like. However, if the case that failed before doesn't fail > now, > I'd call the test a success. > > Thanks, rick > ps: It looks like Kostik is going to work on converting the NFS > vop_putpages() to > using the buffer cache. However, if this isn't ready for head/current > by mid-April, > I will commit this patch to help fix things in the meantime. I do not see a reason to wait for my work before committing your patch. IMO, instead, it should be committed ASAP and merged into stable/11 for upcoming 11.1. I will do required adjustments if/when _putpages() patch progresses enough. ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org" ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org" ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: process killed: text file modification
I can't do commits until I get home in mid-April. That's why it will be waiting until then. It should make it into stable/11 in plenty of time for 11.1. Thanks for your help with this, rick From: owner-freebsd-curr...@freebsd.org on behalf of Konstantin Belousov Sent: Friday, March 24, 2017 3:01:41 AM To: Rick Macklem Cc: Gergely Czuczy; Dimitry Andric; Ian Lepore; FreeBSD Current Subject: Re: process killed: text file modification On Thu, Mar 23, 2017 at 09:39:00PM +, Rick Macklem wrote: > Try whatever you like. However, if the case that failed before doesn't fail > now, > I'd call the test a success. > > Thanks, rick > ps: It looks like Kostik is going to work on converting the NFS > vop_putpages() to > using the buffer cache. However, if this isn't ready for head/current > by mid-April, > I will commit this patch to help fix things in the meantime. I do not see a reason to wait for my work before committing your patch. IMO, instead, it should be committed ASAP and merged into stable/11 for upcoming 11.1. I will do required adjustments if/when _putpages() patch progresses enough. ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org" ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: process killed: text file modification
On Thu, Mar 23, 2017 at 09:39:00PM +, Rick Macklem wrote: > Try whatever you like. However, if the case that failed before doesn't fail > now, > I'd call the test a success. > > Thanks, rick > ps: It looks like Kostik is going to work on converting the NFS > vop_putpages() to > using the buffer cache. However, if this isn't ready for head/current > by mid-April, > I will commit this patch to help fix things in the meantime. I do not see a reason to wait for my work before committing your patch. IMO, instead, it should be committed ASAP and merged into stable/11 for upcoming 11.1. I will do required adjustments if/when _putpages() patch progresses enough. ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: process killed: text file modification
Try whatever you like. However, if the case that failed before doesn't fail now, I'd call the test a success. Thanks, rick ps: It looks like Kostik is going to work on converting the NFS vop_putpages() to using the buffer cache. However, if this isn't ready for head/current by mid-April, I will commit this patch to help fix things in the meantime. From: Gergely Czuczy Sent: Thursday, March 23, 2017 2:25:11 AM To: Rick Macklem; Konstantin Belousov Cc: Dimitry Andric; Ian Lepore; FreeBSD Current Subject: Re: process killed: text file modification On 2017. 03. 21. 3:40, Rick Macklem wrote: > Gergely Czuczy wrote: > [stuff snipped] >> Actually I want to test it, but you guys are so vehemently discussing >> it, I thought it would be better to do so, once you guys settled your >> analysis on the code. Also, me not having the problem occurring, I don't >> think would mean it's solved, since that would only mean, the codepath >> for my specific usecase works. There might be other things there as >> well, what I don't hit. > I hope by vehemently, you didn't find my comments as nasty. If they did > come out that way, it was not what I intended and I apologize. > >> Let me know which patch should I test, and I will see to it in the next >> couple of days, when I get the time to do it. > I've attached it here again and, yes, I would agree that the results you get > from testing are just another data point and not definitive. > (I'd say this statement is true of all testing of nontrivial code.) > > Thanks in advance for any testing you can do, rick > So, I've copied the patched kernel over, and apparently it's working properly. I'm not getting the error anymore. So far I've only did a quick test, should I do something more extensive, like build a couple of ports or something over NFS? ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: process killed: text file modification
On 2017. 03. 21. 3:40, Rick Macklem wrote: Gergely Czuczy wrote: [stuff snipped] Actually I want to test it, but you guys are so vehemently discussing it, I thought it would be better to do so, once you guys settled your analysis on the code. Also, me not having the problem occurring, I don't think would mean it's solved, since that would only mean, the codepath for my specific usecase works. There might be other things there as well, what I don't hit. I hope by vehemently, you didn't find my comments as nasty. If they did come out that way, it was not what I intended and I apologize. Let me know which patch should I test, and I will see to it in the next couple of days, when I get the time to do it. I've attached it here again and, yes, I would agree that the results you get from testing are just another data point and not definitive. (I'd say this statement is true of all testing of nontrivial code.) Thanks in advance for any testing you can do, rick So, I've copied the patched kernel over, and apparently it's working properly. I'm not getting the error anymore. So far I've only did a quick test, should I do something more extensive, like build a couple of ports or something over NFS? ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: process killed: text file modification
On Thu, Mar 23, 2017 at 12:55:09AM +, Rick Macklem wrote: > Wow, this is looking good to me. I had thought that the simple way to make > ncl_putpages() go through the buffer cache was to replace ncl_writerpc() with > VOP_WRITE(). My concern was all the memory<->memory copying that would > go on between the pages being written and the buffers allocated by > VOP_WRITE(). > If there is a way to avoid some (if not all) of this memory<->memory copying, > then > I think it would be a big improvement.. UIO_NOCOPY means that uio is only updated to indicate the operation as performed, but no real copying occurs. This is exactly what the _putpages() case needs, since the data is already in the pages. When buffers are created for the corresponding file offsets, appropriate pages are put into the buffer's page array and data appears in the buffer with zero copying. This is how generic putpages code works for local filesystems, e.g. UFS. > > As far as the commit goes, you don't need to do anything if you are calling > VOP_WRITE(). > (The code below VOP_WRITE() takes care of all of that.) > --> You might want to implement a function like nfs_write(), but with extra > arguments. > If you did that, you could indicate when you want the writes to happen > synchronously > vs. async/delayed and that would decide when FILESYNC would be > specified. > Yes, this is what I want to improve in the patch. As I noted, I added translation of the VM_PAGER_PUT_* flags into IO_* flags, but IO_* flags needs more code. Most important is IO_ASYNC which probably should become similar to the current !IO_SYNC ncl_write(), but without clustering. You mentioned that NFSWRITE_FILESYNC/NFSWRITE_UNSTABLE should be specified, and it seems that this is managed by B_NEEDCOMMIT buffer flag. I see that B_NEEEDCOMMIT is cleared in ncl_write(). > As far as I know, the unpatched nc_putpages() is badly broken for the > UNSTABLE/commit case. For UNSTABLE writes, the client is supposed to > know how to write the data again if the server crashes/reboots before > a Commit RPC is successfully done for the data. (The ncl_clearcommit() > function is the one called when the server indicates it has rebooted > and needs this. It makes no sense whatsoever and breaks the client > to call it in ncl_putpages() when mustcommit is set. All mustcommit > being set indicates is that the write RPC was done UNSTABLE and the > above applies to it. Some servers always do FILESYNC, so it isn't ever > necessary to do a Commit PRC or redo the write RPCs.) > > Summary. If you are calling VOP_WRITE() or a similar call above the > buffer cache, then you don't have to worry about any of this. Ok, thanks. > > > Things that needs to be done is to add missed handling of the IO flags to > > ncl_write(). > > > + if (error == 0 || !nfs_keep_dirty_on_error) > > vnode_pager_undirty_pages(pages, rtvals, count - > > uio.uio_resid); > If the data isn't copied, will this data still be available to the NFS buffer > cache code, > so that it can redo the writes for the UNSTABLE case, if the server reboots > before a > Commit RPC has succeeded? As far as buffers are there (e.g. not marked clean), the data is there. Of course, userspace can modify the data in pages if writeable mapping exists, but it is expected. Oh, I remembered one more question I wanted to ask in the previous mail. With the patch, ncl_write() can be called from the delayed contexts like pagedaemon, or after all writeable file descriptors referencing the file are closed. Wouldn't some calls to VOP_OPEN()/VOP_CLOSE() around the VOP_WRITE() needed there ? > > > - if (must_commit) > > - ncl_clearcommit(vp->v_mount); > No matter what else we do, this should go away. As above, it breaks the NFS > client > and basically forces all dirty buffer cache blocks to be rewritten when it > shouldn't > be necessary. ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: process killed: text file modification
On 2017. 03. 21. 3:40, Rick Macklem wrote: Gergely Czuczy wrote: [stuff snipped] Actually I want to test it, but you guys are so vehemently discussing it, I thought it would be better to do so, once you guys settled your analysis on the code. Also, me not having the problem occurring, I don't think would mean it's solved, since that would only mean, the codepath for my specific usecase works. There might be other things there as well, what I don't hit. I hope by vehemently, you didn't find my comments as nasty. If they did come out that way, it was not what I intended and I apologize. Let me know which patch should I test, and I will see to it in the next couple of days, when I get the time to do it. I've attached it here again and, yes, I would agree that the results you get from testing are just another data point and not definitive. (I'd say this statement is true of all testing of nontrivial code.) Thanks in advance for any testing you can do, rick I finally had the time to give it a go, but unfortunately there was something wrong with the built image, it was unable to find the root device during boot. I will try to just copy the kernel over a bit later, and see how it goes. I hope there are no ABI changes between the two revisions (the previously built world, and the patched kernel). ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: process killed: text file modification
Konstantin Belousov wrote: [stuff snipped] > Below is something to discuss. This is not finished, but it worked for > the simple tests I performed. Clustering should be somewhat handled by > the ncl_write() as is. As an additional advantage, I removed the now > unneeded phys buffer allocation. > > If you agree with the approach on principle, I want to ask what to do > about the commit stuff there (I simply removed that for now). Wow, this is looking good to me. I had thought that the simple way to make ncl_putpages() go through the buffer cache was to replace ncl_writerpc() with VOP_WRITE(). My concern was all the memory<->memory copying that would go on between the pages being written and the buffers allocated by VOP_WRITE(). If there is a way to avoid some (if not all) of this memory<->memory copying, then I think it would be a big improvement.. As far as the commit goes, you don't need to do anything if you are calling VOP_WRITE(). (The code below VOP_WRITE() takes care of all of that.) --> You might want to implement a function like nfs_write(), but with extra arguments. If you did that, you could indicate when you want the writes to happen synchronously vs. async/delayed and that would decide when FILESYNC would be specified. As far as I know, the unpatched nc_putpages() is badly broken for the UNSTABLE/commit case. For UNSTABLE writes, the client is supposed to know how to write the data again if the server crashes/reboots before a Commit RPC is successfully done for the data. (The ncl_clearcommit() function is the one called when the server indicates it has rebooted and needs this. It makes no sense whatsoever and breaks the client to call it in ncl_putpages() when mustcommit is set. All mustcommit being set indicates is that the write RPC was done UNSTABLE and the above applies to it. Some servers always do FILESYNC, so it isn't ever necessary to do a Commit PRC or redo the write RPCs.) Summary. If you are calling VOP_WRITE() or a similar call above the buffer cache, then you don't have to worry about any of this. > Things that needs to be done is to add missed handling of the IO flags to > ncl_write(). > + if (error == 0 || !nfs_keep_dirty_on_error) > vnode_pager_undirty_pages(pages, rtvals, count - > uio.uio_resid); If the data isn't copied, will this data still be available to the NFS buffer cache code, so that it can redo the writes for the UNSTABLE case, if the server reboots before a Commit RPC has succeeded? > - if (must_commit) > - ncl_clearcommit(vp->v_mount); No matter what else we do, this should go away. As above, it breaks the NFS client and basically forces all dirty buffer cache blocks to be rewritten when it shouldn't be necessary. rick ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: process killed: text file modification
On Tue, Mar 21, 2017 at 09:41:19PM +, Rick Macklem wrote: > Konstantin Belousov wrote: > > Anyway, my position is that nfs VOP_PUTPAGES() should do write through > > buffer cache, not issuing the direct rpc call with the pages as source. > Hmm. Interesting idea. Since a "struct buf" can only refer to > contiguous bytes, I suspect each page might end up as a separate > "struct buf", at least until some clustering algorithm succeeded in > merging them. > > I would agree that it would be nice to change VOP_PUTPAGES(), since > it currently results in a lot of 4K writes (with FILE_SYNC I think?) > and this is normally slow/inefficient for the server. (It would be > interesting to try your suggestion above and see if the pages would > cluster into larger writes. Also, the "struct buf" code knows how to > do UNSTABLE writes followed by a Commit.) Below is something to discuss. This is not finished, but it worked for the simple tests I performed. Clustering should be somewhat handled by the ncl_write() as is. As an additional advantage, I removed the now unneeded phys buffer allocation. If you agree with the approach on principle, I want to ask what to do about the commit stuff there (I simply removed that for now). Things that needs to be done is to add missed handling of the IO flags to ncl_write(). diff --git a/sys/fs/nfsclient/nfs_clbio.c b/sys/fs/nfsclient/nfs_clbio.c index 1c225c1469a..562754609b1 100644 --- a/sys/fs/nfsclient/nfs_clbio.c +++ b/sys/fs/nfsclient/nfs_clbio.c @@ -266,9 +266,7 @@ ncl_putpages(struct vop_putpages_args *ap) { struct uio uio; struct iovec iov; - vm_offset_t kva; - struct buf *bp; - int iomode, must_commit, i, error, npages, count; + int ioflags, i, error, npages, count; off_t offset; int *rtvals; struct vnode *vp; @@ -322,44 +320,34 @@ ncl_putpages(struct vop_putpages_args *ap) } mtx_unlock(&np->n_mtx); - /* -* We use only the kva address for the buffer, but this is extremely -* convenient and fast. -*/ - bp = getpbuf(&ncl_pbuf_freecnt); - - kva = (vm_offset_t) bp->b_data; - pmap_qenter(kva, pages, npages); PCPU_INC(cnt.v_vnodeout); PCPU_ADD(cnt.v_vnodepgsout, count); - iov.iov_base = (caddr_t) kva; + iov.iov_base = unmapped_buf; iov.iov_len = count; uio.uio_iov = &iov; uio.uio_iovcnt = 1; uio.uio_offset = offset; uio.uio_resid = count; - uio.uio_segflg = UIO_SYSSPACE; + uio.uio_segflg = UIO_NOCOPY; uio.uio_rw = UIO_WRITE; uio.uio_td = td; - if ((ap->a_sync & VM_PAGER_PUT_SYNC) == 0) - iomode = NFSWRITE_UNSTABLE; - else - iomode = NFSWRITE_FILESYNC; + ioflags = IO_VMIO; + if (ap->a_sync & (VM_PAGER_PUT_SYNC | VM_PAGER_PUT_INVAL)) + ioflags |= IO_SYNC; + else if ((ap->a_sync & VM_PAGER_CLUSTER_OK) == 0) + ioflags |= IO_ASYNC; + ioflags |= (ap->a_sync & VM_PAGER_PUT_INVAL) ? IO_INVAL: 0; + ioflags |= (ap->a_sync & VM_PAGER_PUT_NOREUSE) ? IO_NOREUSE : 0; + ioflags |= IO_SEQMAX << IO_SEQSHIFT; - error = ncl_writerpc(vp, &uio, cred, &iomode, &must_commit, 0); + error = VOP_WRITE(vp, &uio, ioflags, cred); crfree(cred); - pmap_qremove(kva, npages); - relpbuf(bp, &ncl_pbuf_freecnt); - - if (error == 0 || !nfs_keep_dirty_on_error) { + if (error == 0 || !nfs_keep_dirty_on_error) vnode_pager_undirty_pages(pages, rtvals, count - uio.uio_resid); - if (must_commit) - ncl_clearcommit(vp->v_mount); - } - return rtvals[0]; + return (rtvals[0]); } /* ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: process killed: text file modification
Konstantin Belousov wrote: [stuff snipped] > By 'impossible' I mean some arbitrary combination of bytes which were > written by many means to the file at arbitrary moments. In other words, > the file content, or even a single page/block content is not atomic > WRT the client updates. Yes. For multiple processes writing the same file, I'd agree that's going to happen unless the processes use advisoty byte range locking to order the updates. And, I'm pretty sure a process that does both write(2) syscalls on a file and modifies pages of it that are mmap()'d will produce "interesting" results as you describe. [stuff snipped] > Or, what seems more likely to me, the code was written on a system where > buffer cache and page queues are not coherent. > > Anyway, my position is that nfs VOP_PUTPAGES() should do write through > buffer cache, not issuing the direct rpc call with the pages as source. Hmm. Interesting idea. Since a "struct buf" can only refer to contiguous bytes, I suspect each page might end up as a separate "struct buf", at least until some clustering algorithm succeeded in merging them. I would agree that it would be nice to change VOP_PUTPAGES(), since it currently results in a lot of 4K writes (with FILE_SYNC I think?) and this is normally slow/inefficient for the server. (It would be interesting to try your suggestion above and see if the pages would cluster into larger writes. Also, the "struct buf" code knows how to do UNSTABLE writes followed by a Commit.) --> I am currently working on a pNFS server (which is coming along fairly well), so I have no idea if/when I might get around to trying to do this. > Then your patch would need an update with the mentioned call to ncl_flush(). Yes. rick ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: process killed: text file modification
On Tue, Mar 21, 2017 at 02:30:38AM +, Rick Macklem wrote: > Konstantin Belousov wrote: > [stuff snipped] > > Yes, I have to somewhat retract my claims, but then I have another set > > of surprises. > Righto. > > > I realized (remembered) that nfs has its own VOP_PUTPAGES() method. > > Implementation seems to directly initiate write RPC request using the > > pages as the source buffer. I do not see anything in the code which > > would mark the buffers, which possibly contain the pages, as clean, > > or mark the buffer range as undirty. > The only place I know of that the code does this is in the "struct buf's" > hanging off of v_bufobj.bo_dirty. > I imagine there would be a race between the write-back to the NFS server > and further changes to the page by the process. For the most part, the > VOP_PUTPAGES() is likely to happen after the process is done modifying > the pages (often exited). For cases where it happens sooner, I would expect > the page(s) to be written multiple times, but the last write should bring > the file "up to date" on the server. > > > At very least, this might cause unnecessary double-write of the same > > data. I am not sure if it could cause coherency issues between data > > written using mappings and write(2). Also, both vm_object_page_clean() > > and vfs_busy_pages() only ensure the shared-busy state on the pages, > > so write(2) can occur while pageout sends data to server, causing > > 'impossible' content transmitted over the wire. > I'm not sure what you mean by impossible content, but for NFS the only > time the file on the NFS server should be "up to date" will be after a file > doing write(2) writing has closed the fd (and only then if options like > "nocto" has not been used) or after an fsync(2) done by the process > doing the writing. By 'impossible' I mean some arbitrary combination of bytes which were written by many means to the file at arbitrary moments. In other words, the file content, or even a single page/block content is not atomic WRT the client updates. > For mmap'd writing, I think msync(2) is about the only > thing the process can do to ensure the data is written back to the server. > (There was a patch to the NFS VOP_CLOSE() that does a vm_object_page_clean() > but without the OBJPC_SYNC flag which tries to make sure the pages get > written > shortly after the file is closed. Of course, an mmap'd file can still be > modified by the > process after close(2), so it is "just an attempt to make the common case > work". > I don't recall, but I don't think I was the author of this patch.) > > I also wouldn't be surprised that multiple writes of the same page(s) occurs > under certain situations. (NFS has no rules w.r.t. write ordering. Each RPC is > separate and simply writes N bytes at file offset S.) It definitely happens > when > there are multiple write(2)s of partial buffers, depending on when a sync() > happens. > > > Could you, please, explain the reasons for such implementation of > > ncl_putpage() ? > Well, I wasn't the author (it was just cribbed from the old NFS client and I > don't > know who wrote it), so I'm afraid I don't know. (It's code I avoid changing > because I don't > really understand it.) > > I suspect that the author assumed that processes would either mmap the file > or use write(2) and wouldn't ever try and mix them to-gether. Or, what seems more likely to me, the code was written on a system where buffer cache and page queues are not coherent. Anyway, my position is that nfs VOP_PUTPAGES() should do write through buffer cache, not issuing the direct rpc call with the pages as source. Then your patch would need an update with the mentioned call to ncl_flush(). ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: process killed: text file modification
On 2017. 03. 21. 3:40, Rick Macklem wrote: Gergely Czuczy wrote: [stuff snipped] Actually I want to test it, but you guys are so vehemently discussing it, I thought it would be better to do so, once you guys settled your analysis on the code. Also, me not having the problem occurring, I don't think would mean it's solved, since that would only mean, the codepath for my specific usecase works. There might be other things there as well, what I don't hit. I hope by vehemently, you didn't find my comments as nasty. If they did come out that way, it was not what I intended and I apologize. Oh, totally not. I barely meant that you guys are right in the middle of the technical discussion, and it doesn't seemed settled. Let me know which patch should I test, and I will see to it in the next couple of days, when I get the time to do it. I've attached it here again and, yes, I would agree that the results you get from testing are just another data point and not definitive. (I'd say this statement is true of all testing of nontrivial code.) Thanks in advance for any testing you can do, rick Updated the tree and the patch has applied: # patch < /home/phoemix/textmod.patch Hmm... Looks like a unified diff to me... The text leading up to this was: -- |--- fs/nfsclient/nfs_clvnops.c.text2017-03-16 21:55:16.263393000 -0400 |+++ fs/nfsclient/nfs_clvnops.c 2017-03-17 09:31:23.632814000 -0400 -- Patching file fs/nfsclient/nfs_clvnops.c using Plan A... Hunk #1 succeeded at 140. Hunk #2 succeeded at 177. Hunk #3 succeeded at 3375. done When I'm back home from work, I will check the build out, and see how it goes. And thank you very much guys for working on fixing this one. -czg ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: process killed: text file modification
Gergely Czuczy wrote: [stuff snipped] > Actually I want to test it, but you guys are so vehemently discussing > it, I thought it would be better to do so, once you guys settled your > analysis on the code. Also, me not having the problem occurring, I don't > think would mean it's solved, since that would only mean, the codepath > for my specific usecase works. There might be other things there as > well, what I don't hit. I hope by vehemently, you didn't find my comments as nasty. If they did come out that way, it was not what I intended and I apologize. > Let me know which patch should I test, and I will see to it in the next > couple of days, when I get the time to do it. I've attached it here again and, yes, I would agree that the results you get from testing are just another data point and not definitive. (I'd say this statement is true of all testing of nontrivial code.) Thanks in advance for any testing you can do, rick textmod.patch Description: textmod.patch ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: process killed: text file modification
Konstantin Belousov wrote: [stuff snipped] > Yes, I have to somewhat retract my claims, but then I have another set > of surprises. Righto. > I realized (remembered) that nfs has its own VOP_PUTPAGES() method. > Implementation seems to directly initiate write RPC request using the > pages as the source buffer. I do not see anything in the code which > would mark the buffers, which possibly contain the pages, as clean, > or mark the buffer range as undirty. The only place I know of that the code does this is in the "struct buf's" hanging off of v_bufobj.bo_dirty. I imagine there would be a race between the write-back to the NFS server and further changes to the page by the process. For the most part, the VOP_PUTPAGES() is likely to happen after the process is done modifying the pages (often exited). For cases where it happens sooner, I would expect the page(s) to be written multiple times, but the last write should bring the file "up to date" on the server. > At very least, this might cause unnecessary double-write of the same > data. I am not sure if it could cause coherency issues between data > written using mappings and write(2). Also, both vm_object_page_clean() > and vfs_busy_pages() only ensure the shared-busy state on the pages, > so write(2) can occur while pageout sends data to server, causing > 'impossible' content transmitted over the wire. I'm not sure what you mean by impossible content, but for NFS the only time the file on the NFS server should be "up to date" will be after a file doing write(2) writing has closed the fd (and only then if options like "nocto" has not been used) or after an fsync(2) done by the process doing the writing. For mmap'd writing, I think msync(2) is about the only thing the process can do to ensure the data is written back to the server. (There was a patch to the NFS VOP_CLOSE() that does a vm_object_page_clean() but without the OBJPC_SYNC flag which tries to make sure the pages get written shortly after the file is closed. Of course, an mmap'd file can still be modified by the process after close(2), so it is "just an attempt to make the common case work". I don't recall, but I don't think I was the author of this patch.) I also wouldn't be surprised that multiple writes of the same page(s) occurs under certain situations. (NFS has no rules w.r.t. write ordering. Each RPC is separate and simply writes N bytes at file offset S.) It definitely happens when there are multiple write(2)s of partial buffers, depending on when a sync() happens. > Could you, please, explain the reasons for such implementation of > ncl_putpage() ? Well, I wasn't the author (it was just cribbed from the old NFS client and I don't know who wrote it), so I'm afraid I don't know. (It's code I avoid changing because I don't really understand it.) I suspect that the author assumed that processes would either mmap the file or use write(2) and wouldn't ever try and mix them to-gether. Hope this helps, rick ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: process killed: text file modification
On Sun, Mar 19, 2017 at 08:52:50PM +, Rick Macklem wrote: > Kostik wrote: > [stuff snipped] > >> >> Dirty pages are flushed by writes, so if we have a set of dirty pages > >> >> and > >> >> async vm_object_page_clean() is called on the vnode' vm_object, we get > >> >> a bunch of delayed-write AKA dirty buffers. This is possible even after > >> >> VOP_CLOSE() was done, e.g. by syncer performing regular run involving > >> >> vfs_msync(). > >> When I was talking about ncl_flush() above, I was referring to buffer cache > >> buffers written by a write(2) syscall, not the case of mmap'd pages. > > But dirty buffers can appear on the vnode queue due to dirty pages msyncing > > by syncer, for instance. > Ok, just to clarify this, in case I don't understand it... > - You aren't saying that anything will be added to v_bufobj.bo_dirty.bv_hd by > vfs_msync() or similar, after VOP_CLOSE(), right? Yes, I have to somewhat retract my claims, but then I have another set of surprises. I realized (remembered) that nfs has its own VOP_PUTPAGES() method. Implementation seems to directly initiate write RPC request using the pages as the source buffer. I do not see anything in the code which would mark the buffers, which possibly contain the pages, as clean, or mark the buffer range as undirty. At very least, this might cause unnecessary double-write of the same data. I am not sure if it could cause coherency issues between data written using mappings and write(2). Also, both vm_object_page_clean() and vfs_busy_pages() only ensure the shared-busy state on the pages, so write(2) can occur while pageout sends data to server, causing 'impossible' content transmitted over the wire. Could you, please, explain the reasons for such implementation of ncl_putpage() ? > --> ncl_flush() { was called nfs_flush() in the old NFS client } only deals > with > "struct buf's" hanging off v_bufobj.bo_dirty.bv_hd, so I don't see a use > for > it in the patch. > > As for pages added to v_bufobj.bo_object...the patch assumes that the > process that was writing the executable file mmap'd is done { normally > exited } before the exec() syscall occurs. If it is still dirtying > pages when the exec() occurs, then failing with "Text file modified" > seems correct to me. As you mentioned, another client can do this to > the file anyhow. > > My understanding is that vm_object_page_clean() will get all the dirty > pages written back to the server at that point and if that is done in > VOP_SET_TEXT() as this patch does, what more can the NFS client do? > > [more stuff snipped] > > Syncer does not open the vnode inside the vfs_msync() operations. > Ok, but this doesn't put "struct buf's" on v_bufobj.bo_dirty.bv_hd. Am I > right? > (When I said "buffers". I meant "struct buf's" under bo_dirty, not stuff under > v_bufobj.bo_object.) > > > We do track writeability to the file, and do not allow execution if there is > > an active writer, be it a file descriptor opened for write, or a writeable > > mapping. And in reverse, if the file is executed (VV_TEXT is set), then > > we disallow opening the file for write. > Yes, and that was why I figured doing this in VOP_SET_TEXT(), just before > setting VV_TEXT, was the right place to do it. > [more stuff snipped] > > > > Thanks for testing the patch. Now, if others can test it...rick > > > Again, hopefully others (especially the original reporter) will be able to > test the patch, rick ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: process killed: text file modification
On 2017. 03. 19. 21:52, Rick Macklem wrote: Kostik wrote: [stuff snipped] Dirty pages are flushed by writes, so if we have a set of dirty pages and async vm_object_page_clean() is called on the vnode' vm_object, we get a bunch of delayed-write AKA dirty buffers. This is possible even after VOP_CLOSE() was done, e.g. by syncer performing regular run involving vfs_msync(). When I was talking about ncl_flush() above, I was referring to buffer cache buffers written by a write(2) syscall, not the case of mmap'd pages. But dirty buffers can appear on the vnode queue due to dirty pages msyncing by syncer, for instance. Ok, just to clarify this, in case I don't understand it... - You aren't saying that anything will be added to v_bufobj.bo_dirty.bv_hd by vfs_msync() or similar, after VOP_CLOSE(), right? --> ncl_flush() { was called nfs_flush() in the old NFS client } only deals with "struct buf's" hanging off v_bufobj.bo_dirty.bv_hd, so I don't see a use for it in the patch. As for pages added to v_bufobj.bo_object...the patch assumes that the process that was writing the executable file mmap'd is done { normally exited } before the exec() syscall occurs. If it is still dirtying pages when the exec() occurs, then failing with "Text file modified" seems correct to me. As you mentioned, another client can do this to the file anyhow. My understanding is that vm_object_page_clean() will get all the dirty pages written back to the server at that point and if that is done in VOP_SET_TEXT() as this patch does, what more can the NFS client do? [more stuff snipped] Syncer does not open the vnode inside the vfs_msync() operations. Ok, but this doesn't put "struct buf's" on v_bufobj.bo_dirty.bv_hd. Am I right? (When I said "buffers". I meant "struct buf's" under bo_dirty, not stuff under v_bufobj.bo_object.) We do track writeability to the file, and do not allow execution if there is an active writer, be it a file descriptor opened for write, or a writeable mapping. And in reverse, if the file is executed (VV_TEXT is set), then we disallow opening the file for write. Yes, and that was why I figured doing this in VOP_SET_TEXT(), just before setting VV_TEXT, was the right place to do it. [more stuff snipped] Thanks for testing the patch. Now, if others can test it...rick Again, hopefully others (especially the original reporter) will be able to test the patch, rick Actually I want to test it, but you guys are so vehemently discussing it, I thought it would be better to do so, once you guys settled your analysis on the code. Also, me not having the problem occurring, I don't think would mean it's solved, since that would only mean, the codepath for my specific usecase works. There might be other things there as well, what I don't hit. Let me know which patch should I test, and I will see to it in the next couple of days, when I get the time to do it. Regards, -czg ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: process killed: text file modification
Kostik wrote: [stuff snipped] >> >> Dirty pages are flushed by writes, so if we have a set of dirty pages and >> >> async vm_object_page_clean() is called on the vnode' vm_object, we get >> >> a bunch of delayed-write AKA dirty buffers. This is possible even after >> >> VOP_CLOSE() was done, e.g. by syncer performing regular run involving >> >> vfs_msync(). >> When I was talking about ncl_flush() above, I was referring to buffer cache >> buffers written by a write(2) syscall, not the case of mmap'd pages. > But dirty buffers can appear on the vnode queue due to dirty pages msyncing > by syncer, for instance. Ok, just to clarify this, in case I don't understand it... - You aren't saying that anything will be added to v_bufobj.bo_dirty.bv_hd by vfs_msync() or similar, after VOP_CLOSE(), right? --> ncl_flush() { was called nfs_flush() in the old NFS client } only deals with "struct buf's" hanging off v_bufobj.bo_dirty.bv_hd, so I don't see a use for it in the patch. As for pages added to v_bufobj.bo_object...the patch assumes that the process that was writing the executable file mmap'd is done { normally exited } before the exec() syscall occurs. If it is still dirtying pages when the exec() occurs, then failing with "Text file modified" seems correct to me. As you mentioned, another client can do this to the file anyhow. My understanding is that vm_object_page_clean() will get all the dirty pages written back to the server at that point and if that is done in VOP_SET_TEXT() as this patch does, what more can the NFS client do? [more stuff snipped] > Syncer does not open the vnode inside the vfs_msync() operations. Ok, but this doesn't put "struct buf's" on v_bufobj.bo_dirty.bv_hd. Am I right? (When I said "buffers". I meant "struct buf's" under bo_dirty, not stuff under v_bufobj.bo_object.) > We do track writeability to the file, and do not allow execution if there is > an active writer, be it a file descriptor opened for write, or a writeable > mapping. And in reverse, if the file is executed (VV_TEXT is set), then > we disallow opening the file for write. Yes, and that was why I figured doing this in VOP_SET_TEXT(), just before setting VV_TEXT, was the right place to do it. [more stuff snipped] > > Thanks for testing the patch. Now, if others can test it...rick > Again, hopefully others (especially the original reporter) will be able to test the patch, rick ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: process killed: text file modification
On Fri, Mar 17, 2017 at 09:23:00PM +, Rick Macklem wrote: > Dimitry Andric wrote: > >On 17 Mar 2017, at 15:19, Konstantin Belousov wrote: > >> > >> On Fri, Mar 17, 2017 at 01:53:46PM +, Rick Macklem wrote: > >>> Well, I don't mind adding ncl_flush(), but it shouldn't be > >>> necessary. I actually had it in the first > >>> rendition of the patch, but took it out because it should happen > >>> on the VOP_CLOSE() if any writing to the buffer cache happened > >>> and that code hasn't changed in many years. > >> Dirty pages are flushed by writes, so if we have a set of dirty pages and > >> async vm_object_page_clean() is called on the vnode' vm_object, we get > >> a bunch of delayed-write AKA dirty buffers. This is possible even after > >> VOP_CLOSE() was done, e.g. by syncer performing regular run involving > >> vfs_msync(). > When I was talking about ncl_flush() above, I was referring to buffer cache > buffers written by a write(2) syscall, not the case of mmap'd pages. But dirty buffers can appear on the vnode queue due to dirty pages msyncing by syncer, for instance. > > >> > >> I agree that the patch would not create new dirty buffers, but it is > >> possible > >> to get them by other means. > To write to a buffer cache block, the file would be opened by another > thread and that is what this sanity check was meant to catch. As > for dirtying pages that are mmap'd, as far as I understand it, the > NFS client has no way of knowing if this will happen more until > VOP_INACTIVE() is called on the vnode. > Syncer does not open the vnode inside the vfs_msync() operations. > To be honest, this check could easily be deleted. After all, NFS could care > less if a file > is being executed (all it sees are reads and writes). Without the check, the > executable > might do "interesting" things;-) > [stuff snipped] > > FWIW, Rick's patch seems to do the trick, both for my test case and lld > > itself. And even with vfs.timestamp_precision=3 on both server and > > client. > Hopefully the original reporter of the problem (Gergely ??) can test the > patch as well. > I think the patch is pretty harmless, although it could be argued that setting > np->m_mtime = np->n_nattr.na_mtime (or close to that) > shouldn't happen for the case where there isn't any dirty pages found to > flush. > However, once a file mmap'd we don't know when it does get modified anyhow > (as discussed above), so setting it here doesn't seem harmful to me. We do track writeability to the file, and do not allow execution if there is an active writer, be it a file descriptor opened for write, or a writeable mapping. And in reverse, if the file is executed (VV_TEXT is set), then we disallow opening the file for write. Of course, this cannot work when we execute file on one NFS client, and another client modifies the file, but then exactly this check and kill should provide some sanity. > > Thanks for testing the patch. Now, if others can test it...rick > ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: process killed: text file modification
Dimitry Andric wrote: >On 17 Mar 2017, at 15:19, Konstantin Belousov wrote: >> >> On Fri, Mar 17, 2017 at 01:53:46PM +, Rick Macklem wrote: >>> Well, I don't mind adding ncl_flush(), but it shouldn't be >>> necessary. I actually had it in the first >>> rendition of the patch, but took it out because it should happen >>> on the VOP_CLOSE() if any writing to the buffer cache happened >>> and that code hasn't changed in many years. >> Dirty pages are flushed by writes, so if we have a set of dirty pages and >> async vm_object_page_clean() is called on the vnode' vm_object, we get >> a bunch of delayed-write AKA dirty buffers. This is possible even after >> VOP_CLOSE() was done, e.g. by syncer performing regular run involving >> vfs_msync(). When I was talking about ncl_flush() above, I was referring to buffer cache buffers written by a write(2) syscall, not the case of mmap'd pages. >> >> I agree that the patch would not create new dirty buffers, but it is possible >> to get them by other means. To write to a buffer cache block, the file would be opened by another thread and that is what this sanity check was meant to catch. As for dirtying pages that are mmap'd, as far as I understand it, the NFS client has no way of knowing if this will happen more until VOP_INACTIVE() is called on the vnode. To be honest, this check could easily be deleted. After all, NFS could care less if a file is being executed (all it sees are reads and writes). Without the check, the executable might do "interesting" things;-) [stuff snipped] > FWIW, Rick's patch seems to do the trick, both for my test case and lld > itself. And even with vfs.timestamp_precision=3 on both server and > client. Hopefully the original reporter of the problem (Gergely ??) can test the patch as well. I think the patch is pretty harmless, although it could be argued that setting np->m_mtime = np->n_nattr.na_mtime (or close to that) shouldn't happen for the case where there isn't any dirty pages found to flush. However, once a file mmap'd we don't know when it does get modified anyhow (as discussed above), so setting it here doesn't seem harmful to me. Thanks for testing the patch. Now, if others can test it...rick ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: process killed: text file modification
On 17 Mar 2017, at 15:19, Konstantin Belousov wrote: > > On Fri, Mar 17, 2017 at 01:53:46PM +, Rick Macklem wrote: >> Well, I don't mind adding ncl_flush(), but it shouldn't be >> necessary. I actually had it in the first >> rendition of the patch, but took it out because it should happen >> on the VOP_CLOSE() if any writing to the buffer cache happened >> and that code hasn't changed in many years. > Dirty pages are flushed by writes, so if we have a set of dirty pages and > async vm_object_page_clean() is called on the vnode' vm_object, we get > a bunch of delayed-write AKA dirty buffers. This is possible even after > VOP_CLOSE() was done, e.g. by syncer performing regular run involving > vfs_msync(). > > I agree that the patch would not create new dirty buffers, but it is possible > to get them by other means. > >> >> What the patch was missing was updating n_mtime after the dirty >> page flush. >> >> Btw, a flush without OBJPC_SYNC happens when the file is VOP_CLOSE()'d >> unless the default value of vfs.nfs.clean_[ages_on_close is changed, which >> I think is why the 1sec resolution always seemed to work, at least for the >> example where there was an munmap before close. >> >> Attached is an updated version with that in it, rick FWIW, Rick's patch seems to do the trick, both for my test case and lld itself. And even with vfs.timestamp_precision=3 on both server and client. -Dimitry signature.asc Description: Message signed with OpenPGP
Re: process killed: text file modification
On Fri, Mar 17, 2017 at 01:53:46PM +, Rick Macklem wrote: > Well, I don't mind adding ncl_flush(), but it shouldn't be > necessary. I actually had it in the first > rendition of the patch, but took it out because it should happen > on the VOP_CLOSE() if any writing to the buffer cache happened > and that code hasn't changed in many years. Dirty pages are flushed by writes, so if we have a set of dirty pages and async vm_object_page_clean() is called on the vnode' vm_object, we get a bunch of delayed-write AKA dirty buffers. This is possible even after VOP_CLOSE() was done, e.g. by syncer performing regular run involving vfs_msync(). I agree that the patch would not create new dirty buffers, but it is possible to get them by other means. > > What the patch was missing was updating n_mtime after the dirty > page flush. > > Btw, a flush without OBJPC_SYNC happens when the file is VOP_CLOSE()'d > unless the default value of vfs.nfs.clean_[ages_on_close is changed, which > I think is why the 1sec resolution always seemed to work, at least for the > example where there was an munmap before close. > > Attached is an updated version with that in it, rick ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: process killed: text file modification
Well, I don't mind adding ncl_flush(), but it shouldn't be necessary. I actually had it in the first rendition of the patch, but took it out because it should happen on the VOP_CLOSE() if any writing to the buffer cache happened and that code hasn't changed in many years. What the patch was missing was updating n_mtime after the dirty page flush. Btw, a flush without OBJPC_SYNC happens when the file is VOP_CLOSE()'d unless the default value of vfs.nfs.clean_[ages_on_close is changed, which I think is why the 1sec resolution always seemed to work, at least for the example where there was an munmap before close. Attached is an updated version with that in it, rick From: owner-freebsd-curr...@freebsd.org on behalf of Konstantin Belousov Sent: Friday, March 17, 2017 4:36:05 AM To: Rick Macklem Cc: Dimitry Andric; Ian Lepore; Gergely Czuczy; FreeBSD Current Subject: Re: process killed: text file modification On Fri, Mar 17, 2017 at 03:10:57AM +, Rick Macklem wrote: > Hope you don't mind a top post... > Attached is a little patch you could test maybe? > > rick > > From: owner-freebsd-curr...@freebsd.org > on behalf of Rick Macklem > Sent: Thursday, March 16, 2017 9:57:23 PM > To: Dimitry Andric; Ian Lepore > Cc: Gergely Czuczy; FreeBSD Current > Subject: Re: process killed: text file modification > > Dimitry Andric wrote: > [lots of stuff snipped] > > I'm also running into this problem, but while using lld. I must set > > vfs.timestamp_precision to 1 (e.g. sec + ns accurate to 1/HZ) on both > > the client and the server, to make it work. > > > > Instead of GNU ld, lld uses mmap to write to the output executable. If > > this executable is more than one page, and resides on an NFS share, > > running it will almost always result in "text file modification", if > > vfs_timestamp_precision >= 2. > > > > A small test case: http://www.andric.com/freebsd/test-mmap-write.c, > > which writes a simple "hello world" i386-freebsd executable file, using > > the sequence: open() -> ftruncate() -> mmap() -> memcpy() -> munmap() -> > > close(). > Hopefully Kostik will correct me if I have this wrong, but I don't believe any > of the above syscalls guarantee that dirty pages have been flushed. > At least for cases without munmap(), the writes of dirty pages can occur after > the file descriptor is closed. I run into this in NFSv4, where there is a > Close (NFSv4 one) > that can't be done until VOP_INACTIVE(). > If you look in the NFS VOP_INACTIVE() { called ncl_inactive() } you'll see: > if (NFS_ISV4(vp) && vp->v_type == VREG) { > 237 /* > 238 * Since mmap()'d files do I/O after VOP_CLOSE(), the > NFSv4 > 239 * Close operations are delayed until now. Any dirty > 240 * buffers/pages must be flushed before the close, so > that the > 241 * stateid is available for the writes. > 242 */ > 243 if (vp->v_object != NULL) { > 244 VM_OBJECT_WLOCK(vp->v_object); > 245 retv = vm_object_page_clean(vp->v_object, 0, > 0, > 246 OBJPC_SYNC); > 247 VM_OBJECT_WUNLOCK(vp->v_object); > 248 } else > 249 retv = TRUE; > 250 if (retv == TRUE) { > 251 (void)ncl_flush(vp, MNT_WAIT, NULL, ap->a_td, > 1, 0); > 252 (void)nfsrpc_close(vp, 1, ap->a_td); > 253 } > 254 } > Note that nothing like this is done for NFSv3. > What might work is implementing a VOP_SET_TEXT() vnode op for the NFS > client that does most of the above (except for nfsrpc_close()) and then sets > VV_TEXT. > --> That way, all the dirty pages will be flushed to the server before the > executable > starts executing. > > > Running this on an NFS share, and then attempting to run the resulting > > 'helloworld' executable will result in the "text file modification" > > error, and it will be killed. But if you simply copy the executable to > > something else, then it runs fine, even if you use -p to retain the > > properties! > > > > IMHO this is a rather surprising problem with the NFS code, and Kostik > > remarked that the problem seems to be that the VV_TEXT flag is set too > > early, before the nfs cache is invalidated. Rick, do you have any ideas > &
Re: process killed: text file modification
On Fri, Mar 17, 2017 at 03:10:57AM +, Rick Macklem wrote: > Hope you don't mind a top post... > Attached is a little patch you could test maybe? > > rick > > From: owner-freebsd-curr...@freebsd.org > on behalf of Rick Macklem > Sent: Thursday, March 16, 2017 9:57:23 PM > To: Dimitry Andric; Ian Lepore > Cc: Gergely Czuczy; FreeBSD Current > Subject: Re: process killed: text file modification > > Dimitry Andric wrote: > [lots of stuff snipped] > > I'm also running into this problem, but while using lld. I must set > > vfs.timestamp_precision to 1 (e.g. sec + ns accurate to 1/HZ) on both > > the client and the server, to make it work. > > > > Instead of GNU ld, lld uses mmap to write to the output executable. If > > this executable is more than one page, and resides on an NFS share, > > running it will almost always result in "text file modification", if > > vfs_timestamp_precision >= 2. > > > > A small test case: http://www.andric.com/freebsd/test-mmap-write.c, > > which writes a simple "hello world" i386-freebsd executable file, using > > the sequence: open() -> ftruncate() -> mmap() -> memcpy() -> munmap() -> > > close(). > Hopefully Kostik will correct me if I have this wrong, but I don't believe any > of the above syscalls guarantee that dirty pages have been flushed. > At least for cases without munmap(), the writes of dirty pages can occur after > the file descriptor is closed. I run into this in NFSv4, where there is a > Close (NFSv4 one) > that can't be done until VOP_INACTIVE(). > If you look in the NFS VOP_INACTIVE() { called ncl_inactive() } you'll see: > if (NFS_ISV4(vp) && vp->v_type == VREG) { > 237 /* > 238 * Since mmap()'d files do I/O after VOP_CLOSE(), the > NFSv4 > 239 * Close operations are delayed until now. Any dirty > 240 * buffers/pages must be flushed before the close, so > that the > 241 * stateid is available for the writes. > 242 */ > 243 if (vp->v_object != NULL) { > 244 VM_OBJECT_WLOCK(vp->v_object); > 245 retv = vm_object_page_clean(vp->v_object, 0, > 0, > 246 OBJPC_SYNC); > 247 VM_OBJECT_WUNLOCK(vp->v_object); > 248 } else > 249 retv = TRUE; > 250 if (retv == TRUE) { > 251 (void)ncl_flush(vp, MNT_WAIT, NULL, ap->a_td, > 1, 0); > 252 (void)nfsrpc_close(vp, 1, ap->a_td); > 253 } > 254 } > Note that nothing like this is done for NFSv3. > What might work is implementing a VOP_SET_TEXT() vnode op for the NFS > client that does most of the above (except for nfsrpc_close()) and then sets > VV_TEXT. > --> That way, all the dirty pages will be flushed to the server before the > executable > starts executing. > > > Running this on an NFS share, and then attempting to run the resulting > > 'helloworld' executable will result in the "text file modification" > > error, and it will be killed. But if you simply copy the executable to > > something else, then it runs fine, even if you use -p to retain the > > properties! > > > > IMHO this is a rather surprising problem with the NFS code, and Kostik > > remarked that the problem seems to be that the VV_TEXT flag is set too > > early, before the nfs cache is invalidated. Rick, do you have any ideas > > about this? > I don't think it is the "nfs cache" that needs invalidation, but the dirty > pages written by the mmap'd file need to be flushed, before the VV_TEXT > is set, I think? > > If Kostik meant "attribute cache" when he said "nfs cache", I'll note that the > cached attributes (including mtime) are updated by the reply to every write. > (As such, I think it is the writes that must be done before setting VV_TEXT > that is needed.) > > It is a fairly simple patch to create. I'll post one to this thread in a day > or so > unless Kostik thinks this isn't correct and not worth trying. > I think that the patch is in right direction, but I am not convinced that it is enough. What we need is to ensure that mtime on server does not change after VV_TEXT is set. Dirty pages indeed would cause the mtime update on flush, but wouldn't dirty buffers writes cause the same issue ? In other words, I think that enchanced VOP_SET_TEXT() for nfs must flush everything to ensure that mtime on server would not change due to further actions by this machine' nfs client. ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: process killed: text file modification
Hope you don't mind a top post... Attached is a little patch you could test maybe? rick From: owner-freebsd-curr...@freebsd.org on behalf of Rick Macklem Sent: Thursday, March 16, 2017 9:57:23 PM To: Dimitry Andric; Ian Lepore Cc: Gergely Czuczy; FreeBSD Current Subject: Re: process killed: text file modification Dimitry Andric wrote: [lots of stuff snipped] > I'm also running into this problem, but while using lld. I must set > vfs.timestamp_precision to 1 (e.g. sec + ns accurate to 1/HZ) on both > the client and the server, to make it work. > > Instead of GNU ld, lld uses mmap to write to the output executable. If > this executable is more than one page, and resides on an NFS share, > running it will almost always result in "text file modification", if > vfs_timestamp_precision >= 2. > > A small test case: http://www.andric.com/freebsd/test-mmap-write.c, > which writes a simple "hello world" i386-freebsd executable file, using > the sequence: open() -> ftruncate() -> mmap() -> memcpy() -> munmap() -> > close(). Hopefully Kostik will correct me if I have this wrong, but I don't believe any of the above syscalls guarantee that dirty pages have been flushed. At least for cases without munmap(), the writes of dirty pages can occur after the file descriptor is closed. I run into this in NFSv4, where there is a Close (NFSv4 one) that can't be done until VOP_INACTIVE(). If you look in the NFS VOP_INACTIVE() { called ncl_inactive() } you'll see: if (NFS_ISV4(vp) && vp->v_type == VREG) { 237 /* 238 * Since mmap()'d files do I/O after VOP_CLOSE(), the NFSv4 239 * Close operations are delayed until now. Any dirty 240 * buffers/pages must be flushed before the close, so that the 241 * stateid is available for the writes. 242 */ 243 if (vp->v_object != NULL) { 244 VM_OBJECT_WLOCK(vp->v_object); 245 retv = vm_object_page_clean(vp->v_object, 0, 0, 246 OBJPC_SYNC); 247 VM_OBJECT_WUNLOCK(vp->v_object); 248 } else 249 retv = TRUE; 250 if (retv == TRUE) { 251 (void)ncl_flush(vp, MNT_WAIT, NULL, ap->a_td, 1, 0); 252 (void)nfsrpc_close(vp, 1, ap->a_td); 253 } 254 } Note that nothing like this is done for NFSv3. What might work is implementing a VOP_SET_TEXT() vnode op for the NFS client that does most of the above (except for nfsrpc_close()) and then sets VV_TEXT. --> That way, all the dirty pages will be flushed to the server before the executable starts executing. > Running this on an NFS share, and then attempting to run the resulting > 'helloworld' executable will result in the "text file modification" > error, and it will be killed. But if you simply copy the executable to > something else, then it runs fine, even if you use -p to retain the > properties! > > IMHO this is a rather surprising problem with the NFS code, and Kostik > remarked that the problem seems to be that the VV_TEXT flag is set too > early, before the nfs cache is invalidated. Rick, do you have any ideas > about this? I don't think it is the "nfs cache" that needs invalidation, but the dirty pages written by the mmap'd file need to be flushed, before the VV_TEXT is set, I think? If Kostik meant "attribute cache" when he said "nfs cache", I'll note that the cached attributes (including mtime) are updated by the reply to every write. (As such, I think it is the writes that must be done before setting VV_TEXT that is needed.) It is a fairly simple patch to create. I'll post one to this thread in a day or so unless Kostik thinks this isn't correct and not worth trying. rick ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org" textmod.patch Description: textmod.patch ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: process killed: text file modification
Dimitry Andric wrote: [lots of stuff snipped] > I'm also running into this problem, but while using lld. I must set > vfs.timestamp_precision to 1 (e.g. sec + ns accurate to 1/HZ) on both > the client and the server, to make it work. > > Instead of GNU ld, lld uses mmap to write to the output executable. If > this executable is more than one page, and resides on an NFS share, > running it will almost always result in "text file modification", if > vfs_timestamp_precision >= 2. > > A small test case: http://www.andric.com/freebsd/test-mmap-write.c, > which writes a simple "hello world" i386-freebsd executable file, using > the sequence: open() -> ftruncate() -> mmap() -> memcpy() -> munmap() -> > close(). Hopefully Kostik will correct me if I have this wrong, but I don't believe any of the above syscalls guarantee that dirty pages have been flushed. At least for cases without munmap(), the writes of dirty pages can occur after the file descriptor is closed. I run into this in NFSv4, where there is a Close (NFSv4 one) that can't be done until VOP_INACTIVE(). If you look in the NFS VOP_INACTIVE() { called ncl_inactive() } you'll see: if (NFS_ISV4(vp) && vp->v_type == VREG) { 237 /* 238 * Since mmap()'d files do I/O after VOP_CLOSE(), the NFSv4 239 * Close operations are delayed until now. Any dirty 240 * buffers/pages must be flushed before the close, so that the 241 * stateid is available for the writes. 242 */ 243 if (vp->v_object != NULL) { 244 VM_OBJECT_WLOCK(vp->v_object); 245 retv = vm_object_page_clean(vp->v_object, 0, 0, 246 OBJPC_SYNC); 247 VM_OBJECT_WUNLOCK(vp->v_object); 248 } else 249 retv = TRUE; 250 if (retv == TRUE) { 251 (void)ncl_flush(vp, MNT_WAIT, NULL, ap->a_td, 1, 0); 252 (void)nfsrpc_close(vp, 1, ap->a_td); 253 } 254 } Note that nothing like this is done for NFSv3. What might work is implementing a VOP_SET_TEXT() vnode op for the NFS client that does most of the above (except for nfsrpc_close()) and then sets VV_TEXT. --> That way, all the dirty pages will be flushed to the server before the executable starts executing. > Running this on an NFS share, and then attempting to run the resulting > 'helloworld' executable will result in the "text file modification" > error, and it will be killed. But if you simply copy the executable to > something else, then it runs fine, even if you use -p to retain the > properties! > > IMHO this is a rather surprising problem with the NFS code, and Kostik > remarked that the problem seems to be that the VV_TEXT flag is set too > early, before the nfs cache is invalidated. Rick, do you have any ideas > about this? I don't think it is the "nfs cache" that needs invalidation, but the dirty pages written by the mmap'd file need to be flushed, before the VV_TEXT is set, I think? If Kostik meant "attribute cache" when he said "nfs cache", I'll note that the cached attributes (including mtime) are updated by the reply to every write. (As such, I think it is the writes that must be done before setting VV_TEXT that is needed.) It is a fairly simple patch to create. I'll post one to this thread in a day or so unless Kostik thinks this isn't correct and not worth trying. rick ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: process killed: text file modification
> On 12 Mar 2017, at 18:47, Ian Lepore wrote: > > On Thu, 2017-03-09 at 21:07 +0100, Gergely Czuczy wrote: >> >> On 2017. 03. 09. 20:47, Gergely Czuczy wrote: >>> >>> >>> >>> On 2017. 03. 09. 19:44, John Baldwin wrote: On Thursday, March 09, 2017 03:31:56 PM Gergely Czuczy wrote: > > [+freebsd-fs] > > > On 2017. 03. 09. 14:20, Gergely Czuczy wrote: >> >> On 2017. 03. 09. 11:27, Gergely Czuczy wrote: >>> >>> Hello, >>> >>> I'm trying to build a few things from ports on an rpi3, the >>> ports >>> collection is mounted over NFS from another machine. When >>> it's trying >>> to build pkg i'm getting the error message in syslog: >>> >>> rpi3 kernel: pid 4451 (sh), uid 0, was killed: text file >>> modification >>> >>> The report to pkg@: >>> https://lists.freebsd.org/pipermail/freebsd-pkg/2017-March/ >>> 002048.html >>> >>> >>> In ports-mgmt/pkg's config.log It fails at the following >>> entry: >>> configure:3726: checking whether we are cross compiling >>> configure:3734: cc -o conftest -O2 -pipe -Wno-error >>> -fno-strict-aliasing conftest.c >&5 >>> configure:3738: $? = 0 >>> configure:3745: ./conftest >>> configure:3749: $? = 137 >>> configure:3756: error: in >>> `/usr/ports/ports-mgmt/pkg/work/pkg-1.10.0': >>> configure:3760: error: cannot run C compiled programs. >>> If you meant to cross compile, use `--host'. >>> See `config.log' for more details >>> >>> # uname -a >>> FreeBSD rpi3 12.0-CURRENT FreeBSD 12.0-CURRENT #0 r314949: >>> Thu Mar 9 >>> 08:58:46 CET 2017 >>> ae...@marvin.harmless.hu:/tank/rpi3/crochet/work/obj/arm64. >>> aarch64/tank/rpi3/src/sys/AEGIR >>> >>> arm64 >> So far, a few additions: >> Time is synced between the NFS server and the client. >> it's an open() call which is getting the kill, and it's not >> the file >> what's being opened, but the process executing it. >> Here's a simple code that reproduces it: >> #include >> >> int main() { >> >>FILE *f = fopen ("/bar", "w"); >> >>fclose(f); >>return 0; >> } >> >> Conditions to reproduce it: >> - The resulting binary must be executed from the nfs mount >> - The binary must be built after mounting the NFS share. >> >> I haven't tried building it on a different host, I don't have >> access >> to multiple RPis. Also, if I build the binary, umount/remount >> the NFS >> mount point, which has the binary, execute it, then it works. >> >> I've also tried this with the raspbsd.org's image, I could >> reproduce >> it as well. >> >> Another interesting thing is, when I first booted the RPi up, >> the NFS >> server was a 10.2-STABLE, and later got updated to 11-STABLE. >> While it >> was 10.2 I've tried to build some port, and I don't remember >> having >> this issue. >> >> So, could someone please help me figure this out and fix it? >> This >> stuff should work pretty much. >> > So, this error message comes from here: > https://svnweb.freebsd.org/base/head/sys/fs/nfsclient/nfs_clbio > .c?revision=314436&view=markup#l1674 > > > It's the NFS_TIMESPEC_COMPARE(&np->n_mtime, &np- >> n_vattr.na_mtime) > comparision that fails, np should be the NFS node structure, > from the > vnode's v_data, and n_vattr is the attribute cache. As I've > seen these > two are being updated together, so I don't really see by the > code why > they might differ. Could someone please take a look at it, with > more > experience in the NFS code? -czg Can you print out the two mtimes? I wonder if what's happening is that your server uses different granularity (for example just seconds) than your client, so on the client we generate a timestamp with a non- zero nanoseconds but when the server receives that timestamp it "truncates" it. During open() we forcefully re-fetch the timestamp (for CTO consistency) and then notice it doesn't match. For now I would start with comparing the timestamps and maybe the vfs.timestamp_precision sysctls on client and server (if server is a FreeBSD box). >>> Here are the time values: >>> Mar 9 19:46:01 rpi3 kernel: np->n_mtime: -3298114786344 + >>> -3298114786336 &np->n_vattr.na_mtime: -3298114786616 + >>> -3298114786608 >>> Mar 9 19:46:01 rpi3 kernel: pid 912 (csh), uid 0, was killed: >>> text >>> file modification >>> Mar 9 19:46:01 rpi3 kernel: np->n_mtime: -3298114786344 + >>> -3298114786336 &np->n_vattr.na_mtime: -3298114786616 + >>> -3298114786608 >>> Mar 9 19:46:01 rpi3 kernel: pid 912 (csh), uid 0, was killed: >>> text >>> file modification >>> >>> Printed this way: >>>
Re: process killed: text file modification
On Thu, 2017-03-09 at 21:07 +0100, Gergely Czuczy wrote: > > On 2017. 03. 09. 20:47, Gergely Czuczy wrote: > > > > > > > > On 2017. 03. 09. 19:44, John Baldwin wrote: > > > > > > On Thursday, March 09, 2017 03:31:56 PM Gergely Czuczy wrote: > > > > > > > > [+freebsd-fs] > > > > > > > > > > > > On 2017. 03. 09. 14:20, Gergely Czuczy wrote: > > > > > > > > > > On 2017. 03. 09. 11:27, Gergely Czuczy wrote: > > > > > > > > > > > > Hello, > > > > > > > > > > > > I'm trying to build a few things from ports on an rpi3, the > > > > > > ports > > > > > > collection is mounted over NFS from another machine. When > > > > > > it's trying > > > > > > to build pkg i'm getting the error message in syslog: > > > > > > > > > > > > rpi3 kernel: pid 4451 (sh), uid 0, was killed: text file > > > > > > modification > > > > > > > > > > > > The report to pkg@: > > > > > > https://lists.freebsd.org/pipermail/freebsd-pkg/2017-March/ > > > > > > 002048.html > > > > > > > > > > > > > > > > > > In ports-mgmt/pkg's config.log It fails at the following > > > > > > entry: > > > > > > configure:3726: checking whether we are cross compiling > > > > > > configure:3734: cc -o conftest -O2 -pipe -Wno-error > > > > > > -fno-strict-aliasing conftest.c >&5 > > > > > > configure:3738: $? = 0 > > > > > > configure:3745: ./conftest > > > > > > configure:3749: $? = 137 > > > > > > configure:3756: error: in > > > > > > `/usr/ports/ports-mgmt/pkg/work/pkg-1.10.0': > > > > > > configure:3760: error: cannot run C compiled programs. > > > > > > If you meant to cross compile, use `--host'. > > > > > > See `config.log' for more details > > > > > > > > > > > > # uname -a > > > > > > FreeBSD rpi3 12.0-CURRENT FreeBSD 12.0-CURRENT #0 r314949: > > > > > > Thu Mar 9 > > > > > > 08:58:46 CET 2017 > > > > > > ae...@marvin.harmless.hu:/tank/rpi3/crochet/work/obj/arm64. > > > > > > aarch64/tank/rpi3/src/sys/AEGIR > > > > > > > > > > > > arm64 > > > > > So far, a few additions: > > > > > Time is synced between the NFS server and the client. > > > > > it's an open() call which is getting the kill, and it's not > > > > > the file > > > > > what's being opened, but the process executing it. > > > > > Here's a simple code that reproduces it: > > > > > #include > > > > > > > > > > int main() { > > > > > > > > > > FILE *f = fopen ("/bar", "w"); > > > > > > > > > > fclose(f); > > > > > return 0; > > > > > } > > > > > > > > > > Conditions to reproduce it: > > > > > - The resulting binary must be executed from the nfs mount > > > > > - The binary must be built after mounting the NFS share. > > > > > > > > > > I haven't tried building it on a different host, I don't have > > > > > access > > > > > to multiple RPis. Also, if I build the binary, umount/remount > > > > > the NFS > > > > > mount point, which has the binary, execute it, then it works. > > > > > > > > > > I've also tried this with the raspbsd.org's image, I could > > > > > reproduce > > > > > it as well. > > > > > > > > > > Another interesting thing is, when I first booted the RPi up, > > > > > the NFS > > > > > server was a 10.2-STABLE, and later got updated to 11-STABLE. > > > > > While it > > > > > was 10.2 I've tried to build some port, and I don't remember > > > > > having > > > > > this issue. > > > > > > > > > > So, could someone please help me figure this out and fix it? > > > > > This > > > > > stuff should work pretty much. > > > > > > > > > So, this error message comes from here: > > > > https://svnweb.freebsd.org/base/head/sys/fs/nfsclient/nfs_clbio > > > > .c?revision=314436&view=markup#l1674 > > > > > > > > > > > > It's the NFS_TIMESPEC_COMPARE(&np->n_mtime, &np- > > > > >n_vattr.na_mtime) > > > > comparision that fails, np should be the NFS node structure, > > > > from the > > > > vnode's v_data, and n_vattr is the attribute cache. As I've > > > > seen these > > > > two are being updated together, so I don't really see by the > > > > code why > > > > they might differ. Could someone please take a look at it, with > > > > more > > > > experience in the NFS code? -czg > > > Can you print out the two mtimes? I wonder if what's happening > > > is that > > > your server uses different granularity (for example just seconds) > > > than > > > your client, so on the client we generate a timestamp with a non- > > > zero > > > nanoseconds but when the server receives that timestamp it > > > "truncates" > > > it. During open() we forcefully re-fetch the timestamp (for CTO > > > consistency) and then notice it doesn't match. For now I would > > > start > > > with comparing the timestamps and maybe the > > > vfs.timestamp_precision > > > sysctls on client and server (if server is a FreeBSD box). > > Here are the time values: > > Mar 9 19:46:01 rpi3 kernel: np->n_mtime: -3298114786344 + > > -3298114786336 &np->n_vattr.na_mtime: -3298114786616 + > > -3298114786608 > > Mar 9 19:46:01 rpi3 kernel: pid 912 (csh), uid 0, was killed: > > text > > file modificati
Re: process killed: text file modification
On 2017. 03. 09. 20:47, Gergely Czuczy wrote: On 2017. 03. 09. 19:44, John Baldwin wrote: On Thursday, March 09, 2017 03:31:56 PM Gergely Czuczy wrote: [+freebsd-fs] On 2017. 03. 09. 14:20, Gergely Czuczy wrote: On 2017. 03. 09. 11:27, Gergely Czuczy wrote: Hello, I'm trying to build a few things from ports on an rpi3, the ports collection is mounted over NFS from another machine. When it's trying to build pkg i'm getting the error message in syslog: rpi3 kernel: pid 4451 (sh), uid 0, was killed: text file modification The report to pkg@: https://lists.freebsd.org/pipermail/freebsd-pkg/2017-March/002048.html In ports-mgmt/pkg's config.log It fails at the following entry: configure:3726: checking whether we are cross compiling configure:3734: cc -o conftest -O2 -pipe -Wno-error -fno-strict-aliasing conftest.c >&5 configure:3738: $? = 0 configure:3745: ./conftest configure:3749: $? = 137 configure:3756: error: in `/usr/ports/ports-mgmt/pkg/work/pkg-1.10.0': configure:3760: error: cannot run C compiled programs. If you meant to cross compile, use `--host'. See `config.log' for more details # uname -a FreeBSD rpi3 12.0-CURRENT FreeBSD 12.0-CURRENT #0 r314949: Thu Mar 9 08:58:46 CET 2017 ae...@marvin.harmless.hu:/tank/rpi3/crochet/work/obj/arm64.aarch64/tank/rpi3/src/sys/AEGIR arm64 So far, a few additions: Time is synced between the NFS server and the client. it's an open() call which is getting the kill, and it's not the file what's being opened, but the process executing it. Here's a simple code that reproduces it: #include int main() { FILE *f = fopen ("/bar", "w"); fclose(f); return 0; } Conditions to reproduce it: - The resulting binary must be executed from the nfs mount - The binary must be built after mounting the NFS share. I haven't tried building it on a different host, I don't have access to multiple RPis. Also, if I build the binary, umount/remount the NFS mount point, which has the binary, execute it, then it works. I've also tried this with the raspbsd.org's image, I could reproduce it as well. Another interesting thing is, when I first booted the RPi up, the NFS server was a 10.2-STABLE, and later got updated to 11-STABLE. While it was 10.2 I've tried to build some port, and I don't remember having this issue. So, could someone please help me figure this out and fix it? This stuff should work pretty much. So, this error message comes from here: https://svnweb.freebsd.org/base/head/sys/fs/nfsclient/nfs_clbio.c?revision=314436&view=markup#l1674 It's the NFS_TIMESPEC_COMPARE(&np->n_mtime, &np->n_vattr.na_mtime) comparision that fails, np should be the NFS node structure, from the vnode's v_data, and n_vattr is the attribute cache. As I've seen these two are being updated together, so I don't really see by the code why they might differ. Could someone please take a look at it, with more experience in the NFS code? -czg Can you print out the two mtimes? I wonder if what's happening is that your server uses different granularity (for example just seconds) than your client, so on the client we generate a timestamp with a non-zero nanoseconds but when the server receives that timestamp it "truncates" it. During open() we forcefully re-fetch the timestamp (for CTO consistency) and then notice it doesn't match. For now I would start with comparing the timestamps and maybe the vfs.timestamp_precision sysctls on client and server (if server is a FreeBSD box). Here are the time values: Mar 9 19:46:01 rpi3 kernel: np->n_mtime: -3298114786344 + -3298114786336 &np->n_vattr.na_mtime: -3298114786616 + -3298114786608 Mar 9 19:46:01 rpi3 kernel: pid 912 (csh), uid 0, was killed: text file modification Mar 9 19:46:01 rpi3 kernel: np->n_mtime: -3298114786344 + -3298114786336 &np->n_vattr.na_mtime: -3298114786616 + -3298114786608 Mar 9 19:46:01 rpi3 kernel: pid 912 (csh), uid 0, was killed: text file modification Printed this way: printf("np->n_mtime: %ji + %ji &np->n_vattr.na_mtime: %ji + %ji", (intmax_t)(&np->n_mtime.tv_sec), (intmax_t)(&np->n_mtime.tv_nsec), (intmax_t)(&np->n_vattr.na_mtime.tv_sec), (intmax_t)(&np->n_vattr.na_mtime.tv_nsec)); Sorry, I made a typo there. Here's it now: Mar 9 20:05:35 rpi3 kernel: np->n_mtime: 1489089935 + 219323000 &np->n_vattr.na_mtime: 1489089935 + 221438000 Mar 9 20:05:35 rpi3 kernel: pid 847 (csh), uid 0, was killed: text file modification Mar 9 20:05:35 rpi3 kernel: np->n_mtime: 1489089935 + 219323000 &np->n_vattr.na_mtime: 1489089935 + 221438000 Mar 9 20:05:35 rpi3 kernel: pid 847 (csh), uid 0, was killed: text file modification That's a difference of 2115 micro seconds. ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: process killed: text file modification
On 2017. 03. 09. 19:44, John Baldwin wrote: On Thursday, March 09, 2017 03:31:56 PM Gergely Czuczy wrote: [+freebsd-fs] On 2017. 03. 09. 14:20, Gergely Czuczy wrote: On 2017. 03. 09. 11:27, Gergely Czuczy wrote: Hello, I'm trying to build a few things from ports on an rpi3, the ports collection is mounted over NFS from another machine. When it's trying to build pkg i'm getting the error message in syslog: rpi3 kernel: pid 4451 (sh), uid 0, was killed: text file modification The report to pkg@: https://lists.freebsd.org/pipermail/freebsd-pkg/2017-March/002048.html In ports-mgmt/pkg's config.log It fails at the following entry: configure:3726: checking whether we are cross compiling configure:3734: cc -o conftest -O2 -pipe -Wno-error -fno-strict-aliasing conftest.c >&5 configure:3738: $? = 0 configure:3745: ./conftest configure:3749: $? = 137 configure:3756: error: in `/usr/ports/ports-mgmt/pkg/work/pkg-1.10.0': configure:3760: error: cannot run C compiled programs. If you meant to cross compile, use `--host'. See `config.log' for more details # uname -a FreeBSD rpi3 12.0-CURRENT FreeBSD 12.0-CURRENT #0 r314949: Thu Mar 9 08:58:46 CET 2017 ae...@marvin.harmless.hu:/tank/rpi3/crochet/work/obj/arm64.aarch64/tank/rpi3/src/sys/AEGIR arm64 So far, a few additions: Time is synced between the NFS server and the client. it's an open() call which is getting the kill, and it's not the file what's being opened, but the process executing it. Here's a simple code that reproduces it: #include int main() { FILE *f = fopen ("/bar", "w"); fclose(f); return 0; } Conditions to reproduce it: - The resulting binary must be executed from the nfs mount - The binary must be built after mounting the NFS share. I haven't tried building it on a different host, I don't have access to multiple RPis. Also, if I build the binary, umount/remount the NFS mount point, which has the binary, execute it, then it works. I've also tried this with the raspbsd.org's image, I could reproduce it as well. Another interesting thing is, when I first booted the RPi up, the NFS server was a 10.2-STABLE, and later got updated to 11-STABLE. While it was 10.2 I've tried to build some port, and I don't remember having this issue. So, could someone please help me figure this out and fix it? This stuff should work pretty much. So, this error message comes from here: https://svnweb.freebsd.org/base/head/sys/fs/nfsclient/nfs_clbio.c?revision=314436&view=markup#l1674 It's the NFS_TIMESPEC_COMPARE(&np->n_mtime, &np->n_vattr.na_mtime) comparision that fails, np should be the NFS node structure, from the vnode's v_data, and n_vattr is the attribute cache. As I've seen these two are being updated together, so I don't really see by the code why they might differ. Could someone please take a look at it, with more experience in the NFS code? -czg Can you print out the two mtimes? I wonder if what's happening is that your server uses different granularity (for example just seconds) than your client, so on the client we generate a timestamp with a non-zero nanoseconds but when the server receives that timestamp it "truncates" it. During open() we forcefully re-fetch the timestamp (for CTO consistency) and then notice it doesn't match. For now I would start with comparing the timestamps and maybe the vfs.timestamp_precision sysctls on client and server (if server is a FreeBSD box). Here are the time values: Mar 9 19:46:01 rpi3 kernel: np->n_mtime: -3298114786344 + -3298114786336 &np->n_vattr.na_mtime: -3298114786616 + -3298114786608 Mar 9 19:46:01 rpi3 kernel: pid 912 (csh), uid 0, was killed: text file modification Mar 9 19:46:01 rpi3 kernel: np->n_mtime: -3298114786344 + -3298114786336 &np->n_vattr.na_mtime: -3298114786616 + -3298114786608 Mar 9 19:46:01 rpi3 kernel: pid 912 (csh), uid 0, was killed: text file modification Printed this way: printf("np->n_mtime: %ji + %ji &np->n_vattr.na_mtime: %ji + %ji", (intmax_t)(&np->n_mtime.tv_sec), (intmax_t)(&np->n_mtime.tv_nsec), (intmax_t)(&np->n_vattr.na_mtime.tv_sec), (intmax_t)(&np->n_vattr.na_mtime.tv_nsec)); ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: process killed: text file modification
On Thursday, March 09, 2017 03:31:56 PM Gergely Czuczy wrote: > [+freebsd-fs] > > > On 2017. 03. 09. 14:20, Gergely Czuczy wrote: > > On 2017. 03. 09. 11:27, Gergely Czuczy wrote: > >> Hello, > >> > >> I'm trying to build a few things from ports on an rpi3, the ports > >> collection is mounted over NFS from another machine. When it's trying > >> to build pkg i'm getting the error message in syslog: > >> > >> rpi3 kernel: pid 4451 (sh), uid 0, was killed: text file modification > >> > >> The report to pkg@: > >> https://lists.freebsd.org/pipermail/freebsd-pkg/2017-March/002048.html > >> > >> In ports-mgmt/pkg's config.log It fails at the following entry: > >> configure:3726: checking whether we are cross compiling > >> configure:3734: cc -o conftest -O2 -pipe -Wno-error > >> -fno-strict-aliasing conftest.c >&5 > >> configure:3738: $? = 0 > >> configure:3745: ./conftest > >> configure:3749: $? = 137 > >> configure:3756: error: in `/usr/ports/ports-mgmt/pkg/work/pkg-1.10.0': > >> configure:3760: error: cannot run C compiled programs. > >> If you meant to cross compile, use `--host'. > >> See `config.log' for more details > >> > >> # uname -a > >> FreeBSD rpi3 12.0-CURRENT FreeBSD 12.0-CURRENT #0 r314949: Thu Mar 9 > >> 08:58:46 CET 2017 > >> ae...@marvin.harmless.hu:/tank/rpi3/crochet/work/obj/arm64.aarch64/tank/rpi3/src/sys/AEGIR > >> > >> arm64 > > So far, a few additions: > > Time is synced between the NFS server and the client. > > it's an open() call which is getting the kill, and it's not the file > > what's being opened, but the process executing it. > > Here's a simple code that reproduces it: > > #include > > > > int main() { > > > > FILE *f = fopen ("/bar", "w"); > > > > fclose(f); > > return 0; > > } > > > > Conditions to reproduce it: > > - The resulting binary must be executed from the nfs mount > > - The binary must be built after mounting the NFS share. > > > > I haven't tried building it on a different host, I don't have access > > to multiple RPis. Also, if I build the binary, umount/remount the NFS > > mount point, which has the binary, execute it, then it works. > > > > I've also tried this with the raspbsd.org's image, I could reproduce > > it as well. > > > > Another interesting thing is, when I first booted the RPi up, the NFS > > server was a 10.2-STABLE, and later got updated to 11-STABLE. While it > > was 10.2 I've tried to build some port, and I don't remember having > > this issue. > > > > So, could someone please help me figure this out and fix it? This > > stuff should work pretty much. > > > So, this error message comes from here: > https://svnweb.freebsd.org/base/head/sys/fs/nfsclient/nfs_clbio.c?revision=314436&view=markup#l1674 > > It's the NFS_TIMESPEC_COMPARE(&np->n_mtime, &np->n_vattr.na_mtime) > comparision that fails, np should be the NFS node structure, from the > vnode's v_data, and n_vattr is the attribute cache. As I've seen these > two are being updated together, so I don't really see by the code why > they might differ. Could someone please take a look at it, with more > experience in the NFS code? -czg Can you print out the two mtimes? I wonder if what's happening is that your server uses different granularity (for example just seconds) than your client, so on the client we generate a timestamp with a non-zero nanoseconds but when the server receives that timestamp it "truncates" it. During open() we forcefully re-fetch the timestamp (for CTO consistency) and then notice it doesn't match. For now I would start with comparing the timestamps and maybe the vfs.timestamp_precision sysctls on client and server (if server is a FreeBSD box). -- John Baldwin ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: process killed: text file modification
[+freebsd-fs] On 2017. 03. 09. 14:20, Gergely Czuczy wrote: On 2017. 03. 09. 11:27, Gergely Czuczy wrote: Hello, I'm trying to build a few things from ports on an rpi3, the ports collection is mounted over NFS from another machine. When it's trying to build pkg i'm getting the error message in syslog: rpi3 kernel: pid 4451 (sh), uid 0, was killed: text file modification The report to pkg@: https://lists.freebsd.org/pipermail/freebsd-pkg/2017-March/002048.html In ports-mgmt/pkg's config.log It fails at the following entry: configure:3726: checking whether we are cross compiling configure:3734: cc -o conftest -O2 -pipe -Wno-error -fno-strict-aliasing conftest.c >&5 configure:3738: $? = 0 configure:3745: ./conftest configure:3749: $? = 137 configure:3756: error: in `/usr/ports/ports-mgmt/pkg/work/pkg-1.10.0': configure:3760: error: cannot run C compiled programs. If you meant to cross compile, use `--host'. See `config.log' for more details # uname -a FreeBSD rpi3 12.0-CURRENT FreeBSD 12.0-CURRENT #0 r314949: Thu Mar 9 08:58:46 CET 2017 ae...@marvin.harmless.hu:/tank/rpi3/crochet/work/obj/arm64.aarch64/tank/rpi3/src/sys/AEGIR arm64 So far, a few additions: Time is synced between the NFS server and the client. it's an open() call which is getting the kill, and it's not the file what's being opened, but the process executing it. Here's a simple code that reproduces it: #include int main() { FILE *f = fopen ("/bar", "w"); fclose(f); return 0; } Conditions to reproduce it: - The resulting binary must be executed from the nfs mount - The binary must be built after mounting the NFS share. I haven't tried building it on a different host, I don't have access to multiple RPis. Also, if I build the binary, umount/remount the NFS mount point, which has the binary, execute it, then it works. I've also tried this with the raspbsd.org's image, I could reproduce it as well. Another interesting thing is, when I first booted the RPi up, the NFS server was a 10.2-STABLE, and later got updated to 11-STABLE. While it was 10.2 I've tried to build some port, and I don't remember having this issue. So, could someone please help me figure this out and fix it? This stuff should work pretty much. So, this error message comes from here: https://svnweb.freebsd.org/base/head/sys/fs/nfsclient/nfs_clbio.c?revision=314436&view=markup#l1674 It's the NFS_TIMESPEC_COMPARE(&np->n_mtime, &np->n_vattr.na_mtime) comparision that fails, np should be the NFS node structure, from the vnode's v_data, and n_vattr is the attribute cache. As I've seen these two are being updated together, so I don't really see by the code why they might differ. Could someone please take a look at it, with more experience in the NFS code? -czg ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
Re: process killed: text file modification
On 2017. 03. 09. 11:27, Gergely Czuczy wrote: Hello, I'm trying to build a few things from ports on an rpi3, the ports collection is mounted over NFS from another machine. When it's trying to build pkg i'm getting the error message in syslog: rpi3 kernel: pid 4451 (sh), uid 0, was killed: text file modification The report to pkg@: https://lists.freebsd.org/pipermail/freebsd-pkg/2017-March/002048.html In ports-mgmt/pkg's config.log It fails at the following entry: configure:3726: checking whether we are cross compiling configure:3734: cc -o conftest -O2 -pipe -Wno-error -fno-strict-aliasing conftest.c >&5 configure:3738: $? = 0 configure:3745: ./conftest configure:3749: $? = 137 configure:3756: error: in `/usr/ports/ports-mgmt/pkg/work/pkg-1.10.0': configure:3760: error: cannot run C compiled programs. If you meant to cross compile, use `--host'. See `config.log' for more details # uname -a FreeBSD rpi3 12.0-CURRENT FreeBSD 12.0-CURRENT #0 r314949: Thu Mar 9 08:58:46 CET 2017 ae...@marvin.harmless.hu:/tank/rpi3/crochet/work/obj/arm64.aarch64/tank/rpi3/src/sys/AEGIR arm64 So far, a few additions: Time is synced between the NFS server and the client. it's an open() call which is getting the kill, and it's not the file what's being opened, but the process executing it. Here's a simple code that reproduces it: #include int main() { FILE *f = fopen ("/bar", "w"); fclose(f); return 0; } Conditions to reproduce it: - The resulting binary must be executed from the nfs mount - The binary must be built after mounting the NFS share. I haven't tried building it on a different host, I don't have access to multiple RPis. Also, if I build the binary, umount/remount the NFS mount point, which has the binary, execute it, then it works. I've also tried this with the raspbsd.org's image, I could reproduce it as well. Another interesting thing is, when I first booted the RPi up, the NFS server was a 10.2-STABLE, and later got updated to 11-STABLE. While it was 10.2 I've tried to build some port, and I don't remember having this issue. So, could someone please help me figure this out and fix it? This stuff should work pretty much. I have no idea what's causing it, it should pretty much work out of the box. Could someone please explain me what's going on here, what's causing it and how can I fix it? Best regards, -czg ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org" ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
process killed: text file modification
Hello, I'm trying to build a few things from ports on an rpi3, the ports collection is mounted over NFS from another machine. When it's trying to build pkg i'm getting the error message in syslog: rpi3 kernel: pid 4451 (sh), uid 0, was killed: text file modification The report to pkg@: https://lists.freebsd.org/pipermail/freebsd-pkg/2017-March/002048.html In ports-mgmt/pkg's config.log It fails at the following entry: configure:3726: checking whether we are cross compiling configure:3734: cc -o conftest -O2 -pipe -Wno-error -fno-strict-aliasing conftest.c >&5 configure:3738: $? = 0 configure:3745: ./conftest configure:3749: $? = 137 configure:3756: error: in `/usr/ports/ports-mgmt/pkg/work/pkg-1.10.0': configure:3760: error: cannot run C compiled programs. If you meant to cross compile, use `--host'. See `config.log' for more details # uname -a FreeBSD rpi3 12.0-CURRENT FreeBSD 12.0-CURRENT #0 r314949: Thu Mar 9 08:58:46 CET 2017 ae...@marvin.harmless.hu:/tank/rpi3/crochet/work/obj/arm64.aarch64/tank/rpi3/src/sys/AEGIR arm64 I have no idea what's causing it, it should pretty much work out of the box. Could someone please explain me what's going on here, what's causing it and how can I fix it? Best regards, -czg ___ freebsd-current@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"