Re: How to determine if a file is in use
On Tue, Nov 3, 2009 at 20:49, Cameron Simpson wrote: > On 03Nov2009 13:31, Donald Russell wrote: > | Another system uses FTP to drop files in a directory for me to process. > | I have a bash script to process the incoming files. The script is started > by > | cron periodically. > | > | There's a problem if the FTP transfer is still in progress because the > | process begins reading the file even though it isn't complete yet. > > I liked the upload-then-rename suggested by another poster, if you can > get this implemented. > > Otherwise... > > [...] > | I could also configure the ftp server to lock files being written, but > that > | seems to be discouraged. (based on man vsftpd.conf) > > It's not discouraged for any reason that seems to match your use case. > You've got a well defined upload area and no malicious users. > Use the lock facility! That's what it's for! > > | Basically, what I want is something like > | Can I get an exclusive read on file x? > | No - skip that file, go onto the next one > | Yes - start processing that file > > Do it! See above! Have you tried it? > > Cheers, > > Thank you all for some great suggestions :-) Based on the feedback I've received, I'm going to ... 1 - configure vsftpd to lock files while writing (no malicious users etc) 2 - use ftp put/rename like put ftp-in-progress.foo.bar / rename ftp-in-progress.foo.bar foo.bar because it provides such a great "visual" for watchers, and a convenient way to determine which files are "in transit" and which are complete. 3 - use lockfile/fuser to ensure my cron job doesn't start processing a file that's already being read by an earlier cron job. Cheers -- fedora-list mailing list fedora-list@redhat.com To unsubscribe: https://www.redhat.com/mailman/listinfo/fedora-list Guidelines: http://fedoraproject.org/wiki/Communicate/MailingListGuidelines
Re: How to determine if a file is in use
On 03Nov2009 13:31, Donald Russell wrote: | Another system uses FTP to drop files in a directory for me to process. | I have a bash script to process the incoming files. The script is started by | cron periodically. | | There's a problem if the FTP transfer is still in progress because the | process begins reading the file even though it isn't complete yet. I liked the upload-then-rename suggested by another poster, if you can get this implemented. Otherwise... [...] | I could also configure the ftp server to lock files being written, but that | seems to be discouraged. (based on man vsftpd.conf) It's not discouraged for any reason that seems to match your use case. You've got a well defined upload area and no malicious users. Use the lock facility! That's what it's for! | Basically, what I want is something like | Can I get an exclusive read on file x? | No - skip that file, go onto the next one | Yes - start processing that file Do it! See above! Have you tried it? Cheers, -- Cameron Simpson DoD#743 http://www.cskk.ezoshosting.com/cs/ Carpe Daemon - Seize the Background Process - Paul Tomblin -- fedora-list mailing list fedora-list@redhat.com To unsubscribe: https://www.redhat.com/mailman/listinfo/fedora-list Guidelines: http://fedoraproject.org/wiki/Communicate/MailingListGuidelines
Re: How to determine if a file is in use
On Tue, 2009-11-03 at 19:23 -0800, Richard England wrote: > On 11/03/2009 01:31 PM, Donald Russell wrote: > > Another system uses FTP to drop files in a directory for me to process. > > I have a bash script to process the incoming files. The script is > > started by cron periodically. > > > > There's a problem if the FTP transfer is still in progress because the > > process begins reading the file even though it isn't complete yet. > > > > From a bash script, is there a way to tell if the file is still being > > written to? > > I was looking at the lsof command, which will tell me if the file is > > opened or not, so that's a possibility... but it sure seems awkward > > for the task. > > > > I could also configure the ftp server to lock files being written, but > > that seems to be discouraged. (based on man vsftpd.conf) > > > > Basically, what I want is something like > > Can I get an exclusive read on file x? > > No - skip that file, go onto the next one > > Yes - start processing that file > > (I'm not concerned about the possible race condition there... I have > > other protections for that) > > > > Thanks for any suggestions... > > > > > > > > Perhaps "fuser" might be of use? Duh! I had a nagging feeling that lsof wasn't the best answer, it's just that I've been obsessing about it recently :-) File locking is unimportant for the OP's application. If fuser says the file's still in use, just postpone for another cycle. Presumably the same file isn't going to be overwritten by another ftp process before it has a chance to be read. poc -- fedora-list mailing list fedora-list@redhat.com To unsubscribe: https://www.redhat.com/mailman/listinfo/fedora-list Guidelines: http://fedoraproject.org/wiki/Communicate/MailingListGuidelines
How to determine if a file is in use
Another system uses FTP to drop files in a directory for me to process. I have a bash script to process the incoming files. The script is started by cron periodically. There's a problem if the FTP transfer is still in progress because the process begins reading the file even though it isn't complete yet. >From a bash script, is there a way to tell if the file is still being written to? I was looking at the lsof command, which will tell me if the file is opened or not, so that's a possibility... but it sure seems awkward for the task. I could also configure the ftp server to lock files being written, but that seems to be discouraged. (based on man vsftpd.conf) Basically, what I want is something like Can I get an exclusive read on file x? No - skip that file, go onto the next one Yes - start processing that file (I'm not concerned about the possible race condition there... I have other protections for that) Thanks for any suggestions... -- fedora-list mailing list fedora-list@redhat.com To unsubscribe: https://www.redhat.com/mailman/listinfo/fedora-list Guidelines: http://fedoraproject.org/wiki/Communicate/MailingListGuidelines
Re: How to determine if a file is in use
On 11/03/2009 01:31 PM, Donald Russell wrote: Another system uses FTP to drop files in a directory for me to process. I have a bash script to process the incoming files. The script is started by cron periodically. There's a problem if the FTP transfer is still in progress because the process begins reading the file even though it isn't complete yet. From a bash script, is there a way to tell if the file is still being written to? I was looking at the lsof command, which will tell me if the file is opened or not, so that's a possibility... but it sure seems awkward for the task. I could also configure the ftp server to lock files being written, but that seems to be discouraged. (based on man vsftpd.conf) Basically, what I want is something like Can I get an exclusive read on file x? No - skip that file, go onto the next one Yes - start processing that file (I'm not concerned about the possible race condition there... I have other protections for that) Thanks for any suggestions... Perhaps "fuser" might be of use? -- /~~R/ -- fedora-list mailing list fedora-list@redhat.com To unsubscribe: https://www.redhat.com/mailman/listinfo/fedora-list Guidelines: http://fedoraproject.org/wiki/Communicate/MailingListGuidelines
Re: How to determine if a file is in use
Donald Russell wrote: Another system uses FTP to drop files in a directory for me to process. I have a bash script to process the incoming files. The script is started by cron periodically. There's a problem if the FTP transfer is still in progress because the process begins reading the file even though it isn't complete yet. Do you have control of the FTP procedure that drops the files? If so, transfer the files with one filename, and when complete, use ftp to rename the file. The rename is atomic. e.g.: put foo.bar foo.bar.xfer rename foo.bar.xfer foo.bar Then have the cron job only process files without the .xfer appended to name. Chris -- fedora-list mailing list fedora-list@redhat.com To unsubscribe: https://www.redhat.com/mailman/listinfo/fedora-list Guidelines: http://fedoraproject.org/wiki/Communicate/MailingListGuidelines
Re: How to determine if a file is in use
On Tue, 2009-11-03 at 13:31 -0800, Donald Russell wrote: > Another system uses FTP to drop files in a directory for me to > process. > I have a bash script to process the incoming files. The script is > started by cron periodically. > > There's a problem if the FTP transfer is still in progress because the > process begins reading the file even though it isn't complete yet. > > From a bash script, is there a way to tell if the file is still being > written to? > I was looking at the lsof command, which will tell me if the file is > opened or not, so that's a possibility... but it sure seems awkward > for the task. Not really. Since you know that the ftp demon is the only potential writer for the file, you can use lsof -p | grep > I could also configure the ftp server to lock files being written, but > that seems to be discouraged. (based on man vsftpd.conf) inotify(7) could do the job, but would require some programming. poc -- fedora-list mailing list fedora-list@redhat.com To unsubscribe: https://www.redhat.com/mailman/listinfo/fedora-list Guidelines: http://fedoraproject.org/wiki/Communicate/MailingListGuidelines
Re: How to determine if a file is in use
Donald Russell wrote: Another system uses FTP to drop files in a directory for me to process. I have a bash script to process the incoming files. The script is started by cron periodically. There's a problem if the FTP transfer is still in progress because the process begins reading the file even though it isn't complete yet. From a bash script, is there a way to tell if the file is still being written to? I was looking at the lsof command, which will tell me if the file is opened or not, so that's a possibility... but it sure seems awkward for the task. I could also configure the ftp server to lock files being written, but that seems to be discouraged. (based on man vsftpd.conf) Basically, what I want is something like Can I get an exclusive read on file x? No - skip that file, go onto the next one Yes - start processing that file (I'm not concerned about the possible race condition there... I have other protections for that) Thanks for any suggestions... The "lsof(1)" command can tell you if a file is open or in use by some process, but it is not atomic. Lock files are useful, but can cause problems if, say, the process that created the lockfile dies for some reason without removing it. IIRC, vsftpd creates an exclusive write lock on files that are being created. That, or it creates a temp file and when complete, renames it. Can't recall...it's been awhile since I went trudging through the source. You can try to use flock(1) to get locks on files in shells. See the man page for it for suggested uses. I'd suggest using exclusive locks rather than advisory. -- - Rick Stevens, Systems Engineer ri...@nerd.com - - AIM/Skype: therps2ICQ: 22643734Yahoo: origrps2 - -- - This message printed using recycled bandwidth- -- -- fedora-list mailing list fedora-list@redhat.com To unsubscribe: https://www.redhat.com/mailman/listinfo/fedora-list Guidelines: http://fedoraproject.org/wiki/Communicate/MailingListGuidelines