Re: How to determine if a file is in use

2009-11-03 Thread Donald Russell
On Tue, Nov 3, 2009 at 20:49, Cameron Simpson  wrote:

> On 03Nov2009 13:31, Donald Russell  wrote:
> | Another system uses FTP to drop files in a directory for me to process.
> | I have a bash script to process the incoming files. The script is started
> by
> | cron periodically.
> |
> | There's a problem if the FTP transfer is still in progress because the
> | process begins reading the file even though it isn't complete yet.
>
> I liked the upload-then-rename suggested by another poster, if you can
> get this implemented.
>
> Otherwise...
>
> [...]
> | I could also configure the ftp server to lock files being written, but
> that
> | seems to be discouraged. (based on man vsftpd.conf)
>
> It's not discouraged for any reason that seems to match your use case.
> You've got a well defined upload area and no malicious users.
> Use the lock facility! That's what it's for!
>
> | Basically, what I want is something like
> | Can I get an exclusive read on file x?
> | No - skip that file, go onto the next one
> | Yes - start processing that file
>
> Do it! See above! Have you tried it?
>
> Cheers,
>
>

Thank you all for some great suggestions :-)

Based on the feedback I've received, I'm going to ...

1 - configure vsftpd to lock files while writing (no malicious users etc)
2 - use ftp put/rename like put ftp-in-progress.foo.bar / rename
ftp-in-progress.foo.bar foo.bar because it provides such a great "visual"
for watchers, and a convenient way to determine which files are "in transit"
and which are complete.
3 - use lockfile/fuser to ensure my cron job doesn't start processing a file
that's already being read by an earlier cron job.

Cheers
-- 
fedora-list mailing list
fedora-list@redhat.com
To unsubscribe: https://www.redhat.com/mailman/listinfo/fedora-list
Guidelines: http://fedoraproject.org/wiki/Communicate/MailingListGuidelines

Re: How to determine if a file is in use

2009-11-03 Thread Cameron Simpson
On 03Nov2009 13:31, Donald Russell  wrote:
| Another system uses FTP to drop files in a directory for me to process.
| I have a bash script to process the incoming files. The script is started by
| cron periodically.
| 
| There's a problem if the FTP transfer is still in progress because the
| process begins reading the file even though it isn't complete yet.

I liked the upload-then-rename suggested by another poster, if you can 
get this implemented.

Otherwise...

[...]
| I could also configure the ftp server to lock files being written, but that
| seems to be discouraged. (based on man vsftpd.conf)

It's not discouraged for any reason that seems to match your use case.
You've got a well defined upload area and no malicious users.
Use the lock facility! That's what it's for!

| Basically, what I want is something like
| Can I get an exclusive read on file x?
| No - skip that file, go onto the next one
| Yes - start processing that file

Do it! See above! Have you tried it?

Cheers,
-- 
Cameron Simpson  DoD#743
http://www.cskk.ezoshosting.com/cs/

Carpe Daemon - Seize the Background Process
- Paul Tomblin 

-- 
fedora-list mailing list
fedora-list@redhat.com
To unsubscribe: https://www.redhat.com/mailman/listinfo/fedora-list
Guidelines: http://fedoraproject.org/wiki/Communicate/MailingListGuidelines


Re: How to determine if a file is in use

2009-11-03 Thread Patrick O'Callaghan
On Tue, 2009-11-03 at 19:23 -0800, Richard England wrote:
> On 11/03/2009 01:31 PM, Donald Russell wrote:
> > Another system uses FTP to drop files in a directory for me to process.
> > I have a bash script to process the incoming files. The script is 
> > started by cron periodically.
> >
> > There's a problem if the FTP transfer is still in progress because the 
> > process begins reading the file even though it isn't complete yet.
> >
> > From a bash script, is there a way to tell if the file is still being 
> > written to?
> > I was looking at the lsof command, which will tell me if the file is 
> > opened or not, so that's a possibility... but it sure seems awkward 
> > for the task.
> >
> > I could also configure the ftp server to lock files being written, but 
> > that seems to be discouraged. (based on man vsftpd.conf)
> >
> > Basically, what I want is something like
> > Can I get an exclusive read on file x?
> > No - skip that file, go onto the next one
> > Yes - start processing that file
> > (I'm not concerned about the possible race condition there... I have 
> > other protections for that)
> >
> > Thanks for any suggestions...
> >
> >
> >
> 
> Perhaps "fuser" might be of use?

Duh! I had a nagging feeling that lsof wasn't the best answer, it's just
that I've been obsessing about it recently :-)

File locking is unimportant for the OP's application. If fuser says the
file's still in use, just postpone for another cycle. Presumably the
same file isn't going to be overwritten by another ftp process before it
has a chance to be read.

poc

-- 
fedora-list mailing list
fedora-list@redhat.com
To unsubscribe: https://www.redhat.com/mailman/listinfo/fedora-list
Guidelines: http://fedoraproject.org/wiki/Communicate/MailingListGuidelines


How to determine if a file is in use

2009-11-03 Thread Donald Russell
Another system uses FTP to drop files in a directory for me to process.
I have a bash script to process the incoming files. The script is started by
cron periodically.

There's a problem if the FTP transfer is still in progress because the
process begins reading the file even though it isn't complete yet.

>From a bash script, is there a way to tell if the file is still being
written to?
I was looking at the lsof command, which will tell me if the file is opened
or not, so that's a possibility... but it sure seems awkward for the task.

I could also configure the ftp server to lock files being written, but that
seems to be discouraged. (based on man vsftpd.conf)

Basically, what I want is something like
Can I get an exclusive read on file x?
No - skip that file, go onto the next one
Yes - start processing that file
(I'm not concerned about the possible race condition there... I have other
protections for that)

Thanks for any suggestions...
-- 
fedora-list mailing list
fedora-list@redhat.com
To unsubscribe: https://www.redhat.com/mailman/listinfo/fedora-list
Guidelines: http://fedoraproject.org/wiki/Communicate/MailingListGuidelines

Re: How to determine if a file is in use

2009-11-03 Thread Richard England

On 11/03/2009 01:31 PM, Donald Russell wrote:

Another system uses FTP to drop files in a directory for me to process.
I have a bash script to process the incoming files. The script is 
started by cron periodically.


There's a problem if the FTP transfer is still in progress because the 
process begins reading the file even though it isn't complete yet.


From a bash script, is there a way to tell if the file is still being 
written to?
I was looking at the lsof command, which will tell me if the file is 
opened or not, so that's a possibility... but it sure seems awkward 
for the task.


I could also configure the ftp server to lock files being written, but 
that seems to be discouraged. (based on man vsftpd.conf)


Basically, what I want is something like
Can I get an exclusive read on file x?
No - skip that file, go onto the next one
Yes - start processing that file
(I'm not concerned about the possible race condition there... I have 
other protections for that)


Thanks for any suggestions...





Perhaps "fuser" might be of use?

--

/~~R/

--
fedora-list mailing list
fedora-list@redhat.com
To unsubscribe: https://www.redhat.com/mailman/listinfo/fedora-list
Guidelines: http://fedoraproject.org/wiki/Communicate/MailingListGuidelines


Re: How to determine if a file is in use

2009-11-03 Thread Christopher K. Johnson

Donald Russell wrote:

Another system uses FTP to drop files in a directory for me to process.
I have a bash script to process the incoming files. The script is 
started by cron periodically.


There's a problem if the FTP transfer is still in progress because the 
process begins reading the file even though it isn't complete yet.
Do you have control of the FTP procedure that drops the files?  If so, 
transfer the files with one filename, and when complete, use ftp to 
rename the file.  The rename is atomic.  e.g.:

put foo.bar foo.bar.xfer
rename foo.bar.xfer foo.bar

Then have the cron job only process files without the .xfer appended to 
name.


Chris

--
fedora-list mailing list
fedora-list@redhat.com
To unsubscribe: https://www.redhat.com/mailman/listinfo/fedora-list
Guidelines: http://fedoraproject.org/wiki/Communicate/MailingListGuidelines


Re: How to determine if a file is in use

2009-11-03 Thread Patrick O'Callaghan
On Tue, 2009-11-03 at 13:31 -0800, Donald Russell wrote:
> Another system uses FTP to drop files in a directory for me to
> process.
> I have a bash script to process the incoming files. The script is
> started by cron periodically.
> 
> There's a problem if the FTP transfer is still in progress because the
> process begins reading the file even though it isn't complete yet.
> 
> From a bash script, is there a way to tell if the file is still being
> written to?
> I was looking at the lsof command, which will tell me if the file is
> opened or not, so that's a possibility... but it sure seems awkward
> for the task.

Not really. Since you know that the ftp demon is the only potential
writer for the file, you can use

lsof -p  | grep 

> I could also configure the ftp server to lock files being written, but
> that seems to be discouraged. (based on man vsftpd.conf)

inotify(7) could do the job, but would require some programming.

poc

-- 
fedora-list mailing list
fedora-list@redhat.com
To unsubscribe: https://www.redhat.com/mailman/listinfo/fedora-list
Guidelines: http://fedoraproject.org/wiki/Communicate/MailingListGuidelines


Re: How to determine if a file is in use

2009-11-03 Thread Rick Stevens

Donald Russell wrote:

Another system uses FTP to drop files in a directory for me to process.
I have a bash script to process the incoming files. The script is started by
cron periodically.

There's a problem if the FTP transfer is still in progress because the
process begins reading the file even though it isn't complete yet.


From a bash script, is there a way to tell if the file is still being

written to?
I was looking at the lsof command, which will tell me if the file is opened
or not, so that's a possibility... but it sure seems awkward for the task.

I could also configure the ftp server to lock files being written, but that
seems to be discouraged. (based on man vsftpd.conf)

Basically, what I want is something like
Can I get an exclusive read on file x?
No - skip that file, go onto the next one
Yes - start processing that file
(I'm not concerned about the possible race condition there... I have other
protections for that)

Thanks for any suggestions...


The "lsof(1)" command can tell you if a file is open or in use by some
process, but it is not atomic.

Lock files are useful, but can cause problems if, say, the process that
created the lockfile dies for some reason without removing it.

IIRC, vsftpd creates an exclusive write lock on files that are being
created.  That, or it creates a temp file and when complete, renames it.
Can't recall...it's been awhile since I went trudging through the
source.

You can try to use flock(1) to get locks on files in shells.  See the
man page for it for suggested uses.  I'd suggest using exclusive locks
rather than advisory.
--
- Rick Stevens, Systems Engineer  ri...@nerd.com -
- AIM/Skype: therps2ICQ: 22643734Yahoo: origrps2 -
--
-   This message printed using recycled bandwidth-
--

--
fedora-list mailing list
fedora-list@redhat.com
To unsubscribe: https://www.redhat.com/mailman/listinfo/fedora-list
Guidelines: http://fedoraproject.org/wiki/Communicate/MailingListGuidelines