Re: waiting for new files in a directory
Peter Pentchev <[EMAIL PROTECTED]> writes: > On Thu, Dec 28, 2000 at 01:44:34PM +0100, Dag-Erling Smorgrav wrote: > > Volker Stolz <[EMAIL PROTECTED]> writes: > > > On Thu, Dec 28, 2000 at 01:35:08PM +0100, Dag-Erling Smorgrav wrote: > > > > What are you guys smoking? > > > *shrug* Can you spell "event-driven"? There are ways to do things much > > > more elegantly today (see all the references to kevent()). > > I choose simple and working over elegant. > I think opendir-readdir-closedir-sleep is a bit simpler than the locking > you yourself admit is non-trivial :) Locking in Perl is a known problem with a known solution which takes me five or ten minutes to implement off the top of my head, and I don't trust not to start multiple copies of the spool scanner. You can of course write the scanner in such a way that multiple instances can run in paralell without harm even without locking; this is left as an exercise for the reader. DES -- Dag-Erling Smorgrav - [EMAIL PROTECTED] To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-hackers" in the body of the message
Re: waiting for new files in a directory
On Thu, Dec 28, 2000 at 01:44:34PM +0100, Dag-Erling Smorgrav wrote: > Volker Stolz <[EMAIL PROTECTED]> writes: > > On Thu, Dec 28, 2000 at 01:35:08PM +0100, Dag-Erling Smorgrav wrote: > > > What are you guys smoking? > > *shrug* Can you spell "event-driven"? There are ways to do things much > > more elegantly today (see all the references to kevent()). > > I choose simple and working over elegant. I think opendir-readdir-closedir-sleep is a bit simpler than the locking you yourself admit is non-trivial :) G'luck, Peter -- When you are not looking at it, this sentence is in Spanish. To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-hackers" in the body of the message
Re: waiting for new files in a directory
Volker Stolz <[EMAIL PROTECTED]> writes: > On Thu, Dec 28, 2000 at 01:35:08PM +0100, Dag-Erling Smorgrav wrote: > > What are you guys smoking? > *shrug* Can you spell "event-driven"? There are ways to do things much > more elegantly today (see all the references to kevent()). I choose simple and working over elegant. DES -- Dag-Erling Smorgrav - [EMAIL PROTECTED] To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-hackers" in the body of the message
Re: waiting for new files in a directory
On Thu, Dec 28, 2000 at 01:35:08PM +0100, Dag-Erling Smorgrav wrote: > What are you guys smoking? Use cron to run a spool scanning job every > minute or so, and use a lock file to make sure one doesn't start until > the previous one is done. Note that reliable locking is non-trivial in > Perl; a quick workaround is to use a lock directory instead (mkdir() > will fail if the directory exists; make sure to differentiate between > "somebody already holds the lock" and "the lock can't be created due > to permission errors or some other problem" by examining $!) I've tried this; and I still believe that a process continuously watching the directory is better than a cronjob for several reasons, which I have outlined in a previous mail. First, there is *no* need for locking if a single process is there all the time; this eliminates all sorts of locking problems. Second, there is no overhead in starting Perl (yeah, yeah, so it's cached after the first few times, but still..) each and every minute just to find nothing and die quietly - as somebody else said, that's exactly why poll(2) and later kqueue/kevent work on directory vnodes. Third, if a process uses poll(2) or kqueue, it shall react on new mails the moment they arrive, not up to a minute later. G'luck, Peter -- because I didn't think of a good beginning of it. To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-hackers" in the body of the message
Re: waiting for new files in a directory
On Thu, Dec 28, 2000 at 01:35:08PM +0100, Dag-Erling Smorgrav wrote: > What are you guys smoking? *shrug* Can you spell "event-driven"? There are ways to do things much more elegantly today (see all the references to kevent()). -- \usepackage[latin1]{inputenc}! Volker Stolz * [EMAIL PROTECTED] * PGP + S/MIME To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-hackers" in the body of the message
Re: waiting for new files in a directory
What are you guys smoking? Use cron to run a spool scanning job every minute or so, and use a lock file to make sure one doesn't start until the previous one is done. Note that reliable locking is non-trivial in Perl; a quick workaround is to use a lock directory instead (mkdir() will fail if the directory exists; make sure to differentiate between "somebody already holds the lock" and "the lock can't be created due to permission errors or some other problem" by examining $!) DES -- Dag-Erling Smorgrav - [EMAIL PROTECTED] To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-hackers" in the body of the message
Re: waiting for new files in a directory
On Thu, Dec 28, 2000 at 02:35:19AM -0800, Peter Wemm wrote: > This sort of thing is why we added poll(2) and later kqueue(2) support > for getting notifications on directory changes.. eg: you can get an event > to tell you that a new file "appeared" in your directory. See how the l0pht-watch port does exactly this. In fact you could probably use that program as-is - I think it has the capability to execute another process on file creation.. Kris PGP signature
Re: waiting for new files in a directory
On Thu, Dec 28, 2000 at 11:36:50PM +1300, Dan Langille wrote: > On 28 Dec 2000, at 11:29, Volker Stolz wrote: > > > Am 28. Dec 2000 um 10:33 MET schrieb Dan Langille: > > > What about a daemon signalling a waiting perl script? > > > Is it an issue if the daemon signals the perl script when it's already > > > processing? Could a signal be missed? > > > > How about using a FIFO (maybe in /tmp) and let the daemon printf,echo,cat,... > > control-msgs into the FIFO and have a perl script sitting on the other end? > > That sounds good to me. It meets the criteria. Actually, there's no need for the FIFO. What I've been thinking about is a little C program that spawns a Perl script, then sits, watching the spool directory through the kevent interface. When a new file appears, the parent lets the child know - this need not be signal-based, I'm thinking more along the lines of writing to a previously-opened pipe. This has the added benefit that the parent can monitor the child's status, and respawn it if it dies; with separate processes and a FIFO, if the reader dies, the writer either blocks or goes haywire, judging from my (admittedly limited) experience. Handling SIGCHLD and respawning seems easy :) Also, the Perl child can find out the parent has died (the pipe shall close or something), and die gracefully, to be reborn as the parent is respawned. Respawning the parent could be done as either a /etc/inittab-respawned process, or a service running under svscan from DJB's daemontools package. The latter case also has a almost-built-in logging with support for log rotation through multilog. G'luck, Peter -- If you think this sentence is confusing, then change one pig. To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-hackers" in the body of the message
Re: waiting for new files in a directory
On 28 Dec 2000, at 11:29, Volker Stolz wrote: > Am 28. Dec 2000 um 10:33 MET schrieb Dan Langille: > > What about a daemon signalling a waiting perl script? > > Is it an issue if the daemon signals the perl script when it's already > > processing? Could a signal be missed? > > How about using a FIFO (maybe in /tmp) and let the daemon printf,echo,cat,... > control-msgs into the FIFO and have a perl script sitting on the other end? That sounds good to me. It meets the criteria. > Signals suck. Another advantage would be that the perl script could choose > it´s own pace and let things queue up in the FIFO. However, a FIFO only > has limited capacity. Given that we are processing incoming messages from cvs-all, I don't think we'll meet that capacity (not that I know what the capacity is). > If I´d be using Haskell (http://www.haskell.org), I´d > throw in a forkIO() and would get a neatly multi-threaded solution where one > thread reads the FIFO and queues up requests while the other thread queries > him for more work -- I don´t know about threaded perl, though. That sounds great. But without knowing more, I think it's too much for the task at hand. I would like to keep things simple and free from complicity. Writing a multi-threaded solution, unless someone else wants to do it, may be too big of a task for me. Volunteeers? ;) thank you. -- Dan Langille The FreeBSD Diary - http://freebsddiary.org/ FreshPorts - http://freshports.org/ NZ Broadband - http://unixathome.org/broadband/ To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-hackers" in the body of the message
Re: waiting for new files in a directory
Volker Stolz wrote: > Am 28. Dec 2000 um 10:33 MET schrieb Dan Langille: > > What about a daemon signalling a waiting perl script? > > Is it an issue if the daemon signals the perl script when it's already > > processing? Could a signal be missed? > > How about using a FIFO (maybe in /tmp) and let the daemon printf,echo,cat,... > control-msgs into the FIFO and have a perl script sitting on the other end? > Signals suck. Another advantage would be that the perl script could choose > it´s own pace and let things queue up in the FIFO. However, a FIFO only > has limited capacity. If I´d be using Haskell (http://www.haskell.org), I´d > throw in a forkIO() and would get a neatly multi-threaded solution where one > thread reads the FIFO and queues up requests while the other thread queries > him for more work -- I don´t know about threaded perl, though. This sort of thing is why we added poll(2) and later kqueue(2) support for getting notifications on directory changes.. eg: you can get an event to tell you that a new file "appeared" in your directory. Cheers, -Peter -- Peter Wemm - [EMAIL PROTECTED]; [EMAIL PROTECTED]; [EMAIL PROTECTED] "All of this is for nothing if we don't go to the stars" - JMS/B5 To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-hackers" in the body of the message
Re: waiting for new files in a directory
Am 28. Dec 2000 um 10:33 MET schrieb Dan Langille: > What about a daemon signalling a waiting perl script? > Is it an issue if the daemon signals the perl script when it's already > processing? Could a signal be missed? How about using a FIFO (maybe in /tmp) and let the daemon printf,echo,cat,... control-msgs into the FIFO and have a perl script sitting on the other end? Signals suck. Another advantage would be that the perl script could choose it´s own pace and let things queue up in the FIFO. However, a FIFO only has limited capacity. If I´d be using Haskell (http://www.haskell.org), I´d throw in a forkIO() and would get a neatly multi-threaded solution where one thread reads the FIFO and queues up requests while the other thread queries him for more work -- I don´t know about threaded perl, though. -- \usepackage[latin1]{inputenc}! Volker Stolz * [EMAIL PROTECTED] * PGP + S/MIME To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-hackers" in the body of the message
Re: waiting for new files in a directory
On 28 Dec 2000, at 10:50, Peter Pentchev wrote: > Hmm. On second thoughts, I wonder if the sleep/opendir method might > not work better under temporarily high load - even better than the > cron-based one. If a bunch of mails arrive at the same time.. hmm > I should play around with kevent to see how it could handle this - > notifying me for each and every message could be suboptimal. I would appreciate that very much. > I could play around with kevent in a couple of days to see how it > behaves when multiple messages arrive. When a file or multiple files > arrive, the sleeper would have to go through the opendir/readdir > dance, and either only process the first file it finds, or process them > all. In the second case, if multiple files should arrive, those would > be all processed in response to one event, and the next events would > trigger lots of opendir/readdir/closedir calls with no files found. I'll include my thoughts in case they help: What about a daemon signalling a waiting perl script? The script would wake up, take the first file, process it, repeat until no more files, then go back to sleep. Is it an issue if the daemon signals the perl script when it's already processing? Could a signal be missed? thank you. -- Dan Langille The FreeBSD Diary - http://freebsddiary.org/ FreshPorts - http://freshports.org/ NZ Broadband - http://unixathome.org/broadband/ To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-hackers" in the body of the message
Re: waiting for new files in a directory
On Thu, Dec 28, 2000 at 12:23:12PM +1300, Dan Langille wrote: > On 27 Dec 2000, at 19:56, Peter Pentchev wrote: > > > On Wed, Dec 27, 2000 at 09:16:34AM -0800, Alfred Perlstein wrote: > > > * Dan Langille <[EMAIL PROTECTED]> [001226 23:50] wrote: > > > > > > > > My idea is to have a daemon, or something resembling one, sitting on > > > > the box watching the directory. When a new file appears, it starts a perl > > > > script. This perl script is beyound the scope of my question, but it > > > > processes all the files in the directory. When finished, it looks for any > > > > more files and repeats as necessary. If no more files, it exits. > > > > > > > > > > This isn't an answer to your main question (i see it's already been > > > discussed), but you may be able to use setup a kevent on the > > > directory which should inform you if any files are added to it. > > > > Unfortunately, I gather that Dan intends to write most of the FreshPorts > > code in Perl, and AFAIK, Perl has no kqueue/kevent interface :( > > Unfortunately? *grin* FWIW, Most of the existing and new code will be > PHP based. Perl is used primarly for importing data from cvs-all. And > for various mailings out to users. The 'unfortunately' part was not to say that I don't like Perl, or that I don't think it should be written in Perl; rather, that at the moment, Perl has no easy way of using the kqueue/kevent interface. If there were such an iface for Perl, it would all be done with one little filter invoked from procmail to write the message, and one sleepy Perl thing, idling in an kevent() call most of the time, and only waking up when there are changes to the dir. Hmm. On second thoughts, I wonder if the sleep/opendir method might not work better under temporarily high load - even better than the cron-based one. If a bunch of mails arrive at the same time.. hmm I should play around with kevent to see how it could handle this - notifying me for each and every message could be suboptimal. The sleep/opendir way would process as many new messages as there have arrived; ditto for the cron-based one, *except* that if there are too many messages, there could be two or three Perl interpreter invocations, which find an old script still running, and die quietly, having used up some CPU resources in the meantime. > > Thus, to make use of kevent (which I certainly agree would be a better > > FreeBSD-specific solution), he'd have to either 1. have a C program > > which spawns Perl and his script on every change, or 2. have a C program > > which spawns Perl once and signals it on every change. > > > > The first way would be downright stupid IMHO.. The second one may > > very well be more efficient than the readdir, sleep solution which > > I proposed in other postings, seeing that Dan wants to process > > the cvs-all mailings, which certainly do not arrive every few seconds :) > > I like the 2nd concept. It appeals to me. I haven't done any C in about > 7 years and all of that was in Windows. Never in a Unix environment. > This solution is more complex than the "cron job every minute" which I > discussed with Mark, but it fits with my goal of having processed the > cvs-all messages as quickly as I can. I could play around with kevent in a couple of days to see how it behaves when multiple messages arrive. When a file or multiple files arrive, the sleeper would have to go through the opendir/readdir dance, and either only process the first file it finds, or process them all. In the second case, if multiple files should arrive, those would be all processed in response to one event, and the next events would trigger lots of opendir/readdir/closedir calls with no files found. Hmm.. as a side note.. I'm not quite sure how kqueues operate on vnodes. If I should request an EVFILT_VNODE filter with NOTE_WRITE, receive an event, find a new file, then unlink() it (which involves writing to the vnode I'm monitoring), will *my* write generate another event I'd have to process? G'luck, Peter -- You have, of course, just begun reading the sentence that you have just finished reading. To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-hackers" in the body of the message
Re: waiting for new files in a directory
On 27 Dec 2000, at 11:25, Jack Rusher wrote: > > At present the files are created through procmail like this: > > > > |/usr/bin/perl $HOME/process_cvs_mail.pl > ~/msgs/$FILE > > ...this fragment tells me that you are in control of the process of > creating these files. That is correct. > This makes the whole problem much easier to solve > and side steps the issue of watching the directory altogether. > > In addition to the suggestions above, you could also: > > You could set up the message processing daemon to listen on a named pipe > and send the messages there from the process_cvs_mail script. > > You could handle queue entry with the process_cvs_mail script and > queue exit with your daemon; signal the daemon from the script when > new work appears in the queue. This would mirror a threaded work queue > approach that blocks on a a conditional variable until work comes into the > queue. Will this approach tie up the procmail script? I want the MTA to be freed up ASAP. That's one of the primary reason for wanting separate processes. From time to time, the website can be "flooded" with messages. This is usually the result of the website being offline or otherwise disconnected from the net. The mail builds up and then arrives all at once. That's the reason for freeing up the MTA quickly. -- Dan Langille The FreeBSD Diary - http://freebsddiary.org/ FreshPorts - http://freshports.org/ NZ Broadband - http://unixathome.org/broadband/ To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-hackers" in the body of the message
Re: waiting for new files in a directory
On 27 Dec 2000, at 19:56, Peter Pentchev wrote: > On Wed, Dec 27, 2000 at 09:16:34AM -0800, Alfred Perlstein wrote: > > * Dan Langille <[EMAIL PROTECTED]> [001226 23:50] wrote: > > > > > > My idea is to have a daemon, or something resembling one, sitting on > > > the box watching the directory. When a new file appears, it starts a perl > > > script. This perl script is beyound the scope of my question, but it > > > processes all the files in the directory. When finished, it looks for any > > > more files and repeats as necessary. If no more files, it exits. > > > > > > > This isn't an answer to your main question (i see it's already been > > discussed), but you may be able to use setup a kevent on the > > directory which should inform you if any files are added to it. > > Unfortunately, I gather that Dan intends to write most of the FreshPorts > code in Perl, and AFAIK, Perl has no kqueue/kevent interface :( Unfortunately? *grin* FWIW, Most of the existing and new code will be PHP based. Perl is used primarly for importing data from cvs-all. And for various mailings out to users. > Thus, to make use of kevent (which I certainly agree would be a better > FreeBSD-specific solution), he'd have to either 1. have a C program > which spawns Perl and his script on every change, or 2. have a C program > which spawns Perl once and signals it on every change. > > The first way would be downright stupid IMHO.. The second one may > very well be more efficient than the readdir, sleep solution which > I proposed in other postings, seeing that Dan wants to process > the cvs-all mailings, which certainly do not arrive every few seconds :) I like the 2nd concept. It appeals to me. I haven't done any C in about 7 years and all of that was in Windows. Never in a Unix environment. This solution is more complex than the "cron job every minute" which I discussed with Mark, but it fits with my goal of having processed the cvs-all messages as quickly as I can. -- Dan Langille The FreeBSD Diary - http://freebsddiary.org/ FreshPorts - http://freshports.org/ NZ Broadband - http://unixathome.org/broadband/ To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-hackers" in the body of the message
Re: waiting for new files in a directory
On 27 Dec 2000, at 12:53, Peter Pentchev wrote: > Something like.. > | /usr/bin/perl $HOME/process.pl > ~/msgs/$FILE.tmp && \ > mv ~/msgs/$FILE.tmp ~/msgs/$FILE.cvs Thanks for that. It's helped me solve a procmail problem I was having. The files were 600 instead of 640, so I did this: |/usr/bin/perl $HOME/process_cvs_mail.pl > ~/msgs/$FILE && chmod o+r ~/msgs/$FILE Works great. Cheers. -- Dan Langille The FreeBSD Diary - http://freebsddiary.org/ FreshPorts - http://freshports.org/ NZ Broadband - http://unixathome.org/broadband/ To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-hackers" in the body of the message
Re: waiting for new files in a directory
I was about to write up a group of suggestions that include the notion that you could use kqueue to watch the directory's vnode, you could use Erez's stackable file system code to pass all file creates through a filter, use lpd's spooling mechanism to treat the incoming directory like a print queue, use a standard issue cron job, etc, etc. But... > At present the files are created through procmail like this: > > |/usr/bin/perl $HOME/process_cvs_mail.pl > ~/msgs/$FILE ...this fragment tells me that you are in control of the process of creating these files. This makes the whole problem much easier to solve and side steps the issue of watching the directory altogether. In addition to the suggestions above, you could also: You could set up the message processing daemon to listen on a named pipe and send the messages there from the process_cvs_mail script. You could handle queue entry with the process_cvs_mail script and queue exit with your daemon; signal the daemon from the script when new work appears in the queue. This would mirror a threaded work queue approach that blocks on a a conditional variable until work comes into the queue. -- Jack Rusher, Senior Engineer | mailto:[EMAIL PROTECTED] Integratus, Inc. | http://www.integratus.com To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-hackers" in the body of the message
Re: waiting for new files in a directory
On Wed, Dec 27, 2000 at 09:16:34AM -0800, Alfred Perlstein wrote: > * Dan Langille <[EMAIL PROTECTED]> [001226 23:50] wrote: > > > > My idea is to have a daemon, or something resembling one, sitting on > > the box watching the directory. When a new file appears, it starts a perl > > script. This perl script is beyound the scope of my question, but it > > processes all the files in the directory. When finished, it looks for any > > more files and repeats as necessary. If no more files, it exits. > > > > This isn't an answer to your main question (i see it's already been > discussed), but you may be able to use setup a kevent on the > directory which should inform you if any files are added to it. Unfortunately, I gather that Dan intends to write most of the FreshPorts code in Perl, and AFAIK, Perl has no kqueue/kevent interface :( Thus, to make use of kevent (which I certainly agree would be a better FreeBSD-specific solution), he'd have to either 1. have a C program which spawns Perl and his script on every change, or 2. have a C program which spawns Perl once and signals it on every change. The first way would be downright stupid IMHO.. The second one may very well be more efficient than the readdir, sleep solution which I proposed in other postings, seeing that Dan wants to process the cvs-all mailings, which certainly do not arrive every few seconds :) As a side-point - does Perl really have a kqueue/kevent interface? If not, how hard would it be to write a litte Perl module to implement that? (Unfortunately, I am a complete stranger to Perl modules..) A Perl script which uses kevent to wait on a directory would certainly be more efficient than any of the above solutions :) G'luck, Peter -- I am jealous of the first word in this sentence. To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-hackers" in the body of the message
Re: waiting for new files in a directory
* Dan Langille <[EMAIL PROTECTED]> [001226 23:50] wrote: > > My idea is to have a daemon, or something resembling one, sitting on > the box watching the directory. When a new file appears, it starts a perl > script. This perl script is beyound the scope of my question, but it > processes all the files in the directory. When finished, it looks for any > more files and repeats as necessary. If no more files, it exits. > This isn't an answer to your main question (i see it's already been discussed), but you may be able to use setup a kevent on the directory which should inform you if any files are added to it. -- -Alfred Perlstein - [[EMAIL PROTECTED]|[EMAIL PROTECTED]] "I have the heart of a child; I keep it in a jar on my desk." To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-hackers" in the body of the message
RE: waiting for new files in a directory
Dear All, What you'd really want is some kind of message queueing system for this kind of work. What message queueing systems are (non-commercially) available on UNIX systems? Kees Jan You are only young once, but you can stay immature all your life. To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-hackers" in the body of the message
Re: waiting for new files in a directory
> > > unlock the file > > > > > > The cleaner you mentioned: run it every 15 minutes, compare the > > > date/time on the lockfile, if more than 15 minutes old, grab the PID, > > > and kill the job, remove the lock. > > > > Correct. Actually, you can make it a lot better: If the lockfile exists, then kill -0 the PID to see if it is still live. If not, blow away the lockfile. If still alive and older than N minutes, blow away the PID and break the lock. M -- Mark Murray Warning: this .sig is umop ap!sdn To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-hackers" in the body of the message
Re: waiting for new files in a directory
On Wed, Dec 27, 2000 at 01:18:28PM +0200, Peter Pentchev wrote: [snip..] > closedir(D); > foreach $fname (@files) { > next if (($fname eq ".") || ($fname eq "..")); > # more filename vailidity checks go here ^ validity.. *sigh* :P > # pattern filtering and subexpression to 'untaint' > next unless $fname =~ /^([\w\d._-]+\.cvs)$/; > $fname = $1; G'luck, Peter -- Thit sentence is not self-referential because "thit" is not a word. To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-hackers" in the body of the message
Re: waiting for new files in a directory
On Wed, Dec 27, 2000 at 11:09:40AM +, Mike Bristow wrote: > On Wed, Dec 27, 2000 at 12:53:37PM +0200, Peter Pentchev wrote: > > Btw, anybody reading this discussion - I tried the attached script with > > #!/usr/bin/perl -wT, and Perl died on the unlink() - "unsafe dependency". > > What gives? > > $ man perldiag > [snip] >Insecure dependency in %s >(F) You tried to do something that the tainting >mechanism didn't like. The tainting mechanism is >turned on when you're running setuid or setgid, or >when you specify -T to turn it on explicitly. The >tainting mechanism labels all data that's derived >directly or indirectly from the user, who is >considered to be unworthy of your trust. If any such >data is used in a "dangerous" operation, you get this >error. See the perlsec manpage for more information. > [snip] > > Note that a filename you get from readdir is (indirectly) from the > user, and unlink counts as dangerous. > > Basically, you need to "untaint" $fname in OnePass before using it in > the unlink call; this is fairly trivial to do, and if you can't work it > out from perlsec(1), feel free to contact me off-list. Whoops. Yup, thanks. Updated version attached. G'luck, Peter -- Nostalgia ain't what it used to be. #!/usr/bin/perl -wT # $Id: procdir.pl,v 1.2 2000/12/27 11:16:38 roam Exp $ use strict; sub OnePass { my $dir = (shift || ""); my ($fname, @files); die("OnePass() requires a dir argument\n") if ($dir eq ""); opendir(D, $dir) or die("Opening $dir: $!\n"); @files = readdir(D); closedir(D); foreach $fname (@files) { next if (($fname eq ".") || ($fname eq "..")); # more filename vailidity checks go here # pattern filtering and subexpression to 'untaint' next unless $fname =~ /^([\w\d._-]+\.cvs)$/; $fname = $1; # ok, we want this file print "Processing $dir/$fname\n"; # done with it.. unlink("$dir/$fname") or warn("Removing $dir/$fname: $!\n"); # this is evil - if we could process it, but could not # remove it, we might end up processing it again at the next # iteration :( } } sub ProcessDir { my $dir = (shift || ""); die("ProcessDir() requires a dir argument\n") if ($dir eq ""); for (;;) { OnePass($dir); # this could be done with select(), with a signal handler, # many different ways.. polling and sleep() is easy sleep(2); } } MAIN:{ # obtain directory name in some way my $d = "/tmp"; ProcessDir($d); # er heh.. this should never return :) die("ProcessDir() returned?.. $!\n"); } To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-hackers" in the body of the message
Re: waiting for new files in a directory
On Wed, Dec 27, 2000 at 12:53:37PM +0200, Peter Pentchev wrote: > Btw, anybody reading this discussion - I tried the attached script with > #!/usr/bin/perl -wT, and Perl died on the unlink() - "unsafe dependency". > What gives? $ man perldiag [snip] Insecure dependency in %s (F) You tried to do something that the tainting mechanism didn't like. The tainting mechanism is turned on when you're running setuid or setgid, or when you specify -T to turn it on explicitly. The tainting mechanism labels all data that's derived directly or indirectly from the user, who is considered to be unworthy of your trust. If any such data is used in a "dangerous" operation, you get this error. See the perlsec manpage for more information. [snip] Note that a filename you get from readdir is (indirectly) from the user, and unlink counts as dangerous. Basically, you need to "untaint" $fname in OnePass before using it in the unlink call; this is fairly trivial to do, and if you can't work it out from perlsec(1), feel free to contact me off-list. -- Mike Bristow, seebitwopie To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-hackers" in the body of the message
Re: waiting for new files in a directory
On 27 Dec 2000, at 10:11, Mark Murray wrote: > > Any ideas on how to do this? Any suggestions on the process? > > Simple lock (like flock(3)) in the perl script. Lock some ${FILE}, > and if you can't get the lock, die. The file should contain the PID > of the process that holds the lock, so that a cleanerd can kill > stuck processes, or so that the lock can be blown away if needed. > > Works like a charm. Mark and I have been msging offline and he's agreed to my posting the results of our discussion: > > > > Thanks Mark. But what part of the solution does flock solve? > > > > > > It prevents more than one perl script from running. You can then > > > cron perl scripts to deal with the incoming, and not worry about > > > them jumping on each other. > > > > Yes. That does make some things much easier. That's a very > > simple solution. > > > > I was looking for a gold-plated solution where messages are > > processed right away. But it sounds too complicated. I guess > > setting up a cron job to run every minute is fine. > > > > The perl script looks like this: > > > > flock a file, if it fails, die. > > Write PID to flocked file. > > > Loop > > Get oldest file in directory (file are named Y.m.d.h.m.s.PID) > > process it > > move file to archives > > until no more files > > Truncate file > > > unlock the file > > > > The cleaner you mentioned: run it every 15 minutes, compare the > > date/time on the lockfile, if more than 15 minutes old, grab the PID, > > and kill the job, remove the lock. > > Correct. Thanks Mark. -- Dan Langille The FreeBSD Diary - http://freebsddiary.org/ FreshPorts - http://freshports.org/ NZ Broadband - http://unixathome.org/broadband/ To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-hackers" in the body of the message
Re: waiting for new files in a directory
On Wed, Dec 27, 2000 at 11:17:47PM +1300, Dan Langille wrote: > On 27 Dec 2000, at 12:11, Peter Pentchev wrote: > > > I would do that (and have done it in several projects) using opendir() > > and readdir(). Open the directory, read entry by entry, when you find > > a file you want, process it and unlink() it. Get to the end of the dir, > > sleep, repeat. > > Thanks for that. > > Do you have code I can use as a starting position? Try the attached file, it works for me. Btw, anybody reading this discussion - I tried the attached script with #!/usr/bin/perl -wT, and Perl died on the unlink() - "unsafe dependency". What gives? > > > DJB's Maildir concept is based on having two directories - a temporary > > one where files are created and then atomically move/rename'd to > > the real one. This works best when the tempdir and the real dir are > > located on the same filesystem, and you can use the rename() syscall. > > At present the files are created through procmail like this: > > |/usr/bin/perl $HOME/process_cvs_mail.pl > ~/msgs/$FILE > > I guess I could add a rename. Something like.. | /usr/bin/perl $HOME/process.pl > ~/msgs/$FILE.tmp && \ mv ~/msgs/$FILE.tmp ~/msgs/$FILE.cvs ..or alternatively, use safecat (which I shall commit real-soon-now), and.. | /usr/bin/perl process.pl | /usr/local/bin/safecat ~/msgs/tmpdir/ ~/msgs/ safecat takes two arguments - a temp dir and the real dir - reads stdin, and stores it there. G'luck, Peter -- What would this sentence be like if pi were 3? #!/usr/bin/perl -w # $Id: procdir.pl,v 1.1 2000/12/27 10:48:30 roam Exp $ use strict; sub OnePass { my $dir = (shift || ""); my ($fname, @files); die("OnePass() requires a dir argument\n") if ($dir eq ""); opendir(D, $dir) or die("Opening $dir: $!\n"); @files = readdir(D); closedir(D); foreach $fname (@files) { next if (($fname eq ".") || ($fname eq "..")); # more filename vailidity checks go here next unless $fname =~ /\.cvs$/; # ok, we want this file print "Processing $dir/$fname\n"; # done with it.. unlink("$dir/$fname") or warn("Removing $dir/$fname: $!\n"); # this is evil - if we could process it, but could not # remove it, we might end up processing it again at the next # iteration :( } } sub ProcessDir { my $dir = (shift || ""); die("ProcessDir() requires a dir argument\n") if ($dir eq ""); for (;;) { OnePass($dir); # this could be done with select(), with a signal handler, # many different ways.. polling and sleep() is easy sleep(2); } } MAIN:{ # obtain directory name in some way my $d = "/tmp"; ProcessDir($d); # er heh.. this should never return :) die("ProcessDir() returned?.. $!\n"); } To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-hackers" in the body of the message
Re: waiting for new files in a directory
> On 27 Dec 2000, at 10:11, Mark Murray wrote: > > > > [use flock(2)] > > But what part of the solution does flock solve? It solves the problem of finding out whether the Perl script is already running, but as I understood the original posting, this isn't what you were asking. See below. > I'm not sure if my lack of comprehension stems from my initial > description being inadequete or my knowledge being too narrow. Probably from it being a little confusing. Here's how I understand it. You have some program putting files in directory /x. You need something that will be notified when a new file appears in /x. That something then starts a Perl script to process the files. If you control the program that's putting files into /x, the easiest way would be to have it send a signal to your daemon. You can put its PID in a well-known file for it to look at. If, however, you don't control the program, you may have to resort to looking at the directory every now and then and checking for new files (``polling''). Depending on your application, this may or may not be acceptable. If you don't want to use polling, you might try fooling around with the select(2), poll(2), or kqueue(2) interfaces. The former two were designed to be used with regular files or sockets, but in unix, a directory is just a special type of file. I don't know how they'd react to it. In particular, the EVFILT_VNODE filter with the NOTE_EXTEND event/flag (notifies you when the file descriptor specified was extended) looks promising. Then again, I'm not a filesystem whiz, so this may all be nonsense. Hopefully I've at least interpreted your question correctly. Regards Dima Dorfman [EMAIL PROTECTED] To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-hackers" in the body of the message
Re: waiting for new files in a directory
On 27 Dec 2000, at 12:11, Peter Pentchev wrote: > I would do that (and have done it in several projects) using opendir() > and readdir(). Open the directory, read entry by entry, when you find > a file you want, process it and unlink() it. Get to the end of the dir, > sleep, repeat. Thanks for that. Do you have code I can use as a starting position? > DJB's Maildir concept is based on having two directories - a temporary > one where files are created and then atomically move/rename'd to > the real one. This works best when the tempdir and the real dir are > located on the same filesystem, and you can use the rename() syscall. At present the files are created through procmail like this: |/usr/bin/perl $HOME/process_cvs_mail.pl > ~/msgs/$FILE I guess I could add a rename. cheers -- Dan Langille The FreeBSD Diary - http://freebsddiary.org/ FreshPorts - http://freshports.org/ NZ Broadband - http://unixathome.org/broadband/ To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-hackers" in the body of the message
Re: waiting for new files in a directory
On Wed, Dec 27, 2000 at 08:49:51PM +1300, Dan Langille wrote: > FreshPorts2 will have a new processing strategy for incoming > messages. Each message will be in a separate file in a predetermined > directory. As each file arrives, it is processed by a perl script. I want > only one instance of that perl script running at a given time. This is > primarily for serialization and to ensure the system doesn't get bogged > down running perl scripts if many messages arrive in a short period of > time. > > My idea is to have a daemon, or something resembling one, sitting on > the box watching the directory. When a new file appears, it starts a perl > script. This perl script is beyound the scope of my question, but it > processes all the files in the directory. When finished, it looks for any > more files and repeats as necessary. If no more files, it exits. > > If a file arrives, the daemon checks to see if the perl script is already > running. If so, it doesn't start another one. > > Any ideas on how to do this? Any suggestions on the process? I would do that (and have done it in several projects) using opendir() and readdir(). Open the directory, read entry by entry, when you find a file you want, process it and unlink() it. Get to the end of the dir, sleep, repeat. Beware of a subtle problem here though - see that you do not have the process which creates files creating them in that directory; you might very well wind up with a file being processed before it's fully created. There are two solutions to this problem - either DJB's Maildir style, or processing files based on filenames. DJB's Maildir concept is based on having two directories - a temporary one where files are created and then atomically move/rename'd to the real one. This works best when the tempdir and the real dir are located on the same filesystem, and you can use the rename() syscall. However, there is a solution if you want the temporary dir on another filesystem - there is a safecat program, which I shall shortly commit a port for (it's been sitting in my to-do tree for several weeks now). The other way is create the files in the same directory, but with a different name style, e.g. ending in .tmp; then when you readdir() an entry, only process those not ending in .tmp, or only process those ending in .xml, or something like that. This might be a bit easier to implement. G'luck, Peter -- If there were no counterfactuals, this sentence would not have been paradoxical. To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-hackers" in the body of the message
Re: waiting for new files in a directory
On 27 Dec 2000, at 10:11, Mark Murray wrote: > > Any ideas on how to do this? Any suggestions on the process? > > Simple lock (like flock(3)) in the perl script. Lock some ${FILE}, > and if you can't get the lock, die. The file should contain the PID > of the process that holds the lock, so that a cleanerd can kill > stuck processes, or so that the lock can be blown away if needed. > > Works like a charm. Thanks Mark. But what part of the solution does flock solve? I'm not sure if my lack of comprehension stems from my initial description being inadequete or my knowledge being too narrow. -- Dan Langille The FreeBSD Diary - http://freebsddiary.org/ FreshPorts - http://freshports.org/ NZ Broadband - http://unixathome.org/broadband/ To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-hackers" in the body of the message
Re: waiting for new files in a directory
> Any ideas on how to do this? Any suggestions on the process? Simple lock (like flock(3)) in the perl script. Lock some ${FILE}, and if you can't get the lock, die. The file should contain the PID of the process that holds the lock, so that a cleanerd can kill stuck processes, or so that the lock can be blown away if needed. Works like a charm. M -- Mark Murray Warning: this .sig is umop ap!sdn To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-hackers" in the body of the message