Bug#590074: awstats: DO NOT use cron scripts to update stats database
On Fri, Jul 23, 2010 at 08:44:18PM +0200, Jonas Smedegaard wrote: On Fri, Jul 23, 2010 at 09:54:23PM +0400, Sergey B Kirpichev wrote: I guess because I haven't written a proper config file yet? Anyway, it's still spamming my syslog *every 10 minutes*. This should at least be an option that's off by default. It's a fresh install, right? On Fri, Jul 23, 2010 at 9:39 PM, Jonas Smedegaard jo...@jones.dk wrote: I experienced cron spam too when trying to install awstats recently (and too busy at the time to investigate further - just cursed a bit and uninstalled awstats again). Possibly not a helpful comment - just want to hint that there might actually be an issue of cron spam in virgin installs of awstats currently. I guess, we can disable cron jobs by default on a fresh install. As /etc/awstats/awstats.conf is not configured by default, A virgin install must not cause cron spam. If you implicitly acknowledge above that awstats currently does, then yes, we should disable it by default (or figure out something more clever). I'm also experiencing such spamming, and it gets even worse as it runs as www-data, and there's no /etc/aliases redirecting it to a real user by default in exim, it seems. So there's an ever growing mailbox in /var/spool/mail/www-data :-( See #496029 that seems to relate to the aliases problem. Still this needs to be addressed on awstats side too, I guess. Hope this helps. Best regards, -- To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Bug#590074: awstats: DO NOT use cron scripts to update stats database
Package: awstats Version: 6.9.5~dfsg-2 Severity: important Currently this package installs a cron job that runs every ten minutes. This is a VERY bad idea: - if logrotate(8) runs during those 10 minutes, some log entries will fail to be accounted for by awstats - it wastes resources parsing the same log files every 10 minutes, especially if they get big - it makes logcheck(8) spam my inbox every hour due to the cron job failing every 10 minutes A better solution is to hook the update script onto the logrotate(8) entries for any installed webservers (eg. /etc/logrotate.d/lighttpd,apache2). This solves all of the 3 problems I just mentioned. -- System Information: Debian Release: squeeze/sid APT prefers testing APT policy: (990, 'testing'), (500, 'unstable'), (500, 'stable'), (1, 'experimental') Architecture: amd64 (x86_64) Kernel: Linux 2.6.32-5-amd64 (SMP w/2 CPU cores) Locale: LANG=en_GB.utf8, LC_CTYPE=en_GB.utf8 (charmap=UTF-8) Shell: /bin/sh linked to /bin/dash Versions of packages awstats depends on: ii perl 5.10.1-13 Larry Wall's Practical Extraction Versions of packages awstats recommends: ii libnet-xwhois-perl0.90-3 Whois Client Interface for Perl5 Versions of packages awstats suggests: pn libgeo-ipfree-perlnone (no description available) ii libnet-dns-perl 0.66-2 Perform DNS queries from a Perl sc ii libnet-ip-perl1.25-2 Perl extension for manipulating IP ii liburi-perl 1.54-1 module to manipulate and access UR ii lighttpd [httpd] 1.4.26-3 A fast webserver with minimal memo -- no debconf information -- To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Bug#590074: [Pkg-awstats-devel] Bug#590074: awstats: DO NOT use cron scripts to update stats database
Hi Ximin, On Fri, Jul 23, 2010 at 01:31:57PM +0100, Ximin Luo wrote: Currently this package installs a cron job that runs every ten minutes. This is a VERY bad idea: - if logrotate(8) runs during those 10 minutes, some log entries will fail to be accounted for by awstats - it wastes resources parsing the same log files every 10 minutes, especially if they get big - it makes logcheck(8) spam my inbox every hour due to the cron job failing every 10 minutes A better solution is to hook the update script onto the logrotate(8) entries for any installed webservers (eg. /etc/logrotate.d/lighttpd,apache2). This solves all of the 3 problems I just mentioned. Good points! Frequent updates of logfiles have its use too, however. But not always - and the backsides you raise here are valid. I suggest to a) split the current cron job into infrequent and frequent jobs, b) make the frequent one optional (ideally through debconf), and c) invoke the infrequent job also (or instead?) as a logrotate hook. How does that sound? - Jonas -- * Jonas Smedegaard - idealist Internet-arkitekt * Tlf.: +45 40843136 Website: http://dr.jones.dk/ [x] quote me freely [ ] ask before reusing [ ] keep private signature.asc Description: Digital signature
Bug#590074: [Pkg-awstats-devel] Bug#590074: awstats: DO NOT use cron scripts to update stats database
On 23/07/10 14:31, Jonas Smedegaard wrote: I suggest to a) split the current cron job into infrequent and frequent jobs, b) make the frequent one optional (ideally through debconf), and c) invoke the infrequent job also (or instead?) as a logrotate hook. How does that sound? Yeah, that works. Though looking at the current script, there's not really much need to split it up. The main problem is to do the logrotate hook itself - ideally you'd add it directly to the webserver entry rather than a new awstats entry. Is that going to be a pain - editing another package's configuration files? I dunno what infrastructure / policy Debian has for this sort of thing. -- To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Bug#590074: [Pkg-awstats-devel] Bug#590074: Bug#590074: awstats: DO NOT use cron scripts to update stats database
On Fri, Jul 23, 2010 at 02:57:43PM +0100, Ximin Luo wrote: On 23/07/10 14:31, Jonas Smedegaard wrote: I suggest to a) split the current cron job into infrequent and frequent jobs, b) make the frequent one optional (ideally through debconf), and c) invoke the infrequent job also (or instead?) as a logrotate hook. How does that sound? Yeah, that works. Though looking at the current script, there's not really much need to split it up. The main problem is to do the logrotate hook itself - ideally you'd add it directly to the webserver entry rather than a new awstats entry. Is that going to be a pain - editing another package's configuration files? I dunno what infrastructure / policy Debian has for this sort of thing. Ah, yes. That was it: Not doable! Or more accurately: It needs implemented in webserver packages that this awstats package can hook into - not doable in awstats alone. - Jonas -- * Jonas Smedegaard - idealist Internet-arkitekt * Tlf.: +45 40843136 Website: http://dr.jones.dk/ [x] quote me freely [ ] ask before reusing [ ] keep private signature.asc Description: Digital signature
Bug#590074: [Pkg-awstats-devel] Bug#590074: Bug#590074: awstats: DO NOT use cron scripts to update stats database
clone 590074 -1 -2 -3 reassign -1 apache2 reassign -2 lighttpd reassign -3 nginx retitle -1 add pre-rotate hook to logrotate script retitle -2 add pre-rotate hook to logrotate script retitle -3 add pre-rotate hook to logrotate script severity -1 wishlist severity -2 wishlist severity -3 wishlist thanks Please add something like the following snippet to the logrotate script for your package: prerotate if [ -d /etc/logrotate.d/httpd-prerotate ]; then \ run-parts /etc/logrotate.d/httpd-prerotate; \ fi; \ endscript (or some suitable directory other than the one suggested; I'm not sure what Debian naming conventions are.) This would be greatly helpful to log-parsing packages such as awstats, which can then set up hooks to processes these logs before they get rotated (see #590074 for an example). X On 23/07/10 17:03, Jonas Smedegaard wrote: On Fri, Jul 23, 2010 at 02:57:43PM +0100, Ximin Luo wrote: On 23/07/10 14:31, Jonas Smedegaard wrote: I suggest to a) split the current cron job into infrequent and frequent jobs, b) make the frequent one optional (ideally through debconf), and c) invoke the infrequent job also (or instead?) as a logrotate hook. How does that sound? Yeah, that works. Though looking at the current script, there's not really much need to split it up. The main problem is to do the logrotate hook itself - ideally you'd add it directly to the webserver entry rather than a new awstats entry. Is that going to be a pain - editing another package's configuration files? I dunno what infrastructure / policy Debian has for this sort of thing. Ah, yes. That was it: Not doable! Or more accurately: It needs implemented in webserver packages that this awstats package can hook into - not doable in awstats alone. - Jonas -- To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Bug#590074: [Pkg-awstats-devel] Bug#590074: awstats: DO NOT use cron scripts to update stats database
On Fri, Jul 23, 2010 at 4:31 PM, Ximin Luo infini...@gmx.com wrote: Currently this package installs a cron job that runs every ten minutes. This is a VERY bad idea: - if logrotate(8) runs during those 10 minutes, some log entries will fail to be accounted for by awstats logrotate every 10 minutes - could be the source of trouble. Not awstats. - it wastes resources parsing the same log files every 10 minutes, especially if they get big Do you mean, it parses _same_ log entires? No, awstats doesn't do such a stupid things. Actually, it does lseek on file to the last known entry and then begin parsing. - it makes logcheck(8) spam my inbox every hour due to the cron job failing every 10 minutes Why exactly it fails? Do you try first to comment out crontab entry and fix the source of failure? A better solution is to hook the update script onto the logrotate(8) entries for any installed webservers (eg. /etc/logrotate.d/lighttpd,apache2). This solves all of the 3 problems I just mentioned. Package: awstats Version: 6.9.5~dfsg-2 Severity: important I'm disagree with severity. Looks like a very site-specific/workload-specific issue. Your logrotate-based solution could be suggested as an option in README.Debian for specific setups. On Fri, Jul 23, 2010 at 5:31 PM, Jonas Smedegaard jo...@jones.dk wrote: Frequent updates of logfiles have its use too, however. But not always - and the backsides you raise here are valid. True. I suggest to a) split the current cron job into infrequent and frequent jobs, b) make the frequent one optional (ideally through debconf), and c) invoke the infrequent job also (or instead?) as a logrotate hook. How to split a) or c)? It's easy only from the local admin side. We can make cron job frequency to be debconfigured. Is it an option? -- To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Bug#590074: [Pkg-awstats-devel] Bug#590074: awstats: DO NOT use cron scripts to update stats database
On 23/07/10 17:46, Sergey B Kirpichev wrote: On Fri, Jul 23, 2010 at 4:31 PM, Ximin Luoinfini...@gmx.com wrote: Currently this package installs a cron job that runs every ten minutes. This is a VERY bad idea: - if logrotate(8) runs during those 10 minutes, some log entries will fail to be accounted for by awstats logrotate every 10 minutes - could be the source of trouble. Not awstats. No, logrotate isn't running every 10 minutes. I think you misunderstood my point. If logrotate runs between the 10-minute cron runs of awstats, it will rotate the log entries since the last 10-minute run, so the next 10-minute run won't be able to see it any more. - it wastes resources parsing the same log files every 10 minutes, especially if they get big Do you mean, it parses _same_ log entires? No, awstats doesn't do such a stupid things. Actually, it does lseek on file to the last known entry and then begin parsing. What if the file has been truncated or removed by logrotate? - it makes logcheck(8) spam my inbox every hour due to the cron job failing every 10 minutes Why exactly it fails? Do you try first to comment out crontab entry and fix the source of failure? I guess because I haven't written a proper config file yet? Anyway, it's still spamming my syslog *every 10 minutes*. This should at least be an option that's off by default. A better solution is to hook the update script onto the logrotate(8) entries for any installed webservers (eg. /etc/logrotate.d/lighttpd,apache2). This solves all of the 3 problems I just mentioned. Package: awstats Version: 6.9.5~dfsg-2 Severity: important I'm disagree with severity. Looks like a very site-specific/workload-specific issue. Your logrotate-based solution could be suggested as an option in README.Debian for specific setups. logrotate is part of the standard install for Debian webservers (at least apache2 and lighttpd). this is not site specific. On Fri, Jul 23, 2010 at 5:31 PM, Jonas Smedegaardjo...@jones.dk wrote: Frequent updates of logfiles have its use too, however. But not always - and the backsides you raise here are valid. True. I suggest to a) split the current cron job into infrequent and frequent jobs, b) make the frequent one optional (ideally through debconf), and c) invoke the infrequent job also (or instead?) as a logrotate hook. How to split a) or c)? It's easy only from the local admin side. We can make cron job frequency to be debconfigured. Is it an option? -- To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Bug#590074: [Pkg-awstats-devel] Bug#590074: Bug#590074: awstats: DO NOT use cron scripts to update stats database
On Fri, Jul 23, 2010 at 08:46:27PM +0400, Sergey B Kirpichev wrote: On Fri, Jul 23, 2010 at 4:31 PM, Ximin Luo infini...@gmx.com wrote: Currently this package installs a cron job that runs every ten minutes. This is a VERY bad idea: - if logrotate(8) runs during those 10 minutes, some log entries will fail to be accounted for by awstats logrotate every 10 minutes - could be the source of trouble. Not awstats. Looks like possible language problem: during != every during ~= in between If this didn't help, please follow-up on Ximin's response instead of mine :-) - it makes logcheck(8) spam my inbox every hour due to the cron job failing every 10 minutes Why exactly it fails? Do you try first to comment out crontab entry and fix the source of failure? I experienced cron spam too when trying to install awstats recently (and too busy at the time to investigate further - just cursed a bit and uninstalled awstats again). Possibly not a helpful comment - just want to hint that there might actually be an issue of cron spam in virgin installs of awstats currently. On Fri, Jul 23, 2010 at 5:31 PM, Jonas Smedegaard jo...@jones.dk wrote: I suggest to a) split the current cron job into infrequent and frequent jobs, b) make the frequent one optional (ideally through debconf), and c) invoke the infrequent job also (or instead?) as a logrotate hook. How to split a) or c)? It's easy only from the local admin side. I must admit that I have lost track of most recent improvements by you, but seem to recall in the past that it made sense for my local scripts to distinguish between hevier monthly/weekly log analysis routines and smaller hourly ones. But perhaps that was because I (for other reasons) analyzed the files from scratch again each month... Let's first figure out if current frequent cron job really is heavy on system resources, and only if it is I can try elaborate more on my ideas here. We can make cron job frequency to be debconfigured. Is it an option? I would prefer to keep the cron file as a conffile and instead have the invoked script check a flag in /etc/default/awstats if it should really run or just quit immediately. But again, let's first resolve if it really is necessary. - Jonas -- * Jonas Smedegaard - idealist Internet-arkitekt * Tlf.: +45 40843136 Website: http://dr.jones.dk/ [x] quote me freely [ ] ask before reusing [ ] keep private signature.asc Description: Digital signature
Bug#590074: [Pkg-awstats-devel] Bug#590074: awstats: DO NOT use cron scripts to update stats database
No, logrotate isn't running every 10 minutes. I think you misunderstood my point. If logrotate runs between the 10-minute cron runs of awstats, it will rotate the log entries since the last 10-minute run, so the next 10-minute run won't be able to see it any more. That's true. But it's a known issue and your logrotate hint is already documented in README.Debian for this purpose. On Fri, Jul 23, 2010 at 9:39 PM, Jonas Smedegaard jo...@jones.dk wrote: Looks like possible language problem: during != every during ~= in between Probably ;) - it wastes resources parsing the same log files every 10 minutes, especially if they get big Do you mean, it parses _same_ log entires? No, awstats doesn't do such a stupid things. Actually, it does lseek on file to the last known entry and then begin parsing. What if the file has been truncated or removed by logrotate? In this case it starts from the first line of course... Please consider to enable EnableLockForUpdate feature. From the README.Debian: 8 Also consider enabling lock files in /etc/awstats/awstats.conf with EnableLockForUpdate=1 so that only one AWStats update process is running at a time. This will reduce system resources especially if the AWStats update process takes longer than 10 minutes to complete. This solution has some security drawbacks: lockfile with well-known name and writable by www-data user. -8 I guess because I haven't written a proper config file yet? Anyway, it's still spamming my syslog *every 10 minutes*. This should at least be an option that's off by default. It's a fresh install, right? On Fri, Jul 23, 2010 at 9:39 PM, Jonas Smedegaard jo...@jones.dk wrote: I experienced cron spam too when trying to install awstats recently (and too busy at the time to investigate further - just cursed a bit and uninstalled awstats again). Possibly not a helpful comment - just want to hint that there might actually be an issue of cron spam in virgin installs of awstats currently. I guess, we can disable cron jobs by default on a fresh install. As /etc/awstats/awstats.conf is not configured by default, I'm disagree with severity. Looks like a very site-specific/workload-specific issue. Your logrotate-based solution could be suggested as an option in README.Debian for specific setups. logrotate is part of the standard install for Debian webservers (at least apache2 and lighttpd). this is not site specific. Yes, but your logrotate settings is very site specific and far, far away from defaults... -- To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Bug#590074: [Pkg-awstats-devel] Bug#590074: awstats: DO NOT use cron scripts to update stats database
On 23/07/10 18:54, Sergey B Kirpichev wrote: I guess because I haven't written a proper config file yet? Anyway, it's still spamming my syslog *every 10 minutes*. This should at least be an option that's off by default. It's a fresh install, right? yeah. logrotate is part of the standard install for Debian webservers (at least apache2 and lighttpd). this is not site specific. Yes, but your logrotate settings is very site specific and far, far away from defaults... Where did I suggest that I edited my logrotate scripts? They are unchanged since being installed... Or do you mean the solution I proposed? AFAICS Debian utility packages normally assume they're going to be used on/by other Debian packages, so it's fine to assume that awstats is being installed for the logs on some local Debian webserver package. In fact, the solution I described in the cloned bug reports above, won't put any extra effort on awstats maintenance: - awstats adds some update scripts into DIR. job done on the awstats side. - default logrotate scripts of various webservers call DIR when rotating logs. (what I made those cloned reports for) - if a site admin wants to use non-default log settings, then they'll need to edit their logrotate scripts anyhow. X -- To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Bug#590074: awstats: DO NOT use cron scripts to update stats database
On Fri, Jul 23, 2010 at 09:54:23PM +0400, Sergey B Kirpichev wrote: I guess because I haven't written a proper config file yet? Anyway, it's still spamming my syslog *every 10 minutes*. This should at least be an option that's off by default. It's a fresh install, right? On Fri, Jul 23, 2010 at 9:39 PM, Jonas Smedegaard jo...@jones.dk wrote: I experienced cron spam too when trying to install awstats recently (and too busy at the time to investigate further - just cursed a bit and uninstalled awstats again). Possibly not a helpful comment - just want to hint that there might actually be an issue of cron spam in virgin installs of awstats currently. I guess, we can disable cron jobs by default on a fresh install. As /etc/awstats/awstats.conf is not configured by default, A virgin install must not cause cron spam. If you implicitly acknowledge above that awstats currently does, then yes, we should disable it by default (or figure out something more clever). I'm disagree with severity. Looks like a very site-specific/workload-specific issue. Your logrotate-based solution could be suggested as an option in README.Debian for specific setups. logrotate is part of the standard install for Debian webservers (at least apache2 and lighttpd). this is not site specific. Yes, but your logrotate settings is very site specific and far, far away from defaults... I fail to understand: Did I miss something and we have already been provided the specific logrotate config of that host? If you are simply guessing, I suggest you state that more clearly, and be kinder about alternative viewpoints here. :-) - Jonas -- * Jonas Smedegaard - idealist Internet-arkitekt * Tlf.: +45 40843136 Website: http://dr.jones.dk/ [x] quote me freely [ ] ask before reusing [ ] keep private signature.asc Description: Digital signature