On 01/24/2011 02:43 PM, Kyle Ambroff wrote: > On Mon, Jan 24, 2011 at 1:27 PM, Alex Mandel <tech_...@wildintellect.com> > wrote: >> So I'm trying to setup automated remote backup of some files from >> machine1 to machine2 using something simple like rsync. What I'm having >> trouble figuring out is what user to run it as and how to get that user >> the correct permissions. >> >> In the example use case I want to copy my Apache logs over to a 2nd >> machine to run awstats on it without putting much of a load on the >> actual web server. I was thinking of creating a "backup" user, >> generating a passphraseless key and then rsync on a cron timer. >> Should this user be a system user (below 1000) or a regular user (above >> 1000), since it needs a key I would assume it needs to be a regular user >> with a home directory? >> >> Question 2 is how do I make sure it has permissions to read the logs? >> It appears that most of /var/log/apache2 files are root:adm but some are >> root:root. If they were all g+r for adm then just adding my backup user >> to the adm group should work? >> >> Looks like I need to go figure out why some logs have a different group. > > I really don't think using SSH for stuff like this is a good idea. > It's just too hard to get the security right, especially with a > passphraseless key. Too scary for me. Just don't do it. > > If your main use case is collecting statistics from your web servers > then I suggest you look at Ganglia[1]. One of my coworkers has > released a bunch of really awesome Ganglia tools that we use at Linden > Lab for monitoring 10k+ servers, many of which are running Apache. > > http://ben.hartshorne.net/ganglia/ > > Check out ganglia-logtailer for example. It includes support for > collecting the following stats from Apache: > > * Requests per second > * Requests per second broken down by HTTP method > * Average query processing time > * Ninetieth percentile query processing time > * Number of 200, 300, 400 and 500 responses per second. > > All of this data ends up on your Ganglia dashboard, along with general > system health. As an added bonus you can use his ganglios plugin for > Nagios[2] to set up alerts on any value in Ganglia. This is just > fantastic once you have it set up. You can set it up to send SMS > messages or emails if you have a spike in 500 responses, for example. > Having historical performance data can be a life saver as well. > > -Kyle > > [1] http://ganglia.sourceforge.net/ > [2] http://www.nagios.org/
Interesting idea, but that isn't really the data I'm trying to get. I already have munin running for general health tracking. The analysis of the logs is more about who is visiting what, and where are they from, + inferring how long they stayed on the site. So it really relies on having the log files as a whole and running through some tools, one of which is awstats, another is an import to a RBDMs for more exact query reporting. Thanks, Alex _______________________________________________ vox-tech mailing list vox-tech@lists.lugod.org http://lists.lugod.org/mailman/listinfo/vox-tech