This mail is an automated notification from the task tracker
of the project: Gna! Administration.
/**************************************************************************/
[task #119] Latest Modifications:
Changes by:
Vincent Caron <[EMAIL PROTECTED]>
'Date:
Mon 10/04/04 at 15:17 (Europe/Paris)
------------------ Additional Follow-up Comments ----------------------------
I'd like to have a try this week. I propose to create the dummy project
'stats', this way home.gna.org/stats and download.gna.org/stats will be
reserved for the stat outputs (can go on with mail.gna.org/stats, etc). If it's
fine for you, please create it.
/**************************************************************************/
[task #119] Full Item Snapshot:
URL: <http://gna.org/task/?func=detailitem&item_id=119>
Project: Gna! Administration
Submitted by: Mathieu Roy
On: Mon 02/02/04 at 02:00
Should Start On: Mon 02/02/04 at 00:00
Should be Finished on: Thu 12/02/04 at 00:00
Category: Services Functionalities
Priority: 5 - Normal
Resolution: None
Privacy: Public
Assigned to: None
Percent Complete: 0%
Status: Open
Effort: 0.00
Summary: webalizer statistics for download area + homepage per project
Original Submission: We should provide "webalizer statistics for download area
+ homepage per project"
Follow-up Comments
------------------
-------------------------------------------------------
Date: Mon 10/04/04 at 15:17 By: Vincent Caron <zerodeux>
I'd like to have a try this week. I propose to create the dummy project
'stats', this way home.gna.org/stats and download.gna.org/stats will be
reserved for the stat outputs (can go on with mail.gna.org/stats, etc). If it's
fine for you, please create it.
-------------------------------------------------------
Date: Thu 09/23/04 at 18:40 By: Mathieu Roy <yeupou>
Vincent, do you think you'll have time to look into Analog? Otherwise, we can,
in the meantime, work to get webalizer stats.
-------------------------------------------------------
Date: Sat 09/18/04 at 18:18 By: Nicolas LAURENT <nicoo>
how to help to make things happen?
-------------------------------------------------------
Date: Wed 06/02/04 at 12:38 By: Mathieu Roy <yeupou>
Feel free to go for it :)
-------------------------------------------------------
Date: Tue 06/01/04 at 15:45 By: Vincent Caron <zerodeux>
Webalizer:
* Woody: 2.01.10, 2002/04/22
* lang: C
* deps: libgd
* demo: http://demo.latinwebs.net/webalizer/
* output: mostly HTML 4.0 compliant
Analog:
* Woody: 5.23, 2002/05/18
* lang: C
* deps: libgd
* demo: http://www.chiark.greenend.org.uk/~sret1/stats/
* output: fully XHTML 1.0 compliant
Analog can draw good looking pseudo-bargraphs without invoking libgd which can
save a lot on CPU cycles. I'd like to experiment with that one.
-------------------------------------------------------
Date: Tue 06/01/04 at 12:14 By: Vincent Caron <zerodeux>
RRD is just about graphing some data, it's the underlying tool for MRTG.
I've made a little experiment where I graphs per-site total hits, loaded pages
and bandwidth ('hometraffic' script attached). The idea was to make it simple
and light enough to be run every 5min in order to have a real-time measure just
like with MRTG; I'm not very satisfied, it can grow CPU intensive in some cases.
Anyway, this gives us some code to demux per-site data, that we could feed to
webalizer. The simplest way is to make the analysis during the Apache log
rotation (this way we don't need logtail). I'd suggest to have it rotate twice
a day for now.
BTW, I've been used to analog which is also technically very nice. It would be
cool to make a few comparisons to see which one fits better.
-------------------------------------------------------
Date: Fri 05/28/04 at 15:47 By: Mathieu Roy <yeupou>
I'm not generally satisfied with rdd but the truth is the fact that I'm not
very familiar with it. Can you generate data as meaningfull as webalizer's with
rrd?
I was also interested in the idea of giving project their logs, not just
statistics - but that's not a priority.
-------------------------------------------------------
Date: Thu 05/27/04 at 18:02 By: Vincent Caron <zerodeux>
It would be very simple to feed a rrd base from the logs and have a per-project
throughput graph. I just figured out that barely 20 lines of perl called from
logrotate on access.log would do the trick. What do you think ?
-------------------------------------------------------
Date: Tue 05/18/04 at 17:20 By: Mathieu Roy <yeupou>
« The purpose seems to update web pages from cvs, is it ? »
Yes. It just access a text file giving the unix name of the project and then
download the stuff.
The list of projects is generated by a script which one is outside of the
chroot. It permit to have the www chroot running without mysql client and mysql
access.
Apart from that, Gna! use the backend with nothing really specific.
In the case of interest to us, the issue is not really related to Savane.
What the script just have to do is only related to apache and webalizer.
As I detailled before:
The best is to have a script that takes as argument on the command line:
--conffile= path to the configuration file like /var/webalizer/group.conf
--logfile= path the apache log, like /var/log/apache/bygroup/group.log
--outputdir= path to the output directory for webalizer
like /var/www/group/webalizer/
--title= title for the webalizer results
It would be easy then to adapt the script homepage-update.pl to call that
script.
No other group info other than the system name is available. The home.gna.org
system in a chroot that have no access to the database or anything else.
--
Another script (a trivial one) like homepage-update.pl should be written to run
webalizer on each conffile, and that script will be run via cron. In fact,
homepage-update.pl would only need very trivial changes for that.
-------------------------------------------------------
Date: Tue 05/18/04 at 16:40 By: David Jobet <djobet>
OK, I've read the file quickly. The purpose seems to update web pages from cvs,
is it ?
Is there some documentation somewhere detailing the installation process
between savane and gna ?
I mean, I've installed savane, but what else do I need to install ? (that's
hard to try to guess what's the next step when you're in the dark).
-------------------------------------------------------
Date: Tue 05/18/04 at 16:20 By: Mathieu Roy <yeupou>
Hum, sorry I hadnt time to follow the issue.
The script is not part of savane (as it does not interact at all with the
software savane) but of gna scripts. I attach to this item.
-------------------------------------------------------
Date: Tue 05/18/04 at 15:53 By: David Jobet <djobet>
I've installed and run savane (at least partially) on my home system, and I've
posted to savane-dev but got no answer.
Perhaps was it not the good place.
However, where can I find the homepage-update.pl script you're talking about ?
-------------------------------------------------------
Date: Sat 05/08/04 at 20:00 By: Mathieu Roy <yeupou>
No, it is not complex to set up webalizer. However, we need to make apache
saving logs for each area in separate files. That's not a big task either,
that's doable. But it still need to be done.
You are right, what would be need in a minimal script that create a webalizer
conf (hum, it may even not be necessary. Just one conffile + command line args
may be enough).
The robot.txt blocking access to search engines is a good idea, indeed.
However, a script handling webalizer conf creation should not touch cron. There
should be only one cron entry. Otherwise it would not be scalable.
--
How I understand your script, it would be called by the script that create the
homepage area at http://home.gna.org
--
The prefered language is Perl, as all the others scripts are in Perl.
The scripts must be in "use strict;" and should use "use Getopt::Long;" for
command line arguments.
--
The best is to have a script that takes as argument on the command line:
--conffile= path to the configuration file like /var/webalizer/group.conf
--logfile= path the apache log, like /var/log/apache/bygroup/group.log
--outputdir= path to the output directory for webalizer
like /var/www/group/webalizer/
--title= title for the webalizer results
It would be easy then to adapt the script homepage-update.pl to call that
script.
No other group info other than the system name is available. The home.gna.org
system in a chroot that have no access to the database or anything else.
--
Another script (a trivial one) like homepage-update.pl should be written to run
webalizer on each conffile, and that script will be run via cron. In fact,
homepage-update.pl would only need very trivial changes for that.
--
So if you can write the first script, it would be easy for us to include it in
our bunch of scripts.
If you want, we can give you write access to the cvs repository, which is not
public (but all the code is GPL).
-------------------------------------------------------
Date: Fri 05/07/04 at 10:36 By: David Jobet <djobet>
From my limited knowledge, setting up webalizer is not too complicated : we
need to set up a webalizer.conf file, plus set up a cron job to launch
webalizer once a day.
I guess we have to provide a tool (in which form ? perl ? bash ? other ?) that
creates the conf file from the project info (in which form can we retrieve the
project info ?).
One bothering task with webalizer is to regularly check the logs to add
IgnoreReferrer on xxx site (they use webalizer as a way to increase their
google ranking).
I guess we should add a robot.txt file forbidding the search engines to
reference the webalizer page...
If you can tell me what kind of tool I can use and how I can get basic
information on the project (such as the name, the path on the servers, ...) I
can create a script that creates the
- webalizer.conf
- add an entry in the cron
I think we will need to think if we want a referrer entry in the webalizer page
(I think that's a cool feature because I can see who is talking/using of my
project) and in that case how can we fight against porn sites.
See my last webalizer entry : http://www.nosica.net/webalizer/usage_200405.html
CC List
-------
CC Address | Comment
------------------------------------+-----------------------------
nicoo | any roadmap at this point?
File Attachments
-------------------
-------------------------------------------------------
Date: Tue 06/01/04 at 12:14 Name: hometraffic Size: 4.75KB By: zerodeux
http://gna.org/task/download.php?item_id=119&item_file_id=16
-------------------------------------------------------
Date: Tue 05/18/04 at 16:20 Name: homepage-update.pl Size: 2.13KB By: yeupou
homepage update
http://gna.org/task/download.php?item_id=119&item_file_id=13
For detailed info, follow this link:
<http://gna.org/task/?func=detailitem&item_id=119>
_______________________________________________
Message sent via/by Gna!
http://gna.org/