Dear Greg, Thank you so much for your work. Both you and DM had offered to help on this. As DM has a ton of other tasks, I'm sure he would appreciate it if you wanted to own this. Here is the history up to now.
Originally, I believe Joachim, Chris, Martin, and DM had a hand in creating, improving, debugging, etc., a perl script to do module statistics. I think they worked out a good way to minimize skewed numbers from multiple retries, multiple files per modules, etc. I've moved their script to ~sword/bin/ on the server and placed it under version control. If you'd like to own this task moving forward, you are more than welcome-- and I think I can say this for all those involved in the process in the past (though they can speak up if they still have a heartfelt attachment to the task). However, so as not to neglect gleaning from their past work, I would like to ask you to take a look at their script and see how they decided to computer numbers. This script is run from a daily cron job to produce the top20.html file on swords front page. The arguments for the run are: /home/sword/bin/makeDownloadsStats.pl /home/sword/html/top20.html 20 30 If your new python script could take the same params and generate a similar file, it would make it easy for me to substitute it into the cron job. If you don't feel this is something you'd like to own, maybe DM is still willing to look into updating the current perl script. Thanks everyone for your recent work and work from the past on this. Automation is our friend: it captures nebulous knowledge floating around and places it into a solid description, and keeps humans out of the role of 'bottleneck'. :) -Troy. Greg Hellings wrote: > Troy, > > I've written up a log processor for the download statistics. It's the > executable .py file in my user directory on the server. Below is an > example run of it: > > [EMAIL PROTECTED] ~]$ ./process_log.py ESV <path-to-log snipped> > Total downloads: 362 > Unique downloads: 210 > > It will accept as many files on the command line as you desire and > report their statistics in aggregate. Such is most useful for > maintaining information about the IP-address across the multiple > files. It also works for the FTP files, but for those, relying on the > total downloads is misleading, since it reports individual downloads > of both new AND old testament .bz* files. Thus, each individual > download of the module should crop up as about 6 files in the "total > downloads" section. Unique downloads are based solely on IP address. > As an example of the discrepancy of the counting: > > [EMAIL PROTECTED] ~]$ ./process_log.py ESV <path-to-log snipped> > Total downloads: 540 > Unique downloads: 84 > > Examples for comparison: > [EMAIL PROTECTED] ~]$ ./process_log.py KJV <ftp log> > Total downloads: 2098 > Unique downloads: 163 > [EMAIL PROTECTED] ~]$ ./process_log.py KJV <http log> > Total downloads: 342 > Unique downloads: 198 > > Those stats are based off of the currently in-use log files. If you > would like a version of the script that will also report all module > download totals, that can be provided for little extra work. > > --Greg > > > On Tue, Aug 19, 2008 at 4:14 PM, Greg Hellings <[EMAIL PROTECTED]> wrote: >> Troy, >> >> On Tue, Aug 19, 2008 at 4:04 PM, Troy A. Griffitts <[EMAIL PROTECTED]> wrote: >>> Hey guys. We have a few needs which need addressing: >>> >>> Log files got a new naming convention recently. Instead of: >>> >>> ffff >>> ffff.1 >>> ffff.2 >>> ... >>> >>> It has become >>> >>> ffff >>> ffff-20080819 >>> ffff-20080818 >>> ... >>> >>> Hence our perl scripts that generate module statistics are not working, >>> seen on the left panel here: >> I don't know thing 1 on Perl, so editing that is out for me. A >> rewrite is possible into Python if no one with Perl knowledge shows >> up. >> >>> http://crosswire.org/sword >>> >>> Also, Crossway asks for periodic download statistics for their ESV >>> module. I generated the last report for them by hand, but I would love >>> for someone to write a script that would run on the first of each month >>> and email them statistics for the previous month. >> What format is the file in (I'm guessing it's an Apache file access >> log)? A simple Python script should be more than sufficient for this >> purpose. I can probably whip one up in little time. Also, what >> statistics are you in need of -- just a download count or do you also >> want to have information on the unique IP address downloads, etc. A >> sample of one line of the file (or multiple lines, if a file access is >> spread across several lines) which pertains to the ESV should be >> sufficient to base the work off of -- more would be appropriate if >> there are multiple formats the line appears in. Also, odds are good >> that the same script can be used to generate the statistics for any >> individual module. >> >> --Greg >> >>> Any takers? >>> >>> -Troy. >>> >>> >>> _______________________________________________ >>> sword-devel mailing list: sword-devel@crosswire.org >>> http://www.crosswire.org/mailman/listinfo/sword-devel >>> Instructions to unsubscribe/change your settings at above page >>> > > _______________________________________________ > sword-devel mailing list: sword-devel@crosswire.org > http://www.crosswire.org/mailman/listinfo/sword-devel > Instructions to unsubscribe/change your settings at above page _______________________________________________ sword-devel mailing list: sword-devel@crosswire.org http://www.crosswire.org/mailman/listinfo/sword-devel Instructions to unsubscribe/change your settings at above page