On Wed, Jun 08, 2011 at 05:19:33PM +0200, "Paweł Hajdan, Jr." wrote: > On 6/8/11 4:36 PM, Vikraman wrote: > > I'm working on the `Package statistics` project this year. Till now, I > > have managed to write a client and server[0] to collect the following > > information from hosts: > > Excellent, good luck with the idea! I think that better information > about how Gentoo is actually used will greatly help improving it. >
Well, that information cannot be collected automatically, can it ? > > Is there a need to collect files installed by a package ? Doesn't PFL[1] > > already provide that ? > > Well, PFL is not an official Gentoo project. It might be useful, but I > wouldn't say it's a priority. > > > Please provide some feedback on what other data should be collected, etc. > > In my opinion it's *not* about collecting as much data as possible. I > think it's most important to get the core functionality working really > well, and convincing as large percentage of users as possible to enable > reporting the statistics (to make the results - hopefully - accurately > represent the user base). Please note that in some cases it may mean > collecting _less_ data, or thinking more about the privacy of the users. > > For me, as a developer, even a list of packages sorted by popularity > (aka Debian/Ubuntu popcon) would be very useful. > > Ah, and maybe files in /etc/portage: package.keywords and so on. It > could be useful to see what people are masking/unmasking, that may be an > indication of stale stabilizations or brokenness hitting the tree. > Anyway, I'd call it an enhancement. > > > Also, I'm starting work on the webUI, and would like some > > recommendations for stats pages, such as: > > > > * Packages installed sorted by users > > Cool! > > > * Top arches, keywords, profiles > > And percentage of ~arch vs arch users? > > > * Most enabled, disabled useflags per package/globally > > Also great, especially the per-package variant. It'd be also useful to > have per-profile data, to better tune the profile defaults. > > > [0] > > http://git.overlays.gentoo.org/gitweb/?p=proj/gentoostats.git;a=commit;h=1b9697a090515d2a373e83b1094d6e08ec405c02 > > I took a quick look at the code. Some random comments: > > - it uses portage Python API a lot. But it's not stable, or at least not > guaranteed to be stable. Have you considered using helpers like portageq > (or eventually enhancing those helpers)? > > - make the licensing super-clear (a LICENSE file, possibly some header > in every source file, and so on) > > - how about submitting the data over HTTPS and not HTTP to better help > privacy? Fair points, thanks! > > - don't leave exception handling as a TODO; it should be a part of your > design, not an afterthought > > - instead of or in addition to the setup.txt file, how about just > writing the real setup.py file for distutils? > Yes, these are part of my sub-goals for next week. -- Vikraman
signature.asc
Description: PGP signature