Re: analyzing popcon data for bogus recommends

2008-05-14 Thread Petter Reinholdtsen
[Joey Hess]
 It would be nice to have a list which Recommends are
 ignored/overridden the most when installing packages, to identify
 Recommends that need to be downgraded to Suggests. Could we derive
 such a list from popcon data?

I have no idea if that can be done. :)

 I think it would need to be done by analyzing each individual popcon
 data submission, so I can't do it as that data is not published.

The raw popcon data is available for all Debian Developers at
popcon.debian.org.  Unable to log in to confirm the exact location at
the moment, but it is there somewhere. :)

Putting [EMAIL PROTECTED] on the CC list, as
it is a better place to discuss the use of popcon data.

Happy hacking,
-- 
Petter Reinholdtsen


-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]



Re: analyzing popcon data for bogus recommends

2008-05-14 Thread Enrico Zini
On Tue, May 13, 2008 at 10:51:37PM -0400, Joey Hess wrote:

 It would be nice to have a list which Recommends are ignored/overridden
 the most when installing packages, to identify Recommends that need to be
 downgraded to Suggests. Could we derive such a list from popcon data? I
 think it would need to be done by analyzing each individual popcon data
 submission, so I can't do it as that data is not published.

Yes you can.  Also, there's a xapian database in my home directory
(~enrico/anapop/something IIRC) on people.debian.org that is built with
the popcon data, and you can query that database to quickly get a count
of submissions having package X AND NOT package Y and package X AND
package Y.

That Xapian index indexes popcon submissions as if they were
documents, and installed packages as if they were terms.

The database is updated using a weekly cronjob that rescans the whole
popcon database.  I've quickly tried in the past[1] to come out with
ways to hook the indexing process into popcon so that I could do
realitime indexing of the data (it gives an up to date index and doesn't
suck 100% cpu on gluck once a week), but I got the impression that it
required having more discussion than I was motivated to have at the
time.  If more people are interested in using that xapian index, it can
make sense to rehash this.


Ciao,

Enrico

[1] 
http://lists.alioth.debian.org/pipermail/popcon-developers/2007-June/001374.html
-- 
GPG key: 1024D/797EBFAB 2000-12-05 Enrico Zini [EMAIL PROTECTED]


signature.asc
Description: Digital signature


analyzing popcon data for bogus recommends

2008-05-13 Thread Joey Hess
It would be nice to have a list which Recommends are ignored/overridden
the most when installing packages, to identify Recommends that need to be
downgraded to Suggests. Could we derive such a list from popcon data? I
think it would need to be done by analyzing each individual popcon data
submission, so I can't do it as that data is not published.

-- 
see shy jo


signature.asc
Description: Digital signature


Re: analyzing popcon data for bogus recommends

2008-05-13 Thread Felipe Sateler
Joey Hess wrote:

 It would be nice to have a list which Recommends are ignored/overridden
 the most when installing packages, to identify Recommends that need to be
 downgraded to Suggests. Could we derive such a list from popcon data? I
 think it would need to be done by analyzing each individual popcon data
 submission, so I can't do it as that data is not published.

I think you need more than popcon data: popcon doesn't say which packages were
manually installed and which were automatically AFAIK. Maybe package B is
installed and only recommended by A, but there is no way to tell if package B
wasn't needed on it's own.

-- 

  Felipe Sateler


-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]



Re: analyzing popcon data for bogus recommends

2008-05-13 Thread Daniel Burrows
On Tue, May 13, 2008 at 11:09:20PM -0400, Felipe Sateler [EMAIL PROTECTED] 
was heard to say:
 Joey Hess wrote:
 
  It would be nice to have a list which Recommends are ignored/overridden
  the most when installing packages, to identify Recommends that need to be
  downgraded to Suggests. Could we derive such a list from popcon data? I
  think it would need to be done by analyzing each individual popcon data
  submission, so I can't do it as that data is not published.
 
 I think you need more than popcon data: popcon doesn't say which packages were
 manually installed and which were automatically AFAIK. Maybe package B is
 installed and only recommended by A, but there is no way to tell if package B
 wasn't needed on it's own.

  It's true that you probably couldn't use this to find recommendations
that *should* exist, but if a Recommends is being widely ignored /
overridden (i.e., if the number of systems installed A but not B is
high), then it might be worth re-examining that dependency.

  Daniel


-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]