Re: Let's eliminate the Module List
* Simon Cozens [EMAIL PROTECTED] [2004-08-27 13:53]: [EMAIL PROTECTED] (A. Pagaltzis) writes: I object. Browsing is problematic when the amount of data becomes overwhelming, but it is useful as a concept. You're thinking in terms of use, I'm thinking in terms of implementation. Sorry, I realized that after reading further into the rest of the thread. Regards, -- Aristotle If you can't laugh at yourself, you don't take life seriously enough.
Re: Let's eliminate the Module List
* Simon Cozens [EMAIL PROTECTED] [2004-08-27 13:53]: [EMAIL PROTECTED] (A. Pagaltzis) writes: That is essentially correct, but beware of metacrap[1]. [1] http://www.well.com/~doctorow/metacrap.htm Niggly comments are great! I love the way they really motivate me to get this finished! I'm not trying to demotivate you. :-) As that article says, inherent metadata you can harvest from the existing data base in some way without reliance on external effort is very useful. I'd also like to think that a number of the issues Cory brings up won't be too pronounced on CPAN: I'd hope module don't routinely get uploaded with misspelt names.. It's just something to keep in mind. I've had to deal with such concerns in an intensely painful project attempting to infer structure from weakly marked up documents. Sorry if it seemed I was saying that your project is doomed or something; that wasn't my intent at all. Regards, -- Aristotle If you can't laugh at yourself, you don't take life seriously enough.
Re: Let's eliminate the Module List
* Simon Cozens [EMAIL PROTECTED] [2004-08-24 15:37]: Repeat after me: browsing is just searching metadata. That is essentially correct, but beware of metacrap[1]. [1] http://www.well.com/~doctorow/metacrap.htm Regards, -- Aristotle If you can't laugh at yourself, you don't take life seriously enough.
Re: Let's eliminate the Module List
[EMAIL PROTECTED] (A. Pagaltzis) writes: I object. Browsing is problematic when the amount of data becomes overwhelming, but it is useful as a concept. You're thinking in terms of use, I'm thinking in terms of implementation. -- It's usually // either for a good reason // or a bad reason - Larry Wall haiku
Re: Let's eliminate the Module List
[EMAIL PROTECTED] (A. Pagaltzis) writes: * Simon Cozens [EMAIL PROTECTED] [2004-08-24 15:37]: Repeat after me: browsing is just searching metadata. That is essentially correct, but beware of metacrap[1]. [1] http://www.well.com/~doctorow/metacrap.htm Niggly comments are great! I love the way they really motivate me to get this finished! -- In matters of principle, stand like a rock; in matters of taste, swim with the current. -- Thomas Jefferson
Re: Let's eliminate the Module List
On Fri, Aug 27, 2004 at 12:53:15PM +0100, Simon Cozens wrote: [EMAIL PROTECTED] (A. Pagaltzis) writes: I object. Browsing is problematic when the amount of data becomes overwhelming, but it is useful as a concept. You're thinking in terms of use, I'm thinking in terms of implementation. I've done the same thing that Simon is doing here-- implementing browsing and searching with the same code. I still had explicit browse options like browse by date and browse by size. They simply had hard links to a search result set. I think Simon is proposing the same-- the concept of browsing wouldn't have to go away, it could easily be implemented by creating explicit search links using the meta data in the system. Mark -- http://mark.stosberg.com/
Re: Let's eliminate the Module List
On 8/24/2004 9:28 AM, Simon Cozens wrote: [EMAIL PROTECTED] (Randy W. Sims) writes: hmm, are you going to generate multiple indexes? It might be interesting if we could search over the various fields provided by META.yml[1] I am only going to generate one index, but this is because Plucene indexes are better than you think they are. :) Plucene allows me to index documents with multiple fields, like so: type: module name: Email::Simple category: email author: SIMON description: ... synopsis: ... depends: rating: ... In fact, I can just index META.yml plus the documentation pretty much as is and get the data in the right fields. There's a lot more fields you can (and I will add) of course, such as when the module was released, the license, and so on. Very cool. You're right. I was thinking of the old-fashioned type text indexes. I've been doing a little reading on P/Lucene, and it's very exciting. Oooh, I think I like it I think I like what Im feelin Even though its such a surprise But you know Oooh, I think I really like it I think I like what I feel And changes really open your eyes - Boston, lyrics to I think I like it When you have such an index, to find all the good email handling modules, search for category:email rating:4-5. Or email author:SIMON, it comes to the same thing. :) Is that a bug? ;-) Browsing is just searching metadata.
Re: Let's eliminate the Module List
[EMAIL PROTECTED] (Randy W. Sims) writes: Looks like you and Simon should collaborate. We've been chatting. Is it possible or realistic for it to have pluggable search browse engines. I think so. There are three things at issue, all of which can and should be implemented distinctly: 1) Viewing the contents of packages 2) Browsing by category 3) Searching I have no desire to work on 1), since I think search.cpan.org does this very well. (Andy agrees with this idea.) I have no idea to work on 2), since I think that all browsing is a subset of searching - for instance, browsing by keyword is just doing a search for a particular keyword; browsing all modules is just doing a search for type:module, and so on. (Andy... well, tolerates this idea. ;) 3) is the thing I want to work on, and Andy wants to work on 2), so my plan for the time being is to index all the CPAN module metadata and link search results to the current search.cpan.org pages for displaying a given module. Then Andy can come along and turn canned searches into a browse interface on top of that. I still think sourceforge-like hierchical catagories (Topics) in META.yml would make for good light-weight search and improved by-catagory browsing I disagree quite violently with this, but I'm not going to implement searching and indexing in a way that precludes it. I think that the world moved from browse to search some time in the mid 90s (hell, we're even being encouraged to search rather than browse email these days) and that this is because browsing is useful if your search engine isn't good enough. Even so, I recognise that everyone who comes to working on CPAN metadata has their own conceptual axe to grind, so I'll just index whatever the heck is in META.yml and let everone else sort out the details. -- Using Outlook is like running barefoot through a hot ward snogging all the patients - Peter da Silva
Re: [unclassified] Re: Let's eliminate the Module List
Graham Barr [EMAIL PROTECTED] writes: Right. And a grand job he has been doing. I second this. For a while I've tried to do something sensible with the registry requests, but failed due to lack of response in times where feedback was needed (i.e., with ambigous or dubious requests). So _I_ decided to leave the module registration for what it is and only handle user ID requests, most of which brian handles as well, he's just quicker than me :-). -- Johan
Re: Let's eliminate the Module List
On Tue, 24 Aug 2004, Simon Cozens wrote: I still think sourceforge-like hierchical catagories (Topics) in META.yml would make for good light-weight search and improved by-catagory browsing I disagree quite violently with this, but I'm not going to implement searching and indexing in a way that precludes it. I think that the world moved from browse to search some time in the mid 90s (hell, we're even being encouraged to search rather than browse email these days) and that this is because browsing is useful if your search engine isn't good enough. Browsing and searching each have their place. It is conceivable that a powerful enough search could emulate browsing. Consider how much discussion has been generated by talk of removing a browse-oriented document like the module list. Some people and activities are more fond of browsing than searching. It may only be 10% of the cases where browsing works better than searching, but if I want to answer the question - what are all of the perl web applications it would take lots of searches and result munging to find out what a painless browse could produce. A hierarchical system that people can add into META.yml supplemented by an effort to fill in the gaps left by maintainers not motivated to fix their META.yml would be a wonderful thing. -- /chris There are two ways of constructing a software design. One way is to make it so simple that there are obviously no deficiencies. And the other way is to make it so complicated that there are no obvious deficiencies. -- C.A.R. Hoare
Re: Let's eliminate the Module List
On Tue, 24 Aug 2004, Simon Cozens wrote: Repeat after me: browsing is just searching metadata. For our current purposes I'm willing to go along with that. Once the metadata exists people can do whatever they want with it. I strongly suspect that one of those things will be making something that is vaguely yahooish. This brings to mind an interesting question - shouldn't there be some central file of meta data that's automatically generated? Maybe in Storable and XML? That way people that want to experiment don't have to have a full CPAN mirror or dig data out of all of the tar files. -- /chris There are two ways of constructing a software design. One way is to make it so simple that there are obviously no deficiencies. And the other way is to make it so complicated that there are no obvious deficiencies. -- C.A.R. Hoare
Re: Let's eliminate the Module List
* Fergal Daly [EMAIL PROTECTED] [2004-08-20 11:39]: So why not auto generate another list, giving keyowrds and descriptions of _every_ module? Particularly as that's not almost trivial to write. I've done that before -- http://www.perlmonks.org/index.pl?node_id=281203 f.ex, which is far from the only piece of such code I've written (most other stuff is unpublished though). If noone else has the time, I could probably cook something up in an evening, assuming I have specs on the repository structure (filesystem layout, what kinds of files I need to look at, etc). I guess I'd know that already if I'd ever set up a personal CPAN mirror, but I haven't. Regards, -- Aristotle If you can't laugh at yourself, you don't take life seriously enough.
Re: Let's eliminate the Module List
On Mon, Aug 23, 2004 at 04:11:33PM -0400, Robert Rothenberg wrote: It would be a lot of work to implement a workflow system (I wish I had the time), but once it's implemented, the approval work could be Your honesty with I wish I had the time illustrates the problem here. [and the following isn't personal, but for the list as a whole:] Talk is cheap. Sadly none of this will get done unless someone with sufficient desire to do this creates themselves the time and does it. There is nothing stopping anyone on this list prototyping their own improved substitute for search.cpan.org. (although it helps if you have a public facing webserver if you want to show it to others). Yet no-one does. Until someone does, nothing will change. No-one on this list is preventing anyone from trying this. Nicholas Clark
Re: Let's eliminate the Module List
[EMAIL PROTECTED] (Nicholas Clark) writes: Until someone does, nothing will change. No-one on this list is preventing anyone from trying this. I'm working on it. The only thing that sucks about search.cpan.org is the search engine, which is a shame since that's the major part of it. Thankfully, I have this really handy Perl search engine toolkit up my sleeve... -- And it should be the law: If you use the word `paradigm' without knowing what the dictionary says it means, you go to jail. No exceptions. -- David Jones
Re: Let's eliminate the Module List
On Mon, Aug 23, 2004 at 10:43:38PM +0100, Nicholas Clark wrote: There is nothing stopping anyone on this list prototyping their own improved substitute for search.cpan.org. (although it helps if you have a public facing webserver if you want to show it to others). Yet no-one does. Randy Kobes did: http://kobesearch.cpan.org/ But apparently it's not sufficiently better or sufficiently well known to come up in future of CPAN conversations much. At least the code for it is easily available: http://cpan-search.sourceforge.net/ It uses mod_perl and Template Toolkit. Mark -- . . . . . . . . . . . . . . . . . . . . . . . . . . . Mark StosbergPrincipal Developer [EMAIL PROTECTED] Summersault, LLC 765-939-9301 ext 202 database driven websites . . . . . http://www.summersault.com/ . . . . . . . .
Re: Let's eliminate the Module List
Andy Lester wrote: On Mon, Aug 23, 2004 at 10:43:38PM +0100, Nicholas Clark ([EMAIL PROTECTED]) wrote: There is nothing stopping anyone on this list prototyping their own improved substitute for search.cpan.org. (although it helps if you have a public facing webserver if you want to show it to others). Yet no-one does. I'm working on it. I've already pulled the minicpan (a la Randal's mini mirror) and I'm working on the Template Toolkit-fu to make reasonable pages. If anyone's interested in helping out, let me know. Looks like you and Simon should collaborate. Is it possible or realistic for it to have pluggable search browse engines. I still think sourceforge-like hierchical catagories (Topics) in META.yml would make for good light-weight search and improved by-catagory browsing (modules can list multiple catagories). There may be other usefull info in META.yml like OS Platform and requirements, etc that could be used in advanced searches. Also, some info might be pulled from cpanratings. Web development is not my area, but I've been trying to remedy that. I've been trying to setup a local cpanratings to play with and hopefully do some work on, but it's going slow right now. When I do get up to speed, I'd be willing to do some work... Randy.
Re: Let's eliminate the Module List
On Thu, 2004-08-19 at 13:54, Simon Cozens wrote: [EMAIL PROTECTED] (Jose Alves de Castro) writes: I don't want to show the results of a search. I want to say Here is the link to the module list. See how long it is? It contains practically everything you need, doesn't it? http://www.cpan.org/modules/02packages.details.txt.gz Hmmm that is pretty complete. Good point. I kind of like the idea of a standard KEYWORDS section for PODs... Gives the author ability to help a potential user find the module. Sort like man -k. Adding (say up to 10?) keywords after the tarball name would make _this_ pretty useful... It would be useful for an enhancement to search.cpan,org org too.
Re: Let's eliminate the Module List
Christopher Hicks wrote: On Thu, 19 Aug 2004, Hugh S. Myers wrote: 2. Push hard on the notion of adding a keywords item to the 'standard' for pod documentation. What should those keywords be? Who decides? I'm personally much more interested in seeing a dmoz-ish hierarchy so related modules can be easily found and compared. I agree[1]. A static list of catagories like DMOZ or SourceForge uses would provide for improved, more consistent searches and an improved by-catagory browsing catalog. Catagories could be added to META.yml for easy indexing by search.cpan.org. Fixed catagories make it more likely that similar modules will be found together both when browsing by-catagory and when searching. And you eliminate the abuses of keywords. 1. http://www.nntp.perl.org/group/perl.module-authors/2601/
Re: Let's eliminate the Module List
On Thu, 2004-08-19 at 18:54, Simon Cozens wrote: [EMAIL PROTECTED] (Jose Alves de Castro) writes: I don't want to show the results of a search. I want to say Here is the link to the module list. See how long it is? It contains practically everything you need, doesn't it? http://www.cpan.org/modules/02packages.details.txt.gz It seems like I'm the only one, but I still prefer the other list... :-( It has the module descriptions and all... :-( -- José Alves de Castro [EMAIL PROTECTED] http://natura.di.uminho.pt/~jac signature.asc Description: This is a digitally signed message part
Re: Let's eliminate the Module List
On Wed, Aug 18, 2004 at 04:57:34PM -0500, Mark Stosberg wrote: On Wed, Aug 18, 2004 at 04:54:32PM -0500, Andy Lester wrote: I propose eliminating the Long Module List. I'm talking about http://www.cpan.org/modules/00modlist.long.html (2998 modules), not http://www.cpan.org/modules/01modules.index.html (6800 modules). As a long time CPAN module author and user, I second this proposal. Thirded, and in fact I proposed this on [EMAIL PROTECTED] last week but hadn't followed up on it yet. But I had intended to crusade for it, so count me in. K. -- Kirrily 'Skud' Robert - [EMAIL PROTECTED] - http://infotrope.net/ Heavily armed, easily bored, and off my medication.
Re: Let's eliminate the Module List
On Thu, Aug 19, 2004 at 05:24:57PM +0100, Jose Alves de Castro wrote: On Thu, 2004-08-19 at 16:47, Christopher Hicks wrote: On Thu, 19 Aug 2004, Hugh S. Myers wrote: It seems to me that ANY thing that contributes to the solution set of 'How do I find the module I'm looking for?' needs to be kept until it can be replaced with something of equal or greater value. search.cpan.org seems to be of greater value than the modules list according to most of the people that have chimed in. Try asking beginners what they think. I believe it is easier for them to look at a long list of modules then searching for a specific one, particularly because they often don't know what they should be looking for. The problem is that the list is missing many modules and in some cases it is missing the right module for a particular job while listing other inferior modules and since no one is adding to the list, this can only get worse. Anyway, I like to have a long list of modules to show my Java friends and say see? If we had keywords you could just search on a keyword and show them that list instead, F
Re: Let's eliminate the Module List
On Thu, 2004-08-19 at 17:35, Fergal Daly wrote: On Thu, Aug 19, 2004 at 05:24:57PM +0100, Jose Alves de Castro wrote: On Thu, 2004-08-19 at 16:47, Christopher Hicks wrote: On Thu, 19 Aug 2004, Hugh S. Myers wrote: It seems to me that ANY thing that contributes to the solution set of 'How do I find the module I'm looking for?' needs to be kept until it can be replaced with something of equal or greater value. search.cpan.org seems to be of greater value than the modules list according to most of the people that have chimed in. Try asking beginners what they think. I believe it is easier for them to look at a long list of modules then searching for a specific one, particularly because they often don't know what they should be looking for. The problem is that the list is missing many modules and in some cases it is missing the right module for a particular job while listing other inferior modules and since no one is adding to the list, this can only get worse. I know that, but what I'm saying is Let's keep the list updated! I had already volunteered to brian to do that, and by the same time this whole thing of killing the list has exploded... I agree with you all, I know the list is probably doing more harm then good, but it wasn't like that years ago, and the only reason it is like that now is that the list isn't being updated! If someone keeps it up to date, I think it'll be a good thing for all of us once again. Anyway, I like to have a long list of modules to show my Java friends and say see? If we had keywords you could just search on a keyword and show them that list instead, I don't want to show the results of a search. I want to say Here is the link to the module list. See how long it is? It contains practically everything you need, doesn't it? And I also want to be able to look at the list and think of what other things are still lacking... F -- José Alves de Castro [EMAIL PROTECTED] http://natura.di.uminho.pt/~jac signature.asc Description: This is a digitally signed message part
RE: Let's eliminate the Module List
Title: RE: Let's eliminate the Module List I agree with you all, I know the list is probably doing more harm then good, but it wasn't like that years ago, and the only reason it is like that now is that the list isn't being updated! If someone keeps it up to date, I think it'll be a good thing for all of us once again. If someone keeps it up to date they wont be doing much else is the impression I get. Some things that work well for small communities just don't work well in large ones. It seems to me that the module list is one of them. I personally would like to see it go. Yves
Re: Let's eliminate the Module List
[EMAIL PROTECTED] (Jose Alves de Castro) writes: I don't want to show the results of a search. I want to say Here is the link to the module list. See how long it is? It contains practically everything you need, doesn't it? http://www.cpan.org/modules/02packages.details.txt.gz -- yes /dev/kmem # Shutdown is broken. This'll have to do - plan9 has a bad day
Let's eliminate the Module List
I propose eliminating the Long Module List. I'm talking about http://www.cpan.org/modules/00modlist.long.html (2998 modules), not http://www.cpan.org/modules/01modules.index.html (6800 modules). =over 4 =item * It's no longer relevant. Way back when, it was cool to have a single readable source of information. With search.cpan.org, it's just not necessary any more. The list gives two aims: * FOR DEVELOPERS: To change duplication of effort into cooperation. * FOR USERS: To quickly locate existing software which can be reused. Both are addressed, and very effectively, by search.cpan.org. =item * Few people look at it. Looking at pair.com's mirror logs, I see that since Jan 2003, downloads of 00mod* have averaged fewer than five per month. Per month, not per day. Pair is not a lightly-used mirror, either. They served up 615K distros for July 2004. Five out of 615,000 is close enough to zero for me. =item * Inclusion on the list is effectively arbitrary. It doesn't mean anything to have a module on that list. It's certainly not a stamp of quality. I don't mean to ignite the debate over whether there should be some Perl Approved CPAN module apparatus should exist; only that inclusion on the Module List is not it. =item * The resources used could be better used elsewhere. There's significant amount of human time and machine resources that go into maintaining the Long Module List. For that matter, it's a waste of developer time proposing inclusion on a list that nobody looks at. =item * search.cpan.org browsing is misleading Browsing search.cpan.org gives the user the impression that he or she is browsing all modules on the CPAN. This is not the case. The 26 categories don't make sense any more, anyway. =back The one bit of value that I see in this process is where Graham looks at submissions that people have sent in and, if something seems like it's duplicate effort, tries to redirect the author to reduce the duplication. (http://www.nntp.perl.org/group/perl.modules/34207) Unfortunately, that requires the author to submit a proposal for inclusion, and since fewer than half of the authors submit the modules, it's hardly a complete filter. I welcome your thoughts. How can we capture the good part of the module list (the human filtering), and remove the obsoleted infrastructure? xoxo, Andy -- Andy Lester = [EMAIL PROTECTED] = www.petdance.com = AIM:petdance
Re: Let's eliminate the Module List
On Wed, Aug 18, 2004 at 04:54:32PM -0500, Andy Lester wrote: I propose eliminating the Long Module List. I'm talking about http://www.cpan.org/modules/00modlist.long.html (2998 modules), not http://www.cpan.org/modules/01modules.index.html (6800 modules). As a long time CPAN module author and user, I second this proposal. Mark
Re: Let's eliminate the Module List
On Wed, 18 Aug 2004, Randy W. Sims wrote: I made a suggestion regarding this before that I thought provided a fair solution http://www.nntp.perl.org/group/perl.module-authors/2615, but no one commented. Basically, upon submission of a new module, a notice would be auto-posted to some list. If no one replies to that posting within some time frame, the module is automatically accepted. If anyone does reply, then it requires moderator approval. The moderator(s) isn't required to do anything other than monitor the discussion and act according to the concesus reached in the discussion. The list members do what is currently done on a voluntary basis on module-authors; that is, they make name suggestions, discuss prior art, etc. That sounds good to me! Are you volunteering to implement it too? :) -- /chris There are two ways of constructing a software design. One way is to make it so simple that there are obviously no deficiencies. And the other way is to make it so complicated that there are no obvious deficiencies. -- C.A.R. Hoare
Re: Let's eliminate the Module List
On Wed, 2004-08-18 at 17:57, Mark Stosberg wrote: On Wed, Aug 18, 2004 at 04:54:32PM -0500, Andy Lester wrote: I propose eliminating the Long Module List. I'm talking about http://www.cpan.org/modules/00modlist.long.html (2998 modules), not http://www.cpan.org/modules/01modules.index.html (6800 modules). As a long time CPAN module author and user, I second this proposal. Mark ditto, I have modules on the list and not. Sometime ago, I came to the same conclusion, it really adds no significant value today. I did use it years ago, before search.cpan.org. Not much really since. Lincoln