Re: [request] modperl mailing lists searchable archives wanted

2001-10-16 Thread Stas Bekman

Joshua Chamas wrote:

 Stas Bekman wrote:
 
dev@@perl.apache.org - 2.5, but their search engines suck
[EMAIL PROTECTED] - none
[EMAIL PROTECTED] - none
[EMAIL PROTECTED]  - none
[EMAIL PROTECTED]   - 1


 
 Hey Stas, 
 
 I have the asp list getting archived at:
 
   http://www.mail-archive.com/asp%40perl.apache.org/


Added. Thanks Joshua

 
 Thanks for keeping up on this.  It would be nice to 
 have another search archive for the asp list too.

:)

_
Stas Bekman JAm_pH  --   Just Another mod_perl Hacker
http://stason.org/  mod_perl Guide   http://perl.apache.org/guide
mailto:[EMAIL PROTECTED]  http://ticketmaster.com http://apacheweek.com
http://singlesheaven.com http://perl.apache.org http://perlmonth.com/




Re: [request] modperl mailing lists searchable archives wanted

2001-10-10 Thread Stas Bekman

Bill Moseley wrote:

 Hi Stas,
 
 I just updated the search site for Apache.org with a newer version of
 swish.  The context highlighting is a bit silly, but that can be fixed.
 I'm only caching the first 15K of text from each page for context
 highlighting.
 
 http://search.apache.org
 
 It seems reasonably fast (it's not running under mod_perl currently, but
 could -- if mod_perl was in that server ;).
 
 It takes about eight or nine minutes to reindex ~35,000 docs on *.apache.org
 so the mod_perl list (and others) shouldn't too much trouble, I'd think,
 with smaller numbers and smaller content.
 
 It doesn't do incremental indexing at this point, which is a draw back, but
 indexing is so fast it normally doesn't matter (and there's an easy
 work-around for something like a mailing list to pickup new messages as
 they come in during the day).
 
 Swish-e can also call a perl program which feeds docs to swish.  That makes
 it easy to parse the email into fields for something like:
 
   http://swish-e.org/Discussion/search/swish.cgi
 
 which looks a lot like the Apache search site...
 
 But, what would be needed is a good threaded mail archiver, which there are
 many to pick from, I'd expect.
 
 
Some 
archives are browsable, but their search engines simply suck. e.g. 
marc.theaimsgroup.com I think is the only one that archives 
[EMAIL PROTECTED], but if you try to seach for perl string like 
APR::Table::FETCH it won't find anything. If you search for
get_dir_config it will split it into 'get', 'dir', 'config' and give you 
a zillion matches when you know that there are just a few.

 
 On swish you could say : and _ are part of words and those would index
 as full words.  Or, just simply search for phrase: get_dir_config and it
 would search for the phrase get dir config which would probably find what
 you want.
 
 Maybe : and _ are ok in words, but you have to think carefully about
 others.  It's more flexible to split the words and use phrases in many cases.

Hi Bill,

It's great that search.apache.org gets a new engine, but if you run a 
few simple tests it's still not very good with what you've just 
explained. When I search for mod_perl, I search for 'mod_perl' and not 
'mod' and 'perl'. It's possible that there are hundreds of pages which 
have mod_perl or 'mod' and 'perl' in them, in the current case those 
with 'mod_perl' won't get higher relevance than those with 'mod' and 
'perl'. So it's not good.

Well we have been through this already with Randy Kobe's version of the 
searchable guide (Swish-E too), which has been tuned to work with Perl 
content. You may want to ask Randy to give you the tuned configuration. 
You can compare all three search engines used at 
http://perl.apache.org/guide/#search.

Thanks Bill!

_
Stas Bekman JAm_pH  --   Just Another mod_perl Hacker
http://stason.org/  mod_perl Guide   http://perl.apache.org/guide
mailto:[EMAIL PROTECTED]  http://ticketmaster.com http://apacheweek.com
http://singlesheaven.com http://perl.apache.org http://perlmonth.com/




Re: [request] modperl mailing lists searchable archives wanted

2001-10-10 Thread Joshua Chamas

Stas Bekman wrote:
 
 dev@@perl.apache.org - 2.5, but their search engines suck
 [EMAIL PROTECTED] - none
 [EMAIL PROTECTED] - none
 [EMAIL PROTECTED]  - none
 [EMAIL PROTECTED]   - 1
 

Hey Stas, 

I have the asp list getting archived at:

  http://www.mail-archive.com/asp%40perl.apache.org/

Thanks for keeping up on this.  It would be nice to 
have another search archive for the asp list too.

Josh

_
Joshua Chamas   Chamas Enterprises Inc.
NodeWorks Founder   Huntington Beach, CA  USA 
http://www.nodeworks.com1-714-625-4051



Re: [request] modperl mailing lists searchable archives wanted

2001-10-09 Thread Bill Moseley

Hi Stas,

I just updated the search site for Apache.org with a newer version of
swish.  The context highlighting is a bit silly, but that can be fixed.
I'm only caching the first 15K of text from each page for context
highlighting.

http://search.apache.org

It seems reasonably fast (it's not running under mod_perl currently, but
could -- if mod_perl was in that server ;).

It takes about eight or nine minutes to reindex ~35,000 docs on *.apache.org
so the mod_perl list (and others) shouldn't too much trouble, I'd think,
with smaller numbers and smaller content.

It doesn't do incremental indexing at this point, which is a draw back, but
indexing is so fast it normally doesn't matter (and there's an easy
work-around for something like a mailing list to pickup new messages as
they come in during the day).

Swish-e can also call a perl program which feeds docs to swish.  That makes
it easy to parse the email into fields for something like:

  http://swish-e.org/Discussion/search/swish.cgi

which looks a lot like the Apache search site...

But, what would be needed is a good threaded mail archiver, which there are
many to pick from, I'd expect.

Some 
archives are browsable, but their search engines simply suck. e.g. 
marc.theaimsgroup.com I think is the only one that archives 
[EMAIL PROTECTED], but if you try to seach for perl string like 
APR::Table::FETCH it won't find anything. If you search for
get_dir_config it will split it into 'get', 'dir', 'config' and give you 
a zillion matches when you know that there are just a few.

On swish you could say : and _ are part of words and those would index
as full words.  Or, just simply search for phrase: get_dir_config and it
would search for the phrase get dir config which would probably find what
you want.

Maybe : and _ are ok in words, but you have to think carefully about
others.  It's more flexible to split the words and use phrases in many cases.



Bill Moseley
mailto:[EMAIL PROTECTED]



RE: [request] modperl mailing lists searchable archives wanted

2001-10-09 Thread Geoffrey Young


 
 I've just updated the archives list at 
 http://perl.apache.org/#maillists, so here is what we have:
 
 dev@@perl.apache.org - 2.5, but their search engines suck
 [EMAIL PROTECTED] - none
 [EMAIL PROTECTED] - none
 [EMAIL PROTECTED]  - none
 [EMAIL PROTECTED]   - 1

as far as I know, nobody is archiving [EMAIL PROTECTED] either,
which is also of interest to us mod_perl folks :)

--Geoff



Re: [request] modperl mailing lists searchable archives wanted

2001-10-09 Thread Elizabeth Mattijsen

At 05:59 PM 10/9/01 +0800, Stas Bekman wrote:
Please try to send links only for good archives with good search engines.
Thanks a bunch!

Still in beta phase, and only containing Perl newsgroups, it nonetheless 
might be interesting to check out:

   http://news.search.nl/style/search.en/read/category/Programming_Languages 
http://news.search.nl/style/search.en/read/category/Programming_Languages/Pe 
rl/list/page1.html

Currently refreshed 4 times a day, with searching being refreshed once a day.

The site actually runs ModPerl with Matt Sergeant's LibXML and LibXSLT modules.




Elizabeth Mattijsen

Note: I am the main developer of this website, so I am prejudiced  ;-)




Re: [request] modperl mailing lists searchable archives wanted

2001-10-09 Thread Stas Bekman

Geoffrey Young wrote:

I've just updated the archives list at 
http://perl.apache.org/#maillists, so here is what we have:

dev@@perl.apache.org - 2.5, but their search engines suck
[EMAIL PROTECTED] - none
[EMAIL PROTECTED] - none
[EMAIL PROTECTED]  - none
[EMAIL PROTECTED]   - 1

 
 as far as I know, nobody is archiving [EMAIL PROTECTED] either,
 which is also of interest to us mod_perl folks :)

At least: http://www.apachelabs.org/test-dev/

I'll add this link.

_
Stas Bekman JAm_pH  --   Just Another mod_perl Hacker
http://stason.org/  mod_perl Guide   http://perl.apache.org/guide
mailto:[EMAIL PROTECTED]  http://ticketmaster.com http://apacheweek.com
http://singlesheaven.com http://perl.apache.org http://perlmonth.com/




Re: [request] modperl mailing lists searchable archives wanted

2001-10-09 Thread Stas Bekman

Elizabeth Mattijsen wrote:

 At 05:59 PM 10/9/01 +0800, Stas Bekman wrote:
 
 Please try to send links only for good archives with good search engines.
 Thanks a bunch!
 
 
 Still in beta phase, and only containing Perl newsgroups, it nonetheless 
 might be interesting to check out:
 
   
 http://news.search.nl/style/search.en/read/category/Programming_Languages 
 http://news.search.nl/style/search.en/read/category/Programming_Languages/Pe 
 rl/list/page1.html
 
 Currently refreshed 4 times a day, with searching being refreshed once a 
 day.
 
 The site actually runs ModPerl with Matt Sergeant's LibXML and LibXSLT 
 modules.


That's cool, but I've asked for the links with modperl-foo lists 
archives that I've listed in my original post (we have enough archives 
of the modperl list itself).

Thanks, Elizabeth




-- 


_
Stas Bekman JAm_pH  --   Just Another mod_perl Hacker
http://stason.org/  mod_perl Guide   http://perl.apache.org/guide
mailto:[EMAIL PROTECTED]  http://ticketmaster.com http://apacheweek.com
http://singlesheaven.com http://perl.apache.org http://perlmonth.com/