Re: [preview] Search engine for the Guide
On Fri, 19 May 2000, Randy Kobes wrote: > On Fri, 19 May 2000, Stas Bekman wrote: > > > On Fri, 19 May 2000, raptor wrote: > > > > > hi, > > > > > > very interesting. Search for : "statinc" returns nothing and the box get filled > > > with "tatinc" instead "statinc" ?!?!:") > > > > > > this under KDE viewer, now will try netscape ...!! > > > > it's not the client -- it's a bug. > > > > This happened after Randy has made non-stemming as a default. When you > > turn the stemming on you get it right. Randy, ideas? > > Hi, > This was a bug, which was just fixed - 'statinc' now returns > reasonable results. Also, I fixed it so a search term of > $SIG{__DIE__}, for example, also returns some results. Almost, when you search for it for the first time, it's Ok. But then you append \ before $SIG{__DIE__} and it searchs for \\$SIG{__DIE__} which yields nothing. 'VINC' gives nothing as well :( Looks like a try and catch game... _ Stas Bekman JAm_pH -- Just Another mod_perl Hacker http://stason.org/ mod_perl Guide http://perl.apache.org/guide mailto:[EMAIL PROTECTED] http://perl.org http://stason.org/TULARC http://singlesheaven.com http://perlmonth.com http://sourcegarden.org
Re: [preview] Search engine for the Guide
On Fri, 19 May 2000, Stas Bekman wrote: > On Fri, 19 May 2000, raptor wrote: > > > hi, > > > > very interesting. Search for : "statinc" returns nothing and the box get filled > > with "tatinc" instead "statinc" ?!?!:") > > > > this under KDE viewer, now will try netscape ...!! > > it's not the client -- it's a bug. > > This happened after Randy has made non-stemming as a default. When you > turn the stemming on you get it right. Randy, ideas? Hi, This was a bug, which was just fixed - 'statinc' now returns reasonable results. Also, I fixed it so a search term of $SIG{__DIE__}, for example, also returns some results. best regards, randy
Re: [preview] Search engine for the Guide
On Fri, 19 May 2000, raptor wrote: > hi, > > very interesting. Search for : "statinc" returns nothing and the box get filled > with "tatinc" instead "statinc" ?!?!:") > > this under KDE viewer, now will try netscape ...!! it's not the client -- it's a bug. This happened after Randy has made non-stemming as a default. When you turn the stemming on you get it right. Randy, ideas? _ Stas Bekman JAm_pH -- Just Another mod_perl Hacker http://stason.org/ mod_perl Guide http://perl.apache.org/guide mailto:[EMAIL PROTECTED] http://perl.org http://stason.org/TULARC http://singlesheaven.com http://perlmonth.com http://sourcegarden.org
Re: [preview] Search engine for the Guide
hi, very interesting. Search for : "statinc" returns nothing and the box get filled with "tatinc" instead "statinc" ?!?!:") this under KDE viewer, now will try netscape ...!!
RE: [preview] Search engine for the Guide
> > That would be nice to see. I'm afraid I'll continue on working on guide. > > So if there anyone with a few free minutes on his hands, he/she might like > > to contribute something back to community ;) > > > > Ideally, when we complete the tuning of the search engine, we will be able > > to have the whole site, apache::asp and embperl pages searchable as well. > > (with Perl style documentation in mind). > > > > Stas, > > there is already a search frontend for the apache sites, at > http://www.apache.org/search.html which is also able to search under > perl.apache.org, but if you enter mod_perl, doesn't find anything :-(. Don't > know if this is of any use and who is maintaining (or not maintaining) this > page. Heh, look at the bottom of the http://perl.apache.org/guide/index.html -- the search box from http://www.apache.org/search.html is there since the day the guide is online. But as you said -- it's useless, as it's not good for the kind of documentation we have. I've posted a request for comments about the apache.org search engine to the asf members list but it was ignored :( _ Stas Bekman JAm_pH -- Just Another mod_perl Hacker http://stason.org/ mod_perl Guide http://perl.apache.org/guide mailto:[EMAIL PROTECTED] http://perl.org http://stason.org/TULARC http://singlesheaven.com http://perlmonth.com http://sourcegarden.org
RE: [preview] Search engine for the Guide
> > That would be nice to see. I'm afraid I'll continue on working on guide. > So if there anyone with a few free minutes on his hands, he/she might like > to contribute something back to community ;) > > Ideally, when we complete the tuning of the search engine, we will be able > to have the whole site, apache::asp and embperl pages searchable as well. > (with Perl style documentation in mind). > Stas, there is already a search frontend for the apache sites, at http://www.apache.org/search.html which is also able to search under perl.apache.org, but if you enter mod_perl, doesn't find anything :-(. Don't know if this is of any use and who is maintaining (or not maintaining) this page. Gerald
Re: [preview] Search engine for the Guide
On Fri, 19 May 2000, Matt Sergeant wrote: > On Thu, 18 May 2000, Randy Kobes wrote: > > > Another thing that was configured in is that words have > > to be at least 3 characters long, which seems reasonable, > > and also there's some stopwords that don't get indexed, > > as they're too common. This list of stopwords is built > > by hand - so far it only includes 'perl' and 'modperl'. > > Also, the maximum number of hits is set at 30. > > It should also index $/, etc. So limiting to >2char words is another > broken aspect... Seems like for Perl documentation there should be no limiting at all, or may be one character is the only option... > But I'm not complaining! It's 100% better than it was. Maybe someone > would like my code for a db backed search engine and fix that up to > something that could work? It's all built in perl so you're free to add > and remove stopwords or change the min word length as you like. That would be nice to see. I'm afraid I'll continue on working on guide. So if there anyone with a few free minutes on his hands, he/she might like to contribute something back to community ;) Ideally, when we complete the tuning of the search engine, we will be able to have the whole site, apache::asp and embperl pages searchable as well. (with Perl style documentation in mind). _ Stas Bekman JAm_pH -- Just Another mod_perl Hacker http://stason.org/ mod_perl Guide http://perl.apache.org/guide mailto:[EMAIL PROTECTED] http://perl.org http://stason.org/TULARC http://singlesheaven.com http://perlmonth.com http://sourcegarden.org
Re: [preview] Search engine for the Guide
On Fri, 19 May 2000, Ged Haywood wrote: > Hi all, > > On Thu, 18 May 2000, Randy Kobes wrote: > > > > The :: are stripped on the fly, since these cannot be used in index, so > > > when you look for Foo::Bar you are actually looking for 'Foo && Bar'. > > > > That's a limitation of swish-e - you can configure it to > > index characters like $, !, ... as part of a "word", but > > the characters >, <, *, and : cannot be so indexed. > > If you use swish++4.4 then you can change this in "config.h" > > // Characters that are permissible in words: letters must be lower > // case and upper case letters would be redundant. > // > char const Word_Chars[] = "&'-0123456789abcdefghijklmnopqrstuvwxyz_"; > // Characters that may be in a word. Note that '&' is here so > // acronyms like "AT&T" are treated as one word. Unlike SWISH-E, > // ';' does not need to be here to recognize and convert character > // entity references. Interesting, Randy what version did you use? Thanks Ged! _ Stas Bekman JAm_pH -- Just Another mod_perl Hacker http://stason.org/ mod_perl Guide http://perl.apache.org/guide mailto:[EMAIL PROTECTED] http://perl.org http://stason.org/TULARC http://singlesheaven.com http://perlmonth.com http://sourcegarden.org
Re: [preview] Search engine for the Guide
On Thu, 18 May 2000, Randy Kobes wrote: > On Fri, 19 May 2000, Stas Bekman wrote: > > > On Thu, 18 May 2000, Vivek Khera wrote: > > > > > looks good... one minor issue with the stickyness of the next search > > > feature: > > > > > > type "lexical file handles" in your original search. the "es" at the > > > end is lost in the next search box on the result page. > > > > Yup, broken :( > > Hi, > But fixable ...:) As I just mentioned, we can turn stemming > off, or at least make it optional, so the full word only is > searched for. I've found stemming useful, but that's perhaps > just the way I do searches - should I turn it off by default to see if > that's preferable? And make it then a configurable option? Yup, turn it off. And have an option to turn it on. Thanks! _ Stas Bekman JAm_pH -- Just Another mod_perl Hacker http://stason.org/ mod_perl Guide http://perl.apache.org/guide mailto:[EMAIL PROTECTED] http://perl.org http://stason.org/TULARC http://singlesheaven.com http://perlmonth.com http://sourcegarden.org
Re: [preview] Search engine for the Guide
On Thu, 18 May 2000, Randy Kobes wrote: > Another thing that was configured in is that words have > to be at least 3 characters long, which seems reasonable, > and also there's some stopwords that don't get indexed, > as they're too common. This list of stopwords is built > by hand - so far it only includes 'perl' and 'modperl'. > Also, the maximum number of hits is set at 30. It should also index $/, etc. So limiting to >2char words is another broken aspect... But I'm not complaining! It's 100% better than it was. Maybe someone would like my code for a db backed search engine and fix that up to something that could work? It's all built in perl so you're free to add and remove stopwords or change the min word length as you like. -- Fastnet Software Ltd. High Performance Web Specialists Providing mod_perl, XML, Sybase and Oracle solutions Email for training and consultancy availability. http://sergeant.org http://xml.sergeant.org
Re: [preview] Search engine for the Guide
Hi all, On Thu, 18 May 2000, Randy Kobes wrote: > > The :: are stripped on the fly, since these cannot be used in index, so > > when you look for Foo::Bar you are actually looking for 'Foo && Bar'. > > That's a limitation of swish-e - you can configure it to > index characters like $, !, ... as part of a "word", but > the characters >, <, *, and : cannot be so indexed. If you use swish++4.4 then you can change this in "config.h" // Characters that are permissible in words: letters must be lower // case and upper case letters would be redundant. // char const Word_Chars[] = "&'-0123456789abcdefghijklmnopqrstuvwxyz_"; // Characters that may be in a word. Note that '&' is here so // acronyms like "AT&T" are treated as one word. Unlike SWISH-E, // ';' does not need to be here to recognize and convert character // entity references. 73, Ged.
Re: [preview] Search engine for the Guide
On Thu, 18 May 2000, Jeremy Howard wrote: > Stas Bekman <[EMAIL PROTECTED]> wrote: > > Ok, We have a preview ready for you. Randy Kobes worked hard to prepare > > this one. So your comments are very welcome. If you like it we'll put this > > into production. > > > > Please keep either the list CC'ed or if you reply to me in person, make > > sure you keep Randy CC'ed -- all the kudos should go his way :) > > > When I search for 'dbi' or 'DBI', it finds nothing, and the search box shows 'dby'! > > It looks like it's try to helpfully change my search pattern... > Hi, I turned stemming on by default - that's why the search pattern gets changed. This obviously causes confusion - I'll turn it off, and make it manually configurable. As well, the indexing was configured for words greater than 3 characters; I'll reduce it down to greater than 2 characters and see if that helps. best regards, randy
Re: [preview] Search engine for the Guide
Stas Bekman <[EMAIL PROTECTED]> wrote: > Ok, We have a preview ready for you. Randy Kobes worked hard to prepare > this one. So your comments are very welcome. If you like it we'll put this > into production. > > Please keep either the list CC'ed or if you reply to me in person, make > sure you keep Randy CC'ed -- all the kudos should go his way :) > When I search for 'dbi' or 'DBI', it finds nothing, and the search box shows 'dby'! It looks like it's try to helpfully change my search pattern... -- Jeremy Howard [EMAIL PROTECTED]
Re: [preview] Search engine for the Guide
On Fri, 19 May 2000, Stas Bekman wrote: > On Thu, 18 May 2000, Matt Sergeant wrote: > > > One more point... The indexer or the searcher (or both) has a broken > > tokenizer for anything involving perl. Try searching for > > Apache::Constants, for example. > > That's right. It's broken :( After searching for 'Apache::Constants' I've > got 'apach constant'... Just to expand on this - I turned stemming of words on by default in the search, which is why the stemmed words get returned. Perhaps it'll be better to turn stemming off by default, and rather make it a configureable option? > The :: are stripped on the fly, since these cannot be used in index, so > when you look for Foo::Bar you are actually looking for 'Foo && Bar'. That's a limitation of swish-e - you can configure it to index characters like $, !, ... as part of a "word", but the characters >, <, *, and : cannot be so indexed. So the script silently stripped ':' out, leaving the search term to be 'Apache' && 'Constants'. This should be mentioned on the search page Another thing that was configured in is that words have to be at least 3 characters long, which seems reasonable, and also there's some stopwords that don't get indexed, as they're too common. This list of stopwords is built by hand - so far it only includes 'perl' and 'modperl'. Also, the maximum number of hits is set at 30. best regards, randy
Re: [preview] Search engine for the Guide
On Thu, 18 May 2000, Matt Sergeant wrote: > Looks cool, except can we take the guide splitting back 1 level? It > seems to be split on =head2's, and should be split (IMO) on =head1's. The reason for splitting on any =head level lies in fact that there are huge sections under =head1 which have many =head{2,5}, and I'm slowly reworking the Guide to making it more categorized (nested), rather than flattened as it was before (and still is). But we have thought about this issue. Look at the links at the bottom of the splitted page -- it can take to the full version as well. > One more point... The indexer or the searcher (or both) has a broken > tokenizer for anything involving perl. Try searching for > Apache::Constants, for example. That's right. It's broken :( After searching for 'Apache::Constants' I've got 'apach constant'... The :: are stripped on the fly, since these cannot be used in index, so when you look for Foo::Bar you are actually looking for 'Foo && Bar'. _ Stas Bekman JAm_pH -- Just Another mod_perl Hacker http://stason.org/ mod_perl Guide http://perl.apache.org/guide mailto:[EMAIL PROTECTED] http://perl.org http://stason.org/TULARC http://singlesheaven.com http://perlmonth.com http://sourcegarden.org
Re: [preview] Search engine for the Guide
On Fri, 19 May 2000, Stas Bekman wrote: > Ok, We have a preview ready for you. Randy Kobes worked hard to prepare > this one. So your comments are very welcome. If you like it we'll put this > into production. > > Please keep either the list CC'ed or if you reply to me in person, make > sure you keep Randy CC'ed -- all the kudos should go his way :) > > So: > > The search is at: > > http://theoryx5.uwinnipeg.ca/cgi-bin/guide-search > > and the split guide is at: > > http://theoryx5.uwinnipeg.ca/guide/ One more point... The indexer or the searcher (or both) has a broken tokenizer for anything involving perl. Try searching for Apache::Constants, for example. -- Fastnet Software Ltd. High Performance Web Specialists Providing mod_perl, XML, Sybase and Oracle solutions Email for training and consultancy availability. http://sergeant.org http://xml.sergeant.org
Re: [preview] Search engine for the Guide
On Fri, 19 May 2000, Stas Bekman wrote: > Ok, We have a preview ready for you. Randy Kobes worked hard to prepare > this one. So your comments are very welcome. If you like it we'll put this > into production. > > Please keep either the list CC'ed or if you reply to me in person, make > sure you keep Randy CC'ed -- all the kudos should go his way :) > > So: > > The search is at: > > http://theoryx5.uwinnipeg.ca/cgi-bin/guide-search > > and the split guide is at: > > http://theoryx5.uwinnipeg.ca/guide/ Looks cool, except can we take the guide splitting back 1 level? It seems to be split on =head2's, and should be split (IMO) on =head1's. -- Fastnet Software Ltd. High Performance Web Specialists Providing mod_perl, XML, Sybase and Oracle solutions Email for training and consultancy availability. http://sergeant.org http://xml.sergeant.org
[preview] Search engine for the Guide
Ok, We have a preview ready for you. Randy Kobes worked hard to prepare this one. So your comments are very welcome. If you like it we'll put this into production. Please keep either the list CC'ed or if you reply to me in person, make sure you keep Randy CC'ed -- all the kudos should go his way :) So: The search is at: http://theoryx5.uwinnipeg.ca/cgi-bin/guide-search and the split guide is at: http://theoryx5.uwinnipeg.ca/guide/ Enjoy! _ Stas Bekman JAm_pH -- Just Another mod_perl Hacker http://stason.org/ mod_perl Guide http://perl.apache.org/guide mailto:[EMAIL PROTECTED] http://perl.org http://stason.org/TULARC http://singlesheaven.com http://perlmonth.com http://sourcegarden.org