Re: search engine module?
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 People have been talking about backend search engines, but when I saw the subject I was thinking more about front end classes. In particular, last time I looked there wasn't a standard class for integrating local search engines into your code. I ended up making a WWW::Search, but you kind of have to tweak the meaning of some values. If anyone is interested I ought to release it. It's a trivial example for very small web sites (it provides google-like search syntax, and backends it with grep). - -- Kee Hinckley - Somewhere.Com, LLC http://consulting.somewhere.com/ [EMAIL PROTECTED] (or ...!alice!nazgul for time travelers :-) I'm not sure which upsets me more: that people are so unwilling to accept responsibility for their own actions, or that they are so eager to regulate everyone else's. -BEGIN PGP SIGNATURE- Version: PGP Personal Security 7.0.3 iQA/AwUBO88CGCZsPfdw+r2CEQLj9ACfSqjkFgwvFR0iXWRRS9B2oM6EcZ8AoNSd 6jkha/LM8cS1ia4mYti8tiGW =yXL9 -END PGP SIGNATURE-
Re: search engine module?
Kee Hinckley wrote: -BEGIN PGP SIGNED MESSAGE- Hash: SHA1 People have been talking about backend search engines, but when I saw the subject I was thinking more about front end classes. In particular, last time I looked there wasn't a standard class for integrating local search engines into your code. I ended up making a WWW::Search, but you kind of have to tweak the meaning of some values. If anyone is interested I ought to release it. It's a trivial example for very small web sites (it provides google-like search syntax, and backends it with grep). You should have checked CPAN first: There is a load of WWW::Search:: modules there. _ Stas Bekman JAm_pH -- Just Another mod_perl Hacker http://stason.org/ mod_perl Guide http://perl.apache.org/guide mailto:[EMAIL PROTECTED] http://ticketmaster.com http://apacheweek.com http://singlesheaven.com http://perl.apache.org http://perlmonth.com/
Re: search engine module?
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 At 12:56 AM +0800 10/19/01, Stas Bekman wrote: Kee Hinckley wrote: -BEGIN PGP SIGNED MESSAGE- Hash: SHA1 People have been talking about backend search engines, but when I saw the subject I was thinking more about front end classes. In particular, last time I looked there wasn't a standard class for integrating local search engines into your code. I ended up making a WWW::Search, but you kind of have to tweak the meaning of some values. If anyone is interested I ought to release it. It's a trivial example for very small web sites (it provides google-like search syntax, and backends it with grep). You should have checked CPAN first: There is a load of WWW::Search:: modules there. Yes. But my point is that they are all *offsite* searches as far as I can tell. What I wanted was a standard interface to a local search engine. - -- Kee Hinckley - Somewhere.Com, LLC http://consulting.somewhere.com/ [EMAIL PROTECTED] (or ...!alice!nazgul for time travelers :-) I'm not sure which upsets me more: that people are so unwilling to accept responsibility for their own actions, or that they are so eager to regulate everyone else's. -BEGIN PGP SIGNATURE- Version: PGP Personal Security 7.0.3 iQA/AwUBO88W3yZsPfdw+r2CEQLQ8wCgrokvPCmktlUCSLPulsZsVwrBMdwAoMMQ V1vsViU2nutZioKmgwVnqV22 =03cp -END PGP SIGNATURE-
Re: search engine module?
Kee Hinckley wrote: -BEGIN PGP SIGNED MESSAGE- Hash: SHA1 At 12:56 AM +0800 10/19/01, Stas Bekman wrote: Kee Hinckley wrote: -BEGIN PGP SIGNED MESSAGE- Hash: SHA1 People have been talking about backend search engines, but when I saw the subject I was thinking more about front end classes. In particular, last time I looked there wasn't a standard class for integrating local search engines into your code. I ended up making a WWW::Search, but you kind of have to tweak the meaning of some values. If anyone is interested I ought to release it. It's a trivial example for very small web sites (it provides google-like search syntax, and backends it with grep). You should have checked CPAN first: There is a load of WWW::Search:: modules there. Yes. But my point is that they are all *offsite* searches as far as I can tell. What I wanted was a standard interface to a local search engine. Right, my point is that WWW::Search namespace is taken :) -- _ Stas Bekman JAm_pH -- Just Another mod_perl Hacker http://stason.org/ mod_perl Guide http://perl.apache.org/guide mailto:[EMAIL PROTECTED] http://ticketmaster.com http://apacheweek.com http://singlesheaven.com http://perl.apache.org http://perlmonth.com/
Re: search engine module?
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 At 11:36 AM +0800 10/19/01, Stas Bekman wrote: Right, my point is that WWW::Search namespace is taken :) Ah. Sorry, my miscommunication. When I said that I ended up making a WWW::Search I should have put an an instance of in there instead of a. Basically WWW::Search provided a good interface, but everything was remote, so I wrote this. If you stick to the conventions provided here, it should be easy to make other variations using other local search engines. I was just surprised that nobody seemed to have done it before. Grep(3)User Contributed Perl DocumentationGrep(3) NAME WWW::Search::Grep - class for searching a local web site using grep SYNOPSIS require WWW::Search; $search = new WWW::Search('Grep'); DESCRIPTION This is a grep specialization of WWW::Search. THis class exports no public interface; all interaction should be done through WWW::Search objects. OPTIONS The default query syntax is: word word OR word quoted phrase Blank separated words are implicitly separated by AND. OR refers only to the word or phrases directly to either side. The model is the same as that used by Google (http://www.google.com/). search_url Specifies the directory to search. All .html and .htm files in the specified directory and any subdirectories will be searched. This is an absolute pathname and is required. E.g. /home/httpd/html/foo/searchdir/ base_path This is this is the part of that pathname that should be stripped off before prefixing the base_url. This is required. E.g. /home/httpd/html/ base_url This is prepended to the pathname after stripping the base_path. This is optional, the default is none. E.g. http://www.somewhere.com/ or / search_debug,search_parse_debug See WWW::Search grep Pathname to grep, default is /bin/egrep. AUTHOR Kee Hinckley, [EMAIL PROTECTED] - -- Kee Hinckley - Somewhere.Com, LLC http://consulting.somewhere.com/ [EMAIL PROTECTED] (or ...!alice!nazgul for time travelers :-) I'm not sure which upsets me more: that people are so unwilling to accept responsibility for their own actions, or that they are so eager to regulate everyone else's. -BEGIN PGP SIGNATURE- Version: PGP Personal Security 7.0.3 iQA/AwUBO8+mtSZsPfdw+r2CEQI1+wCeI3s9JcPuXvaexrriahCWnjtTS/kAnjl3 v7uvLYWz4xxxc2weT/qU0f2n =MXIA -END PGP SIGNATURE-
Re: search engine module?
Daniel Sully wrote: Is the engine used at the math forum publiclicly available? I don't know. Why don't you ask them :) Once upon a time Stas Bekman shaped the electrons to say... the engine at mathforum does a great job, it's the best mailing list archive search engine that I've ever seen, in regards to searching Perl strings and code in general. Just make sure to use the right options at: http://mathforum.org/discussions/epi-search/modperl.html -D -- Zim I am the neighbourhood baby inspector. I have come to inspect the baby. Mother Oh, goodness! Inspect him for what? Zim YOUR RESISTANCE WILL BE NOTED! -- _ Stas Bekman JAm_pH -- Just Another mod_perl Hacker http://stason.org/ mod_perl Guide http://perl.apache.org/guide mailto:[EMAIL PROTECTED] http://ticketmaster.com http://apacheweek.com http://singlesheaven.com http://perl.apache.org http://perlmonth.com/
Re: search engine module?
We use OpenFTS (http://openfts.sourceforge.net) at postgresql mailing list archive ( http://fts.postgresql.org). Regards, Oleg On Wed, 17 Oct 2001, Stas Bekman wrote: Ged Haywood wrote: Hi all, On Mon, 15 Oct 2001, Ask Bjoern Hansen wrote: On Fri, 12 Oct 2001, Perrin Harkins wrote: [...] Plus lots of other stuff like Glimpse and Swish which interface to C-based engines. I've had good luck with http://swish-e.org/2.2/ Please make sure that it's possible to do a plain ordinary literal text string search. Nothing fancy, no case-folding, no automatic removal of puctuation, nothing like that. Just a literal string. Last night I tried to find perl -V on all the search engines mentioned on the mod_perl home page and they all failed in various interesting ways. If somebody knows what I'm doing wrong, please post. the engine at mathforum does a great job, it's the best mailing list archive search engine that I've ever seen, in regards to searching Perl strings and code in general. Just make sure to use the right options at: http://mathforum.org/discussions/epi-search/modperl.html _ Stas Bekman JAm_pH -- Just Another mod_perl Hacker http://stason.org/ mod_perl Guide http://perl.apache.org/guide mailto:[EMAIL PROTECTED] http://ticketmaster.com http://apacheweek.com http://singlesheaven.com http://perl.apache.org http://perlmonth.com/ Regards, Oleg _ Oleg Bartunov, sci.researcher, hostmaster of AstroNet, Sternberg Astronomical Institute, Moscow University (Russia) Internet: [EMAIL PROTECTED], http://www.sai.msu.su/~megera/ phone: +007(095)939-16-83, +007(095)939-23-83
Re: search engine module?
Is the engine used at the math forum publiclicly available? Once upon a time Stas Bekman shaped the electrons to say... the engine at mathforum does a great job, it's the best mailing list archive search engine that I've ever seen, in regards to searching Perl strings and code in general. Just make sure to use the right options at: http://mathforum.org/discussions/epi-search/modperl.html -D -- Zim I am the neighbourhood baby inspector. I have come to inspect the baby. Mother Oh, goodness! Inspect him for what? Zim YOUR RESISTANCE WILL BE NOTED!
Re: search engine module?
Hi all, On Mon, 15 Oct 2001, Ask Bjoern Hansen wrote: On Fri, 12 Oct 2001, Perrin Harkins wrote: [...] Plus lots of other stuff like Glimpse and Swish which interface to C-based engines. I've had good luck with http://swish-e.org/2.2/ Please make sure that it's possible to do a plain ordinary literal text string search. Nothing fancy, no case-folding, no automatic removal of puctuation, nothing like that. Just a literal string. Last night I tried to find perl -V on all the search engines mentioned on the mod_perl home page and they all failed in various interesting ways. If somebody knows what I'm doing wrong, please post. 73, Ged.
RE: search engine module?
-Original Message- From: Ged Haywood [mailto:[EMAIL PROTECTED]] Hi all, On Mon, 15 Oct 2001, Ask Bjoern Hansen wrote: On Fri, 12 Oct 2001, Perrin Harkins wrote: [...] Plus lots of other stuff like Glimpse and Swish which interface to C-based engines. I've had good luck with http://swish-e.org/2.2/ Please make sure that it's possible to do a plain ordinary literal text string search. Nothing fancy, no case-folding, no automatic removal of puctuation, nothing like that. Just a literal string. Last night I tried to find perl -V on all the search engines mentioned on the mod_perl home page and they all failed in various interesting ways. If somebody knows what I'm doing wrong, please post. I've written a RDBMS backed search engine that could do such queries, but it gets expensive after a while, as you have to do table-scans of the full text of every page in your DBMS to find the match. One thing I could have done would be to split up the match so it would try and match perl and -V, before doing the full text search on the subset of results. Never got around to doing that though. I suspect most search engines (with the exception of maybe google) are in the same or similar boat. Matt. _ This message has been checked for all known viruses by Star Internet delivered through the MessageLabs Virus Scanning Service. For further information visit http://www.star.net.uk/stats.asp or alternatively call Star Internet for details on the Virus Scanning Service.
Re: search engine module?
Please make sure that it's possible to do a plain ordinary literal text string search. Nothing fancy, no case-folding, no automatic removal of puctuation, nothing like that. Just a literal string. Last night I tried to find perl -V on all the search engines mentioned on the mod_perl home page and they all failed in various interesting ways. The amazingly fast ht://Dig (http://www.htdig.org/) engine can do phrase searching, but I'm not certain how well it does with punctuation. - Perrin
Re: [OT] search engine module?
At 02:04 PM 10/16/2001 +0100, Ged Haywood wrote: Plus lots of other stuff like Glimpse and Swish which interface to C-based engines. I've had good luck with http://swish-e.org/2.2/ Please make sure that it's possible to do a plain ordinary literal text string search. Nothing fancy, no case-folding, no automatic removal of puctuation, nothing like that. Just a literal string. Last night I tried to find perl -V on all the search engines mentioned on the mod_perl home page and they all failed in various interesting ways. I assume it's how the search engine is configured. Swish, for example, you can define what chars make up a word. Not sure what you mean by literal string. For performance reasons you can't just grep words (or parts of words), so you have to extract out words from the text during indexing. You might define that a dash is ok at the start of a word, but not at the end and to ignore trailing dots, so you could find -V and -V. (at the end of a sentence). Some search engines let you define a set of buzzwords that should be indexed as-is, but that's more helpful for technical writing instead of indexing code. Finally, in swish, if you put something like perl -V in quotes to use a phrase search it will find what you are looking for most likely, even if the dash is not indexed. Bill Moseley mailto:[EMAIL PROTECTED]
Re: search engine module?
Hi, I've written a search engine that searches for jobs in a database based on keywords. I'm assembling a string of sql and then submitting it to the database based on the user's search criteria. It's working but is It sounds like you are writing a web front end for mysql. I'm not sure about modules on cpan about that specifically. If you wanted to get a bit more fancy, you might try DBIx::FullTextSearch. This module is nice, though mysql specific. It creates an index of your content (event rows in a db), and allows the user to perform boolean searches on that index. Word stemming is also available by installing a seperate module which FullTextSearch uses. It's a tad sluggish when the number of rows gets to be above 40,000, but certainly not unusable. hth, matt I don't want to reinvent the wheel and I'm sure this has been done a zillion times, so does anyone know of a module in CPAN that I can use for this? I'm using MySQL on the back end and DBI under mod perl which runs as a handler. -- ## Matt J. Avitable ([EMAIL PROTECTED]) ## General Partner / Programmer ## Escapement Arts And Media ## http://www.escapement.net/ ## Phone: (804) 400-0605
Re: search engine module? [drifting OT DBI related]
Matt J. Avitable wrote: Hi, I've written a search engine that searches for jobs in a database based on keywords. I'm assembling a string of sql and then submitting it to the database based on the user's search criteria. It's working but is It sounds like you are writing a web front end for mysql. I'm not sure about modules on cpan about that specifically. If you wanted to get a bit more fancy, you might try DBIx::FullTextSearch. Thanks. I Checked out FullTextSearch on some earlier advice and it's not exactly what I'm after, but quite useful none the less. I've started using MySQL's MATCH/AGAINST with fulltext indexes instead, and it is extremelly fast (!!), but am waiting for a feature that's available in mysql 4.0 (due end of this month) that allows you to use +word and -word syntax to specify required or unwanted keywords. Also just as an asside, match/against only works with MyISAM tables so I've had to convert some of mine from InnoDB at the cost of losing transactions.
Re: search engine module? [drifting OT DBI related]
Mark Maunder wrote: I've started using MySQL's MATCH/AGAINST with fulltext indexes instead, and it is extremelly fast (!!), but am waiting for a feature that's available in mysql 4.0 (due end of this month) that allows you to use +word and -word syntax to specify required or unwanted keywords. Also just as an asside, match/against only works with MyISAM tables so I've had to convert some of mine from InnoDB at the cost of losing transactions. er - lo and behold, mysql 4.0 alpha has been released a few minutes ago by Monty. http://www.mysql.com/downloads/mysql-4.0.html
Re: search engine module?
On Fri, 12 Oct 2001, Perrin Harkins wrote: [...] Plus lots of other stuff like Glimpse and Swish which interface to C-based engines. I've had good luck with http://swish-e.org/2.2/ - ask -- ask bjoern hansen, http://ask.netcetera.dk/ !try; do(); more than a billion impressions per week, http://valueclick.com
search engine module?
I've written a search engine that searches for jobs in a database based on keywords. I'm assembling a string of sql and then submitting it to the database based on the user's search criteria. It's working but is really simple right now - it just does a logical AND with all the keywords the user submits. I'd like to include features like the ability to submit a query like: (perl AND apache) OR java NOT microsoft I don't want to reinvent the wheel and I'm sure this has been done a zillion times, so does anyone know of a module in CPAN that I can use for this? I'm using MySQL on the back end and DBI under mod perl which runs as a handler.
Re: search engine module?
I don't want to reinvent the wheel and I'm sure this has been done a zillion times, so does anyone know of a module in CPAN that I can use for this? Have you tried searching on http://search.cpan.org/? DBIx::FullTextSearch DBIxTextIndex Search::InvertedIndex Plus lots of other stuff like Glimpse and Swish which interface to C-based engines. - Perrin