Re: [OT] Realizing a search functionality
Thank you. I think I'll go for Lucene. Marco - Original Message - From: "John Turner" <[EMAIL PROTECTED]> To: "Tomcat Users List" <[EMAIL PROTECTED]> Sent: Friday, September 05, 2003 1:20 PM Subject: Re: [OT] Realizing a search functionality > > AFAIK, Lucene indexes files. How then, do you index a dynamic site? > The only files that exist on a dynamic site are source code files. > Servlets would never be indexed...how then do you index the content > returned from the servlet? Can Lucene do this? > > The Lucene site is pretty sparse in information. Not having worked with > it, and not knowing every option available when using it, I think there > might be some other alternatives. I've used Verity in the past, but > that is a commercial product. The other tool I've used in the past to > great success is Atomz (http://www.atomz.com). The "trial" is > never-ending, so an index of up to 500 "pages" is free. Pages also = > URL. The nice thing about Atomz is that it will spider your site and > index the content returned, thus it works quite well for dynamic sites. > > In other words, it will take a URL like > "http://your.domain.com/content.jsp?id=512&view=full"; and index the > content returned from that, not the actual text string of the URL. > > The only requirement is that you display the Atomz logo on the search > results page. You can pay a small annual fee to have that removed. All > indexes and collections are kept on the Atomz site, not yours, and you > can define the stylesheet and template that is used to display the > search results, as well as define the frequency of indexing. > > John > > Schalk wrote: > > Marco > > > > You may to have a look at Lucene (OpenSource Jakarata project) at: > > http://jakarta.apache.org/lucene/docs/index.html > > > > Kind Regards > > Schalk Neethling > > Volume4.Development.Multimedia.Branding > > emotionalize.conceptualize.visualize.realize > > Tel: +27125468436 > > Fax: +27125468436 > > email:[EMAIL PROTECTED] > > web: www.volume4.co.za > > > > > > :: -Original Message- > > :: From: Marco Tedone [mailto:[EMAIL PROTECTED] > > :: Sent: Friday, September 05, 2003 12:32 AM > > :: To: Tomcat Users List > > :: Subject: [OT] Realizing a search functionality > > :: > > :: Hi, I must admit that I don't know anything about how to realize a search > > :: functionality. The only thing that I know is that most sites have a > > search > > :: functionality which, when searching for something, return a list of links > > :: more or less involved in the search string. > > :: > > :: The only things I know are: > > :: > > :: 1) An index of the web site contents should be created somehow > > :: 2) The search 'action' (I'm talking in Struts terms, but I think it could > > be > > :: anything) should interact with this index to match the required string > > :: 3) A list (which form does it assume) containing all the links related to > > :: the query string should be created, eventually read and displayed to the > > :: client > > :: > > :: Did anyone of you realized succesfully a search functionality in its > > site? > > :: Could you please address me towards some good software (possibly > > :: open-source, possibly Jakarta, possibly java-oriented) and patterns to > > use > > :: to realize a search functionality? > > :: > > :: Many thanks, > > :: > > :: Marco > > :: > > :: > > :: > > :: > > :: - > > :: To unsubscribe, e-mail: [EMAIL PROTECTED] > > :: For additional commands, e-mail: [EMAIL PROTECTED] > > > > > > > > - > > To unsubscribe, e-mail: [EMAIL PROTECTED] > > For additional commands, e-mail: [EMAIL PROTECTED] > > > > > > - > To unsubscribe, e-mail: [EMAIL PROTECTED] > For additional commands, e-mail: [EMAIL PROTECTED] > > - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: [OT] Realizing a search functionality
On Friday, September 5, 2003 at 1:20:00 PM, John Turner wrote: JT> The other tool I've used in the past to JT> great success is Atomz (http://www.atomz.com). The "trial" is JT> never-ending, so an index of up to 500 "pages" is free. Pages also = JT> URL. The nice thing about Atomz is that it will spider your site and JT> index the content returned, thus it works quite well for dynamic sites. JT> In other words, it will take a URL like JT> "http://your.domain.com/content.jsp?id=512&view=full"; and index the JT> content returned from that, not the actual text string of the URL. I use atomz, because it's free. There are a couple of issues with it: - the template for the search results is pretty hard to get right. - because of the spidering, session tracking through the URL is not a good idea. It gets up to the limit of 500 *very* quickly, as the session id part of the URL makes it think that it's a whole new page. Luckily my web site isn't really dependent on sessions, so I was able to get round that (but it does mean that I can't use the struts rewriting tags...). Otherwise I'm very happy with atomz. -- Louise Pryor http://www.louisepryor.com - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: [OT] Realizing a search functionality
Ulrich Mayring wrote: John Turner wrote: Ulrich Mayring wrote: I can only recommend Lucene, it is vastly superior to any pre-packaged search engine, because you do not depend on specific features or behavior, but can customize everything to your needs. Assuming you have time, money, skills, etc. to do so, which is not always the case. Skills is the key issue. It took me all of one week to write our own custom search engine and I doubt that anyone would be able to install and configure a third-party product any faster than that. I had no prior exposure to Lucene, but of course knew my way around Java. Hmmm...I had Atomz working for several clients by lunch one day. ;) I'm not arguing, just emphasizing that some of us are not Java developers. Granted, the question was somewhat in a context of "using Java" and not "using Tomcat", but not every Tomcat user is a developer. John - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: [OT] Realizing a search functionality
John Turner wrote: Ulrich Mayring wrote: I can only recommend Lucene, it is vastly superior to any pre-packaged search engine, because you do not depend on specific features or behavior, but can customize everything to your needs. Assuming you have time, money, skills, etc. to do so, which is not always the case. Skills is the key issue. It took me all of one week to write our own custom search engine and I doubt that anyone would be able to install and configure a third-party product any faster than that. I had no prior exposure to Lucene, but of course knew my way around Java. So, I don't think time and money are factors here at all. BTW, the guy who originally wrote Lucene is now developing an OpenSource version of Google with major financial backing. So you can see that there is some serious technology behind Lucene and IMHO it's worth to learn it. Ulrich - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: [OT] Realizing a search functionality
Ulrich Mayring wrote: Lucene is not a search engine, but an API for writing a search engine, so it can do everything that you can write in Java. By itself it does nothing, like the JDK. Thanks for the clarification. I can only recommend Lucene, it is vastly superior to any pre-packaged search engine, because you do not depend on specific features or behavior, but can customize everything to your needs. Assuming you have time, money, skills, etc. to do so, which is not always the case. John - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: [OT] Realizing a search functionality
Thanks for the clarification. John Tim Funk wrote: Lucene indexes "documents". A document is composed of fields and does not need (and it actuually is not) to be a physical file. In the simplistic example of a site consisting of a single dynamic web page backed by a database. You would create "documents" based on the database data where the db data goes into named fields. Then when you construct your query, it will return a list of documents. When you iterate through each document, you need to pull the appropriate field out of the document to reconstruct the appropriate URL. In a nutshell, it can do what you want, but there is a lot of setup work to construct documents and a lot of work to display results from documents from queries. -Tim John Turner wrote: AFAIK, Lucene indexes files. How then, do you index a dynamic site? The only files that exist on a dynamic site are source code files. Servlets would never be indexed...how then do you index the content returned from the servlet? Can Lucene do this? The Lucene site is pretty sparse in information. Not having worked with it, and not knowing every option available when using it, I think there might be some other alternatives. John - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: [OT] Realizing a search functionality
Lucene indexes "documents". A document is composed of fields and does not need (and it actuually is not) to be a physical file. In the simplistic example of a site consisting of a single dynamic web page backed by a database. You would create "documents" based on the database data where the db data goes into named fields. Then when you construct your query, it will return a list of documents. When you iterate through each document, you need to pull the appropriate field out of the document to reconstruct the appropriate URL. In a nutshell, it can do what you want, but there is a lot of setup work to construct documents and a lot of work to display results from documents from queries. -Tim John Turner wrote: AFAIK, Lucene indexes files. How then, do you index a dynamic site? The only files that exist on a dynamic site are source code files. Servlets would never be indexed...how then do you index the content returned from the servlet? Can Lucene do this? The Lucene site is pretty sparse in information. Not having worked with it, and not knowing every option available when using it, I think there might be some other alternatives. John - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: [OT] Realizing a search functionality
John Turner wrote: AFAIK, Lucene indexes files. How then, do you index a dynamic site? The only files that exist on a dynamic site are source code files. Servlets would never be indexed...how then do you index the content returned from the servlet? Can Lucene do this? Lucene is not a search engine, but an API for writing a search engine, so it can do everything that you can write in Java. By itself it does nothing, like the JDK. In my case I've implemented a search engine that gets local files and hands them to the Lucene Indexer, but that could also be implemented so that it retrieves files via HTTP. I can only recommend Lucene, it is vastly superior to any pre-packaged search engine, because you do not depend on specific features or behavior, but can customize everything to your needs. Ulrich - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: [OT] Realizing a search functionality
AFAIK, Lucene indexes files. How then, do you index a dynamic site? The only files that exist on a dynamic site are source code files. Servlets would never be indexed...how then do you index the content returned from the servlet? Can Lucene do this? The Lucene site is pretty sparse in information. Not having worked with it, and not knowing every option available when using it, I think there might be some other alternatives. I've used Verity in the past, but that is a commercial product. The other tool I've used in the past to great success is Atomz (http://www.atomz.com). The "trial" is never-ending, so an index of up to 500 "pages" is free. Pages also = URL. The nice thing about Atomz is that it will spider your site and index the content returned, thus it works quite well for dynamic sites. In other words, it will take a URL like "http://your.domain.com/content.jsp?id=512&view=full"; and index the content returned from that, not the actual text string of the URL. The only requirement is that you display the Atomz logo on the search results page. You can pay a small annual fee to have that removed. All indexes and collections are kept on the Atomz site, not yours, and you can define the stylesheet and template that is used to display the search results, as well as define the frequency of indexing. John Schalk wrote: Marco You may to have a look at Lucene (OpenSource Jakarata project) at: http://jakarta.apache.org/lucene/docs/index.html Kind Regards Schalk Neethling Volume4.Development.Multimedia.Branding emotionalize.conceptualize.visualize.realize Tel: +27125468436 Fax: +27125468436 email:[EMAIL PROTECTED] web: www.volume4.co.za :: -Original Message- :: From: Marco Tedone [mailto:[EMAIL PROTECTED] :: Sent: Friday, September 05, 2003 12:32 AM :: To: Tomcat Users List :: Subject: [OT] Realizing a search functionality :: :: Hi, I must admit that I don't know anything about how to realize a search :: functionality. The only thing that I know is that most sites have a search :: functionality which, when searching for something, return a list of links :: more or less involved in the search string. :: :: The only things I know are: :: :: 1) An index of the web site contents should be created somehow :: 2) The search 'action' (I'm talking in Struts terms, but I think it could be :: anything) should interact with this index to match the required string :: 3) A list (which form does it assume) containing all the links related to :: the query string should be created, eventually read and displayed to the :: client :: :: Did anyone of you realized succesfully a search functionality in its site? :: Could you please address me towards some good software (possibly :: open-source, possibly Jakarta, possibly java-oriented) and patterns to use :: to realize a search functionality? :: :: Many thanks, :: :: Marco :: :: :: :: :: - :: To unsubscribe, e-mail: [EMAIL PROTECTED] :: For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
RE: [OT] Realizing a search functionality
Marco You may to have a look at Lucene (OpenSource Jakarata project) at: http://jakarta.apache.org/lucene/docs/index.html Kind Regards Schalk Neethling Volume4.Development.Multimedia.Branding emotionalize.conceptualize.visualize.realize Tel: +27125468436 Fax: +27125468436 email:[EMAIL PROTECTED] web: www.volume4.co.za :: -Original Message- :: From: Marco Tedone [mailto:[EMAIL PROTECTED] :: Sent: Friday, September 05, 2003 12:32 AM :: To: Tomcat Users List :: Subject: [OT] Realizing a search functionality :: :: Hi, I must admit that I don't know anything about how to realize a search :: functionality. The only thing that I know is that most sites have a search :: functionality which, when searching for something, return a list of links :: more or less involved in the search string. :: :: The only things I know are: :: :: 1) An index of the web site contents should be created somehow :: 2) The search 'action' (I'm talking in Struts terms, but I think it could be :: anything) should interact with this index to match the required string :: 3) A list (which form does it assume) containing all the links related to :: the query string should be created, eventually read and displayed to the :: client :: :: Did anyone of you realized succesfully a search functionality in its site? :: Could you please address me towards some good software (possibly :: open-source, possibly Jakarta, possibly java-oriented) and patterns to use :: to realize a search functionality? :: :: Many thanks, :: :: Marco :: :: :: :: :: - :: To unsubscribe, e-mail: [EMAIL PROTECTED] :: For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: [OT] Realizing a search functionality
SorryI found Jakarta LuceneI'll work on it :) Marco - Original Message - From: "Marco Tedone" <[EMAIL PROTECTED]> To: "Tomcat Users List" <[EMAIL PROTECTED]> Sent: Thursday, September 04, 2003 11:32 PM Subject: [OT] Realizing a search functionality > Hi, I must admit that I don't know anything about how to realize a search > functionality. The only thing that I know is that most sites have a search > functionality which, when searching for something, return a list of links > more or less involved in the search string. > > The only things I know are: > > 1) An index of the web site contents should be created somehow > 2) The search 'action' (I'm talking in Struts terms, but I think it could be > anything) should interact with this index to match the required string > 3) A list (which form does it assume) containing all the links related to > the query string should be created, eventually read and displayed to the > client > > Did anyone of you realized succesfully a search functionality in its site? > Could you please address me towards some good software (possibly > open-source, possibly Jakarta, possibly java-oriented) and patterns to use > to realize a search functionality? > > Many thanks, > > Marco > > > > > - > To unsubscribe, e-mail: [EMAIL PROTECTED] > For additional commands, e-mail: [EMAIL PROTECTED] > - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]