Re: [OT] Realizing a search functionality

2003-09-06 Thread Marco Tedone
Thank you. I think I'll go for Lucene.

Marco
- Original Message - 
From: "John Turner" <[EMAIL PROTECTED]>
To: "Tomcat Users List" <[EMAIL PROTECTED]>
Sent: Friday, September 05, 2003 1:20 PM
Subject: Re: [OT] Realizing a search functionality


>
> AFAIK, Lucene indexes files.  How then, do you index a dynamic site?
> The only files that exist on a dynamic site are source code files.
> Servlets would never be indexed...how then do you index the content
> returned from the servlet?  Can Lucene do this?
>
> The Lucene site is pretty sparse in information.  Not having worked with
> it, and not knowing every option available when using it, I think there
> might be some other alternatives.  I've used Verity in the past, but
> that is a commercial product.  The other tool I've used in the past to
> great success is Atomz (http://www.atomz.com).  The "trial" is
> never-ending, so an index of up to 500 "pages" is free.  Pages also =
> URL.  The nice thing about Atomz is that it will spider your site and
> index the content returned, thus it works quite well for dynamic sites.
>
> In other words, it will take a URL like
> "http://your.domain.com/content.jsp?id=512&view=full"; and index the
> content returned from that, not the actual text string of the URL.
>
> The only requirement is that you display the Atomz logo on the search
> results page.  You can pay a small annual fee to have that removed.  All
> indexes and collections are kept on the Atomz site, not yours, and you
> can define the stylesheet and template that is used to display the
> search results, as well as define the frequency of indexing.
>
> John
>
> Schalk wrote:
> > Marco
> >
> > You may to have a look at Lucene (OpenSource Jakarata project) at:
> > http://jakarta.apache.org/lucene/docs/index.html
> >
> > Kind Regards
> > Schalk Neethling
> > Volume4.Development.Multimedia.Branding
> > emotionalize.conceptualize.visualize.realize
> > Tel: +27125468436
> > Fax: +27125468436
> > email:[EMAIL PROTECTED]
> > web: www.volume4.co.za
> >
> >
> > :: -Original Message-
> > :: From: Marco Tedone [mailto:[EMAIL PROTECTED]
> > :: Sent: Friday, September 05, 2003 12:32 AM
> > :: To: Tomcat Users List
> > :: Subject: [OT] Realizing a search functionality
> > ::
> > :: Hi, I must admit that I don't know anything about how to realize a
search
> > :: functionality. The only thing that I know is that most sites have a
> > search
> > :: functionality which, when searching for something, return a list of
links
> > :: more or less involved in the search string.
> > ::
> > :: The only things I know are:
> > ::
> > :: 1) An index of the web site contents should be created somehow
> > :: 2) The search 'action' (I'm talking in Struts terms, but I think it
could
> > be
> > :: anything) should interact with this index to match the required
string
> > :: 3) A list (which form does it assume) containing all the links
related to
> > :: the query string should be created, eventually read and displayed to
the
> > :: client
> > ::
> > :: Did anyone of you realized succesfully a search functionality in its
> > site?
> > :: Could you please address me towards some good software (possibly
> > :: open-source, possibly Jakarta, possibly java-oriented) and  patterns
to
> > use
> > :: to realize  a search functionality?
> > ::
> > :: Many thanks,
> > ::
> > :: Marco
> > ::
> > ::
> > ::
> > ::
> > :: -
> > :: To unsubscribe, e-mail: [EMAIL PROTECTED]
> > :: For additional commands, e-mail: [EMAIL PROTECTED]
> >
> >
> >
> > -
> > To unsubscribe, e-mail: [EMAIL PROTECTED]
> > For additional commands, e-mail: [EMAIL PROTECTED]
> >
>
>
>
> -
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail: [EMAIL PROTECTED]
>
>




-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: [OT] Realizing a search functionality

2003-09-05 Thread Louise Pryor

On Friday, September 5, 2003 at 1:20:00 PM, John Turner wrote:


JT>   The other tool I've used in the past to
JT> great success is Atomz (http://www.atomz.com).  The "trial" is 
JT> never-ending, so an index of up to 500 "pages" is free.  Pages also = 
JT> URL.  The nice thing about Atomz is that it will spider your site and 
JT> index the content returned, thus it works quite well for dynamic sites.

JT> In other words, it will take a URL like 
JT> "http://your.domain.com/content.jsp?id=512&view=full"; and index the 
JT> content returned from that, not the actual text string of the URL.



I use atomz, because it's free. There are a couple of issues with it:

- the template for the search results is pretty hard to get right.
- because of the spidering, session tracking through the URL is not a
good idea. It gets up to the limit of 500 *very* quickly, as the
session id part of the URL makes it think that it's a whole new page.
Luckily my web site isn't really dependent on sessions, so I was able
to get round that (but it does mean that I can't use the struts
rewriting tags...).

Otherwise I'm very happy with atomz.

-- 
Louise Pryor
http://www.louisepryor.com



-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: [OT] Realizing a search functionality

2003-09-05 Thread John Turner
Ulrich Mayring wrote:

John Turner wrote:

Ulrich Mayring wrote:

I can only recommend Lucene, it is vastly superior to any 
pre-packaged search engine, because you do not depend on specific 
features or behavior, but can customize everything to your needs.


Assuming you have time, money, skills, etc. to do so, which is not 
always the case.


Skills is the key issue. It took me all of one week to write our own 
custom search engine and I doubt that anyone would be able to install 
and configure a third-party product any faster than that. I had no prior 
exposure to Lucene, but of course knew my way around Java.
Hmmm...I had Atomz working for several clients by lunch one day. ;) I'm 
not arguing, just emphasizing that some of us are not Java developers. 
Granted, the question was somewhat in a context of "using Java" and not 
"using Tomcat", but not every Tomcat user is a developer.

John



-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


Re: [OT] Realizing a search functionality

2003-09-05 Thread Ulrich Mayring
John Turner wrote:
Ulrich Mayring wrote:

I can only recommend Lucene, it is vastly superior to any pre-packaged 
search engine, because you do not depend on specific features or 
behavior, but can customize everything to your needs.
Assuming you have time, money, skills, etc. to do so, which is not 
always the case.
Skills is the key issue. It took me all of one week to write our own 
custom search engine and I doubt that anyone would be able to install 
and configure a third-party product any faster than that. I had no prior 
exposure to Lucene, but of course knew my way around Java.

So, I don't think time and money are factors here at all. BTW, the guy 
who originally wrote Lucene is now developing an OpenSource version of 
Google with major financial backing. So you can see that there is some 
serious technology behind Lucene and IMHO it's worth to learn it.

Ulrich



-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


Re: [OT] Realizing a search functionality

2003-09-05 Thread John Turner
Ulrich Mayring wrote:

Lucene is not a search engine, but an API for writing a search engine, 
so it can do everything that you can write in Java. By itself it does 
nothing, like the JDK.
Thanks for the clarification.

I can only recommend Lucene, it is vastly superior to any pre-packaged 
search engine, because you do not depend on specific features or 
behavior, but can customize everything to your needs.
Assuming you have time, money, skills, etc. to do so, which is not 
always the case.

John



-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


Re: [OT] Realizing a search functionality

2003-09-05 Thread John Turner
Thanks for the clarification.

John

Tim Funk wrote:

Lucene indexes "documents". A document is composed of fields and does 
not need (and it actuually is not) to be a physical file.

In the simplistic example of a site consisting of a single dynamic web 
page backed by a database. You would create "documents" based on the 
database data where the db data goes into named fields. Then when you 
construct your query, it will return a list of documents. When you 
iterate through each document, you need to pull the appropriate field 
out of the document to reconstruct the appropriate URL.

In a nutshell, it can do what you want, but there is a lot of setup work 
to construct documents and a lot of work to display results from 
documents from queries.

-Tim

John Turner wrote:

AFAIK, Lucene indexes files.  How then, do you index a dynamic site? 
The only files that exist on a dynamic site are source code files. 
Servlets would never be indexed...how then do you index the content 
returned from the servlet?  Can Lucene do this?

The Lucene site is pretty sparse in information.  Not having worked 
with it, and not knowing every option available when using it, I think 
there might be some other alternatives. 
John



-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


Re: [OT] Realizing a search functionality

2003-09-05 Thread Tim Funk
Lucene indexes "documents". A document is composed of fields and does not 
need (and it actuually is not) to be a physical file.

In the simplistic example of a site consisting of a single dynamic web page 
backed by a database. You would create "documents" based on the database data 
where the db data goes into named fields. Then when you construct your query, 
it will return a list of documents. When you iterate through each document, 
you need to pull the appropriate field out of the document to reconstruct the 
appropriate URL.

In a nutshell, it can do what you want, but there is a lot of setup work to 
construct documents and a lot of work to display results from documents from 
queries.

-Tim

John Turner wrote:

AFAIK, Lucene indexes files.  How then, do you index a dynamic site? The 
only files that exist on a dynamic site are source code files. Servlets 
would never be indexed...how then do you index the content returned from 
the servlet?  Can Lucene do this?

The Lucene site is pretty sparse in information.  Not having worked with 
it, and not knowing every option available when using it, I think there 
might be some other alternatives.  

John



-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


Re: [OT] Realizing a search functionality

2003-09-05 Thread Ulrich Mayring
John Turner wrote:
AFAIK, Lucene indexes files.  How then, do you index a dynamic site? The 
only files that exist on a dynamic site are source code files. Servlets 
would never be indexed...how then do you index the content returned from 
the servlet?  Can Lucene do this?
Lucene is not a search engine, but an API for writing a search engine, 
so it can do everything that you can write in Java. By itself it does 
nothing, like the JDK.

In my case I've implemented a search engine that gets local files and 
hands them to the Lucene Indexer, but that could also be implemented so 
that it retrieves files via HTTP.

I can only recommend Lucene, it is vastly superior to any pre-packaged 
search engine, because you do not depend on specific features or 
behavior, but can customize everything to your needs.

Ulrich



-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


Re: [OT] Realizing a search functionality

2003-09-05 Thread John Turner
AFAIK, Lucene indexes files.  How then, do you index a dynamic site? 
The only files that exist on a dynamic site are source code files. 
Servlets would never be indexed...how then do you index the content 
returned from the servlet?  Can Lucene do this?

The Lucene site is pretty sparse in information.  Not having worked with 
it, and not knowing every option available when using it, I think there 
might be some other alternatives.  I've used Verity in the past, but 
that is a commercial product.  The other tool I've used in the past to 
great success is Atomz (http://www.atomz.com).  The "trial" is 
never-ending, so an index of up to 500 "pages" is free.  Pages also = 
URL.  The nice thing about Atomz is that it will spider your site and 
index the content returned, thus it works quite well for dynamic sites.

In other words, it will take a URL like 
"http://your.domain.com/content.jsp?id=512&view=full"; and index the 
content returned from that, not the actual text string of the URL.

The only requirement is that you display the Atomz logo on the search 
results page.  You can pay a small annual fee to have that removed.  All 
indexes and collections are kept on the Atomz site, not yours, and you 
can define the stylesheet and template that is used to display the 
search results, as well as define the frequency of indexing.

John

Schalk wrote:
Marco

You may to have a look at Lucene (OpenSource Jakarata project) at:
http://jakarta.apache.org/lucene/docs/index.html
Kind Regards
Schalk Neethling
Volume4.Development.Multimedia.Branding
emotionalize.conceptualize.visualize.realize
Tel: +27125468436
Fax: +27125468436
email:[EMAIL PROTECTED]
web: www.volume4.co.za
 

:: -Original Message-
:: From: Marco Tedone [mailto:[EMAIL PROTECTED]
:: Sent: Friday, September 05, 2003 12:32 AM
:: To: Tomcat Users List
:: Subject: [OT] Realizing a search functionality
:: 
:: Hi, I must admit that I don't know anything about how to realize a search
:: functionality. The only thing that I know is that most sites have a
search
:: functionality which, when searching for something, return a list of links
:: more or less involved in the search string.
:: 
:: The only things I know are:
:: 
:: 1) An index of the web site contents should be created somehow
:: 2) The search 'action' (I'm talking in Struts terms, but I think it could
be
:: anything) should interact with this index to match the required string
:: 3) A list (which form does it assume) containing all the links related to
:: the query string should be created, eventually read and displayed to the
:: client
:: 
:: Did anyone of you realized succesfully a search functionality in its
site?
:: Could you please address me towards some good software (possibly
:: open-source, possibly Jakarta, possibly java-oriented) and  patterns to
use
:: to realize  a search functionality?
:: 
:: Many thanks,
:: 
:: Marco
:: 
:: 
:: 
:: 
:: -
:: To unsubscribe, e-mail: [EMAIL PROTECTED]
:: For additional commands, e-mail: [EMAIL PROTECTED]



-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


RE: [OT] Realizing a search functionality

2003-09-04 Thread Schalk
Marco

You may to have a look at Lucene (OpenSource Jakarata project) at:
http://jakarta.apache.org/lucene/docs/index.html

Kind Regards
Schalk Neethling
Volume4.Development.Multimedia.Branding
emotionalize.conceptualize.visualize.realize
Tel: +27125468436
Fax: +27125468436
email:[EMAIL PROTECTED]
web: www.volume4.co.za
 

:: -Original Message-
:: From: Marco Tedone [mailto:[EMAIL PROTECTED]
:: Sent: Friday, September 05, 2003 12:32 AM
:: To: Tomcat Users List
:: Subject: [OT] Realizing a search functionality
:: 
:: Hi, I must admit that I don't know anything about how to realize a search
:: functionality. The only thing that I know is that most sites have a
search
:: functionality which, when searching for something, return a list of links
:: more or less involved in the search string.
:: 
:: The only things I know are:
:: 
:: 1) An index of the web site contents should be created somehow
:: 2) The search 'action' (I'm talking in Struts terms, but I think it could
be
:: anything) should interact with this index to match the required string
:: 3) A list (which form does it assume) containing all the links related to
:: the query string should be created, eventually read and displayed to the
:: client
:: 
:: Did anyone of you realized succesfully a search functionality in its
site?
:: Could you please address me towards some good software (possibly
:: open-source, possibly Jakarta, possibly java-oriented) and  patterns to
use
:: to realize  a search functionality?
:: 
:: Many thanks,
:: 
:: Marco
:: 
:: 
:: 
:: 
:: -
:: To unsubscribe, e-mail: [EMAIL PROTECTED]
:: For additional commands, e-mail: [EMAIL PROTECTED]



-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: [OT] Realizing a search functionality

2003-09-04 Thread Marco Tedone
SorryI found Jakarta LuceneI'll work on it :)

Marco
- Original Message - 
From: "Marco Tedone" <[EMAIL PROTECTED]>
To: "Tomcat Users List" <[EMAIL PROTECTED]>
Sent: Thursday, September 04, 2003 11:32 PM
Subject: [OT] Realizing a search functionality


> Hi, I must admit that I don't know anything about how to realize a search
> functionality. The only thing that I know is that most sites have a search
> functionality which, when searching for something, return a list of links
> more or less involved in the search string.
>
> The only things I know are:
>
> 1) An index of the web site contents should be created somehow
> 2) The search 'action' (I'm talking in Struts terms, but I think it could
be
> anything) should interact with this index to match the required string
> 3) A list (which form does it assume) containing all the links related to
> the query string should be created, eventually read and displayed to the
> client
>
> Did anyone of you realized succesfully a search functionality in its site?
> Could you please address me towards some good software (possibly
> open-source, possibly Jakarta, possibly java-oriented) and  patterns to
use
> to realize  a search functionality?
>
> Many thanks,
>
> Marco
>
>
>
>
> -
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail: [EMAIL PROTECTED]
>




-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]