Indexing Dynamically Generated Web Pages

2002-02-20 Thread Paul Friedman

Hello Lucene Users,

I am a novice developer researching Lucene for use on a web site that
primarily uses JSPs.

How do you index dynamically generated web pages with Lucene?
Or is it even possible?

The JSPs themselves don't have searchable data, only methods to get that
data.
When parsing these files for indexing, the necessary data for the search
would not yet be in the page.
The data upon which to search is generated only after certain parameters are
passed to the JSP ( resulting in the HTML output ).

Thanks in advance.

pax et bonum. p.


_
Do You Yahoo!?
Get your free @yahoo.com address at http://mail.yahoo.com


--
To unsubscribe, e-mail:   mailto:[EMAIL PROTECTED]
For additional commands, e-mail: mailto:[EMAIL PROTECTED]




Re: Indexing Dynamically Generated Web Pages

2002-02-20 Thread Steven J. Owens

Paul Friedman [EMAIL PROTECTED] asks:
 I am a novice developer researching Lucene for use on a web site
 that primarily uses JSPs.  How do you index dynamically generated
 web pages with Lucene?  Or is it even possible?
 The JSPs themselves don't have searchable data, only methods to get
 that data.  When parsing these files for indexing, the necessary
 data for the search would not yet be in the page.

 Lucene doesn't do any crawling for you - it's your responsibility
to write the code that crawls your website, i.e. selects the data
sources to be indexed, chooses how to convert them to Lucene
Documents, and adds them to the indexes.  I suspect you're assuming
you would use the demo application.  That would not be appropriate for
your project.

 Instead, what you should do is:

1) write some java code that walks through your data source - i.e. if
your data source is a database, it would select each row in the
appropriate table - and composes a Lucene Document, and adds it to
your index.

2) write some servlets/jsps that do the search, and when it gets to
the point where you would redirect the user's browser off to the
appropriate HTML page, you either:

 a) invoke the appropriate JSP in your site with appropriate
arguments

 or

 b) compose a page dynamically, roughly the same way the JSP page
would compose it, and return it to the user.

 or

 c) redirect the user to a new JSP page, which you will have to
write, which looks much like your old JSP page, but is
designed to be invoked with an argument specifying what
data to load.

 You may find http://darksleep.com/puff/lucene/lucene.html useful
in further clarifying this.  There is also a getting started article
at http://jakarta.apache.org/lucene/docs/index.html that may be of
some use.

Steven J. Owens
[EMAIL PROTECTED]

--
To unsubscribe, e-mail:   mailto:[EMAIL PROTECTED]
For additional commands, e-mail: mailto:[EMAIL PROTECTED]