Hi,

I am a new user of Lucene. so please point me to
documentation/archives if these issues have been covered before.

I plan to use Lucene in a application with the following (fairly
standard) requirements:
- Index documents that contain a title, author, date and content
- It is fairly common to search for some text across all the fields
- Matches in the title field should be given more weightage over
matches in the content field
- Provide an option to restrict search to documents within a date range

Give these requirements, what is a good index design with search speed in mind?
Documents will have fields "title", "author", "date" and "content".
Should I make title and author part of the "content" as well so that
"search across all fields" will just become a search in "content"
field? If so, how do I give more weightage to matches in "title"
field?

The other option would be to expand a simple query to include searches
across all fields.
Ex.: Expand "abcd" to "title:abcd^4 OR content:abcd". Also, should the
boost for title field be applied in the query or is it better to
provide a boost to the title field during indexing (is that possible)?
Which of these options will work and be more effecient?

For date range limited search, can field values be integers? If not,
encoding the date as "YYYYMMDDHHMM" and then use a filter or a
RangeQuery - is that the way to do this?

Thanks,
Venkat

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to