Interestingly, you have extra spaces when you construct your queries, e.g.
queries[2]= " accountmanagement" has an extra space at the beginning but
when you index the document, there are no spaces. I believe that since
you're indexing the fields UN_TOKENIZED, that the spaces are preserved in
the query (but I'm not entirely clear on this point, so don't take my word
for it completely <G>).
Have you used Luke to examine your index? You can also put parsed form of
the query into Luke and play around with that to see what *should* work.
Google lucene luke and you'll find it right away.
Best
Erick
On 9/12/06, Pramodh Shenoy <[EMAIL PROTECTED]> wrote:
Hi Eric/Usergroup,
I am working on a help content index-search project based on Lucene.
One of my requirements is to search for a particular text in the content
of files from specific directories. When I index the content
Eg. guides/accountmanagement/index.htm and
guides/databasemanagement/index.htm
doc.add(new Field("booktype", "guides", Field.Store.YES,
Field.Index.UN_TOKENIZED))
doc.add(new Field("subtype", "accountmanagement", Field.Store.YES,
Field.Index.UN_TOKENIZED))
doc.add(new Field("subtype", "databasemanagement", Field.Store.YES,
Field.Index.UN_TOKENIZED))
doc.add(new Field("content",
all-content-read-from-html-body-as-a-string, Field.Store.NO,
Field.Index.TOKENIZED))
Now I want to search for all occurrences of "management" in the
"content" field (which already exists in both the above index.htm files
body), in files under subtype/accountmanagement and under subtype/
databasemanagement
Iam creating the query as below:
String [] queries = new String [3];// =new String[4]
String [] fields = new String [3] ];// =new String[4]
BooleanClause.Occur[] flags = new BooleanClause.Occur[3] ];//
=new String[4]
queries[0]= " guides ";
fields[0]=" booktype ";
flags[0] = BooleanClause.Occur.MUST;
queries[1]= " management ";
fields[1]="content";
flags[1] = BooleanClause.Occur.MUST;
/* ######## A ####### */
queries[2]= " accountmanagement databasemanagement ";
fields[2]=" subtype ";
flags[2] = BooleanClause.Occur.MUST;
/* ##### B #######
queries[2]= " accountmanagement";
fields[2]="subtype";
flags[2] = BooleanClause.Occur.MUST;
queries[3]= " databasemanagement ";
fields[3]=" subtype ";
flags[3] = BooleanClause.Occur.MUST;
*/
Query queryObj = null;
//parse the query string
try {
queryObj = MultiFieldQueryParser.parse(queries, fields,
flags, new StandardAnalyzer());
} catch (ParseException exp) { }
With option A , the query generated looks like:
+booktype:guides +content:management +(subtype: accountmanagement
subtype: databasemanagement)
With option B , the query generated looks like:
+booktype:guides +content:management +subtype: accountmanagement
+subtype: databasemanagement
Both return no Hits.!
Any idea how I should create the query. In Lucene In Action, this is
explained as "you can group field selection over several terms using
field:(a b c)". How can I achieve this with the code above ?
Thanks
Pramodh