Dear Karsten:
Sorry for the multiple posts, but I have made some progress. I think in
order to search multiple fields, I should be using the
MultipleFieldsQueryParser class, and simply pass a String array containing
the fields I wish to search over. My follow-up question to you is this:
How do I highlight the results returned from the MultipleFieldsQueryParser?
As of this moment, my Searcher code looks like this:
List searchResult = new ArrayList();
Directory fsDir=FSDirectory.getDirectory(indexDir);
IndexSearcher is=new IndexSearcher(fsDir);
String[] fields = {"SCENE-COMMENTARY", "LINES"};
Analyzer analyser = new StandardAnalyzer();
Query parser=new MultiFieldQueryParser(fields, analyser).parse(q);
//parser.setAllowLeadingWildcard(true);
long start=new Date().getTime();
Hits hits=is.search(parser);
long end=new Date().getTime();
QueryScorer scorer = new QueryScorer(parser);
SimpleHTMLFormatter formatter = new SimpleHTMLFormatter("", "");
Highlighter highlighter = new Highlighter(formatter, scorer);
//Highlighter highlighter = new Highlighter(scorer);
Highlighter high = new Highlighter(formatter, scorer);
//Highlighter high = new Highlighter(scorer);
Fragmenter fragmenter = new NullFragmenter();
Fragmenter fragment = new SimpleFragmenter(250);
highlighter.setTextFragmenter(fragmenter);
high.setTextFragmenter(fragment);
for(int i=0; i<hits.length(); i++){
Document doc=hits.doc(i);
String com = doc.get("SCENE-COMMENTARY");
String lns = doc.get("LINES");
//String spkr = doc.get("SPEAKER");
TokenStream lines = analyser.tokenStream("LINES", new
StringReader(lns));
CachingTokenFilter filter = new CachingTokenFilter(lines);
//TokenStream speaker = analyser.tokenStream("SPEAKER", new
StringReader(spkr));
String highlightedLines = highlighter.getBestFragment(filter,
lns);
filter.reset();
String highlight = high.getBestFragment(filter, lns);
SearchResult resultBean = new SearchResult();
resultBean.setReference(hits.doc(i).get("REFERENCE"));
resultBean.setNarrator(hits.doc(i).get("SPEAKER"));
resultBean.setHitResult(highlight);
resultBean.setQuote(highlightedLines);
searchResult.add(resultBean);
System.out.println(resultBean.getReference());
System.out.println(resultBean.getNarrator());
System.out.println(resultBean.getHitResult());
System.out.println("");
System.out.println(resultBean.getQuote());
System.out.println("");
}
System.err.println("Found " + hits.length() + " document(s)(in " +
(end-start) + " milliseconds) that matched query '" + q + "':");
return searchResult;
}
Thanks again for all of your help, I do sincerely appreciate it.
Take care.
Fayyaz
Karsten F. wrote:
>
> Hi Fayyaz,
>
> again, this is about SAX-Handler not about lucene.
>
> My understanding of what you want:
> 1. one lucene document for each SPEECH-Element (already implemented)
> 2. one lucene document for each SCENE-COMMENTARY-Element (not implemented
> yet).
>
> correct?
>
> If yes, you can write
> if(qName.equals("SPEECH") ||
> qName.equals("SCENE-COMMENTARY")){
> doc=new Document();
> }
> and
>
> public void endElement(String uri, String localName, String qName) throws
> SAXException{
> ...
> else if(qName.equals("SCENE-COMMENTARY")){
> Field lines = new Field(qName, elementBuffer.toString(), Field.Store.YES,
> Field.Index.TOKENIZED, Field.TermVector.YES);
> doc.add(lines);
> }
> ...
> if(qName.equals("SPEECH") || qName.equals("SCENE-COMMENTARY")){
> indexWriter.addDocument(doc);
> }
>
> (instead of "indexWriter.addDocument(doc);" in block of
> if(qName.equals("LINES")){ )
>
>
>
> Best regards
> Karsten
>
> P.S.:
> If you want to learn java:
> I really like
> http://www.java-hamster-modell.de/
> possible there is an english version somewhere?
>
>
> syedfa wrote:
>>
>> I think I understand what you are saying, but I was hoping you could
>> clarify a little further. in the start-element method, I have the
>> following:
>>
>> if(qName.equals("SPEECH")){
>> doc=new Document();
>> }
>>
>> are you saying that I should add an identical block of code for
>> <SCENE-COMMENTARY> as well, and include a similar clause in the
>> endElement method as well? i.e.
>>
>> else if(qName.equals("SCENE-COMMENTARY")){
>> Field lines = new Field(qName,
>> elementBuffer.toString(), Field.Store.YES, Field.Index.TOKENIZED,
>> Field.TermVector.YES);
>> lines.setBoost(1.0f);
>> doc.add(lines);
>> indexWriter.addDocument(doc);
>> }
>>
>> Does it also matter where in the if/else if clauses I mention the
>> "SCENE-COMMENTARY" tag? ie. should I mention it first? last? or does
>> the order matter?
>>
>> Just wondering.
>> Thanks again for your prompt reply.
>> Sincerely;
>> Fayyaz
>>
>> P.S. This is actually a personal project, as I have developed an
>> interest in Information Retrieval and simply wanted to work on a creative
>> project to help me develop my skills. :-)
>>
>
>
--
View this message in context:
http://www.nabble.com/Creating-an-index-from-an-XML-file-using-Lucene-in-Java-tp18678779p18705179.html
Sent from the Lucene - Java Users mailing list archive at Nabble.com.
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]