Hi! Eric,
I end up purchasing your book "Lucene in Action". I have downloaded
your code samples. I am able to retrieve "result" only some time. Below is the
code I have taken from Search.jhtml in lucene demo. I have 2 problem
a) I am unable to display "result" using
b) When I click on the title to retrieve document I do not see my query
highlighted.
<java>
Searcher searcher = new IndexSearcher(getReader(indexName));
// get query from request
String queryString = request.getParameter("query");
query = QueryParser.parse(queryString, "contents", analyzer);
Hits hits = searcher.search(query);
SimpleHTMLFormatter formatter =
new SimpleHTMLFormatter();
Highlighter highlighter = new Highlighter(formatter, new QueryScorer(query));
highlighter.setTextFragmenter(new SimpleFragmenter(50));
String FIELD_NAME = "contents";
for (int i = start; i < end; i++) { // display the hits
Document doc = hits.doc(i);
String text = hits.doc(i).get(FIELD_NAME);
int maxNumFragmentsRequired = 5;
String fragmentSeparator = "...";
if ( text != null){
TokenStream tokenStream = new
StandardAnalyzer().tokenStream(FIELD_NAME, new java.io.StringReader(text));
String result = highlighter.getBestFragments
(tokenStream,text,maxNumFragmentsRequired,fragmentSeparator);
System.out.println("result=" +result);
}
String title = doc.get("title");
if (title.equals("")) // use url for docs w/o title
title = doc.get("path");
</java>
<p><b><java type=print>(int)(hits.score(i) * 100.0f)</java>%
<a href="`doc.get("path")`">
<java type=print>Entities.encode(title)</java>
</b></a>
<java>
if (showSummaries) { // maybe show summary
</java>
<ul><i>Summary</i>:
<java type=print>Entities.encode(doc.get("summary"))</java>
</ul>
<java>
}
}
</java>
-----Original Message-----
From: Erik Hatcher [mailto:[EMAIL PROTECTED]
Sent: Thursday, March 31, 2005 8:04 PM
To: [email protected]
Subject: Re: HTML pages highlighter
On Mar 31, 2005, at 6:36 PM, Yagnesh Shah wrote:
> try {
> fis = new FileInputStream(f);
> HTMLParser parser = new HTMLParser(fis);
>
> // Add the tag-stripped contents as a Reader-valued Text field
> so it will
> // get tokenized and indexed.
> // doc.add(new Field("contents", parser.getReader()));
> LineNumberReader reader = new
> LineNumberReader(parser.getReader());
> for (String l = reader.readLine(); l != null; l =
> reader.readLine())
> // System.out.println(l);
> doc.add(Field.Text("contents", l));
Notice that your loop here is adding a "contents" field for *every*
line read since that is where the first semi-colon is.
Look at using Luke to explore your index. Try indexing just a dummy
String:
doc.add(Field.Text("contents", "some dummy text"));
to show that it works. Always always always simplify a complicated
situation by doing the most obvious thing that _should_ work.
Also, the demo Lucene code is not really designed to be used in a
production application (sadly), so you're better off borrowing code
from the many articles or our book to begin with.
Erik
>
> // Add the summary as a field that is stored and returned with
> // hit documents for display.
> doc.add(new Field("summary", parser.getSummary(),
> Field.Store.YES, Field.Index.NO));
>
> // Add the title as a field that it can be searched and that is
> stored.
> doc.add(new Field("title", parser.getTitle(), Field.Store.YES,
> Field.Index.TOKENIZED));
> }
>
>
>
> -----Original Message-----
> From: Erik Hatcher [mailto:[EMAIL PROTECTED]
> Sent: Wednesday, March 30, 2005 7:38 PM
> To: [email protected]
> Subject: Re: HTML pages highlighter
>
>
>
> On Mar 30, 2005, at 4:46 PM, Yagnesh Shah wrote:
>
>> Hi! Eric,
>
> Erik - with a 'k' - Sorry, I let it slide once though :)
>
>> I try to modified that with this but I get compile error. Do you have
>> any code snippet of highlighting code to pull the contents from the
>> original source?
>
> I have a whole book full of code examples :)
> http://www.lucenebook.com - Grab the source code and look in
> src/lia/tools at Highlight*.java
>
>> or Do you know how I can do field store?
>>
>> doc.add(new Field("contents", parser.getReader(),
>> Field.Store.YES, Field.Index.NO));
>
> You cannot store it with a Reader. You need to use Field.Text(String,
> String), or one of the other variations.
>
> Erik
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail: [EMAIL PROTECTED]
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail: [EMAIL PROTECTED]
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]