Oh, forgot your last question, thats why the field "line" has to be
stored, upon query you have to get the "line" number from the document
that represents the line and in "forward" / "back" actions you will
have sort the resultset by line value and print only chunks of that
result.
Mvh Karl Øi
Yes, the biggest drawback is text spanning lines:
L1 - it was the best of times,
L2 - it was the worst of times
will return no hits for the search "it was the best of times, it was
the worst of times" (with quotes). because no single lucene document
contains the whole text alone.
I would be inte
Most indexing creates a Lucene document for each Source document. What
would need is to create a Lucene document for each line.
String src_doc = "crash.java";
int line_number = 0;
while(reader!=EOF) {
String line = reader.readLine();
Document ld = new Document();
ld.add(ne
I don't think you can figure out the language from the input box value
alone, i can't see any way to select the correct language analyzer at
this point. What you can do is to put Chinese, Japanese, English and
Dutch content in separate indexes and use multisearcher to search in
all of them, and
If you use a servlet and a HTML Form to feed queries to the QueryParser
take good care of all configurations around the servlet container. If
you, like me, use tomcat you might have to recode the query into
internal java form (utf-8) before you pass it to lucene.
read this:
http://www.crazysqui
This might sound a bit lame but it has worked for me. I have had the
same problem where the amount of small lucene documents slows down the
building of large indexes.
Search is pretty fast, and read only, so for my case i just created
three indexes and saved every three lucene documents into on