On 11/29/06, Java Programmer <[EMAIL PROTECTED]> wrote:
Hello,
I have trouble with writing and searching on lucene index same time,
all I did so far is making a class which has 2 methods:
private String indexLocation;
public void addDocument(int id,String title, String body) throws IOException{
IndexWriter indexWriter = new IndexWriter(indexLocation, new
SimpleAnalyzer(), false);
Document doc = new Document();
doc.add(new
Field("id",Integer.toString(id),Store.YES,Index.NO));
doc.add(new Field("title",title,Store.NO,Index.TOKENIZED));
doc.add(new Field("body",body,Store.NO,Index.TOKENIZED));
indexWriter.addDocument(doc);
indexWriter.close();
}
public List<Integer> search(String query) throws IOException, ParseException{
IndexSearcher indexSearcher = new IndexSearcher(indexLocation);
MultiFieldQueryParser queryparser = new
MultiFieldQueryParser(new
String[]{"title","body"}, new SimpleAnalyzer());
Query q = queryparser.parse(query);
Hits hits = indexSearcher.search(q);
Iterator it = hits.iterator();
List<Integer> output = new ArrayList<Integer>();
while(it.hasNext()){
output.add(Integer.parseInt(((Hit)it.next()).getDocument().get("id")));
}
indexSearcher.close();
return output;
}
What I don't like is that I have in each method opening IndexWriter
and IndexSearcher, I try to open them once and keep opened throught
whole lifecycle of application (which would be very long cause it
would be search for news working as webservice), but when I wasn't
close IndexWriter then IndexSearcher wasn't seen any new documents in
index. Next step was keeping IndexWriter open and reopen only
IndexSearcher but in this case also IndexSearcher was seen old index
without new documents. So my final version is this above, but could it
be better, without closing IndexWriter after each addition, and
opening IndexSearcher before each search query? What is the best
pattern of doing such systems?
Hi there, it's quiet a good Idea to keep IndexWriter/Reader(Searcher)
open as long as possible. A quiet nice patter is used by solr and
gdata-server. The indexwriter will be closed after a certain amount of
document have been added to the index or if an timeout exceeded
without any updates inserts. As soon as the writers close method is
called a new index searcher will be opened and released to the
application. If you work in a webapp e.g. in a multithreaded env. you
shout track the references of you searcher to close it if nobody uses
it anymore.
Keeping Searchers and Writers open will result in a much better
performance if it is considerable to have updates and insert invisible
to the searcher for a certain amount of time.
Have a look at this:
http://svn.apache.org/repos/asf/lucene/java/trunk/contrib/gdata-server/src/java/org/apache/lucene/gdata/search/index/GDataIndexer.java
http://svn.apache.org/repos/asf/lucene/java/trunk/contrib/gdata-server/src/java/org/apache/lucene/gdata/utils/ReferenceCounter.java
http://svn.apache.org/repos/asf/lucene/java/trunk/contrib/gdata-server/src/java/org/apache/lucene/gdata/search/index/IndexController.java
--> private ReferenceCounter<IndexSearcher>
getNewServiceSearcher(final Directory dir)
Another question: do I need provide any synchronization on
indexWriter.addDocument(doc) method? I see that it isn't synchronized,
so maybe programmer need to do it himself?
Best regards,
Adr
You could queue the document to add to the index to keep your
indexwriter busy. Might be a good idea anyway.
best regards simon
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]