IndexReader.delete( term ) bug?
Hello I am trying to delete all docs in my index containing a field with a given value. The API says that the delete( term ) method in IndexReader can do that for me. The problem is that it doesn't seem to work properly. When i apply the delete( term ) method to docs where i know that only one document exists with the given ID, the document is deleted as it should be, but when there are more documents with the same ID, nothing happens. It should be said that alle documents have a unique ID and some documents furthermore have another ID field called modelTreeFileID, which indicates membership to a group of documents. This little code snippet is how I am doing it : IndexReader ir = IndexReader.open( pathToIndex ); Term term = new Term( modelTreeFileID, 324i28383gvvb ); ir.delete( term ); if only one document was found with the value 324i28383gvvb was found, it is deleted, but if more documents were found they are not :-( Is this a bug in IndexReader or am i doing something wrong. Any help is appreciated. /Lars Hammer http://www.dezide.com
Re: IndexReader.delete( term ) bug?
Sorry I should have posted som more code the first time around, here is the whole method that i Use. I make a search for the term first to check if any documents exists : private void deleteModelTree( String modelTreeFileID, pathToIndex ) { try { IndexReader ir = IndexReader.open( pathToIndex ); Term term = new Term( modelTreeFileID, modelTreeFileID ); TermQuery query = new TermQuery( term ); Searcher searcher = new IndexSearcher( pathToIndex ); Hits hits = searcher.search( query ); if( hits.length() != 0 ) { ir.delete( term ); } ir.close(); } catch( Exception e ) { e.printStackTrace(); } } I catch all Execptions to see if anything goes wrong and so far no excpetions are thrown -thanks in advance! /Lars Hammer Lars, The code looks fine. However, you are not showing how you deal with any possible exceptions (IOExceptions, for instance), so it is possible that you are ignoring exceptions. Thata delete(Term) method returns an int, so you could also capture that and see what its value is. I assume that you are also closing your IndexReader instance... Otis --- Lars Hammer [EMAIL PROTECTED] wrote: Hello I am trying to delete all docs in my index containing a field with a given value. The API says that the delete( term ) method in IndexReader can do that for me. The problem is that it doesn't seem to work properly. When i apply the delete( term ) method to docs where i know that only one document exists with the given ID, the document is deleted as it should be, but when there are more documents with the same ID, nothing happens. It should be said that alle documents have a unique ID and some documents furthermore have another ID field called modelTreeFileID, which indicates membership to a group of documents. This little code snippet is how I am doing it : IndexReader ir = IndexReader.open( pathToIndex ); Term term = new Term( modelTreeFileID, 324i28383gvvb ); ir.delete( term ); if only one document was found with the value 324i28383gvvb was found, it is deleted, but if more documents were found they are not :-( Is this a bug in IndexReader or am i doing something wrong. Any help is appreciated. /Lars Hammer http://www.dezide.com __ Do you Yahoo!? Yahoo! SiteBuilder - Free, easy-to-use web site design software http://sitebuilder.yahoo.com - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: Reference for Lucene as a search tool built into a CD
I think DocSearcher can do just that : http://www.brownsite.net/docsearch.htm /Hammer - Original Message - From: Pete Lewis [EMAIL PROTECTED] To: Lucene Users List [EMAIL PROTECTED] Sent: Tuesday, September 09, 2003 4:58 PM Subject: Reference for Lucene as a search tool built into a CD Does anyone know of Lucene being packaged onto a CD to provide a search facility for the data on that CD? If so, would it be possible to refence? Thanks Pete - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: Hits not serializable?
The Hits collection needs to get back to the index itself to retrieve Documents. My strategy has been to collection all the Documents from Hits as a List of Maps, and hand that back across a session bean boundary. Sounds like fair solution to me -maybe I'll do that. I will look at ejindex in detail in the near future to see how it works. I too will look into ejindex in the near future. -Thanks for the input! /Lars Hammer - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Hits not serializable?
Has anyone experimented with using EJB's for carrying out searches? I'm thinking of using an EJB for carrying out the searches and return the hits to a JSP page, which handles displaying of the results. But Hits isn't serializable, so it cannot be used for sending across the network from for example JBoss to Tomcat. Does anyone has any experience with using Lucene through EJB's Thanks in advance /Lars Hammer www.dezide.com
updating a document
Hello I'm trying to update a document in my index. As far as i can tell from the FAQ and other places of documentation, the only way to do this is by deleting the document and adding it again. Now, I want to be able to add the document a new but keep from having to re-parse the original file again. That is i want to extract a document from the index (and keep a copy in memory), delete the document from the index, update a field in the doc in memory and add the doc to index once again. I imagine that it has to be done something like this : 1. extract the desired document from index with a function returning the document (not complete code): Document doc; String fileNameToGetFromIdx String tempName for ( int i = 0; i numDocs; i++ ) { if ( !indexreader.isDeleted( i ) ) { doc = indexreader.document( i ); if ( doc != null ) { tmpName = ( doc.get( pathToFileOnDisk ) ); if ( tmpName.equals( fileNameToGetFromIdx ) ) { ir.delete( i ); return doc; } } } } This would leave me with the document in memory and the document deleted from the index -right? 2. update a field in the document by adding the field again. doc.add( Field.Text( someField, value ) ); The API for Document says that if multiple fields exists with the same name, the value of the last value added is returned when getting the value. 3. add the document to the index again indexwriter.addDocument( doc ); Is this a correct way of doing an update because i can't seem to get i to work properly. The reason for trying it this way is to not having to reindex the original file again. I have many large PDF documents which takes some time to index :-( Bottom line -when i do a search and a list of search results are displayed to the user, the user clicks the title of the document and the document is shown to the user. Before the document is shown i execute an update function to increase the number of times the documents has been visited -hence i need to update the visited field of that particular document in the index. Uhm -hope you get the idea :-) Any suggestions and comments are very welcome thanks in advance BTW : does anyone know if an update function is planned to be added to lucene? Would it be hard to write it yourself? /Lars Hammer www.dezide.com