IndexReader.delete( term ) bug?

2003-09-16 Thread Lars Hammer
Hello

I am trying to delete all docs in my index containing a field with a given value. The 
API says that the delete( term ) method in IndexReader can do that for me. The problem 
is that it doesn't seem to work properly. When i apply the delete( term ) method to 
docs where i know that only one document exists with the given ID, the document is 
deleted as it should be, but when there are more documents with the same ID, nothing 
happens.
It should be said that alle documents have a unique ID and some documents furthermore 
have another ID field called modelTreeFileID, which indicates membership to a group of 
documents. 

This little code snippet is how I am doing it :

IndexReader ir = IndexReader.open( pathToIndex );
Term term = new Term( modelTreeFileID, 324i28383gvvb );
ir.delete( term );


if only one document was found with the value 324i28383gvvb was found, it is deleted, 
but if more documents were found they are not :-(

Is this a bug in IndexReader or am i doing something wrong.

Any help is appreciated.

/Lars Hammer

http://www.dezide.com




Re: IndexReader.delete( term ) bug?

2003-09-16 Thread Lars Hammer
Sorry I should have posted som more code the first time around, here is the
whole method that i Use. I make a search for the term first to check if any
documents exists :

private void deleteModelTree( String modelTreeFileID, pathToIndex )
{
try
{
IndexReader ir = IndexReader.open( pathToIndex );

Term term = new Term( modelTreeFileID, modelTreeFileID );

TermQuery query = new TermQuery( term );
Searcher searcher = new IndexSearcher( pathToIndex );
Hits hits = searcher.search( query );

if( hits.length() != 0 )
{
ir.delete( term );
}

ir.close();
}
catch( Exception e )
{
e.printStackTrace();
 }
}

I catch all Execptions to see if anything goes wrong and so far no
excpetions are thrown 

-thanks in advance!

/Lars Hammer





 Lars,

 The code looks fine.  However, you are not showing how you deal with
 any possible exceptions (IOExceptions, for instance), so it is possible
 that you are ignoring exceptions.
 Thata delete(Term) method returns an int, so you could also capture
 that and see what its value is.
 I assume that you are also closing your IndexReader instance...

 Otis


 --- Lars Hammer [EMAIL PROTECTED] wrote:
  Hello
 
  I am trying to delete all docs in my index containing a field with a
  given value. The API says that the delete( term ) method in
  IndexReader can do that for me. The problem is that it doesn't seem
  to work properly. When i apply the delete( term ) method to docs
  where i know that only one document exists with the given ID, the
  document is deleted as it should be, but when there are more
  documents with the same ID, nothing happens.
  It should be said that alle documents have a unique ID and some
  documents furthermore have another ID field called modelTreeFileID,
  which indicates membership to a group of documents.
 
  This little code snippet is how I am doing it :
 
  IndexReader ir = IndexReader.open( pathToIndex );
  Term term = new Term( modelTreeFileID, 324i28383gvvb );
  ir.delete( term );
 
 
  if only one document was found with the value 324i28383gvvb was
  found, it is deleted, but if more documents were found they are not
  :-(
 
  Is this a bug in IndexReader or am i doing something wrong.
 
  Any help is appreciated.
 
  /Lars Hammer
 
  http://www.dezide.com
 
 
 


 __
 Do you Yahoo!?
 Yahoo! SiteBuilder - Free, easy-to-use web site design software
 http://sitebuilder.yahoo.com

 -
 To unsubscribe, e-mail: [EMAIL PROTECTED]
 For additional commands, e-mail: [EMAIL PROTECTED]






-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: Reference for Lucene as a search tool built into a CD

2003-09-09 Thread Lars Hammer
I think DocSearcher can do just that :

http://www.brownsite.net/docsearch.htm

/Hammer

- Original Message - 
From: Pete Lewis [EMAIL PROTECTED]
To: Lucene Users List [EMAIL PROTECTED]
Sent: Tuesday, September 09, 2003 4:58 PM
Subject: Reference for Lucene as a search tool built into a CD


Does anyone know of Lucene being packaged onto a CD to provide a search
facility for the data on that CD?  If so, would it be possible to refence?

Thanks

Pete




-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: Hits not serializable?

2003-08-25 Thread Lars Hammer


 The Hits collection needs to get back to the index itself to retrieve 
 Documents.  My strategy has been to collection all the Documents from 
 Hits as a List of Maps, and hand that back across a session bean 
 boundary.

Sounds like fair solution to me -maybe I'll do that.

 
 I will look at ejindex in detail in the near future to see how it works.
 

I too will look into ejindex in the near future.

-Thanks for the input!

/Lars Hammer


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Hits not serializable?

2003-08-22 Thread Lars Hammer
Has anyone experimented with using EJB's for carrying out searches? I'm thinking of 
using an EJB for carrying out the searches and return the hits to a JSP page, which 
handles displaying of the results. 
But Hits isn't serializable, so it cannot be used for sending across the network from 
for example JBoss to Tomcat.

Does anyone has any experience with using Lucene through EJB's 

Thanks in advance

/Lars Hammer

www.dezide.com


updating a document

2003-08-20 Thread Lars Hammer
Hello

I'm trying to update a document in my index. As far as i can tell from the FAQ and 
other places of documentation, the only way to do this is by deleting the document and 
adding it again.

Now, I want to be able to add the document a new but keep from having to re-parse the 
original file again. That is i want to extract a document from the index (and keep a 
copy in memory), delete the document from the index, update a field in the doc in 
memory and add the doc to index once again.

I imagine that it has to be done something like this :

1. extract the desired document from index with a function returning the document (not 
complete code):

Document doc;
String fileNameToGetFromIdx
String tempName

   for ( int i = 0; i  numDocs; i++ )
   {
if ( !indexreader.isDeleted( i ) )
{
 doc = indexreader.document( i );

 if ( doc != null )
 {
  tmpName = ( doc.get( pathToFileOnDisk ) ); 

  if ( tmpName.equals( fileNameToGetFromIdx ) )
  {
   ir.delete( i );
   return doc;
  }
 }
}
   }

This would leave me with the document in memory and the document deleted from the 
index -right?


2. update a field in the document by adding the field again.

 doc.add( Field.Text( someField, value ) );

The API for Document says that if multiple fields exists with the same name, the value 
of the last value added is returned when getting the value.


3. add the document to the index again

indexwriter.addDocument( doc );


Is this a correct way of doing an update because i can't seem to get i to work 
properly.
The reason for trying it this way is to not having to reindex the original file again. 
I have many large PDF documents which takes some time to index :-(

Bottom line -when i do a search and a list of search results are displayed to the 
user, the user clicks the title of the document and the document is shown to the user. 
Before the document is shown i execute an update function to increase the number of 
times the documents has been visited -hence i need to update the visited field of that 
particular document in the index.

Uhm -hope you get the idea :-)

Any suggestions and comments are very welcome

thanks in advance

BTW : does anyone know if an update function is planned to be added to lucene? Would 
it be hard to write it yourself?


/Lars Hammer

www.dezide.com