Lucene in Action: Batch indexing by using RAMDirectory

2005-01-22 Thread Oscar Picasso
Hi,

On page 52 of Lucene in Action (Indexing  Controlling the indexing process 
Batch indexing by using RAMDirectory as a buffer) I read:

A more sophisticated approach would involve keeping track of RAMDirectory's
memory consumption, in order to prevent RAMDirectory from growing too large.

I've taken a look at Runtime.totalMemory() and so on but I didn't figure out
how to use these functions to prevent an OutMemoryException while using
RAMDirectory that way.

Any idea?



__ 
Do you Yahoo!? 
Read only the mail you want - Yahoo! Mail SpamGuard. 
http://promotions.yahoo.com/new_mail 

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: Closed IndexWriter reuse

2005-01-21 Thread Oscar Picasso
--- Otis Gospodnetic [EMAIL PROTECTED] wrote:

 No, you can't add documents to an index once you close the IndexWriter.
 You can re-open the IndexWriter and add more documents, of course.
 
 Otis

That's what I expected at first, but:
1- It's a disappointment, because such a 'feature' would have made IndexeWriter
management much easier. You would open an IndexWriter at startup and reuse it
during all the life of the application, just flushing on a regular base using
the close() method and without worrying if other objects are currently using
the writer.

2- When you say you can't add, do you mean it's impossible or that you
shouldn't because for example it could corrupt the index?
Maybe I'm wrong, but I think it's possible. Let's look at the follwoing code:



public static void main(String[] args) throws IOException
{
final IndexWriter writer1 = new IndexWriter(/tmp/test-reuse, new
StandardAnalyzer(), true);

// First write with the writer
Document doc = new Document();
doc.add(new Field(name, John, Field.Store.YES, 
Field.Index.UN_TOKENIZED));
writer1.addDocument(doc);
System.out.println(1  After first write, before closing the writer 
---);
Searcher searcher = new IndexSearcher(/tmp/test-reuse);
Query query = new TermQuery(new Term(name, John));
Hits hits = searcher.search(query);
System.out.println(=== hits:  + hits.length());
System.out.println();

// CLOSING THE WRITER ONCE
writer1.close();
System.out.println(2  After first write, after closing the writer 
---);
searcher = new IndexSearcher(/tmp/test-reuse);
hits = searcher.search(query);
System.out.println(=== hits:  + hits.length());
System.out.println();

// Second write, THE WRITER HAS ALREADY BEEN CLOSED ONCE
writer1.addDocument(doc);
System.out.println(3  After second write, the writer has been 
closed once
---);
hits = searcher.search(query);
System.out.println(=== hits:  + hits.length());
System.out.println();

// Closing the writer again
writer1.close();
System.out.println(4  After second write, the writer has been 
closed
twice ---);
searcher = new IndexSearcher(/tmp/test-reuse);
hits = searcher.search(query);
System.out.println(=== hits:  + hits.length());

}

== Results ==
1  After first write, before closing the writer ---
=== hits: 0

2  After first write, after closing the writer ---
=== hits: 1

3  After second write, the writer has been closed once ---
=== hits: 1

4  After second write, the writer has been closed twice ---
=== hits: 2


As your can see, not only does the code above execute without complain but it
also gives the right results.

Thanks for your comments.



__ 
Do you Yahoo!? 
Yahoo! Mail - Easier than ever with enhanced search. Learn more.
http://info.mail.yahoo.com/mail_250

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Closed IndexWriter reuse

2005-01-20 Thread Oscar Picasso
Hi,

Is it safe to add documents to an IndexWriter that has been closed? 

From what I have seen, the close method flush the changes, closes the files but
it creates new files allowing to add new documents.

Am I right?

Thanks.



__ 
Do you Yahoo!? 
Yahoo! Mail - Easier than ever with enhanced search. Learn more.
http://info.mail.yahoo.com/mail_250

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Restricting searches to some URLs

2004-09-30 Thread Oscar Picasso
Hi,

I use Lucene inside a Cocoon application. The indexed documents URLs are
something like domain-name//integer/integer/integer. For example:
flowers//1/234/23
flowers//2/23/90
...
animals//1/34/65
animals//21/98/4532
...

The searches are typically done in a subset of all available URLs.
For example in all URLS that start with flowers//2/23/ or animals//21/ or
animals//.

I don't see how to do this with Cocoon. Maybe it's only due to some limitations
in the Cocoon implementations of searches and indexing, but anyway I would be
interested to know how it can be done in Cocoon and if not, also directly with
Lucene.

Here is what I have tried:
- doing a search in the @field
I have read somewhere that you can index the @uri field but don't see how to do
this with cocoon

- adding a new url field and using it for the search
It doesn't work. It seems, that due to how the urls are written (combination or
integers and '/') I cannot search for something like: url//flowers/*.

Any idea?

Oscar



__
Do you Yahoo!?
Take Yahoo! Mail with you! Get it on your mobile phone.
http://mobile.yahoo.com/maildemo 

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]