Re: optimize()

2002-11-26 Thread Otis Gospodnetic
This was just mentioned a few days ago. Check the archives.
Not needed for indexing, good to do after you are done indexing, as the
index reader needs to open and search through less files.

Otis

--- Leo Galambos <[EMAIL PROTECTED]> wrote:
> How does it affect overall performance, when I do not call
> optimize()?
> 
> THX
> 
> -g-
> 
> 
> 
> --
> To unsubscribe, e-mail:  
> 
> For additional commands, e-mail:
> 
> 


__
Do you Yahoo!?
Yahoo! Mail Plus - Powerful. Affordable. Sign up now.
http://mailplus.yahoo.com

--
To unsubscribe, e-mail:   
For additional commands, e-mail: 




Re: optimize()

2002-11-26 Thread Leo Galambos
Did you try any tests in this area? (figures, charts...)

AFAIK reader reads identical number of (giga)bytes. BTW, it could read
segments in many threads. I do not see why it would be slower (until you
do many delete()-s). If reader opens 1 or 50 files, it is still nothing.

-g-

On Tue, 26 Nov 2002, Otis Gospodnetic wrote:

> This was just mentioned a few days ago. Check the archives.
> Not needed for indexing, good to do after you are done indexing, as the
> index reader needs to open and search through less files.
> 
> Otis
> 
> --- Leo Galambos <[EMAIL PROTECTED]> wrote:
> > How does it affect overall performance, when I do not call
> > optimize()?
> > 
> > THX
> > 
> > -g-
> > 
> > 
> > 
> > --
> > To unsubscribe, e-mail:  
> > 
> > For additional commands, e-mail:
> > 
> > 
> 
> 
> __
> Do you Yahoo!?
> Yahoo! Mail Plus - Powerful. Affordable. Sign up now.
> http://mailplus.yahoo.com
> 
> --
> To unsubscribe, e-mail:   
> For additional commands, e-mail: 
> 


--
To unsubscribe, e-mail:   
For additional commands, e-mail: 




Re: optimize()

2002-11-26 Thread Otis Gospodnetic
No tests, just intuition that it's faster to find something in 1 file
than in 100 of them.  If you do some tests, I'd love to hear the real
numbers :)

Otis

--- Leo Galambos <[EMAIL PROTECTED]> wrote:
> Did you try any tests in this area? (figures, charts...)
> 
> AFAIK reader reads identical number of (giga)bytes. BTW, it could
> read
> segments in many threads. I do not see why it would be slower (until
> you
> do many delete()-s). If reader opens 1 or 50 files, it is still
> nothing.
> 
> -g-
> 
> On Tue, 26 Nov 2002, Otis Gospodnetic wrote:
> 
> > This was just mentioned a few days ago. Check the archives.
> > Not needed for indexing, good to do after you are done indexing, as
> the
> > index reader needs to open and search through less files.
> > 
> > Otis
> > 
> > --- Leo Galambos <[EMAIL PROTECTED]> wrote:
> > > How does it affect overall performance, when I do not call
> > > optimize()?
> > > 
> > > THX
> > > 
> > > -g-
> > > 
> > > 
> > > 
> > > --
> > > To unsubscribe, e-mail:  
> > > 
> > > For additional commands, e-mail:
> > > 
> > > 
> > 
> > 
> > __
> > Do you Yahoo!?
> > Yahoo! Mail Plus - Powerful. Affordable. Sign up now.
> > http://mailplus.yahoo.com
> > 
> > --
> > To unsubscribe, e-mail:  
> 
> > For additional commands, e-mail:
> 
> > 
> 
> 
> --
> To unsubscribe, e-mail:  
> 
> For additional commands, e-mail:
> 
> 


__
Do you Yahoo!?
Yahoo! Mail Plus - Powerful. Affordable. Sign up now.
http://mailplus.yahoo.com

--
To unsubscribe, e-mail:   
For additional commands, e-mail: 




Re: optimize()

2002-11-26 Thread Leo Galambos
Hmmm. The question is what would I measure?

Otis, do you know what implementation is used in Lucene (I am lost in 
hiearchy of readers/writers):

a) single thread for solving query
b) more than one thread for a query

(a) would mean that Lucene could solve queries more than 50% slower
than in case (b). It would also mean, that Lucene's index is in optimal
state when just one segment exists. And it also means that if you remove
half of documents from a collection you have to rebuild one big segment to
a smaller one, and so on... It would cost a lot of CPU/HDD time.

So it looks like I would measure effect of random insert/remove 
operations. The problem is, how often I would call optimize in the test?

Any thoughts?

-g-

On Tue, 26 Nov 2002, Otis Gospodnetic wrote:

> No tests, just intuition that it's faster to find something in 1 file
> than in 100 of them.  If you do some tests, I'd love to hear the real
> numbers :)
> 
> Otis
> 
> --- Leo Galambos <[EMAIL PROTECTED]> wrote:
> > Did you try any tests in this area? (figures, charts...)
> > 
> > AFAIK reader reads identical number of (giga)bytes. BTW, it could
> > read
> > segments in many threads. I do not see why it would be slower (until
> > you
> > do many delete()-s). If reader opens 1 or 50 files, it is still
> > nothing.
> > 
> > -g-
> > 
> > On Tue, 26 Nov 2002, Otis Gospodnetic wrote:
> > 
> > > This was just mentioned a few days ago. Check the archives.
> > > Not needed for indexing, good to do after you are done indexing, as
> > the
> > > index reader needs to open and search through less files.
> > > 
> > > Otis
> > > 
> > > --- Leo Galambos <[EMAIL PROTECTED]> wrote:
> > > > How does it affect overall performance, when I do not call
> > > > optimize()?
> > > > 
> > > > THX
> > > > 
> > > > -g-
> > > > 
> > > > 
> > > > 
> > > > --
> > > > To unsubscribe, e-mail:  
> > > > 
> > > > For additional commands, e-mail:
> > > > 
> > > > 
> > > 
> > > 
> > > __
> > > Do you Yahoo!?
> > > Yahoo! Mail Plus - Powerful. Affordable. Sign up now.
> > > http://mailplus.yahoo.com
> > > 
> > > --
> > > To unsubscribe, e-mail:  
> > 
> > > For additional commands, e-mail:
> > 
> > > 
> > 
> > 
> > --
> > To unsubscribe, e-mail:  
> > 
> > For additional commands, e-mail:
> > 
> > 
> 
> 
> __
> Do you Yahoo!?
> Yahoo! Mail Plus - Powerful. Affordable. Sign up now.
> http://mailplus.yahoo.com
> 
> --
> To unsubscribe, e-mail:   
> For additional commands, e-mail: 
> 


--
To unsubscribe, e-mail:   
For additional commands, e-mail: 




RE: optimize()

2002-11-26 Thread Stephen Eaton
I don't know if this answers your question, but I had alot of problems
with lucene bombing out with out of memory errors.  I was not using the
optimize, I tried this and hey presto no more problems.

-Original Message-
From: Leo Galambos [mailto:[EMAIL PROTECTED]]
Sent: Wednesday, 27 November 2002 5:22 AM
To: [EMAIL PROTECTED]
Subject: optimize()


How does it affect overall performance, when I do not call optimize()?

THX

-g-



--
To unsubscribe, e-mail:

For additional commands, e-mail:



--
To unsubscribe, e-mail:   
For additional commands, e-mail: 




Re: optimize()

2002-11-26 Thread Otis Gospodnetic
This should answer your a) or b) question:

[otis@linux2 java]$ pwd
/mnt/disk2/cvs-repositories/jakarta/jakarta-lucene/src/java

[otis@linux2 java]$ which ffjg
alias ffjg='find . -type f -name \*.java|xargs grep'
/usr/bin/find
/usr/bin/xargs

[otis@linux2 java]$ ffjg Thread
./org/apache/lucene/store/Lock.java:   
Thread.sleep(sleepInterval);
[otis@linux2 java]$


Unoptimized index is not a problem for document additions, they take
constant time, regardless of the size of the index and regardless of
whether the index is optimized or not.
Searches of unoptimized index take longer than searches of an optimized
index.
Here's a test:
Write a class that indexes X documents where X is a substantial number.
Them make a copy of that index X and call it Y.  Optimize index Y.
Then do a search against one, and against the other index, and time it.
Then let us know which one is faster and by how much.

Good luck,
Otis



--- Leo Galambos <[EMAIL PROTECTED]> wrote:
> Hmmm. The question is what would I measure?
> 
> Otis, do you know what implementation is used in Lucene (I am lost in
> 
> hiearchy of readers/writers):
> 
> a) single thread for solving query
> b) more than one thread for a query
> 
> (a) would mean that Lucene could solve queries more than 50% slower
> than in case (b). It would also mean, that Lucene's index is in
> optimal
> state when just one segment exists. And it also means that if you
> remove
> half of documents from a collection you have to rebuild one big
> segment to
> a smaller one, and so on... It would cost a lot of CPU/HDD time.
> 
> So it looks like I would measure effect of random insert/remove 
> operations. The problem is, how often I would call optimize in the
> test?
> 
> Any thoughts?
> 
> -g-
> 
> On Tue, 26 Nov 2002, Otis Gospodnetic wrote:
> 
> > No tests, just intuition that it's faster to find something in 1
> file
> > than in 100 of them.  If you do some tests, I'd love to hear the
> real
> > numbers :)
> > 
> > Otis
> > 
> > --- Leo Galambos <[EMAIL PROTECTED]> wrote:
> > > Did you try any tests in this area? (figures, charts...)
> > > 
> > > AFAIK reader reads identical number of (giga)bytes. BTW, it could
> > > read
> > > segments in many threads. I do not see why it would be slower
> (until
> > > you
> > > do many delete()-s). If reader opens 1 or 50 files, it is still
> > > nothing.
> > > 
> > > -g-
> > > 
> > > On Tue, 26 Nov 2002, Otis Gospodnetic wrote:
> > > 
> > > > This was just mentioned a few days ago. Check the archives.
> > > > Not needed for indexing, good to do after you are done
> indexing, as
> > > the
> > > > index reader needs to open and search through less files.
> > > > 
> > > > Otis
> > > > 
> > > > --- Leo Galambos <[EMAIL PROTECTED]> wrote:
> > > > > How does it affect overall performance, when I do not call
> > > > > optimize()?
> > > > > 
> > > > > THX
> > > > > 
> > > > > -g-
> > > > > 
> > > > > 
> > > > > 
> > > > > --
> > > > > To unsubscribe, e-mail:  
> > > > > 
> > > > > For additional commands, e-mail:
> > > > > 
> > > > > 
> > > > 
> > > > 
> > > > __
> > > > Do you Yahoo!?
> > > > Yahoo! Mail Plus - Powerful. Affordable. Sign up now.
> > > > http://mailplus.yahoo.com
> > > > 
> > > > --
> > > > To unsubscribe, e-mail:  
> > > 
> > > > For additional commands, e-mail:
> > > 
> > > > 
> > > 
> > > 
> > > --
> > > To unsubscribe, e-mail:  
> > > 
> > > For additional commands, e-mail:
> > > 
> > > 
> > 
> > 
> > __
> > Do you Yahoo!?
> > Yahoo! Mail Plus - Powerful. Affordable. Sign up now.
> > http://mailplus.yahoo.com
> > 
> > --
> > To unsubscribe, e-mail:  
> 
> > For additional commands, e-mail:
> 
> > 
> 
> 
> --
> To unsubscribe, e-mail:  
> 
> For additional commands, e-mail:
> 
> 


__
Do you Yahoo!?
Yahoo! Mail Plus - Powerful. Affordable. Sign up now.
http://mailplus.yahoo.com

--
To unsubscribe, e-mail:   
For additional commands, e-mail: 




Re: optimize()

2002-11-27 Thread Leo Galambos
> Unoptimized index is not a problem for document additions, they take
> constant time, regardless of the size of the index and regardless of
> whether the index is optimized or not.

IMHO It is not true. It would mean that O(log(n/M))=O(1).  (n-number of
documents in index, M max number of segments per level). I think that if
you are true, we are able to sort an array in O(n) and not in O(nlog n).

> Searches of unoptimized index take longer than searches of an optimized
> index.

Is there any limitation in Lucene architecture, so that you cannot use
multithread algorithm for calculation of hit lists? I think it would boost
performance. Otis, thank you for your proof, that Lucene has not it now
(you got me :-)). But what about next releases?

> Then do a search against one, and against the other index, and time it.
> Then let us know which one is faster and by how much.

OK, I will.

I would like to compare Lucene to another engine. The test would be
precise, because I wanna use it in an academic paper.

Aim of my question was, how could I configure Lucene to get maximum
performance for test. It looks to be pretty hard, because:

- if I do not call optimize(), I can build index at maximum speed, but 
searches are slow, so it is not configuration for dynamic environment

- if I call optimize() regularly (as real application would do), indexing
is slower and slower when I add more and more documents to the collection

IMHO the second option describes "real environment", so we get:

loop:
  K-times indexDoc()
  optimize()
end-of-loop

What *K* would I use? 1000, 1 or 10 or 100? Folks, what *K* do you use 
in your applications? Thank you.

-g-



--
To unsubscribe, e-mail:   
For additional commands, e-mail: 




Re: optimize()

2002-11-27 Thread Scott Ganyo
We generally optimize only after a full index (re-)build or during 
periods where the index is not being unused.

Scott

Leo Galambos wrote:
Unoptimized index is not a problem for document additions, they take
constant time, regardless of the size of the index and regardless of
whether the index is optimized or not.


IMHO It is not true. It would mean that O(log(n/M))=O(1).  (n-number of
documents in index, M max number of segments per level). I think that if
you are true, we are able to sort an array in O(n) and not in O(nlog n).



Searches of unoptimized index take longer than searches of an optimized
index.



Is there any limitation in Lucene architecture, so that you cannot use
multithread algorithm for calculation of hit lists? I think it would boost
performance. Otis, thank you for your proof, that Lucene has not it now
(you got me :-)). But what about next releases?



Then do a search against one, and against the other index, and time it.
Then let us know which one is faster and by how much.



OK, I will.

I would like to compare Lucene to another engine. The test would be
precise, because I wanna use it in an academic paper.

Aim of my question was, how could I configure Lucene to get maximum
performance for test. It looks to be pretty hard, because:

- if I do not call optimize(), I can build index at maximum speed, but 
searches are slow, so it is not configuration for dynamic environment

- if I call optimize() regularly (as real application would do), indexing
is slower and slower when I add more and more documents to the collection

IMHO the second option describes "real environment", so we get:

loop:
  K-times indexDoc()
  optimize()
end-of-loop

What *K* would I use? 1000, 1 or 10 or 100? Folks, what *K* do you use 
in your applications? Thank you.

-g-



--
To unsubscribe, e-mail:   
For additional commands, e-mail: 

--
Brain: Pinky, are you pondering what I’m pondering?
Pinky: I think so, Brain, but calling it a pu-pu platter? Huh, what were 
they thinking?


--
To unsubscribe, e-mail:   
For additional commands, e-mail: 



Re: Optimize crash

2004-04-19 Thread Paul
Dear all,

I hate to be insistent, but I have a large live website with a growing,
un-optimizable Lucene index and which therefore has it's appointment
with destiny pencilled into The Diary of Doom on a date roughly
three weeks hence.

So if I'm doing something stupid, or there's a workaround, or someone
is already looking into this problem, *please* let me know. My alternative
is to spend two days re-indexing the archive, and then to just wait for the 
inevitable repeat of this problem, like Groundhog Day, which isn't a
particularly attractive option.

(NB: The original message is under the same subject line in the archive.)

Thanks.

Cheers,
Paul.

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: optimize(), delete() calls on IndexWriter

2002-03-08 Thread Otis Gospodnetic

No they don't. Note that delete() is in IndexReader.

Otis

--- Aruna Raghavan <[EMAIL PROTECTED]> wrote:
> Hi,
> Do calls like optimize() and delete() on the Indexwriter cause a
> separate
> thread to be kicked off?
> Thanks!
> Aruna.
> 
> --
> To unsubscribe, e-mail:  
> 
> For additional commands, e-mail:
> 
> 


__
Do You Yahoo!?
Try FREE Yahoo! Mail - the world's greatest free email!
http://mail.yahoo.com/

--
To unsubscribe, e-mail:   
For additional commands, e-mail: 




RE: optimize(), delete() calls on IndexWriter

2002-03-08 Thread Aruna Raghavan

Yes, thanks.

-Original Message-
From: Otis Gospodnetic [mailto:[EMAIL PROTECTED]]
Sent: Friday, March 08, 2002 11:46 AM
To: Lucene Users List
Subject: Re: optimize(), delete() calls on IndexWriter


No they don't. Note that delete() is in IndexReader.

Otis

--- Aruna Raghavan <[EMAIL PROTECTED]> wrote:
> Hi,
> Do calls like optimize() and delete() on the Indexwriter cause a
> separate
> thread to be kicked off?
> Thanks!
> Aruna.
> 
> --
> To unsubscribe, e-mail:  
> <mailto:[EMAIL PROTECTED]>
> For additional commands, e-mail:
> <mailto:[EMAIL PROTECTED]>
> 


__
Do You Yahoo!?
Try FREE Yahoo! Mail - the world's greatest free email!
http://mail.yahoo.com/

--
To unsubscribe, e-mail:
<mailto:[EMAIL PROTECTED]>
For additional commands, e-mail:
<mailto:[EMAIL PROTECTED]>

--
To unsubscribe, e-mail:   <mailto:[EMAIL PROTECTED]>
For additional commands, e-mail: <mailto:[EMAIL PROTECTED]>




Re: Optimize not deleting all files

2005-02-03 Thread åç
Your understanding is right!

The old existing files should be deleted,but it  will build new files!


On Thu, 03 Feb 2005 17:36:27 -0800 (PST),
[EMAIL PROTECTED] <[EMAIL PROTECTED]>
wrote:
> Hi,
> 
> When I run an optimize in our production environment, old index are
> left in the directory and are not deleted.
> 
> My understanding is that an
> optimize will create new index files and all existing index files should be
> deleted.  Is this correct?
> 
> We are running Lucene 1.4.2 on Windows.
> 
> Any help is appreciated.  Thanks!
> 
> -
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail: [EMAIL PROTECTED]
> 
> 


-- 
æäåäæäå

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: Optimize not deleting all files

2005-02-04 Thread Ernesto De Santis
Hi all
We have the same problem.
We guess that the problem is that windows lock files.
Our enviroment:
Windows 2000
Tomcat 5.5.4
Ernesto.
[EMAIL PROTECTED] escribió:
Hi,
When I run an optimize in our production environment, old index are
left in the directory and are not deleted.  

My understanding is that an
optimize will create new index files and all existing index files should be
deleted.  Is this correct?
We are running Lucene 1.4.2 on Windows.  

Any help is appreciated.  Thanks!
-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

 


--
No virus found in this outgoing message.
Checked by AVG Anti-Virus.
Version: 7.0.300 / Virus Database: 265.8.5 - Release Date: 03/02/2005
-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


Re: Optimize not deleting all files

2005-02-04 Thread Otis Gospodnetic
Get and try Lucene 1.4.3.  One of the older versions had a bug that was
not deleting old index files.

Otis

--- [EMAIL PROTECTED] wrote:

> Hi,
> 
> When I run an optimize in our production environment, old index are
> left in the directory and are not deleted.  
> 
> My understanding is that an
> optimize will create new index files and all existing index files
> should be
> deleted.  Is this correct?
> 
> We are running Lucene 1.4.2 on Windows.  
> 
> 
> Any help is appreciated.  Thanks!
> 
> -
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail: [EMAIL PROTECTED]
> 
> 


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: Optimize not deleting all files

2005-02-04 Thread yahootintin . 1247688
Ernestor, what version of Lucene are you running?



--- Lucene Users List"
 

> We have the same problem.

> We guess that the problem is that windows lock files.

> 

> Our enviroment:

> Windows 2000

> Tomcat 5.5.4

> 

> Ernesto.

> 

> [EMAIL PROTECTED]
escribi󺊾 

> >Hi,

> >

> >When I run an optimize in our production environment,
old index are

> >left in the directory and are not deleted.  

> >

> >My
understanding is that an

> >optimize will create new index files and all
existing index files should be

> >deleted.  Is this correct?

> >

> >We
are running Lucene 1.4.2 on Windows.  

> >

> >

> >Any help is appreciated.
 Thanks!

> >

> >-

> >To unsubscribe, e-mail: [EMAIL PROTECTED]

>
>For additional commands, e-mail: [EMAIL PROTECTED]

> >

> >

> >

> >  

> >

> 

> 

> -- 

> No virus found in this outgoing message.

> Checked by AVG Anti-Virus.

> Version: 7.0.300 / Virus Database: 265.8.5
- Release Date: 03/02/2005

> 

> 

> -

> To unsubscribe, e-mail: [EMAIL PROTECTED]

> For
additional commands, e-mail: [EMAIL PROTECTED]

> 

> 

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: Optimize not deleting all files

2005-02-04 Thread Patricio Keilty
Hi all, i´ll answer on behalf of Ernesto, our environment is:
Lucene 1.4.2
Tomcat 5.5.4
java 1.4.2_04
Windows 2000 SP4
--p
[EMAIL PROTECTED] wrote:
Ernestor, what version of Lucene are you running?
--- Lucene Users List"

We have the same problem.

We guess that the problem is that windows lock files.
Our enviroment:

Windows 2000
Tomcat 5.5.4
Ernesto.
[EMAIL PROTECTED]
escribi󺊾 

Hi,
When I run an optimize in our production environment,
old index are
left in the directory and are not deleted.  

My
understanding is that an
optimize will create new index files and all
existing index files should be
deleted.  Is this correct?
We
are running Lucene 1.4.2 on Windows.  

Any help is appreciated.
Thanks!
-

To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



--
No virus found in this outgoing message.

Checked by AVG Anti-Virus.
Version: 7.0.300 / Virus Database: 265.8.5
- Release Date: 03/02/2005
-

To unsubscribe, e-mail: [EMAIL PROTECTED]
For
additional commands, e-mail: [EMAIL PROTECTED]

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


Re: Optimize not deleting all files

2005-02-04 Thread Patricio Keilty
Hi Otis, tried version 1.4.3 without success, old index files still 
remain in the directory.
Also tried not calling optimize(), and still getting the same behaviour, 
maybe our problem is not related to optimize() call at all.

--p
Otis Gospodnetic wrote:
Get and try Lucene 1.4.3.  One of the older versions had a bug that was
not deleting old index files.
Otis
--- [EMAIL PROTECTED] wrote:

Hi,
When I run an optimize in our production environment, old index are
left in the directory and are not deleted.  

My understanding is that an
optimize will create new index files and all existing index files
should be
deleted.  Is this correct?
We are running Lucene 1.4.2 on Windows.  

Any help is appreciated.  Thanks!
-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


Re: Optimize not deleting all files

2005-02-04 Thread Steven Rowe
Hi Patricio,
Is it the case that the "old index files" are not removed from session to
session, or only within the same session?  The discussion below pertains to
the latter case, that is, where the "old index files" are used in the same
process as the files replacing them.
I was having a similar problem, and tracked the source down to IndexReaders
not being closed in my application.  

As far as I can tell, in order for IndexReaders to present a consistent
view of an index while changes are being made to it, read-only copies
of the index are kept around until all IndexReaders using them are
closed.  If any IndexReaders are open on the index, IndexWriters first
make a copy, then operate on the copy.  If you track down all of these
open IndexReaders and close them before optimization, all of the
"old index files" should be deleted.  (Lucene Gurus, please correct this
if I have misrepresented the situation).
In my application, I had a bad interaction between IndexReader caching,
garbage collection, and incremental indexing, in which a new IndexReader
was being opened on an index after each indexing increment, without
closing the already-opened IndexReaders.
On Windows, operating-system level file locking caused by IndexReaders
left open was disallowing index re-creation, because the IndexWriter
wasn't allowed to delete the index files opened by the abandoned
IndexReaders.
In short, if you need to write to an index more than once in a single
session, be sure to keep careful track of your IndexReaders.
Hope it helps,
Steve
Patricio Keilty wrote:
Hi Otis, tried version 1.4.3 without success, old index files still 
remain in the directory.
Also tried not calling optimize(), and still getting the same behaviour, 
maybe our problem is not related to optimize() call at all.

--p
Otis Gospodnetic wrote:
Get and try Lucene 1.4.3.  One of the older versions had a bug that was
not deleting old index files.
Otis
--- [EMAIL PROTECTED] wrote:

Hi,
When I run an optimize in our production environment, old index are
left in the directory and are not deleted. 
My understanding is that an
optimize will create new index files and all existing index files
should be
deleted.  Is this correct?

We are running Lucene 1.4.2 on Windows. 

Any help is appreciated.  Thanks!

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


Re: Optimize not deleting all files

2005-02-04 Thread yahootintin . 1247688
Yes, I believe my problem is related to open IndexReaders.  The issues is
that we can't shut down our live search application while we wait for a 10
minute optimization.  Search is a major part of our application and removing
the feature would significantly affect our end users (even though we run the
optimize during the night).



After the optimize is completed, I close and
re-open the readers so they start reading from the new index files.  I'm 
thinking
of adding code to delete all the old files at that point.  I presume they
will no longer be locked.



--- Lucene Users List"  

> Is it the case that the "old index files" are
not removed from session to

> session, or only within the same session? 
The discussion below pertains to

> the latter case, that is, where the "old
index files" are used in the same

> process as the files replacing them.

> 

> I was having a similar problem, and tracked the source down to IndexReaders

> not being closed in my application.  

> 

> As far as I can tell, in order
for IndexReaders to present a consistent

> view of an index while changes
are being made to it, read-only copies

> of the index are kept around until
all IndexReaders using them are

> closed.  If any IndexReaders are open on
the index, IndexWriters first

> make a copy, then operate on the copy.  If
you track down all of these

> open IndexReaders and close them before optimization,
all of the

> "old index files" should be deleted.  (Lucene Gurus, please
correct this

> if I have misrepresented the situation).

> 

> In my application,
I had a bad interaction between IndexReader caching,

> garbage collection,
and incremental indexing, in which a new IndexReader

> was being opened on
an index after each indexing increment, without

> closing the already-opened
IndexReaders.

> 

> On Windows, operating-system level file locking caused
by IndexReaders

> left open was disallowing index re-creation, because the
IndexWriter

> wasn't allowed to delete the index files opened by the abandoned

> IndexReaders.

> 

> In short, if you need to write to an index more than
once in a single

> session, be sure to keep careful track of your IndexReaders.

> 

> Hope it helps,

> Steve

> 

> Patricio Keilty wrote:

> > Hi Otis,
tried version 1.4.3 without success, old index files still 

> > remain in
the directory.

> > Also tried not calling optimize(), and still getting the
same behaviour, 

> > maybe our problem is not related to optimize() call
at all.

> > 

> > --p

> > 

> > Otis Gospodnetic wrote:

> > 

> >> Get
and try Lucene 1.4.3.  One of the older versions had a bug that was

> >>
not deleting old index files.

> >>

> >> Otis

> >>

> >> --- [EMAIL PROTECTED]
wrote:

> >>

> >>

> >>> Hi,

> >>>

> >>> When I run an optimize in our
production environment, old index are

> >>> left in the directory and are
not deleted. 

> >>> My understanding is that an

> >>> optimize will create
new index files and all existing index files

> >>> should be

> >>> deleted.
 Is this correct?

> >>>

> >>> We are running Lucene 1.4.2 on Windows. 

> >>>

> >>> Any help is appreciated.  Thanks!

> 

> 

> -

> To unsubscribe, e-mail: [EMAIL PROTECTED]

> For
additional commands, e-mail: [EMAIL PROTECTED]

> 

> 

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: optimize fails with "Negative seek offset"

2004-05-12 Thread Sascha Ottolski
Hi,

sorry for following up my own mail, but since no one responded so
far, I thought the stacktrace might be of interested. The following
exception always occurs when trying to optimize one of our indizes,
which always went ok for about a year now. I just tried with 1.4-rc3,
but with the same result:

java.io.IOException: Negative seek offset
at java.io.RandomAccessFile.seek(Native Method)
at org.apache.lucene.store.FSInputStream.readInternal(FSDirectory.java:405)
at org.apache.lucene.store.InputStream.readBytes(InputStream.java:61)
at 
org.apache.lucene.index.CompoundFileReader$CSInputStream.readInternal(CompoundFileReader.java:222)
at org.apache.lucene.store.InputStream.refill(InputStream.java:158)
at org.apache.lucene.store.InputStream.readByte(InputStream.java:43)
at org.apache.lucene.store.InputStream.readVInt(InputStream.java:83)
at org.apache.lucene.index.FieldsReader.doc(FieldsReader.java:63)
at org.apache.lucene.index.SegmentReader.document(SegmentReader.java:238)
at org.apache.lucene.index.SegmentMerger.mergeFields(SegmentMerger.java:185)
at org.apache.lucene.index.SegmentMerger.merge(SegmentMerger.java:92)
at org.apache.lucene.index.IndexWriter.mergeSegments(IndexWriter.java:483)
at org.apache.lucene.index.IndexWriter.optimize(IndexWriter.java:362)
at LuceneRPCHandler.optimize(LuceneRPCHandler.java:398)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:324)
at org.apache.xmlrpc.Invoker.execute(Invoker.java:168)
at org.apache.xmlrpc.XmlRpcWorker.invokeHandler(XmlRpcWorker.java:123)
at org.apache.xmlrpc.XmlRpcWorker.execute(XmlRpcWorker.java:185)
at org.apache.xmlrpc.XmlRpcServer.execute(XmlRpcServer.java:151)
at org.apache.xmlrpc.XmlRpcServer.execute(XmlRpcServer.java:139)
at org.apache.xmlrpc.WebServer$Connection.run(WebServer.java:773)
at org.apache.xmlrpc.WebServer$Runner.run(WebServer.java:656)
at java.lang.Thread.run(Thread.java:534)


Any hint would be greatly appreciated.


Thanks,

Sascha

-- 
Gallileus - the power of knowledge

Gallileus GmbHhttp://www.gallileus.info/

Pintschstraße 16  fon +49-(0)30-41 93 43 43
10249 Berlin  fax +49-(0)30-41 93 43 45
Germany



++
AKTUELLER HINWEIS (Mai 2004)

Literatur Alerts - Literatursuche (wie) im Schlaf!

Ab jetzt mehr dazu unter:
http://www.gallileus.info/gallileus/about/products/alerts/
++

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: optimize fails with "Negative seek offset"

2004-05-12 Thread Anthony Vito
Looks like the same error I got when I tried to use Lucene version 1.3
to search on an index I had created with Lucene version 1.4. The
versions are not forward compatible. Did you by chance create the index
with version 1.4 and are now searching with version 1.3. It's easy to
get the dependencies out of sync for different apps, which is what
happened to me.

-vito

On Wed, 2004-05-12 at 04:59, Sascha Ottolski wrote:
> Hi,
> 
> sorry for following up my own mail, but since no one responded so
> far, I thought the stacktrace might be of interested. The following
> exception always occurs when trying to optimize one of our indizes,
> which always went ok for about a year now. I just tried with 1.4-rc3,
> but with the same result:
> 
> java.io.IOException: Negative seek offset
> at java.io.RandomAccessFile.seek(Native Method)
> at org.apache.lucene.store.FSInputStream.readInternal(FSDirectory.java:405)
> at org.apache.lucene.store.InputStream.readBytes(InputStream.java:61)
> at 
> org.apache.lucene.index.CompoundFileReader$CSInputStream.readInternal(CompoundFileReader.java:222)
> at org.apache.lucene.store.InputStream.refill(InputStream.java:158)
> at org.apache.lucene.store.InputStream.readByte(InputStream.java:43)
> at org.apache.lucene.store.InputStream.readVInt(InputStream.java:83)
> at org.apache.lucene.index.FieldsReader.doc(FieldsReader.java:63)
> at org.apache.lucene.index.SegmentReader.document(SegmentReader.java:238)
> at org.apache.lucene.index.SegmentMerger.mergeFields(SegmentMerger.java:185)
> at org.apache.lucene.index.SegmentMerger.merge(SegmentMerger.java:92)
> at org.apache.lucene.index.IndexWriter.mergeSegments(IndexWriter.java:483)
> at org.apache.lucene.index.IndexWriter.optimize(IndexWriter.java:362)
> at LuceneRPCHandler.optimize(LuceneRPCHandler.java:398)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> at java.lang.reflect.Method.invoke(Method.java:324)
> at org.apache.xmlrpc.Invoker.execute(Invoker.java:168)
> at org.apache.xmlrpc.XmlRpcWorker.invokeHandler(XmlRpcWorker.java:123)
> at org.apache.xmlrpc.XmlRpcWorker.execute(XmlRpcWorker.java:185)
> at org.apache.xmlrpc.XmlRpcServer.execute(XmlRpcServer.java:151)
> at org.apache.xmlrpc.XmlRpcServer.execute(XmlRpcServer.java:139)
> at org.apache.xmlrpc.WebServer$Connection.run(WebServer.java:773)
> at org.apache.xmlrpc.WebServer$Runner.run(WebServer.java:656)
> at java.lang.Thread.run(Thread.java:534)
> 
> 
> Any hint would be greatly appreciated.
> 
> 
> Thanks,
> 
> Sascha
> 
> -- 
> Gallileus - the power of knowledge
> 
> Gallileus GmbHhttp://www.gallileus.info/
> 
> PintschstraÃe 16  fon +49-(0)30-41 93 43 43
> 10249 Berlin  fax +49-(0)30-41 93 43 45
> Germany
> 
> 
> 
> ++
> AKTUELLER HINWEIS (Mai 2004)
> 
> Literatur Alerts - Literatursuche (wie) im Schlaf!
> 
> Ab jetzt mehr dazu unter:
> http://www.gallileus.info/gallileus/about/products/alerts/
> ++
> 
> -
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail: [EMAIL PROTECTED]
> 
> 



-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: optimize fails with "Negative seek offset"

2004-05-12 Thread Sascha Ottolski
Am Mittwoch, 12. Mai 2004 18:54 schrieb Anthony Vito:
> Looks like the same error I got when I tried to use Lucene version
> 1.3 to search on an index I had created with Lucene version 1.4. The
> versions are not forward compatible. Did you by chance create the
> index with version 1.4 and are now searching with version 1.3. It's
> easy to get the dependencies out of sync for different apps, which is
> what happened to me.
>
> -vito

Hi vito,

thanks for the reply, but no, we only upgraded so far, but did not 
downgade. More than that, the failing index was just rebuilt completely 
with 1.4-rc2, only two weeks ago. The problem started a short time 
afterwards (but not immediately).


Greets,

Sascha

-- 
Gallileus - the power of knowledge

Gallileus GmbHhttp://www.gallileus.info/

PintschstraÃe 16  fon +49-(0)30-41 93 43 43
10249 Berlin  fax +49-(0)30-41 93 43 45
Germany



++
AKTUELLER HINWEIS (Mai 2004)

Literatur Alerts - Literatursuche (wie) im Schlaf!

Ab jetzt mehr dazu unter:
http://www.gallileus.info/gallileus/about/products/alerts/
++

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: optimize() is not merging into single file? !!!!!!

2004-06-02 Thread iouli . golovatyi
I rechecked  the results. Here they are:

IndexWriter compiled with v.1.4-rc2 generates after optimization
_36d.cfs3779 kb

IndexWriter compiled with v.1.4-rc3 generates after optimization

_36d.cfs   3778 kb
_36c.cfs31 kb
_35z.cfs14 kb
_35o.cfs   14  kb
.
etc.

I both cases segment file contains _36d.cfs

Looks like new version just "foget" to clean up






Iouli Golovatyi/X/GP/[EMAIL PROTECTED]
01.06.2004 17:22
Please respond to "Lucene Users List"

 
To: <[EMAIL PROTECTED]>
cc: 
Subject:optimeze() is not merging into single file?
Category: 



I optimize and close the index after that, but don't get just one .cvs 
file as it promised in doc. Instead of it I see something like small 
segments and a couple of big.
This weird behavor seems started since i changed from v 1.4-rc2 to 
1.4-rc3.
Before I got just one cvs segment . Any ideas?
Thanks in advance
J.



RE : optimize() is not merging into single file? !!!!!!

2004-06-02 Thread Rasik Pandey
Hello,

I am running a two-week old version of Lucene from the CVS HEAD and seeing the same 
behavior.?

Regards,
RBP 

> -Message d'origine-
> De : [EMAIL PROTECTED]
> [mailto:[EMAIL PROTECTED]
> Envoyà : mercredi 2 juin 2004 13:53
> Ã : Lucene Users List
> Objet : Re: optimize() is not merging into single file? !!
> 
> I rechecked  the results. Here they are:
> 
> IndexWriter compiled with v.1.4-rc2 generates after
> optimization
> _36d.cfs3779 kb
> 
> IndexWriter compiled with v.1.4-rc3 generates after
> optimization
> 
> _36d.cfs   3778 kb
> _36c.cfs31 kb
> _35z.cfs14 kb
> _35o.cfs   14  kb
> .
> etc.
> 
> I both cases segment file contains _36d.cfs
> 
> Looks like new version just "foget" to clean up
> 
> 
> 
> 
> 
> 
> Iouli Golovatyi/X/GP/[EMAIL PROTECTED]
> 01.06.2004 17:22
> Please respond to "Lucene Users List"
> 
> 
> To: <[EMAIL PROTECTED]>
> cc:
> Subject:optimeze() is not merging into single
> file?
> Category:
> 
> 
> 
> I optimize and close the index after that, but don't get just
> one .cvs
> file as it promised in doc. Instead of it I see something like
> small
> segments and a couple of big.
> This weird behavor seems started since i changed from v 1.4-rc2
> to
> 1.4-rc3.
> Before I got just one cvs segment . Any ideas?
> Thanks in advance
> J.




-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]