I believe it takes constant time to add a new document to an index
because when adding a new document a new segment is created on the
disk, 'separate' from the other, existing, index segments.
The size of the index may come into play when this new segment has to
be merged with the existing segments, which happens every mergeFactor
documents, so to speak.
I have built indices with several hundred thousand documents, but never
notices the increase in time to add a new document to an index.  Maybe
the difference was too small to notice.  I don't have sufficient
knowledge of Lucene to be able to stand behind this 100% and I could
certainly be wrong :(.

Otis


--- Leo Galambos <[EMAIL PROTECTED]> wrote:
> > Adding a new document does not immediately modify an index, so the
> time
> > it takes to add a new document to an existing index is not
> proportional
> > to the index size.  It is constant.  The execution time of
> optimize()
> > is proportional to the index size, so you want to do that only if
> you
> > really need it.  The Lucene article on http://www.onjava.com/ from
> > March 5th describes this in more detail.
> 
> Otis,
> 
> I am not sure, if anything about constants is constant in
> non-constant IR 
> systems :-)
> 
> I think, that the correct answer is O(t/k*(1+log_m(k)), where t is a
> time
> you need to create&write one monolithic segment of k documents, m is
> merge factor you use, and k is the number of documents which are
> already
> in index. As you can see, the function grows with k.
> 
> Can you explain me, why addition of one document takes constant time?
> 
> Thank you
> 
> -g-

> 


__________________________________________________
Do you Yahoo!?
Yahoo! Platinum - Watch CBS' NCAA March Madness, live on your desktop!
http://platinum.yahoo.com

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to