Re: index size with replication

Mike Austin Wed, 14 Mar 2012 21:54:44 -0700

Shawn,

Thanks for the detailed answer! I will play around with this information in
hand.  Maybe a second optimize or just a dummy commit after the optimize
will help get me past this.  Both not the best options, but maybe it's a do
it because it's running on windows work-around. If it is indeed a file
locking issue, I think I can probably work around this since my indexing is
scheduled at certain times and not "live" so I could try the optimize again
soon after or do a single commit that seems to fix the issue also.  Or just
not optimize..


Thanks,
Mike

On Wed, Mar 14, 2012 at 6:34 PM, Shawn Heisey <s...@elyograg.org> wrote:

> On 3/14/2012 2:54 PM, Mike Austin wrote:
>
>> The odd thing is that if I optimize the index it doubles in size.. If I
>> then, add one more document to the index it goes back down to half size?
>>
>> Is there a way to force this without needing to wait until another
>> document
>> is added? Or do you have more information on what you think is going on?
>> I'm using a trunk version of solr4 from 9/12/2011 with a master with two
>> slaves setup.  Everything besides this is working great!
>>
>
> The not-very-helpful-but-true answer: Don't run on Windows.  I checked
> your prior messages to the list to verify that this is your environment.
>  If you can control index updates so they don't happen at the same time as
> your optimizes, you can also get around this problem by doing the optimize
> twice.  You would have to be absolutely sure that no changes are made to
> the index between the two optimizes, so the second one basically doesn't do
> anything except take care of the deletes.
>
> Nuts and bolts of why this happens: Solr keeps the old files open so the
> existing reader can continue to serve queries.  That reader will not be
> closed until the last query completes, which may not happen until well
> after the time the new reader is completely online and ready.  I assume
> that the delete attempt occurs as soon as the new index segments are
> completely online, before the old reader begins to close.  I've not read
> the source code to find out.
>
> On Linux and other UNIX-like environments, you can delete files while they
> are open by a process.  They continue to exist as in-memory links and take
> up space until those processes close them, at which point they are truly
> gone.  On Windows, an attempt to delete an open file will fail, even if
> it's open read-only.
>
> There are probably a number of ways that this problem could be solved for
> Windows platforms.  The simplest that I can think of, assuming it's even
> possible, would be to wait until the old reader is closed before attempting
> the segment deletion.  That may not be possible - the information may not
> be available to the portion of code that does the deletion.  There are a
> few things standing in the way of me fixing this problem myself: 1) I'm a
> beginning Java programmer.  2) I'm not familiar with the Solr code at all.
> 3) My interest level is low because I run on Linux, not Windows.
>
> Thanks,
> Shawn
>
>

Re: index size with replication

Reply via email to