the index.20100127044500/ is a temp directory should have got cleaned
up if there was no problem in replication (see the logs if there was a
problem) . if there is a problem the temp directory will be used as
the new index directory and the old one will no more be used.at any
given point only one directory is used for the index. check the
replication dashboard to check which one it is. Everything else can be
deleted.

On Fri, Jan 29, 2010 at 6:03 AM, mark angelillo <li...@snooth.com> wrote:
> Thanks, Otis. Responses inline.
>
>
>>> Hi,
>>>
>>> We're using the new replication and it's working pretty well. There's one
>>> detail
>>> I'd like to get some more information about.
>>>
>>> As the replication works, it creates versions of the index in the data
>>> directory. Originally we had index/, but now there are dated versions
>>> such as
>>> index.20100127044500/, which are the replicated versions.
>>>
>>> Each copy is sized in the vicinity of 65G. With our current hard drive
>>> it's fine
>>> to have two around, but 3 gets a little dicey. Sometimes we're finding
>>> that the
>>> replication doesn't always clean up after itself. I would like to
>>> understand
>>> this better, or to not have this happen. It could be a configuration
>>> issue.
>>>
>>> Some more specific questions:
>>>
>>> - Is it safe to remove the index/ directory (that doesn't have the date
>>> on it)?
>>> I think I tried this once and the whole thing broke, however maybe
>>> something
>>> else was wrong at the time.
>>
>> No, that's the real, live index, you don't want to remove that one.
>
>
> Yeah... I tried it once and remember things breaking.
>
> However nothing in this directory has been modified for over a week (since
> the last replication initialization). And I'm still sitting on 130GB of data
> for what is only 65GB on the master
>
>
>
>>
>>> - Is there a way to know which one is the current one? (I'm looking at
>>> the file
>>> index.properties, and it seems to be correct, but sometimes there's a
>>> newer
>>> version in the directory, which later is removed)
>>
>> I think the "index" one is always current, no?  If not, I imagine the
>> admin replication page will tell you, or even the Statistics page.
>> e.g.
>> reader :
>>  SolrIndexReader{this=46a55e,r=readonlysegmentrea...@46a55e,segments=1}
>> readerDir :
>>  org.apache.lucene.store.NIOFSDirectory@/mnt/solrhome/cores/foo/data/index
>
>
> reader :
> SolrIndexReader{this=5c3aef1,r=readonlydirectoryrea...@5c3aef1,refCnt=1,segments=9}
> readerDir :
> org.apache.lucene.store.NIOFSDirectory@/home/solr/solr_1.4/solr/data/index.20100127044500
>
>
>
>
>>
>>> - Could it be that the index does not finish replicating in the poll
>>> interval I
>>> give it? What happens if, say there's a poll interval X and replicating
>>> the
>>> index happens to take longer than X sometimes. (Our current poll interval
>>> is 45
>>> minutes, and every time I'm watching it it completes in time.)

you can keep a very small pollInterval and it is OK. if a replication
is going on no new replication will be initiated till the old one
completes
>>
>>
>> I think only 1 replication will/should be happening at a time.
>
> Whew, that's comforting.
>
>



-- 
-----------------------------------------------------
Noble Paul | Systems Architect| AOL | http://aol.com

Reply via email to