I did not have good luck with super-high-speed polling. You probably
need to adjust the various parameters on both sides of the
replication.

Some sites (LinkedIn for example with Zoie) do not use replication.
They have all query servers do their own indexing, so that new content
will be available immediately. Network bandwidth is a silent killer of
distributed systems, and the update input text is generally smaller
than the binary update files.

On Thu, Jan 21, 2010 at 2:54 PM, Trey <solrt...@gmail.com> wrote:
> Unfortunately, when I went back to look at the logs this morning, the log
> file had been blown away... that puts a major damper on my debugging
> capabilities - so sorry about that.  As a double whammy, we optimize
> nightly, so the old index files have completely changed at this point.
>
> I do not remember seeing an exception / stack trace in the logs associated
> with the "SEVERE *Unable to move file*" entry, but we were grepping the
> logs, so if it was outputted onto another line it could have possibly been
> there.  I wouldn't really expect to see anything based upon the code in
> SnapPuller.java:
>
> /**
>   * Copy a file by the File#renameTo() method. If it fails, it is
> considered a failure
>   * <p/>
>   * Todo may be we should try a simple copy if it fails
>   */
>  private boolean copyAFile(File tmpIdxDir, File indexDir, String fname,
> List<String> copiedfiles) {
>    File indexFileInTmpDir = new File(tmpIdxDir, fname);
>    File indexFileInIndex = new File(indexDir, fname);
>    boolean success = indexFileInTmpDir.renameTo(indexFileInIndex);
>    if (!success) {
>      LOG.error("Unable to move index file from: " + indexFileInTmpDir
>              + " to: " + indexFileInIndex);
>      for (String f : copiedfiles) {
>        File indexFile = new File(indexDir, f);
>        if (indexFile.exists())
>          indexFile.delete();
>      }
>      delTree(tmpIdxDir);
>      return false;
>    }
>    return true;
>  }
>
> In terms of whether this is an off case: this is the first occurrence of
> this I have seen in the logs.  We tried to replicate the conditions under
> which the exception occurred, but were unable.  I'll send along some more
> useful info if this happens again.
>
> In terms of the behavior we saw: It appears that a replication occurred and
> the "Unable to move file" error occurred.  As a result, it looks like the
> ENTIRE index was subsequently replicated again into a temporary directory
> (several times, over and over).
>
> The end result was that we had multiple full copies of the index in
> temporary index folders on the slave, and the original still couldn't be
> updated (the move to ./index wouldn't work).  Does Solr ever hold files open
> in a manner that would prevent a file in the index directory from being
> overridden?
>
>
> 2010/1/21 Noble Paul നോബിള്‍ नोब्ळ् <noble.p...@corp.aol.com>
>
>> is it a one off case? do you observerve this frequently?
>>
>> On Thu, Jan 21, 2010 at 11:26 AM, Otis Gospodnetic
>> <otis_gospodne...@yahoo.com> wrote:
>> > It's hard to tell without poking around, but one of the first things I'd
>> do would be to look for /home/solr/cores/core8/index.20100119103919/_6qv.fnm
>> - does this file/dir really exist?  Or, rather, did it exist when the error
>> happened.
>> >
>> > I'm not looking at the source code now, but is that really the only error
>> you got?  No exception stack trace?
>> >
>> >  Otis
>> > --
>> > Sematext -- http://sematext.com/ -- Solr - Lucene - Nutch
>> >
>> >
>> >
>> > ----- Original Message ----
>> >> From: Trey <solrt...@gmail.com>
>> >> To: solr-user@lucene.apache.org
>> >> Sent: Wed, January 20, 2010 11:54:43 PM
>> >> Subject: Replication Handler Severe Error: Unable to move index file
>> >>
>> >> Does anyone know what would cause the following error?:
>> >>
>> >> 10:45:10 AM org.apache.solr.handler.SnapPuller copyAFile
>> >>
>> >>      SEVERE: *Unable to move index file* from:
>> >> /home/solr/cores/core8/index.20100119103919/_6qv.fnm to:
>> >> /home/solr/cores/core8/index/_6qv.fnm
>> >> This occurred a few days back and we noticed that several full copies of
>> the
>> >> index were subsequently pulled from the master to the slave, effectively
>> >> evicting our live index from RAM (the linux os cache), and killing our
>> query
>> >> performance due to disk io contention.
>> >>
>> >> Has anyone experienced this behavior recently?  I found an old thread
>> about
>> >> this error from early 2009, but it looks like it was patched almost a
>> year
>> >> ago:
>> >>
>> http://old.nabble.com/%22Unable-to-move-index-file%22-error-during-replication-td21157722.html
>> >>
>> >>
>> >> Additional Relevant information:
>> >> -We are using the Solr 1.4 official release + a field collapsing patch
>> from
>> >> mid December (which I believe should only affect query side, not
>> indexing /
>> >> replication).
>> >> -Our Replication PollInterval for slaves checking the master is very
>> small
>> >> (15 seconds)
>> >> -We have a multi-box distributed search with each box possessing
>> multiple
>> >> cores
>> >> -We issue a manual (rolling) optimize across the cores on the master
>> once a
>> >> day (occurred ~ 1-2 hours before the above timeline)
>> >> -maxWarmingSearchers is set to 1.
>> >
>> >
>>
>>
>>
>> --
>> -----------------------------------------------------
>> Noble Paul | Systems Architect| AOL | http://aol.com
>>
>



-- 
Lance Norskog
goks...@gmail.com

Reply via email to