we are currently implementing a simple/dumb solution which has a very
basic crc check for chunks.
We shall look into this and see how we can optimize our index copy
--Noble

On Thu, May 1, 2008 at 12:09 AM, Yonik Seeley <[EMAIL PROTECTED]> wrote:
> On Tue, Apr 29, 2008 at 2:02 PM, Noble Paul നോബിള്‍ नोब्ळ्
>
> <[EMAIL PROTECTED]> wrote:
>
> > Solrj/BinaryResponseWriter should be used for calls to get metadata on
>  >  the index. The actual index transfer must be done over simple http. I
>  >  may propose a Simple BinaryRawResponseWriter for that.
>  >
>  >  Sending a huge file in a single response is definitely a bad idea. It
>  >  should be send in chunks of say 10MB or so (configurable)
>  >  . It must have also some mechanism to generate checksums for the whole
>  >  and if possible for chunks.
>
>  checksumming is done by TCP (and by disk drives), so it's not strictly
>  necessary to maintain integrity.
>  Might be a nice option for debugging though.
>
>
>  >  A solution can look like this
>  >  * getFileList . Get the names of index files and their checksums.
>  >  (NamedList response)
>  >  * getFilePart: for 1...n of configured chunk size (simple binary 
> output/http)
>  >  * join parts 1..n  and compare checksums
>  >  * If it passes keep the file delete the parts
>  >  * If it fails get checksums for individual chunks (NamedList response)
>  >  * and re-fetch the corrupted chunks (simple binary output/http)
>  >
>  >  Once all the files are downloaded and the checksums are matched ,
>  >  trigger a snapinstall
>  >
>  >  The details of the snapinstall in windows (with or without hardlinks
>  >  is still a bit fuzzy). But in worst case scenario a copy should be ok.
>  >  (better than having no replication at all)
>
>  Now that Lucene has lockless commits and changes almost no files,
>  there are perhaps other options that would be better for windows.
>
>  For the lucene index, we might be able to avoid hard links altogether
>  and only copy new files.
>  We could keep old segments from being removed while in use with custom
>  delete policies.
>  See SnapshotDeletePolicy for example:
>  
> http://hudson.zones.apache.org/hudson/job/Lucene-trunk/javadoc//org/apache/lucene/index/SnapshotDeletionPolicy.html
>
>  -Yonik
>

Reply via email to