we are currently implementing a simple/dumb solution which has a very basic crc check for chunks. We shall look into this and see how we can optimize our index copy --Noble
On Thu, May 1, 2008 at 12:09 AM, Yonik Seeley <[EMAIL PROTECTED]> wrote: > On Tue, Apr 29, 2008 at 2:02 PM, Noble Paul നോബിള് नोब्ळ् > > <[EMAIL PROTECTED]> wrote: > > > Solrj/BinaryResponseWriter should be used for calls to get metadata on > > the index. The actual index transfer must be done over simple http. I > > may propose a Simple BinaryRawResponseWriter for that. > > > > Sending a huge file in a single response is definitely a bad idea. It > > should be send in chunks of say 10MB or so (configurable) > > . It must have also some mechanism to generate checksums for the whole > > and if possible for chunks. > > checksumming is done by TCP (and by disk drives), so it's not strictly > necessary to maintain integrity. > Might be a nice option for debugging though. > > > > A solution can look like this > > * getFileList . Get the names of index files and their checksums. > > (NamedList response) > > * getFilePart: for 1...n of configured chunk size (simple binary > output/http) > > * join parts 1..n and compare checksums > > * If it passes keep the file delete the parts > > * If it fails get checksums for individual chunks (NamedList response) > > * and re-fetch the corrupted chunks (simple binary output/http) > > > > Once all the files are downloaded and the checksums are matched , > > trigger a snapinstall > > > > The details of the snapinstall in windows (with or without hardlinks > > is still a bit fuzzy). But in worst case scenario a copy should be ok. > > (better than having no replication at all) > > Now that Lucene has lockless commits and changes almost no files, > there are perhaps other options that would be better for windows. > > For the lucene index, we might be able to avoid hard links altogether > and only copy new files. > We could keep old segments from being removed while in use with custom > delete policies. > See SnapshotDeletePolicy for example: > > http://hudson.zones.apache.org/hudson/job/Lucene-trunk/javadoc//org/apache/lucene/index/SnapshotDeletionPolicy.html > > -Yonik >
