Thanks for the reply Asif. We have already tried removing the optimization step. Unfortunately the commit command alone is also causing an identical behaviour . Is there any thing else that we are missing ?
Asif Rahman wrote: > > Tushar: > > Is it necessary to do the optimize on each iteration? When you run an > optimize, the entire index is rewritten. Thus each index file can have at > most one hard link and each snapshot will consume the full amount of space > on your disk. > > Asir > > On Thu, Jul 9, 2009 at 3:26 AM, tushar kapoor < > tushar_kapoor...@rediffmail.com> wrote: > >> >> What I gather from this discussion is - >> >> 1. Snapshots are always hard links and not actual files so they cannot >> possibly consume the same amount of space. >> 2. Snapshots contain hard links to existing docs + delta docs. >> >> We are facing a situation wherein the snapshot occupies the same space as >> the actual indexes thus violating the first point. >> We have a batch processing scheme for refreshing indexes. the steps we >> follow are - >> >> 1. Delete 200 documents in one go. >> 2. Do an optimize. >> 3. Create the 200 documents deleted earlier. >> 4. Do a commit. >> >> This process continues for around 160,000 documents i.e. 800 times and by >> the end of it we have 800 snapshots. >> >> The size of actual indexes is 200 Mb and remarkably all the 800 snapshots >> are of size around 200 Mb each. In effect this process consumes around >> 160 >> Gb space on our disks. This is causing a lot of pain right now. >> >> My concern are - Is our understanding of the snapshooter correct ? Should >> this massive space consumption be happening at all ? Are we missing >> something critical ? >> >> Regards, >> Tushar. >> >> Shalin Shekhar Mangar wrote: >> > >> > On Sat, Apr 18, 2009 at 1:06 PM, Koushik Mitra >> > <koushik_mi...@infosys.com>wrote: >> > >> >> Ok.... >> >> >> >> If these are hard links, then where does the index data get stored? >> Those >> >> must be getting stored somewhere in the file system. >> >> >> > >> > Yes, of course they are stored on disk. The hard links are created from >> > the >> > actual files inside the index directory. When those older files are >> > deleted >> > by Solr, they are still left on the disk if at least one hard link to >> that >> > file exists. If you are looking for how to clean old snapshots, you >> could >> > use the snapcleaner script. >> > >> > Is that what you wanted to do? >> > >> > -- >> > Regards, >> > Shalin Shekhar Mangar. >> > >> > >> >> -- >> View this message in context: >> http://www.nabble.com/Create-incremental-snapshot-tp23109877p24405434.html >> Sent from the Solr - User mailing list archive at Nabble.com. >> >> > > > -- > Asif Rahman > Lead Engineer - NewsCred > a...@newscred.com > http://platform.newscred.com > > :-(( -- View this message in context: http://www.nabble.com/Create-incremental-snapshot-tp23109877p24447593.html Sent from the Solr - User mailing list archive at Nabble.com.