Thanks for the reply Asif. We have already tried removing the optimization
step. Unfortunately the commit command alone is also causing an identical
behaviour . Is there any thing else that we are missing ?


Asif Rahman wrote:
> 
> Tushar:
> 
> Is it necessary to do the optimize on each iteration?  When you run an
> optimize, the entire index is rewritten.  Thus each index file can have at
> most one hard link and each snapshot will consume the full amount of space
> on your disk.
> 
> Asir
> 
> On Thu, Jul 9, 2009 at 3:26 AM, tushar kapoor <
> tushar_kapoor...@rediffmail.com> wrote:
> 
>>
>> What I gather from this discussion is -
>>
>> 1. Snapshots are always hard links and not actual files so they cannot
>> possibly consume the same amount    of space.
>> 2. Snapshots contain hard links to existing docs + delta docs.
>>
>> We are facing a situation wherein the snapshot occupies the same space as
>> the actual indexes thus violating the first point.
>> We have a batch processing scheme for refreshing indexes. the steps we
>> follow are -
>>
>> 1. Delete 200 documents in one go.
>> 2. Do an optimize.
>> 3. Create the 200 documents deleted earlier.
>> 4. Do a commit.
>>
>> This process continues for around 160,000 documents i.e. 800 times and by
>> the end of it we have 800 snapshots.
>>
>> The size of actual indexes is 200 Mb and remarkably all the 800 snapshots
>> are of size around 200 Mb each. In effect this process consumes around
>> 160
>> Gb space on our disks. This is causing a lot of pain right now.
>>
>> My concern are - Is our understanding of the snapshooter correct ? Should
>> this massive space consumption be happening at all ? Are we missing
>> something critical ?
>>
>> Regards,
>> Tushar.
>>
>> Shalin Shekhar Mangar wrote:
>> >
>> > On Sat, Apr 18, 2009 at 1:06 PM, Koushik Mitra
>> > <koushik_mi...@infosys.com>wrote:
>> >
>> >> Ok....
>> >>
>> >> If these are hard links, then where does the index data get stored?
>> Those
>> >> must be getting stored somewhere in the file system.
>> >>
>> >
>> > Yes, of course they are stored on disk. The hard links are created from
>> > the
>> > actual files inside the index directory. When those older files are
>> > deleted
>> > by Solr, they are still left on the disk if at least one hard link to
>> that
>> > file exists. If you are looking for how to clean old snapshots, you
>> could
>> > use the snapcleaner script.
>> >
>> > Is that what you wanted to do?
>> >
>> > --
>> > Regards,
>> > Shalin Shekhar Mangar.
>> >
>> >
>>
>> --
>> View this message in context:
>> http://www.nabble.com/Create-incremental-snapshot-tp23109877p24405434.html
>> Sent from the Solr - User mailing list archive at Nabble.com.
>>
>>
> 
> 
> -- 
> Asif Rahman
> Lead Engineer - NewsCred
> a...@newscred.com
> http://platform.newscred.com
> 
> 
:-((
-- 
View this message in context: 
http://www.nabble.com/Create-incremental-snapshot-tp23109877p24447593.html
Sent from the Solr - User mailing list archive at Nabble.com.

Reply via email to