Hi,

It is the index folder. tlog is only a few MB.

I have analysed all changed and found out that only one field in schema was 
changed.

This field in non cloud
 <fieldType name="text" class="solr.TextField" positionIncrementGap="100">

was changed to
 <fieldType name="text" class="solr.TextField" positionIncrementGap="100" 
termVectors="true" termPositions="true" termOffsets="true">

 in cloud to use fastVectorHighlighting.

Is it possible that this change could double index size?

Thanks.
Alex.

 

 

-----Original Message-----
From: Jan Høydahl <jan....@cominvent.com>
To: solr-user <solr-user@lucene.apache.org>
Sent: Mon, Mar 4, 2013 2:24 pm
Subject: Re: solr cloud index size is too big


Can you tell whether it's the "index" folder that is that large or is it 
including the "tlog" transaction log folder?
If you have a huge transaction log, you need to start sending hard commits more 
often during indexing to flush the tlogs.

--
Jan Høydahl, search solution architect
Cominvent AS - www.cominvent.com
Solr Training - www.solrtraining.com

4. mars 2013 kl. 04:16 skrev alx...@aim.com:

> Hello,
> 
> I had a non cloud collection index size around 80G for 15M documents with 
solr-4.1.0. So, I decided to use solr cloud with two shards and sent to solr 
the 
following command
> 
> curl 
> 'http://slave:8983/solr/admin/collections?action=CREATE&name=mycollection&numShards=2&replicationFactor=1&maxShardsPerNode=1'
> 
> I tried to put replicationFactor=0 but this command gave an error.  After 
reindexing, into two separate linux boxes with one instances of solr running in 
each of them I see that size of index in each shard is 90GB versus expected 
40GB 
although each of the shards has half (7.5M) of  documents.
> 
> Any ideas what went wrong?
> 
> Thanks.
> Alex.


 

Reply via email to