Hi, It is the index folder. tlog is only a few MB.
I have analysed all changed and found out that only one field in schema was changed. This field in non cloud <fieldType name="text" class="solr.TextField" positionIncrementGap="100"> was changed to <fieldType name="text" class="solr.TextField" positionIncrementGap="100" termVectors="true" termPositions="true" termOffsets="true"> in cloud to use fastVectorHighlighting. Is it possible that this change could double index size? Thanks. Alex. -----Original Message----- From: Jan Høydahl <jan....@cominvent.com> To: solr-user <solr-user@lucene.apache.org> Sent: Mon, Mar 4, 2013 2:24 pm Subject: Re: solr cloud index size is too big Can you tell whether it's the "index" folder that is that large or is it including the "tlog" transaction log folder? If you have a huge transaction log, you need to start sending hard commits more often during indexing to flush the tlogs. -- Jan Høydahl, search solution architect Cominvent AS - www.cominvent.com Solr Training - www.solrtraining.com 4. mars 2013 kl. 04:16 skrev alx...@aim.com: > Hello, > > I had a non cloud collection index size around 80G for 15M documents with solr-4.1.0. So, I decided to use solr cloud with two shards and sent to solr the following command > > curl > 'http://slave:8983/solr/admin/collections?action=CREATE&name=mycollection&numShards=2&replicationFactor=1&maxShardsPerNode=1' > > I tried to put replicationFactor=0 but this command gave an error. After reindexing, into two separate linux boxes with one instances of solr running in each of them I see that size of index in each shard is 90GB versus expected 40GB although each of the shards has half (7.5M) of documents. > > Any ideas what went wrong? > > Thanks. > Alex.