On 5 November 2012 11:11, mitra <mitra.re...@ornext.com> wrote:

> Hello all
>
> i have a csv file of size 10 gb which i have to index using solr
>
> my question is how to index the csv in such a way so that
> i can get two separate index files of which one of the index is the index
> for the first half of the csv and the second index is the index for the
> second half of the csv
>

I do not think that there is any automatic way to do that in Solr.
Could you not split the CSV file yourself, and index different
halves of it to different Solr indices?


>
>
> also coming to index settings what should be the optimal value of auto
> commit maxdocs and maxtime for the 10gb csv file it has around 28 milllion
> records
>

That would depend on various local factors like how much RAM
you have to give to Solr, network speed, etc. The best way would
be to experiment with these settings. Usually, your goal should
to minimise auto-commits, so you can try setting these numbers
to high values. You could also disable auto-commit altogether, and
do manual commits.

Given your data size, I think that the indexing should be quite fast
on reasonable hardware.

Regards,
Gora

Reply via email to