Slow indexing with data import handler

Gustavo A. Lopes Sat, 18 Apr 2009 19:01:43 -0700

I'm indexing around 1 million documents of one type that requires 4 additional 
queries for each document + 0,5 M documents that only require 1 query for all.


I'm using the data import handler from contrib with SolrWriter modified with 
allowDups = true (doesn't seem to have made any difference).

This doesn't seem to be a that many documents, however, after 21 hours, I have 
only ~700 k documents of the first type indexed. The size of index is currently 
2.1 GB

I'm noticing the initial import rate is relatively high, such as all the 
documents of first type would be indexed in less than 6 hours if it were 
maintained. When the number of documents already imported rises, the import 
rate falls significatively.

Does anyone have any suggestions on how to speed up full imports? What is the 
bottleneck? I will probably have to make some changes to schema over the next 
days that will require new imports.

thanks



Esta mensagem e quaisquer ficheiros anexos podem conter informação confidencial 
ou de uso restrito. Se não for o destinatário da mesma por favor notifique 
imediatamente o seu remetente e proceda à sua destruição. Não poderá revelar, 
copiar, distribuir ou de alguma forma usar o seu conteúdo. O Grupo Media 
Capital e suas associadas utilizam software de anti-virus. No entanto, não 
obstante terem sido tomadas todas as precauções, não é garantido que a mensagem 
ou os seus anexos não contenham vírus.

This message, including any attachments, may contain confidential information 
or privileged material. If you are not the intended recipient please notify the 
sender immediately by e-mail and delete it from your system. You should not 
disseminate, distribute or copy this e-mail or disclose its content. We 
believe, but do not warrant, that this e-mail, including any attachments, is 
virus free.

Slow indexing with data import handler

Reply via email to