I'm indexing around 1 million documents of one type that requires 4 additional queries for each document + 0,5 M documents that only require 1 query for all.
I'm using the data import handler from contrib with SolrWriter modified with allowDups = true (doesn't seem to have made any difference). This doesn't seem to be a that many documents, however, after 21 hours, I have only ~700 k documents of the first type indexed. The size of index is currently 2.1 GB I'm noticing the initial import rate is relatively high, such as all the documents of first type would be indexed in less than 6 hours if it were maintained. When the number of documents already imported rises, the import rate falls significatively. Does anyone have any suggestions on how to speed up full imports? What is the bottleneck? I will probably have to make some changes to schema over the next days that will require new imports. thanks Esta mensagem e quaisquer ficheiros anexos podem conter informação confidencial ou de uso restrito. Se não for o destinatário da mesma por favor notifique imediatamente o seu remetente e proceda à sua destruição. Não poderá revelar, copiar, distribuir ou de alguma forma usar o seu conteúdo. O Grupo Media Capital e suas associadas utilizam software de anti-virus. No entanto, não obstante terem sido tomadas todas as precauções, não é garantido que a mensagem ou os seus anexos não contenham vírus. This message, including any attachments, may contain confidential information or privileged material. If you are not the intended recipient please notify the sender immediately by e-mail and delete it from your system. You should not disseminate, distribute or copy this e-mail or disclose its content. We believe, but do not warrant, that this e-mail, including any attachments, is virus free.