You can avoid commit and leave solr do autocommit at certain times.
Or use softcommit if you have search queries at the same time to answer.
550000 pages of 3500 words isn't a big deal for a solr server, what's the
hardware configuration?
What's you single solr document a single newspaper? a single page?
Do you have a solrcloud with 8 nodes? Or are you sending same document to 8
single solr servers?

Il giorno mer 26 feb 2020 alle ore 19:22 Massimiliano Randazzo <
massimiliano.randa...@gmail.com> ha scritto:

> Good morning
>
> I have the following situation I have to index the OCR of about 550,000
> pages of newspapers counting an average of 3,500 words per page and making
> a document per word the records are many.
>
> At the moment I have 1 instance of Solr and 8 servers that read and write
> all on the same instance at the same time, at the beginning everything is
> fine after a while when I add, delete or commit it gives me a TimeOut error
> towards the solr server.
>
> I suspect the problem is due to the fact that it is that I do many commit
> operations of many docs at a time (practically if the newspaper is 30 pages
> I do 105,000 add and in the end I commit), if everyone does this and 8
> servers within walking distance of each other I think this creates problems
> for Solr.
>
> What can I do to solve the problem?
> Do I make a commi to each add?
> Is it possible to configure the solr server to apply the add and delete
> commands, and to commit it, the server autonomously supports the available
> resources as it seems to do for the optmized command?
> Reading the documentation I would have found this configuration to
> implement but not if it solves my problem
>
> <deletionPolicy class="solr.SolrDeletionPolicy">
>   <str name="maxCommitsToKeep">1</str>
>   <str name="maxOptimizedCommitsToKeep">0</str>
>   <str
> name="maxCommitAge">1DAY</str></deletionPolicy><infoStream>false</infoStream>
>
>
>
> Thanks for your consideration
> Massimiliano Randazzo
>


-- 

Dario Rigolin
Comperio srl - CTO
Mobile: +39 347 7232652 - Office: +39 0425 471482
Skype: dario.rigolin

Reply via email to