Hi Nawab,

> One indexing thread in lucene  corresponds to one segment being written. I 
> need a fine control on the number of segments.

I didn’t check the code, but I would be surprised that it is how things work. 
It can appear that it is working like that if each client thread is doing 
commits. Is that the case?

Thanks,
Emir
--
Monitoring - Log Management - Alerting - Anomaly Detection
Solr & Elasticsearch Consulting Support Training - http://sematext.com/



> On 1 Nov 2017, at 18:00, Nawab Zada Asad Iqbal <khi...@gmail.com> wrote:
> 
> Well, the reason i want to control number of indexing threads is to
> restrict number of "segments" being created at one time in the RAM. One
> indexing thread in lucene  corresponds to one segment being written. I need
> a fine control on the number of segments. Less than that, and I will not be
> fully utilizing my writing capacity. On the other hand, if I have more
> threads, then I will end up a lot more segments of small size, which I will
> need to flush frequently and then merge, and that will cause a different
> kind of problem.
> 
> Your suggestion will require me and other such solr users to create a tight
> coupling between the clients and the Solr servers. My client is not SolrJ
> based. IN a scenario when I am connecting and indexing to Solr remotely, I
> want more requests to be waiting on the solr side so that they start
> writing as soon as an Indexing thread is available, vs waiting on my client
> side - on the other side of the wire.
> 
> Thanks
> Nawab
> 
> On Wed, Nov 1, 2017 at 7:11 AM, Shawn Heisey <apa...@elyograg.org> wrote:
> 
>> On 10/31/2017 4:57 PM, Nawab Zada Asad Iqbal wrote:
>> 
>>> I hit this issue https://issues.apache.org/jira/browse/SOLR-11504 while
>>> migrating to solr6 and locally working around it in Lucene code. I am
>>> thinking to fix it properly and hopefully patch back to Solr. Since,
>>> Lucene
>>> code does not want to keep any such config, I am thinking to use a
>>> counting
>>> semaphore in Solr code before calling IndexWriter.addDocument(s) or
>>> IndexWriter.updateDocument(s).
>>> 
>> 
>> There's a fairly simple way to control the number of indexing threads that
>> doesn't require ANY changes to Solr:  Don't start as many threads/processes
>> on your indexing client(s).  If you control the number of simultaneous
>> requests sent to Solr, then Solr won't start as many indexing threads.
>> That kind of control over your indexing system is something that's always
>> preferable to have.
>> 
>> Thanks,
>> Shawn
>> 

Reply via email to