Hi Edwin,
I’ll let somebody with more knowledge about merge to comment merge aspects.
What do you use to merge those cores - merge tool or you run it using Solr’s 
core API? What is the heap size? How many documents are in those two cores?

Regards,
Emir
--
Monitoring - Log Management - Alerting - Anomaly Detection
Solr & Elasticsearch Consulting Support Training - http://sematext.com/



> On 21 Nov 2017, at 14:20, Zheng Lin Edwin Yeo <edwinye...@gmail.com> wrote:
> 
> Hi Emir,
> 
> Thanks for your reply.
> 
> There are only 1 host, 1 nodes and 1 shard for these 3.5TB.
> The merging has already written the additional 3.5TB to another segment.
> However, it is still not a single segment, and the size of the folder where
> the merged index is supposed to be is now 4.6TB, This excludes the original
> 3.5TB, meaning it is already using up 8.1TB of space, but the merging is
> still going on.
> 
> The index are currently updates free. We have only index the data in 2
> different collections, and we now need to merge them into a single
> collection.
> 
> Regards,
> Edwin
> 
> On 21 November 2017 at 16:52, Emir Arnautović <emir.arnauto...@sematext.com>
> wrote:
> 
>> Hi Edwin,
>> How many host/nodes/shard are those 3.5TB? I am not familiar with merge
>> code, but trying to think what it might include, so don’t take any of
>> following as ground truth.
>> Merging for sure will include segments rewrite, so you better have
>> additional 3.5TB if you are merging it to a single segment. But that should
>> not last days on SSD. My guess is that you are running on the edge of your
>> heap and doing a lot GCs and maybe you will OOM at some point. I would
>> guess that merging is memory intensive operation and even if not holding
>> large structures in memory, it will probably create a lot of garbage.
>> Merging requires a lot of comparison so it is also a possibility that you
>> are exhausting CPU resources.
>> Bottom line - without more details and some monitoring tool, it is hard to
>> tell why it is taking that much.
>> And there is also a question if merging is good choice in you case - is
>> index static/updates free?
>> 
>> Regards,
>> Emir
>> --
>> Monitoring - Log Management - Alerting - Anomaly Detection
>> Solr & Elasticsearch Consulting Support Training - http://sematext.com/
>> 
>> 
>> 
>>> On 20 Nov 2017, at 17:35, Zheng Lin Edwin Yeo <edwinye...@gmail.com>
>> wrote:
>>> 
>>> Hi,
>>> 
>>> Does anyone knows how long usually the merging in Solr will take?
>>> 
>>> I am currently merging about 3.5TB of data, and it has been running for
>>> more than 28 hours and it is not completed yet. The merging is running on
>>> SSD disk.
>>> 
>>> I am using Solr 6.5.1.
>>> 
>>> Regards,
>>> Edwin
>> 
>> 

Reply via email to