Re: storing large text fields in a database? (instead of inside index)

Emir Arnautović Wed, 21 Feb 2018 03:33:08 -0800

Hi,
Maybe you could use external field type as an example how to hook up values 
from DB: 
https://lucene.apache.org/solr/guide/6_6/working-with-external-files-and-processes.html
 
<https://lucene.apache.org/solr/guide/6_6/working-with-external-files-and-processes.html>


HTH,
Emir
--
Monitoring - Log Management - Alerting - Anomaly Detection
Solr & Elasticsearch Consulting Support Training - http://sematext.com/



> On 20 Feb 2018, at 20:39, Roman Chyla <roman.ch...@gmail.com> wrote:
> 
> Say there is a high load and  I'd like to bring a new machine and let it
> replicate the index, if 100gb and more can be shaved, it will have a
> significant impact on how quickly the new searcher is ready and added to
> the cluster. Impact on the search speed is likely minimal.
> 
> we are investigating the idea of two clusters but i have to say it seems to
> me more complex than storing/loading a field from an external source.
> having said that, I wonder why this was not done before (maybe it was) and
> what the cons are (besides the obvious ones: maintenance and the database
> being potential point of failure; well in that case i'd miss highlights -
> can live with that...)
> 
> On Tue, Feb 20, 2018 at 10:36 AM, David Hastings <
> hastings.recurs...@gmail.com> wrote:
> 
>> Really depends on what you consider too large, and why the size is a big
>> issue, since most replication will go at about 100mg/second give or take,
>> and replicating a 300GB index is only an hour or two.  What i do for this
>> purpose is store my text in a separate index altogether, and call on that
>> core for highlighting.  So for my use case, the primary index with no
>> stored text is around 300GB and replicates as needed, and the full text
>> indexes with stored text totals around 500GB and are replicating non stop.
>> All searching goes against the primary index, and for highlighting i call
>> on the full text indexes that have a stupid simple schema.  This has worked
>> for me pretty well at least.
>> 
>> On Tue, Feb 20, 2018 at 10:27 AM, Roman Chyla <roman.ch...@gmail.com>
>> wrote:
>> 
>>> Hello,
>>> 
>>> We have a use case of a very large index (slave-master; for unrelated
>>> reasons the search cannot work in the cloud mode) - one of the fields is
>> a
>>> very large text, stored mostly for highlighting. To cut down the index
>> size
>>> (for purposes of replication/scaling) I thought I could try to save it
>> in a
>>> database - and not in the index.
>>> 
>>> Lucene has codecs - one of the methods is for 'stored field', so that
>> seems
>>> likes a natural path for me.
>>> 
>>> However, I'd expect somebody else before had a similar problem. I googled
>>> and couldn't find any solutions. Using the codecs seems really good thing
>>> for this particular problem, am I missing something? Is there a better
>> way
>>> to cut down on index size? (besides solr cloud/sharding, compression)
>>> 
>>> Thank you,
>>> 
>>>   Roman
>>> 
>>

Re: storing large text fields in a database? (instead of inside index)

Reply via email to