Do you have some more information on index and size? 

Do you have to store everything in the Index? Can you store some data (blobs 
etc) outside ?

I think you are generally right with your solution, but also be aware that it 
is sometimes cheaper to have several servers instead keeping engineer busy for 
some months to find a solution. I don’t say this is the case in your solution 
and I am also not a fan at throwing hardware at a problem, but an engineer 
(even if it affects him/herself) should always make that decision. That does 
not necessarily mean that engineer looses a job - one can implement other 
valuable features for a customer.

> Am 06.08.2019 um 08:21 schrieb Updates Profimedia <upda...@profimedia.com>:
> 
> 
> 
>> On 2019/08/03 18:00:28, Furkan KAMACI <furkankam...@gmail.com> wrote: 
>> Hi,
>> 
>> First of all, could you check here:
>> https://lucidworks.com/post/understanding-transaction-logs-softcommit-and-commit-in-sorlcloud/
>> to
>> better understand hard commits, soft commits and transaction logs to
>> achieve NRT search.
>> 
>> Kind Regards,
>> Furkan KAMACI
>> 
>>> On Wed, Jul 31, 2019 at 3:47 PM profiuser <upda...@profimedia.com> wrote:
>>> 
>>> Hi,
>>> 
>>> we have something about 400 000 000 items in a solr collection.
>>> We have set up auto commit property for this collection to 15 minutes.
>>> Is a big collection and we using some caches etc. Therefore we have big
>>> autocommit value.
>>> 
>>> This have disadvantage that we haven't NRT searches.
>>> 
>>> We would like to have NRT at least for searching for the newly added items.
>>> 
>>> We read about new functionality "Category routed alilases" in a solr
>>> version
>>> 8.1.
>>> 
>>> And we got an idea, that we could add to our collection schema field for
>>> routing.
>>> And at the time of indexing we check if item is new and to routing field we
>>> set up value "new", or the item is older than some time period we set up
>>> value to "old".
>>> And we will have one category routed alias routedCollection, and there will
>>> be 2 collections old and new.
>>> 
>>> If we index new item, router choose new collection and this item is
>>> inserted
>>> to it. After some period we reindex item and we decide that this item is
>>> old
>>> and to routing field we set up value "old". Router decide to update
>>> (insert)
>>> item to collection old. But we expect that solr automatically check
>>> uniqueness in all routed collections. And if solr found item in other
>>> collection, than will be automatically deleted. But not !!!
>>> 
>>> Is this expected behaviour?
>>> 
>>> Could be used this functionality for issue we have? Or could someone
>>> suggest
>>> another solution, which ensure that we have all new items ready for NRT
>>> searches?
>>> 
>>> Thanks for your help
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> --
>>> Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html
>>> 
>> 
> 
> Hi,
> 
> we know this page, and we understand how commits and transaction logs works, 
> but as I said we have a very big index size ;-) Therefore we cannot create 
> commits to often.
> We must cache data for fast search, and if we will commit to often, then we 
> can any cache throw out.
> 
> Now we have only one server, and we prepare new solution with Solr Cloud. 
> Where we would have several servers. We have limited resources and we cannot 
> afford to have for example 20 Solr servers, which I believe is a standard 
> solution for big indexes.
> 
> Therefore we search for some compromise between price/performance. Therefore 
> we think about have more collections. And one collection would be a daily 
> feed (small index) and then we can commit every several seconds. And these 
> collections would be merge to main collection alias.
> 
> Do you have another idea?
> 
> Best
> 
> 
> 

Reply via email to