Do you have some more information on index and size? Do you have to store everything in the Index? Can you store some data (blobs etc) outside ?
I think you are generally right with your solution, but also be aware that it is sometimes cheaper to have several servers instead keeping engineer busy for some months to find a solution. I don’t say this is the case in your solution and I am also not a fan at throwing hardware at a problem, but an engineer (even if it affects him/herself) should always make that decision. That does not necessarily mean that engineer looses a job - one can implement other valuable features for a customer. > Am 06.08.2019 um 08:21 schrieb Updates Profimedia <upda...@profimedia.com>: > > > >> On 2019/08/03 18:00:28, Furkan KAMACI <furkankam...@gmail.com> wrote: >> Hi, >> >> First of all, could you check here: >> https://lucidworks.com/post/understanding-transaction-logs-softcommit-and-commit-in-sorlcloud/ >> to >> better understand hard commits, soft commits and transaction logs to >> achieve NRT search. >> >> Kind Regards, >> Furkan KAMACI >> >>> On Wed, Jul 31, 2019 at 3:47 PM profiuser <upda...@profimedia.com> wrote: >>> >>> Hi, >>> >>> we have something about 400 000 000 items in a solr collection. >>> We have set up auto commit property for this collection to 15 minutes. >>> Is a big collection and we using some caches etc. Therefore we have big >>> autocommit value. >>> >>> This have disadvantage that we haven't NRT searches. >>> >>> We would like to have NRT at least for searching for the newly added items. >>> >>> We read about new functionality "Category routed alilases" in a solr >>> version >>> 8.1. >>> >>> And we got an idea, that we could add to our collection schema field for >>> routing. >>> And at the time of indexing we check if item is new and to routing field we >>> set up value "new", or the item is older than some time period we set up >>> value to "old". >>> And we will have one category routed alias routedCollection, and there will >>> be 2 collections old and new. >>> >>> If we index new item, router choose new collection and this item is >>> inserted >>> to it. After some period we reindex item and we decide that this item is >>> old >>> and to routing field we set up value "old". Router decide to update >>> (insert) >>> item to collection old. But we expect that solr automatically check >>> uniqueness in all routed collections. And if solr found item in other >>> collection, than will be automatically deleted. But not !!! >>> >>> Is this expected behaviour? >>> >>> Could be used this functionality for issue we have? Or could someone >>> suggest >>> another solution, which ensure that we have all new items ready for NRT >>> searches? >>> >>> Thanks for your help >>> >>> >>> >>> >>> >>> >>> -- >>> Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html >>> >> > > Hi, > > we know this page, and we understand how commits and transaction logs works, > but as I said we have a very big index size ;-) Therefore we cannot create > commits to often. > We must cache data for fast search, and if we will commit to often, then we > can any cache throw out. > > Now we have only one server, and we prepare new solution with Solr Cloud. > Where we would have several servers. We have limited resources and we cannot > afford to have for example 20 Solr servers, which I believe is a standard > solution for big indexes. > > Therefore we search for some compromise between price/performance. Therefore > we think about have more collections. And one collection would be a daily > feed (small index) and then we can commit every several seconds. And these > collections would be merge to main collection alias. > > Do you have another idea? > > Best > > >