On 2019/08/03 18:00:28, Furkan KAMACI <furkankam...@gmail.com> wrote:
> Hi,
>
> First of all, could you check here:
> https://lucidworks.com/post/understanding-transaction-logs-softcommit-and-commit-in-sorlcloud/
> to
> better understand hard commits, soft commits and transaction logs to
> achieve NRT search.
>
> Kind Regards,
> Furkan KAMACI
>
> On Wed, Jul 31, 2019 at 3:47 PM profiuser <upda...@profimedia.com> wrote:
>
> > Hi,
> >
> > we have something about 400 000 000 items in a solr collection.
> > We have set up auto commit property for this collection to 15 minutes.
> > Is a big collection and we using some caches etc. Therefore we have big
> > autocommit value.
> >
> > This have disadvantage that we haven't NRT searches.
> >
> > We would like to have NRT at least for searching for the newly added items.
> >
> > We read about new functionality "Category routed alilases" in a solr
> > version
> > 8.1.
> >
> > And we got an idea, that we could add to our collection schema field for
> > routing.
> > And at the time of indexing we check if item is new and to routing field we
> > set up value "new", or the item is older than some time period we set up
> > value to "old".
> > And we will have one category routed alias routedCollection, and there will
> > be 2 collections old and new.
> >
> > If we index new item, router choose new collection and this item is
> > inserted
> > to it. After some period we reindex item and we decide that this item is
> > old
> > and to routing field we set up value "old". Router decide to update
> > (insert)
> > item to collection old. But we expect that solr automatically check
> > uniqueness in all routed collections. And if solr found item in other
> > collection, than will be automatically deleted. But not !!!
> >
> > Is this expected behaviour?
> >
> > Could be used this functionality for issue we have? Or could someone
> > suggest
> > another solution, which ensure that we have all new items ready for NRT
> > searches?
> >
> > Thanks for your help
> >
> >
> >
> >
> >
> >
> > --
> > Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html
> >
>
Hi,
we know this page, and we understand how commits and transaction logs works,
but as I said we have a very big index size ;-) Therefore we cannot create
commits to often.
We must cache data for fast search, and if we will commit to often, then we can
any cache throw out.
Now we have only one server, and we prepare new solution with Solr Cloud. Where
we would have several servers. We have limited resources and we cannot afford
to have for example 20 Solr servers, which I believe is a standard solution for
big indexes.
Therefore we search for some compromise between price/performance. Therefore we
think about have more collections. And one collection would be a daily feed
(small index) and then we can commit every several seconds. And these
collections would be merge to main collection alias.
Do you have another idea?
Best