Thank you for the interesting reply.

You confirmed our assumptions about that. The usage of two or more collections, 
as Jörn Franke said, is more complicated  for developing. And for a now we will 
only try split image to more shards and servers and try to reduce commit times 
too.

I think that NRT times about one minute are acceptable

Thank you


On 2019/08/06 19:59:49, Shawn Heisey <apa...@elyograg.org> wrote: 
> On 7/31/2019 6:47 AM, profiuser wrote:
> > we have something about 400 000 000 items in a solr collection.
> > We have set up auto commit property for this collection to 15 minutes.
> > Is a big collection and we using some caches etc. Therefore we have big
> > autocommit value.
> 
> I would set autoCommit to 60 seconds (a value of 60000) with 
> openSearcher set to false.  This will not affect change visibility in 
> any way, but it will keep your transaction logs from becoming huge. 
> Commits that do NOT open a new searcher are very fast.
> 
> Then I would use autoSoftCommit as a failsafe on change visibility. 
> Start with a value between two and five minutes.
> 
> > This have disadvantage that we haven't NRT searches.
> > 
> > We would like to have NRT at least for searching for the newly added items.
> > 
> > We read about new functionality "Category routed alilases" in a solr version
> > 8.1.
> > 
> > And we got an idea, that we could add to our collection schema field for
> > routing.
> > And at the time of indexing we check if item is new and to routing field we
> > set up value "new", or the item is older than some time period we set up
> > value to "old".
> > And we will have one category routed alias routedCollection, and there will
> > be 2 collections old and new.
> > 
> > If we index new item, router choose new collection and this item is inserted
> > to it. After some period we reindex item and we decide that this item is old
> > and to routing field we set up value "old". Router decide to update (insert)
> > item to collection old. But we expect that solr automatically check
> > uniqueness in all routed collections. And if solr found item in other
> > collection, than will be automatically deleted. But not !!!
> > 
> > Is this expected behaviour?
> 
> I know very little about the new routed collection capability, but in 
> general, I would not expect Solr to check more than one collection for 
> an existing ID value when it is indexing.  I don't think there's 
> anything happening at that level that even knows about other 
> collections.  If you want to split your index into hot and cold pieces, 
> you're probably going to need to have your indexing software be aware of 
> that and either figure out where to send deletes, or just send deletes 
> to all parts of the index.
> 
> What kind of lag time do you think about when you imagine near real time 
> indexing?  Note that extremely short NRT times may not be achievable, 
> especially with the large index you're using.  A good starting point in 
> my opinion is 30000, which is 30 seconds.
> 
> What I would do is use the autoCommit and autoSoftCommit settings that I 
> mentioned above, and include a "commitWithin" parameter on all indexing 
> requests.  The commitWithin would be for NRT.
> 
> Thanks,
> Shawn
> 

Reply via email to