It is the number of recommendations for a single user that matter. The more there are, the worse the performance. Try it and see is the best way though.
I personally would have one doc per recommendation. It will reduce the amount of churn in your index as updating a multivalued field will involve deleting the entire document that preceded it, which will then need merging, etc. One doc per recommendation effectively makes your index write-only, which is much cleaner. Regarding sharding, you can shard your original index, but a replica of your user recommendations collection must exist on every shard/replica of that original index. It cannot be sharded. HTH Upayavira On Thu, Jun 11, 2015, at 06:06 PM, Reitzel, Charles wrote: > So long as the fields are indexed, I think performance should be ok. > > Personally, I would also look at using a single document per user with a > multi-valued field for recommendation ID. Assuming only a small > fraction of all recommendation IDs are ever presented to any single user, > this schema would be physically much smaller and require only a single > document per user. > > I don't know the answer to your sharding question. The join query is > available out of the box, so it should be quick work to set up a > two-shard sample and test the distributed sub-query. > > That said, with the scales you are talking about, I question if sharding > is necessary. You can still use replication for load balancing without > sharding. > > -----Original Message----- > From: amid [mailto:a...@donanza.com] > Sent: Thursday, June 11, 2015 12:36 PM > To: solr-user@lucene.apache.org > Subject: RE: The best way to exclude "seen" results from search queries > > Thanks allot Charles, > > This seems to be what I'm looking for. > Do you know if join for this amount of documents & user will still have > good query performance? also, is there any limitations for the solr > architecture once using the "join" method (i.e. sharding)? > > Many thanks, > Ami > > > > -- > View this message in context: > http://lucene.472066.n3.nabble.com/The-best-way-to-exclude-seen-results-from-search-queries-tp4211022p4211223.html > Sent from the Solr - User mailing list archive at Nabble.com. > > ************************************************************************* > This e-mail may contain confidential or privileged information. > If you are not the intended recipient, please notify the sender > immediately and then delete it. > > TIAA-CREF > ************************************************************************* >