> You can certainly replicate the joined collection to every shard. It > must fit in one shard and a replica of that shard must be co-located > with every replica of the “to” collection.
Yes, I found this in the documentation, with a clear example just after this mail. I will test it today. I also read your blog about join performances[1] and I suspect the performance impact of joins will be huge because the joined collection is about 10M documents (only two fields, unique id and an array of longs and a filter applied to the array, join key is 10M unique IDs). > Have you looked at streaming and “streaming expressions"? It does not > have the same problem, although it does have its own limitations. I never tested them, and I am not very confortable yet in how to test them. Is it possible to mix query parsers and streaming expression in the client call via http parameters - or is streaming expression apply programmatically only ? [1] https://lucidworks.com/post/solr-and-joins/ On Tue, Oct 15, 2019 at 07:12:25PM -0400, Erick Erickson wrote: > You can certainly replicate the joined collection to every shard. It must fit > in one shard and a replica of that shard must be co-located with every > replica of the “to” collection. > > Have you looked at streaming and “streaming expressions"? It does not have > the same problem, although it does have its own limitations. > > Best, > Erick > > > On Oct 15, 2019, at 6:58 PM, Nicolas Paris <nicolas.pa...@riseup.net> wrote: > > > > Hi > > > > I have several large collections that cannot fit in a standalone solr > > instance. They are split over multiple shards in solr-cloud mode. > > > > Those collections are supposed to be joined to an other collection to > > retrieve subset. Because I am using distributed collections, I am not > > able to use the solr join feature. > > > > For this reason, I denormalize the information by adding the joined > > collection within every collections. Naturally, when I want to update > > the joined collection, I have to update every one of the distributed > > collections. > > > > In standalone mode, I only would have to update the joined collection. > > > > I wonder if there is a way to overcome this limitation. For example, by > > replicating the joined collection to every shard - or other method I am > > ignoring. > > > > Any thought ? > > -- > > nicolas > -- nicolas