Hi,

There is no JOIN functionality in Solr. The common solution is either to accept 
the high volume update churn, or to add client side code to build a "join" 
layer on top of the two indices. I know that Attivio (www.attivio.com) have 
built some kind of JOIN functionality on top of Solr in their AIE product, but 
do not know the details or the actual performance.

Why not open a JIRA issue, if there is no such already, to request this as a 
feature?

--
Jan Høydahl  - search architect
Cominvent AS - www.cominvent.com

On 25. jan. 2010, at 22.01, Aaron McKee wrote:

> 
> Is there any somewhat convenient way to collate/integrate fields from 
> separate indices during result writing, if the indices use the same unique 
> keys? Basically, some sort of cross-index JOIN?
> 
> As a bit of background, I have a rather heavyweight dataset of every US 
> business (~25m records, an on-disk index footprint of ~30g, and 5-10 hours to 
> fully index on a decent box). Given the size and relatively stability of the 
> dataset, I generally only update this monthly. However, I have separate 
> advertising-related datasets that need to be updated either hourly or daily 
> (e.g. today's coupon, click revenue remaining, etc.) . These advertiser feeds 
> reference the same keyspace that I use in the main index, but are otherwise 
> significantly lighter weight. Importing and indexing them discretely only 
> takes a couple minutes. Given that Solr/Lucene doesn't support field 
> updating, without having to drop and re-add an entire document, it doesn't 
> seem practical to integrate this data into the main index (the system would 
> be under a constant state of churn, if we did document re-inserts, and the 
> performance impact would probably be debilitating). It may be nice if this 
> data could participate in filtering (e.g. only show advertisers), but it 
> doesn't need to participate in scoring/ranking.
> 
> I'm guessing that someone else has had a similar need, at some point?  I can 
> have our front-end query the smaller indices separately, using the keys 
> returned by the primary index, but would prefer to avoid the extra sequential 
> roundtrips. I'm hoping to also avoid a coding solution, if only to avoid the 
> maintenance overhead as we drop in new builds of Solr, but that's also 
> feasible.
> 
> Thank you for your insight,
> Aaron
> 

Reply via email to