RE: Solr 4.0 - Join performance

Eric Khoury Tue, 14 Aug 2012 13:20:04 -0700
Thanks David, that does indeed sound like it'll help.  Is there an issue number 
I can use to track development\availability?Eric.
 > From: dsmi...@mitre.org
> To: solr-user@lucene.apache.org
> Subject: Re: Solr 4.0 - Join performance
> Date: Tue, 14 Aug 2012 20:15:27 +0000
> 
> Stepping back a bit, the reason you are using multiple cores with a join is 
> because Solr doesn't have a multi-valued numeric range type.  The spatial 
> work I'm doing in Lucene-spatial does, and it's 2-dimensional for an x & y 
> whereas your case calls for one dimension.  It's taking a bit of time, but 
> when finished you should be able to use it for your use case ignoring the 
> 'y'.  Eventually I'd like to develop  such a Solr field type for a 
> numeric/time range to do it more natively but that's a ways off.
> 
> Cheers,
>   ~ David Smiley
> 
> On Aug 2, 2012, at 10:45 AM, Eric Khoury wrote:
> 
> > 
> > 
> > 
> > 
> > 
> > 
> > Hello all,
> > 
> > 
> > 
> > I’m testing out the new join feature, hitting some perf
> > issues, as described in Erick’s article 
> > (http://architects.dzone.com/articles/solr-experimenting-join).
> > 
> > Basically, I’m using 2 objects in solr (this is a simplified
> > view):
> > 
> > 
> > 
> > Item
> > 
> > - Id
> > 
> > - Name
> > 
> > 
> > 
> > Grant
> > 
> > - ItemId
> > 
> > - AvailabilityStartTime
> > 
> > - AvailabilityEndTime
> > 
> > 
> > 
> > Each item can have multiple grants attached to it.
> > 
> > 
> > 
> > The query I'm using is the following, to find items by
> > name, filtered by grants availability window:
> > 
> > 
> > 
> > solr/select?fq=Name:XXX&q={!join
> > from=ItemId to=Id} AvailabilityStartTime:[* TO NOW] AND 
> > -AvailabilityEndTime:[*
> > TO NOW]
> > 
> > 
> > 
> > With a hundred thousand items, this query can take multiple seconds
> > to perform, due to the large number or ItemIds returned from the join query.
> > 
> > Has anyone come up with a better way to use joins for these types of 
> > queries?  Are there improvements planned in 4.0 rtm in this area?
> > 
> > 
> > 
> > Btw, I’ve explored simply adding Start-End times to items, but
> > the flat data model makes it hard to maintain start-end pairs.
> > 
> > 
> > 
> > Thanks for the help!
> > 
> > Eric.
> > 
> > 
> > 
> >                                       
>
RE: Solr 4.0 - Join performance

Reply via email to