[ https://issues.apache.org/jira/browse/SOLR-4787?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13716342#comment-13716342 ]
Joel Bernstein commented on SOLR-4787: -------------------------------------- Kranti, Let me know how the pjoin is performing for you. I'm going to be testing out some different data structures for the pjoin to see if I can get better performance. > Join Contrib > ------------ > > Key: SOLR-4787 > URL: https://issues.apache.org/jira/browse/SOLR-4787 > Project: Solr > Issue Type: New Feature > Components: search > Affects Versions: 4.2.1 > Reporter: Joel Bernstein > Priority: Minor > Fix For: 4.4 > > Attachments: SOLR-4787-deadlock-fix.patch, SOLR-4787.patch, > SOLR-4787.patch, SOLR-4787.patch, SOLR-4787.patch, SOLR-4787.patch, > SOLR-4787.patch, SOLR-4787.patch, SOLR-4787.patch, SOLR-4787.patch, > SOLR-4787.patch, SOLR-4787-pjoin-long-keys.patch > > > This contrib provides a place where different join implementations can be > contributed to Solr. This contrib currently includes 2 join implementations. > The initial patch was generated from the Solr 4.3 tag. Because of changes in > the FieldCache API this patch will only build with Solr 4.2 or above. > *PostFilterJoinQParserPlugin aka "pjoin"* > The pjoin provides a join implementation that filters results in one core > based on the results of a search in another core. This is similar in > functionality to the JoinQParserPlugin but the implementation differs in a > couple of important ways. > The first way is that the pjoin is designed to work with integer join keys > only. So, in order to use pjoin, integer join keys must be included in both > the to and from core. > The second difference is that the pjoin builds memory structures that are > used to quickly connect the join keys. It also uses a custom SolrCache named > "join" to hold intermediate DocSets which are needed to build the join memory > structures. So, the pjoin will need more memory then the JoinQParserPlugin to > perform the join. > The main advantage of the pjoin is that it can scale to join millions of keys > between cores. > Because it's a PostFilter, it only needs to join records that match the main > query. > The syntax of the pjoin is the same as the JoinQParserPlugin except that the > plugin is referenced by the string "pjoin" rather then "join". > fq=\{!pjoin fromCore=collection2 from=id_i to=id_i\}user:customer1 > The example filter query above will search the fromCore (collection2) for > "user:customer1". This query will generate a list of values from the "from" > field that will be used to filter the main query. Only records from the main > query, where the "to" field is present in the "from" list will be included in > the results. > The solrconfig.xml in the main query core must contain the reference to the > pjoin. > <queryParser name="pjoin" > class="org.apache.solr.joins.PostFilterJoinQParserPlugin"/> > And the join contrib jars must be registed in the solrconfig.xml. > <lib dir="../../../dist/" regex="solr-joins-\d.*\.jar" /> > The solrconfig.xml in the fromcore must have the "join" SolrCache configured. > <cache name="join" > class="solr.LRUCache" > size="4096" > initialSize="1024" > /> > *ValueSourceJoinParserPlugin aka vjoin* > The second implementation is the ValueSourceJoinParserPlugin aka "vjoin". > This implements a ValueSource function query that can return a value from a > second core based on join keys and limiting query. The limiting query can be > used to select a specific subset of data from the join core. This allows > customer specific relevance data to be stored in a separate core and then > joined in the main query. > The vjoin is called using the "vjoin" function query. For example: > bf=vjoin(fromCore, fromKey, fromVal, toKey, query) > This example shows "vjoin" being called by the edismax boost function > parameter. This example will return the "fromVal" from the "fromCore". The > "fromKey" and "toKey" are used to link the records from the main query to the > records in the "fromCore". The "query" is used to select a specific set of > records to join with in fromCore. > Currently the fromKey and toKey must be longs but this will change in future > versions. Like the pjoin, the "join" SolrCache is used to hold the join > memory structures. > To configure the vjoin you must register the ValueSource plugin in the > solrconfig.xml as follows: > <valueSourceParser name="vjoin" > class="org.apache.solr.joins.ValueSourceJoinParserPlugin" /> -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org