[jira] Commented: (SOLR-1632) Distributed IDF
[ https://issues.apache.org/jira/browse/SOLR-1632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12856517#action_12856517 ] Yonik Seeley commented on SOLR-1632: Rewrite not working through function query is not the end of the problems either... there is also stuff like extractTerms. There is also the issue of Lucene changing rapidly... and the difficulty of adding new methods to ValueSource and making sure that all implementations correctly propagate them through to sub ValueSources. Perhaps one idea is to use a visitor pattern to decouple tree traversal with the operations being performed. Distributed IDF --- Key: SOLR-1632 URL: https://issues.apache.org/jira/browse/SOLR-1632 Project: Solr Issue Type: New Feature Components: search Affects Versions: 1.5 Reporter: Andrzej Bialecki Attachments: distrib-2.patch, distrib.patch Distributed IDF is a valuable enhancement for distributed search across non-uniform shards. This issue tracks the proposed implementation of an API to support this functionality in Solr. -- This message is automatically generated by JIRA. - If you think it was sent incorrectly contact one of the administrators: https://issues.apache.org/jira/secure/Administrators.jspa - For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Commented: (SOLR-1632) Distributed IDF
[ https://issues.apache.org/jira/browse/SOLR-1632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12856220#action_12856220 ] Yonik Seeley commented on SOLR-1632: Was looking into this a little offline with Mark, who noticed that some queries were not being rewritten, and would thus throw an exception during weighting. It looks like the issue is this: rewrite() doesn't work for function queries (there is no propagation mechanism to go through value sources). This is a problem when real queries are embedded in function queries. Solr Function queries do have a mechanism to weight (via ValueSource.createWeight()). QueryValueSource does Weight w = q.weight(searcher); and that implementation of weight calls Query query = searcher.rewrite(this); This patch calls rewrite explicitly (which does nothing for embedded queries), and then when using the DFSource implementation of searcher, rewrite does nothing, and hence the embedded query is never rewritten and the subsequent createWeight() throws an exception. Distributed IDF --- Key: SOLR-1632 URL: https://issues.apache.org/jira/browse/SOLR-1632 Project: Solr Issue Type: New Feature Components: search Affects Versions: 1.5 Reporter: Andrzej Bialecki Attachments: distrib-2.patch, distrib.patch Distributed IDF is a valuable enhancement for distributed search across non-uniform shards. This issue tracks the proposed implementation of an API to support this functionality in Solr. -- This message is automatically generated by JIRA. - If you think it was sent incorrectly contact one of the administrators: https://issues.apache.org/jira/secure/Administrators.jspa - For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Commented: (SOLR-1632) Distributed IDF
[ https://issues.apache.org/jira/browse/SOLR-1632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12793283#action_12793283 ] Marc Sturlese commented on SOLR-1632: - Wich should be the value of the parameter shard.purpose to enable or disable the exact version of global IDF? Distributed IDF --- Key: SOLR-1632 URL: https://issues.apache.org/jira/browse/SOLR-1632 Project: Solr Issue Type: New Feature Components: search Affects Versions: 1.5 Reporter: Andrzej Bialecki Attachments: distrib.patch Distributed IDF is a valuable enhancement for distributed search across non-uniform shards. This issue tracks the proposed implementation of an API to support this functionality in Solr. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-1632) Distributed IDF
[ https://issues.apache.org/jira/browse/SOLR-1632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12789174#action_12789174 ] Andrzej Bialecki commented on SOLR-1632: - I'm not sure what approach you are referring to. Following the terminology in that thread, this implementation follows the approach where there is a single merged big idf map at the master, and it's sent out to slaves on each query. However, when exactly this merging and sending happens is implementation-specific - in the ExactDFSource it happens on every query, but I hope the API can support other scenarios as well. Distributed IDF --- Key: SOLR-1632 URL: https://issues.apache.org/jira/browse/SOLR-1632 Project: Solr Issue Type: New Feature Components: search Affects Versions: 1.5 Reporter: Andrzej Bialecki Attachments: distrib.patch Distributed IDF is a valuable enhancement for distributed search across non-uniform shards. This issue tracks the proposed implementation of an API to support this functionality in Solr. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-1632) Distributed IDF
[ https://issues.apache.org/jira/browse/SOLR-1632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12789379#action_12789379 ] Otis Gospodnetic commented on SOLR-1632: I didn't look a the patch, but from your comments it looks like you already have that 1 merged big idf map, which is really what I was aiming at, so that's good! I was just thinking that this map (file) would be periodically updated and pushed to slaves, so that slaves can compute the global IDF *locally* instead of any kind of extra requests. Distributed IDF --- Key: SOLR-1632 URL: https://issues.apache.org/jira/browse/SOLR-1632 Project: Solr Issue Type: New Feature Components: search Affects Versions: 1.5 Reporter: Andrzej Bialecki Attachments: distrib.patch Distributed IDF is a valuable enhancement for distributed search across non-uniform shards. This issue tracks the proposed implementation of an API to support this functionality in Solr. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-1632) Distributed IDF
[ https://issues.apache.org/jira/browse/SOLR-1632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12789607#action_12789607 ] Andrzej Bialecki commented on SOLR-1632: - I believe the API that I propose would support such implementation as well. Please note that it's usually not feasible to compute and distribute the complete IDF table for all terms - you would have to replicate a union of all term dictionaries across the cluster. In practice, you limit the amount of information by various means, e.g. only distributing data related to the current request (this implementation) or reducing the frequency of updates (e.g. LRU caching), or approximating global DF with a constant for frequent terms (where the contribution of their IDF to the score would be negligible anyway). Distributed IDF --- Key: SOLR-1632 URL: https://issues.apache.org/jira/browse/SOLR-1632 Project: Solr Issue Type: New Feature Components: search Affects Versions: 1.5 Reporter: Andrzej Bialecki Attachments: distrib.patch Distributed IDF is a valuable enhancement for distributed search across non-uniform shards. This issue tracks the proposed implementation of an API to support this functionality in Solr. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (SOLR-1632) Distributed IDF
[ https://issues.apache.org/jira/browse/SOLR-1632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12789120#action_12789120 ] Otis Gospodnetic commented on SOLR-1632: What about this approach: http://markmail.org/message/mjfmpzfspguepixx ? Distributed IDF --- Key: SOLR-1632 URL: https://issues.apache.org/jira/browse/SOLR-1632 Project: Solr Issue Type: New Feature Components: search Affects Versions: 1.5 Reporter: Andrzej Bialecki Attachments: distrib.patch Distributed IDF is a valuable enhancement for distributed search across non-uniform shards. This issue tracks the proposed implementation of an API to support this functionality in Solr. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.