[jira] [Commented] (SOLR-3442) Example schema switch to DisMax instead of CopyField
Jack Krupansky commented on SOLR-3442 Example schema switch to DisMax instead of CopyField When I initially read this issue I mistakenly read it as edismax rather than dismax. So, I would request that the intent be crystal clear - is it reasonable to switch the default query parser handler to edismax, or is it being suggested that the more limited dismax query parser be the new default? If the latter, we won't even be able to query specific fields without config changes. Some of the discussion over on SOLR-2368 might be relevant, as to whether the default query for example should be severely "locked-down" as opposed to highly functional (fields, Lucene syntax, etc.) I was going to proceed with an edismax-based patch, but now I am not so sure. This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-3442) Example schema switch to DisMax instead of CopyField
[ https://issues.apache.org/jira/browse/SOLR-3442?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13269794#comment-13269794 ] Jack Krupansky commented on SOLR-3442: -- bq. The lucene query parser generally shouldn't be used for user queries... If that is the general sentiment, then having the default example *user* query parser be edismax makes perfect sense. > Example schema switch to DisMax instead of CopyField > > > Key: SOLR-3442 > URL: https://issues.apache.org/jira/browse/SOLR-3442 > Project: Solr > Issue Type: Improvement > Components: Schema and Analysis >Reporter: Jan Høydahl > Labels: dismax > > Spinoff from SOLR-3439: > The use of copyField in todays example schema is an anti pattern since we > indirectly teach people to duplicate most of their content, while most would > be better off using DisMax, or at least a combination. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-3442) Example schema switch to DisMax instead of CopyField
[ https://issues.apache.org/jira/browse/SOLR-3442?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13269710#comment-13269710 ] Yonik Seeley commented on SOLR-3442: bq. I would cringe a little if we change the schema so that it doesn't work very well if the user does drop back to the lucene query parser The lucene query parser generally shouldn't be used for user queries, only programmatically generated ones. Using expicit fieldnames (or specifying df) for that case should be fine. > Example schema switch to DisMax instead of CopyField > > > Key: SOLR-3442 > URL: https://issues.apache.org/jira/browse/SOLR-3442 > Project: Solr > Issue Type: Improvement > Components: Schema and Analysis >Reporter: Jan Høydahl > Labels: dismax > > Spinoff from SOLR-3439: > The use of copyField in todays example schema is an anti pattern since we > indirectly teach people to duplicate most of their content, while most would > be better off using DisMax, or at least a combination. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-3442) Example schema switch to DisMax instead of CopyField
[ https://issues.apache.org/jira/browse/SOLR-3442?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13269684#comment-13269684 ] Jack Krupansky commented on SOLR-3442: -- I don't disagree with the gist of your argument, but I would cringe a little if we change the schema so that it doesn't work very well if the user does drop back to the lucene query parser with &defType=lucene which has only a single default field. OTOH, maybe that is simply the cost of making the example schema (and config) be more representative of "best practices". But, that sort of implies that the Lucene query parser is not a "best practice", at least when searchable text content is spread over multiple fields. > Example schema switch to DisMax instead of CopyField > > > Key: SOLR-3442 > URL: https://issues.apache.org/jira/browse/SOLR-3442 > Project: Solr > Issue Type: Improvement > Components: Schema and Analysis >Reporter: Jan Høydahl > Labels: dismax > > Spinoff from SOLR-3439: > The use of copyField in todays example schema is an anti pattern since we > indirectly teach people to duplicate most of their content, while most would > be better off using DisMax, or at least a combination. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-3442) Example schema switch to DisMax instead of CopyField
[ https://issues.apache.org/jira/browse/SOLR-3442?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13269603#comment-13269603 ] Jan Høydahl commented on SOLR-3442: --- I'm not saying anything is "dead". Both the "lucene" queryparser and copyField has its mission and is supported, and you can mix and match these with DisMax to fit your needs. But for the example we should select the most useful and flexible way to show indexing and search, and that is no longer "text" catch-all and copyField. Aside from it doubling the size of your index, it is inflexible in that adding or removing a field from search means schema update and re-indexing. Catch-all fields with copyField can sometimes be used as a performance optimization, but you do not start in that end. Maintaining many examples has shown not to be a very good strategy, look at the multi-core and DIH examples, they lag behind several versions when it comes to schema version and new solrconfig syntaxes. Instead, a single schema which can do both the product search and document search use cases well is easy to achieve. The Velocity GUI can be extended with two tabs if need be, one "products" tab and one "documents" tab. If we choose the example documents to index wisely, to be i.e. user guides for the products, we get a nice connection. You can search for "ipod" and see both products and user guides matching your search. > Example schema switch to DisMax instead of CopyField > > > Key: SOLR-3442 > URL: https://issues.apache.org/jira/browse/SOLR-3442 > Project: Solr > Issue Type: Improvement > Components: Schema and Analysis >Reporter: Jan Høydahl > Labels: dismax > > Spinoff from SOLR-3439: > The use of copyField in todays example schema is an anti pattern since we > indirectly teach people to duplicate most of their content, while most would > be better off using DisMax, or at least a combination. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-3442) Example schema switch to DisMax instead of CopyField
[ https://issues.apache.org/jira/browse/SOLR-3442?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13269584#comment-13269584 ] Jack Krupansky commented on SOLR-3442: -- Maybe Solr has outgrown the concept of a single example schema/config. "Full function" and "maximal performance" conflict to some degree and picking one arbitrary point on the design spectrum does a disservice for those who have varying requirements. The current example already has performance tips and a warning advisory not to use it for benchmarking. And SolrCell documents having "core", common metadata is somewhat distinct from full-custom schema design. The copyField to "text" pattern is more clearly targeted at non-dismax users, where "text" is the single default search field. This issue essentially raises the question: Is non-dismax query parsing dead? If not, the copyField/text pattern still seems relevant. Maybe it would be worth having a modest library of schema/config files that the user can select from when running "example". OTOH, maintaining a lot of somewhat similar files can be a pain. A way to configure the schema/config files (conditionals) would be helpful. > Example schema switch to DisMax instead of CopyField > > > Key: SOLR-3442 > URL: https://issues.apache.org/jira/browse/SOLR-3442 > Project: Solr > Issue Type: Improvement > Components: Schema and Analysis >Reporter: Jan Høydahl > Labels: dismax > > Spinoff from SOLR-3439: > The use of copyField in todays example schema is an anti pattern since we > indirectly teach people to duplicate most of their content, while most would > be better off using DisMax, or at least a combination. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-3442) Example schema switch to DisMax instead of CopyField
[ https://issues.apache.org/jira/browse/SOLR-3442?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13269509#comment-13269509 ] Jan Høydahl commented on SOLR-3442: --- Sure, I've seen it successfully used too, and I use it myself now and then to reduce the number of fields required in "qf". For very small indexes without much need for tuning analysis or relevancy it does not matter very much. But I'm arguing that copyField is the legacy way of searching multiple fields in one go, while DisMax is the current recommendation. So why stick to the legacy in the default example? > Example schema switch to DisMax instead of CopyField > > > Key: SOLR-3442 > URL: https://issues.apache.org/jira/browse/SOLR-3442 > Project: Solr > Issue Type: Improvement > Components: Schema and Analysis >Reporter: Jan Høydahl > Labels: dismax > > Spinoff from SOLR-3439: > The use of copyField in todays example schema is an anti pattern since we > indirectly teach people to duplicate most of their content, while most would > be better off using DisMax, or at least a combination. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-3442) Example schema switch to DisMax instead of CopyField
[ https://issues.apache.org/jira/browse/SOLR-3442?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13269489#comment-13269489 ] Chris Male commented on SOLR-3442: -- I think it's a pretty bold claim to call it an anti-pattern. I've seen it successfully used in many projects and it continues to fulfill user needs. > Example schema switch to DisMax instead of CopyField > > > Key: SOLR-3442 > URL: https://issues.apache.org/jira/browse/SOLR-3442 > Project: Solr > Issue Type: Improvement > Components: Schema and Analysis >Reporter: Jan Høydahl > Labels: dismax > > Spinoff from SOLR-3439: > The use of copyField in todays example schema is an anti pattern since we > indirectly teach people to duplicate most of their content, while most would > be better off using DisMax, or at least a combination. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org