[ https://issues.apache.org/jira/browse/SOLR-14701?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17170013#comment-17170013 ]
Jan Høydahl commented on SOLR-14701: ------------------------------------ {quote}When we guess wrong, you can't index some documents {quote} Sure. But when we guess right, you can :) {quote}The mechanism for updating the schema is fragile {quote} Doesn't matter, as this is NOT a production feature, so don't expect there to be any load and any large number of servers/shards involved {quote}It's another instance of complex code that we have to maintain. {quote} True. We always have to weigh benefit vs complexity. Since this is mostly contained to *one* URP I'm not overly worried. Would be interesting to hear what user community says about it. Perhaps they love it, or perhaps they hate it. Probably a good chunk of both. {quote}We don't really deliver "schemaless". What we deliver is something that doesn't (and can't) work correctly. {quote} For some usecases with well formatted typed data it can work really well. Other times not so much. What I tend to do is do a first run, identify the problematic 1-3 fields that get mixed up, then wrote {{add-field}} schema api commands for those in my script that is run before ingestion, and lett the system guess the rest. If you have used Elastic, this is exactly what you need to to there as well. {quote}and the users _still_ have to go in and tweak the schema {quote} Of course they do. ALL search apps need to tweak the schema. For Solr. For Elastic. For MySql. And we must tell them clearly. This feature is only an aid very early on in exploring your data, to avoid having to hand edit 142 {{<field>}} tags in a schema before you can even look at you data. {quote}Version control is another hidden gotcha. {quote} It's not hidden, is it? We recommend AGAINST this feature in production, i.e. turn it off once you reach stable schema and stick your schema in version control, in your Application and use schema api or whatever. Perhaps that can be documented even better. {quote}Hmmm, though if we wanted to help them make a real schema, we could write something that processed an existing index {quote} Or we could just make a page in Admin UI schema tab - schema wizard, where they could paste N documents similar to what they can do in the "Documents" tab, and we detect most likely schema from those documents and spit out a JSON that can be used in Schema-API to bootstrap that schema. ? > Deprecate Schemaless Mode (Discussion) > -------------------------------------- > > Key: SOLR-14701 > URL: https://issues.apache.org/jira/browse/SOLR-14701 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) > Components: Schema and Analysis > Reporter: Marcus Eagan > Priority: Major > > I know this won't be the most popular ticket out there, but I am growing more > and more sympathetic to the idea that we should rip many of the freedoms out > that cause users more harm than not. One of the freedoms I saw time and time > again to cause issues was schemaless mode. It doesn't work as named or > documented, so I think it should be deprecated. > If you use it in production reliably and in a way that cannot be accomplished > another way, I am happy to hear from more knowledgeable folks as to why > deprecation is a bad idea. -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org