[ https://issues.apache.org/jira/browse/SOLR-3250?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13230243#comment-13230243 ]
Yonik Seeley commented on SOLR-3250: ------------------------------------ Of course hopefully everyone knows "schemaless" is mostly marketing b.s. - when people do this, there is still a schema, but it's guessed on first use (and hence generally a horrible idea for production systems). It would be easy enough on a single node... but how does one handle a cluster? Say you index price=0 on nodeA, and price=100.0 on nodeB? A quick thought on how it might work: - have a separate file auto_fields.json that keeps track of the mappings that would be the same for all cores using that schema - when we run across a field we haven't seen before, we must guess a type for it, then grab a lock - update the auto_fields.json - we can update our in-memory schema with any new fields we find in auto_fields.json - works the same in ZK mode... it's just the auto_fields.json is in ZK, and we would use something like optimistic locking to update it > Dynamic Field capabilities based on value not name > -------------------------------------------------- > > Key: SOLR-3250 > URL: https://issues.apache.org/jira/browse/SOLR-3250 > Project: Solr > Issue Type: Improvement > Reporter: Grant Ingersoll > > In some situations, one already knows the schema of their content, so having > to declare a schema in Solr becomes cumbersome in some situations. For > instance, if you have all your content in JSON (or can easily generate it) or > other typed serializations, then you already have a schema defined. It would > be nice if we could have support for dynamic fields that used whatever name > was passed in, but then picked the appropriate FieldType for that field based > on the value of the content. So, for instance, if the input is a number, it > would select the appropriate numeric type. If it is a plain text string, it > would pick the appropriate text field (you could even add in language > detection here). If it is comma separated, it would treat them as keywords, > etc. Also, we could likely send in a hint as to the type too. > With this approach, you of course have a "first in wins" situation, but > assuming you have this schema defined elsewhere, it is likely fine. > Supporting such cases would allow us to be schemaless when appropriate, while > offering the benefits of schemas when appropriate. Naturally, one could mix > and match these too. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org