[ 
https://issues.apache.org/jira/browse/SOLR-3250?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13230243#comment-13230243
 ] 

Yonik Seeley commented on SOLR-3250:
------------------------------------

Of course hopefully everyone knows "schemaless" is mostly marketing b.s. - when 
people do this, there is still a schema, but it's guessed on first use (and 
hence generally a horrible idea for production systems).

It would be easy enough on a single node... but how does one handle a cluster?
Say you index price=0 on nodeA, and price=100.0 on nodeB?

A quick thought on how it might work:
 - have a separate file auto_fields.json that keeps track of the mappings that 
would be the same for all cores using that schema
 - when we run across a field we haven't seen before, we must guess a type for 
it, then grab a lock - update the auto_fields.json
 - we can update our in-memory schema with any new fields we find in 
auto_fields.json
 - works the same in ZK mode... it's just the auto_fields.json is in ZK, and we 
would use something like optimistic locking to update it


                
> Dynamic Field capabilities based on value not name
> --------------------------------------------------
>
>                 Key: SOLR-3250
>                 URL: https://issues.apache.org/jira/browse/SOLR-3250
>             Project: Solr
>          Issue Type: Improvement
>            Reporter: Grant Ingersoll
>
> In some situations, one already knows the schema of their content, so having 
> to declare a schema in Solr becomes cumbersome in some situations.  For 
> instance, if you have all your content in JSON (or can easily generate it) or 
> other typed serializations, then you already have a schema defined.  It would 
> be nice if we could have support for dynamic fields that used whatever name 
> was passed in, but then picked the appropriate FieldType for that field based 
> on the value of the content.  So, for instance, if the input is a number, it 
> would select the appropriate numeric type.  If it is a plain text string, it 
> would pick the appropriate text field (you could even add in language 
> detection here).  If it is comma separated, it would treat them as keywords, 
> etc.  Also, we could likely send in a hint as to the type too.
> With this approach, you of course have a "first in wins" situation, but 
> assuming you have this schema defined elsewhere, it is likely fine.
> Supporting such cases would allow us to be schemaless when appropriate, while 
> offering the benefits of schemas when appropriate.  Naturally, one could mix 
> and match these too.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to