[jira] [Commented] (SOLR-5017) Allow sharding based on the value of a field

Erick Erickson (JIRA) Wed, 10 Jul 2013 06:09:55 -0700

    [ 
https://issues.apache.org/jira/browse/SOLR-5017?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13704533#comment-13704533
 ]


Erick Erickson commented on SOLR-5017:
--------------------------------------

bq: If I have a already working system where ids cannot be changed, I have no 
option with the current scheme of things .

_Do_ you have such a system? Theoretically I agree. But it also seems like this 
change has enough edge cases that it might be better to wait and see whether 
there's enough pressure to move this forward before trying to anticipate 
problems. Premature optimization?

bq: If your code is using that API then your code should continue to work 
right...

Don't really know, I've been meaning to dive into that patch but haven't. It's 
on the SolrJ side, mostly I'm using it as an example of a place things can get 
out of synch. I'm sure there are others.

bq: What if I to have a clean 'id' value which is devoid of extra information? 
Should I do id.substring(id.indexOf("!") everytime I use it elsewhere ?

Yeah, that's a pain. But perhaps not as much as trying to maintain two schemes 
to route documents and deal with the issues that are sure to come up. Frankly I 
don't have a firm sense of which is better/worse, my antenna are just quivering 
based on introducing a feature that'll have repercussions before there's a 
demonstrated need. I've gotten myself into trouble too often doing that...

bq: what happens when a document is updated and the value of this field changes?

This is exactly what I'm talking about, I'm afraid the edge cases will go on 
forever (or nearly). An N+1 kind of thing. 


All that said, I'm not totally against the idea. In fact I kind of wish a 
separate "routing field" was the way it was implemented in the first place. But 
did I think to suggest it when it first started to be implemented? Nooooooo.....

But I fear at this point that having two ways of routing things around without 
a compelling _existing_ use case will generate a lot of work, lots of ongoing 
maintenance and the effort could well be spent elsewhere in the near term.

But since I'm not volunteering to do the work, I really don't have all that 
much to say.


                
> Allow sharding based on the value of a field
> --------------------------------------------
>
>                 Key: SOLR-5017
>                 URL: https://issues.apache.org/jira/browse/SOLR-5017
>             Project: Solr
>          Issue Type: Sub-task
>            Reporter: Noble Paul
>            Assignee: Noble Paul
>
> We should be able to create a collection where sharding is done based on the 
> value of a given field
> collections can be created with shardField=fieldName, which will be persisted 
> in DocCollection in ZK
> implicit DocRouter would look at this field instead of _shard_ field
> CompositeIdDocRouter can also use this field instead of looking at the id 
> field. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-5017) Allow sharding based on the value of a field

Reply via email to