[jira] [Commented] (SOLR-4114) Collection API: Allow multiple shards from one collection on the same Solr server

2013-03-22 Thread Commit Tag Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13610468#comment-13610468
 ] 

Commit Tag Bot commented on SOLR-4114:
--

[branch_4x commit] Mark Robert Miller
http://svn.apache.org/viewvc?view=revisionrevision=1417070

SOLR-4114: Allow creating more than one shard per instance with the Collection 
API.


 Collection API: Allow multiple shards from one collection on the same Solr 
 server
 -

 Key: SOLR-4114
 URL: https://issues.apache.org/jira/browse/SOLR-4114
 Project: Solr
  Issue Type: New Feature
  Components: multicore, SolrCloud
Affects Versions: 4.0
 Environment: Solr 4.0.0 release
Reporter: Per Steffensen
Assignee: Mark Miller
  Labels: collection-api, multicore, shard, shard-allocation
 Fix For: 4.1, 5.0

 Attachments: 
 SOLR-4114_mocking_OverseerCollectionProcessorTest_branch_4x.patch, 
 SOLR-4114.patch, SOLR-4114.patch, SOLR-4114.patch, SOLR-4114.patch, 
 SOLR-4114_trunk.patch


 We should support running multiple shards from one collection on the same 
 Solr server - the run a collection with 8 shards on a 4 Solr server cluster 
 (each Solr server running 2 shards).
 Performance tests at our side has shown that this is a good idea, and it is 
 also a good idea for easy elasticity later on - it is much easier to move an 
 entire existing shards from one Solr server to another one that just joined 
 the cluter than it is to split an exsiting shard among the Solr that used to 
 run it and the new Solr.
 See dev mailing list discussion Multiple shards for one collection on the 
 same Solr server

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-4114) Collection API: Allow multiple shards from one collection on the same Solr server

2012-12-28 Thread Per Steffensen (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13540383#comment-13540383
 ] 

Per Steffensen commented on SOLR-4114:
--

Maybe you ought to fix the real problem instaed - making Solr behave equally 
whether it is checked out from GitHub or SVN.

Had a look at your fix - it ought to be
{code}
 ((value = server.getAttribute(mbean, coreName)) != null)
{code}
instead of
{code}
 ((indexDir = server.getAttribute(mbean, coreName)) != null)
{code}

Regards, Per Steffensen

 Collection API: Allow multiple shards from one collection on the same Solr 
 server
 -

 Key: SOLR-4114
 URL: https://issues.apache.org/jira/browse/SOLR-4114
 Project: Solr
  Issue Type: New Feature
  Components: multicore, SolrCloud
Affects Versions: 4.0
 Environment: Solr 4.0.0 release
Reporter: Per Steffensen
Assignee: Mark Miller
  Labels: collection-api, multicore, shard, shard-allocation
 Fix For: 4.1, 5.0

 Attachments: 
 SOLR-4114_mocking_OverseerCollectionProcessorTest_branch_4x.patch, 
 SOLR-4114.patch, SOLR-4114.patch, SOLR-4114.patch, SOLR-4114.patch, 
 SOLR-4114_trunk.patch


 We should support running multiple shards from one collection on the same 
 Solr server - the run a collection with 8 shards on a 4 Solr server cluster 
 (each Solr server running 2 shards).
 Performance tests at our side has shown that this is a good idea, and it is 
 also a good idea for easy elasticity later on - it is much easier to move an 
 entire existing shards from one Solr server to another one that just joined 
 the cluter than it is to split an exsiting shard among the Solr that used to 
 run it and the new Solr.
 See dev mailing list discussion Multiple shards for one collection on the 
 same Solr server

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-4114) Collection API: Allow multiple shards from one collection on the same Solr server

2012-12-28 Thread Per Steffensen (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13540390#comment-13540390
 ] 

Per Steffensen commented on SOLR-4114:
--

Guess the following will do
{code}
 (server.getAttribute(mbean, coreName) != null)
{code}

 Collection API: Allow multiple shards from one collection on the same Solr 
 server
 -

 Key: SOLR-4114
 URL: https://issues.apache.org/jira/browse/SOLR-4114
 Project: Solr
  Issue Type: New Feature
  Components: multicore, SolrCloud
Affects Versions: 4.0
 Environment: Solr 4.0.0 release
Reporter: Per Steffensen
Assignee: Mark Miller
  Labels: collection-api, multicore, shard, shard-allocation
 Fix For: 4.1, 5.0

 Attachments: 
 SOLR-4114_mocking_OverseerCollectionProcessorTest_branch_4x.patch, 
 SOLR-4114.patch, SOLR-4114.patch, SOLR-4114.patch, SOLR-4114.patch, 
 SOLR-4114_trunk.patch


 We should support running multiple shards from one collection on the same 
 Solr server - the run a collection with 8 shards on a 4 Solr server cluster 
 (each Solr server running 2 shards).
 Performance tests at our side has shown that this is a good idea, and it is 
 also a good idea for easy elasticity later on - it is much easier to move an 
 entire existing shards from one Solr server to another one that just joined 
 the cluter than it is to split an exsiting shard among the Solr that used to 
 run it and the new Solr.
 See dev mailing list discussion Multiple shards for one collection on the 
 same Solr server

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-4114) Collection API: Allow multiple shards from one collection on the same Solr server

2012-12-27 Thread Mark Miller (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13540122#comment-13540122
 ] 

Mark Miller commented on SOLR-4114:
---

Ran into some troubles with the 'no two shards use the same index dir' check 
with a git checkout - finally traced it down to checking the source attribute 
from the the solrcore mbean - and when not using svn, this can be $URL. I'll 
look at an alternate check to do instead.

 Collection API: Allow multiple shards from one collection on the same Solr 
 server
 -

 Key: SOLR-4114
 URL: https://issues.apache.org/jira/browse/SOLR-4114
 Project: Solr
  Issue Type: New Feature
  Components: multicore, SolrCloud
Affects Versions: 4.0
 Environment: Solr 4.0.0 release
Reporter: Per Steffensen
Assignee: Mark Miller
  Labels: collection-api, multicore, shard, shard-allocation
 Fix For: 4.1, 5.0

 Attachments: 
 SOLR-4114_mocking_OverseerCollectionProcessorTest_branch_4x.patch, 
 SOLR-4114.patch, SOLR-4114.patch, SOLR-4114.patch, SOLR-4114.patch, 
 SOLR-4114_trunk.patch


 We should support running multiple shards from one collection on the same 
 Solr server - the run a collection with 8 shards on a 4 Solr server cluster 
 (each Solr server running 2 shards).
 Performance tests at our side has shown that this is a good idea, and it is 
 also a good idea for easy elasticity later on - it is much easier to move an 
 entire existing shards from one Solr server to another one that just joined 
 the cluter than it is to split an exsiting shard among the Solr that used to 
 run it and the new Solr.
 See dev mailing list discussion Multiple shards for one collection on the 
 same Solr server

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-4114) Collection API: Allow multiple shards from one collection on the same Solr server

2012-12-27 Thread Commit Tag Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13540154#comment-13540154
 ] 

Commit Tag Bot commented on SOLR-4114:
--

[trunk commit] Mark Robert Miller
http://svn.apache.org/viewvc?view=revisionrevision=1426329

SOLR-4114: tests: make happy with git - source attrib counts on svn substitution


 Collection API: Allow multiple shards from one collection on the same Solr 
 server
 -

 Key: SOLR-4114
 URL: https://issues.apache.org/jira/browse/SOLR-4114
 Project: Solr
  Issue Type: New Feature
  Components: multicore, SolrCloud
Affects Versions: 4.0
 Environment: Solr 4.0.0 release
Reporter: Per Steffensen
Assignee: Mark Miller
  Labels: collection-api, multicore, shard, shard-allocation
 Fix For: 4.1, 5.0

 Attachments: 
 SOLR-4114_mocking_OverseerCollectionProcessorTest_branch_4x.patch, 
 SOLR-4114.patch, SOLR-4114.patch, SOLR-4114.patch, SOLR-4114.patch, 
 SOLR-4114_trunk.patch


 We should support running multiple shards from one collection on the same 
 Solr server - the run a collection with 8 shards on a 4 Solr server cluster 
 (each Solr server running 2 shards).
 Performance tests at our side has shown that this is a good idea, and it is 
 also a good idea for easy elasticity later on - it is much easier to move an 
 entire existing shards from one Solr server to another one that just joined 
 the cluter than it is to split an exsiting shard among the Solr that used to 
 run it and the new Solr.
 See dev mailing list discussion Multiple shards for one collection on the 
 same Solr server

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-4114) Collection API: Allow multiple shards from one collection on the same Solr server

2012-12-27 Thread Commit Tag Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13540160#comment-13540160
 ] 

Commit Tag Bot commented on SOLR-4114:
--

[branch_4x commit] Mark Robert Miller
http://svn.apache.org/viewvc?view=revisionrevision=1426330

SOLR-4114: tests: make happy with git - source attrib counts on svn substitution


 Collection API: Allow multiple shards from one collection on the same Solr 
 server
 -

 Key: SOLR-4114
 URL: https://issues.apache.org/jira/browse/SOLR-4114
 Project: Solr
  Issue Type: New Feature
  Components: multicore, SolrCloud
Affects Versions: 4.0
 Environment: Solr 4.0.0 release
Reporter: Per Steffensen
Assignee: Mark Miller
  Labels: collection-api, multicore, shard, shard-allocation
 Fix For: 4.1, 5.0

 Attachments: 
 SOLR-4114_mocking_OverseerCollectionProcessorTest_branch_4x.patch, 
 SOLR-4114.patch, SOLR-4114.patch, SOLR-4114.patch, SOLR-4114.patch, 
 SOLR-4114_trunk.patch


 We should support running multiple shards from one collection on the same 
 Solr server - the run a collection with 8 shards on a 4 Solr server cluster 
 (each Solr server running 2 shards).
 Performance tests at our side has shown that this is a good idea, and it is 
 also a good idea for easy elasticity later on - it is much easier to move an 
 entire existing shards from one Solr server to another one that just joined 
 the cluter than it is to split an exsiting shard among the Solr that used to 
 run it and the new Solr.
 See dev mailing list discussion Multiple shards for one collection on the 
 same Solr server

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-4114) Collection API: Allow multiple shards from one collection on the same Solr server

2012-12-12 Thread Per Steffensen (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13529791#comment-13529791
 ] 

Per Steffensen commented on SOLR-4114:
--

bq. Great patch! I just had to make a couple small tweaks to make the policeman 
happy - added license files for the test jars, changed the new test to extend 
LuceneTestCase. Good stuff found by the long but useful precommit ant target.

Yes, I do not do precommit test before sending a patch. Of course I will do 
before committing myself if/when I become committer.

Just out of curiosity, why do tests HAVE to extend LuceneTestCase?

bq. I think that covers all of your points Per - let me know

Checking only on branch_4x, hopeing that you did the exact same changes on 
trunk. Updated my local checkout of branch_4x, made a check to see if 
everything seemed to be there. It was. I am ready to close this ticket.



 Collection API: Allow multiple shards from one collection on the same Solr 
 server
 -

 Key: SOLR-4114
 URL: https://issues.apache.org/jira/browse/SOLR-4114
 Project: Solr
  Issue Type: New Feature
  Components: multicore, SolrCloud
Affects Versions: 4.0
 Environment: Solr 4.0.0 release
Reporter: Per Steffensen
Assignee: Mark Miller
  Labels: collection-api, multicore, shard, shard-allocation
 Fix For: 4.1, 5.0

 Attachments: 
 SOLR-4114_mocking_OverseerCollectionProcessorTest_branch_4x.patch, 
 SOLR-4114.patch, SOLR-4114.patch, SOLR-4114.patch, SOLR-4114.patch, 
 SOLR-4114_trunk.patch


 We should support running multiple shards from one collection on the same 
 Solr server - the run a collection with 8 shards on a 4 Solr server cluster 
 (each Solr server running 2 shards).
 Performance tests at our side has shown that this is a good idea, and it is 
 also a good idea for easy elasticity later on - it is much easier to move an 
 entire existing shards from one Solr server to another one that just joined 
 the cluter than it is to split an exsiting shard among the Solr that used to 
 run it and the new Solr.
 See dev mailing list discussion Multiple shards for one collection on the 
 same Solr server

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-4114) Collection API: Allow multiple shards from one collection on the same Solr server

2012-12-12 Thread Per Steffensen (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13529797#comment-13529797
 ] 

Per Steffensen commented on SOLR-4114:
--

bq. Feel free to give feedback in that issue - this was my first experience 
looking at easymock.

I added a comment on SOLR-4136, but basically you did a fine job!

Now that SOLR-4114 is closed would you mind continuing with SOLR-4120, Mark?

 Collection API: Allow multiple shards from one collection on the same Solr 
 server
 -

 Key: SOLR-4114
 URL: https://issues.apache.org/jira/browse/SOLR-4114
 Project: Solr
  Issue Type: New Feature
  Components: multicore, SolrCloud
Affects Versions: 4.0
 Environment: Solr 4.0.0 release
Reporter: Per Steffensen
Assignee: Mark Miller
  Labels: collection-api, multicore, shard, shard-allocation
 Fix For: 4.1, 5.0

 Attachments: 
 SOLR-4114_mocking_OverseerCollectionProcessorTest_branch_4x.patch, 
 SOLR-4114.patch, SOLR-4114.patch, SOLR-4114.patch, SOLR-4114.patch, 
 SOLR-4114_trunk.patch


 We should support running multiple shards from one collection on the same 
 Solr server - the run a collection with 8 shards on a 4 Solr server cluster 
 (each Solr server running 2 shards).
 Performance tests at our side has shown that this is a good idea, and it is 
 also a good idea for easy elasticity later on - it is much easier to move an 
 entire existing shards from one Solr server to another one that just joined 
 the cluter than it is to split an exsiting shard among the Solr that used to 
 run it and the new Solr.
 See dev mailing list discussion Multiple shards for one collection on the 
 same Solr server

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-4114) Collection API: Allow multiple shards from one collection on the same Solr server

2012-12-12 Thread Mark Miller (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13529976#comment-13529976
 ] 

Mark Miller commented on SOLR-4114:
---

I don't often run precommit - takes too long - but I generally do when adding 
new dependencies because i know there will be license stuff. Jenkins will catch 
it even if we dont before commit though.

You have to extend LuceneTestCase because thats what hooks you into our test 
framework I believe - eg it enforces things like setup and teardown calling 
super and a ton of other nice stuff.

If you don't do it, jenkins will complain and fail.

I'll look at SOLR-4120.

 Collection API: Allow multiple shards from one collection on the same Solr 
 server
 -

 Key: SOLR-4114
 URL: https://issues.apache.org/jira/browse/SOLR-4114
 Project: Solr
  Issue Type: New Feature
  Components: multicore, SolrCloud
Affects Versions: 4.0
 Environment: Solr 4.0.0 release
Reporter: Per Steffensen
Assignee: Mark Miller
  Labels: collection-api, multicore, shard, shard-allocation
 Fix For: 4.1, 5.0

 Attachments: 
 SOLR-4114_mocking_OverseerCollectionProcessorTest_branch_4x.patch, 
 SOLR-4114.patch, SOLR-4114.patch, SOLR-4114.patch, SOLR-4114.patch, 
 SOLR-4114_trunk.patch


 We should support running multiple shards from one collection on the same 
 Solr server - the run a collection with 8 shards on a 4 Solr server cluster 
 (each Solr server running 2 shards).
 Performance tests at our side has shown that this is a good idea, and it is 
 also a good idea for easy elasticity later on - it is much easier to move an 
 entire existing shards from one Solr server to another one that just joined 
 the cluter than it is to split an exsiting shard among the Solr that used to 
 run it and the new Solr.
 See dev mailing list discussion Multiple shards for one collection on the 
 same Solr server

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-4114) Collection API: Allow multiple shards from one collection on the same Solr server

2012-12-12 Thread Mark Miller (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13530650#comment-13530650
 ] 

Mark Miller commented on SOLR-4114:
---

Thanks for the contribution Per!

 Collection API: Allow multiple shards from one collection on the same Solr 
 server
 -

 Key: SOLR-4114
 URL: https://issues.apache.org/jira/browse/SOLR-4114
 Project: Solr
  Issue Type: New Feature
  Components: multicore, SolrCloud
Affects Versions: 4.0
 Environment: Solr 4.0.0 release
Reporter: Per Steffensen
Assignee: Mark Miller
  Labels: collection-api, multicore, shard, shard-allocation
 Fix For: 4.1, 5.0

 Attachments: 
 SOLR-4114_mocking_OverseerCollectionProcessorTest_branch_4x.patch, 
 SOLR-4114.patch, SOLR-4114.patch, SOLR-4114.patch, SOLR-4114.patch, 
 SOLR-4114_trunk.patch


 We should support running multiple shards from one collection on the same 
 Solr server - the run a collection with 8 shards on a 4 Solr server cluster 
 (each Solr server running 2 shards).
 Performance tests at our side has shown that this is a good idea, and it is 
 also a good idea for easy elasticity later on - it is much easier to move an 
 entire existing shards from one Solr server to another one that just joined 
 the cluter than it is to split an exsiting shard among the Solr that used to 
 run it and the new Solr.
 See dev mailing list discussion Multiple shards for one collection on the 
 same Solr server

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-4114) Collection API: Allow multiple shards from one collection on the same Solr server

2012-12-11 Thread Mark Miller (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13529069#comment-13529069
 ] 

Mark Miller commented on SOLR-4114:
---

Hey Per - thanks! I'll try and get back to helping finish up this issue - at a 
minimum I'll get a commit or two in on it today.

 Collection API: Allow multiple shards from one collection on the same Solr 
 server
 -

 Key: SOLR-4114
 URL: https://issues.apache.org/jira/browse/SOLR-4114
 Project: Solr
  Issue Type: New Feature
  Components: multicore, SolrCloud
Affects Versions: 4.0
 Environment: Solr 4.0.0 release
Reporter: Per Steffensen
Assignee: Per Steffensen
  Labels: collection-api, multicore, shard, shard-allocation
 Attachments: 
 SOLR-4114_mocking_OverseerCollectionProcessorTest_branch_4x.patch, 
 SOLR-4114.patch, SOLR-4114.patch, SOLR-4114.patch, SOLR-4114.patch, 
 SOLR-4114_trunk.patch


 We should support running multiple shards from one collection on the same 
 Solr server - the run a collection with 8 shards on a 4 Solr server cluster 
 (each Solr server running 2 shards).
 Performance tests at our side has shown that this is a good idea, and it is 
 also a good idea for easy elasticity later on - it is much easier to move an 
 entire existing shards from one Solr server to another one that just joined 
 the cluter than it is to split an exsiting shard among the Solr that used to 
 run it and the new Solr.
 See dev mailing list discussion Multiple shards for one collection on the 
 same Solr server

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-4114) Collection API: Allow multiple shards from one collection on the same Solr server

2012-12-11 Thread Per Steffensen (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13529093#comment-13529093
 ] 

Per Steffensen commented on SOLR-4114:
--

Thanks Mark

Guess what is left is:
* Decide if you want to enable the call to checkCollectionIsNotCreated and 
set the wait in it to 10 sec (or so), or if you are prepared to drop this part 
of the test, now that you got the OverseerCollectionProcessorTest
* Maybe make a comment on SOLR-4043 that you ought to modify the 
BasicDistributedZkTest.testCollectionsAPI test as a part of solving that issue.
* Add the checkNoTwoShardsUseTheSameIndexDir thing if you (also) find it 
usefull (see above)
* Add SOLR-4114_mocking_OverseerCollectionProcessorTest_branch_4x.patch
But you probably want to read through the (last of the) comments above to make 
sure we didnt miss something we promissed (eash other and/or Robert Muir) :-)

Regards, Per Steffensen

 Collection API: Allow multiple shards from one collection on the same Solr 
 server
 -

 Key: SOLR-4114
 URL: https://issues.apache.org/jira/browse/SOLR-4114
 Project: Solr
  Issue Type: New Feature
  Components: multicore, SolrCloud
Affects Versions: 4.0
 Environment: Solr 4.0.0 release
Reporter: Per Steffensen
Assignee: Per Steffensen
  Labels: collection-api, multicore, shard, shard-allocation
 Attachments: 
 SOLR-4114_mocking_OverseerCollectionProcessorTest_branch_4x.patch, 
 SOLR-4114.patch, SOLR-4114.patch, SOLR-4114.patch, SOLR-4114.patch, 
 SOLR-4114_trunk.patch


 We should support running multiple shards from one collection on the same 
 Solr server - the run a collection with 8 shards on a 4 Solr server cluster 
 (each Solr server running 2 shards).
 Performance tests at our side has shown that this is a good idea, and it is 
 also a good idea for easy elasticity later on - it is much easier to move an 
 entire existing shards from one Solr server to another one that just joined 
 the cluter than it is to split an exsiting shard among the Solr that used to 
 run it and the new Solr.
 See dev mailing list discussion Multiple shards for one collection on the 
 same Solr server

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-4114) Collection API: Allow multiple shards from one collection on the same Solr server

2012-12-11 Thread Mark Miller (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13529180#comment-13529180
 ] 

Mark Miller commented on SOLR-4114:
---

Great patch! I just had to make a couple small tweaks to make the policeman 
happy - added license files for the test jars, changed the new test to extend 
LuceneTestCase. Good stuff found by the long but useful precommit ant target.

I also added in the 10 second wait for now.

I'll commit very shortly.

 Collection API: Allow multiple shards from one collection on the same Solr 
 server
 -

 Key: SOLR-4114
 URL: https://issues.apache.org/jira/browse/SOLR-4114
 Project: Solr
  Issue Type: New Feature
  Components: multicore, SolrCloud
Affects Versions: 4.0
 Environment: Solr 4.0.0 release
Reporter: Per Steffensen
Assignee: Mark Miller
  Labels: collection-api, multicore, shard, shard-allocation
 Fix For: 4.1, 5.0

 Attachments: 
 SOLR-4114_mocking_OverseerCollectionProcessorTest_branch_4x.patch, 
 SOLR-4114.patch, SOLR-4114.patch, SOLR-4114.patch, SOLR-4114.patch, 
 SOLR-4114_trunk.patch


 We should support running multiple shards from one collection on the same 
 Solr server - the run a collection with 8 shards on a 4 Solr server cluster 
 (each Solr server running 2 shards).
 Performance tests at our side has shown that this is a good idea, and it is 
 also a good idea for easy elasticity later on - it is much easier to move an 
 entire existing shards from one Solr server to another one that just joined 
 the cluter than it is to split an exsiting shard among the Solr that used to 
 run it and the new Solr.
 See dev mailing list discussion Multiple shards for one collection on the 
 same Solr server

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-4114) Collection API: Allow multiple shards from one collection on the same Solr server

2012-12-11 Thread Mark Miller (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13529184#comment-13529184
 ] 

Mark Miller commented on SOLR-4114:
---

bq. Add the checkNoTwoShardsUseTheSameIndexDir thing if you (also) find it 
usefull (see above)

Missed that comment - I'll add this in to the mix.

 Collection API: Allow multiple shards from one collection on the same Solr 
 server
 -

 Key: SOLR-4114
 URL: https://issues.apache.org/jira/browse/SOLR-4114
 Project: Solr
  Issue Type: New Feature
  Components: multicore, SolrCloud
Affects Versions: 4.0
 Environment: Solr 4.0.0 release
Reporter: Per Steffensen
Assignee: Mark Miller
  Labels: collection-api, multicore, shard, shard-allocation
 Fix For: 4.1, 5.0

 Attachments: 
 SOLR-4114_mocking_OverseerCollectionProcessorTest_branch_4x.patch, 
 SOLR-4114.patch, SOLR-4114.patch, SOLR-4114.patch, SOLR-4114.patch, 
 SOLR-4114_trunk.patch


 We should support running multiple shards from one collection on the same 
 Solr server - the run a collection with 8 shards on a 4 Solr server cluster 
 (each Solr server running 2 shards).
 Performance tests at our side has shown that this is a good idea, and it is 
 also a good idea for easy elasticity later on - it is much easier to move an 
 entire existing shards from one Solr server to another one that just joined 
 the cluter than it is to split an exsiting shard among the Solr that used to 
 run it and the new Solr.
 See dev mailing list discussion Multiple shards for one collection on the 
 same Solr server

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-4114) Collection API: Allow multiple shards from one collection on the same Solr server

2012-12-11 Thread Commit Tag Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13529198#comment-13529198
 ] 

Commit Tag Bot commented on SOLR-4114:
--

[trunk commit] Mark Robert Miller
http://svn.apache.org/viewvc?view=revisionrevision=1420327

SOLR-4114: add back commented out test with 10 second wait, add Per's new test, 
add test to ensure no two cores use the same index directory


 Collection API: Allow multiple shards from one collection on the same Solr 
 server
 -

 Key: SOLR-4114
 URL: https://issues.apache.org/jira/browse/SOLR-4114
 Project: Solr
  Issue Type: New Feature
  Components: multicore, SolrCloud
Affects Versions: 4.0
 Environment: Solr 4.0.0 release
Reporter: Per Steffensen
Assignee: Mark Miller
  Labels: collection-api, multicore, shard, shard-allocation
 Fix For: 4.1, 5.0

 Attachments: 
 SOLR-4114_mocking_OverseerCollectionProcessorTest_branch_4x.patch, 
 SOLR-4114.patch, SOLR-4114.patch, SOLR-4114.patch, SOLR-4114.patch, 
 SOLR-4114_trunk.patch


 We should support running multiple shards from one collection on the same 
 Solr server - the run a collection with 8 shards on a 4 Solr server cluster 
 (each Solr server running 2 shards).
 Performance tests at our side has shown that this is a good idea, and it is 
 also a good idea for easy elasticity later on - it is much easier to move an 
 entire existing shards from one Solr server to another one that just joined 
 the cluter than it is to split an exsiting shard among the Solr that used to 
 run it and the new Solr.
 See dev mailing list discussion Multiple shards for one collection on the 
 same Solr server

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-4114) Collection API: Allow multiple shards from one collection on the same Solr server

2012-12-11 Thread Commit Tag Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13529207#comment-13529207
 ] 

Commit Tag Bot commented on SOLR-4114:
--

[branch_4x commit] Mark Robert Miller
http://svn.apache.org/viewvc?view=revisionrevision=1420329

SOLR-4114: add back commented out test with 10 second wait, add Per's new test, 
add test to ensure no two cores use the same index directory


 Collection API: Allow multiple shards from one collection on the same Solr 
 server
 -

 Key: SOLR-4114
 URL: https://issues.apache.org/jira/browse/SOLR-4114
 Project: Solr
  Issue Type: New Feature
  Components: multicore, SolrCloud
Affects Versions: 4.0
 Environment: Solr 4.0.0 release
Reporter: Per Steffensen
Assignee: Mark Miller
  Labels: collection-api, multicore, shard, shard-allocation
 Fix For: 4.1, 5.0

 Attachments: 
 SOLR-4114_mocking_OverseerCollectionProcessorTest_branch_4x.patch, 
 SOLR-4114.patch, SOLR-4114.patch, SOLR-4114.patch, SOLR-4114.patch, 
 SOLR-4114_trunk.patch


 We should support running multiple shards from one collection on the same 
 Solr server - the run a collection with 8 shards on a 4 Solr server cluster 
 (each Solr server running 2 shards).
 Performance tests at our side has shown that this is a good idea, and it is 
 also a good idea for easy elasticity later on - it is much easier to move an 
 entire existing shards from one Solr server to another one that just joined 
 the cluter than it is to split an exsiting shard among the Solr that used to 
 run it and the new Solr.
 See dev mailing list discussion Multiple shards for one collection on the 
 same Solr server

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-4114) Collection API: Allow multiple shards from one collection on the same Solr server

2012-12-11 Thread Mark Miller (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13529209#comment-13529209
 ] 

Mark Miller commented on SOLR-4114:
---

I think that covers all of your points Per - let me know.

 Collection API: Allow multiple shards from one collection on the same Solr 
 server
 -

 Key: SOLR-4114
 URL: https://issues.apache.org/jira/browse/SOLR-4114
 Project: Solr
  Issue Type: New Feature
  Components: multicore, SolrCloud
Affects Versions: 4.0
 Environment: Solr 4.0.0 release
Reporter: Per Steffensen
Assignee: Mark Miller
  Labels: collection-api, multicore, shard, shard-allocation
 Fix For: 4.1, 5.0

 Attachments: 
 SOLR-4114_mocking_OverseerCollectionProcessorTest_branch_4x.patch, 
 SOLR-4114.patch, SOLR-4114.patch, SOLR-4114.patch, SOLR-4114.patch, 
 SOLR-4114_trunk.patch


 We should support running multiple shards from one collection on the same 
 Solr server - the run a collection with 8 shards on a 4 Solr server cluster 
 (each Solr server running 2 shards).
 Performance tests at our side has shown that this is a good idea, and it is 
 also a good idea for easy elasticity later on - it is much easier to move an 
 entire existing shards from one Solr server to another one that just joined 
 the cluter than it is to split an exsiting shard among the Solr that used to 
 run it and the new Solr.
 See dev mailing list discussion Multiple shards for one collection on the 
 same Solr server

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-4114) Collection API: Allow multiple shards from one collection on the same Solr server

2012-12-11 Thread Mark Miller (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13529298#comment-13529298
 ] 

Mark Miller commented on SOLR-4114:
---

note: hossman has run into trouble with the easymock test in SOLR-4136 - I've 
attached a small patch there that extends the mocking to cover solrzkclient and 
a call to get the base url.

I suppose it's not ideal in that it simply returns _ replaced with /, so the 
test won't work with a solr context using an underscore - but it's a start, 
works currently, and can be improved.

Feel free to give feedback in that issue - this was my first experience looking 
at easymock.

 Collection API: Allow multiple shards from one collection on the same Solr 
 server
 -

 Key: SOLR-4114
 URL: https://issues.apache.org/jira/browse/SOLR-4114
 Project: Solr
  Issue Type: New Feature
  Components: multicore, SolrCloud
Affects Versions: 4.0
 Environment: Solr 4.0.0 release
Reporter: Per Steffensen
Assignee: Mark Miller
  Labels: collection-api, multicore, shard, shard-allocation
 Fix For: 4.1, 5.0

 Attachments: 
 SOLR-4114_mocking_OverseerCollectionProcessorTest_branch_4x.patch, 
 SOLR-4114.patch, SOLR-4114.patch, SOLR-4114.patch, SOLR-4114.patch, 
 SOLR-4114_trunk.patch


 We should support running multiple shards from one collection on the same 
 Solr server - the run a collection with 8 shards on a 4 Solr server cluster 
 (each Solr server running 2 shards).
 Performance tests at our side has shown that this is a good idea, and it is 
 also a good idea for easy elasticity later on - it is much easier to move an 
 entire existing shards from one Solr server to another one that just joined 
 the cluter than it is to split an exsiting shard among the Solr that used to 
 run it and the new Solr.
 See dev mailing list discussion Multiple shards for one collection on the 
 same Solr server

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-4114) Collection API: Allow multiple shards from one collection on the same Solr server

2012-12-06 Thread Per Steffensen (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13511336#comment-13511336
 ] 

Per Steffensen commented on SOLR-4114:
--

bq. I'm personally okay with adding like a 10 second wait until that gets in.

So please do it.
Enable the call to checkCollectionIsNotCreated and set the wait in it to 10 
sec.
Maybe also make a comment on SOLR-4043 that you ought to modify the 
BasicDistributedZkTest.testCollectionsAPI test as a part of solving this issue.

 Collection API: Allow multiple shards from one collection on the same Solr 
 server
 -

 Key: SOLR-4114
 URL: https://issues.apache.org/jira/browse/SOLR-4114
 Project: Solr
  Issue Type: New Feature
  Components: multicore, SolrCloud
Affects Versions: 4.0
 Environment: Solr 4.0.0 release
Reporter: Per Steffensen
Assignee: Per Steffensen
  Labels: collection-api, multicore, shard, shard-allocation
 Attachments: SOLR-4114.patch, SOLR-4114.patch, SOLR-4114.patch, 
 SOLR-4114.patch, SOLR-4114_trunk.patch


 We should support running multiple shards from one collection on the same 
 Solr server - the run a collection with 8 shards on a 4 Solr server cluster 
 (each Solr server running 2 shards).
 Performance tests at our side has shown that this is a good idea, and it is 
 also a good idea for easy elasticity later on - it is much easier to move an 
 entire existing shards from one Solr server to another one that just joined 
 the cluter than it is to split an exsiting shard among the Solr that used to 
 run it and the new Solr.
 See dev mailing list discussion Multiple shards for one collection on the 
 same Solr server

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-4114) Collection API: Allow multiple shards from one collection on the same Solr server

2012-12-06 Thread Per Steffensen (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13511348#comment-13511348
 ] 

Per Steffensen commented on SOLR-4114:
--

I verified that the removal of my controlled instance-dir and data-dir 
OverseerCollectionProcessor.createCollection is ok. I needed to do some 
investigations on how instance-dir and data-dir works. Now I know and can see 
that the controlled instance-dir and data-dir was a bad idea. Thanks for 
being so thorough, Mark.

During my investigations of instance-dir and data-dir I came up with an 
additional test for BasicDistributedZkTest.testCollectionAPI, namely to do a 
test making sure that when you have created a lot of collections you will not 
end up with any two (or more) shards using the same index-dir - that was 
actually what I was affraid would happen when you (Mark) removed the 
controlled instance- and data-dir. This additional test-part will run very 
fast (200 ms on my local machine), so it will not extend the run-time of the 
test noticeably to include it. Instead of sending a patch I will just explain 
what to do to get this additional testing into BasicDistributedZkTest (this 
description works on 4.0, but I couldnt imagine that it wouldnt on 5.x or 4.x):
* Add this method somewhere in BasicDistributedZkTest
{code}
  private void checkNoTwoShardsUseTheSameIndexDir() throws Exception {
MapString, SetString indexDirToShardNamesMap = new HashMapString, 
SetString();

ListMBeanServer servers = new LinkedListMBeanServer();
servers.add(ManagementFactory.getPlatformMBeanServer());
servers.addAll(MBeanServerFactory.findMBeanServer(null));
for (final MBeanServer server : servers) {
  SetObjectName mbeans = new HashSetObjectName();
  mbeans.addAll(server.queryNames(null, null));
  for (final ObjectName mbean : mbeans) {
Object value;
Object indexDir;
Object name;
try {
  if (((value = server.getAttribute(mbean, category)) != null  
value.toString().equals(Category.CORE.toString())) 
  ((value = server.getAttribute(mbean, source)) != null  
value.toString().contains(SolrCore.class.getSimpleName())) 
  ((indexDir = server.getAttribute(mbean, indexDir)) != null) 
  ((name = server.getAttribute(mbean, name)) != null)) {
  if (!indexDirToShardNamesMap.containsKey(indexDir.toString())) {
indexDirToShardNamesMap.put(indexDir.toString(), new 
HashSetString());
  }
  
indexDirToShardNamesMap.get(indexDir.toString()).add(name.toString());
  }
} catch (Exception e) {
  // ignore, just continue - probably a category or source 
attribute not found
}
  }
}

assertTrue(Something is broken in the assert for no shards using the same 
indexDir - probably something was changed in the attributes published in the 
MBean of  + SolrCore.class.getSimpleName(), indexDirToShardNamesMap.size()  
0);
for (EntryString, SetString entry : indexDirToShardNamesMap.entrySet()) 
{
  if (entry.getValue().size()  1) {
fail(We have shards using the same indexDir. E.g. shards  + 
entry.getValue().toString() +  all use indexDir  + entry.getKey());
  }
}

  }
{code}
* Add a call to this method (checkNoTwoShardsUseTheSameIndexDir();) at the end 
of BasicDistributedZkTest.testCollectionsAPI
* Add the line lst.add(indexDir, getIndexDir()); to 
SolrCore.getStatistics() so that index-dir will also be part of the information 
exposed in the MBean of SolrCore

Please consider including the additional test. It scans all SolrCores in the 
system to see if any of them share index-dir. I do the scanning by accessing 
MBean info from SolrCores - the simplest way I could come up with. It means 
that SolrCore will now also expose index-dir through its MBean, but I guess no 
one would have anything against that.

Regards, Per Steffensen

 Collection API: Allow multiple shards from one collection on the same Solr 
 server
 -

 Key: SOLR-4114
 URL: https://issues.apache.org/jira/browse/SOLR-4114
 Project: Solr
  Issue Type: New Feature
  Components: multicore, SolrCloud
Affects Versions: 4.0
 Environment: Solr 4.0.0 release
Reporter: Per Steffensen
Assignee: Per Steffensen
  Labels: collection-api, multicore, shard, shard-allocation
 Attachments: SOLR-4114.patch, SOLR-4114.patch, SOLR-4114.patch, 
 SOLR-4114.patch, SOLR-4114_trunk.patch


 We should support running multiple shards from one collection on the same 
 Solr server - the run a collection with 8 shards on a 4 Solr server cluster 
 (each Solr server running 2 shards).
 Performance tests at our side has 

[jira] [Commented] (SOLR-4114) Collection API: Allow multiple shards from one collection on the same Solr server

2012-12-05 Thread Per Steffensen (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13510351#comment-13510351
 ] 

Per Steffensen commented on SOLR-4114:
--

bq. I suppose a 10 second sleep is more agreeable, but these things add up and 
and I'd rather come up with a better test. 

I would rather too, but believe it is hard to make sure something does not 
happen, without giving it a chance to happen, and check that it did not. So 
there is no better way of testing it, as I see it. At least if you want this 
kind of integration-ish test, where you start a full system and do something to 
it as if you where a real client acting against a real system. And I like this 
kind of tests. If we go do a more unit-test-ish test directly on 
OverseerCollectionProcessor we might be able to do something faster, but it 
will not ensure the correct system-level functionality to nearly the same 
degree.

I think you should commit with a 10-20 sec wait, and then if you (or someone 
else) can come up with a faster way for testing it properly, it is fine for me 
to make the change. I do not believe I will be able to come up with a proper 
test of this that is faster. But protect the feature the slow way until 
someone comes up with a faster way of testing.

 Collection API: Allow multiple shards from one collection on the same Solr 
 server
 -

 Key: SOLR-4114
 URL: https://issues.apache.org/jira/browse/SOLR-4114
 Project: Solr
  Issue Type: New Feature
  Components: multicore, SolrCloud
Affects Versions: 4.0
 Environment: Solr 4.0.0 release
Reporter: Per Steffensen
Assignee: Per Steffensen
  Labels: collection-api, multicore, shard, shard-allocation
 Attachments: SOLR-4114.patch, SOLR-4114.patch, SOLR-4114.patch, 
 SOLR-4114.patch, SOLR-4114_trunk.patch


 We should support running multiple shards from one collection on the same 
 Solr server - the run a collection with 8 shards on a 4 Solr server cluster 
 (each Solr server running 2 shards).
 Performance tests at our side has shown that this is a good idea, and it is 
 also a good idea for easy elasticity later on - it is much easier to move an 
 entire existing shards from one Solr server to another one that just joined 
 the cluter than it is to split an exsiting shard among the Solr that used to 
 run it and the new Solr.
 See dev mailing list discussion Multiple shards for one collection on the 
 same Solr server

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-4114) Collection API: Allow multiple shards from one collection on the same Solr server

2012-12-05 Thread Robert Muir (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13510450#comment-13510450
 ] 

Robert Muir commented on SOLR-4114:
---

{quote}
If we go do a more unit-test-ish test directly on OverseerCollectionProcessor 
we might be able to do something faster, but it will not ensure the correct 
system-level functionality to nearly the same degree.
{quote}

But these tests are really necessary. 

I think solr has enough integration tests and not enough unit tests already. 
This makes it difficult to change solr code: because there is no direct 
correlation with the functionality and tests.

Integration-level tests aren't nearly as useful, and much more difficult to 
debug.


 Collection API: Allow multiple shards from one collection on the same Solr 
 server
 -

 Key: SOLR-4114
 URL: https://issues.apache.org/jira/browse/SOLR-4114
 Project: Solr
  Issue Type: New Feature
  Components: multicore, SolrCloud
Affects Versions: 4.0
 Environment: Solr 4.0.0 release
Reporter: Per Steffensen
Assignee: Per Steffensen
  Labels: collection-api, multicore, shard, shard-allocation
 Attachments: SOLR-4114.patch, SOLR-4114.patch, SOLR-4114.patch, 
 SOLR-4114.patch, SOLR-4114_trunk.patch


 We should support running multiple shards from one collection on the same 
 Solr server - the run a collection with 8 shards on a 4 Solr server cluster 
 (each Solr server running 2 shards).
 Performance tests at our side has shown that this is a good idea, and it is 
 also a good idea for easy elasticity later on - it is much easier to move an 
 entire existing shards from one Solr server to another one that just joined 
 the cluter than it is to split an exsiting shard among the Solr that used to 
 run it and the new Solr.
 See dev mailing list discussion Multiple shards for one collection on the 
 same Solr server

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-4114) Collection API: Allow multiple shards from one collection on the same Solr server

2012-12-05 Thread Per Steffensen (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13510471#comment-13510471
 ] 

Per Steffensen commented on SOLR-4114:
--

Guess there are two movements in the industry at the moment
* TDD: Mostly focused at testing units (single components), hoping that if 
all units work as they are supposed to, the entire system where all those 
units are used in combination will also work as it is supposed to
* BDD: Mostly focused at testing behaviour of a system seen from the outside, 
because that is basically all you care about. No one cares how the system works 
internally. Everyone cares about how the system works as a whole, when used to 
the extend you can use it from the outside. This is especially true for the 
end users of the system, and at the end of the day a system is created to 
fullfil the users needs.

bq. Integration-level tests aren't nearly as useful

I completely disagree with that.

I, personally, are not that much into unit-tests because thay basically do 
not show that the system as a whole behaves as it is supposed to. 
Behavioural/integration-tests does, where you test against an entire system 
using it the way it can be used from the outside, and asserting that result 
is as expected seen from the outside.

bq. and much more difficult to debug

You are right about that. It like unit-tests to supplement 
behavioural/integration-tests whenever it is found to be necessary. We can add 
a OverseerCollectionProcessor unit-test, testing some of this new 
functionality, but in my mind (as I said) it will not ensure the correct 
system-level functionality to nearly the same degree, and basically thats all 
we are interrested in.

If the community would like we can supplement with unit-tests for this new 
feature, but you will have to fight me (FWIW) to get rid of the integration-ish 
test of the feature.

 Collection API: Allow multiple shards from one collection on the same Solr 
 server
 -

 Key: SOLR-4114
 URL: https://issues.apache.org/jira/browse/SOLR-4114
 Project: Solr
  Issue Type: New Feature
  Components: multicore, SolrCloud
Affects Versions: 4.0
 Environment: Solr 4.0.0 release
Reporter: Per Steffensen
Assignee: Per Steffensen
  Labels: collection-api, multicore, shard, shard-allocation
 Attachments: SOLR-4114.patch, SOLR-4114.patch, SOLR-4114.patch, 
 SOLR-4114.patch, SOLR-4114_trunk.patch


 We should support running multiple shards from one collection on the same 
 Solr server - the run a collection with 8 shards on a 4 Solr server cluster 
 (each Solr server running 2 shards).
 Performance tests at our side has shown that this is a good idea, and it is 
 also a good idea for easy elasticity later on - it is much easier to move an 
 entire existing shards from one Solr server to another one that just joined 
 the cluter than it is to split an exsiting shard among the Solr that used to 
 run it and the new Solr.
 See dev mailing list discussion Multiple shards for one collection on the 
 same Solr server

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-4114) Collection API: Allow multiple shards from one collection on the same Solr server

2012-12-05 Thread Robert Muir (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13510478#comment-13510478
 ] 

Robert Muir commented on SOLR-4114:
---

I don't even think we need to argue about it really. Currently solr has a lot 
of integration tests, how is that working out?!

Look at solr's queryparser tests. they index documents and run actual queries, 
instead of simply assertEquals(expectedQuery, actual).

This provides no benefit, and only makes it harder to debug. instead of seeing 
what is different about the query structure, you typically get xpath failed 
and have to start digging.

This is just the simplest possible Captain Obvious example. The problem runs 
much deeper.


 Collection API: Allow multiple shards from one collection on the same Solr 
 server
 -

 Key: SOLR-4114
 URL: https://issues.apache.org/jira/browse/SOLR-4114
 Project: Solr
  Issue Type: New Feature
  Components: multicore, SolrCloud
Affects Versions: 4.0
 Environment: Solr 4.0.0 release
Reporter: Per Steffensen
Assignee: Per Steffensen
  Labels: collection-api, multicore, shard, shard-allocation
 Attachments: SOLR-4114.patch, SOLR-4114.patch, SOLR-4114.patch, 
 SOLR-4114.patch, SOLR-4114_trunk.patch


 We should support running multiple shards from one collection on the same 
 Solr server - the run a collection with 8 shards on a 4 Solr server cluster 
 (each Solr server running 2 shards).
 Performance tests at our side has shown that this is a good idea, and it is 
 also a good idea for easy elasticity later on - it is much easier to move an 
 entire existing shards from one Solr server to another one that just joined 
 the cluter than it is to split an exsiting shard among the Solr that used to 
 run it and the new Solr.
 See dev mailing list discussion Multiple shards for one collection on the 
 same Solr server

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-4114) Collection API: Allow multiple shards from one collection on the same Solr server

2012-12-05 Thread Per Steffensen (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13510487#comment-13510487
 ] 

Per Steffensen commented on SOLR-4114:
--

bq. I don't even think we need to argue about it really.

Well if that is your opinion you will have a problem working on big projects. 
You should find a private project to work on in you basement. I think it is 
worth arguing about. Maybe not on this issue/ticket, but in general in the 
community on the dev mailing list. I am surrounded by professional testers in 
my everyday work, and what I hear from them is more and more pulling towards 
behavioural testing.

bq. Currently solr has a lot of integration tests, how is that working out?!

I dont know how it works out, but if you see a lot of problems, I wouldnt blame 
the integration-test over unit-test strategy (is such a strategy exists). I 
have a gut fealing about then main problem of Solr is that no one ever dare to 
do refactoring - the code is a mess. And that you really do not trust your 
test-suite.

If you want to to be able to trust you test-suite enough to dare doing big 
refactorings, integration/behavioural tests are by far the best. Typically when 
you do major refactoring you do not change the system-behaviour seen from the 
outside. You basically re-organize the code internally in order to be able to 
keep adding features, fixing bugs etc. without getting too confused. Therefore 
integration/behavioural tests do not have to be changed during a big 
refactor. Unit-tests usually do, because a big refactor often includes getting 
rid of existing units, splitting up existing units, adding new units etc.

 Collection API: Allow multiple shards from one collection on the same Solr 
 server
 -

 Key: SOLR-4114
 URL: https://issues.apache.org/jira/browse/SOLR-4114
 Project: Solr
  Issue Type: New Feature
  Components: multicore, SolrCloud
Affects Versions: 4.0
 Environment: Solr 4.0.0 release
Reporter: Per Steffensen
Assignee: Per Steffensen
  Labels: collection-api, multicore, shard, shard-allocation
 Attachments: SOLR-4114.patch, SOLR-4114.patch, SOLR-4114.patch, 
 SOLR-4114.patch, SOLR-4114_trunk.patch


 We should support running multiple shards from one collection on the same 
 Solr server - the run a collection with 8 shards on a 4 Solr server cluster 
 (each Solr server running 2 shards).
 Performance tests at our side has shown that this is a good idea, and it is 
 also a good idea for easy elasticity later on - it is much easier to move an 
 entire existing shards from one Solr server to another one that just joined 
 the cluter than it is to split an exsiting shard among the Solr that used to 
 run it and the new Solr.
 See dev mailing list discussion Multiple shards for one collection on the 
 same Solr server

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-4114) Collection API: Allow multiple shards from one collection on the same Solr server

2012-12-05 Thread Robert Muir (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13510490#comment-13510490
 ] 

Robert Muir commented on SOLR-4114:
---

Its not working out: and its exactly due to this attitude (which must be 
changed).

(release manager who had to release solr 4.0, where its own test suite didnt 
pass because no one really gives a shit about this stuff)


 Collection API: Allow multiple shards from one collection on the same Solr 
 server
 -

 Key: SOLR-4114
 URL: https://issues.apache.org/jira/browse/SOLR-4114
 Project: Solr
  Issue Type: New Feature
  Components: multicore, SolrCloud
Affects Versions: 4.0
 Environment: Solr 4.0.0 release
Reporter: Per Steffensen
Assignee: Per Steffensen
  Labels: collection-api, multicore, shard, shard-allocation
 Attachments: SOLR-4114.patch, SOLR-4114.patch, SOLR-4114.patch, 
 SOLR-4114.patch, SOLR-4114_trunk.patch


 We should support running multiple shards from one collection on the same 
 Solr server - the run a collection with 8 shards on a 4 Solr server cluster 
 (each Solr server running 2 shards).
 Performance tests at our side has shown that this is a good idea, and it is 
 also a good idea for easy elasticity later on - it is much easier to move an 
 entire existing shards from one Solr server to another one that just joined 
 the cluter than it is to split an exsiting shard among the Solr that used to 
 run it and the new Solr.
 See dev mailing list discussion Multiple shards for one collection on the 
 same Solr server

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-4114) Collection API: Allow multiple shards from one collection on the same Solr server

2012-12-05 Thread Robert Muir (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13510492#comment-13510492
 ] 

Robert Muir commented on SOLR-4114:
---

Per: you contradict yourself. you say only the seen from the outside matters, 
yet discourage the unit tests that make it easier to develop new APIs, and the 
unit tests that test actual usage of the API. This is critical to refactoring.

Its a bonus that you have to change unit tests when you refactor APIs, its like 
having little example apps that use your apis to test if the api change has 
some quirks and so on. you are forced to actually use it.

 Collection API: Allow multiple shards from one collection on the same Solr 
 server
 -

 Key: SOLR-4114
 URL: https://issues.apache.org/jira/browse/SOLR-4114
 Project: Solr
  Issue Type: New Feature
  Components: multicore, SolrCloud
Affects Versions: 4.0
 Environment: Solr 4.0.0 release
Reporter: Per Steffensen
Assignee: Per Steffensen
  Labels: collection-api, multicore, shard, shard-allocation
 Attachments: SOLR-4114.patch, SOLR-4114.patch, SOLR-4114.patch, 
 SOLR-4114.patch, SOLR-4114_trunk.patch


 We should support running multiple shards from one collection on the same 
 Solr server - the run a collection with 8 shards on a 4 Solr server cluster 
 (each Solr server running 2 shards).
 Performance tests at our side has shown that this is a good idea, and it is 
 also a good idea for easy elasticity later on - it is much easier to move an 
 entire existing shards from one Solr server to another one that just joined 
 the cluter than it is to split an exsiting shard among the Solr that used to 
 run it and the new Solr.
 See dev mailing list discussion Multiple shards for one collection on the 
 same Solr server

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-4114) Collection API: Allow multiple shards from one collection on the same Solr server

2012-12-05 Thread Robert Muir (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13510497#comment-13510497
 ] 

Robert Muir commented on SOLR-4114:
---

I'm also against more tests with sleeps. you can expect to see me vote on 
commits that have huge sleeps.

 Collection API: Allow multiple shards from one collection on the same Solr 
 server
 -

 Key: SOLR-4114
 URL: https://issues.apache.org/jira/browse/SOLR-4114
 Project: Solr
  Issue Type: New Feature
  Components: multicore, SolrCloud
Affects Versions: 4.0
 Environment: Solr 4.0.0 release
Reporter: Per Steffensen
Assignee: Per Steffensen
  Labels: collection-api, multicore, shard, shard-allocation
 Attachments: SOLR-4114.patch, SOLR-4114.patch, SOLR-4114.patch, 
 SOLR-4114.patch, SOLR-4114_trunk.patch


 We should support running multiple shards from one collection on the same 
 Solr server - the run a collection with 8 shards on a 4 Solr server cluster 
 (each Solr server running 2 shards).
 Performance tests at our side has shown that this is a good idea, and it is 
 also a good idea for easy elasticity later on - it is much easier to move an 
 entire existing shards from one Solr server to another one that just joined 
 the cluter than it is to split an exsiting shard among the Solr that used to 
 run it and the new Solr.
 See dev mailing list discussion Multiple shards for one collection on the 
 same Solr server

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-4114) Collection API: Allow multiple shards from one collection on the same Solr server

2012-12-05 Thread Jack Krupansky (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13510504#comment-13510504
 ] 

Jack Krupansky commented on SOLR-4114:
--

Definition of integration testing: A process where you spend 75% to 95% of your 
time (and the time of people tracking down test failures) GUESSING what numbers 
to use for sleeps.

I am a fan of integration testing, but it should be used as an adjunct, not a 
replacement for hard-core unit testing.

Solr is screaming out for a mocking capability so that more integration 
testing can be done at the unit test level. Mocking can also improve testing by 
varying the characteristics of dependencies in a controlled manner rather than 
have integration tests that test only a narrow range of characteristics that 
vary between environments and over time in an unpredictable and non-repeatable 
manner. I mean, it would be nice if we could develop components that were less 
sensitive to changes in the performance of components that they depend on.


 Collection API: Allow multiple shards from one collection on the same Solr 
 server
 -

 Key: SOLR-4114
 URL: https://issues.apache.org/jira/browse/SOLR-4114
 Project: Solr
  Issue Type: New Feature
  Components: multicore, SolrCloud
Affects Versions: 4.0
 Environment: Solr 4.0.0 release
Reporter: Per Steffensen
Assignee: Per Steffensen
  Labels: collection-api, multicore, shard, shard-allocation
 Attachments: SOLR-4114.patch, SOLR-4114.patch, SOLR-4114.patch, 
 SOLR-4114.patch, SOLR-4114_trunk.patch


 We should support running multiple shards from one collection on the same 
 Solr server - the run a collection with 8 shards on a 4 Solr server cluster 
 (each Solr server running 2 shards).
 Performance tests at our side has shown that this is a good idea, and it is 
 also a good idea for easy elasticity later on - it is much easier to move an 
 entire existing shards from one Solr server to another one that just joined 
 the cluter than it is to split an exsiting shard among the Solr that used to 
 run it and the new Solr.
 See dev mailing list discussion Multiple shards for one collection on the 
 same Solr server

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-4114) Collection API: Allow multiple shards from one collection on the same Solr server

2012-12-05 Thread Robert Muir (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13510508#comment-13510508
 ] 

Robert Muir commented on SOLR-4114:
---

I agree with Jack about the mocking, its really needed.

I feel like with solr cloud it could help a lot, e.g. simulate this guy going 
down and that guy and not deal with timing issues and so on.
Just like lucene doesn't actually write continuously to your disk until its 
really full to TestIndexWriterOnDiskFull, it mocks the disk full.
Sure these mock techniques aren't perfect, but they are much easier to debug.

for real integration tests maybe junit isnt even the best tool for the job 
anyway, so i would prefer if these were separate from the unit tests.

These integration tests are especially frustrating for lucene developers, who 
*seriously* dont want to break solr when they change things underneath it. but 
the test suite doesn't really allow this you know, and when something does 
break its hard to tell if its just an unrelated sporatic fail, because the test 
suite is unreliable in general.

There is just no replacement for good unit tests.


 Collection API: Allow multiple shards from one collection on the same Solr 
 server
 -

 Key: SOLR-4114
 URL: https://issues.apache.org/jira/browse/SOLR-4114
 Project: Solr
  Issue Type: New Feature
  Components: multicore, SolrCloud
Affects Versions: 4.0
 Environment: Solr 4.0.0 release
Reporter: Per Steffensen
Assignee: Per Steffensen
  Labels: collection-api, multicore, shard, shard-allocation
 Attachments: SOLR-4114.patch, SOLR-4114.patch, SOLR-4114.patch, 
 SOLR-4114.patch, SOLR-4114_trunk.patch


 We should support running multiple shards from one collection on the same 
 Solr server - the run a collection with 8 shards on a 4 Solr server cluster 
 (each Solr server running 2 shards).
 Performance tests at our side has shown that this is a good idea, and it is 
 also a good idea for easy elasticity later on - it is much easier to move an 
 entire existing shards from one Solr server to another one that just joined 
 the cluter than it is to split an exsiting shard among the Solr that used to 
 run it and the new Solr.
 See dev mailing list discussion Multiple shards for one collection on the 
 same Solr server

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-4114) Collection API: Allow multiple shards from one collection on the same Solr server

2012-12-05 Thread Per Steffensen (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13510515#comment-13510515
 ] 

Per Steffensen commented on SOLR-4114:
--

bq. Its not working out: and its exactly due to this attitude (which must be 
changed)

Its also my impression that is does not work out, we just do not share opinion 
about reason.

bq. you contradict yourself. you say only the seen from the outside matters, 
yet discourage the unit tests that make it easier to develop new APIs, and the 
unit tests that test actual usage of the API. This is critical to refactoring.

No I do not contradict myself - you misunderstand me. If an API can be used 
from the outside it certainly needs test-coverage - whether or not you see it 
as unit or behavioural tests. The collection-creation/reload/removal featurs of 
OverseerCollectionProcesser (the one i question here) is not accessable from 
the outside. The Collection API is, which is triggerede by sending a request 
getting handled by CollectionHandler etc. eventaully submitting a job for the 
OverseerCollectionProcessor. This should (also) be tested by sending such 
request from the outside to a complete Solr cluster, and assert that the 
expected collection/shards ends up being created/reloaded/removed, and in case 
of create that you can index data into the new collection and search the data 
up again, and i case of remove that the collections/shards disappear from the 
system (ZK and that data-dirs are delete if that is what you asked for) etc.

If you want to supplement with tests directly on OverseerCollectionProcessor 
that is fine. But such tests are mainly usefull during the development process, 
and not to ensure that no one in the future breaks the feature you introduced. 
The feature seen from the outside is typically unchanged during refactoring, 
and the feature seen from the outside is what matters. Say we some day 
decides that collection creation shouldnt really be handled asynchronously by 
the Overseer, but that we want to handle it synchronously before responding to 
the one that sent the collection creation request. In that case 
OverseerCollectionProcessor will probably be deleted (yes most of the code will 
still remain, but will probably be moved/restructured to other classes/units 
somewhere else), and there will be tests in a OverseerCollectionProcessorTest 
that needs to be moved and changed, and it is not certain that the one doing 
the refactor gets (the reason or point) of all the tests of 
OverseerCollectionProcessor or that he is able to tweak them to simular tests 
of the new components handling the same aspects. The behavioral test does not 
need to change, as it ensures that the feature seen from the outside did not 
change, and that is very important doing refactoring.

As I said, I am certainly not against unit-tests, but is is mainly a 
working-tool for the developers. Behavioural tests are the ones that ensure, 
that your system as a whole works as it is supposed to - and whether you want 
it or not, it IS what creating a system is all about.
Guess you would realize a thing or two by working on a project for a real 
custumor setting up real demands. Those demands are all requirements to the 
system as a whole seen from the outside. He doesnt care about the internal 
working of the system. Our testers work a lot with the custumor (or the PO 
representing him) to work out behavioural tests to make sure we fulfill his 
needs and requirements. We do lots of unit-tests in our project also, but have 
NO problem what-so-ever refactoring all the time. So I can tell you that 
behavioural tests and daring to do constant refactoring can go hand in hand. It 
is a little harder when you have only unit-tests.

Start thinking about Solr as a product you need to deliver to some artificial 
customer, and try to think the way he would think - only system-level behaviour 
matters to him. Unit-tests are working-tools for the developers.

bq. I'm also against more tests with sleeps

As I said me too. But rather test a feature the way it can be tested, than 
not testing it.

bq. you can expect to see me vote on commits that have huge sleeps

Uhh I hope not, you just said you where against sleeps. But I guess mean you 
cannot... :-)

 Collection API: Allow multiple shards from one collection on the same Solr 
 server
 -

 Key: SOLR-4114
 URL: https://issues.apache.org/jira/browse/SOLR-4114
 Project: Solr
  Issue Type: New Feature
  Components: multicore, SolrCloud
Affects Versions: 4.0
 Environment: Solr 4.0.0 release
Reporter: Per Steffensen
Assignee: Per Steffensen
  Labels: collection-api, multicore, shard, shard-allocation
 Attachments: 

[jira] [Commented] (SOLR-4114) Collection API: Allow multiple shards from one collection on the same Solr server

2012-12-05 Thread Per Steffensen (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13510522#comment-13510522
 ] 

Per Steffensen commented on SOLR-4114:
--

bq. but it should be used as an adjunct, not a replacement for hard-core unit 
testing

Guess I kinda agree, just dont overdue the hard-core unit testing, and 
certainly not forget to do the behavioural/integration tests.

bq. Solr is screaming out for a mocking capability so that more integration 
testing can be done at the unit test level

Uhhh yes. Mocking is a nice way of having your integration tests not include 
the whole world. E.g. an a test included the entire request handling, seen from 
the outside down to the OverseerCollectionProcessor, but where we mock the 
calls to the Core Admin API and just assert that the requests forwarded to the 
Core Admin API is as expected. Mocking capability in Solr will take tests to 
the next level, but why not just start using it? - e.g. mockito is just go-use.

bq. its hard to tell if its just an unrelated sporatic fail, because the test 
suite is unreliable in general

Yes, my impression is also that the tests are unreliable - sometimes fail and 
sometimes not, and it is really hard to know if you did something wrong. But I 
will still claim that it is not because of integration tests - it might be 
because they, as so many other things in Solr, are done in a undisciplined way.

Well everyone have their experience and believes and I am always glad to 
discuss with you guys. Glad that you decided to really join Robert, even though 
you thought that there where nothing to discuss. But I guess I have stated my 
opinion now, and I dont want to do any more debating here on this issue/ticket. 
If you want to continue on the mailing-list I will probably join, but this 
issue/ticket should be about this issue/ticket from now on. We still have 
unhandled matters (e.g. controlling instance-dir and/or data-dir), and I dont 
want the discussion about those matters to drown in partly unrelated 
discussions about test-strategies.

 Collection API: Allow multiple shards from one collection on the same Solr 
 server
 -

 Key: SOLR-4114
 URL: https://issues.apache.org/jira/browse/SOLR-4114
 Project: Solr
  Issue Type: New Feature
  Components: multicore, SolrCloud
Affects Versions: 4.0
 Environment: Solr 4.0.0 release
Reporter: Per Steffensen
Assignee: Per Steffensen
  Labels: collection-api, multicore, shard, shard-allocation
 Attachments: SOLR-4114.patch, SOLR-4114.patch, SOLR-4114.patch, 
 SOLR-4114.patch, SOLR-4114_trunk.patch


 We should support running multiple shards from one collection on the same 
 Solr server - the run a collection with 8 shards on a 4 Solr server cluster 
 (each Solr server running 2 shards).
 Performance tests at our side has shown that this is a good idea, and it is 
 also a good idea for easy elasticity later on - it is much easier to move an 
 entire existing shards from one Solr server to another one that just joined 
 the cluter than it is to split an exsiting shard among the Solr that used to 
 run it and the new Solr.
 See dev mailing list discussion Multiple shards for one collection on the 
 same Solr server

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-4114) Collection API: Allow multiple shards from one collection on the same Solr server

2012-12-05 Thread Robert Muir (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13510526#comment-13510526
 ] 

Robert Muir commented on SOLR-4114:
---

Per I don't want to drown out your issue either. But when I see arguments to 
add an e.g. 60 second test like this, I felt the need to speak up. Solr has a 
lot of these tests today (many fail on a daily basis), but I'm not sure how 
much use they really are.

If tests are failing constantly and nobody is fixing them: then there is a 
problem :)

Look: I'm totally fine with such tests being annotated \@Nightly and running on 
jenkins at night (as long as they are reliable and debuggable). But slowness 
itself presents another barrier for someone else trying to debug the test.

So I think its important to have quicker, simpler unit tests to encourage 
refactoring and good APIs. Solr is really missing this.

And of course for the record I agree with you about the whole issue of 
refactoring in general. I feel like refactoring in solr is not really 
encouraged, because there is no faith in the test suite. So its safer to just 
stick with the status quo instead. This is how projects die.


 Collection API: Allow multiple shards from one collection on the same Solr 
 server
 -

 Key: SOLR-4114
 URL: https://issues.apache.org/jira/browse/SOLR-4114
 Project: Solr
  Issue Type: New Feature
  Components: multicore, SolrCloud
Affects Versions: 4.0
 Environment: Solr 4.0.0 release
Reporter: Per Steffensen
Assignee: Per Steffensen
  Labels: collection-api, multicore, shard, shard-allocation
 Attachments: SOLR-4114.patch, SOLR-4114.patch, SOLR-4114.patch, 
 SOLR-4114.patch, SOLR-4114_trunk.patch


 We should support running multiple shards from one collection on the same 
 Solr server - the run a collection with 8 shards on a 4 Solr server cluster 
 (each Solr server running 2 shards).
 Performance tests at our side has shown that this is a good idea, and it is 
 also a good idea for easy elasticity later on - it is much easier to move an 
 entire existing shards from one Solr server to another one that just joined 
 the cluter than it is to split an exsiting shard among the Solr that used to 
 run it and the new Solr.
 See dev mailing list discussion Multiple shards for one collection on the 
 same Solr server

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-4114) Collection API: Allow multiple shards from one collection on the same Solr server

2012-12-05 Thread Per Steffensen (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13510534#comment-13510534
 ] 

Per Steffensen commented on SOLR-4114:
--

Agree with everything you just said!

bq. But when I see arguments to add an e.g. 60 second test like this, I felt 
the need to speak up

I understand, and it is not that I this kind of test! I just want the feature 
tested and protected from someone else ruining it tomorrow. And I cant come up 
with another way of testing that nothing happens, than wait for a while, and 
assert that it did not. And IMHO unit-tests directly on 
OverseerCollectionProcessor is not enough to ensure that the functionality seen 
as a whole from the outside will not be broken - and it IS the main concern (if 
we had a real customer :-) )

 Collection API: Allow multiple shards from one collection on the same Solr 
 server
 -

 Key: SOLR-4114
 URL: https://issues.apache.org/jira/browse/SOLR-4114
 Project: Solr
  Issue Type: New Feature
  Components: multicore, SolrCloud
Affects Versions: 4.0
 Environment: Solr 4.0.0 release
Reporter: Per Steffensen
Assignee: Per Steffensen
  Labels: collection-api, multicore, shard, shard-allocation
 Attachments: SOLR-4114.patch, SOLR-4114.patch, SOLR-4114.patch, 
 SOLR-4114.patch, SOLR-4114_trunk.patch


 We should support running multiple shards from one collection on the same 
 Solr server - the run a collection with 8 shards on a 4 Solr server cluster 
 (each Solr server running 2 shards).
 Performance tests at our side has shown that this is a good idea, and it is 
 also a good idea for easy elasticity later on - it is much easier to move an 
 entire existing shards from one Solr server to another one that just joined 
 the cluter than it is to split an exsiting shard among the Solr that used to 
 run it and the new Solr.
 See dev mailing list discussion Multiple shards for one collection on the 
 same Solr server

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-4114) Collection API: Allow multiple shards from one collection on the same Solr server

2012-12-05 Thread Robert Muir (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13510546#comment-13510546
 ] 

Robert Muir commented on SOLR-4114:
---

But i feel like there is a compromise: 

Add the unit tests for good checks during normal test runs.
Also add the slower test but mark it \@Nightly. So it runs at night in jenkins 
tests runs.


 Collection API: Allow multiple shards from one collection on the same Solr 
 server
 -

 Key: SOLR-4114
 URL: https://issues.apache.org/jira/browse/SOLR-4114
 Project: Solr
  Issue Type: New Feature
  Components: multicore, SolrCloud
Affects Versions: 4.0
 Environment: Solr 4.0.0 release
Reporter: Per Steffensen
Assignee: Per Steffensen
  Labels: collection-api, multicore, shard, shard-allocation
 Attachments: SOLR-4114.patch, SOLR-4114.patch, SOLR-4114.patch, 
 SOLR-4114.patch, SOLR-4114_trunk.patch


 We should support running multiple shards from one collection on the same 
 Solr server - the run a collection with 8 shards on a 4 Solr server cluster 
 (each Solr server running 2 shards).
 Performance tests at our side has shown that this is a good idea, and it is 
 also a good idea for easy elasticity later on - it is much easier to move an 
 entire existing shards from one Solr server to another one that just joined 
 the cluter than it is to split an exsiting shard among the Solr that used to 
 run it and the new Solr.
 See dev mailing list discussion Multiple shards for one collection on the 
 same Solr server

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-4114) Collection API: Allow multiple shards from one collection on the same Solr server

2012-12-05 Thread Per Steffensen (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13510549#comment-13510549
 ] 

Per Steffensen commented on SOLR-4114:
--

bq. Add the unit tests for good checks during normal test runs

Ok. I like a good discussion, that that will be your prize for participating so 
eagerly :-) Will try to find time to do it soon.

bq. Also add the slower test but mark it @Nightly. So it runs at night in 
jenkins tests runs.

So we agree to commit the test with a 10-20 sec wait?

Yeah I would rather not add the @Nightly, since I do not really know your 
test-target setup (nightly, pre-commit, etc). I certainly agree that such a 
slow test needs to be run at night, but actually I guessed that the @Slow 
annotation was for that. The test is already marked as @Slow. I certainly can 
add the @Nightly annotation if people agree that it should be added to 
BasicDistributedZkTest?

 Collection API: Allow multiple shards from one collection on the same Solr 
 server
 -

 Key: SOLR-4114
 URL: https://issues.apache.org/jira/browse/SOLR-4114
 Project: Solr
  Issue Type: New Feature
  Components: multicore, SolrCloud
Affects Versions: 4.0
 Environment: Solr 4.0.0 release
Reporter: Per Steffensen
Assignee: Per Steffensen
  Labels: collection-api, multicore, shard, shard-allocation
 Attachments: SOLR-4114.patch, SOLR-4114.patch, SOLR-4114.patch, 
 SOLR-4114.patch, SOLR-4114_trunk.patch


 We should support running multiple shards from one collection on the same 
 Solr server - the run a collection with 8 shards on a 4 Solr server cluster 
 (each Solr server running 2 shards).
 Performance tests at our side has shown that this is a good idea, and it is 
 also a good idea for easy elasticity later on - it is much easier to move an 
 entire existing shards from one Solr server to another one that just joined 
 the cluter than it is to split an exsiting shard among the Solr that used to 
 run it and the new Solr.
 See dev mailing list discussion Multiple shards for one collection on the 
 same Solr server

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-4114) Collection API: Allow multiple shards from one collection on the same Solr server

2012-12-05 Thread Robert Muir (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13510556#comment-13510556
 ] 

Robert Muir commented on SOLR-4114:
---

Unfortunately the \@Slow is enabled by default.

I think if a test takes double-digit seconds then it should be \@Nightly. 
I looked and its strange that no solr tests are never marked this way, lets 
start here.

We do it with lots of lucene tests, keeping the regular runs fast.


 Collection API: Allow multiple shards from one collection on the same Solr 
 server
 -

 Key: SOLR-4114
 URL: https://issues.apache.org/jira/browse/SOLR-4114
 Project: Solr
  Issue Type: New Feature
  Components: multicore, SolrCloud
Affects Versions: 4.0
 Environment: Solr 4.0.0 release
Reporter: Per Steffensen
Assignee: Per Steffensen
  Labels: collection-api, multicore, shard, shard-allocation
 Attachments: SOLR-4114.patch, SOLR-4114.patch, SOLR-4114.patch, 
 SOLR-4114.patch, SOLR-4114_trunk.patch


 We should support running multiple shards from one collection on the same 
 Solr server - the run a collection with 8 shards on a 4 Solr server cluster 
 (each Solr server running 2 shards).
 Performance tests at our side has shown that this is a good idea, and it is 
 also a good idea for easy elasticity later on - it is much easier to move an 
 entire existing shards from one Solr server to another one that just joined 
 the cluter than it is to split an exsiting shard among the Solr that used to 
 run it and the new Solr.
 See dev mailing list discussion Multiple shards for one collection on the 
 same Solr server

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-4114) Collection API: Allow multiple shards from one collection on the same Solr server

2012-12-05 Thread Mark Miller (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13510562#comment-13510562
 ] 

Mark Miller commented on SOLR-4114:
---

bq. I think if a test takes double-digit seconds then it should be @Nightly.

-1 to running all these solrcloud tests only as nightly.

 Collection API: Allow multiple shards from one collection on the same Solr 
 server
 -

 Key: SOLR-4114
 URL: https://issues.apache.org/jira/browse/SOLR-4114
 Project: Solr
  Issue Type: New Feature
  Components: multicore, SolrCloud
Affects Versions: 4.0
 Environment: Solr 4.0.0 release
Reporter: Per Steffensen
Assignee: Per Steffensen
  Labels: collection-api, multicore, shard, shard-allocation
 Attachments: SOLR-4114.patch, SOLR-4114.patch, SOLR-4114.patch, 
 SOLR-4114.patch, SOLR-4114_trunk.patch


 We should support running multiple shards from one collection on the same 
 Solr server - the run a collection with 8 shards on a 4 Solr server cluster 
 (each Solr server running 2 shards).
 Performance tests at our side has shown that this is a good idea, and it is 
 also a good idea for easy elasticity later on - it is much easier to move an 
 entire existing shards from one Solr server to another one that just joined 
 the cluter than it is to split an exsiting shard among the Solr that used to 
 run it and the new Solr.
 See dev mailing list discussion Multiple shards for one collection on the 
 same Solr server

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-4114) Collection API: Allow multiple shards from one collection on the same Solr server

2012-12-05 Thread Robert Muir (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13510590#comment-13510590
 ] 

Robert Muir commented on SOLR-4114:
---

thats ok if you want to veto progress mark. I really think we can be reasonable 
with solr tests and do:

* reasonably good unit tests that are simpler and easy to debug
* nightly-annotated slow tests that are more like the solr tests of today.

This would probably remove a lot of the frustration lucene developers have with 
solr tests.

In all cases, its fine if people want to veto the \@Nightly annotations of 
existing tests.

But these will be the very last slow solr tests we have in our build, because I 
already mentioned I'm going to raise issues about new tests added with huge 
sleeps, vetoing those commits before they can be added at all, so the test 
situation does not continue to get even worse.

I'm trying to take a stand against the always-failing/super-slow test situation 
of today. Its been like this for far too long.


 Collection API: Allow multiple shards from one collection on the same Solr 
 server
 -

 Key: SOLR-4114
 URL: https://issues.apache.org/jira/browse/SOLR-4114
 Project: Solr
  Issue Type: New Feature
  Components: multicore, SolrCloud
Affects Versions: 4.0
 Environment: Solr 4.0.0 release
Reporter: Per Steffensen
Assignee: Per Steffensen
  Labels: collection-api, multicore, shard, shard-allocation
 Attachments: SOLR-4114.patch, SOLR-4114.patch, SOLR-4114.patch, 
 SOLR-4114.patch, SOLR-4114_trunk.patch


 We should support running multiple shards from one collection on the same 
 Solr server - the run a collection with 8 shards on a 4 Solr server cluster 
 (each Solr server running 2 shards).
 Performance tests at our side has shown that this is a good idea, and it is 
 also a good idea for easy elasticity later on - it is much easier to move an 
 entire existing shards from one Solr server to another one that just joined 
 the cluter than it is to split an exsiting shard among the Solr that used to 
 run it and the new Solr.
 See dev mailing list discussion Multiple shards for one collection on the 
 same Solr server

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-4114) Collection API: Allow multiple shards from one collection on the same Solr server

2012-12-05 Thread Mark Miller (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13510607#comment-13510607
 ] 

Mark Miller commented on SOLR-4114:
---

A veto requires a formal announcement accompanied with a valid technical 
argument. I know you are used to throwing them around rather casually, but 
generally, -1, +1 are used to indicate votes of direction at Apache.

This test in particular is actually many tests - compiled together to save the 
time of setting up and tearing down lots of jetties.

If you were to @Nightly it in the name of so called progress, I would break 
it up into many tests, each under a minute, and increase the total test run 
time. I'd prefer it was all faster this way, but I'm okay with that way too.

@Slow was introduced the last time these discussion came up and if you don't 
want to run these tests, I suggest you take advantage of it. 

 Collection API: Allow multiple shards from one collection on the same Solr 
 server
 -

 Key: SOLR-4114
 URL: https://issues.apache.org/jira/browse/SOLR-4114
 Project: Solr
  Issue Type: New Feature
  Components: multicore, SolrCloud
Affects Versions: 4.0
 Environment: Solr 4.0.0 release
Reporter: Per Steffensen
Assignee: Per Steffensen
  Labels: collection-api, multicore, shard, shard-allocation
 Attachments: SOLR-4114.patch, SOLR-4114.patch, SOLR-4114.patch, 
 SOLR-4114.patch, SOLR-4114_trunk.patch


 We should support running multiple shards from one collection on the same 
 Solr server - the run a collection with 8 shards on a 4 Solr server cluster 
 (each Solr server running 2 shards).
 Performance tests at our side has shown that this is a good idea, and it is 
 also a good idea for easy elasticity later on - it is much easier to move an 
 entire existing shards from one Solr server to another one that just joined 
 the cluter than it is to split an exsiting shard among the Solr that used to 
 run it and the new Solr.
 See dev mailing list discussion Multiple shards for one collection on the 
 same Solr server

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-4114) Collection API: Allow multiple shards from one collection on the same Solr server

2012-12-05 Thread Mark Miller (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13510610#comment-13510610
 ] 

Mark Miller commented on SOLR-4114:
---

bq.  vetoing those commits before they can be added at all,

Vetos are not made for that and IMO would not apply, but okay...

 Collection API: Allow multiple shards from one collection on the same Solr 
 server
 -

 Key: SOLR-4114
 URL: https://issues.apache.org/jira/browse/SOLR-4114
 Project: Solr
  Issue Type: New Feature
  Components: multicore, SolrCloud
Affects Versions: 4.0
 Environment: Solr 4.0.0 release
Reporter: Per Steffensen
Assignee: Per Steffensen
  Labels: collection-api, multicore, shard, shard-allocation
 Attachments: SOLR-4114.patch, SOLR-4114.patch, SOLR-4114.patch, 
 SOLR-4114.patch, SOLR-4114_trunk.patch


 We should support running multiple shards from one collection on the same 
 Solr server - the run a collection with 8 shards on a 4 Solr server cluster 
 (each Solr server running 2 shards).
 Performance tests at our side has shown that this is a good idea, and it is 
 also a good idea for easy elasticity later on - it is much easier to move an 
 entire existing shards from one Solr server to another one that just joined 
 the cluter than it is to split an exsiting shard among the Solr that used to 
 run it and the new Solr.
 See dev mailing list discussion Multiple shards for one collection on the 
 same Solr server

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-4114) Collection API: Allow multiple shards from one collection on the same Solr server

2012-12-05 Thread Mark Miller (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13510691#comment-13510691
 ] 

Mark Miller commented on SOLR-4114:
---

SOLR-4043 (collection api responses) should make the missing test much easier 
it seems to me. I'm personally okay with adding like a 10 second wait until 
that gets in.

 Collection API: Allow multiple shards from one collection on the same Solr 
 server
 -

 Key: SOLR-4114
 URL: https://issues.apache.org/jira/browse/SOLR-4114
 Project: Solr
  Issue Type: New Feature
  Components: multicore, SolrCloud
Affects Versions: 4.0
 Environment: Solr 4.0.0 release
Reporter: Per Steffensen
Assignee: Per Steffensen
  Labels: collection-api, multicore, shard, shard-allocation
 Attachments: SOLR-4114.patch, SOLR-4114.patch, SOLR-4114.patch, 
 SOLR-4114.patch, SOLR-4114_trunk.patch


 We should support running multiple shards from one collection on the same 
 Solr server - the run a collection with 8 shards on a 4 Solr server cluster 
 (each Solr server running 2 shards).
 Performance tests at our side has shown that this is a good idea, and it is 
 also a good idea for easy elasticity later on - it is much easier to move an 
 entire existing shards from one Solr server to another one that just joined 
 the cluter than it is to split an exsiting shard among the Solr that used to 
 run it and the new Solr.
 See dev mailing list discussion Multiple shards for one collection on the 
 same Solr server

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-4114) Collection API: Allow multiple shards from one collection on the same Solr server

2012-12-04 Thread Per Steffensen (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13509687#comment-13509687
 ] 

Per Steffensen commented on SOLR-4114:
--

bq. We have to do something else other than a one minute sleep in the test.

In checkForCollection we are willing to wait up to 120 secs (used to be 60 
secs) for the collection to be created (correctly). So it is kinda obvious to 
also wait at least 60 secs to verify that nothing related to a collection was 
actually created in cases where you want to be sure that nothing is created. In 
the case of waiting for a collection being created correctly you can skip the 
waiting as soon as you see that is has been created correctly. In the case 
where you want to see that nothing gets created you have no chance to do that. 
I think the 60 sec wait fits well into the way the test already works.

bq. I'd like the default data dir to remain /data. If you are not doing 
anything special that works fine.

Thats also fine, but as soon at you create more than one shard on the same 
node, dont you need different data-dirs? You cant have two shards using the 
same folder for lucene-index-files, right? Did I get the instance-dir and 
data-dir concepts wrong? Or is it the instance-dir you want to vary between 
different shards? I would say that running only one shard on a node IS doing 
something special.

bq. I'd also like the ability to override the data dir on per shard basis, but 
I'm not sure what the best API for that is.

Yes, me too, but that has to be another ticket/issue. For now, until the API 
allowes to control the data-dir-naming, it should be ok for the algorithm to 
decide for something reasonable by itself.

bq. So I'd like to support what you want, but I don't want to change the 
default behavior.

I agree, but if I got the concepts correct you cannot use /data as data-dir 
for more than one shard on each node. So default behaviour about using /data 
as data-dir will not work as soon as you run more than one shard on a node. I 
probably got something wrong - please try to explain.

bq. My latest patch - I'll commit this soon and we can iterate from there.

Well I would prefer you committed my patch, and we can iterate from there :-) 
It will also make it much easier to get SOLR-4120 in, which I hope you will 
also consider.

Had a quick peek at your patch and have the following comments
* I see that you removed the auto-reduce replica-per-shard to never have more 
than one replica of the same shard on the same node-feature and just issue a 
warning instead in OverseerCollectionProcessor (the if (numShardsPerSlice  
nodeList.size())-thingy). It is ok for me, eventhough I believe it is 
pointless to replicate data to the same machine and under the same Solr 
instance. But then you probably need to change the BasicDistrbutedZkTest also - 
in checkCollectionExceptations I believe you'll need to change from
{code}
int expectedShardsPerSlice = (Math.min(clusterState.getLiveNodes().size(), 
numShardsNumReplicaList.get(1) + 1));
{code}
to
{code}
int expectedShardsPerSlice = numShardsNumReplicaList.get(1) + 1;
{code}
* You removed the following lines (because you just want default-values for 
instance-dir and data-dir)
{code}
params.set(CoreAdminParams.INSTANCE_DIR, .);
params.set(CoreAdminParams.DATA_DIR, shardName + _data);
{code}
I still do not understand how that will work, but hopefully you will explain
* You didnt like my rename of variable replica to 
nodeUrlWithoutProtocolPart in OverseerCollectionProcessor.createCollection. 
As I said on mailing-list I dont like the term replica as a replacement for 
what we used to call shards, because I think it will cause misunderstandings, 
as replica is also (in my mind) a role played at runtime. But getting the 
terms right and reflect them correctly in API, variable-names etc. across the 
code must be another issue/ticket. But here in this specific example replica 
is just a very bad name, because the variable is not even containing a 
replica-url, which would require the shard/replica-name to be postfixed to 
the string. So this replica-variable is closest to being an node-url (without 
the protocol part) - NOT a shard/replica-url. I would accept my name-change if 
I where you, but I have a motto of carefully choosing my fights and this is a 
fight I will not fight for very long :-)
* I see that you did not include my changes to HttpShardHandler, making 
shardToUrls-map (renamed) concurrency protected through getURLs-method 
(renamed to getFullURLs), so that you do not have to use the map so carefully 
outside. I understand that it has very little to do with this issue SOLR-4114, 
but it is a nice modification. Please consider committing it - maybe related to 
another issue/ticket. It is little bit of a problem that good refactoring does 
not easy get in as part of issues/tickets not requiring the refactor. If 

[jira] [Commented] (SOLR-4114) Collection API: Allow multiple shards from one collection on the same Solr server

2012-12-04 Thread Mark Miller (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13509861#comment-13509861
 ] 

Mark Miller commented on SOLR-4114:
---

Yeah, I minimized the patch down to just dealing with this issue. I'm away from 
home and just looking to get this issue completed with minimum fuss. 
'nodeUrlWithoutProtocolPart' is just so long and I didn't see it helping much 
in terms of code readability - if you have a better fitting name that is a 
little less verbose, I think I'd be more into it.

bq. I see that you removed the auto-reduce replica-per-shard t

Yeah, I don't think I agree with changing the users params on him - I'd rather 
warn and let the user do what he wants to do rather than trying to outthink him 
ahead of time. If he decides he wants more than one repica on an instance for 
some reason, that's his deal - we can warn him though.

bq. You removed the following lines (because you just want default-values for 
instance-dir and data-dir)

Right - it should match collection1 - eg newcollection/data should be the data 
dir just like collection1/data and rather than something like 
newcollection_data. In my experience and testing, data ends up in the cores own 
instance dir - not some common dir.

bq. making shardToUrls-map (renamed) concurrency protected 

Yup - it seemed unrelated and I'm busy so I didn't want to think about it. My 
goal is to get the essence of this thing committed - it's a lot easier to then 
fight over smaller pieces on top of that. Progress not perfection. 

 Collection API: Allow multiple shards from one collection on the same Solr 
 server
 -

 Key: SOLR-4114
 URL: https://issues.apache.org/jira/browse/SOLR-4114
 Project: Solr
  Issue Type: New Feature
  Components: multicore, SolrCloud
Affects Versions: 4.0
 Environment: Solr 4.0.0 release
Reporter: Per Steffensen
Assignee: Per Steffensen
  Labels: collection-api, multicore, shard, shard-allocation
 Attachments: SOLR-4114.patch, SOLR-4114.patch, SOLR-4114.patch, 
 SOLR-4114.patch, SOLR-4114_trunk.patch


 We should support running multiple shards from one collection on the same 
 Solr server - the run a collection with 8 shards on a 4 Solr server cluster 
 (each Solr server running 2 shards).
 Performance tests at our side has shown that this is a good idea, and it is 
 also a good idea for easy elasticity later on - it is much easier to move an 
 entire existing shards from one Solr server to another one that just joined 
 the cluter than it is to split an exsiting shard among the Solr that used to 
 run it and the new Solr.
 See dev mailing list discussion Multiple shards for one collection on the 
 same Solr server

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-4114) Collection API: Allow multiple shards from one collection on the same Solr server

2012-12-04 Thread Mark Miller (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13509864#comment-13509864
 ] 

Mark Miller commented on SOLR-4114:
---

bq. In checkForCollection we are willing to wait up to 120 secs (used to be 60 
secs) for the collection to be created (correctly). So it is kinda obvious to 
also wait at least 60 secs to verify that nothing related to a collection was 
actually created in cases where you want to be sure that nothing is created. 

We poll so that we wait up to 120 seconds for a slow comp, but a fast comp 
won't need to wait nearly that long. The 60 wait hits everyone no matter what. 
We generally try avoid any hard waits like that. I understand you can't really 
poll in this case, so I'm not sure the best way to test that - I made it a TODO 
for now.

 Collection API: Allow multiple shards from one collection on the same Solr 
 server
 -

 Key: SOLR-4114
 URL: https://issues.apache.org/jira/browse/SOLR-4114
 Project: Solr
  Issue Type: New Feature
  Components: multicore, SolrCloud
Affects Versions: 4.0
 Environment: Solr 4.0.0 release
Reporter: Per Steffensen
Assignee: Per Steffensen
  Labels: collection-api, multicore, shard, shard-allocation
 Attachments: SOLR-4114.patch, SOLR-4114.patch, SOLR-4114.patch, 
 SOLR-4114.patch, SOLR-4114_trunk.patch


 We should support running multiple shards from one collection on the same 
 Solr server - the run a collection with 8 shards on a 4 Solr server cluster 
 (each Solr server running 2 shards).
 Performance tests at our side has shown that this is a good idea, and it is 
 also a good idea for easy elasticity later on - it is much easier to move an 
 entire existing shards from one Solr server to another one that just joined 
 the cluter than it is to split an exsiting shard among the Solr that used to 
 run it and the new Solr.
 See dev mailing list discussion Multiple shards for one collection on the 
 same Solr server

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-4114) Collection API: Allow multiple shards from one collection on the same Solr server

2012-12-04 Thread Commit Tag Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13509882#comment-13509882
 ] 

Commit Tag Bot commented on SOLR-4114:
--

[trunk commit] Mark Robert Miller
http://svn.apache.org/viewvc?view=revisionrevision=1417045

SOLR-4114: Allow creating more than one shard per instance with the Collection 
API.



 Collection API: Allow multiple shards from one collection on the same Solr 
 server
 -

 Key: SOLR-4114
 URL: https://issues.apache.org/jira/browse/SOLR-4114
 Project: Solr
  Issue Type: New Feature
  Components: multicore, SolrCloud
Affects Versions: 4.0
 Environment: Solr 4.0.0 release
Reporter: Per Steffensen
Assignee: Per Steffensen
  Labels: collection-api, multicore, shard, shard-allocation
 Attachments: SOLR-4114.patch, SOLR-4114.patch, SOLR-4114.patch, 
 SOLR-4114.patch, SOLR-4114_trunk.patch


 We should support running multiple shards from one collection on the same 
 Solr server - the run a collection with 8 shards on a 4 Solr server cluster 
 (each Solr server running 2 shards).
 Performance tests at our side has shown that this is a good idea, and it is 
 also a good idea for easy elasticity later on - it is much easier to move an 
 entire existing shards from one Solr server to another one that just joined 
 the cluter than it is to split an exsiting shard among the Solr that used to 
 run it and the new Solr.
 See dev mailing list discussion Multiple shards for one collection on the 
 same Solr server

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-4114) Collection API: Allow multiple shards from one collection on the same Solr server

2012-12-04 Thread Commit Tag Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13509910#comment-13509910
 ] 

Commit Tag Bot commented on SOLR-4114:
--

[branch_4x commit] Mark Robert Miller
http://svn.apache.org/viewvc?view=revisionrevision=1417070

SOLR-4114: Allow creating more than one shard per instance with the Collection 
API.



 Collection API: Allow multiple shards from one collection on the same Solr 
 server
 -

 Key: SOLR-4114
 URL: https://issues.apache.org/jira/browse/SOLR-4114
 Project: Solr
  Issue Type: New Feature
  Components: multicore, SolrCloud
Affects Versions: 4.0
 Environment: Solr 4.0.0 release
Reporter: Per Steffensen
Assignee: Per Steffensen
  Labels: collection-api, multicore, shard, shard-allocation
 Attachments: SOLR-4114.patch, SOLR-4114.patch, SOLR-4114.patch, 
 SOLR-4114.patch, SOLR-4114_trunk.patch


 We should support running multiple shards from one collection on the same 
 Solr server - the run a collection with 8 shards on a 4 Solr server cluster 
 (each Solr server running 2 shards).
 Performance tests at our side has shown that this is a good idea, and it is 
 also a good idea for easy elasticity later on - it is much easier to move an 
 entire existing shards from one Solr server to another one that just joined 
 the cluter than it is to split an exsiting shard among the Solr that used to 
 run it and the new Solr.
 See dev mailing list discussion Multiple shards for one collection on the 
 same Solr server

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-4114) Collection API: Allow multiple shards from one collection on the same Solr server

2012-12-04 Thread Per Steffensen (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13509960#comment-13509960
 ] 

Per Steffensen commented on SOLR-4114:
--

bq. We poll so that we wait up to 120 seconds for a slow comp, but a fast comp 
won't need to wait nearly that long. The 60 wait hits everyone no matter what. 
We generally try avoid any hard waits like that. I understand you can't really 
poll in this case, so I'm not sure the best way to test that - I made it a TODO 
for now.

Ok with a TODO, but it should be about making a more clever solution than the 
60 sec wait. You commented out the assert-method checkCollectionIsNotCreated, 
which means that we now have an unprotected feature in the code-base. Anyone 
can go ruin this feature tomorrow and no one will notice. Yes, I believe the 
main value of tests is their ability to protect other people from accidently 
ruining existing fuctionality. All the comments below are very unimportant 
compared to this - I am really serious about this. Get 
checkCollectionIsNotCreated running now so that the code is protected, then 
think about a more clever solution later (if you think such a thing exists). 
TODO's have a tendency to be forgotten.

bq. Yeah, I minimized the patch down to just dealing with this issue. I'm away 
from home and just looking to get this issue completed with minimum fuss.

Then the easiest thing would probably have been, to take everything in, except 
for things you thought would actually ruin existing stuff. Instead of using 
time to find every little detail that could be left out. Do not misunderstand 
me, I am glad you used your time to get it committed, but I also want to 
influence the way you committers work, whenever I have the chance. Only 
thinking about our common interrest - a good Solr. I have a bad gut feeling 
that the code-base is so messy because no one ever refactors. Refactoring is 
something you usually do while solving some specific (potentially) unrelated 
task. No one goes refactoring just to do refactoring, but it is extremely 
important that refactoring has an easy way into the code-base.

bq. 'nodeUrlWithoutProtocolPart' is just so long and I didn't see it helping 
much in terms of code readability - if you have a better fitting name that is a 
little less verbose, I think I'd be more into it.

Well, first of all a long saying name is much better than a short misleading 
name, and second of all that name really isnt very long :-)

bq. Yeah, I don't think I agree with changing the users params on him - I'd 
rather warn and let the user do what he wants to do rather than trying to 
outthink him ahead of time. If he decides he wants more than one repica on an 
instance for some reason, that's his deal - we can warn him though.

Ok, cool

bq. Right - it should match collection1 - eg newcollection/data should be the 
data dir just like collection1/data and rather than something like 
newcollection_data. In my experience and testing, data ends up in the cores own 
instance dir - not some common dir.

Didnt learn much from this explanation, but I will have to do a little studying 
on instance-dir and data-dir to understand how your solution will ever work. I 
will get back with an official complaint (just a comment here or a mail on the 
mailing list :-) ) if I still do not think it will work after I have done my 
studying.

bq. Yup - it seemed unrelated and I'm busy so I didn't want to think about it...

Still easier to just take it in, unless you saw it harming more than it helped. 
I am worried about refactoring in this project! Trust your test-suite :-)


 Collection API: Allow multiple shards from one collection on the same Solr 
 server
 -

 Key: SOLR-4114
 URL: https://issues.apache.org/jira/browse/SOLR-4114
 Project: Solr
  Issue Type: New Feature
  Components: multicore, SolrCloud
Affects Versions: 4.0
 Environment: Solr 4.0.0 release
Reporter: Per Steffensen
Assignee: Per Steffensen
  Labels: collection-api, multicore, shard, shard-allocation
 Attachments: SOLR-4114.patch, SOLR-4114.patch, SOLR-4114.patch, 
 SOLR-4114.patch, SOLR-4114_trunk.patch


 We should support running multiple shards from one collection on the same 
 Solr server - the run a collection with 8 shards on a 4 Solr server cluster 
 (each Solr server running 2 shards).
 Performance tests at our side has shown that this is a good idea, and it is 
 also a good idea for easy elasticity later on - it is much easier to move an 
 entire existing shards from one Solr server to another one that just joined 
 the cluter than it is to split an exsiting shard among the Solr that used to 
 run it and the new Solr.
 See dev mailing list discussion 

[jira] [Commented] (SOLR-4114) Collection API: Allow multiple shards from one collection on the same Solr server

2012-12-04 Thread Mark Miller (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13509967#comment-13509967
 ] 

Mark Miller commented on SOLR-4114:
---

bq. Still easier to just take it in

For me it's easier to not take it in :) I have to vet what I take in. I think 
you will find it easier to get stuff in if your refactoring is related to the 
issue. Otherwise make a refactoring issue.

bq. Trust your test-suite 

It's not my test-suite, it's a huge shard test-suite and I don't blindly trust 
it. It certainly doesn't test everything.

 Collection API: Allow multiple shards from one collection on the same Solr 
 server
 -

 Key: SOLR-4114
 URL: https://issues.apache.org/jira/browse/SOLR-4114
 Project: Solr
  Issue Type: New Feature
  Components: multicore, SolrCloud
Affects Versions: 4.0
 Environment: Solr 4.0.0 release
Reporter: Per Steffensen
Assignee: Per Steffensen
  Labels: collection-api, multicore, shard, shard-allocation
 Attachments: SOLR-4114.patch, SOLR-4114.patch, SOLR-4114.patch, 
 SOLR-4114.patch, SOLR-4114_trunk.patch


 We should support running multiple shards from one collection on the same 
 Solr server - the run a collection with 8 shards on a 4 Solr server cluster 
 (each Solr server running 2 shards).
 Performance tests at our side has shown that this is a good idea, and it is 
 also a good idea for easy elasticity later on - it is much easier to move an 
 entire existing shards from one Solr server to another one that just joined 
 the cluter than it is to split an exsiting shard among the Solr that used to 
 run it and the new Solr.
 See dev mailing list discussion Multiple shards for one collection on the 
 same Solr server

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-4114) Collection API: Allow multiple shards from one collection on the same Solr server

2012-12-04 Thread Per Steffensen (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13510020#comment-13510020
 ] 

Per Steffensen commented on SOLR-4114:
--

Ok, will you please consider enabling the 60 sec test (maybe reduce it to 10 or 
30 sec) so that the feature is protected until someone comes up with a 
better/faster test. Please!!!

 Collection API: Allow multiple shards from one collection on the same Solr 
 server
 -

 Key: SOLR-4114
 URL: https://issues.apache.org/jira/browse/SOLR-4114
 Project: Solr
  Issue Type: New Feature
  Components: multicore, SolrCloud
Affects Versions: 4.0
 Environment: Solr 4.0.0 release
Reporter: Per Steffensen
Assignee: Per Steffensen
  Labels: collection-api, multicore, shard, shard-allocation
 Attachments: SOLR-4114.patch, SOLR-4114.patch, SOLR-4114.patch, 
 SOLR-4114.patch, SOLR-4114_trunk.patch


 We should support running multiple shards from one collection on the same 
 Solr server - the run a collection with 8 shards on a 4 Solr server cluster 
 (each Solr server running 2 shards).
 Performance tests at our side has shown that this is a good idea, and it is 
 also a good idea for easy elasticity later on - it is much easier to move an 
 entire existing shards from one Solr server to another one that just joined 
 the cluter than it is to split an exsiting shard among the Solr that used to 
 run it and the new Solr.
 See dev mailing list discussion Multiple shards for one collection on the 
 same Solr server

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-4114) Collection API: Allow multiple shards from one collection on the same Solr server

2012-12-04 Thread Mark Miller (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13510038#comment-13510038
 ] 

Mark Miller commented on SOLR-4114:
---

I'm happy to commit a test for this, but lets come up with something that 
doesn't have a long sleep  like this. 
I suppose a 10 second sleep is more agreeable, but these things add up and  and 
I'd rather come up with a better test. 

 Collection API: Allow multiple shards from one collection on the same Solr 
 server
 -

 Key: SOLR-4114
 URL: https://issues.apache.org/jira/browse/SOLR-4114
 Project: Solr
  Issue Type: New Feature
  Components: multicore, SolrCloud
Affects Versions: 4.0
 Environment: Solr 4.0.0 release
Reporter: Per Steffensen
Assignee: Per Steffensen
  Labels: collection-api, multicore, shard, shard-allocation
 Attachments: SOLR-4114.patch, SOLR-4114.patch, SOLR-4114.patch, 
 SOLR-4114.patch, SOLR-4114_trunk.patch


 We should support running multiple shards from one collection on the same 
 Solr server - the run a collection with 8 shards on a 4 Solr server cluster 
 (each Solr server running 2 shards).
 Performance tests at our side has shown that this is a good idea, and it is 
 also a good idea for easy elasticity later on - it is much easier to move an 
 entire existing shards from one Solr server to another one that just joined 
 the cluter than it is to split an exsiting shard among the Solr that used to 
 run it and the new Solr.
 See dev mailing list discussion Multiple shards for one collection on the 
 same Solr server

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-4114) Collection API: Allow multiple shards from one collection on the same Solr server

2012-12-03 Thread Mark Miller (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13509328#comment-13509328
 ] 

Mark Miller commented on SOLR-4114:
---

bq. consider backporting to 4.x

Currently, all of 'my' work is targeted towards 4.x - with the caveat that some 
trickier stuff might back in 5x before being back ported.

 Collection API: Allow multiple shards from one collection on the same Solr 
 server
 -

 Key: SOLR-4114
 URL: https://issues.apache.org/jira/browse/SOLR-4114
 Project: Solr
  Issue Type: New Feature
  Components: multicore, SolrCloud
Affects Versions: 4.0
 Environment: Solr 4.0.0 release
Reporter: Per Steffensen
Assignee: Per Steffensen
  Labels: collection-api, multicore, shard, shard-allocation
 Attachments: SOLR-4114.patch, SOLR-4114.patch, SOLR-4114.patch, 
 SOLR-4114_trunk.patch


 We should support running multiple shards from one collection on the same 
 Solr server - the run a collection with 8 shards on a 4 Solr server cluster 
 (each Solr server running 2 shards).
 Performance tests at our side has shown that this is a good idea, and it is 
 also a good idea for easy elasticity later on - it is much easier to move an 
 entire existing shards from one Solr server to another one that just joined 
 the cluter than it is to split an exsiting shard among the Solr that used to 
 run it and the new Solr.
 See dev mailing list discussion Multiple shards for one collection on the 
 same Solr server

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-4114) Collection API: Allow multiple shards from one collection on the same Solr server

2012-12-03 Thread Mark Miller (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13509330#comment-13509330
 ] 

Mark Miller commented on SOLR-4114:
---

Thanks for the updated patch.
Thanks for the corrected patch!

Two things I'd like to address before committing:

1. 
+// nocommit
+Thread.sleep(6);
We have to do something else other than a one minute sleep in the test.

2. Setting the shard as part of the data dir.

I'd like the default data dir to remain /data. If you are not doing anything 
special that works fine.
I'd also like the ability to override the data dir on per shard basis, but I'm 
not sure what the best API for that is.

So I'd like to support what you want, but I don't want to change the default 
behavior.

 Collection API: Allow multiple shards from one collection on the same Solr 
 server
 -

 Key: SOLR-4114
 URL: https://issues.apache.org/jira/browse/SOLR-4114
 Project: Solr
  Issue Type: New Feature
  Components: multicore, SolrCloud
Affects Versions: 4.0
 Environment: Solr 4.0.0 release
Reporter: Per Steffensen
Assignee: Per Steffensen
  Labels: collection-api, multicore, shard, shard-allocation
 Attachments: SOLR-4114.patch, SOLR-4114.patch, SOLR-4114.patch, 
 SOLR-4114_trunk.patch


 We should support running multiple shards from one collection on the same 
 Solr server - the run a collection with 8 shards on a 4 Solr server cluster 
 (each Solr server running 2 shards).
 Performance tests at our side has shown that this is a good idea, and it is 
 also a good idea for easy elasticity later on - it is much easier to move an 
 entire existing shards from one Solr server to another one that just joined 
 the cluter than it is to split an exsiting shard among the Solr that used to 
 run it and the new Solr.
 See dev mailing list discussion Multiple shards for one collection on the 
 same Solr server

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-4114) Collection API: Allow multiple shards from one collection on the same Solr server

2012-12-02 Thread Per Steffensen (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13508287#comment-13508287
 ] 

Per Steffensen commented on SOLR-4114:
--

Where does your patch fit, Mark?

 Collection API: Allow multiple shards from one collection on the same Solr 
 server
 -

 Key: SOLR-4114
 URL: https://issues.apache.org/jira/browse/SOLR-4114
 Project: Solr
  Issue Type: New Feature
  Components: multicore, SolrCloud
Affects Versions: 4.0
 Environment: Solr 4.0.0 release
Reporter: Per Steffensen
Assignee: Per Steffensen
  Labels: collection-api, multicore, shard, shard-allocation
 Attachments: SOLR-4114.patch, SOLR-4114.patch, SOLR-4114.patch


 We should support running multiple shards from one collection on the same 
 Solr server - the run a collection with 8 shards on a 4 Solr server cluster 
 (each Solr server running 2 shards).
 Performance tests at our side has shown that this is a good idea, and it is 
 also a good idea for easy elasticity later on - it is much easier to move an 
 entire existing shards from one Solr server to another one that just joined 
 the cluter than it is to split an exsiting shard among the Solr that used to 
 run it and the new Solr.
 See dev mailing list discussion Multiple shards for one collection on the 
 same Solr server

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-4114) Collection API: Allow multiple shards from one collection on the same Solr server

2012-12-02 Thread Mark Miller (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13508288#comment-13508288
 ] 

Mark Miller commented on SOLR-4114:
---

Should be against 5x - I'm going to US west coast for a week - so not sure when 
I'll get back to this - I may try and get it going while I'm out there and I 
may not have time till I get back.


 Collection API: Allow multiple shards from one collection on the same Solr 
 server
 -

 Key: SOLR-4114
 URL: https://issues.apache.org/jira/browse/SOLR-4114
 Project: Solr
  Issue Type: New Feature
  Components: multicore, SolrCloud
Affects Versions: 4.0
 Environment: Solr 4.0.0 release
Reporter: Per Steffensen
Assignee: Per Steffensen
  Labels: collection-api, multicore, shard, shard-allocation
 Attachments: SOLR-4114.patch, SOLR-4114.patch, SOLR-4114.patch


 We should support running multiple shards from one collection on the same 
 Solr server - the run a collection with 8 shards on a 4 Solr server cluster 
 (each Solr server running 2 shards).
 Performance tests at our side has shown that this is a good idea, and it is 
 also a good idea for easy elasticity later on - it is much easier to move an 
 entire existing shards from one Solr server to another one that just joined 
 the cluter than it is to split an exsiting shard among the Solr that used to 
 run it and the new Solr.
 See dev mailing list discussion Multiple shards for one collection on the 
 same Solr server

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-4114) Collection API: Allow multiple shards from one collection on the same Solr server

2012-12-02 Thread Per Steffensen (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13508382#comment-13508382
 ] 

Per Steffensen commented on SOLR-4114:
--

Hope you will commit, and consider backporting to 4.x, since we expect to 
upgrade to 4.1 when it is released, and we would really like this feature to be 
included.

 Collection API: Allow multiple shards from one collection on the same Solr 
 server
 -

 Key: SOLR-4114
 URL: https://issues.apache.org/jira/browse/SOLR-4114
 Project: Solr
  Issue Type: New Feature
  Components: multicore, SolrCloud
Affects Versions: 4.0
 Environment: Solr 4.0.0 release
Reporter: Per Steffensen
Assignee: Per Steffensen
  Labels: collection-api, multicore, shard, shard-allocation
 Attachments: SOLR-4114.patch, SOLR-4114.patch, SOLR-4114.patch, 
 SOLR-4114_trunk.patch


 We should support running multiple shards from one collection on the same 
 Solr server - the run a collection with 8 shards on a 4 Solr server cluster 
 (each Solr server running 2 shards).
 Performance tests at our side has shown that this is a good idea, and it is 
 also a good idea for easy elasticity later on - it is much easier to move an 
 entire existing shards from one Solr server to another one that just joined 
 the cluter than it is to split an exsiting shard among the Solr that used to 
 run it and the new Solr.
 See dev mailing list discussion Multiple shards for one collection on the 
 same Solr server

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-4114) Collection API: Allow multiple shards from one collection on the same Solr server

2012-12-01 Thread Per Steffensen (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13507932#comment-13507932
 ] 

Per Steffensen commented on SOLR-4114:
--

Thanks, Mark. Let me know if I can help in any way. It is not that big a patch, 
but I did take the opportunity to do some refactoring - you know the bell in my 
head preventing me from just doing ctrl-c/ctrl-v :-) This makes the patch a 
little bigger. Let me know if you want me to make a patch fitting on top of 4.x 
or 5.x.

 Collection API: Allow multiple shards from one collection on the same Solr 
 server
 -

 Key: SOLR-4114
 URL: https://issues.apache.org/jira/browse/SOLR-4114
 Project: Solr
  Issue Type: New Feature
  Components: multicore, SolrCloud
Affects Versions: 4.0
 Environment: Solr 4.0.0 release
Reporter: Per Steffensen
Assignee: Per Steffensen
  Labels: collection-api, multicore, shard, shard-allocation
 Attachments: SOLR-4114.patch, SOLR-4114.patch


 We should support running multiple shards from one collection on the same 
 Solr server - the run a collection with 8 shards on a 4 Solr server cluster 
 (each Solr server running 2 shards).
 Performance tests at our side has shown that this is a good idea, and it is 
 also a good idea for easy elasticity later on - it is much easier to move an 
 entire existing shards from one Solr server to another one that just joined 
 the cluter than it is to split an exsiting shard among the Solr that used to 
 run it and the new Solr.
 See dev mailing list discussion Multiple shards for one collection on the 
 same Solr server

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-4114) Collection API: Allow multiple shards from one collection on the same Solr server

2012-11-30 Thread Mark Miller (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13507759#comment-13507759
 ] 

Mark Miller commented on SOLR-4114:
---

I started working on patching this into recent stuff, and it's more of a pain 
than I thought. I must have missed some piece as I tried to merge it up and the 
test is failing. Giving up for tonight.

 Collection API: Allow multiple shards from one collection on the same Solr 
 server
 -

 Key: SOLR-4114
 URL: https://issues.apache.org/jira/browse/SOLR-4114
 Project: Solr
  Issue Type: New Feature
  Components: multicore, SolrCloud
Affects Versions: 4.0
 Environment: Solr 4.0.0 release
Reporter: Per Steffensen
Assignee: Per Steffensen
  Labels: collection-api, multicore, shard, shard-allocation
 Attachments: SOLR-4114.patch, SOLR-4114.patch


 We should support running multiple shards from one collection on the same 
 Solr server - the run a collection with 8 shards on a 4 Solr server cluster 
 (each Solr server running 2 shards).
 Performance tests at our side has shown that this is a good idea, and it is 
 also a good idea for easy elasticity later on - it is much easier to move an 
 entire existing shards from one Solr server to another one that just joined 
 the cluter than it is to split an exsiting shard among the Solr that used to 
 run it and the new Solr.
 See dev mailing list discussion Multiple shards for one collection on the 
 same Solr server

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-4114) Collection API: Allow multiple shards from one collection on the same Solr server

2012-11-28 Thread Per Steffensen (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13505295#comment-13505295
 ] 

Per Steffensen commented on SOLR-4114:
--

bq. As far as terminology, when I say replicationFactor of 3, I mean 3 copies 
of the data. I also count the leader as a replica of a shard (which is 
logical). It follows from the clusterstate.json, which lists all replicas for 
a shard and one of them just has a flag indicating it's the leader. This also 
makes it easier to talk about a shard having 0 replicas (meaning there is not 
even a leader).

Ok, its just than the replicationFactor you specify in your request is the 
other thing. You get replicationFactor + 1 shards per slice, if we define 
replicationFactor as the one you give in your request.

 Collection API: Allow multiple shards from one collection on the same Solr 
 server
 -

 Key: SOLR-4114
 URL: https://issues.apache.org/jira/browse/SOLR-4114
 Project: Solr
  Issue Type: New Feature
  Components: multicore, SolrCloud
Affects Versions: 4.0
 Environment: Solr 4.0.0 release
Reporter: Per Steffensen
Assignee: Per Steffensen
  Labels: collection-api, multicore, shard, shard-allocation
 Attachments: SOLR-4114.patch


 We should support running multiple shards from one collection on the same 
 Solr server - the run a collection with 8 shards on a 4 Solr server cluster 
 (each Solr server running 2 shards).
 Performance tests at our side has shown that this is a good idea, and it is 
 also a good idea for easy elasticity later on - it is much easier to move an 
 entire existing shards from one Solr server to another one that just joined 
 the cluter than it is to split an exsiting shard among the Solr that used to 
 run it and the new Solr.
 See dev mailing list discussion Multiple shards for one collection on the 
 same Solr server

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-4114) Collection API: Allow multiple shards from one collection on the same Solr server

2012-11-28 Thread Per Steffensen (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13505296#comment-13505296
 ] 

Per Steffensen commented on SOLR-4114:
--

bq. Solr 3.X to Solr 4.X back compat is not considered the same as Solr 4.0 to 
Solr 4.1 back compat.

Of course, I agree! But anyway...

 Collection API: Allow multiple shards from one collection on the same Solr 
 server
 -

 Key: SOLR-4114
 URL: https://issues.apache.org/jira/browse/SOLR-4114
 Project: Solr
  Issue Type: New Feature
  Components: multicore, SolrCloud
Affects Versions: 4.0
 Environment: Solr 4.0.0 release
Reporter: Per Steffensen
Assignee: Per Steffensen
  Labels: collection-api, multicore, shard, shard-allocation
 Attachments: SOLR-4114.patch


 We should support running multiple shards from one collection on the same 
 Solr server - the run a collection with 8 shards on a 4 Solr server cluster 
 (each Solr server running 2 shards).
 Performance tests at our side has shown that this is a good idea, and it is 
 also a good idea for easy elasticity later on - it is much easier to move an 
 entire existing shards from one Solr server to another one that just joined 
 the cluter than it is to split an exsiting shard among the Solr that used to 
 run it and the new Solr.
 See dev mailing list discussion Multiple shards for one collection on the 
 same Solr server

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-4114) Collection API: Allow multiple shards from one collection on the same Solr server

2012-11-28 Thread Per Steffensen (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13505368#comment-13505368
 ] 

Per Steffensen commented on SOLR-4114:
--

bq. As far as terminology, when I say replicationFactor of 3, I mean 3 copies 
of the data. I also count the leader as a replica of a shard (which is 
logical). It follows from the clusterstate.json, which lists all replicas for 
a shard and one of them just has a flag indicating it's the leader. This also 
makes it easier to talk about a shard having 0 replicas (meaning there is not 
even a leader).

I understand that you can view all shards under a slice as a replica, but in 
my mind replica is also a role that a shard plays at runtime - all shards 
except one under a slice plays the replica role at runtime, the remaining 
shard play the leader role. To not create to much confusion I suggest you use 
the term shards for all the instances under a slice, and that you use the term 
replica only for a role that a shard plays at runtime.
But that of course would require changes e.g. to Slice-class where e.g. 
getReplicas, getReplicasCopy and getReplicasMap needs to me renamed to 
getShardsXXX. It probably shouldnt be done now, but as a part of a cross-code 
cleaning up in term-usage.

Suggested terms:
 * collection: A big logical bucket to fill data into
 * slice: A logical part of a collection. A part of the data going into a 
collection goes into a particular slice. Slices for a particular collection are 
non-overlapping
 * shard: A physical instance of a slice. Running without replica there is one 
shard per slice. Running with replication-factor X there are X+1 shards per 
slice.
 * replica and leader: Roles played by shards at runtime. As soon as the system 
is not running there are no replica/leader - there are just shards
 * node-base-url: The prefix/base (up to and including the webapp-context) of 
the URL for a specific Solr server
 * node-name: A logical name for the Solr server - the same as node-base-url 
except /'s are replaced by _'s and the protocol part (http(s)://) is removed


 Collection API: Allow multiple shards from one collection on the same Solr 
 server
 -

 Key: SOLR-4114
 URL: https://issues.apache.org/jira/browse/SOLR-4114
 Project: Solr
  Issue Type: New Feature
  Components: multicore, SolrCloud
Affects Versions: 4.0
 Environment: Solr 4.0.0 release
Reporter: Per Steffensen
Assignee: Per Steffensen
  Labels: collection-api, multicore, shard, shard-allocation
 Attachments: SOLR-4114.patch


 We should support running multiple shards from one collection on the same 
 Solr server - the run a collection with 8 shards on a 4 Solr server cluster 
 (each Solr server running 2 shards).
 Performance tests at our side has shown that this is a good idea, and it is 
 also a good idea for easy elasticity later on - it is much easier to move an 
 entire existing shards from one Solr server to another one that just joined 
 the cluter than it is to split an exsiting shard among the Solr that used to 
 run it and the new Solr.
 See dev mailing list discussion Multiple shards for one collection on the 
 same Solr server

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-4114) Collection API: Allow multiple shards from one collection on the same Solr server

2012-11-28 Thread Per Steffensen (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13505397#comment-13505397
 ] 

Per Steffensen commented on SOLR-4114:
--

Patch including the maxShardsPerNode feature comming up. And (much) better 
testing of the create operation of the Collections API.

 Collection API: Allow multiple shards from one collection on the same Solr 
 server
 -

 Key: SOLR-4114
 URL: https://issues.apache.org/jira/browse/SOLR-4114
 Project: Solr
  Issue Type: New Feature
  Components: multicore, SolrCloud
Affects Versions: 4.0
 Environment: Solr 4.0.0 release
Reporter: Per Steffensen
Assignee: Per Steffensen
  Labels: collection-api, multicore, shard, shard-allocation
 Attachments: SOLR-4114.patch


 We should support running multiple shards from one collection on the same 
 Solr server - the run a collection with 8 shards on a 4 Solr server cluster 
 (each Solr server running 2 shards).
 Performance tests at our side has shown that this is a good idea, and it is 
 also a good idea for easy elasticity later on - it is much easier to move an 
 entire existing shards from one Solr server to another one that just joined 
 the cluter than it is to split an exsiting shard among the Solr that used to 
 run it and the new Solr.
 See dev mailing list discussion Multiple shards for one collection on the 
 same Solr server

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-4114) Collection API: Allow multiple shards from one collection on the same Solr server

2012-11-28 Thread Mark Miller (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13505468#comment-13505468
 ] 

Mark Miller commented on SOLR-4114:
---

bq.  fixed in collectionCmd (used for delete and reload) but not in 
createCollection 

This fix belongs with the issue that fixed delete and reload - I'm going to fix 
it there.

 Collection API: Allow multiple shards from one collection on the same Solr 
 server
 -

 Key: SOLR-4114
 URL: https://issues.apache.org/jira/browse/SOLR-4114
 Project: Solr
  Issue Type: New Feature
  Components: multicore, SolrCloud
Affects Versions: 4.0
 Environment: Solr 4.0.0 release
Reporter: Per Steffensen
Assignee: Per Steffensen
  Labels: collection-api, multicore, shard, shard-allocation
 Attachments: SOLR-4114.patch


 We should support running multiple shards from one collection on the same 
 Solr server - the run a collection with 8 shards on a 4 Solr server cluster 
 (each Solr server running 2 shards).
 Performance tests at our side has shown that this is a good idea, and it is 
 also a good idea for easy elasticity later on - it is much easier to move an 
 entire existing shards from one Solr server to another one that just joined 
 the cluter than it is to split an exsiting shard among the Solr that used to 
 run it and the new Solr.
 See dev mailing list discussion Multiple shards for one collection on the 
 same Solr server

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-4114) Collection API: Allow multiple shards from one collection on the same Solr server

2012-11-28 Thread Yonik Seeley (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13505471#comment-13505471
 ] 

Yonik Seeley commented on SOLR-4114:


bq. Ok, its just than the replicationFactor you specify in your request is the 
other thing.

Hmmm, you're right:
Note: replicationFactor defines the maximum number of replicas created in 
addition to the leader from amongst the nodes currently running

That's not consistent with the original definition 
(http://wiki.apache.org/solr/NewSolrCloudDesign), the way the state is 
represented in clusterstate, or the way others use the term such as in 
hbase/HDFS, cassandra, oracle, etc.  The important part is how many times the 
data is stored (the replication factor), and things like leaders are more of an 
implementation detail.

Luckily we don't yet store this in the cluster, so there's no back compat issue 
with existing clusters.  There's only a change when creating a new cluster, but 
that seems relatively minor.  Given that, I'd lean toward changing this 
parameter to be in line with common usage.

Per: this is unrelated to your patch of course - it just happened to come up 
here.

 Collection API: Allow multiple shards from one collection on the same Solr 
 server
 -

 Key: SOLR-4114
 URL: https://issues.apache.org/jira/browse/SOLR-4114
 Project: Solr
  Issue Type: New Feature
  Components: multicore, SolrCloud
Affects Versions: 4.0
 Environment: Solr 4.0.0 release
Reporter: Per Steffensen
Assignee: Per Steffensen
  Labels: collection-api, multicore, shard, shard-allocation
 Attachments: SOLR-4114.patch


 We should support running multiple shards from one collection on the same 
 Solr server - the run a collection with 8 shards on a 4 Solr server cluster 
 (each Solr server running 2 shards).
 Performance tests at our side has shown that this is a good idea, and it is 
 also a good idea for easy elasticity later on - it is much easier to move an 
 entire existing shards from one Solr server to another one that just joined 
 the cluter than it is to split an exsiting shard among the Solr that used to 
 run it and the new Solr.
 See dev mailing list discussion Multiple shards for one collection on the 
 same Solr server

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-4114) Collection API: Allow multiple shards from one collection on the same Solr server

2012-11-28 Thread Per Steffensen (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13505472#comment-13505472
 ] 

Per Steffensen commented on SOLR-4114:
--

bq. This fix belongs with the issue that fixed delete and reload - I'm going to 
fix it there.

Yes of course, it is just hard for me to split up the patch, because it is all 
needed for the tests to be green. But commit-wise it belongs to the other issue.

 Collection API: Allow multiple shards from one collection on the same Solr 
 server
 -

 Key: SOLR-4114
 URL: https://issues.apache.org/jira/browse/SOLR-4114
 Project: Solr
  Issue Type: New Feature
  Components: multicore, SolrCloud
Affects Versions: 4.0
 Environment: Solr 4.0.0 release
Reporter: Per Steffensen
Assignee: Per Steffensen
  Labels: collection-api, multicore, shard, shard-allocation
 Attachments: SOLR-4114.patch


 We should support running multiple shards from one collection on the same 
 Solr server - the run a collection with 8 shards on a 4 Solr server cluster 
 (each Solr server running 2 shards).
 Performance tests at our side has shown that this is a good idea, and it is 
 also a good idea for easy elasticity later on - it is much easier to move an 
 entire existing shards from one Solr server to another one that just joined 
 the cluter than it is to split an exsiting shard among the Solr that used to 
 run it and the new Solr.
 See dev mailing list discussion Multiple shards for one collection on the 
 same Solr server

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-4114) Collection API: Allow multiple shards from one collection on the same Solr server

2012-11-28 Thread Per Steffensen (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13505496#comment-13505496
 ] 

Per Steffensen commented on SOLR-4114:
--

bq. Per: this is unrelated to your patch of course - it just happened to come 
up here.

No problem. I could make it as part of this patch if you want, but Im not sure 
I agree with your way of interpreting the term replication-factor. I would 
expect replication-factor to say something about how many times the data is 
REPLICATED. If I run with only one copy of the data for each slice, I would 
logically say that my data is not replicated, and that matches the 
replication-factor of 0.

I have used HDFS and HBase a little a year or so ago, but Im not sure what 
meaning they put into the term replica. I've also worked a lot with 
ElasticSearch (which I believe is more of a pendant to Solr) and in 
ElasticSearch I believe they use the term replica as the number of ADDITIONAL 
copies of the data - equal to your/our current implementation in Solr.

 Collection API: Allow multiple shards from one collection on the same Solr 
 server
 -

 Key: SOLR-4114
 URL: https://issues.apache.org/jira/browse/SOLR-4114
 Project: Solr
  Issue Type: New Feature
  Components: multicore, SolrCloud
Affects Versions: 4.0
 Environment: Solr 4.0.0 release
Reporter: Per Steffensen
Assignee: Per Steffensen
  Labels: collection-api, multicore, shard, shard-allocation
 Attachments: SOLR-4114.patch


 We should support running multiple shards from one collection on the same 
 Solr server - the run a collection with 8 shards on a 4 Solr server cluster 
 (each Solr server running 2 shards).
 Performance tests at our side has shown that this is a good idea, and it is 
 also a good idea for easy elasticity later on - it is much easier to move an 
 entire existing shards from one Solr server to another one that just joined 
 the cluter than it is to split an exsiting shard among the Solr that used to 
 run it and the new Solr.
 See dev mailing list discussion Multiple shards for one collection on the 
 same Solr server

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-4114) Collection API: Allow multiple shards from one collection on the same Solr server

2012-11-28 Thread Per Steffensen (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13505506#comment-13505506
 ] 

Per Steffensen commented on SOLR-4114:
--

Another more urgent problem (for me) is that I need to do another change to the 
Solr Collection API, before we can use it as a replacement for what we already 
do in our project (where we create each shard one by one in OUR code). We split 
our set of Solr servers into two subsets - Data-Solrs and Search-Solrs. The 
Search-Solrs are not supposed to carry any data and therefore to be occupied by 
indexing. Search-Solr instead play the role of receiving queries from the 
outside, sub-quering the Data-Solrs and combining the final total response to 
the outside. Data-Solrs are where we create the data-carrying collections. 
Data-Solrs need more CPU and IO-capabilities while Search-Solrs need more RAM - 
hence the splitup.

Therefore I need to be able to provide a list of Solrs to the create operation 
of the Solr Collection API. The shards are then only allowed to be spread 
shards for the collection over the Solrs in this list - default list could be 
all Solrs. As this list we, in our Solr-based projbect, will give our list of 
Data-Solrs.

Can I add such a feature to this SOLR-4114 and include it in a combined patch, 
or do you prefer another ticket for this change? I can create another issue but 
provide a combined patch. Are you interrested in such a feature at all? That 
is, a feature where the create operation takes a list of Solrs to spread the 
created shards over.

 Collection API: Allow multiple shards from one collection on the same Solr 
 server
 -

 Key: SOLR-4114
 URL: https://issues.apache.org/jira/browse/SOLR-4114
 Project: Solr
  Issue Type: New Feature
  Components: multicore, SolrCloud
Affects Versions: 4.0
 Environment: Solr 4.0.0 release
Reporter: Per Steffensen
Assignee: Per Steffensen
  Labels: collection-api, multicore, shard, shard-allocation
 Attachments: SOLR-4114.patch


 We should support running multiple shards from one collection on the same 
 Solr server - the run a collection with 8 shards on a 4 Solr server cluster 
 (each Solr server running 2 shards).
 Performance tests at our side has shown that this is a good idea, and it is 
 also a good idea for easy elasticity later on - it is much easier to move an 
 entire existing shards from one Solr server to another one that just joined 
 the cluter than it is to split an exsiting shard among the Solr that used to 
 run it and the new Solr.
 See dev mailing list discussion Multiple shards for one collection on the 
 same Solr server

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-4114) Collection API: Allow multiple shards from one collection on the same Solr server

2012-11-28 Thread Mark Miller (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13505531#comment-13505531
 ] 

Mark Miller commented on SOLR-4114:
---

When grabbing the params fix I noticed you set the data dir to something like 
shardname+_data - that's not strictly necessary right? Since each core should 
have it's own instance dir?

I've been thinking about how to set custom datadirs with this api - it would be 
nice to be able to specify the data dir - and in some cases perhaps base it on 
something like the core name rather than just some static string. But have you 
found it 'necessary' with your work?

 Collection API: Allow multiple shards from one collection on the same Solr 
 server
 -

 Key: SOLR-4114
 URL: https://issues.apache.org/jira/browse/SOLR-4114
 Project: Solr
  Issue Type: New Feature
  Components: multicore, SolrCloud
Affects Versions: 4.0
 Environment: Solr 4.0.0 release
Reporter: Per Steffensen
Assignee: Per Steffensen
  Labels: collection-api, multicore, shard, shard-allocation
 Attachments: SOLR-4114.patch, SOLR-4114.patch


 We should support running multiple shards from one collection on the same 
 Solr server - the run a collection with 8 shards on a 4 Solr server cluster 
 (each Solr server running 2 shards).
 Performance tests at our side has shown that this is a good idea, and it is 
 also a good idea for easy elasticity later on - it is much easier to move an 
 entire existing shards from one Solr server to another one that just joined 
 the cluter than it is to split an exsiting shard among the Solr that used to 
 run it and the new Solr.
 See dev mailing list discussion Multiple shards for one collection on the 
 same Solr server

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-4114) Collection API: Allow multiple shards from one collection on the same Solr server

2012-11-28 Thread Radim Kolar (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13505533#comment-13505533
 ] 

Radim Kolar commented on SOLR-4114:
---

could not you do same thing as Elastic Search. Build index with number of 
shards (initial number is 5). If there is 1 machine in cluster, then all shards 
are on this machine. If you add more machines, they will move to other 
machines. It is way simple for administration.

 Collection API: Allow multiple shards from one collection on the same Solr 
 server
 -

 Key: SOLR-4114
 URL: https://issues.apache.org/jira/browse/SOLR-4114
 Project: Solr
  Issue Type: New Feature
  Components: multicore, SolrCloud
Affects Versions: 4.0
 Environment: Solr 4.0.0 release
Reporter: Per Steffensen
Assignee: Per Steffensen
  Labels: collection-api, multicore, shard, shard-allocation
 Attachments: SOLR-4114.patch, SOLR-4114.patch


 We should support running multiple shards from one collection on the same 
 Solr server - the run a collection with 8 shards on a 4 Solr server cluster 
 (each Solr server running 2 shards).
 Performance tests at our side has shown that this is a good idea, and it is 
 also a good idea for easy elasticity later on - it is much easier to move an 
 entire existing shards from one Solr server to another one that just joined 
 the cluter than it is to split an exsiting shard among the Solr that used to 
 run it and the new Solr.
 See dev mailing list discussion Multiple shards for one collection on the 
 same Solr server

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-4114) Collection API: Allow multiple shards from one collection on the same Solr server

2012-11-28 Thread Mark Miller (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13505535#comment-13505535
 ] 

Mark Miller commented on SOLR-4114:
---

bq. Can I add such a feature to this SOLR-4114 and include it in a combined 
patch, or do you prefer another ticket for this change?

My preference would be a new issue. If it has to be done as one piece, I would 
wait for this to go in before supplying the patch for that issue. Or supply a 
patch for that issue and note that it requires applying this patch first. 
Combining multiple issues into one patch just makes it more difficult to get it 
in generally.

 Collection API: Allow multiple shards from one collection on the same Solr 
 server
 -

 Key: SOLR-4114
 URL: https://issues.apache.org/jira/browse/SOLR-4114
 Project: Solr
  Issue Type: New Feature
  Components: multicore, SolrCloud
Affects Versions: 4.0
 Environment: Solr 4.0.0 release
Reporter: Per Steffensen
Assignee: Per Steffensen
  Labels: collection-api, multicore, shard, shard-allocation
 Attachments: SOLR-4114.patch, SOLR-4114.patch


 We should support running multiple shards from one collection on the same 
 Solr server - the run a collection with 8 shards on a 4 Solr server cluster 
 (each Solr server running 2 shards).
 Performance tests at our side has shown that this is a good idea, and it is 
 also a good idea for easy elasticity later on - it is much easier to move an 
 entire existing shards from one Solr server to another one that just joined 
 the cluter than it is to split an exsiting shard among the Solr that used to 
 run it and the new Solr.
 See dev mailing list discussion Multiple shards for one collection on the 
 same Solr server

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-4114) Collection API: Allow multiple shards from one collection on the same Solr server

2012-11-28 Thread Mark Miller (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13505539#comment-13505539
 ] 

Mark Miller commented on SOLR-4114:
---

bq. you add more machines, they will move to other machines.

Personally, I'm not really sold on this auto re balancing idea. I'd prefer the 
user had to explicitly make these moves.

 Collection API: Allow multiple shards from one collection on the same Solr 
 server
 -

 Key: SOLR-4114
 URL: https://issues.apache.org/jira/browse/SOLR-4114
 Project: Solr
  Issue Type: New Feature
  Components: multicore, SolrCloud
Affects Versions: 4.0
 Environment: Solr 4.0.0 release
Reporter: Per Steffensen
Assignee: Per Steffensen
  Labels: collection-api, multicore, shard, shard-allocation
 Attachments: SOLR-4114.patch, SOLR-4114.patch


 We should support running multiple shards from one collection on the same 
 Solr server - the run a collection with 8 shards on a 4 Solr server cluster 
 (each Solr server running 2 shards).
 Performance tests at our side has shown that this is a good idea, and it is 
 also a good idea for easy elasticity later on - it is much easier to move an 
 entire existing shards from one Solr server to another one that just joined 
 the cluter than it is to split an exsiting shard among the Solr that used to 
 run it and the new Solr.
 See dev mailing list discussion Multiple shards for one collection on the 
 same Solr server

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-4114) Collection API: Allow multiple shards from one collection on the same Solr server

2012-11-28 Thread Jack Krupansky (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13505541#comment-13505541
 ] 

Jack Krupansky commented on SOLR-4114:
--

I certainly think of replica as a copy of the ORIGINAL, which makes perfect 
sense in a master with n-slaves configuration, but in a fully distributed 
environment such as SolrCloud where the leader of a shard can vary over time 
and updates are distributed to all nodes all of the time, there is no longer 
the concept of an original copy of the data. If anything, the original data 
is the source data on the wire before it gets instantiated on each node. No 
node is truly the original.

The terminology has this difficulty that it is only partially shared between 
the worlds of master/slave and the cloud. In master/slave, only the slaves are 
replicas and the master is the original, while in cloud ALL nodes are replicas 
since there are no originals. The leader is not a master copy of the data 
in the sense of master/slave.

So, I guess I am semi-comfortable with replica referring to all instances of 
the data, but we do need to be careful to highlight the distinction between how 
the term replica is used in the world of master/slave vs. SolrCloud, especially 
since many Cloud users will be migrating from the world of master/slave.

We also need to be careful not to refer to leader and replicas which implies 
that a leader is not a replica!


 Collection API: Allow multiple shards from one collection on the same Solr 
 server
 -

 Key: SOLR-4114
 URL: https://issues.apache.org/jira/browse/SOLR-4114
 Project: Solr
  Issue Type: New Feature
  Components: multicore, SolrCloud
Affects Versions: 4.0
 Environment: Solr 4.0.0 release
Reporter: Per Steffensen
Assignee: Per Steffensen
  Labels: collection-api, multicore, shard, shard-allocation
 Attachments: SOLR-4114.patch, SOLR-4114.patch


 We should support running multiple shards from one collection on the same 
 Solr server - the run a collection with 8 shards on a 4 Solr server cluster 
 (each Solr server running 2 shards).
 Performance tests at our side has shown that this is a good idea, and it is 
 also a good idea for easy elasticity later on - it is much easier to move an 
 entire existing shards from one Solr server to another one that just joined 
 the cluter than it is to split an exsiting shard among the Solr that used to 
 run it and the new Solr.
 See dev mailing list discussion Multiple shards for one collection on the 
 same Solr server

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-4114) Collection API: Allow multiple shards from one collection on the same Solr server

2012-11-28 Thread Per Steffensen (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13505547#comment-13505547
 ] 

Per Steffensen commented on SOLR-4114:
--

bq. could not you do same thing as Elastic Search. Build index with number of 
shards (initial number is 5). If there is 1 machine in cluster, then all shards 
are on this machine. If you add more machines, they will move to other 
machines. It is way simple for administration.

This moving shards around as more Solr servers join the cluster is the easiest 
way to provide elasticity (as I mentioned above somewhere). That is one of the 
reasons, that I want to be able to run multiple shards for a collection on the 
same Solr server. In that way you will have shards already to move to other 
Solrs that might join the cluster later.

In Solr, right now, we dont have the abillity to move shards from one server to 
another (ES has it), but in order to be able to bennefit from such a future 
feature, you will need to be able have multiple shards on one Solr server. 
Alternatively you have to go split a shard, but that is much harder, and should 
only be used if you did not forsee, when you created your collection, that you 
would add more servers later, and therefore created your collection with 
multiple shards per server.

 Collection API: Allow multiple shards from one collection on the same Solr 
 server
 -

 Key: SOLR-4114
 URL: https://issues.apache.org/jira/browse/SOLR-4114
 Project: Solr
  Issue Type: New Feature
  Components: multicore, SolrCloud
Affects Versions: 4.0
 Environment: Solr 4.0.0 release
Reporter: Per Steffensen
Assignee: Per Steffensen
  Labels: collection-api, multicore, shard, shard-allocation
 Attachments: SOLR-4114.patch, SOLR-4114.patch


 We should support running multiple shards from one collection on the same 
 Solr server - the run a collection with 8 shards on a 4 Solr server cluster 
 (each Solr server running 2 shards).
 Performance tests at our side has shown that this is a good idea, and it is 
 also a good idea for easy elasticity later on - it is much easier to move an 
 entire existing shards from one Solr server to another one that just joined 
 the cluter than it is to split an exsiting shard among the Solr that used to 
 run it and the new Solr.
 See dev mailing list discussion Multiple shards for one collection on the 
 same Solr server

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-4114) Collection API: Allow multiple shards from one collection on the same Solr server

2012-11-28 Thread Per Steffensen (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13505556#comment-13505556
 ] 

Per Steffensen commented on SOLR-4114:
--

bq. Personally, I'm not really sold on this auto re balancing idea. I'd prefer 
the user had to explicitly make these moves.

Me neither - and I can say that ES sometimes f it up. At least when I was 
working with it, but that was mainly because of bad re-balancing algoritms. But 
I like moving shards manually from a admin-console!

 Collection API: Allow multiple shards from one collection on the same Solr 
 server
 -

 Key: SOLR-4114
 URL: https://issues.apache.org/jira/browse/SOLR-4114
 Project: Solr
  Issue Type: New Feature
  Components: multicore, SolrCloud
Affects Versions: 4.0
 Environment: Solr 4.0.0 release
Reporter: Per Steffensen
Assignee: Per Steffensen
  Labels: collection-api, multicore, shard, shard-allocation
 Attachments: SOLR-4114.patch, SOLR-4114.patch


 We should support running multiple shards from one collection on the same 
 Solr server - the run a collection with 8 shards on a 4 Solr server cluster 
 (each Solr server running 2 shards).
 Performance tests at our side has shown that this is a good idea, and it is 
 also a good idea for easy elasticity later on - it is much easier to move an 
 entire existing shards from one Solr server to another one that just joined 
 the cluter than it is to split an exsiting shard among the Solr that used to 
 run it and the new Solr.
 See dev mailing list discussion Multiple shards for one collection on the 
 same Solr server

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-4114) Collection API: Allow multiple shards from one collection on the same Solr server

2012-11-28 Thread Mark Miller (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13505558#comment-13505558
 ] 

Mark Miller commented on SOLR-4114:
---

I've committed the shared params issue under SOLR-4055 and added Per to the 
Changes entry.

 Collection API: Allow multiple shards from one collection on the same Solr 
 server
 -

 Key: SOLR-4114
 URL: https://issues.apache.org/jira/browse/SOLR-4114
 Project: Solr
  Issue Type: New Feature
  Components: multicore, SolrCloud
Affects Versions: 4.0
 Environment: Solr 4.0.0 release
Reporter: Per Steffensen
Assignee: Per Steffensen
  Labels: collection-api, multicore, shard, shard-allocation
 Attachments: SOLR-4114.patch, SOLR-4114.patch


 We should support running multiple shards from one collection on the same 
 Solr server - the run a collection with 8 shards on a 4 Solr server cluster 
 (each Solr server running 2 shards).
 Performance tests at our side has shown that this is a good idea, and it is 
 also a good idea for easy elasticity later on - it is much easier to move an 
 entire existing shards from one Solr server to another one that just joined 
 the cluter than it is to split an exsiting shard among the Solr that used to 
 run it and the new Solr.
 See dev mailing list discussion Multiple shards for one collection on the 
 same Solr server

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-4114) Collection API: Allow multiple shards from one collection on the same Solr server

2012-11-28 Thread Per Steffensen (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13505562#comment-13505562
 ] 

Per Steffensen commented on SOLR-4114:
--

bq. When grabbing the params fix I noticed you set the data dir to something 
like shardname+_data - that's not strictly necessary right? Since each core 
should have it's own instance dir

Well I use the same instance-dir for all shards, but a different data-dir - 
this is just how we used to do it in my project, but it can be changed. As long 
as the code uses same instance-dir different data-dirs are necessary though.

 Collection API: Allow multiple shards from one collection on the same Solr 
 server
 -

 Key: SOLR-4114
 URL: https://issues.apache.org/jira/browse/SOLR-4114
 Project: Solr
  Issue Type: New Feature
  Components: multicore, SolrCloud
Affects Versions: 4.0
 Environment: Solr 4.0.0 release
Reporter: Per Steffensen
Assignee: Per Steffensen
  Labels: collection-api, multicore, shard, shard-allocation
 Attachments: SOLR-4114.patch, SOLR-4114.patch


 We should support running multiple shards from one collection on the same 
 Solr server - the run a collection with 8 shards on a 4 Solr server cluster 
 (each Solr server running 2 shards).
 Performance tests at our side has shown that this is a good idea, and it is 
 also a good idea for easy elasticity later on - it is much easier to move an 
 entire existing shards from one Solr server to another one that just joined 
 the cluter than it is to split an exsiting shard among the Solr that used to 
 run it and the new Solr.
 See dev mailing list discussion Multiple shards for one collection on the 
 same Solr server

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-4114) Collection API: Allow multiple shards from one collection on the same Solr server

2012-11-28 Thread Yonik Seeley (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13505564#comment-13505564
 ] 

Yonik Seeley commented on SOLR-4114:


bq. I would expect replication-factor to say something about how many times 
the data is REPLICATED.

I would too, but we would still disagree on what that meant since I would 
interpret the number of times the data is replicated to mean the total number 
of copies that exist after a write operation to the cluster.  That seems to be 
the much more common interpretation in this context since there is no 
original... everyone has stored/indexed a copy.

$ echo hello  file1.txt
$ cp file1.txt file2.txt

How many copies of the file are there? If you look at the state (and not the 
mechanism by which you arrived there) most would say there are 2 copies.
In one interpretation, there is only one copy, but that's too literal and 
assignes some special category to the original.


http://hadoop.apache.org/docs/r0.20.2/hdfs_design.html
The number of copies of a file is called the replication factor of that file.

http://www.datastax.com/docs/1.0/cluster_architecture/replication
The total number of replicas across the cluster is referred to as the 
replication factor. A replication factor of 1 means that there is only one copy 
of each row on one node.

Oracle NoSQL store:
http://docs.oracle.com/cd/NOSQL/html/AdminGuide/introduction.html#replicationfactor
http://docs.oracle.com/cd/NOSQL/html/AdminGuide/store-config.html
A Replication Factor of 3 gives you shards with one master plus two replicas.

Riak:
http://wiki.basho.com/What-is-Riak%3F.html
An n value of 3 (default) means that each object is replicated 3 times. When 
an object’s key is mapped onto a given partition, Riak won’t stop there – it 
automatically replicates the data onto the next two partitions as well.

Splunk:
http://docs.splunk.com/Documentation/Splunk/latest/Indexer/Thereplicationfactor
The number of data/bucket copies is called the cluster's replication factor.
The cluster can tolerate a failure of (replication factor - 1) peer nodes. So, 
for example, to ensure that your system can tolerate a failure of two peers, 
you must configure a replication factor of 3, which means that the cluster 
stores three identical copies of each bucket on separate nodes. With a 
replication factor of 3, you can be certain that all your data will be 
available if no more than two peer nodes in the cluster fail. With two nodes 
down, you still have one complete copy of your data available on the remaining 
peer(s).

It's clear that 3 copies means 3 total instances of the same data, not 4 (an 
original plus 3 more copies of it.)


 Collection API: Allow multiple shards from one collection on the same Solr 
 server
 -

 Key: SOLR-4114
 URL: https://issues.apache.org/jira/browse/SOLR-4114
 Project: Solr
  Issue Type: New Feature
  Components: multicore, SolrCloud
Affects Versions: 4.0
 Environment: Solr 4.0.0 release
Reporter: Per Steffensen
Assignee: Per Steffensen
  Labels: collection-api, multicore, shard, shard-allocation
 Attachments: SOLR-4114.patch, SOLR-4114.patch


 We should support running multiple shards from one collection on the same 
 Solr server - the run a collection with 8 shards on a 4 Solr server cluster 
 (each Solr server running 2 shards).
 Performance tests at our side has shown that this is a good idea, and it is 
 also a good idea for easy elasticity later on - it is much easier to move an 
 entire existing shards from one Solr server to another one that just joined 
 the cluter than it is to split an exsiting shard among the Solr that used to 
 run it and the new Solr.
 See dev mailing list discussion Multiple shards for one collection on the 
 same Solr server

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-4114) Collection API: Allow multiple shards from one collection on the same Solr server

2012-11-28 Thread Per Steffensen (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13505565#comment-13505565
 ] 

Per Steffensen commented on SOLR-4114:
--

bq. I've committed the shared params issue under SOLR-4055 and added Per to the 
Changes entry.

On which branch are you committing, Mark?

 Collection API: Allow multiple shards from one collection on the same Solr 
 server
 -

 Key: SOLR-4114
 URL: https://issues.apache.org/jira/browse/SOLR-4114
 Project: Solr
  Issue Type: New Feature
  Components: multicore, SolrCloud
Affects Versions: 4.0
 Environment: Solr 4.0.0 release
Reporter: Per Steffensen
Assignee: Per Steffensen
  Labels: collection-api, multicore, shard, shard-allocation
 Attachments: SOLR-4114.patch, SOLR-4114.patch


 We should support running multiple shards from one collection on the same 
 Solr server - the run a collection with 8 shards on a 4 Solr server cluster 
 (each Solr server running 2 shards).
 Performance tests at our side has shown that this is a good idea, and it is 
 also a good idea for easy elasticity later on - it is much easier to move an 
 entire existing shards from one Solr server to another one that just joined 
 the cluter than it is to split an exsiting shard among the Solr that used to 
 run it and the new Solr.
 See dev mailing list discussion Multiple shards for one collection on the 
 same Solr server

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-4114) Collection API: Allow multiple shards from one collection on the same Solr server

2012-11-28 Thread Per Steffensen (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13505570#comment-13505570
 ] 

Per Steffensen commented on SOLR-4114:
--

bq. I would too, but we would still disagree on what that meant since I would 
interpret the number of times the data is replicated...

I actually agree with you. I just dont like replica to part of the name for 
it then. If we rename replication-factor to number-of-copies or something I 
would be much happier changing the semantics of it :-) But really, this is 
another issue.

 Collection API: Allow multiple shards from one collection on the same Solr 
 server
 -

 Key: SOLR-4114
 URL: https://issues.apache.org/jira/browse/SOLR-4114
 Project: Solr
  Issue Type: New Feature
  Components: multicore, SolrCloud
Affects Versions: 4.0
 Environment: Solr 4.0.0 release
Reporter: Per Steffensen
Assignee: Per Steffensen
  Labels: collection-api, multicore, shard, shard-allocation
 Attachments: SOLR-4114.patch, SOLR-4114.patch


 We should support running multiple shards from one collection on the same 
 Solr server - the run a collection with 8 shards on a 4 Solr server cluster 
 (each Solr server running 2 shards).
 Performance tests at our side has shown that this is a good idea, and it is 
 also a good idea for easy elasticity later on - it is much easier to move an 
 entire existing shards from one Solr server to another one that just joined 
 the cluter than it is to split an exsiting shard among the Solr that used to 
 run it and the new Solr.
 See dev mailing list discussion Multiple shards for one collection on the 
 same Solr server

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-4114) Collection API: Allow multiple shards from one collection on the same Solr server

2012-11-28 Thread Mark Miller (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13505571#comment-13505571
 ] 

Mark Miller commented on SOLR-4114:
---

bq. On which branch are you committing, Mark?

5x and then merged to 4x - just that small fix though - have not had a chance 
to review this patch fully yet.

 Collection API: Allow multiple shards from one collection on the same Solr 
 server
 -

 Key: SOLR-4114
 URL: https://issues.apache.org/jira/browse/SOLR-4114
 Project: Solr
  Issue Type: New Feature
  Components: multicore, SolrCloud
Affects Versions: 4.0
 Environment: Solr 4.0.0 release
Reporter: Per Steffensen
Assignee: Per Steffensen
  Labels: collection-api, multicore, shard, shard-allocation
 Attachments: SOLR-4114.patch, SOLR-4114.patch


 We should support running multiple shards from one collection on the same 
 Solr server - the run a collection with 8 shards on a 4 Solr server cluster 
 (each Solr server running 2 shards).
 Performance tests at our side has shown that this is a good idea, and it is 
 also a good idea for easy elasticity later on - it is much easier to move an 
 entire existing shards from one Solr server to another one that just joined 
 the cluter than it is to split an exsiting shard among the Solr that used to 
 run it and the new Solr.
 See dev mailing list discussion Multiple shards for one collection on the 
 same Solr server

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-4114) Collection API: Allow multiple shards from one collection on the same Solr server

2012-11-28 Thread Per Steffensen (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13505575#comment-13505575
 ] 

Per Steffensen commented on SOLR-4114:
--

Well Im off for today. Will probably (if my POs head does not turn green) be 
making the spread-shards-according-to-provided-list feature tomorrow. If you 
commit the entire patch for SOLR-4114 it will be easier for me to provide a new 
patch for this new feature and attach it to a new issue :-)

 Collection API: Allow multiple shards from one collection on the same Solr 
 server
 -

 Key: SOLR-4114
 URL: https://issues.apache.org/jira/browse/SOLR-4114
 Project: Solr
  Issue Type: New Feature
  Components: multicore, SolrCloud
Affects Versions: 4.0
 Environment: Solr 4.0.0 release
Reporter: Per Steffensen
Assignee: Per Steffensen
  Labels: collection-api, multicore, shard, shard-allocation
 Attachments: SOLR-4114.patch, SOLR-4114.patch


 We should support running multiple shards from one collection on the same 
 Solr server - the run a collection with 8 shards on a 4 Solr server cluster 
 (each Solr server running 2 shards).
 Performance tests at our side has shown that this is a good idea, and it is 
 also a good idea for easy elasticity later on - it is much easier to move an 
 entire existing shards from one Solr server to another one that just joined 
 the cluter than it is to split an exsiting shard among the Solr that used to 
 run it and the new Solr.
 See dev mailing list discussion Multiple shards for one collection on the 
 same Solr server

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-4114) Collection API: Allow multiple shards from one collection on the same Solr server

2012-11-28 Thread Per Steffensen (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13505578#comment-13505578
 ] 

Per Steffensen commented on SOLR-4114:
--

bq. 5x and then merged to 4x - just that small fix though - have not had a 
chance to review this patch fully yet.

But is it also going to be backported to lucene_solr_4_0, which is actually the 
branch I am working on top of?

 Collection API: Allow multiple shards from one collection on the same Solr 
 server
 -

 Key: SOLR-4114
 URL: https://issues.apache.org/jira/browse/SOLR-4114
 Project: Solr
  Issue Type: New Feature
  Components: multicore, SolrCloud
Affects Versions: 4.0
 Environment: Solr 4.0.0 release
Reporter: Per Steffensen
Assignee: Per Steffensen
  Labels: collection-api, multicore, shard, shard-allocation
 Attachments: SOLR-4114.patch, SOLR-4114.patch


 We should support running multiple shards from one collection on the same 
 Solr server - the run a collection with 8 shards on a 4 Solr server cluster 
 (each Solr server running 2 shards).
 Performance tests at our side has shown that this is a good idea, and it is 
 also a good idea for easy elasticity later on - it is much easier to move an 
 entire existing shards from one Solr server to another one that just joined 
 the cluter than it is to split an exsiting shard among the Solr that used to 
 run it and the new Solr.
 See dev mailing list discussion Multiple shards for one collection on the 
 same Solr server

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-4114) Collection API: Allow multiple shards from one collection on the same Solr server

2012-11-28 Thread Mark Miller (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13505590#comment-13505590
 ] 

Mark Miller commented on SOLR-4114:
---

bq. But is it also going to be backported to lucene_solr_4_0

Given past discussion, it's very unlikely that we will release a 4.0.1 (I was 
for it FWIW) and will just do a 4.1 - so no, generally nothing is being back 
ported to the 4.0 branch.

If we did end up deciding to do a 4.0.1, then we would select which issues 
should go in and then do those back ports later.


 Collection API: Allow multiple shards from one collection on the same Solr 
 server
 -

 Key: SOLR-4114
 URL: https://issues.apache.org/jira/browse/SOLR-4114
 Project: Solr
  Issue Type: New Feature
  Components: multicore, SolrCloud
Affects Versions: 4.0
 Environment: Solr 4.0.0 release
Reporter: Per Steffensen
Assignee: Per Steffensen
  Labels: collection-api, multicore, shard, shard-allocation
 Attachments: SOLR-4114.patch, SOLR-4114.patch


 We should support running multiple shards from one collection on the same 
 Solr server - the run a collection with 8 shards on a 4 Solr server cluster 
 (each Solr server running 2 shards).
 Performance tests at our side has shown that this is a good idea, and it is 
 also a good idea for easy elasticity later on - it is much easier to move an 
 entire existing shards from one Solr server to another one that just joined 
 the cluter than it is to split an exsiting shard among the Solr that used to 
 run it and the new Solr.
 See dev mailing list discussion Multiple shards for one collection on the 
 same Solr server

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-4114) Collection API: Allow multiple shards from one collection on the same Solr server

2012-11-28 Thread Per Steffensen (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13505597#comment-13505597
 ] 

Per Steffensen commented on SOLR-4114:
--

bq. so no, generally nothing is being back ported to the 4.0 branch

Well, I guess the essence of my question is, if it is ok that I keep providing 
patches relative to lucene_solr_4_0? At least for this issue and the 
spread-shards-across-provided-list-of-solrs one?

 Collection API: Allow multiple shards from one collection on the same Solr 
 server
 -

 Key: SOLR-4114
 URL: https://issues.apache.org/jira/browse/SOLR-4114
 Project: Solr
  Issue Type: New Feature
  Components: multicore, SolrCloud
Affects Versions: 4.0
 Environment: Solr 4.0.0 release
Reporter: Per Steffensen
Assignee: Per Steffensen
  Labels: collection-api, multicore, shard, shard-allocation
 Attachments: SOLR-4114.patch, SOLR-4114.patch


 We should support running multiple shards from one collection on the same 
 Solr server - the run a collection with 8 shards on a 4 Solr server cluster 
 (each Solr server running 2 shards).
 Performance tests at our side has shown that this is a good idea, and it is 
 also a good idea for easy elasticity later on - it is much easier to move an 
 entire existing shards from one Solr server to another one that just joined 
 the cluter than it is to split an exsiting shard among the Solr that used to 
 run it and the new Solr.
 See dev mailing list discussion Multiple shards for one collection on the 
 same Solr server

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-4114) Collection API: Allow multiple shards from one collection on the same Solr server

2012-11-28 Thread Mark Miller (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13505612#comment-13505612
 ] 

Mark Miller commented on SOLR-4114:
---

Well, it makes things a little more painful in that I have to merge it to 
4x/5x, but I can do that. It's probably not too difficult.

 Collection API: Allow multiple shards from one collection on the same Solr 
 server
 -

 Key: SOLR-4114
 URL: https://issues.apache.org/jira/browse/SOLR-4114
 Project: Solr
  Issue Type: New Feature
  Components: multicore, SolrCloud
Affects Versions: 4.0
 Environment: Solr 4.0.0 release
Reporter: Per Steffensen
Assignee: Per Steffensen
  Labels: collection-api, multicore, shard, shard-allocation
 Attachments: SOLR-4114.patch, SOLR-4114.patch


 We should support running multiple shards from one collection on the same 
 Solr server - the run a collection with 8 shards on a 4 Solr server cluster 
 (each Solr server running 2 shards).
 Performance tests at our side has shown that this is a good idea, and it is 
 also a good idea for easy elasticity later on - it is much easier to move an 
 entire existing shards from one Solr server to another one that just joined 
 the cluter than it is to split an exsiting shard among the Solr that used to 
 run it and the new Solr.
 See dev mailing list discussion Multiple shards for one collection on the 
 same Solr server

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-4114) Collection API: Allow multiple shards from one collection on the same Solr server

2012-11-28 Thread Per Steffensen (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13505808#comment-13505808
 ] 

Per Steffensen commented on SOLR-4114:
--

Created another ticket for the spread-shards-according-to-provided-list thingy. 
SOLR-4120

 Collection API: Allow multiple shards from one collection on the same Solr 
 server
 -

 Key: SOLR-4114
 URL: https://issues.apache.org/jira/browse/SOLR-4114
 Project: Solr
  Issue Type: New Feature
  Components: multicore, SolrCloud
Affects Versions: 4.0
 Environment: Solr 4.0.0 release
Reporter: Per Steffensen
Assignee: Per Steffensen
  Labels: collection-api, multicore, shard, shard-allocation
 Attachments: SOLR-4114.patch, SOLR-4114.patch


 We should support running multiple shards from one collection on the same 
 Solr server - the run a collection with 8 shards on a 4 Solr server cluster 
 (each Solr server running 2 shards).
 Performance tests at our side has shown that this is a good idea, and it is 
 also a good idea for easy elasticity later on - it is much easier to move an 
 entire existing shards from one Solr server to another one that just joined 
 the cluter than it is to split an exsiting shard among the Solr that used to 
 run it and the new Solr.
 See dev mailing list discussion Multiple shards for one collection on the 
 same Solr server

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-4114) Collection API: Allow multiple shards from one collection on the same Solr server

2012-11-27 Thread Yonik Seeley (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13504659#comment-13504659
 ] 

Yonik Seeley commented on SOLR-4114:


What's the proposed API?  Perhaps a maxShardsPerNode parameter during the 
create?
Seems like it should default to 1 (the current behavior)?

 Collection API: Allow multiple shards from one collection on the same Solr 
 server
 -

 Key: SOLR-4114
 URL: https://issues.apache.org/jira/browse/SOLR-4114
 Project: Solr
  Issue Type: New Feature
  Components: multicore, SolrCloud
Affects Versions: 4.0
 Environment: Solr 4.0.0 release
Reporter: Per Steffensen
Assignee: Per Steffensen
  Labels: collection-api, multicore, shard, shard-allocation

 We should support running multiple shards from one collection on the same 
 Solr server - the run a collection with 8 shards on a 4 Solr server cluster 
 (each Solr server running 2 shards).
 Performance tests at our side has shown that this is a good idea, and it is 
 also a good idea for easy elasticity later on - it is much easier to move an 
 entire existing shards from one Solr server to another one that just joined 
 the cluter than it is to split an exsiting shard among the Solr that used to 
 run it and the new Solr.
 See dev mailing list discussion Multiple shards for one collection on the 
 same Solr server

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-4114) Collection API: Allow multiple shards from one collection on the same Solr server

2012-11-27 Thread Per Steffensen (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13504881#comment-13504881
 ] 

Per Steffensen commented on SOLR-4114:
--

Learned from Steve today, that you usually develop for 5.x on trunk, and then 
back port to 4.x.y branches. Let me know if you would like a trunk-based patch 
instead

 Collection API: Allow multiple shards from one collection on the same Solr 
 server
 -

 Key: SOLR-4114
 URL: https://issues.apache.org/jira/browse/SOLR-4114
 Project: Solr
  Issue Type: New Feature
  Components: multicore, SolrCloud
Affects Versions: 4.0
 Environment: Solr 4.0.0 release
Reporter: Per Steffensen
Assignee: Per Steffensen
  Labels: collection-api, multicore, shard, shard-allocation
 Attachments: SOLR-4114.patch


 We should support running multiple shards from one collection on the same 
 Solr server - the run a collection with 8 shards on a 4 Solr server cluster 
 (each Solr server running 2 shards).
 Performance tests at our side has shown that this is a good idea, and it is 
 also a good idea for easy elasticity later on - it is much easier to move an 
 entire existing shards from one Solr server to another one that just joined 
 the cluter than it is to split an exsiting shard among the Solr that used to 
 run it and the new Solr.
 See dev mailing list discussion Multiple shards for one collection on the 
 same Solr server

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-4114) Collection API: Allow multiple shards from one collection on the same Solr server

2012-11-27 Thread Yonik Seeley (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13504889#comment-13504889
 ] 

Yonik Seeley commented on SOLR-4114:


bq. Well I see no reason to introduce (in the first step at least) a 
maxShardsPerNode. 

So that we don't lose functionality we currently have?
And I agree that it should be up to the user, hence the proposed parameter to 
control it.

bq. Only potential problem is if his create request is run when not all Solr 
servers are running, and in such case a maxShardsPerNode could help to stop the 
creation process.

Exactly... there's the main use case.

Example: you have 24 servers and create a collection with 8 shards and a target 
replication factor of 3... but one of the servers goes down in the meantime so 
one shard has only 2 replicas.  It's entirely reasonable for a user to want to 
wait until that machine comes back up rather than doubling up on a different 
node.

The other use case is the examples on http://wiki.apache.org/solr/SolrCloud

 Collection API: Allow multiple shards from one collection on the same Solr 
 server
 -

 Key: SOLR-4114
 URL: https://issues.apache.org/jira/browse/SOLR-4114
 Project: Solr
  Issue Type: New Feature
  Components: multicore, SolrCloud
Affects Versions: 4.0
 Environment: Solr 4.0.0 release
Reporter: Per Steffensen
Assignee: Per Steffensen
  Labels: collection-api, multicore, shard, shard-allocation
 Attachments: SOLR-4114.patch


 We should support running multiple shards from one collection on the same 
 Solr server - the run a collection with 8 shards on a 4 Solr server cluster 
 (each Solr server running 2 shards).
 Performance tests at our side has shown that this is a good idea, and it is 
 also a good idea for easy elasticity later on - it is much easier to move an 
 entire existing shards from one Solr server to another one that just joined 
 the cluter than it is to split an exsiting shard among the Solr that used to 
 run it and the new Solr.
 See dev mailing list discussion Multiple shards for one collection on the 
 same Solr server

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-4114) Collection API: Allow multiple shards from one collection on the same Solr server

2012-11-27 Thread Per Steffensen (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13504954#comment-13504954
 ] 

Per Steffensen commented on SOLR-4114:
--

bq. So that we don't lose functionality we currently have?

So now you care about backwards compatibility? :-) You didnt care about 
backwards compatibility from 3.6 to 4.0 when you introduced optimistic locking 
(including error in case of updating an existing document without providing 
correct version), which is forced upon you in 4.0 if you choose to run with 
version-field and update-log. There are perfectly valid reasons for wanting to 
use version-field and update-log, without wanting to have fullblown optimistic 
locking. My solution to SOLR-3178 support this kind of backwards compatibility 
by letting you explicitly choose among update-semantics modes classic, 
consistency and classic-consistency-hybrid. So if you come from 3.6 and 
want backwards compatibile update-semantics, but also want version-field and 
update-log, you just choose update-semantics classic :-) See 
http://wiki.apache.org/solr/Per%20Steffensen/Update%20semantics.
Im just teasing you a little :-)

But anyway, I like backwards compatibility so you are right, we probably do not 
want to do something that change default behaviour in 4.0.0. Will have a look 
at a solution tomorrow. It is kinda late in europe now.

bq. Example: you have 24 servers and create a collection with 8 shards and a 
target replication factor of 3... but one of the servers goes down in the 
meantime so one shard has only 2 replicas. It's entirely reasonable for a user 
to want to wait until that machine comes back up rather than doubling up on a 
different node.

Assume you mean replication-factor of 2? With a replication-factor of 2 you 
will get 3 shards per slice.

With your current solution there will be no waiting until that machine comes 
back up. You will just end up with 8 slices, where 7 of them have 2 replica, 
and the last only have 1 replica. With the patch I provided today you will end 
up with 8 slices, where all of them have 2 replica - but one of the servers 
will be running two shards and the solr down will not be running any (when it 
comes back up). I probably would prefer my current solution - at least you 
acheive the property that any two servers can crash (including disk crash) 
without you loosing data - which is basically what you want to acheive when you 
request replication-factor of 2.

But waiting for the machine to come back up before creating the collection 
would certainly be the best solution. It is just extremly hard to know if a 
machine is down or not - or if you intented to run one server more than what is 
currently running. In general there is no information in solr/ZK about that - 
and there shouldnt. In this case a maxShardsPerNode could be a nice way to tell 
the system that you just want to wait. But then it would have to be implemented 
correctly, and that is really hard. In OverseerCollectionProcessor you can 
check if you can meet the maxShardsPerNode requirement with the current set of 
live solrs, and if you cant just dont initiate the creation process. But a 
server can go down between the time where the OverseerCollectionProcessor 
checks and the time where it is supposed to create a shard. Therefore it is 
impossible to guarantee that the OverseerCollectionProcessor does not create 
some shards of a new collection without being able to create them all while 
still living up to the maxShardsPerNode requirement. In such case, if you 
really want to live up to the maxShardsPerNode requiremnt the 
OverseerCollectionProcessor would have to try to delete the shards of the 
collection that was successfully created. But this deletion process can also 
fail. Ahhh there is no guaranteeed way.

Therefore my idea about the whole thing, is more aming at just having all the 
shards created, and then move them around later. I know this is not possible 
for now, but I do expect that we (at least my project) will make support for 
(manually and/or automatic) migration of shards from one server to another. 
This feature is needed  to acheive nice elasiticty (moving shards/load onto new 
servers as they join the cluster), but also to do re-balancing after e.g. a 
solr was down (and a shard that should have been placed on this server was 
temporarily created to run on another server).

Well as I said I will consider the best (small patch :-) ) solution tomorrow. 
But if I cant come up with a better small-patch-solution we can certainly do 
the maxShardsPerNode thing - no problemo. It just isnt going to be 100% 
guaranteed.


 Collection API: Allow multiple shards from one collection on the same Solr 
 server
 -

 Key: SOLR-4114
 URL: 

[jira] [Commented] (SOLR-4114) Collection API: Allow multiple shards from one collection on the same Solr server

2012-11-27 Thread Mark Miller (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13504973#comment-13504973
 ] 

Mark Miller commented on SOLR-4114:
---

Solr 3.X to Solr 4.X back compat is not considered the same as Solr 4.0 to Solr 
4.1 back compat.

 Collection API: Allow multiple shards from one collection on the same Solr 
 server
 -

 Key: SOLR-4114
 URL: https://issues.apache.org/jira/browse/SOLR-4114
 Project: Solr
  Issue Type: New Feature
  Components: multicore, SolrCloud
Affects Versions: 4.0
 Environment: Solr 4.0.0 release
Reporter: Per Steffensen
Assignee: Per Steffensen
  Labels: collection-api, multicore, shard, shard-allocation
 Attachments: SOLR-4114.patch


 We should support running multiple shards from one collection on the same 
 Solr server - the run a collection with 8 shards on a 4 Solr server cluster 
 (each Solr server running 2 shards).
 Performance tests at our side has shown that this is a good idea, and it is 
 also a good idea for easy elasticity later on - it is much easier to move an 
 entire existing shards from one Solr server to another one that just joined 
 the cluter than it is to split an exsiting shard among the Solr that used to 
 run it and the new Solr.
 See dev mailing list discussion Multiple shards for one collection on the 
 same Solr server

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-4114) Collection API: Allow multiple shards from one collection on the same Solr server

2012-11-27 Thread Yonik Seeley (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-4114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13505001#comment-13505001
 ] 

Yonik Seeley commented on SOLR-4114:


bq.  So that we don't lose functionality we currently have?
bq. So now you care about backwards compatibility?

I was speaking specifically about functionality, not back compatibility.

bq. With your current solution there will be no waiting until that machine 
comes back up. You will just end up with 8 slices, where 7 of them have 2 
replica, and the last only have 1 replica.

Correct.  When I said it's entirely reasonable for a user to want to wait, I 
meant wait to create the additional replica for one shard, not wait to create 
the whole collection.  Although I guess it might be useful to be able to fail 
collection creation if certain specified constraints aren't met (including a 
min replication factor).

As far as terminology, when I say replicationFactor of 3, I mean 3 copies of 
the data.  I also count the leader as a replica of a shard (which is logical).  
It follows from the clusterstate.json, which lists all replicas for a shard 
and one of them just has a flag indicating it's the leader.  This also makes it 
easier to talk about a shard having 0 replicas (meaning there is not even a 
leader).


 Collection API: Allow multiple shards from one collection on the same Solr 
 server
 -

 Key: SOLR-4114
 URL: https://issues.apache.org/jira/browse/SOLR-4114
 Project: Solr
  Issue Type: New Feature
  Components: multicore, SolrCloud
Affects Versions: 4.0
 Environment: Solr 4.0.0 release
Reporter: Per Steffensen
Assignee: Per Steffensen
  Labels: collection-api, multicore, shard, shard-allocation
 Attachments: SOLR-4114.patch


 We should support running multiple shards from one collection on the same 
 Solr server - the run a collection with 8 shards on a 4 Solr server cluster 
 (each Solr server running 2 shards).
 Performance tests at our side has shown that this is a good idea, and it is 
 also a good idea for easy elasticity later on - it is much easier to move an 
 entire existing shards from one Solr server to another one that just joined 
 the cluter than it is to split an exsiting shard among the Solr that used to 
 run it and the new Solr.
 See dev mailing list discussion Multiple shards for one collection on the 
 same Solr server

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org