[jira] [Commented] (SOLR-2358) Distributing Indexing

2012-04-06 Thread Michael Garski (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-2358?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13248516#comment-13248516
 ] 

Michael Garski commented on SOLR-2358:
--

I have a use case for shard distribution based on something other than a hash 
on the document's unique id and was wondering if there are any thoughts as to 
how such functionality should be implemented? It looks like SOLR-2341 (Shard 
distribution policy) and SOLR-2592 (pluggable shard lookup mechanism) 
complement each other for indexing and searching and was wondering if anyone 
had thoughts as to the approach to take. 

 Distributing Indexing
 -

 Key: SOLR-2358
 URL: https://issues.apache.org/jira/browse/SOLR-2358
 Project: Solr
  Issue Type: New Feature
  Components: SolrCloud, update
Reporter: William Mayor
Priority: Minor
 Fix For: 4.0

 Attachments: 2shard4server.jpg, SOLR-2358.patch, SOLR-2358.patch, 
 apache-solr-noggit-r1211150.jar, zookeeper-3.3.4.jar


 The indexing side of SolrCloud - the goal of this issue is to provide 
 durable, fault tolerant indexing to an elastic cluster of Solr instances.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-2358) Distributing Indexing

2012-01-29 Thread Robert Muir (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-2358?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13195732#comment-13195732
 ] 

Robert Muir commented on SOLR-2358:
---

{quote}
I can't currently get into the hudson machine - used the wrong username the 
other day and seemed to get ip banned pretty much right away. Looking into 
getting that undone.
{quote}

Yeah thats probably the best way to move forward. Otherwise you have to wait 
like an hour just to see if one tweak to a single test worked.

{quote}
Which tricks? This could be part of it by the sound of things.
{quote}

It depends on what the test is doing, but just a few ideas:
* any client operations in tests should have a low connect()timeout/so_timeout. 
if you always set this then it will never hang for long periods of time.
* if you absolutely need to test the case where you don't get a timeout but 
another exception, 
use an ipv6 test address (eg [ff01::114]). because jenkins has no ipv6, it 
fails fast always. this won't work forever...
* in a situation where you have A talking to B, and you want to test a 
condition where B goes down, 
instead of just bringing B down, instead you can consider mocking up a remote 
node to test failures.
bring up a mock downed server (e.g. just a ServerSocket on that same port 
with reuseAddress=true). 
this one can return whatever error you want, or just disconnect, and even 
assert that A tried to 
connect to it. maybe instead of using real remote jettys at all, most tests 
could even be totally 
implemented this way: it would be faster and simpler than spinning up so many 
jettys in all the tests.


 Distributing Indexing
 -

 Key: SOLR-2358
 URL: https://issues.apache.org/jira/browse/SOLR-2358
 Project: Solr
  Issue Type: New Feature
  Components: SolrCloud, update
Reporter: William Mayor
Priority: Minor
 Fix For: 4.0

 Attachments: 2shard4server.jpg, SOLR-2358.patch, SOLR-2358.patch, 
 apache-solr-noggit-r1211150.jar, zookeeper-3.3.4.jar


 The indexing side of SolrCloud - the goal of this issue is to provide 
 durable, fault tolerant indexing to an elastic cluster of Solr instances.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-2358) Distributing Indexing

2012-01-29 Thread Mark Miller (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-2358?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13195759#comment-13195759
 ] 

Mark Miller commented on SOLR-2358:
---

These tests really need to be done with real jetty instances (at least some of 
them). I'll try adding some timeouts where we are not currently using them 
(generally they are used from any test code but not always in non test code).

 Distributing Indexing
 -

 Key: SOLR-2358
 URL: https://issues.apache.org/jira/browse/SOLR-2358
 Project: Solr
  Issue Type: New Feature
  Components: SolrCloud, update
Reporter: William Mayor
Priority: Minor
 Fix For: 4.0

 Attachments: 2shard4server.jpg, SOLR-2358.patch, SOLR-2358.patch, 
 apache-solr-noggit-r1211150.jar, zookeeper-3.3.4.jar


 The indexing side of SolrCloud - the goal of this issue is to provide 
 durable, fault tolerant indexing to an elastic cluster of Solr instances.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-2358) Distributing Indexing

2012-01-29 Thread Yonik Seeley (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-2358?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13195761#comment-13195761
 ] 

Yonik Seeley commented on SOLR-2358:


We should be careful of using socket read timeouts in non-test code for 
operations that could potentially take a long time... commit, optimize, and 
even query requests (depending on what the request is).  By default, solr does 
not currently time out requests because we don't know what the upper bound is.

 Distributing Indexing
 -

 Key: SOLR-2358
 URL: https://issues.apache.org/jira/browse/SOLR-2358
 Project: Solr
  Issue Type: New Feature
  Components: SolrCloud, update
Reporter: William Mayor
Priority: Minor
 Fix For: 4.0

 Attachments: 2shard4server.jpg, SOLR-2358.patch, SOLR-2358.patch, 
 apache-solr-noggit-r1211150.jar, zookeeper-3.3.4.jar


 The indexing side of SolrCloud - the goal of this issue is to provide 
 durable, fault tolerant indexing to an elastic cluster of Solr instances.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-2358) Distributing Indexing

2012-01-29 Thread Mark Miller (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-2358?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13195783#comment-13195783
 ] 

Mark Miller commented on SOLR-2358:
---

Yup, I agree - in general in non test code we don't want to time out by default 
- that is why I've stuck to only using them in the tests until now. I've tried 
adding one to the Solr cmd distributor for a bit though - just to see if that 
helps on Jenkins any. I'd like to narrow in and at least know if this is the 
problem or not (blackhole hangups). For some things, like a request to recover, 
timeouts may be fine I think.

Once I am able to log into jenkins again, I can hopefully narrow down what is 
happening a lot faster.

 Distributing Indexing
 -

 Key: SOLR-2358
 URL: https://issues.apache.org/jira/browse/SOLR-2358
 Project: Solr
  Issue Type: New Feature
  Components: SolrCloud, update
Reporter: William Mayor
Priority: Minor
 Fix For: 4.0

 Attachments: 2shard4server.jpg, SOLR-2358.patch, SOLR-2358.patch, 
 apache-solr-noggit-r1211150.jar, zookeeper-3.3.4.jar


 The indexing side of SolrCloud - the goal of this issue is to provide 
 durable, fault tolerant indexing to an elastic cluster of Solr instances.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-2358) Distributing Indexing

2012-01-29 Thread Yonik Seeley (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-2358?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13195787#comment-13195787
 ] 

Yonik Seeley commented on SOLR-2358:


bq. For some things, like a request to recover, timeouts may be fine I think.

Definitely - we have a lot better handle on Solr created requests.  Replication 
(although it can take a long time to send a big file, there shouldn't be long 
periods where no packets are sent), PeerSync, etc.

Although IIRC, a new cloud-style replication request involves the recipient 
doing a commit?

 Distributing Indexing
 -

 Key: SOLR-2358
 URL: https://issues.apache.org/jira/browse/SOLR-2358
 Project: Solr
  Issue Type: New Feature
  Components: SolrCloud, update
Reporter: William Mayor
Priority: Minor
 Fix For: 4.0

 Attachments: 2shard4server.jpg, SOLR-2358.patch, SOLR-2358.patch, 
 apache-solr-noggit-r1211150.jar, zookeeper-3.3.4.jar


 The indexing side of SolrCloud - the goal of this issue is to provide 
 durable, fault tolerant indexing to an elastic cluster of Solr instances.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-2358) Distributing Indexing

2012-01-28 Thread Mark Miller (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-2358?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13195689#comment-13195689
 ] 

Mark Miller commented on SOLR-2358:
---

bq. Should another issue be opened for the tests?

I have another issue for the test problem: SOLR-3066

bq. Do the failures reproduce if you ssh into the hudson machine itself and 
test from there?

I can't currently get into the hudson machine - used the wrong username the 
other day and seemed to get ip banned pretty much right away. Looking into 
getting that undone.

bq. Do any tests rely upon not being able to connect to a tcp/udp port

Sometimes, yes - because jetties are going up and down during these tests, 
sometimes you wouldn't be able to connect - I wouldn't say we rely on it, but 
it seems it could happen. 

bq. unless you do some tricks.

Which tricks? This could be part of it by the sound of things.

 Distributing Indexing
 -

 Key: SOLR-2358
 URL: https://issues.apache.org/jira/browse/SOLR-2358
 Project: Solr
  Issue Type: New Feature
  Components: SolrCloud, update
Reporter: William Mayor
Priority: Minor
 Fix For: 4.0

 Attachments: 2shard4server.jpg, SOLR-2358.patch, SOLR-2358.patch, 
 apache-solr-noggit-r1211150.jar, zookeeper-3.3.4.jar


 The indexing side of SolrCloud - the goal of this issue is to provide 
 durable, fault tolerant indexing to an elastic cluster of Solr instances.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-2358) Distributing Indexing

2012-01-26 Thread Robert Muir (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-2358?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13194360#comment-13194360
 ] 

Robert Muir commented on SOLR-2358:
---

Should another issue be opened for the tests?

Do the failures reproduce if you ssh into the hudson machine itself and test 
from there?
I've found this useful before when things are hard to reproduce.

Do any tests rely upon *not* being able to connect to a tcp/udp port (even 
localhost)? 
Our hudson machine has an interesting network configuration: it blackholes 
connections
to closed ports, so any tests that rely upon this will just hang (for a very 
long time!) 
unless you do some tricks.  This is actually great for testing (imo), because 
it 
simulates how a real outage can behave: but is likely different from anyone's 
local machine.

 Distributing Indexing
 -

 Key: SOLR-2358
 URL: https://issues.apache.org/jira/browse/SOLR-2358
 Project: Solr
  Issue Type: New Feature
  Components: SolrCloud, update
Reporter: William Mayor
Priority: Minor
 Fix For: 4.0

 Attachments: 2shard4server.jpg, SOLR-2358.patch, SOLR-2358.patch, 
 apache-solr-noggit-r1211150.jar, zookeeper-3.3.4.jar


 The indexing side of SolrCloud - the goal of this issue is to provide 
 durable, fault tolerant indexing to an elastic cluster of Solr instances.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-2358) Distributing Indexing

2012-01-25 Thread Mark Miller (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-2358?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13193254#comment-13193254
 ] 

Mark Miller commented on SOLR-2358:
---

Okay, I just hit commit. I expect I'll have to do some more test hardening, but 
I will be pretty responsive to that initially.

I have not worked out the whole changes entry and how to handle all of these 
sub issues - but I will start on that and leave this issue unresolved until I 
get that done (today or tomorrow depending on how it goes).

 Distributing Indexing
 -

 Key: SOLR-2358
 URL: https://issues.apache.org/jira/browse/SOLR-2358
 Project: Solr
  Issue Type: New Feature
  Components: SolrCloud, update
Reporter: William Mayor
Priority: Minor
 Fix For: 4.0

 Attachments: 2shard4server.jpg, SOLR-2358.patch, SOLR-2358.patch, 
 apache-solr-noggit-r1211150.jar, zookeeper-3.3.4.jar


 The indexing side of SolrCloud - the goal of this issue is to provide 
 durable, fault tolerant indexing to an elastic cluster of Solr instances.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-2358) Distributing Indexing

2012-01-25 Thread Mark Miller (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-2358?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13193434#comment-13193434
 ] 

Mark Miller commented on SOLR-2358:
---

I knew hudson would get me - that series of tubes runs stuff in some funny land 
I always have a hard time reproducing. I've ignored a couple tests for the very 
short term while I try and replicate the first fails on my mac, linux box, or 
windows VM. So far, it's proving difficult to replicate those fails, but I'll 
keep banging away over the short term.

 Distributing Indexing
 -

 Key: SOLR-2358
 URL: https://issues.apache.org/jira/browse/SOLR-2358
 Project: Solr
  Issue Type: New Feature
  Components: SolrCloud, update
Reporter: William Mayor
Priority: Minor
 Fix For: 4.0

 Attachments: 2shard4server.jpg, SOLR-2358.patch, SOLR-2358.patch, 
 apache-solr-noggit-r1211150.jar, zookeeper-3.3.4.jar


 The indexing side of SolrCloud - the goal of this issue is to provide 
 durable, fault tolerant indexing to an elastic cluster of Solr instances.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-2358) Distributing Indexing

2012-01-24 Thread Yonik Seeley (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-2358?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13192219#comment-13192219
 ] 

Yonik Seeley commented on SOLR-2358:


+1, looks good!

 Distributing Indexing
 -

 Key: SOLR-2358
 URL: https://issues.apache.org/jira/browse/SOLR-2358
 Project: Solr
  Issue Type: New Feature
  Components: SolrCloud, update
Reporter: William Mayor
Priority: Minor
 Fix For: 4.0

 Attachments: 2shard4server.jpg, SOLR-2358.patch, SOLR-2358.patch, 
 apache-solr-noggit-r1211150.jar, zookeeper-3.3.4.jar


 The indexing side of SolrCloud - the goal of this issue is to provide 
 durable, fault tolerant indexing to an elastic cluster of Solr instances.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-2358) Distributing Indexing

2012-01-23 Thread Mark Miller (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-2358?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13191394#comment-13191394
 ] 

Mark Miller commented on SOLR-2358:
---

Okay, tests are passing on my linux box, mac and windows vm. I am working on a 
patch right now to highlight the changes, then I plan on committing this issue 
in a day or two. From there, we can iterate on any rough edges on trunk.

 Distributing Indexing
 -

 Key: SOLR-2358
 URL: https://issues.apache.org/jira/browse/SOLR-2358
 Project: Solr
  Issue Type: New Feature
  Components: SolrCloud, update
Reporter: William Mayor
Priority: Minor
 Fix For: 4.0

 Attachments: 2shard4server.jpg, SOLR-2358.patch


 The indexing side of SolrCloud - the goal of this issue is to provide 
 durable, fault tolerant indexing to an elastic cluster of Solr instances.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-2358) Distributing Indexing

2012-01-20 Thread Mark Miller (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-2358?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13190181#comment-13190181
 ] 

Mark Miller commented on SOLR-2358:
---

I'm ready to start looking at merging this branch to trunk - the primary 
blocker to that that I see at the moment is that 
org.apache.solr.search.TestRecovery does not pass on Windows. After that is 
resolved, I hope to start the merge process!

 Distributing Indexing
 -

 Key: SOLR-2358
 URL: https://issues.apache.org/jira/browse/SOLR-2358
 Project: Solr
  Issue Type: New Feature
  Components: SolrCloud, update
Reporter: William Mayor
Priority: Minor
 Fix For: 4.0

 Attachments: 2shard4server.jpg, SOLR-2358.patch


 The indexing side of SolrCloud - the goal of this issue is to provide 
 durable, fault tolerant indexing to an elastic cluster of Solr instances.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-2358) Distributing Indexing

2012-01-20 Thread Yonik Seeley (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-2358?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13190188#comment-13190188
 ] 

Yonik Seeley commented on SOLR-2358:


bq. the primary blocker to that that I see at the moment is that 
org.apache.solr.search.TestRecovery does not pass on Windows

Yeah, it's the old transaction logs that are still open after a shutdown (and 
the test tries to remove those log files).
I'm in the middle of some deleteByQuery stuff right now, but I should be able 
to figure out a workaround for the TestRecovery issue this weekend.

 Distributing Indexing
 -

 Key: SOLR-2358
 URL: https://issues.apache.org/jira/browse/SOLR-2358
 Project: Solr
  Issue Type: New Feature
  Components: SolrCloud, update
Reporter: William Mayor
Priority: Minor
 Fix For: 4.0

 Attachments: 2shard4server.jpg, SOLR-2358.patch


 The indexing side of SolrCloud - the goal of this issue is to provide 
 durable, fault tolerant indexing to an elastic cluster of Solr instances.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-2358) Distributing Indexing

2012-01-19 Thread Mark Miller (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-2358?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13189347#comment-13189347
 ] 

Mark Miller commented on SOLR-2358:
---

I've tried to make good use of atLeast to minimize the times of some of the 
larger new solrcloud tests, but they are still not super light weight (a few of 
the new ones spin up multiple jetty instances).

Here is where they currently stand in comparison to current tests without any 
nightly or multiplier boosts:
{noformat}
Worst Times:
test:org.apache.solr.cloud.FullSolrCloudTest time:33.933
test:org.apache.solr.handler.TestReplicationHandler time:30.002
test:org.apache.solr.cloud.ChaosMonkeyNothingIsSafeTest time:24.572
test:org.apache.solr.cloud.ChaosMonkeySafeLeaderTest time:24.271
test:org.apache.solr.cloud.RecoveryZkTest time:22.875
test:org.apache.solr.cloud.FullSolrCloudDistribCmdsTest time:22.161
test:org.apache.solr.cloud.BasicDistributedZkTest time:16.696
test:org.apache.solr.search.TestRealTimeGet time:16.385
test:org.apache.solr.TestDistributedGrouping time:15.136
test:org.apache.solr.TestDistributedSearch time:14.609
{noformat}

 Distributing Indexing
 -

 Key: SOLR-2358
 URL: https://issues.apache.org/jira/browse/SOLR-2358
 Project: Solr
  Issue Type: New Feature
  Components: SolrCloud, update
Reporter: William Mayor
Priority: Minor
 Fix For: 4.0

 Attachments: 2shard4server.jpg, SOLR-2358.patch


 The indexing side of SolrCloud - the goal of this issue is to provide 
 durable, fault tolerant indexing to an elastic cluster of Solr instances.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-2358) Distributing Indexing

2012-01-11 Thread Mark Miller (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-2358?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13184537#comment-13184537
 ] 

Mark Miller commented on SOLR-2358:
---

Came up with a conversation with a user in #solr IRC - we really want to change 
the search param distrib to default to true rather than false when in SolrCloud 
mode.

 Distributing Indexing
 -

 Key: SOLR-2358
 URL: https://issues.apache.org/jira/browse/SOLR-2358
 Project: Solr
  Issue Type: New Feature
  Components: SolrCloud, update
Reporter: William Mayor
Priority: Minor
 Fix For: 4.0

 Attachments: 2shard4server.jpg, SOLR-2358.patch


 The indexing side of SolrCloud - the goal of this issue is to provide 
 durable, fault tolerant indexing to an elastic cluster of Solr instances.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-2358) Distributing Indexing

2012-01-11 Thread Mark Miller (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-2358?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13184621#comment-13184621
 ] 

Mark Miller commented on SOLR-2358:
---

I've made the above change in the branch.

 Distributing Indexing
 -

 Key: SOLR-2358
 URL: https://issues.apache.org/jira/browse/SOLR-2358
 Project: Solr
  Issue Type: New Feature
  Components: SolrCloud, update
Reporter: William Mayor
Priority: Minor
 Fix For: 4.0

 Attachments: 2shard4server.jpg, SOLR-2358.patch


 The indexing side of SolrCloud - the goal of this issue is to provide 
 durable, fault tolerant indexing to an elastic cluster of Solr instances.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-2358) Distributing Indexing

2012-01-10 Thread Darren Govoni (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-2358?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13183238#comment-13183238
 ] 

Darren Govoni commented on SOLR-2358:
-

Great job Mark. Thanks! 

 Distributing Indexing
 -

 Key: SOLR-2358
 URL: https://issues.apache.org/jira/browse/SOLR-2358
 Project: Solr
  Issue Type: New Feature
  Components: SolrCloud, update
Reporter: William Mayor
Priority: Minor
 Fix For: 4.0

 Attachments: 2shard4server.jpg, SOLR-2358.patch


 The indexing side of SolrCloud - the goal of this issue is to provide 
 durable, fault tolerant indexing to an elastic cluster of Solr instances.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-2358) Distributing Indexing

2012-01-10 Thread Mark Miller (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-2358?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13183436#comment-13183436
 ] 

Mark Miller commented on SOLR-2358:
---

bq. Perhaps within couple/few weeks, after we stabilize and finish up some 
hanging work?

I think we are pretty close to this! There are only a few more nocommits to 
work down. There is more to add, but I think we will have something stable 
enough to start iterating on in trunk - hopefully that will trigger even more 
testing and feedback - it is getting toward the point where the cost of the 
branch is starting to outweigh the benefits.

 Distributing Indexing
 -

 Key: SOLR-2358
 URL: https://issues.apache.org/jira/browse/SOLR-2358
 Project: Solr
  Issue Type: New Feature
  Components: SolrCloud, update
Reporter: William Mayor
Priority: Minor
 Fix For: 4.0

 Attachments: 2shard4server.jpg, SOLR-2358.patch


 The indexing side of SolrCloud - the goal of this issue is to provide 
 durable, fault tolerant indexing to an elastic cluster of Solr instances.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-2358) Distributing Indexing

2012-01-08 Thread Mark Miller (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-2358?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13182305#comment-13182305
 ] 

Mark Miller commented on SOLR-2358:
---

Hey Darren - I have re written the description a bit, attached a little 
diagram, and started working on an updated version of the solrcloud wiki page 
(http://wiki.apache.org/solr/SolrCloud) at 
http://wiki.apache.org/solr/SolrCloud2.

If you have any user level questions, it might be more useful to do those on 
the user mailing list. Anything more related to development, fire away right 
here.

Loosely, this issue covers the indexing side of the solrcloud vision - the 
search side had already been largely done in an earlier phase (though some of 
that has been improved as well in this phase).

 Distributing Indexing
 -

 Key: SOLR-2358
 URL: https://issues.apache.org/jira/browse/SOLR-2358
 Project: Solr
  Issue Type: New Feature
  Components: SolrCloud, update
Reporter: William Mayor
Priority: Minor
 Fix For: 4.0

 Attachments: 2shard4server.jpg, SOLR-2358.patch


 The indexing side of SolrCloud - the goal of this issue is to provide 
 durable, fault tolerant indexing to an elastic cluster of Solr instances.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-2358) Distributing Indexing

2011-12-30 Thread Mark Miller (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-2358?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13177713#comment-13177713
 ] 

Mark Miller commented on SOLR-2358:
---

As I was working on transforming the old distrib update processor code into 
something we needed for solrcloud, I dropped it's ability to buffer updates. It 
just made work quicker and I wasn't really sure how much re-factoring would end 
up happening, so I didn't want to spend too much time on something that only 
related to performance so early. I'm going to work on adding back buffering to 
the new SolrCmdDistributor class shortly - I think it means I have to move 
'forward failures' retry logic back into the SolrCmdDistributor - I had this 
there before, but it was ugly, so I pulled it up a level into the distrib 
update processor. I think with buffering though, it needs to go back. (when a 
forward to leader fails, we would often like to pause and retry as it is 
possible the leader went down and now there is a new one)

 Distributing Indexing
 -

 Key: SOLR-2358
 URL: https://issues.apache.org/jira/browse/SOLR-2358
 Project: Solr
  Issue Type: New Feature
  Components: SolrCloud, update
Reporter: William Mayor
Priority: Minor
 Fix For: 4.0

 Attachments: SOLR-2358.patch


 The first steps towards creating distributed indexing functionality in Solr

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-2358) Distributing Indexing

2011-12-30 Thread Mark Miller (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-2358?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13177743#comment-13177743
 ] 

Mark Miller commented on SOLR-2358:
---

Okay - I've got basic buffering back - I've lost forwarding retries for the 
moment though - I'll wait to commit to the branch until I've brought that back.

 Distributing Indexing
 -

 Key: SOLR-2358
 URL: https://issues.apache.org/jira/browse/SOLR-2358
 Project: Solr
  Issue Type: New Feature
  Components: SolrCloud, update
Reporter: William Mayor
Priority: Minor
 Fix For: 4.0

 Attachments: SOLR-2358.patch


 The first steps towards creating distributed indexing functionality in Solr

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-2358) Distributing Indexing

2011-12-30 Thread Mark Miller (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-2358?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13177864#comment-13177864
 ] 

Mark Miller commented on SOLR-2358:
---

Buffering is back in with retries on failed forwards to leaders.

 Distributing Indexing
 -

 Key: SOLR-2358
 URL: https://issues.apache.org/jira/browse/SOLR-2358
 Project: Solr
  Issue Type: New Feature
  Components: SolrCloud, update
Reporter: William Mayor
Priority: Minor
 Fix For: 4.0

 Attachments: SOLR-2358.patch


 The first steps towards creating distributed indexing functionality in Solr

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-2358) Distributing Indexing

2011-12-05 Thread Mark Miller (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-2358?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13163178#comment-13163178
 ] 

Mark Miller commented on SOLR-2358:
---

We are starting to get some stable, usable stuff here (even though there is 
much to do!). We are also starting to get some users that are interested in 
using this stuff (critical feedback there). So I'd like to propose we try and 
merge the branch into trunk sooner rather than later, and then iterate from 
there. Anything too experimental in the future could move back onto a branch 
again. This will make the merge a bit more digestible as well - rather than 
building up a crazy amount of differences on the branch. There are also a 
variety of improvements and fixes in the testing framework and elsewhere that 
would be nice to get back into trunk. Perhaps within couple/few weeks, after we 
stabilize and finish up some hanging work?

 Distributing Indexing
 -

 Key: SOLR-2358
 URL: https://issues.apache.org/jira/browse/SOLR-2358
 Project: Solr
  Issue Type: New Feature
  Components: SolrCloud, update
Reporter: William Mayor
Priority: Minor
 Fix For: 4.0

 Attachments: SOLR-2358.patch


 The first steps towards creating distributed indexing functionality in Solr

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-2358) Distributing Indexing

2011-12-05 Thread Mark Miller (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-2358?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13163346#comment-13163346
 ] 

Mark Miller commented on SOLR-2358:
---

I just made it so that version can be specified on delete's in solrxml and did 
the work necessary for distrib deletes to work with versioning. You can do 
delete by id now.

 Distributing Indexing
 -

 Key: SOLR-2358
 URL: https://issues.apache.org/jira/browse/SOLR-2358
 Project: Solr
  Issue Type: New Feature
  Components: SolrCloud, update
Reporter: William Mayor
Priority: Minor
 Fix For: 4.0

 Attachments: SOLR-2358.patch


 The first steps towards creating distributed indexing functionality in Solr

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-2358) Distributing Indexing

2011-12-04 Thread Yonik Seeley (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-2358?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13162464#comment-13162464
 ] 

Yonik Seeley commented on SOLR-2358:


I've made the distrib update processor default.  I had to @Ignore BasicZkTest 
for some reason though.

 Distributing Indexing
 -

 Key: SOLR-2358
 URL: https://issues.apache.org/jira/browse/SOLR-2358
 Project: Solr
  Issue Type: New Feature
  Components: SolrCloud, update
Reporter: William Mayor
Priority: Minor
 Fix For: 4.0

 Attachments: SOLR-2358.patch


 The first steps towards creating distributed indexing functionality in Solr

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-2358) Distributing Indexing

2011-12-02 Thread Mark Miller (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-2358?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13161759#comment-13161759
 ] 

Mark Miller commented on SOLR-2358:
---

note: distrib delete by id not working at the moment - we need to start 
propagating versions on SolrCmd objects - right now they are lost on conversion 
to an update request, and the versioning code is not happy.

 Distributing Indexing
 -

 Key: SOLR-2358
 URL: https://issues.apache.org/jira/browse/SOLR-2358
 Project: Solr
  Issue Type: New Feature
  Components: SolrCloud, update
Reporter: William Mayor
Priority: Minor
 Fix For: 4.0

 Attachments: SOLR-2358.patch


 The first steps towards creating distributed indexing functionality in Solr

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-2358) Distributing Indexing

2011-11-22 Thread Lance Norskog (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-2358?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13155693#comment-13155693
 ] 

Lance Norskog commented on SOLR-2358:
-

Lamport Clocks are a time-tested way to sequence actions across a network. In 
this case, you can use an iterate-until-happy algorithm using the locks.

[Google Lamport Clock|https://www.google.com/search?q=lamport+clock]

 Distributing Indexing
 -

 Key: SOLR-2358
 URL: https://issues.apache.org/jira/browse/SOLR-2358
 Project: Solr
  Issue Type: New Feature
  Components: SolrCloud, update
Reporter: William Mayor
Priority: Minor
 Fix For: 4.0

 Attachments: SOLR-2358.patch


 The first steps towards creating distributed indexing functionality in Solr

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-2358) Distributing Indexing

2011-10-11 Thread Mark Miller (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-2358?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13125421#comment-13125421
 ] 

Mark Miller commented on SOLR-2358:
---

P.S. This lock is simply for auto layout of the cluster - if you are going to 
manually specific the layout, it wouldn't be used. If we ended up with an 
overseer, this lock could happen on it instead. Basically, if all the nodes 
fire up at the same time, you still want them to be sanely assigned to be a 
shard / replica, which requires knowing the assignments that have already 
happened.

 Distributing Indexing
 -

 Key: SOLR-2358
 URL: https://issues.apache.org/jira/browse/SOLR-2358
 Project: Solr
  Issue Type: New Feature
  Components: SolrCloud, update
Reporter: William Mayor
Priority: Minor
 Fix For: 4.0

 Attachments: SOLR-2358.patch


 The first steps towards creating distributed indexing functionality in Solr

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-2358) Distributing Indexing

2011-10-09 Thread Mark Miller (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-2358?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13123702#comment-13123702
 ] 

Mark Miller commented on SOLR-2358:
---

Initially, a request will be fully synchronous and will not return success to 
the client until the request is sent to each replica. So if a leader goes down 
before all replicas receive and ACK the request, the client will not get an 
ACK. A new leader will be elected. When the downed, previous leader comes back, 
he will come up in recovery mode. I expect recovery to be a difficult part and 
we have not fully worked it out yet. To recover, the node will have to talk to 
the leader and figure out what it has that it should not, what it doesn't have, 
etc. Then the recovering node either receives replays, or replaces the entire 
index. Lot's of details to work out here. 

You have an interesting problem in that some replica leader candidates may have 
an update while others don't, as the leader may have died in the middle of 
relaying requests. We might prefer a new leader with the greatest versioned 
doc? Most client retries in this case will be fine (global unique id's are 
required, so no worry about dupes). Then replicas talk to the leader and sync 
up. Or when a new leader is elected, replicas just talk amongst each other and 
sync up, or…

If the leader fails right before sending an ACK, the client will likely repeat 
the request. In the case of doc adds/updates and the same id it will just 
replace the previous success or will be able to use optimistic locking to 
figure out that either its update or someone else's actually went through 
already. The client would already know that perhaps its update went through 
because the connection would have timed out rather than receive a failure.

Eventually, we might consider a mode where the request is ACK'd before it's on 
all replicas, in which case you might accept a higher risk of data loss.

bq. indexes diverge because some replicas commit a change while others do not

It's an area we have not fully worked out (though Yonik has likely thought 
about a lot of this more than I have yet) - initially though, Yonik's point was 
that you can usually expect success on all nodes unless the issue is something 
that would require the node come down and then come back in recovery mode I 
think. We certainly want to be resilient here eventually though. As we work 
through recovery scenarios, I think this will become more clear.

Long, short, we have been discussing and thinking about these various 
scenarios, but largely we are also taking things an issue at a time.


 Distributing Indexing
 -

 Key: SOLR-2358
 URL: https://issues.apache.org/jira/browse/SOLR-2358
 Project: Solr
  Issue Type: New Feature
  Components: SolrCloud, update
Reporter: William Mayor
Priority: Minor
 Fix For: 4.0

 Attachments: SOLR-2358.patch


 The first steps towards creating distributed indexing functionality in Solr

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-2358) Distributing Indexing

2011-10-09 Thread Mark Miller (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-2358?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13123706#comment-13123706
 ] 

Mark Miller commented on SOLR-2358:
---

bq. Generally speaking, it seems like we should avoid locks as much as 
possible. Should be more scalable...

Yeah, I had the same initially reaction - a collection wide lock? Who likes 
locks? In reality, I'm not too worried though - its a simple very short lock 
for changing the cluster layout for a collection - this is not a normal thing 
that will happen - normally the cluster layout will be stable - this is mostly 
just as the cluster is coming up. So for simplicity and in the spirit of 
getting something working, it's easy to just start with a simple lock here - if 
it's really a problem (I doubt it myself), it's easy enough to do this 
differently later.

 Distributing Indexing
 -

 Key: SOLR-2358
 URL: https://issues.apache.org/jira/browse/SOLR-2358
 Project: Solr
  Issue Type: New Feature
  Components: SolrCloud, update
Reporter: William Mayor
Priority: Minor
 Fix For: 4.0

 Attachments: SOLR-2358.patch


 The first steps towards creating distributed indexing functionality in Solr

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-2358) Distributing Indexing

2011-10-07 Thread Mark Miller (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-2358?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13122953#comment-13122953
 ] 

Mark Miller commented on SOLR-2358:
---

Okay, I'm going to commit some really early stuff to the branch here...ugly 
code and lots of system.out's still there...but we can start tying in 
versioning and what not...

Commit adds the distrib update processor and makes it cloud aware.

If you add a doc to a replica, it forwards it to the leader. If a doc comes to 
the leader, it versions it (super mock/fake at the moment - param is set to 
docversion=yes) and forwards it to each replica in the shard (including itself).

Also a couple basic tests added around this, and other little fixes that where 
found/needed along the way...

The current main test for this fires up a control and 3 shards, each with 1 
replia (6 cores total). Indexing is then round robin'd to each shard (randomly 
adding either to the leader or the replica). Then the standard distrib search 
tests are run (with load balancing across replicas) and results compared with 
control.

Early, early, stuff - but it's a start. None of the hashing stuff we will be 
doing involved yet.

 Distributing Indexing
 -

 Key: SOLR-2358
 URL: https://issues.apache.org/jira/browse/SOLR-2358
 Project: Solr
  Issue Type: New Feature
  Components: SolrCloud, update
Reporter: William Mayor
Priority: Minor
 Fix For: 4.0

 Attachments: SOLR-2358.patch


 The first steps towards creating distributed indexing functionality in Solr

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-2358) Distributing Indexing

2011-10-07 Thread Mark Miller (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-2358?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13123004#comment-13123004
 ] 

Mark Miller commented on SOLR-2358:
---

Actually, commit will be a bit delayed - new test likes to hang when running in 
parallel to others with ant test - will have to dig...

 Distributing Indexing
 -

 Key: SOLR-2358
 URL: https://issues.apache.org/jira/browse/SOLR-2358
 Project: Solr
  Issue Type: New Feature
  Components: SolrCloud, update
Reporter: William Mayor
Priority: Minor
 Fix For: 4.0

 Attachments: SOLR-2358.patch


 The first steps towards creating distributed indexing functionality in Solr

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-2358) Distributing Indexing

2011-10-07 Thread Yonik Seeley (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-2358?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13123198#comment-13123198
 ] 

Yonik Seeley commented on SOLR-2358:


As far as locking vs leader, I think maybe both can make sense.
Some things are logically more node specific and a lock can make more sense 
there (so that a node can modify it's own state).
Also, something like a command to create a new collection might be easier with 
a cluster lock.  The node that received the command can just do it, rather than 
introducing logic to forward the command to the cluster leader (or put the 
request in a ZK queue or something, to be pulled by someone, which still needs 
coordination to make sure only one node is trying to do it).

On the other hand, cluster overseer code that might want to watch the cluster 
and change the configuration... a single cluster leader makes sense there (and 
they may end up also grabbing some sort of lock to avoid conflicts with what 
other nodes may do).

 Distributing Indexing
 -

 Key: SOLR-2358
 URL: https://issues.apache.org/jira/browse/SOLR-2358
 Project: Solr
  Issue Type: New Feature
  Components: SolrCloud, update
Reporter: William Mayor
Priority: Minor
 Fix For: 4.0

 Attachments: SOLR-2358.patch


 The first steps towards creating distributed indexing functionality in Solr

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-2358) Distributing Indexing

2011-10-07 Thread Ted Dunning (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-2358?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13123217#comment-13123217
 ] 

Ted Dunning commented on SOLR-2358:
---

I think locks should be completely out of bounds if only because they are hell 
to deal with in the presence of failures.  This is a major reason that ZK is 
not a lock manager but supports atomic updates at a fundamental level.

State of a node doesn't need a lock.  The node should just update it's own 
state and that state should be ephemeral so if the node disappears, the state 
reflects that.  Anybody who cares in a real-time kind of way about the state of 
that node should put a watch on that node's state.

Creating a new collection is relatively trivial without a lock as well.  One of 
the simplest ways is to simply put a specification of the new collection into a 
collections directory in ZK.  The cluster overseer sees the addition and it 
parcels out shard assignments to nodes.  The nodes see the assignments change 
and they take actions to conform to the specification, advertising their 
progress in their state files.  All that is needed here is atomic update which 
ZK does just fine.

If it helps, there is a simplified form of this in Chapter 16 of Mahout in 
action.  The source code for this example is available at 
https://github.com/tdunning/Chapter-16.  This example only has nodes, but the 
basic idea of parcelling out assignments is the same.

A summary of what I would suggest is this:

- three directories:
{code}
/collections
/node-assignments
/node-states
{code}
The /collections directory is updated by anybody wishing to advertise or delete 
a collection.  The node-assignments directory is updated only by the overseer.  
The node-states directory is updated by each node.

- one leader election file
{code}
/cluster-leader
{code}
All of the potential overseers try to create this file (ephemerally) and insert 
their IP and port.  The one that succeeds is the overseer, the others watch for 
the file to disappear.  On disconnect from ZK, the overseer stops acting as 
overseer, but does not tear down local state.  On reconnect, the overseer 
continues acting as overseer.  On session expiration, the overseer tears down 
local state and attempts to regain the leadership position.

The cluster overseer never needs to grab locks since atomic read-modify-write 
to node state is all that is required.  

Again for emphasis,

1) cluster-wide locks are a bug in a scalable clustered system.  Leader 
election is an allowable special case.

2) locks are not required for clustered SOLR.

3) a lock-free design is incredibly simple to implement.



 Distributing Indexing
 -

 Key: SOLR-2358
 URL: https://issues.apache.org/jira/browse/SOLR-2358
 Project: Solr
  Issue Type: New Feature
  Components: SolrCloud, update
Reporter: William Mayor
Priority: Minor
 Fix For: 4.0

 Attachments: SOLR-2358.patch


 The first steps towards creating distributed indexing functionality in Solr

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-2358) Distributing Indexing

2011-10-07 Thread Ted Dunning (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-2358?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13123231#comment-13123231
 ] 

Ted Dunning commented on SOLR-2358:
---

Mark,

How do you handle failure scenarios?

The failures I am curious about are:

- the leader fails, but a transaction is still sent to it because the client 
didn't get the memo in time

- the leader fails but has already written a transaction locally without having 
a chance to forward it to the followers

- the leader fails after writing locally and to the replicas but before sending 
an ACK

- a replica is partitioned from the cluster, a transaction is received and 
committed by all live replicas and then the failed index returns from the land 
of the living dead.

The bad behaviors that need to be avoided include

- document acked but not inserted

- document not acked, inserted again and two copies wind up in the index

- indexes diverge because some replicas commit a change while others do not

Two phase commit is not generally a viable solution for this in a cluster where 
failures can occur because it requires locks to be taken.  Once these locks are 
taken, the cluster cannot proceed until the locks are cleared and this cannot 
be done reliably in the presence of failures.

Zookeeper avoids this to a large degree by making updates idempotent before 
they are inserted into the update queue.  This means that if the updates are 
done more than once, most importantly during error recovery, that no error 
actually occurs.  This is what makes ZK able to take snapshots without stopping 
the world.  It does not entirely resolve the case of transactions that are 
committed but not acked.



 Distributing Indexing
 -

 Key: SOLR-2358
 URL: https://issues.apache.org/jira/browse/SOLR-2358
 Project: Solr
  Issue Type: New Feature
  Components: SolrCloud, update
Reporter: William Mayor
Priority: Minor
 Fix For: 4.0

 Attachments: SOLR-2358.patch


 The first steps towards creating distributed indexing functionality in Solr

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Commented: (SOLR-2358) Distributing Indexing

2011-02-16 Thread Alex Cowell (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-2358?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12995570#comment-12995570
 ] 

Alex Cowell commented on SOLR-2358:
---

bq. Since this functionality is core to Solr and should always be present, it 
would be natural to either build it into the DirectUpdateHandler2 or to add 
this processor to the set of default UpdateProcessors that are executed if no 
update.processor parameter is specified.

What advantage would we gain from moving this functionality into 
DirectUpdateHandler2? From what I understand, the UpdateHandler deals directly 
with the index whereas the DistributedUpdateRequestProcessor merely takes 
requests deemed to be distributed by the request handler and distributes them 
to a list of shards based on a distribution policy. 

 Distributing Indexing
 -

 Key: SOLR-2358
 URL: https://issues.apache.org/jira/browse/SOLR-2358
 Project: Solr
  Issue Type: New Feature
Reporter: William Mayor
Priority: Minor
 Attachments: SOLR-2358.patch


 The first steps towards creating distributed indexing functionality in Solr

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Commented: (SOLR-2358) Distributing Indexing

2011-02-16 Thread JIRA

[ 
https://issues.apache.org/jira/browse/SOLR-2358?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12995628#comment-12995628
 ] 

Jan Høydahl commented on SOLR-2358:
---

I'm not sure if DirectUpdateHandler2 is the right location either. My point is 
that the user should not need to manually make sure that the UpdateProcessor is 
present in all his UpdateChains for distributed indexing to work. See new issue 
SOLR-2370 for a suggestion on how to tackle this.

 Distributing Indexing
 -

 Key: SOLR-2358
 URL: https://issues.apache.org/jira/browse/SOLR-2358
 Project: Solr
  Issue Type: New Feature
Reporter: William Mayor
Priority: Minor
 Attachments: SOLR-2358.patch


 The first steps towards creating distributed indexing functionality in Solr

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Commented: (SOLR-2358) Distributing Indexing

2011-02-15 Thread JIRA

[ 
https://issues.apache.org/jira/browse/SOLR-2358?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12994766#comment-12994766
 ] 

Jan Høydahl commented on SOLR-2358:
---

See SOLR-2293 for some thoughts.

Since this functionality is core to Solr and should always be present, it would 
be natural to either build it into the DirectUpdateHandler2 or to add this 
processor to the set of default UpdateProcessors that are executed if no 
update.processor parameter is specified.

 Distributing Indexing
 -

 Key: SOLR-2358
 URL: https://issues.apache.org/jira/browse/SOLR-2358
 Project: Solr
  Issue Type: New Feature
Reporter: William Mayor
Priority: Minor
 Attachments: SOLR-2358.patch


 The first steps towards creating distributed indexing functionality in Solr

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org