[jira] Commented: (SOLR-1044) Use Hadoop RPC for inter Solr communication

Ken Krugler (JIRA) Mon, 02 Mar 2009 13:05:18 -0800

    [ 
https://issues.apache.org/jira/browse/SOLR-1044?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12678108#action_12678108
 ]


Ken Krugler commented on SOLR-1044:
-----------------------------------

I agree with both of Yonik's points:

# We'd first want to measure real-world performance before deciding that using 
something other than HTTP was important.
# Using something other than HTTP has related costs that should be considered.

At Krugle we used Hadoop RPC to handle remote searchers. In general it worked 
well, but we did run into the problem similar to what Yonik voiced as a 
potential concern - occasionally a remote searcher would hang, and when that 
happened the socket would essentially become a zombie. Under very heavy load 
testing this wound up eventually causing the entire system to lock up.

Though we heard that there were subsequent changes to the Hadoop RPC that fixed 
a number of similar bugs. Not sure about any details, though, and we never 
re-ran tests with the latest Hadoop (at that time, which was about a year ago).

If there are performance issues, I would be curious if using a long-lasting 
connection via keep-alive significantly reduces the overhead. I know that Jetty 
(for example) has a very efficient implementation of the Comet web app model, 
where you don't wind up needing a gazillion threads to handle many 
requests/second.

> Use Hadoop RPC for inter Solr communication
> -------------------------------------------
>
>                 Key: SOLR-1044
>                 URL: https://issues.apache.org/jira/browse/SOLR-1044
>             Project: Solr
>          Issue Type: New Feature
>          Components: search
>            Reporter: Noble Paul
>
> Solr uses http for distributed search . We can make it a whole lot faster if 
> we use an RPC mechanism which is more lightweight/efficient. 
> Hadoop RPC looks like a good candidate for this.  
> The implementation should just have one protocol. It should follow the Solr's 
> idiom of making remote calls . A uri + params +[optional stream(s)] . The 
> response can be a stream of bytes.
> To make this work we must make the SolrServer implementation pluggable in 
> distributed search. Users should be able to choose between the current 
> CommonshttpSolrServer, or a HadoopRpcSolrServer . 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (SOLR-1044) Use Hadoop RPC for inter Solr communication

Reply via email to