[ 
https://issues.apache.org/jira/browse/SOLR-1395?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mathias Walter updated SOLR-1395:
---------------------------------

    Attachment: front-end.log
                back-end.log

I ported the patch to Solr 3.1 and Katta 0.6.2, except the Katta test. I also 
fixed some bugs. The updated patch will be added soon.

In the meanwhile I discovered a big issue. Often, a SolrKattaNode (back-end 
server) hosts many shards. If now a Solr front-end server starts a new query, 
it sent as many queries in parallel to the back-end servers as shards the have. 
In contrast, a Katta/Lucene search sends just one query to each back-end server 
which queries all shards a back-end server hosts.
The problem is now that the Solr front-end server often did not receive all 
KattaResponse's from the back-end servers and hence timeout some queries and 
raises an exception. Sometimes a {{NullPointerException}} in 
{{org.apache.solr.handler.component.QueryComponent.mergeIds}} (usually at 
startup of the front-end server) and sometimes a {{NullPointerException}} in 
{{org.apache.solr.handler.component.QueryComponent.returnFields}}:

{noformat}
TRACE 2010-08-18 10:32:25,729 [pool-3-thread-4] net.sf.katta.client.WorkQueue - 
Done waiting, results = ClientResult: 0 results, 0 errors, 0/1 shards (id=6:0)
TRACE 2010-08-18 10:32:25,729 [pool-3-thread-4] net.sf.katta.client.WorkQueue - 
Shutting down work queue, results = ClientResult: 0 results, 0 errors, 0/1 
shards (id=6:0)
TRACE 2010-08-18 10:32:25,729 [pool-3-thread-4] 
net.sf.katta.client.ClientResult - close() called.
TRACE 2010-08-18 10:32:25,729 [pool-3-thread-4] 
net.sf.katta.client.ClientResult - Notifying closed listener.
TRACE 2010-08-18 10:32:25,729 [pool-3-thread-4] net.sf.katta.client.WorkQueue - 
Shut down via ClientRequest.close()
TRACE 2010-08-18 10:32:25,729 [pool-3-thread-4] net.sf.katta.client.WorkQueue - 
Shutdown() called (id=6)
TRACE 2010-08-18 10:32:25,729 [pool-3-thread-4] net.sf.katta.client.WorkQueue - 
Returning results = ClientResult: 0 results, 0 errors, 0/1 shards (closed), 
took 9989 ms (id=6:0)
DEBUG 2010-08-18 10:32:25,730 [pool-3-thread-4] net.sf.katta.client.Client - 
broadcast(request([null, org.apache.solr.katta.kattarequ...@180a1d7b]), 
{ibis46.gsf.de:20001=[sen-00000#sen-00000]}) took 10001 msec for ClientResult: 
0 results, 0 errors, 0/1 shards (closed)
DEBUG 2010-08-18 10:32:25,730 [pool-3-thread-4] 
org.apache.solr.katta.KattaSearchHandler - KattaCommComponent shard: sen-00000 
results.size: 0
 WARN 2010-08-18 10:32:25,730 [pool-3-thread-4] 
org.apache.solr.katta.KattaSearchHandler - Received 0 responses for query [], 
not 1
ERROR 2010-08-18 10:32:25,731 [pool-1-thread-1] org.apache.solr.core.SolrCore - 
java.lang.NullPointerException
        at 
org.apache.solr.handler.component.QueryComponent.mergeIds(QueryComponent.java:411)
        at 
org.apache.solr.handler.component.QueryComponent.handleResponses(QueryComponent.java:308)
        at 
org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:284)
        at 
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:131)
        at org.apache.solr.core.SolrCore.execute(SolrCore.java:1322)
        at 
org.apache.solr.core.QuerySenderListener.newSearcher(QuerySenderListener.java:52)
        at org.apache.solr.core.SolrCore$3.call(SolrCore.java:1144)
        at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
        at java.util.concurrent.FutureTask.run(FutureTask.java:138)
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
        at java.lang.Thread.run(Thread.java:619)


DEBUG 2010-08-18 10:37:55,295 [pool-3-thread-9] net.sf.katta.client.Client - 
broadcast(request([null, org.apache.solr.katta.kattarequ...@71ce5e7a]), 
{ibis46.gsf.de:20001=[sen-00003#sen-00003]}) took 10001 msec for ClientResult: 
0 results, 0 errors, 0/1 shards (closed)
DEBUG 2010-08-18 10:37:55,295 [pool-3-thread-9] 
org.apache.solr.katta.KattaSearchHandler - KattaCommComponent shard: sen-00003 
results.size: 0
 WARN 2010-08-18 10:37:55,295 [pool-3-thread-9] 
org.apache.solr.katta.KattaSearchHandler - Received 0 responses for query [], 
not 1
ERROR 2010-08-18 10:37:55,296 [918077...@qtp-87740549-8] 
org.apache.solr.core.SolrCore - java.lang.NullPointerException
        at 
org.apache.solr.handler.component.QueryComponent.returnFields(QueryComponent.java:574)
        at 
org.apache.solr.handler.component.QueryComponent.handleResponses(QueryComponent.java:312)
        at 
org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:284)
        at 
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:131)
        at org.apache.solr.core.SolrCore.execute(SolrCore.java:1322)
        at 
org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:341)
        at 
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:244)
        at 
org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1157)
        at 
org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:388)
        at 
org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216)
        at 
org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:182)
        at 
org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:765)
        at org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:440)
        at 
org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:230)
        at 
org.mortbay.jetty.handler.HandlerCollection.handle(HandlerCollection.java:114)
        at 
org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:152)
        at org.mortbay.jetty.Server.handle(Server.java:326)
        at 
org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:542)
        at 
org.mortbay.jetty.HttpConnection$RequestHandler.headerComplete(HttpConnection.java:926)
        at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:549)
        at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:212)
        at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:404)
        at 
org.mortbay.jetty.bio.SocketConnector$Connection.run(SocketConnector.java:228)
        at 
org.mortbay.thread.QueuedThreadPool$PoolThread.run(QueuedThreadPool.java:582)
{noformat}

Interestingly, the back-end servers processing the queries immediately and send 
the results to the front-end server:

{noformat}
 INFO 2010-08-18 10:37:45,325 [pool-13-thread-9] org.apache.solr.core.SolrCore 
- [sen-00003#sen-00003] webapp=null path=/select 
params={start=0&ids=pubmed%3A1567687%3A1%3A0%2Cpubmed%3A17140099%3A8%3A0%2Cpubmed%3A12807258%3A6%3A0%2Cpubmed%3A11701068%3A3%3A0&ids=pubmed%3A1567687%3A1%3A0%2Cpubmed%3A17140099%3A8%3A0%2Cpubmed%3A12807258%3A6%3A0%2Cpubmed%3A11701068%3A3%3A0&q=Human&isShard=true&rows=10}
 status=0 QTime=7 
DEBUG 2010-08-18 10:37:45,326 [IPC Server handler 17 on 20001] 
org.apache.solr.katta.SolrKattaServer - SolrServer.request: ibis46.gsf.de:20001 
shards: [sen-00003#sen-00003] request params: 
start=0&ids=pubmed%3A1567687%3A1%3A0%2Cpubmed%3A17140099%3A8%3A0%2Cpubmed%3A12807258%3A6%3A0%2Cpubmed%3A11701068%3A3%3A0&q=Human&isShard=true&rows=10&shards=sen-00003%23sen-00003
 rsp: {response={numFound=4,start=0,docs=[SolrDocument[{id=pubmed:1567687:1:0, 
type=sentence, lang=en, pubdate=Fri Dec 15 11:39:20 CET 2006}], 
SolrDocument[{id=pubmed:17140099:8:0, type=sentence, lang=en, pubdate=Thu Mar 
01 11:40:18 CET 2007}], SolrDocument[{id=pubmed:12807258:6:0, type=sentence, 
lang=en, pubdate=Thu Jun 11 11:37:14 CEST 2009}], 
SolrDocument[{id=pubmed:11701068:3:0, type=sentence, lang=en, pubdate=Fri Apr 
28 11:36:26 CEST 2006}]]},QueriedShards=[Ljava.lang.String;@791ef9f6}
DEBUG 2010-08-18 10:37:45,326 [IPC Server handler 17 on 20001] 
org.apache.hadoop.ipc.Server - Served: request queueTime= 8 procesingTime= 17
DEBUG 2010-08-18 10:37:45,326 [IPC Server handler 17 on 20001] 
org.apache.hadoop.ipc.Server - IPC Server Responder: responding to #30 from 
146.107.217.46:58679
DEBUG 2010-08-18 10:37:45,326 [IPC Server handler 17 on 20001] 
org.apache.hadoop.ipc.Server - IPC Server Responder: responding to #30 from 
146.107.217.46:58679 Wrote 386 bytes.
{noformat}

But if the front-end server cancels the query in case of a timout, always the 
last sent KattaResponse was not recognized by the front-end server. I've 
attached a full communication log of one failed query for both the front-end 
([^front-end.log]) and the back-end ([^backend-end.log]) server.

Did anyone run into the same issue? I hope because the error occurs quit often. 
I assume this bug is related to Hadoop RPC, but I could not find a Hadoop JIRA. 
I also tried the latest release candidate 0.21.0 of Hadoop.

My idea is now to combine the parallel queries to one back-end server into one 
single query, similar to the Lucene queries implemented in Katta.

> Integrate Katta
> ---------------
>
>                 Key: SOLR-1395
>                 URL: https://issues.apache.org/jira/browse/SOLR-1395
>             Project: Solr
>          Issue Type: New Feature
>    Affects Versions: 1.4
>            Reporter: Jason Rutherglen
>            Priority: Minor
>             Fix For: Next
>
>         Attachments: back-end.log, front-end.log, hadoop-core-0.19.0.jar, 
> katta-core-0.6-dev.jar, katta.node.properties, katta.zk.properties, 
> log4j-1.2.13.jar, solr-1395-1431-3.patch, solr-1395-1431-4.patch, 
> solr-1395-1431-katta0.6.patch, solr-1395-1431-katta0.6.patch, 
> solr-1395-1431.patch, SOLR-1395.patch, SOLR-1395.patch, SOLR-1395.patch, 
> test-katta-core-0.6-dev.jar, zkclient-0.1-dev.jar, zookeeper-3.2.1.jar
>
>   Original Estimate: 336h
>  Remaining Estimate: 336h
>
> We'll integrate Katta into Solr so that:
> * Distributed search uses Hadoop RPC
> * Shard/SolrCore distribution and management
> * Zookeeper based failover
> * Indexes may be built using Hadoop

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to