[ https://issues.apache.org/jira/browse/SOLR-1395?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Mathias Walter updated SOLR-1395: --------------------------------- Attachment: front-end.log back-end.log I ported the patch to Solr 3.1 and Katta 0.6.2, except the Katta test. I also fixed some bugs. The updated patch will be added soon. In the meanwhile I discovered a big issue. Often, a SolrKattaNode (back-end server) hosts many shards. If now a Solr front-end server starts a new query, it sent as many queries in parallel to the back-end servers as shards the have. In contrast, a Katta/Lucene search sends just one query to each back-end server which queries all shards a back-end server hosts. The problem is now that the Solr front-end server often did not receive all KattaResponse's from the back-end servers and hence timeout some queries and raises an exception. Sometimes a {{NullPointerException}} in {{org.apache.solr.handler.component.QueryComponent.mergeIds}} (usually at startup of the front-end server) and sometimes a {{NullPointerException}} in {{org.apache.solr.handler.component.QueryComponent.returnFields}}: {noformat} TRACE 2010-08-18 10:32:25,729 [pool-3-thread-4] net.sf.katta.client.WorkQueue - Done waiting, results = ClientResult: 0 results, 0 errors, 0/1 shards (id=6:0) TRACE 2010-08-18 10:32:25,729 [pool-3-thread-4] net.sf.katta.client.WorkQueue - Shutting down work queue, results = ClientResult: 0 results, 0 errors, 0/1 shards (id=6:0) TRACE 2010-08-18 10:32:25,729 [pool-3-thread-4] net.sf.katta.client.ClientResult - close() called. TRACE 2010-08-18 10:32:25,729 [pool-3-thread-4] net.sf.katta.client.ClientResult - Notifying closed listener. TRACE 2010-08-18 10:32:25,729 [pool-3-thread-4] net.sf.katta.client.WorkQueue - Shut down via ClientRequest.close() TRACE 2010-08-18 10:32:25,729 [pool-3-thread-4] net.sf.katta.client.WorkQueue - Shutdown() called (id=6) TRACE 2010-08-18 10:32:25,729 [pool-3-thread-4] net.sf.katta.client.WorkQueue - Returning results = ClientResult: 0 results, 0 errors, 0/1 shards (closed), took 9989 ms (id=6:0) DEBUG 2010-08-18 10:32:25,730 [pool-3-thread-4] net.sf.katta.client.Client - broadcast(request([null, org.apache.solr.katta.kattarequ...@180a1d7b]), {ibis46.gsf.de:20001=[sen-00000#sen-00000]}) took 10001 msec for ClientResult: 0 results, 0 errors, 0/1 shards (closed) DEBUG 2010-08-18 10:32:25,730 [pool-3-thread-4] org.apache.solr.katta.KattaSearchHandler - KattaCommComponent shard: sen-00000 results.size: 0 WARN 2010-08-18 10:32:25,730 [pool-3-thread-4] org.apache.solr.katta.KattaSearchHandler - Received 0 responses for query [], not 1 ERROR 2010-08-18 10:32:25,731 [pool-1-thread-1] org.apache.solr.core.SolrCore - java.lang.NullPointerException at org.apache.solr.handler.component.QueryComponent.mergeIds(QueryComponent.java:411) at org.apache.solr.handler.component.QueryComponent.handleResponses(QueryComponent.java:308) at org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:284) at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:131) at org.apache.solr.core.SolrCore.execute(SolrCore.java:1322) at org.apache.solr.core.QuerySenderListener.newSearcher(QuerySenderListener.java:52) at org.apache.solr.core.SolrCore$3.call(SolrCore.java:1144) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) at java.util.concurrent.FutureTask.run(FutureTask.java:138) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:619) DEBUG 2010-08-18 10:37:55,295 [pool-3-thread-9] net.sf.katta.client.Client - broadcast(request([null, org.apache.solr.katta.kattarequ...@71ce5e7a]), {ibis46.gsf.de:20001=[sen-00003#sen-00003]}) took 10001 msec for ClientResult: 0 results, 0 errors, 0/1 shards (closed) DEBUG 2010-08-18 10:37:55,295 [pool-3-thread-9] org.apache.solr.katta.KattaSearchHandler - KattaCommComponent shard: sen-00003 results.size: 0 WARN 2010-08-18 10:37:55,295 [pool-3-thread-9] org.apache.solr.katta.KattaSearchHandler - Received 0 responses for query [], not 1 ERROR 2010-08-18 10:37:55,296 [918077...@qtp-87740549-8] org.apache.solr.core.SolrCore - java.lang.NullPointerException at org.apache.solr.handler.component.QueryComponent.returnFields(QueryComponent.java:574) at org.apache.solr.handler.component.QueryComponent.handleResponses(QueryComponent.java:312) at org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:284) at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:131) at org.apache.solr.core.SolrCore.execute(SolrCore.java:1322) at org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:341) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:244) at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1157) at org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:388) at org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216) at org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:182) at org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:765) at org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:440) at org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:230) at org.mortbay.jetty.handler.HandlerCollection.handle(HandlerCollection.java:114) at org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:152) at org.mortbay.jetty.Server.handle(Server.java:326) at org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:542) at org.mortbay.jetty.HttpConnection$RequestHandler.headerComplete(HttpConnection.java:926) at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:549) at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:212) at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:404) at org.mortbay.jetty.bio.SocketConnector$Connection.run(SocketConnector.java:228) at org.mortbay.thread.QueuedThreadPool$PoolThread.run(QueuedThreadPool.java:582) {noformat} Interestingly, the back-end servers processing the queries immediately and send the results to the front-end server: {noformat} INFO 2010-08-18 10:37:45,325 [pool-13-thread-9] org.apache.solr.core.SolrCore - [sen-00003#sen-00003] webapp=null path=/select params={start=0&ids=pubmed%3A1567687%3A1%3A0%2Cpubmed%3A17140099%3A8%3A0%2Cpubmed%3A12807258%3A6%3A0%2Cpubmed%3A11701068%3A3%3A0&ids=pubmed%3A1567687%3A1%3A0%2Cpubmed%3A17140099%3A8%3A0%2Cpubmed%3A12807258%3A6%3A0%2Cpubmed%3A11701068%3A3%3A0&q=Human&isShard=true&rows=10} status=0 QTime=7 DEBUG 2010-08-18 10:37:45,326 [IPC Server handler 17 on 20001] org.apache.solr.katta.SolrKattaServer - SolrServer.request: ibis46.gsf.de:20001 shards: [sen-00003#sen-00003] request params: start=0&ids=pubmed%3A1567687%3A1%3A0%2Cpubmed%3A17140099%3A8%3A0%2Cpubmed%3A12807258%3A6%3A0%2Cpubmed%3A11701068%3A3%3A0&q=Human&isShard=true&rows=10&shards=sen-00003%23sen-00003 rsp: {response={numFound=4,start=0,docs=[SolrDocument[{id=pubmed:1567687:1:0, type=sentence, lang=en, pubdate=Fri Dec 15 11:39:20 CET 2006}], SolrDocument[{id=pubmed:17140099:8:0, type=sentence, lang=en, pubdate=Thu Mar 01 11:40:18 CET 2007}], SolrDocument[{id=pubmed:12807258:6:0, type=sentence, lang=en, pubdate=Thu Jun 11 11:37:14 CEST 2009}], SolrDocument[{id=pubmed:11701068:3:0, type=sentence, lang=en, pubdate=Fri Apr 28 11:36:26 CEST 2006}]]},QueriedShards=[Ljava.lang.String;@791ef9f6} DEBUG 2010-08-18 10:37:45,326 [IPC Server handler 17 on 20001] org.apache.hadoop.ipc.Server - Served: request queueTime= 8 procesingTime= 17 DEBUG 2010-08-18 10:37:45,326 [IPC Server handler 17 on 20001] org.apache.hadoop.ipc.Server - IPC Server Responder: responding to #30 from 146.107.217.46:58679 DEBUG 2010-08-18 10:37:45,326 [IPC Server handler 17 on 20001] org.apache.hadoop.ipc.Server - IPC Server Responder: responding to #30 from 146.107.217.46:58679 Wrote 386 bytes. {noformat} But if the front-end server cancels the query in case of a timout, always the last sent KattaResponse was not recognized by the front-end server. I've attached a full communication log of one failed query for both the front-end ([^front-end.log]) and the back-end ([^backend-end.log]) server. Did anyone run into the same issue? I hope because the error occurs quit often. I assume this bug is related to Hadoop RPC, but I could not find a Hadoop JIRA. I also tried the latest release candidate 0.21.0 of Hadoop. My idea is now to combine the parallel queries to one back-end server into one single query, similar to the Lucene queries implemented in Katta. > Integrate Katta > --------------- > > Key: SOLR-1395 > URL: https://issues.apache.org/jira/browse/SOLR-1395 > Project: Solr > Issue Type: New Feature > Affects Versions: 1.4 > Reporter: Jason Rutherglen > Priority: Minor > Fix For: Next > > Attachments: back-end.log, front-end.log, hadoop-core-0.19.0.jar, > katta-core-0.6-dev.jar, katta.node.properties, katta.zk.properties, > log4j-1.2.13.jar, solr-1395-1431-3.patch, solr-1395-1431-4.patch, > solr-1395-1431-katta0.6.patch, solr-1395-1431-katta0.6.patch, > solr-1395-1431.patch, SOLR-1395.patch, SOLR-1395.patch, SOLR-1395.patch, > test-katta-core-0.6-dev.jar, zkclient-0.1-dev.jar, zookeeper-3.2.1.jar > > Original Estimate: 336h > Remaining Estimate: 336h > > We'll integrate Katta into Solr so that: > * Distributed search uses Hadoop RPC > * Shard/SolrCore distribution and management > * Zookeeper based failover > * Indexes may be built using Hadoop -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org