The process looks like this: each shard returns the top 100K documents (actually the doc ID and whatever your sort criteria is, often just the score). _from every shard_ and the node that distributes that request then takes those 900K items and merges the list to get the 100K that satisfy the request. Then the node sends out a second request to each shard to get the actual document, perhaps asking for all 100K from a single node.
As Walter and Shalin say, don't do this. Solr isn't built to return massive result sets. Performance will suffer and you'll run into "interesting" limits like this. Why do you want to do this in the first place? Best Erick On Fri, Jul 5, 2013 at 10:26 AM, eakarsu <eaka...@gmail.com> wrote: > I am using Solr 4.3.1 on solrcloud with 10 nodes. > > I added 3 million documents from a csv file with this command > > curl > ' > http://localhost:8080/solr/trcollection2/update/csv?stream.file=/home/hduser/csvFile.csv&skipLines=1&fieldnames=,cache,segment,digest,tstamp,lang,url,,content,id,title,boost&stream.contentType=text/p > lain;charset=utf-8' > > Then I query the data, fetching first 100K documents with this. But I am > getting "Invalid version (expected 2, but 60) or the data in not in > 'javabin' format" error. I have appended what I got in output file > "alltrcollection_309mil.csv" > > I appreciate if you can help me on this > > hduser@host1:~$ curl -o alltrcollection_309mil.csv > ' > http://localhost:8080/solr/trcollection2/select?q=*%3A*&rows=100000&wt=xml&indent=true > ' > % Total % Received % Xferd Average Speed Time Time Time > Current > Dload Upload Total Spent Left > Speed > 100 3439 0 3439 0 0 103 0 --:--:-- 0:00:33 --:--:-- > 1049 > hduser@host1:~$ more alltrcollection_309mil.csv > <?xml version="1.0" encoding="UTF-8"?> > <response> > > <lst name="responseHeader"> > <int name="status">500</int> > <int name="QTime">33307</int> > <lst name="params"> > <str name="indent">true</str> > <str name="q">*:*</str> > <str name="wt">xml</str> > <str name="rows">100000</str> > </lst> > </lst> > <lst name="error"> > <str name="msg">java.lang.RuntimeException: Invalid version (expected 2, > but 60) or the data in not in 'javabin' format</str> > <str name="trace">org.apache.solr.common.SolrException: > java.lang.RuntimeException: Invalid version (expected 2, but 60) or the > data > in not in 'javabin' format > at > > org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:302) > at > > org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135) > at org.apache.solr.core.SolrCore.execute(SolrCore.java:1820) > at > > org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:656) > at > > org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:359) > at > > org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:155) > at > > org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:243) > at > > org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:210) > at > > org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:222) > at > > org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:123) > at > > org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:171) > at > > org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:99) > at > org.apache.catalina.valves.AccessLogValve.invoke(AccessLogValve.java:931) > at > > org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:118) > at > org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:407) > at > > org.apache.coyote.http11.AbstractHttp11Processor.process(AbstractHttp11Processor.java:1004) > at > > org.apache.coyote.AbstractProtocol$AbstractConnectionHandler.process(AbstractProtocol.java:589) > at > > org.apache.tomcat.util.net.JIoEndpoint$SocketProcessor.run(JIoEndpoint.java:310) > at > > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110) > at > > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603) > at java.lang.Thread.run(Thread.java:722) > Caused by: java.lang.RuntimeException: Invalid version (expected 2, but 60) > or the data in not in 'javabin' format > at > org.apache.solr.common.util.JavaBinCodec.unmarshal(JavaBinCodec.java:109) > at > > org.apache.solr.client.solrj.impl.BinaryResponseParser.processResponse(BinaryResponseParser.java:41) > at > > org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:385) > at > > org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:180) > at > > org.apache.solr.handler.component.HttpShardHandler$1.call(HttpShardHandler.java:156) > at > > org.apache.solr.handler.component.HttpShardHandler$1.call(HttpShardHandler.java:119) > at > java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334) > at java.util.concurrent.FutureTask.run(FutureTask.java:166) > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) > at > java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334) > at java.util.concurrent.FutureTask.run(FutureTask.java:166) > ... 3 more > </str> > <int name="code">500</int> > </lst> > </response> > > > > -- > View this message in context: > http://lucene.472066.n3.nabble.com/Invalid-version-expected-2-but-60-or-the-data-in-not-in-javabin-format-tp4075739.html > Sent from the Solr - User mailing list archive at Nabble.com. >