The process looks like this:

each shard returns the top 100K documents
(actually the doc ID and whatever your
sort criteria is, often just the score).
_from every shard_ and the node that
distributes that request then takes those
900K items and merges the list to get the
100K that satisfy the request. Then the
node sends out a second request to each
shard to get the actual document, perhaps
asking for all 100K from a single node.

As Walter and Shalin say, don't do this. Solr
isn't built to return massive result sets.
Performance will suffer and you'll run into
"interesting" limits like this.

Why do you want to do this in the first place?

Best
Erick


On Fri, Jul 5, 2013 at 10:26 AM, eakarsu <eaka...@gmail.com> wrote:

> I am using Solr 4.3.1 on solrcloud with 10 nodes.
>
> I added 3 million documents from a csv file with this command
>
> curl
> '
> http://localhost:8080/solr/trcollection2/update/csv?stream.file=/home/hduser/csvFile.csv&skipLines=1&fieldnames=,cache,segment,digest,tstamp,lang,url,,content,id,title,boost&stream.contentType=text/p
> lain;charset=utf-8'
>
> Then I query the data, fetching first 100K documents with this. But I am
> getting "Invalid version (expected 2, but 60) or the data in not in
> 'javabin' format" error. I have appended what I got in output file
> "alltrcollection_309mil.csv"
>
> I appreciate if you can help me on this
>
> hduser@host1:~$ curl -o alltrcollection_309mil.csv
> '
> http://localhost:8080/solr/trcollection2/select?q=*%3A*&rows=100000&wt=xml&indent=true
> '
>   % Total    % Received % Xferd  Average Speed   Time    Time     Time
> Current
>                                  Dload  Upload   Total   Spent    Left
> Speed
> 100  3439 0 3439 0     0    103      0 --:--:--  0:00:33 --:--:--
> 1049
> hduser@host1:~$ more alltrcollection_309mil.csv
> <?xml version="1.0" encoding="UTF-8"?>
> <response>
>
> <lst name="responseHeader">
>   <int name="status">500</int>
>   <int name="QTime">33307</int>
>   <lst name="params">
>     <str name="indent">true</str>
>     <str name="q">*:*</str>
>     <str name="wt">xml</str>
>     <str name="rows">100000</str>
>   </lst>
> </lst>
> <lst name="error">
>   <str name="msg">java.lang.RuntimeException: Invalid version (expected 2,
> but 60) or the data in not in 'javabin' format</str>
>   <str name="trace">org.apache.solr.common.SolrException:
> java.lang.RuntimeException: Invalid version (expected 2, but 60) or the
> data
> in not in 'javabin' format
>         at
>
> org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:302)
>         at
>
> org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135)
>         at org.apache.solr.core.SolrCore.execute(SolrCore.java:1820)
>         at
>
> org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:656)
>         at
>
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:359)
>         at
>
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:155)
>         at
>
> org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:243)
>         at
>
> org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:210)
>         at
>
> org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:222)
>         at
>
> org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:123)
>         at
>
> org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:171)
>         at
>
> org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:99)
>         at
> org.apache.catalina.valves.AccessLogValve.invoke(AccessLogValve.java:931)
>         at
>
> org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:118)
>         at
> org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:407)
>         at
>
> org.apache.coyote.http11.AbstractHttp11Processor.process(AbstractHttp11Processor.java:1004)
>         at
>
> org.apache.coyote.AbstractProtocol$AbstractConnectionHandler.process(AbstractProtocol.java:589)
>         at
>
> org.apache.tomcat.util.net.JIoEndpoint$SocketProcessor.run(JIoEndpoint.java:310)
>         at
>
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
>         at
>
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
>         at java.lang.Thread.run(Thread.java:722)
> Caused by: java.lang.RuntimeException: Invalid version (expected 2, but 60)
> or the data in not in 'javabin' format
>         at
> org.apache.solr.common.util.JavaBinCodec.unmarshal(JavaBinCodec.java:109)
>         at
>
> org.apache.solr.client.solrj.impl.BinaryResponseParser.processResponse(BinaryResponseParser.java:41)
>         at
>
> org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:385)
>         at
>
> org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:180)
>         at
>
> org.apache.solr.handler.component.HttpShardHandler$1.call(HttpShardHandler.java:156)
>         at
>
> org.apache.solr.handler.component.HttpShardHandler$1.call(HttpShardHandler.java:119)
>         at
> java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
>         at java.util.concurrent.FutureTask.run(FutureTask.java:166)
>         at
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
>         at
> java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
>         at java.util.concurrent.FutureTask.run(FutureTask.java:166)
>         ... 3 more
> </str>
>   <int name="code">500</int>
> </lst>
> </response>
>
>
>
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/Invalid-version-expected-2-but-60-or-the-data-in-not-in-javabin-format-tp4075739.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>

Reply via email to