On 10/17/2017 5:53 PM, Phillip Wu wrote:
> I've indexed a lot of documents (*.docx & *.vsd).
>
> When I run a query from the website it returns only a small proportion of the 
> data in the index:
> {
> "responseHeader":{
> "status":0,
> "QTime":66,
> "params":{
>    "q":"NS Finance 9.2",
>    "fl":"id,date",
>    "start":"0",
>    "_":"1508193512223"}},
> "response":{"numFound":2053,"start":0,"docs":[
> ..here it returns only 9 documents of type *.doc
> ]

This shows a numFound value of 2053.  That means that the query matched
2053 documents in the index.  If the results only contained 9 documents,
then that probably happened because the rows parameter was set to 9. 
I'm betting that this is a setting in the request handler definition in
solrconfig.xml.  If you want all results in a single request, you're
going to have to increase the rows parameter, which can also be done on
a per-request basis.  There is no value that always means all
documents.  You must tell Solr what you want it to return.

Alternately, you can get further results with multiple requests by
paging with the rows and start parameters:

https://lucene.apache.org/solr/guide/6_6/pagination-of-results.html

It is a bad idea to set a very large rows value on all requests.  This
is because Solr allocates a block of memory to reference the results
based purely on the value of the rows parameter, regardless of the
actual number of matches.  If the rows value is large, that memory block
will be large, which might be a waste and can lead to problems with too
much garbage collection.

> I know the search text occurs in some of the *.vsd files so I re-run:
> {
> "responseHeader":{
> "status":0,
> "QTime":754,
> "params":{
> "q":"\"NS Finance 9.2\" id:*FIN*.vsd",
> "fl":"id,date", "_":"1508193512223"}},
> "response":{"numFound":9,"start":0,"docs":[
> ..here it returns only 9 documents of *.vsd
> ]

For this query, numFound is 9, so this time it actually is showing all
the results.

Thanks,
Shawn

Reply via email to