Hi, Recently we have been trying to scale up our Solandra setup to make use of a more powerful server. To improve query speeds we tried reducing the index size, and thus increasing the number of shards on a single machine. While we had no trouble searching and return results when we had a single shard, with multiple shards we are getting an interesting bug where the search is able to find the matching documents and returns an accurate count, but returns 0 documents. What could be causing this problem? Are we missing an obvious parameter? While using multiple shards, if I set isShard=true, we do get back results with total number found from only one shard. We tried hitting each of the cores directly by using the setParam from SolrJ, but are not getting any results back. We found the names of the core from the logs (ie.10.1.10.200:8983/solandra/df~1).
To debug we have set up a test environment that has the latest release of Solandra and uses the default settings. In this setting with just two shards we are seeing the same issue. We have also tested with different size shards ranging from 256 to 4194304. In all cases as soon as we have more than 1 shard Solandra stops returning results even though it is clear the results were found. Below is some of the log information. Server specs: 8 cores 32 GB of memory (though we are only allocating 16GB for Solandra) Using the default settings in Solandra Properties, we added in 2,000,000 documents to ensure there were two shards on the same machine. For a query that has not been cached: DEBUG 10:28:55,004 core: df DEBUG 10:28:55,005 Adding shard(df): 10.1.10.200:8983/solandra/df~0 DEBUG 10:28:55,005 Adding shard(df): 10.1.10.200:8983/solandra/df~1 DEBUG 10:28:55,014 Fetching 0 Docs INFO 10:28:55,015 [df] webapp=/solandra path=/select params={fl=key,score&start=0&q=province:az&isShard=true&wt=javabin&fsv=true&rows=10&version=2} hits=0 status=0 QTime=3 INFO 10:28:55,821 GC for ParNew: 258 ms, 586012000 reclaimed leaving 2122387984 used; max is 16955473920 DEBUG 10:28:58,034 Fetching 10 Docs DEBUG 10:28:58,035 Going to bulk load 10 documents DEBUG 10:28:58,099 Document read took: 63ms INFO 10:28:58,099 [df] webapp=/solandra path=/select params={fl=key,score&start=0&q=province:az&isShard=true&wt=javabin&fsv=true&rows=10&version=2} hits=99470 status=0 QTime=3087 DEBUG 10:28:58,101 Document read took: 1ms DEBUG 10:28:58,102 Document read took: 1ms DEBUG 10:28:58,104 Document read took: 1ms DEBUG 10:28:58,105 Document read took: 1ms DEBUG 10:28:58,107 Document read took: 2ms DEBUG 10:28:58,108 Document read took: 1ms DEBUG 10:28:58,109 Document read took: 1ms DEBUG 10:28:58,110 Document read took: 1ms DEBUG 10:28:58,112 Document read took: 1ms DEBUG 10:28:58,113 Document read took: 1ms DEBUG 10:28:58,118 Fetching 0 Docs INFO 10:28:58,118 [df] webapp=/solandra path=/select params={isShard=true&wt=javabin&q=province:az&ids=[us/az/yuma/1152s4thave],[us/az/tempe/208sriverdr],[us/az/mundspark/475pinewoodblvd],[us/az/phoenix/2338wstellaln],[us/az/tucson/3341wwildwooddr],[us/az/surprise/15128wbellrd],[us/az/phoenix/3222egeorgiaave],[us/az/lakehavasucity/2250catamarandr],[us/az/huachucacity/264shuachucablvd],[us/az/tucson/6161sparkave]&version=2} status=0 QTime=1 INFO 10:28:58,119 [df] webapp=/solandra path=/select params={wt=javabin&q=province:az&version=2} status=0 QTime=3115 For a query that has been cached: DEBUG 10:27:36,350 core: df INFO 10:27:36,351 ShardInfo for df has expired INFO 10:27:36,353 Found reserved shard1(106758077800188110322537822484278066430):178410 TO 180224 DEBUG 10:27:36,353 Adding shard(df): 10.1.10.200:8983/solandra/df~0 DEBUG 10:27:36,353 Adding shard(df): 10.1.10.200:8983/solandra/df~1 DEBUG 10:27:36,359 Fetching 0 Docs INFO 10:27:36,360 [df] webapp=/solandra path=/select params={fl=key,score&start=0&q=province:ak&isShard=true&wt=javabin&fsv=true&rows=10&version=2} hits=0 status=0 QTime=2 DEBUG 10:27:36,362 Fetching 10 Docs DEBUG 10:27:36,363 Found doc in cache INFO 10:27:36,363 [df] webapp=/solandra path=/select params={fl=key,score&start=0&q=province:ak&isShard=true&wt=javabin&fsv=true&rows=10&version=2} hits=14707 status=0 QTime=5 DEBUG 10:27:36,363 Found doc in cache DEBUG 10:27:36,363 Found doc in cache DEBUG 10:27:36,363 Found doc in cache DEBUG 10:27:36,364 Found doc in cache DEBUG 10:27:36,364 Found doc in cache DEBUG 10:27:36,364 Found doc in cache DEBUG 10:27:36,364 Found doc in cache DEBUG 10:27:36,364 Found doc in cache DEBUG 10:27:36,365 Found doc in cache DEBUG 10:27:36,365 Found doc in cache DEBUG 10:27:36,369 Fetching 0 Docs INFO 10:27:36,369 [df] webapp=/solandra path=/select params={isShard=true&wt=javabin&q=province:ak&ids=[us/ak/fairbanks/1483ballainerd],[us/ak/anchorage/4451etudorrd],[us/ak/anchorage/600cordovast],[us/ak/anchorage/6048e6thave],[us/ak/anchorage/940tyonekdr],[us/ak/fairbanks/3800universityaves],[us/ak/kenai/47189sherwoodcir],[us/ak/anchorage/12801oldsewardhwy],[us/ak/anchorage/8400raintreecir],[us/ak/juneau/9150skywoodln]&version=2} status=0 QTime=1 INFO 10:27:36,370 [df] webapp=/solandra path=/select params={wt=javabin&q=province:ak&version=2} status=0 QTime=20 In case it was due to the way we were adding the documents, here is the log for a single write/update: INFO 10:33:14,666 {add=[us/ak/anchorage/429industrialway]} 0 8 INFO 10:33:14,666 [df] webapp=/solandra path=/update params={wt=javabin&version=2} status=0 QTime=8 DEBUG 10:33:14,683 update for document 606298 DEBUG 10:33:14,683 Adding 606298 to df~0 DEBUG 10:33:14,685 Deleted all terms for: 606298 DEBUG 10:33:14,685 df~0 - firstTerm: 142390502986727170797762286641249888713���key DEBUG 10:33:14,685 df~0 - firstTerm: 142390502986727170797762286641249888713���category DEBUG 10:33:14,686 df~0 - firstTerm: 142390502986727170797762286641249888713���category DEBUG 10:33:14,686 df~0 - firstTerm: 142390502986727170797762286641249888713���category DEBUG 10:33:14,686 df~0 - firstTerm: 142390502986727170797762286641249888713���category DEBUG 10:33:14,687 df~0 - firstTerm: 142390502986727170797762286641249888713���category DEBUG 10:33:14,687 df~0 - firstTerm: 142390502986727170797762286641249888713���source DEBUG 10:33:14,687 df~0 - firstTerm: 142390502986727170797762286641249888713���dateAdded DEBUG 10:33:14,687 df~0 - firstTerm: 142390502986727170797762286641249888713���dateUpdated DEBUG 10:33:14,687 df~0 - firstTerm: 142390502986727170797762286641249888713���type DEBUG 10:33:14,687 df~0 - firstTerm: 142390502986727170797762286641249888713���encoding DEBUG 10:33:14,688 df~0 - firstTerm: 142390502986727170797762286641249888713���postalcode DEBUG 10:33:14,688 df~0 - firstTerm: 142390502986727170797762286641249888713���sic DEBUG 10:33:14,688 df~0 - firstTerm: 142390502986727170797762286641249888713���phone DEBUG 10:33:14,688 df~0 - firstTerm: 142390502986727170797762286641249888713���long DEBUG 10:33:14,688 df~0 - firstTerm: 142390502986727170797762286641249888713���city DEBUG 10:33:14,688 df~0 - firstTerm: 142390502986727170797762286641249888713���country DEBUG 10:33:14,689 df~0 - firstTerm: 142390502986727170797762286641249888713���address DEBUG 10:33:14,689 df~0 - firstTerm: 142390502986727170797762286641249888713���name DEBUG 10:33:14,689 df~0 - firstTerm: 142390502986727170797762286641249888713���province DEBUG 10:33:14,689 df~0 - firstTerm: 142390502986727170797762286641249888713���lat