Sure, here it is directly in mail. Hopefully it does not get chopped.
{
"responseHeader": {
"zkConnected": true,
"status": 0,
"QTime": 29,
"params": {
"q": "task_coopProcessId:20021454",
"indent": "true",
"fl": "task_coopProcessId",
"q.op": "OR",
"debug.explain.structured": "true",
"debugQuery": "true",
"useParams": ""
}
},
"response": {
<same as in response without debug>
},
"debug": {
"track": {
"rid":
"<insert-project-name>-solrcloud-1.<insert-project-name>-solrcloud-headless.<insert-project-name>-14886",
"EXECUTE_QUERY": {
"https://<insert-project-name>-solrcloud-1.<insert-project-name>-solrcloud-headless.<insert-project-name>:8983/solr/workflow_shard1_replica_n4/":
{
"QTime": "0",
"ElapsedTime": "11",
"RequestPurpose":
"GET_TOP_IDS,SET_TERM_STATS",
"NumFound": "0",
"Response":
"{responseHeader={zkConnected=true, status=0, QTime=0, params={df=_text_,
distrib=false, debug=[false, timing, track], fl=[id, score],
shards.purpose=16388, start=0, fsv=true, q.op=OR, rows=10,
rid=<insert-project-name>-solrcloud-1.<insert-project-name>-solrcloud-headless.<insert-project-name>-14886,
debug.explain.structured=true, version=2, q=task_coopProcessId:20021454,
omitHeader=false, requestPurpose=GET_TOP_IDS,SET_TERM_STATS, NOW=1713520858995,
isShard=true, wt=javabin, debugQuery=false, useParams=}},
response={numFound=0,numFoundExact=true,start=0,maxScore=0.0,docs=[]},
sort_values={}, debug={timing={time=0.0, prepare={time=0.0, query={time=0.0},
facet={time=0.0}, facet_module={time=0.0}, mlt={time=0.0},
highlight={time=0.0}, stats={time=0.0}, expand={time=0.0}, terms={time=0.0},
debug={time=0.0}}, process={time=0.0, query={time=0.0}, facet={time=0.0},
facet_module={time=0.0}, mlt={time=0.0}, highlight={time=0.0},
stats={time=0.0}, expand={time=0.0}, terms={time=0.0}, debug={time=0.0}}}}}"
},
"https://<insert-project-name>-solrcloud-2.<insert-project-name>-solrcloud-headless.<insert-project-name>:8983/solr/workflow_shard2_replica_n2/":
{
"QTime": "0",
"ElapsedTime": "13",
"RequestPurpose":
"GET_TOP_IDS,SET_TERM_STATS",
"NumFound": "0",
"Response":
"{responseHeader={zkConnected=true, status=0, QTime=0, params={df=_text_,
distrib=false, debug=[false, timing, track], fl=[id, score],
shards.purpose=16388, start=0, fsv=true, q.op=OR, rows=10,
rid=<insert-project-name>-solrcloud-1.<insert-project-name>-solrcloud-headless.<insert-project-name>-14886,
debug.explain.structured=true, version=2, q=task_coopProcessId:20021454,
omitHeader=false, requestPurpose=GET_TOP_IDS,SET_TERM_STATS, NOW=1713520858995,
isShard=true, wt=javabin, debugQuery=false, useParams=}},
response={numFound=0,numFoundExact=true,start=0,maxScore=0.0,docs=[]},
sort_values={}, debug={timing={time=0.0, prepare={time=0.0, query={time=0.0},
facet={time=0.0}, facet_module={time=0.0}, mlt={time=0.0},
highlight={time=0.0}, stats={time=0.0}, expand={time=0.0}, terms={time=0.0},
debug={time=0.0}}, process={time=0.0, query={time=0.0}, facet={time=0.0},
facet_module={time=0.0}, mlt={time=0.0}, highlight={time=0.0},
stats={time=0.0}, expand={time=0.0}, terms={time=0.0}, debug={time=0.0}}}}}"
},
"https://<insert-project-name>-solrcloud-0.<insert-project-name>-solrcloud-headless.<insert-project-name>:8983/solr/workflow_shard3_replica_n1/":
{
"QTime": "0",
"ElapsedTime": "17",
"RequestPurpose":
"GET_TOP_IDS,SET_TERM_STATS",
"NumFound": "4",
"Response":
"{responseHeader={zkConnected=true, status=0, QTime=0, params={df=_text_,
distrib=false, debug=[false, timing, track], fl=[id, score],
shards.purpose=16388, start=0, fsv=true, q.op=OR, rows=10,
rid=<insert-project-name>-solrcloud-1.<insert-project-name>-solrcloud-headless.<insert-project-name>-14886,
debug.explain.structured=true, version=2, q=task_coopProcessId:20021454,
omitHeader=false, requestPurpose=GET_TOP_IDS,SET_TERM_STATS, NOW=1713520858995,
isShard=true, wt=javabin, debugQuery=false, useParams=}},
response={numFound=4,numFoundExact=true,start=0,maxScore=1.0,docs=[SolrDocument{id=task_46914,
score=1.0}, SolrDocument{id=task_46915, score=1.0},
SolrDocument{id=task_46916, score=1.0}, SolrDocument{id=task_46917,
score=1.0}]}, sort_values={}, debug={timing={time=0.0, prepare={time=0.0,
query={time=0.0}, facet={time=0.0}, facet_module={time=0.0}, mlt={time=0.0},
highlight={time=0.0}, stats={time=0.0}, expand={time=0.0}, terms={time=0.0},
debug={time=0.0}}, process={time=0.0, query={time=0.0}, facet={time=0.0},
facet_module={time=0.0}, mlt={time=0.0}, highlight={time=0.0},
stats={time=0.0}, expand={time=0.0}, terms={time=0.0}, debug={time=0.0}}}}}"
}
},
"GET_FIELDS": {
"https://<insert-project-name>-solrcloud-0.<insert-project-name>-solrcloud-headless.<insert-project-name>:8983/solr/workflow_shard3_replica_n1/":
{
"QTime": "1",
"ElapsedTime": "4",
"RequestPurpose":
"GET_FIELDS,GET_DEBUG,SET_TERM_STATS",
"NumFound": "4",
"Response":
"{responseHeader={zkConnected=true, status=0, QTime=1, params={df=_text_,
distrib=false, debug=[timing, track], fl=[task_coopProcessId, id],
shards.purpose=16704, q.op=OR, rows=10,
rid=<insert-project-name>-solrcloud-1.<insert-project-name>-solrcloud-headless.<insert-project-name>-14886,
debug.explain.structured=true, version=2, q=task_coopProcessId:20021454,
omitHeader=false, requestPurpose=GET_FIELDS,GET_DEBUG,SET_TERM_STATS,
NOW=1713520858995, ids=task_46915,task_46914,task_46917,task_46916,
isShard=true, wt=javabin, debugQuery=true, useParams=}},
response={numFound=4,numFoundExact=true,start=0,docs=[SolrDocument{task_coopProcessId=20021454},
SolrDocument{task_coopProcessId=2008387},
SolrDocument{task_coopProcessId=20021454},
SolrDocument{task_coopProcessId=2008403}]},
debug={rawquerystring=task_coopProcessId:20021454,
querystring=task_coopProcessId:20021454,
parsedquery=(task_coopProcessId:[20021454 TO 20021454]),
parsedquery_toString=task_coopProcessId:[20021454 TO 20021454],
explain={task_46915={match=true, value=1.0,
description=task_coopProcessId:[20021454 TO 20021454]},
task_46914={match=false, value=0.0, description=task_coopProcessId:[20021454 TO
20021454] doesn't match id 30378}, task_46917={match=true, value=1.0,
description=task_coopProcessId:[20021454 TO 20021454]},
task_46916={match=false, value=0.0, description=task_coopProcessId:[20021454 TO
20021454] doesn't match id 30330}}, QParser=LuceneQParser, timing={time=1.0,
prepare={time=0.0, query={time=0.0}, facet={time=0.0}, facet_module={time=0.0},
mlt={time=0.0}, highlight={time=0.0}, stats={time=0.0}, expand={time=0.0},
terms={time=0.0}, debug={time=0.0}}, process={time=0.0, query={time=0.0},
facet={time=0.0}, facet_module={time=0.0}, mlt={time=0.0},
highlight={time=0.0}, stats={time=0.0}, expand={time=0.0}, terms={time=0.0},
debug={time=0.0}}}}}"
}
}
},
"timing": {
"time": 1,
"prepare": {
"time": 0,
"query": {
"time": 0
},
"facet": {
"time": 0
},
"facet_module": {
"time": 0
},
"mlt": {
"time": 0
},
"highlight": {
"time": 0
},
"stats": {
"time": 0
},
"expand": {
"time": 0
},
"terms": {
"time": 0
},
"debug": {
"time": 0
}
},
"process": {
"time": 0,
"query": {
"time": 0
},
"facet": {
"time": 0
},
"facet_module": {
"time": 0
},
"mlt": {
"time": 0
},
"highlight": {
"time": 0
},
"stats": {
"time": 0
},
"expand": {
"time": 0
},
"terms": {
"time": 0
},
"debug": {
"time": 0
}
}
},
"rawquerystring": "task_coopProcessId:20021454",
"querystring": "task_coopProcessId:20021454",
"parsedquery": "(task_coopProcessId:[20021454 TO 20021454])",
"parsedquery_toString": "task_coopProcessId:[20021454 TO
20021454]",
"QParser": "LuceneQParser",
"explain": {
"task_46914": {
"match": false,
"value": 0,
"description": "task_coopProcessId:[20021454 TO
20021454] doesn't match id 30378"
},
"task_46915": {
"match": true,
"value": 1,
"description": "task_coopProcessId:[20021454 TO
20021454]"
},
"task_46916": {
"match": false,
"value": 0,
"description": "task_coopProcessId:[20021454 TO
20021454] doesn't match id 30330"
},
"task_46917": {
"match": true,
"value": 1,
"description": "task_coopProcessId:[20021454 TO
20021454]"
}
}
}
}
-----Ursprüngliche Nachricht-----
Von: Mikhail Khludnev <[email protected]>
Gesendet: Samstag, 20. April 12024 11:36
An: [email protected]
Cc: [email protected]
Betreff: Re: Wrong documents in Response
CAUTION: This is an external email from sender 'Mikhail Khludnev
<[email protected]>' ('[email protected]').
Do not click any links or open any attachments unless you trust the sender and
know the content is safe.
Hello Dario.
Mailing list chopped attachment, but looking into debugQuery is what we
need here.
On Fri, Apr 19, 2024 at 1:41?PM <[email protected]> wrote:
> Hello All,
>
>
>
> We have a relatively new Solr Instance:
>
> solr-spec: 9.5.0
>
> solr-impl: 9.5.0 cdd27dd15c3a6574032e9b1b92b148ab4e383599 - gerlowskija -
> 2024-02-07 15:10:39
>
>
>
> lucene-spec: 9.9.2
>
> lucene-impl: 9.9.2 a2939784c4ca60bc28bf488b5479c02fc2e5e22c - 2024-01-25
> 09:51:09
>
>
>
> JVM Runtime: Eclipse Adoptium OpenJDK 64-Bit Server VM 17.0.10 17.0.10+7
>
>
>
> We run the solr instance in a Kubernetes cluster in gcp.
>
>
>
> We have two collections but only documents in one of them right now. We
> have indexed ~70,000 tasks (one of the types of documents we index) on one
> of the collection. In total there are ~100,000 documents in this
> collection.
>
> Note that on production we still use an older solr version (8.11.2) with
> ~5,000,000 tasks and the fallowing problem does not appear there.
>
>
>
> The collection are all set um with the _default config and only use 1
> shard each. autoAddReplicas is also configured to be false. The
> replicationFactor is also 1. Even the maxShardsPerNode is 1.
>
> Or at least that's how we configured the collections. In the debugged
> response you will see that somehow multiple shards are at play.
>
>
>
> Now the problem:
>
> Every Task has a parent id - we call it processId. We use this processId
> to find all the tasks that belong to one process.
>
> By searching for this processId we expect to find all the tasks that
> belong to the corresponding process.
>
>
>
> For example, we have a process with the processId 20021454 (this is the
> real processId, I have chosen to show you the real number, because maybe
> this number is forbidden in solr?!).
>
> One would expect to find all the tasks that belong to this process when
> using this query: "task_coopProcessId:20021454".
>
> We know for a fact that this process contains exactly four tasks. That's
> also what solr returns - four tasks.
>
> But two of the tasks don't belong to the correct process.
>
> Below is the response we get from solr (to keep the response short, I have
> included the fl parameter, to only show the important info for this problem
> description).
>
> I have also included the result when showing debug info as an attachment
> (example.json). You will need to mentally replace <insert-project-name>
> with a real project name, that I am not going to name here.
>
>
>
> {
>
> "responseHeader": {
>
> "zkConnected": true,
>
> "status": 0,
>
> "QTime": 9,
>
> "params": {
>
> "q": "task_coopProcessId:20021454",
>
> "indent": "true",
>
> "fl": "task_coopProcessId",
>
> "q.op": "OR",
>
> "useParams": ""
>
> }
>
> },
>
> "response": {
>
> "numFound": 4,
>
> "start": 0,
>
> "maxScore": 1,
>
> "numFoundExact": true,
>
> "docs": [
>
> {
>
> "task_coopProcessId": 2008387
>
> },
>
> {
>
> "task_coopProcessId": 20021454
>
> },
>
> {
>
> "task_coopProcessId": 2008403
>
> },
>
> {
>
> "task_coopProcessId": 20021454
>
> }
>
> ]
>
> }
>
> }
>
>
>
> With kind regards,
>
>
>
> Dario Viva
>
>
>
>
>
--
Sincerely yours
Mikhail Khludnev