On Mon, Nov 11, 2019 at 8:32 PM Chris Hostetter <hossman_luc...@fucit.org> wrote:
> > Based on the info provided, it's hard to be certain, but reading between > the lines here are hte assumptions i'm making... > > 1) your core name is "dbtr" > 2) the uniqueId field for the "dbtr" core is "debtor_id" > > ..are those assumptions correct? > Yes they are. Sorry I didn't provide that from the beginning. > Two key pieces of information that doesn't seem to be assumable from the > imfo you've provided: > > a) What is the fieldType of the uniqueKey field in use? > It is a textField > b) how are you determining that "The numFound: 35008" > > I do a preliminary query to the solr core and print out the numFound from this: my $solrResponse = $ua->post( $solrURI ); my $decoded = decode_json( $solrResponse->{_content} ); my $numFound = $decoded->{response}{numFound}; > ... > > You show the code that prints out "size of solrResults: 22006" but nothing > in your code ever prints $numFound. there is a snippet of code at the top > I am printing numFound every time it loops. This should remain constant, because it is the total of all documents found. It's not really necessary that I am printing it. The number of docs is the size that I also print, and that is 1000 every time, until the last little bit, and then it is 6 docs found. > of your perl logic that seems disconnected from the rest of the code which > makes me think that before you do anything with a cursor you are already > parsing some *other* query response to get $numFound that way... > > I am running this query first, to get the cursor set: "http://10.40.10.14:8983/solr/debt/select?indent=on&rows=1000&sort=id asc&q=debt_id: 608384 OR debt_id: 393291&cursorMark=*" This sets the cursor, and then returns a cursorMark that I start using in order to grab 1000 documents at a time. > ...what exactly does all the code *before* this look like? what is the > request that you are using to get that initial '$solrResponse' that you > are parsing to extract '$numFound' are you sure it's exactly the same as > the query whose cursor you are iterating over? > > query from before the loop: "http://10.40.10.14:8983/solr/debt/select?indent=on&rows=1000&sort=id asc&q=debt_id: 608384 OR debt_id: 393291&cursorMark=*" query in the loop: http://10.40.10.14:8983/solr/debt/select?indent=on&rows=1000&sort=id+asc&q=debt_id: 608384 OR debt_id: 393291&cursorMark=AoElMTg1MzE= I do have some logic to make sure i grab the first 1000 from the first query, but other than that, it's a simple loop. > It looks like you are (also) extracting 'my $numFound = > $decoded->{response}{numFound};' on every (cusor) request ... what do you > get if add this to your cursor loop... > > print STDERR "numFound = $numFound at '$cursor'"; > > numFound is always 35008 because that is how many total documents are found. The number of docs in the response is the number that I care about, because that shows me how many came back for this slice. > ...because unless documents are being added/deleted as you iterate over > hte cursor, the numFound value should be consistent on each request. > > numFound is consistently 35008. Thanks Rhys