Re: solr export get wrong results
Thanks a lot for your for your help, Joel. Just wondering, why does export have such limitations? It uses the same query handler with select, isn't it? 2014-12-31 10:28 GMT+08:00 Joel Bernstein joels...@gmail.com: For the initial release only JSON output format is supported with the /export feature. Also there is no built-in distributed support yet. Both of these features are likely to follow in future releases. For the initial release you'll need a client that can handle the JSON format and distributed logic. The Heliosearch project includes a client called CloudSolrStream that you can use for this purpose. Here are two links to get started with CloudSolrStream: https://github.com/Heliosearch/heliosearch/blob/helio_4_10/solr/solrj/src/java/org/apache/solr/client/solrj/streaming/CloudSolrStream.java http://heliosearch.org/streaming-aggregation-for-solrcloud/ Joel Bernstein Search Engineer at Heliosearch On Mon, Dec 29, 2014 at 2:20 AM, Sandy Ding sandy.ding...@gmail.com wrote: Hi, Joel Thanks for your reply. It seems that the weird export results is because that I removed the str namexsort/str invariant of the export request handler in the default sorlconfig.xml to get csv-format output. I don't quite understand the meaning of xsort, but I removed it because I always get json response (as you said) with the xsort invariant. Is there a way to get a csv output using export? And also, can I get full results from all shards? (I tried to set distrib=true but get SyntaxError:xport RankQuery is required for xsort: rq={!xport}, and I do have rq={!xport} in the export invariants) 2014-12-27 3:21 GMT+08:00 Joel Bernstein joels...@gmail.com: Hi Sandy, I pulled Solr 4.10.3 to see if I could recreate the issue you are seeing with export and I wasn't able to recreate the bug you are seeing. For example the following query: http://localhost:8983/solr/collection1/export?q=join_i:[50 TO 500010]wt=jsonindent=truesort=join_i+ascfl=join_i,ShopId_i Brings back the following result: {responseHeader: {status: 0}, response:{numFound:11, docs:[{join_i:50,ShopId_i:578917},{join_i:51,ShopId_i:294217},{join_i:52,ShopId_i:199805},{join_i:53,ShopId_i:633461},{join_i:54,ShopId_i:472995},{join_i:55,ShopId_i:672122},{join_i:56,ShopId_i:394637},{join_i:57,ShopId_i:446443},{join_i:58,ShopId_i:697329},{join_i:59,ShopId_i:166988},{join_i:500010,ShopId_i:191261}]}} Notice the join_i values are all within the correct range. If you can post the export handler configuration we should be able to see the issue. Joel Bernstein Search Engineer at Heliosearch On Fri, Dec 26, 2014 at 1:50 PM, Joel Bernstein joels...@gmail.com wrote: Hi Sandy, The export handler should only return documents in JSON format. The results in your second example are in XML for format so something looks to be wrong in the configuration. Can you post what your solrconfig looks like? Joel Joel Bernstein Search Engineer at Heliosearch On Fri, Dec 26, 2014 at 12:43 PM, Erick Erickson erickerick...@gmail.com wrote: I think you missed a very important part of Jack's reply: bq: I notice that you don't have distrib=false on your select, which would make your select be from all nodes, while export would only be docs from the specific node you sent the request to. And from the Reference Guide on export bq: The initial release treats all queries as non-distributed requests. So the client is responsible for making the calls to each Solr instance and merging the results. So the export statement you're sending is _only_ exporting the results from the shard on 8983 and completely ignoring the other (6?) shards, whereas the query you're sending is getting the results from all the shards. As Jack said, add distrib=false to the query, send it to the same shard you send the export command to and the results should match. Also, be sure your configuration for the /select handler doesn't have any additional default parameters that might alter the results, but I doubt that's really a problem here. Best, Erick On Fri, Dec 26, 2014 at 7:02 AM, Ahmet Arslan iori...@yahoo.com.invalid wrote: Hi, Do you have any custom solr components deployed? May be custom response writer? Ahmet On Friday, December 26, 2014 3:26 PM, Sandy Ding sandy.ding...@gmail.com wrote: Hi, Ahmet, I use libuuid for unique id and I guess there shouldn't be duplicate ids. Also, the results are not just incomplete, they are screwed. 2014-12-26 20:19 GMT+08:00 Ahmet Arslan iori...@yahoo.com.invalid : Hi, Two different things :
Re: solr export get wrong results
For the initial release only JSON output format is supported with the /export feature. Also there is no built-in distributed support yet. Both of these features are likely to follow in future releases. For the initial release you'll need a client that can handle the JSON format and distributed logic. The Heliosearch project includes a client called CloudSolrStream that you can use for this purpose. Here are two links to get started with CloudSolrStream: https://github.com/Heliosearch/heliosearch/blob/helio_4_10/solr/solrj/src/java/org/apache/solr/client/solrj/streaming/CloudSolrStream.java http://heliosearch.org/streaming-aggregation-for-solrcloud/ Joel Bernstein Search Engineer at Heliosearch On Mon, Dec 29, 2014 at 2:20 AM, Sandy Ding sandy.ding...@gmail.com wrote: Hi, Joel Thanks for your reply. It seems that the weird export results is because that I removed the str namexsort/str invariant of the export request handler in the default sorlconfig.xml to get csv-format output. I don't quite understand the meaning of xsort, but I removed it because I always get json response (as you said) with the xsort invariant. Is there a way to get a csv output using export? And also, can I get full results from all shards? (I tried to set distrib=true but get SyntaxError:xport RankQuery is required for xsort: rq={!xport}, and I do have rq={!xport} in the export invariants) 2014-12-27 3:21 GMT+08:00 Joel Bernstein joels...@gmail.com: Hi Sandy, I pulled Solr 4.10.3 to see if I could recreate the issue you are seeing with export and I wasn't able to recreate the bug you are seeing. For example the following query: http://localhost:8983/solr/collection1/export?q=join_i:[50 TO 500010]wt=jsonindent=truesort=join_i+ascfl=join_i,ShopId_i Brings back the following result: {responseHeader: {status: 0}, response:{numFound:11, docs:[{join_i:50,ShopId_i:578917},{join_i:51,ShopId_i:294217},{join_i:52,ShopId_i:199805},{join_i:53,ShopId_i:633461},{join_i:54,ShopId_i:472995},{join_i:55,ShopId_i:672122},{join_i:56,ShopId_i:394637},{join_i:57,ShopId_i:446443},{join_i:58,ShopId_i:697329},{join_i:59,ShopId_i:166988},{join_i:500010,ShopId_i:191261}]}} Notice the join_i values are all within the correct range. If you can post the export handler configuration we should be able to see the issue. Joel Bernstein Search Engineer at Heliosearch On Fri, Dec 26, 2014 at 1:50 PM, Joel Bernstein joels...@gmail.com wrote: Hi Sandy, The export handler should only return documents in JSON format. The results in your second example are in XML for format so something looks to be wrong in the configuration. Can you post what your solrconfig looks like? Joel Joel Bernstein Search Engineer at Heliosearch On Fri, Dec 26, 2014 at 12:43 PM, Erick Erickson erickerick...@gmail.com wrote: I think you missed a very important part of Jack's reply: bq: I notice that you don't have distrib=false on your select, which would make your select be from all nodes, while export would only be docs from the specific node you sent the request to. And from the Reference Guide on export bq: The initial release treats all queries as non-distributed requests. So the client is responsible for making the calls to each Solr instance and merging the results. So the export statement you're sending is _only_ exporting the results from the shard on 8983 and completely ignoring the other (6?) shards, whereas the query you're sending is getting the results from all the shards. As Jack said, add distrib=false to the query, send it to the same shard you send the export command to and the results should match. Also, be sure your configuration for the /select handler doesn't have any additional default parameters that might alter the results, but I doubt that's really a problem here. Best, Erick On Fri, Dec 26, 2014 at 7:02 AM, Ahmet Arslan iori...@yahoo.com.invalid wrote: Hi, Do you have any custom solr components deployed? May be custom response writer? Ahmet On Friday, December 26, 2014 3:26 PM, Sandy Ding sandy.ding...@gmail.com wrote: Hi, Ahmet, I use libuuid for unique id and I guess there shouldn't be duplicate ids. Also, the results are not just incomplete, they are screwed. 2014-12-26 20:19 GMT+08:00 Ahmet Arslan iori...@yahoo.com.invalid : Hi, Two different things : If you have unique key defined document with same id override within a single shard. Plus, uniqueIDs expected to be unique across shards. Ahmet On Friday, December 26, 2014 11:00 AM, Sandy Ding sandy.ding...@gmail.com wrote: Hi, all I've recently set up a solr cluster and found that export
Re: solr export get wrong results
Hi, Joel Thanks for your reply. It seems that the weird export results is because that I removed the str namexsort/str invariant of the export request handler in the default sorlconfig.xml to get csv-format output. I don't quite understand the meaning of xsort, but I removed it because I always get json response (as you said) with the xsort invariant. Is there a way to get a csv output using export? And also, can I get full results from all shards? (I tried to set distrib=true but get SyntaxError:xport RankQuery is required for xsort: rq={!xport}, and I do have rq={!xport} in the export invariants) 2014-12-27 3:21 GMT+08:00 Joel Bernstein joels...@gmail.com: Hi Sandy, I pulled Solr 4.10.3 to see if I could recreate the issue you are seeing with export and I wasn't able to recreate the bug you are seeing. For example the following query: http://localhost:8983/solr/collection1/export?q=join_i:[50 TO 500010]wt=jsonindent=truesort=join_i+ascfl=join_i,ShopId_i Brings back the following result: {responseHeader: {status: 0}, response:{numFound:11, docs:[{join_i:50,ShopId_i:578917},{join_i:51,ShopId_i:294217},{join_i:52,ShopId_i:199805},{join_i:53,ShopId_i:633461},{join_i:54,ShopId_i:472995},{join_i:55,ShopId_i:672122},{join_i:56,ShopId_i:394637},{join_i:57,ShopId_i:446443},{join_i:58,ShopId_i:697329},{join_i:59,ShopId_i:166988},{join_i:500010,ShopId_i:191261}]}} Notice the join_i values are all within the correct range. If you can post the export handler configuration we should be able to see the issue. Joel Bernstein Search Engineer at Heliosearch On Fri, Dec 26, 2014 at 1:50 PM, Joel Bernstein joels...@gmail.com wrote: Hi Sandy, The export handler should only return documents in JSON format. The results in your second example are in XML for format so something looks to be wrong in the configuration. Can you post what your solrconfig looks like? Joel Joel Bernstein Search Engineer at Heliosearch On Fri, Dec 26, 2014 at 12:43 PM, Erick Erickson erickerick...@gmail.com wrote: I think you missed a very important part of Jack's reply: bq: I notice that you don't have distrib=false on your select, which would make your select be from all nodes, while export would only be docs from the specific node you sent the request to. And from the Reference Guide on export bq: The initial release treats all queries as non-distributed requests. So the client is responsible for making the calls to each Solr instance and merging the results. So the export statement you're sending is _only_ exporting the results from the shard on 8983 and completely ignoring the other (6?) shards, whereas the query you're sending is getting the results from all the shards. As Jack said, add distrib=false to the query, send it to the same shard you send the export command to and the results should match. Also, be sure your configuration for the /select handler doesn't have any additional default parameters that might alter the results, but I doubt that's really a problem here. Best, Erick On Fri, Dec 26, 2014 at 7:02 AM, Ahmet Arslan iori...@yahoo.com.invalid wrote: Hi, Do you have any custom solr components deployed? May be custom response writer? Ahmet On Friday, December 26, 2014 3:26 PM, Sandy Ding sandy.ding...@gmail.com wrote: Hi, Ahmet, I use libuuid for unique id and I guess there shouldn't be duplicate ids. Also, the results are not just incomplete, they are screwed. 2014-12-26 20:19 GMT+08:00 Ahmet Arslan iori...@yahoo.com.invalid: Hi, Two different things : If you have unique key defined document with same id override within a single shard. Plus, uniqueIDs expected to be unique across shards. Ahmet On Friday, December 26, 2014 11:00 AM, Sandy Ding sandy.ding...@gmail.com wrote: Hi, all I've recently set up a solr cluster and found that export returns different results from select. And I confirmed that the export results are wrong by manually query the results. Even simple queries as follows will get different results: curl http://localhost:8983/solr/pa_info/select?q=*:*fl=idsort=id+desc: responselst name=responseHeaderint name=status0/intint name=QTime11/intlst name=paramsstr name=sortid desc/strstr name=flid/strstr name=q*:*/str/lst/lstresult name=response *numFound=1197* start=0doc.../doc/result curl http://localhost:8983/solr/pa_info/export?q=*:*fl=idsort=id+desc; : {*numFound:172*, docs:[..] Don't have a clue why this happen! Anyone help? Best, Sandy
solr export get wrong results
Hi, all I've recently set up a solr cluster and found that export returns different results from select. And I confirmed that the export results are wrong by manually query the results. Even simple queries as follows will get different results: curl http://localhost:8983/solr/pa_info/select?q=*:*fl=idsort=id+desc: responselst name=responseHeaderint name=status0/intint name=QTime11/intlst name=paramsstr name=sortid desc/strstr name=flid/strstr name=q*:*/str/lst/lstresult name=response *numFound=1197* start=0doc.../doc/result curl http://localhost:8983/solr/pa_info/export?q=*:*fl=idsort=id+desc; : {*numFound:172*, docs:[..] Don't have a clue why this happen! Anyone help? Best, Sandy
Re: solr export get wrong results
You neglected to tell us specifically in what way the export result is incorrect. Is some of the data missing, duplicated, garbled, or... what? Provide an example and be specific about what you think is wrong in the results. Have you modified the default solrconfig file? I notice that you don't have distrib=false on your select, which would make your select be from all nodes, while export would only be docs from the specific node you sent the request to. Please confirm whether you have read the doc for the Solr export feature: https://cwiki.apache.org/confluence/display/solr/Exporting+Result+Sets -- Jack Krupansky On Fri, Dec 26, 2014 at 3:58 AM, Sandy Ding sandy.ding...@gmail.com wrote: Hi, all I've recently set up a solr cluster and found that export returns different results from select. And I confirmed that the export results are wrong by manually query the results. Even simple queries as follows will get different results: curl http://localhost:8983/solr/pa_info/select?q=*:*fl=idsort=id+desc: responselst name=responseHeaderint name=status0/intint name=QTime11/intlst name=paramsstr name=sortid desc/strstr name=flid/strstr name=q*:*/str/lst/lstresult name=response *numFound=1197* start=0doc.../doc/result curl http://localhost:8983/solr/pa_info/export?q=*:*fl=idsort=id+desc; : {*numFound:172*, docs:[..] Don't have a clue why this happen! Anyone help? Best, Sandy
Re: solr export get wrong results
Hi, Two different things : If you have unique key defined document with same id override within a single shard. Plus, uniqueIDs expected to be unique across shards. Ahmet On Friday, December 26, 2014 11:00 AM, Sandy Ding sandy.ding...@gmail.com wrote: Hi, all I've recently set up a solr cluster and found that export returns different results from select. And I confirmed that the export results are wrong by manually query the results. Even simple queries as follows will get different results: curl http://localhost:8983/solr/pa_info/select?q=*:*fl=idsort=id+desc: responselst name=responseHeaderint name=status0/intint name=QTime11/intlst name=paramsstr name=sortid desc/strstr name=flid/strstr name=q*:*/str/lst/lstresult name=response *numFound=1197* start=0doc.../doc/result curl http://localhost:8983/solr/pa_info/export?q=*:*fl=idsort=id+desc; : {*numFound:172*, docs:[..] Don't have a clue why this happen! Anyone help? Best, Sandy
Re: solr export get wrong results
Thanks for your reply, Jack. The export result sets are incorrect in the sense that results totally don't match the query. For example, when I query age=20(age is int type), the results contains age=14, 22... curl http://localhost:8983/solr/pa_info/export?q=age:20fl=id,age; will get the following result: response lst name=responseHeaderint name=status0/intint name=QTime5/int/lstresult name=response numFound=132309 start=0docstr name=id26650337/strint name=age50/int/docdocstr name=id26650348/strint name=age14/int/docdocstr name=id26650351/strint name=age43/int/docdocstr name=id26650353/strint name=age59/int/docdocstr name=id26650355/strint name=age52/int/docdocstr name=id26650357/strint name=age47/int/docdocstr name=id26650361/strint name=age6/int/docdocstr name=id26650367/strint name=age7/int/docdocstr name=id26650372/strint name=age35/int/docdocstr name=id26650374/strint name=age22/int/doc/result /response I 've read the cwiki document, but I'm still not sure that export will return partial results since the doc says:It's possible to export fully sorted result sets using a special rank query parser https://cwiki.apache.org/confluence/display/solr/Query+Re-Ranking and response writer https://cwiki.apache.org/confluence/display/solr/Response+Writers. But as you can see from the above example, the results are not just partial, they are simply wrong,,, 2014-12-26 20:18 GMT+08:00 Jack Krupansky jack.krupan...@gmail.com: You neglected to tell us specifically in what way the export result is incorrect. Is some of the data missing, duplicated, garbled, or... what? Provide an example and be specific about what you think is wrong in the results. Have you modified the default solrconfig file? I notice that you don't have distrib=false on your select, which would make your select be from all nodes, while export would only be docs from the specific node you sent the request to. Please confirm whether you have read the doc for the Solr export feature: https://cwiki.apache.org/confluence/display/solr/Exporting+Result+Sets -- Jack Krupansky On Fri, Dec 26, 2014 at 3:58 AM, Sandy Ding sandy.ding...@gmail.com wrote: Hi, all I've recently set up a solr cluster and found that export returns different results from select. And I confirmed that the export results are wrong by manually query the results. Even simple queries as follows will get different results: curl http://localhost:8983/solr/pa_info/select?q=*:*fl=idsort=id+desc : responselst name=responseHeaderint name=status0/intint name=QTime11/intlst name=paramsstr name=sortid desc/strstr name=flid/strstr name=q*:*/str/lst/lstresult name=response *numFound=1197* start=0doc.../doc/result curl http://localhost:8983/solr/pa_info/export?q=*:*fl=idsort=id+desc : {*numFound:172*, docs:[..] Don't have a clue why this happen! Anyone help? Best, Sandy
Re: solr export get wrong results
Hi, Ahmet, I use libuuid for unique id and I guess there shouldn't be duplicate ids. Also, the results are not just incomplete, they are screwed. 2014-12-26 20:19 GMT+08:00 Ahmet Arslan iori...@yahoo.com.invalid: Hi, Two different things : If you have unique key defined document with same id override within a single shard. Plus, uniqueIDs expected to be unique across shards. Ahmet On Friday, December 26, 2014 11:00 AM, Sandy Ding sandy.ding...@gmail.com wrote: Hi, all I've recently set up a solr cluster and found that export returns different results from select. And I confirmed that the export results are wrong by manually query the results. Even simple queries as follows will get different results: curl http://localhost:8983/solr/pa_info/select?q=*:*fl=idsort=id+desc: responselst name=responseHeaderint name=status0/intint name=QTime11/intlst name=paramsstr name=sortid desc/strstr name=flid/strstr name=q*:*/str/lst/lstresult name=response *numFound=1197* start=0doc.../doc/result curl http://localhost:8983/solr/pa_info/export?q=*:*fl=idsort=id+desc; : {*numFound:172*, docs:[..] Don't have a clue why this happen! Anyone help? Best, Sandy
Re: solr export get wrong results
Hi, Do you have any custom solr components deployed? May be custom response writer? Ahmet On Friday, December 26, 2014 3:26 PM, Sandy Ding sandy.ding...@gmail.com wrote: Hi, Ahmet, I use libuuid for unique id and I guess there shouldn't be duplicate ids. Also, the results are not just incomplete, they are screwed. 2014-12-26 20:19 GMT+08:00 Ahmet Arslan iori...@yahoo.com.invalid: Hi, Two different things : If you have unique key defined document with same id override within a single shard. Plus, uniqueIDs expected to be unique across shards. Ahmet On Friday, December 26, 2014 11:00 AM, Sandy Ding sandy.ding...@gmail.com wrote: Hi, all I've recently set up a solr cluster and found that export returns different results from select. And I confirmed that the export results are wrong by manually query the results. Even simple queries as follows will get different results: curl http://localhost:8983/solr/pa_info/select?q=*:*fl=idsort=id+desc: responselst name=responseHeaderint name=status0/intint name=QTime11/intlst name=paramsstr name=sortid desc/strstr name=flid/strstr name=q*:*/str/lst/lstresult name=response *numFound=1197* start=0doc.../doc/result curl http://localhost:8983/solr/pa_info/export?q=*:*fl=idsort=id+desc; : {*numFound:172*, docs:[..] Don't have a clue why this happen! Anyone help? Best, Sandy
Re: solr export get wrong results
I think you missed a very important part of Jack's reply: bq: I notice that you don't have distrib=false on your select, which would make your select be from all nodes, while export would only be docs from the specific node you sent the request to. And from the Reference Guide on export bq: The initial release treats all queries as non-distributed requests. So the client is responsible for making the calls to each Solr instance and merging the results. So the export statement you're sending is _only_ exporting the results from the shard on 8983 and completely ignoring the other (6?) shards, whereas the query you're sending is getting the results from all the shards. As Jack said, add distrib=false to the query, send it to the same shard you send the export command to and the results should match. Also, be sure your configuration for the /select handler doesn't have any additional default parameters that might alter the results, but I doubt that's really a problem here. Best, Erick On Fri, Dec 26, 2014 at 7:02 AM, Ahmet Arslan iori...@yahoo.com.invalid wrote: Hi, Do you have any custom solr components deployed? May be custom response writer? Ahmet On Friday, December 26, 2014 3:26 PM, Sandy Ding sandy.ding...@gmail.com wrote: Hi, Ahmet, I use libuuid for unique id and I guess there shouldn't be duplicate ids. Also, the results are not just incomplete, they are screwed. 2014-12-26 20:19 GMT+08:00 Ahmet Arslan iori...@yahoo.com.invalid: Hi, Two different things : If you have unique key defined document with same id override within a single shard. Plus, uniqueIDs expected to be unique across shards. Ahmet On Friday, December 26, 2014 11:00 AM, Sandy Ding sandy.ding...@gmail.com wrote: Hi, all I've recently set up a solr cluster and found that export returns different results from select. And I confirmed that the export results are wrong by manually query the results. Even simple queries as follows will get different results: curl http://localhost:8983/solr/pa_info/select?q=*:*fl=idsort=id+desc: responselst name=responseHeaderint name=status0/intint name=QTime11/intlst name=paramsstr name=sortid desc/strstr name=flid/strstr name=q*:*/str/lst/lstresult name=response *numFound=1197* start=0doc.../doc/result curl http://localhost:8983/solr/pa_info/export?q=*:*fl=idsort=id+desc; : {*numFound:172*, docs:[..] Don't have a clue why this happen! Anyone help? Best, Sandy
Re: solr export get wrong results
Hi Sandy, The export handler should only return documents in JSON format. The results in your second example are in XML for format so something looks to be wrong in the configuration. Can you post what your solrconfig looks like? Joel Joel Bernstein Search Engineer at Heliosearch On Fri, Dec 26, 2014 at 12:43 PM, Erick Erickson erickerick...@gmail.com wrote: I think you missed a very important part of Jack's reply: bq: I notice that you don't have distrib=false on your select, which would make your select be from all nodes, while export would only be docs from the specific node you sent the request to. And from the Reference Guide on export bq: The initial release treats all queries as non-distributed requests. So the client is responsible for making the calls to each Solr instance and merging the results. So the export statement you're sending is _only_ exporting the results from the shard on 8983 and completely ignoring the other (6?) shards, whereas the query you're sending is getting the results from all the shards. As Jack said, add distrib=false to the query, send it to the same shard you send the export command to and the results should match. Also, be sure your configuration for the /select handler doesn't have any additional default parameters that might alter the results, but I doubt that's really a problem here. Best, Erick On Fri, Dec 26, 2014 at 7:02 AM, Ahmet Arslan iori...@yahoo.com.invalid wrote: Hi, Do you have any custom solr components deployed? May be custom response writer? Ahmet On Friday, December 26, 2014 3:26 PM, Sandy Ding sandy.ding...@gmail.com wrote: Hi, Ahmet, I use libuuid for unique id and I guess there shouldn't be duplicate ids. Also, the results are not just incomplete, they are screwed. 2014-12-26 20:19 GMT+08:00 Ahmet Arslan iori...@yahoo.com.invalid: Hi, Two different things : If you have unique key defined document with same id override within a single shard. Plus, uniqueIDs expected to be unique across shards. Ahmet On Friday, December 26, 2014 11:00 AM, Sandy Ding sandy.ding...@gmail.com wrote: Hi, all I've recently set up a solr cluster and found that export returns different results from select. And I confirmed that the export results are wrong by manually query the results. Even simple queries as follows will get different results: curl http://localhost:8983/solr/pa_info/select?q=*:*fl=idsort=id+desc: responselst name=responseHeaderint name=status0/intint name=QTime11/intlst name=paramsstr name=sortid desc/strstr name=flid/strstr name=q*:*/str/lst/lstresult name=response *numFound=1197* start=0doc.../doc/result curl http://localhost:8983/solr/pa_info/export?q=*:*fl=idsort=id+desc; : {*numFound:172*, docs:[..] Don't have a clue why this happen! Anyone help? Best, Sandy
Re: solr export get wrong results
Hi Sandy, I pulled Solr 4.10.3 to see if I could recreate the issue you are seeing with export and I wasn't able to recreate the bug you are seeing. For example the following query: http://localhost:8983/solr/collection1/export?q=join_i:[50 TO 500010]wt=jsonindent=truesort=join_i+ascfl=join_i,ShopId_i Brings back the following result: {responseHeader: {status: 0}, response:{numFound:11, docs:[{join_i:50,ShopId_i:578917},{join_i:51,ShopId_i:294217},{join_i:52,ShopId_i:199805},{join_i:53,ShopId_i:633461},{join_i:54,ShopId_i:472995},{join_i:55,ShopId_i:672122},{join_i:56,ShopId_i:394637},{join_i:57,ShopId_i:446443},{join_i:58,ShopId_i:697329},{join_i:59,ShopId_i:166988},{join_i:500010,ShopId_i:191261}]}} Notice the join_i values are all within the correct range. If you can post the export handler configuration we should be able to see the issue. Joel Bernstein Search Engineer at Heliosearch On Fri, Dec 26, 2014 at 1:50 PM, Joel Bernstein joels...@gmail.com wrote: Hi Sandy, The export handler should only return documents in JSON format. The results in your second example are in XML for format so something looks to be wrong in the configuration. Can you post what your solrconfig looks like? Joel Joel Bernstein Search Engineer at Heliosearch On Fri, Dec 26, 2014 at 12:43 PM, Erick Erickson erickerick...@gmail.com wrote: I think you missed a very important part of Jack's reply: bq: I notice that you don't have distrib=false on your select, which would make your select be from all nodes, while export would only be docs from the specific node you sent the request to. And from the Reference Guide on export bq: The initial release treats all queries as non-distributed requests. So the client is responsible for making the calls to each Solr instance and merging the results. So the export statement you're sending is _only_ exporting the results from the shard on 8983 and completely ignoring the other (6?) shards, whereas the query you're sending is getting the results from all the shards. As Jack said, add distrib=false to the query, send it to the same shard you send the export command to and the results should match. Also, be sure your configuration for the /select handler doesn't have any additional default parameters that might alter the results, but I doubt that's really a problem here. Best, Erick On Fri, Dec 26, 2014 at 7:02 AM, Ahmet Arslan iori...@yahoo.com.invalid wrote: Hi, Do you have any custom solr components deployed? May be custom response writer? Ahmet On Friday, December 26, 2014 3:26 PM, Sandy Ding sandy.ding...@gmail.com wrote: Hi, Ahmet, I use libuuid for unique id and I guess there shouldn't be duplicate ids. Also, the results are not just incomplete, they are screwed. 2014-12-26 20:19 GMT+08:00 Ahmet Arslan iori...@yahoo.com.invalid: Hi, Two different things : If you have unique key defined document with same id override within a single shard. Plus, uniqueIDs expected to be unique across shards. Ahmet On Friday, December 26, 2014 11:00 AM, Sandy Ding sandy.ding...@gmail.com wrote: Hi, all I've recently set up a solr cluster and found that export returns different results from select. And I confirmed that the export results are wrong by manually query the results. Even simple queries as follows will get different results: curl http://localhost:8983/solr/pa_info/select?q=*:*fl=idsort=id+desc: responselst name=responseHeaderint name=status0/intint name=QTime11/intlst name=paramsstr name=sortid desc/strstr name=flid/strstr name=q*:*/str/lst/lstresult name=response *numFound=1197* start=0doc.../doc/result curl http://localhost:8983/solr/pa_info/export?q=*:*fl=idsort=id+desc; : {*numFound:172*, docs:[..] Don't have a clue why this happen! Anyone help? Best, Sandy