Re: solr export get wrong results

2015-01-03 Thread Sandy Ding
Thanks a lot for your for your help, Joel.
Just wondering, why does export have such limitations? It uses the same
query handler with select, isn't it?

2014-12-31 10:28 GMT+08:00 Joel Bernstein joels...@gmail.com:

 For the initial release only JSON output format is supported with the
 /export feature. Also there is no built-in distributed support yet. Both of
 these features are likely to follow in future releases.

 For the initial release you'll need a client that can handle the JSON
 format and distributed logic. The Heliosearch project includes a client
 called CloudSolrStream that you can use for this purpose. Here are two
 links to get started with CloudSolrStream:


 https://github.com/Heliosearch/heliosearch/blob/helio_4_10/solr/solrj/src/java/org/apache/solr/client/solrj/streaming/CloudSolrStream.java
 http://heliosearch.org/streaming-aggregation-for-solrcloud/





 Joel Bernstein
 Search Engineer at Heliosearch

 On Mon, Dec 29, 2014 at 2:20 AM, Sandy Ding sandy.ding...@gmail.com
 wrote:

  Hi, Joel
 
  Thanks for your reply.
  It seems that the weird export results is because that I removed the
 str
  namexsort/str invariant of the export request handler in the default
  sorlconfig.xml to get csv-format output.
  I don't quite understand the meaning of xsort, but I removed it
 because I
  always get json response (as you said) with the xsort invariant.
  Is there a way to get a csv output using export?
  And also, can I get full results from all shards? (I tried to set
  distrib=true but get SyntaxError:xport RankQuery is required for
 xsort:
  rq={!xport}, and I do have rq={!xport} in the export invariants)
 
 
  2014-12-27 3:21 GMT+08:00 Joel Bernstein joels...@gmail.com:
 
   Hi Sandy,
  
   I pulled Solr 4.10.3 to see if I could recreate the issue you are
 seeing
   with export and I wasn't able to recreate the bug you are seeing. For
   example the following query:
  
   http://localhost:8983/solr/collection1/export?q=join_i:[50 TO
   500010]wt=jsonindent=truesort=join_i+ascfl=join_i,ShopId_i
  
  
   Brings back the following result:
  
  
   {responseHeader: {status: 0}, response:{numFound:11,
  
  
 
 docs:[{join_i:50,ShopId_i:578917},{join_i:51,ShopId_i:294217},{join_i:52,ShopId_i:199805},{join_i:53,ShopId_i:633461},{join_i:54,ShopId_i:472995},{join_i:55,ShopId_i:672122},{join_i:56,ShopId_i:394637},{join_i:57,ShopId_i:446443},{join_i:58,ShopId_i:697329},{join_i:59,ShopId_i:166988},{join_i:500010,ShopId_i:191261}]}}
  
  
   Notice the join_i values are all within the correct range.
  
   If you can post the export handler configuration we should be able to
   see the issue.
  
  
   Joel Bernstein
   Search Engineer at Heliosearch
  
   On Fri, Dec 26, 2014 at 1:50 PM, Joel Bernstein joels...@gmail.com
   wrote:
  
Hi Sandy,
   
The export handler should only return documents in JSON format. The
results in your second example are in XML for format so something
 looks
   to
be wrong in the configuration. Can you post what your solrconfig
 looks
   like?
   
Joel
   
Joel Bernstein
Search Engineer at Heliosearch
   
On Fri, Dec 26, 2014 at 12:43 PM, Erick Erickson 
   erickerick...@gmail.com
wrote:
   
I think you missed a very important part of Jack's reply:
   
bq: I notice that you don't have distrib=false on your select, which
would make your select be from all nodes, while export would only be
docs from the specific node you sent the request to.
   
And from the Reference Guide on export
   
bq: The initial release treats all queries as non-distributed
requests. So the client is responsible for making the calls to each
Solr instance and merging the results.
   
So the export statement you're sending is _only_ exporting the
 results
from the shard on 8983 and completely ignoring the other (6?)
 shards,
whereas the query you're sending is getting the results from all the
shards.
   
As Jack said, add distrib=false to the query, send it to the same
shard you send the export command to and the results should match.
   
Also, be sure your configuration for the /select handler doesn't
 have
any additional default parameters that might alter the results, but
 I
doubt that's really a problem here.
   
Best,
Erick
   
On Fri, Dec 26, 2014 at 7:02 AM, Ahmet Arslan
  iori...@yahoo.com.invalid
   
wrote:
 Hi,

 Do you have any custom solr components deployed? May be custom
   response
writer?

 Ahmet




 On Friday, December 26, 2014 3:26 PM, Sandy Ding 
sandy.ding...@gmail.com wrote:
 Hi, Ahmet,

 I use libuuid for unique id and I guess there shouldn't be
 duplicate
ids.
 Also, the results are not just incomplete, they are screwed.


 2014-12-26 20:19 GMT+08:00 Ahmet Arslan iori...@yahoo.com.invalid
  :

 Hi,

 Two different things :
  

Re: solr export get wrong results

2014-12-30 Thread Joel Bernstein
For the initial release only JSON output format is supported with the
/export feature. Also there is no built-in distributed support yet. Both of
these features are likely to follow in future releases.

For the initial release you'll need a client that can handle the JSON
format and distributed logic. The Heliosearch project includes a client
called CloudSolrStream that you can use for this purpose. Here are two
links to get started with CloudSolrStream:

https://github.com/Heliosearch/heliosearch/blob/helio_4_10/solr/solrj/src/java/org/apache/solr/client/solrj/streaming/CloudSolrStream.java
http://heliosearch.org/streaming-aggregation-for-solrcloud/





Joel Bernstein
Search Engineer at Heliosearch

On Mon, Dec 29, 2014 at 2:20 AM, Sandy Ding sandy.ding...@gmail.com wrote:

 Hi, Joel

 Thanks for your reply.
 It seems that the weird export results is because that I removed the str
 namexsort/str invariant of the export request handler in the default
 sorlconfig.xml to get csv-format output.
 I don't quite understand the meaning of xsort, but I removed it because I
 always get json response (as you said) with the xsort invariant.
 Is there a way to get a csv output using export?
 And also, can I get full results from all shards? (I tried to set
 distrib=true but get SyntaxError:xport RankQuery is required for xsort:
 rq={!xport}, and I do have rq={!xport} in the export invariants)


 2014-12-27 3:21 GMT+08:00 Joel Bernstein joels...@gmail.com:

  Hi Sandy,
 
  I pulled Solr 4.10.3 to see if I could recreate the issue you are seeing
  with export and I wasn't able to recreate the bug you are seeing. For
  example the following query:
 
  http://localhost:8983/solr/collection1/export?q=join_i:[50 TO
  500010]wt=jsonindent=truesort=join_i+ascfl=join_i,ShopId_i
 
 
  Brings back the following result:
 
 
  {responseHeader: {status: 0}, response:{numFound:11,
 
 
 docs:[{join_i:50,ShopId_i:578917},{join_i:51,ShopId_i:294217},{join_i:52,ShopId_i:199805},{join_i:53,ShopId_i:633461},{join_i:54,ShopId_i:472995},{join_i:55,ShopId_i:672122},{join_i:56,ShopId_i:394637},{join_i:57,ShopId_i:446443},{join_i:58,ShopId_i:697329},{join_i:59,ShopId_i:166988},{join_i:500010,ShopId_i:191261}]}}
 
 
  Notice the join_i values are all within the correct range.
 
  If you can post the export handler configuration we should be able to
  see the issue.
 
 
  Joel Bernstein
  Search Engineer at Heliosearch
 
  On Fri, Dec 26, 2014 at 1:50 PM, Joel Bernstein joels...@gmail.com
  wrote:
 
   Hi Sandy,
  
   The export handler should only return documents in JSON format. The
   results in your second example are in XML for format so something looks
  to
   be wrong in the configuration. Can you post what your solrconfig looks
  like?
  
   Joel
  
   Joel Bernstein
   Search Engineer at Heliosearch
  
   On Fri, Dec 26, 2014 at 12:43 PM, Erick Erickson 
  erickerick...@gmail.com
   wrote:
  
   I think you missed a very important part of Jack's reply:
  
   bq: I notice that you don't have distrib=false on your select, which
   would make your select be from all nodes, while export would only be
   docs from the specific node you sent the request to.
  
   And from the Reference Guide on export
  
   bq: The initial release treats all queries as non-distributed
   requests. So the client is responsible for making the calls to each
   Solr instance and merging the results.
  
   So the export statement you're sending is _only_ exporting the results
   from the shard on 8983 and completely ignoring the other (6?) shards,
   whereas the query you're sending is getting the results from all the
   shards.
  
   As Jack said, add distrib=false to the query, send it to the same
   shard you send the export command to and the results should match.
  
   Also, be sure your configuration for the /select handler doesn't have
   any additional default parameters that might alter the results, but I
   doubt that's really a problem here.
  
   Best,
   Erick
  
   On Fri, Dec 26, 2014 at 7:02 AM, Ahmet Arslan
 iori...@yahoo.com.invalid
  
   wrote:
Hi,
   
Do you have any custom solr components deployed? May be custom
  response
   writer?
   
Ahmet
   
   
   
   
On Friday, December 26, 2014 3:26 PM, Sandy Ding 
   sandy.ding...@gmail.com wrote:
Hi, Ahmet,
   
I use libuuid for unique id and I guess there shouldn't be duplicate
   ids.
Also, the results are not just incomplete, they are screwed.
   
   
2014-12-26 20:19 GMT+08:00 Ahmet Arslan iori...@yahoo.com.invalid
 :
   
Hi,
   
Two different things :
   
If you have unique key defined document with same id override
 within
  a
single shard.
   
Plus, uniqueIDs expected to be unique across shards.
   
Ahmet
   
   
   
On Friday, December 26, 2014 11:00 AM, Sandy Ding 
   sandy.ding...@gmail.com
wrote:
Hi, all
   
I've recently set up a solr cluster and found that export 

Re: solr export get wrong results

2014-12-28 Thread Sandy Ding
Hi, Joel

Thanks for your reply.
It seems that the weird export results is because that I removed the str
namexsort/str invariant of the export request handler in the default
sorlconfig.xml to get csv-format output.
I don't quite understand the meaning of xsort, but I removed it because I
always get json response (as you said) with the xsort invariant.
Is there a way to get a csv output using export?
And also, can I get full results from all shards? (I tried to set
distrib=true but get SyntaxError:xport RankQuery is required for xsort:
rq={!xport}, and I do have rq={!xport} in the export invariants)


2014-12-27 3:21 GMT+08:00 Joel Bernstein joels...@gmail.com:

 Hi Sandy,

 I pulled Solr 4.10.3 to see if I could recreate the issue you are seeing
 with export and I wasn't able to recreate the bug you are seeing. For
 example the following query:

 http://localhost:8983/solr/collection1/export?q=join_i:[50 TO
 500010]wt=jsonindent=truesort=join_i+ascfl=join_i,ShopId_i


 Brings back the following result:


 {responseHeader: {status: 0}, response:{numFound:11,

 docs:[{join_i:50,ShopId_i:578917},{join_i:51,ShopId_i:294217},{join_i:52,ShopId_i:199805},{join_i:53,ShopId_i:633461},{join_i:54,ShopId_i:472995},{join_i:55,ShopId_i:672122},{join_i:56,ShopId_i:394637},{join_i:57,ShopId_i:446443},{join_i:58,ShopId_i:697329},{join_i:59,ShopId_i:166988},{join_i:500010,ShopId_i:191261}]}}


 Notice the join_i values are all within the correct range.

 If you can post the export handler configuration we should be able to
 see the issue.


 Joel Bernstein
 Search Engineer at Heliosearch

 On Fri, Dec 26, 2014 at 1:50 PM, Joel Bernstein joels...@gmail.com
 wrote:

  Hi Sandy,
 
  The export handler should only return documents in JSON format. The
  results in your second example are in XML for format so something looks
 to
  be wrong in the configuration. Can you post what your solrconfig looks
 like?
 
  Joel
 
  Joel Bernstein
  Search Engineer at Heliosearch
 
  On Fri, Dec 26, 2014 at 12:43 PM, Erick Erickson 
 erickerick...@gmail.com
  wrote:
 
  I think you missed a very important part of Jack's reply:
 
  bq: I notice that you don't have distrib=false on your select, which
  would make your select be from all nodes, while export would only be
  docs from the specific node you sent the request to.
 
  And from the Reference Guide on export
 
  bq: The initial release treats all queries as non-distributed
  requests. So the client is responsible for making the calls to each
  Solr instance and merging the results.
 
  So the export statement you're sending is _only_ exporting the results
  from the shard on 8983 and completely ignoring the other (6?) shards,
  whereas the query you're sending is getting the results from all the
  shards.
 
  As Jack said, add distrib=false to the query, send it to the same
  shard you send the export command to and the results should match.
 
  Also, be sure your configuration for the /select handler doesn't have
  any additional default parameters that might alter the results, but I
  doubt that's really a problem here.
 
  Best,
  Erick
 
  On Fri, Dec 26, 2014 at 7:02 AM, Ahmet Arslan iori...@yahoo.com.invalid
 
  wrote:
   Hi,
  
   Do you have any custom solr components deployed? May be custom
 response
  writer?
  
   Ahmet
  
  
  
  
   On Friday, December 26, 2014 3:26 PM, Sandy Ding 
  sandy.ding...@gmail.com wrote:
   Hi, Ahmet,
  
   I use libuuid for unique id and I guess there shouldn't be duplicate
  ids.
   Also, the results are not just incomplete, they are screwed.
  
  
   2014-12-26 20:19 GMT+08:00 Ahmet Arslan iori...@yahoo.com.invalid:
  
   Hi,
  
   Two different things :
  
   If you have unique key defined document with same id override within
 a
   single shard.
  
   Plus, uniqueIDs expected to be unique across shards.
  
   Ahmet
  
  
  
   On Friday, December 26, 2014 11:00 AM, Sandy Ding 
  sandy.ding...@gmail.com
   wrote:
   Hi, all
  
   I've recently set up a solr cluster and found that export returns
   different results from select.
   And I confirmed that the export results are wrong by manually query
  the
   results.
   Even simple queries as follows will get different results:
  
   curl 
  http://localhost:8983/solr/pa_info/select?q=*:*fl=idsort=id+desc:
  
   responselst name=responseHeaderint
 name=status0/intint
   name=QTime11/intlst name=paramsstr name=sortid
  desc/strstr
   name=flid/strstr name=q*:*/str/lst/lstresult
   name=response *numFound=1197* start=0doc.../doc/result
  
   curl 
  http://localhost:8983/solr/pa_info/export?q=*:*fl=idsort=id+desc;
   :
   {*numFound:172*, docs:[..]
  
   Don't have a clue why this happen! Anyone help?
  
   Best,
   Sandy
  
 
 
 



solr export get wrong results

2014-12-26 Thread Sandy Ding
Hi, all

I've recently set up a solr cluster and found that export returns
different results from select.
And I confirmed that the export results are wrong by manually query the
results.
Even simple queries as follows will get different results:

curl http://localhost:8983/solr/pa_info/select?q=*:*fl=idsort=id+desc:

responselst name=responseHeaderint name=status0/intint
name=QTime11/intlst name=paramsstr name=sortid desc/strstr
name=flid/strstr name=q*:*/str/lst/lstresult
name=response *numFound=1197* start=0doc.../doc/result

curl http://localhost:8983/solr/pa_info/export?q=*:*fl=idsort=id+desc; :
{*numFound:172*, docs:[..]

Don't have a clue why this happen! Anyone help?

Best,
Sandy


Re: solr export get wrong results

2014-12-26 Thread Jack Krupansky
You neglected to tell us specifically in what way the export result is
incorrect. Is some of the data missing, duplicated, garbled, or... what?
Provide an example and be specific about what you think is wrong in the
results.

Have you modified the default solrconfig file?

I notice that you don't have distrib=false on your select, which would make
your select be from all nodes, while export would only be docs from the
specific node you sent the request to.

Please confirm whether you have read the doc for the Solr export feature:
https://cwiki.apache.org/confluence/display/solr/Exporting+Result+Sets


-- Jack Krupansky

On Fri, Dec 26, 2014 at 3:58 AM, Sandy Ding sandy.ding...@gmail.com wrote:

 Hi, all

 I've recently set up a solr cluster and found that export returns
 different results from select.
 And I confirmed that the export results are wrong by manually query the
 results.
 Even simple queries as follows will get different results:

 curl http://localhost:8983/solr/pa_info/select?q=*:*fl=idsort=id+desc:

 responselst name=responseHeaderint name=status0/intint
 name=QTime11/intlst name=paramsstr name=sortid desc/strstr
 name=flid/strstr name=q*:*/str/lst/lstresult
 name=response *numFound=1197* start=0doc.../doc/result

 curl http://localhost:8983/solr/pa_info/export?q=*:*fl=idsort=id+desc;
 :
 {*numFound:172*, docs:[..]

 Don't have a clue why this happen! Anyone help?

 Best,
 Sandy



Re: solr export get wrong results

2014-12-26 Thread Ahmet Arslan
Hi,

Two different things :

If you have unique key defined document with same id override within a single 
shard.

Plus, uniqueIDs expected to be unique across shards.

Ahmet



On Friday, December 26, 2014 11:00 AM, Sandy Ding sandy.ding...@gmail.com 
wrote:
Hi, all

I've recently set up a solr cluster and found that export returns
different results from select.
And I confirmed that the export results are wrong by manually query the
results.
Even simple queries as follows will get different results:

curl http://localhost:8983/solr/pa_info/select?q=*:*fl=idsort=id+desc:

responselst name=responseHeaderint name=status0/intint
name=QTime11/intlst name=paramsstr name=sortid desc/strstr
name=flid/strstr name=q*:*/str/lst/lstresult
name=response *numFound=1197* start=0doc.../doc/result

curl http://localhost:8983/solr/pa_info/export?q=*:*fl=idsort=id+desc; :
{*numFound:172*, docs:[..]

Don't have a clue why this happen! Anyone help?

Best,
Sandy


Re: solr export get wrong results

2014-12-26 Thread Sandy Ding
Thanks for your reply, Jack.

The export result sets are incorrect in the sense that results totally
don't match the query.
For example, when I query age=20(age is int type), the results contains
age=14, 22...
  curl http://localhost:8983/solr/pa_info/export?q=age:20fl=id,age; will
get the following result:
response
lst name=responseHeaderint name=status0/intint
name=QTime5/int/lstresult name=response numFound=132309
start=0docstr name=id26650337/strint
name=age50/int/docdocstr name=id26650348/strint
name=age14/int/docdocstr name=id26650351/strint
name=age43/int/docdocstr name=id26650353/strint
name=age59/int/docdocstr name=id26650355/strint
name=age52/int/docdocstr name=id26650357/strint
name=age47/int/docdocstr name=id26650361/strint
name=age6/int/docdocstr name=id26650367/strint
name=age7/int/docdocstr name=id26650372/strint
name=age35/int/docdocstr name=id26650374/strint
name=age22/int/doc/result
/response

I 've read the cwiki document, but I'm still not sure that export will
return partial results since the doc says:It's possible to export fully
sorted result sets using a special rank query parser
https://cwiki.apache.org/confluence/display/solr/Query+Re-Ranking
and response
writer https://cwiki.apache.org/confluence/display/solr/Response+Writers.
But as you can see from the above example, the results are not just
partial, they are simply wrong,,,

2014-12-26 20:18 GMT+08:00 Jack Krupansky jack.krupan...@gmail.com:

 You neglected to tell us specifically in what way the export result is
 incorrect. Is some of the data missing, duplicated, garbled, or... what?
 Provide an example and be specific about what you think is wrong in the
 results.

 Have you modified the default solrconfig file?

 I notice that you don't have distrib=false on your select, which would make
 your select be from all nodes, while export would only be docs from the
 specific node you sent the request to.

 Please confirm whether you have read the doc for the Solr export feature:
 https://cwiki.apache.org/confluence/display/solr/Exporting+Result+Sets


 -- Jack Krupansky

 On Fri, Dec 26, 2014 at 3:58 AM, Sandy Ding sandy.ding...@gmail.com
 wrote:

  Hi, all
 
  I've recently set up a solr cluster and found that export returns
  different results from select.
  And I confirmed that the export results are wrong by manually query the
  results.
  Even simple queries as follows will get different results:
 
  curl http://localhost:8983/solr/pa_info/select?q=*:*fl=idsort=id+desc
 :
 
  responselst name=responseHeaderint name=status0/intint
  name=QTime11/intlst name=paramsstr name=sortid
 desc/strstr
  name=flid/strstr name=q*:*/str/lst/lstresult
  name=response *numFound=1197* start=0doc.../doc/result
 
  curl http://localhost:8983/solr/pa_info/export?q=*:*fl=idsort=id+desc
 
  :
  {*numFound:172*, docs:[..]
 
  Don't have a clue why this happen! Anyone help?
 
  Best,
  Sandy
 



Re: solr export get wrong results

2014-12-26 Thread Sandy Ding
Hi, Ahmet,

I use libuuid for unique id and I guess there shouldn't be duplicate ids.
Also, the results are not just incomplete, they are screwed.

2014-12-26 20:19 GMT+08:00 Ahmet Arslan iori...@yahoo.com.invalid:

 Hi,

 Two different things :

 If you have unique key defined document with same id override within a
 single shard.

 Plus, uniqueIDs expected to be unique across shards.

 Ahmet



 On Friday, December 26, 2014 11:00 AM, Sandy Ding sandy.ding...@gmail.com
 wrote:
 Hi, all

 I've recently set up a solr cluster and found that export returns
 different results from select.
 And I confirmed that the export results are wrong by manually query the
 results.
 Even simple queries as follows will get different results:

 curl http://localhost:8983/solr/pa_info/select?q=*:*fl=idsort=id+desc:

 responselst name=responseHeaderint name=status0/intint
 name=QTime11/intlst name=paramsstr name=sortid desc/strstr
 name=flid/strstr name=q*:*/str/lst/lstresult
 name=response *numFound=1197* start=0doc.../doc/result

 curl http://localhost:8983/solr/pa_info/export?q=*:*fl=idsort=id+desc;
 :
 {*numFound:172*, docs:[..]

 Don't have a clue why this happen! Anyone help?

 Best,
 Sandy



Re: solr export get wrong results

2014-12-26 Thread Ahmet Arslan
Hi,

Do you have any custom solr components deployed? May be custom response writer?

Ahmet




On Friday, December 26, 2014 3:26 PM, Sandy Ding sandy.ding...@gmail.com 
wrote:
Hi, Ahmet,

I use libuuid for unique id and I guess there shouldn't be duplicate ids.
Also, the results are not just incomplete, they are screwed.


2014-12-26 20:19 GMT+08:00 Ahmet Arslan iori...@yahoo.com.invalid:

 Hi,

 Two different things :

 If you have unique key defined document with same id override within a
 single shard.

 Plus, uniqueIDs expected to be unique across shards.

 Ahmet



 On Friday, December 26, 2014 11:00 AM, Sandy Ding sandy.ding...@gmail.com
 wrote:
 Hi, all

 I've recently set up a solr cluster and found that export returns
 different results from select.
 And I confirmed that the export results are wrong by manually query the
 results.
 Even simple queries as follows will get different results:

 curl http://localhost:8983/solr/pa_info/select?q=*:*fl=idsort=id+desc:

 responselst name=responseHeaderint name=status0/intint
 name=QTime11/intlst name=paramsstr name=sortid desc/strstr
 name=flid/strstr name=q*:*/str/lst/lstresult
 name=response *numFound=1197* start=0doc.../doc/result

 curl http://localhost:8983/solr/pa_info/export?q=*:*fl=idsort=id+desc;
 :
 {*numFound:172*, docs:[..]

 Don't have a clue why this happen! Anyone help?

 Best,
 Sandy



Re: solr export get wrong results

2014-12-26 Thread Erick Erickson
I think you missed a very important part of Jack's reply:

bq: I notice that you don't have distrib=false on your select, which
would make your select be from all nodes, while export would only be
docs from the specific node you sent the request to.

And from the Reference Guide on export

bq: The initial release treats all queries as non-distributed
requests. So the client is responsible for making the calls to each
Solr instance and merging the results.

So the export statement you're sending is _only_ exporting the results
from the shard on 8983 and completely ignoring the other (6?) shards,
whereas the query you're sending is getting the results from all the
shards.

As Jack said, add distrib=false to the query, send it to the same
shard you send the export command to and the results should match.

Also, be sure your configuration for the /select handler doesn't have
any additional default parameters that might alter the results, but I
doubt that's really a problem here.

Best,
Erick

On Fri, Dec 26, 2014 at 7:02 AM, Ahmet Arslan iori...@yahoo.com.invalid wrote:
 Hi,

 Do you have any custom solr components deployed? May be custom response 
 writer?

 Ahmet




 On Friday, December 26, 2014 3:26 PM, Sandy Ding sandy.ding...@gmail.com 
 wrote:
 Hi, Ahmet,

 I use libuuid for unique id and I guess there shouldn't be duplicate ids.
 Also, the results are not just incomplete, they are screwed.


 2014-12-26 20:19 GMT+08:00 Ahmet Arslan iori...@yahoo.com.invalid:

 Hi,

 Two different things :

 If you have unique key defined document with same id override within a
 single shard.

 Plus, uniqueIDs expected to be unique across shards.

 Ahmet



 On Friday, December 26, 2014 11:00 AM, Sandy Ding sandy.ding...@gmail.com
 wrote:
 Hi, all

 I've recently set up a solr cluster and found that export returns
 different results from select.
 And I confirmed that the export results are wrong by manually query the
 results.
 Even simple queries as follows will get different results:

 curl http://localhost:8983/solr/pa_info/select?q=*:*fl=idsort=id+desc:

 responselst name=responseHeaderint name=status0/intint
 name=QTime11/intlst name=paramsstr name=sortid desc/strstr
 name=flid/strstr name=q*:*/str/lst/lstresult
 name=response *numFound=1197* start=0doc.../doc/result

 curl http://localhost:8983/solr/pa_info/export?q=*:*fl=idsort=id+desc;
 :
 {*numFound:172*, docs:[..]

 Don't have a clue why this happen! Anyone help?

 Best,
 Sandy



Re: solr export get wrong results

2014-12-26 Thread Joel Bernstein
Hi Sandy,

The export handler should only return documents in JSON format. The results
in your second example are in XML for format so something looks to be wrong
in the configuration. Can you post what your solrconfig looks like?

Joel

Joel Bernstein
Search Engineer at Heliosearch

On Fri, Dec 26, 2014 at 12:43 PM, Erick Erickson erickerick...@gmail.com
wrote:

 I think you missed a very important part of Jack's reply:

 bq: I notice that you don't have distrib=false on your select, which
 would make your select be from all nodes, while export would only be
 docs from the specific node you sent the request to.

 And from the Reference Guide on export

 bq: The initial release treats all queries as non-distributed
 requests. So the client is responsible for making the calls to each
 Solr instance and merging the results.

 So the export statement you're sending is _only_ exporting the results
 from the shard on 8983 and completely ignoring the other (6?) shards,
 whereas the query you're sending is getting the results from all the
 shards.

 As Jack said, add distrib=false to the query, send it to the same
 shard you send the export command to and the results should match.

 Also, be sure your configuration for the /select handler doesn't have
 any additional default parameters that might alter the results, but I
 doubt that's really a problem here.

 Best,
 Erick

 On Fri, Dec 26, 2014 at 7:02 AM, Ahmet Arslan iori...@yahoo.com.invalid
 wrote:
  Hi,
 
  Do you have any custom solr components deployed? May be custom response
 writer?
 
  Ahmet
 
 
 
 
  On Friday, December 26, 2014 3:26 PM, Sandy Ding 
 sandy.ding...@gmail.com wrote:
  Hi, Ahmet,
 
  I use libuuid for unique id and I guess there shouldn't be duplicate ids.
  Also, the results are not just incomplete, they are screwed.
 
 
  2014-12-26 20:19 GMT+08:00 Ahmet Arslan iori...@yahoo.com.invalid:
 
  Hi,
 
  Two different things :
 
  If you have unique key defined document with same id override within a
  single shard.
 
  Plus, uniqueIDs expected to be unique across shards.
 
  Ahmet
 
 
 
  On Friday, December 26, 2014 11:00 AM, Sandy Ding 
 sandy.ding...@gmail.com
  wrote:
  Hi, all
 
  I've recently set up a solr cluster and found that export returns
  different results from select.
  And I confirmed that the export results are wrong by manually query
 the
  results.
  Even simple queries as follows will get different results:
 
  curl 
 http://localhost:8983/solr/pa_info/select?q=*:*fl=idsort=id+desc:
 
  responselst name=responseHeaderint name=status0/intint
  name=QTime11/intlst name=paramsstr name=sortid
 desc/strstr
  name=flid/strstr name=q*:*/str/lst/lstresult
  name=response *numFound=1197* start=0doc.../doc/result
 
  curl 
 http://localhost:8983/solr/pa_info/export?q=*:*fl=idsort=id+desc;
  :
  {*numFound:172*, docs:[..]
 
  Don't have a clue why this happen! Anyone help?
 
  Best,
  Sandy
 



Re: solr export get wrong results

2014-12-26 Thread Joel Bernstein
Hi Sandy,

I pulled Solr 4.10.3 to see if I could recreate the issue you are seeing
with export and I wasn't able to recreate the bug you are seeing. For
example the following query:

http://localhost:8983/solr/collection1/export?q=join_i:[50 TO
500010]wt=jsonindent=truesort=join_i+ascfl=join_i,ShopId_i


Brings back the following result:


{responseHeader: {status: 0}, response:{numFound:11,
docs:[{join_i:50,ShopId_i:578917},{join_i:51,ShopId_i:294217},{join_i:52,ShopId_i:199805},{join_i:53,ShopId_i:633461},{join_i:54,ShopId_i:472995},{join_i:55,ShopId_i:672122},{join_i:56,ShopId_i:394637},{join_i:57,ShopId_i:446443},{join_i:58,ShopId_i:697329},{join_i:59,ShopId_i:166988},{join_i:500010,ShopId_i:191261}]}}


Notice the join_i values are all within the correct range.

If you can post the export handler configuration we should be able to
see the issue.


Joel Bernstein
Search Engineer at Heliosearch

On Fri, Dec 26, 2014 at 1:50 PM, Joel Bernstein joels...@gmail.com wrote:

 Hi Sandy,

 The export handler should only return documents in JSON format. The
 results in your second example are in XML for format so something looks to
 be wrong in the configuration. Can you post what your solrconfig looks like?

 Joel

 Joel Bernstein
 Search Engineer at Heliosearch

 On Fri, Dec 26, 2014 at 12:43 PM, Erick Erickson erickerick...@gmail.com
 wrote:

 I think you missed a very important part of Jack's reply:

 bq: I notice that you don't have distrib=false on your select, which
 would make your select be from all nodes, while export would only be
 docs from the specific node you sent the request to.

 And from the Reference Guide on export

 bq: The initial release treats all queries as non-distributed
 requests. So the client is responsible for making the calls to each
 Solr instance and merging the results.

 So the export statement you're sending is _only_ exporting the results
 from the shard on 8983 and completely ignoring the other (6?) shards,
 whereas the query you're sending is getting the results from all the
 shards.

 As Jack said, add distrib=false to the query, send it to the same
 shard you send the export command to and the results should match.

 Also, be sure your configuration for the /select handler doesn't have
 any additional default parameters that might alter the results, but I
 doubt that's really a problem here.

 Best,
 Erick

 On Fri, Dec 26, 2014 at 7:02 AM, Ahmet Arslan iori...@yahoo.com.invalid
 wrote:
  Hi,
 
  Do you have any custom solr components deployed? May be custom response
 writer?
 
  Ahmet
 
 
 
 
  On Friday, December 26, 2014 3:26 PM, Sandy Ding 
 sandy.ding...@gmail.com wrote:
  Hi, Ahmet,
 
  I use libuuid for unique id and I guess there shouldn't be duplicate
 ids.
  Also, the results are not just incomplete, they are screwed.
 
 
  2014-12-26 20:19 GMT+08:00 Ahmet Arslan iori...@yahoo.com.invalid:
 
  Hi,
 
  Two different things :
 
  If you have unique key defined document with same id override within a
  single shard.
 
  Plus, uniqueIDs expected to be unique across shards.
 
  Ahmet
 
 
 
  On Friday, December 26, 2014 11:00 AM, Sandy Ding 
 sandy.ding...@gmail.com
  wrote:
  Hi, all
 
  I've recently set up a solr cluster and found that export returns
  different results from select.
  And I confirmed that the export results are wrong by manually query
 the
  results.
  Even simple queries as follows will get different results:
 
  curl 
 http://localhost:8983/solr/pa_info/select?q=*:*fl=idsort=id+desc:
 
  responselst name=responseHeaderint name=status0/intint
  name=QTime11/intlst name=paramsstr name=sortid
 desc/strstr
  name=flid/strstr name=q*:*/str/lst/lstresult
  name=response *numFound=1197* start=0doc.../doc/result
 
  curl 
 http://localhost:8983/solr/pa_info/export?q=*:*fl=idsort=id+desc;
  :
  {*numFound:172*, docs:[..]
 
  Don't have a clue why this happen! Anyone help?
 
  Best,
  Sandy