Thanks Zeeshan for the info.

Is there a workaround in the mean time, or is the only option to handle queries 
to the individual nodes ourselves?

Is there a planned timeframe for the 2.0.1 release?

Thanks,
Jason

> On 21 Apr 2015, at 16:13, Zeeshan Lakhani <zlakh...@basho.com> wrote:
> 
> Hey Jason,
> 
> We’re working on performance issues with YZ filter queries, e.g. 
> https://github.com/basho/yokozuna/issues/392, and coverage plan 
> generation/caching, and our CliServ team has started doing a ton of 
> benchmarks as well.
> 
> You can bypass YZ, but then you’d have to create a way to generate your own 
> coverage plans and other things involving distributed solr that YZ gives you. 
> Nonetheless, we’re actively working on improving these issues you’ve 
> encountered. 
> 
> Zeeshan Lakhani
> programmer | 
> software engineer at @basho | 
> org. member/founder of @papers_we_love | paperswelove.org
> twitter => @zeeshanlakhani
> 
>> On Apr 21, 2015, at 1:06 AM, Jason Campbell <xia...@xiaclo.net> wrote:
>> 
>> Hello,
>> 
>> I'm currently trying to debug slow YZ queries, and I've narrowed down the 
>> issue, but not sure how to solve it.
>> 
>> First off, we have about 80 million records in Riak (and YZ), but the 
>> queries return relatively few (a thousand or so at most).  Our query times 
>> are anywhere from 800ms to 1.5s.
>> 
>> I have been experimenting with queries directly on the Solr node, and it 
>> seems to be a problem with YZ and the way it does vnode filters.
>> 
>> Here is the same query, emulating YZ first:
>> 
>> {
>>  "responseHeader":{
>>    "status":0,
>>    "QTime":958,
>>    "params":{
>>      "q":"timestamp:[1429579919010 TO 1429579921010]",
>>      "indent":"true",
>>      "fq":"_yz_pn:55 OR _yz_pn:40 OR _yz_pn:25 OR _yz_pn:10",
>>      "rows":"0",
>>      "wt":"json"}},
>>  "response":{"numFound":80,"start":0,"docs":[]
>>  }}
>> 
>> And the same query, but including the vnode filter in the main body instead 
>> of using a filter query:
>> 
>> {
>>  "responseHeader":{
>>    "status":0,
>>    "QTime":1,
>>    "params":{
>>      "q":"timestamp:[1429579919010 TO 1429579921010] AND (_yz_pn:55 OR 
>> _yz_pn:40 OR _yz_pn:25 OR _yz_pn:10)",
>>      "indent":"true",
>>      "rows":"0",
>>      "wt":"json"}},
>>  "response":{"numFound":80,"start":0,"docs":[]
>>  }}
>> 
>> I understand there is a caching benefit to using filter queries, but a 
>> performance difference of 100x or greater doesn't seem worth it, especially 
>> with a constant data stream.
>> 
>> Is there a way to make YZ do this, or is the only way to query Solr 
>> directly, bypassing YZ?  Does anyone have any other suggestions of how to 
>> make this faster?
>> 
>> The timestamp field is a SolrTrieLongField with default settings if anyone 
>> is curious.
>> 
>> Thanks,
>> Jason
>> _______________________________________________
>> riak-users mailing list
>> riak-users@lists.basho.com
>> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
> 
> _______________________________________________
> riak-users mailing list
> riak-users@lists.basho.com
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


_______________________________________________
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Reply via email to