Re: app.config missing?

2016-09-19 Thread Magnus Kessler
On 18 September 2016 at 07:51, Alex De la rosa 
wrote:

> Hi there,
>
> I'm trying to locate the app.config file in Riak 2.1.4-1 to add the
> following:
>
> { kernel, [
> {inet_dist_listen_min, 6000},
> {inet_dist_listen_max, 7999}
>   ]},
>
> as explained at http://docs.basho.com/riak/kv/2.1.4/using/security but I
> can't find it.
>
> Thanks,
> Alex
>


Hi Alex,

With Riak 2.x we recommend using the new configuration mechanism (a.k.a
cuttlefish). Please use the instructions for using riak.conf on the page
you quoted.

erlang.distribution.port_range.minimum = 6000
erlang.distribution.port_range.maximum = 7999

For more information about Riak's configuration system, please see the
configuration reference documentation [0].

Kind Regards,

Magnus

[0]: http://docs.basho.com/riak/kv/2.1.4/configuring/reference/

 --
Magnus Kessler
Client Services Engineer
Basho Technologies Limited

Registered Office - 8 Lincoln’s Inn Fields London WC2A 3BP Reg 07970431
___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


Re: app.config missing?

2016-09-19 Thread Alex De la rosa
Ok, documentation was confusing, i thought i had to add the data in both
riak.conf and app.config

Thanks,
Alex

On Mon, Sep 19, 2016 at 11:42 AM, Magnus Kessler  wrote:

> On 18 September 2016 at 07:51, Alex De la rosa 
> wrote:
>
>> Hi there,
>>
>> I'm trying to locate the app.config file in Riak 2.1.4-1 to add the
>> following:
>>
>> { kernel, [
>> {inet_dist_listen_min, 6000},
>> {inet_dist_listen_max, 7999}
>>   ]},
>>
>> as explained at http://docs.basho.com/riak/kv/2.1.4/using/security but I
>> can't find it.
>>
>> Thanks,
>> Alex
>>
>
>
> Hi Alex,
>
> With Riak 2.x we recommend using the new configuration mechanism (a.k.a
> cuttlefish). Please use the instructions for using riak.conf on the page
> you quoted.
>
> erlang.distribution.port_range.minimum = 6000 
> erlang.distribution.port_range.maximum
> = 7999
>
> For more information about Riak's configuration system, please see the
> configuration reference documentation [0].
>
> Kind Regards,
>
> Magnus
>
> [0]: http://docs.basho.com/riak/kv/2.1.4/configuring/reference/
>
>  --
> Magnus Kessler
> Client Services Engineer
> Basho Technologies Limited
>
> Registered Office - 8 Lincoln’s Inn Fields London WC2A 3BP Reg 07970431
>
___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


Re: app.config missing?

2016-09-19 Thread DeadZen
Nope, app.config is actually generated by riak.conf, through an
obscure process known as cuttlefishing ;p

On Mon, Sep 19, 2016 at 3:44 AM, Alex De la rosa
 wrote:
> Ok, documentation was confusing, i thought i had to add the data in both
> riak.conf and app.config
>
> Thanks,
> Alex
>
> On Mon, Sep 19, 2016 at 11:42 AM, Magnus Kessler  wrote:
>>
>> On 18 September 2016 at 07:51, Alex De la rosa 
>> wrote:
>>>
>>> Hi there,
>>>
>>> I'm trying to locate the app.config file in Riak 2.1.4-1 to add the
>>> following:
>>>
>>> { kernel, [
>>> {inet_dist_listen_min, 6000},
>>> {inet_dist_listen_max, 7999}
>>>   ]},
>>>
>>> as explained at http://docs.basho.com/riak/kv/2.1.4/using/security but I
>>> can't find it.
>>>
>>> Thanks,
>>> Alex
>>>
>>
>>
>> Hi Alex,
>>
>> With Riak 2.x we recommend using the new configuration mechanism (a.k.a
>> cuttlefish). Please use the instructions for using riak.conf on the page you
>> quoted.
>>
>> erlang.distribution.port_range.minimum = 6000
>> erlang.distribution.port_range.maximum = 7999
>>
>> For more information about Riak's configuration system, please see the
>> configuration reference documentation [0].
>>
>> Kind Regards,
>>
>> Magnus
>>
>> [0]: http://docs.basho.com/riak/kv/2.1.4/configuring/reference/
>>
>>  --
>> Magnus Kessler
>> Client Services Engineer
>> Basho Technologies Limited
>>
>> Registered Office - 8 Lincoln’s Inn Fields London WC2A 3BP Reg 07970431
>
>
>
> ___
> riak-users mailing list
> riak-users@lists.basho.com
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>

___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


Solr search performance

2016-09-19 Thread sean mcevoy
Hi All,

We have an index with ~548,000 entries, ~14,000 of which match one of our
queries.
We read these in a paginated search and the first page (of 100 hits)
returns quickly in ~70ms.
This response time seems to increase exponentially as we walk through the
pages:
the 4th page takes ~200ms,
the 8th page takes ~1200ms
the 12th page takes ~2100ms
the 16th page takes ~6100ms
the 20th page takes ~24000ms

And by the time we're searching for the 22nd page it regularly times out at
the default 60 seconds.

I have a good unsderstanding of riak KV internals but absolutely nothing of
Lucene which I think is what's most relevant here. If anyone in the know
can point me towards any relevant resource or can explain what's happening
I'd be much obliged :-)
As I would also be if anyone with experience of using Riak/Lucene can tell
me:
- Is 500K a crazy number of entries to put into one index?
- Is 14K a crazy number of entries to expect to be returned?
- Are there any methods we can use to make the search time more constant
across the full search?
I read one blog post on inlining but it was a bit old & not very obvious
how to implement using riakc_pb_socket calls.

And out of curiosity, do we not traverse the full range of hits for each
page? I naively thought that because I'm sorting the returned values we'd
have to get them all first and then sort, but the response times suggests
otherwise. Does Lucene store the data sorted by each field just in case a
query asks for it? Or what other magic is going on?


For the technical details, we use the "_yz_default" schema and all the
fields stored are strings:
- entry_id_s: unique within the DB, the aim of the query is to gather a
list of these
- type_s: has one of 2 values
- sub_category_id_s: in the query described above all 14K hits will match
on this, in the DB of ~500K entries there are ~43K different values for
this field, withe each category typically having 2-6 sub categories
- category_id_s: not matched in this query, in the DB of ~500K entries
there are ~13K different values for this field
- status_s: has one of 2 values, in the query described baove all hits will
have the value "active"
- user_id_s: unique within the DB but not matched in this query
- first_name_s: almost unique within the DB, this query will sort by this
field
- last_name_s: almost unique within the DB, this query will sort by this
field

This search query looks like:
<<"sub_category_id_s:test_1 AND status_s:active AND type_s:sub_category">>

Our options parameter has the sort directive:
{sort, <<"first_name_s asc, last_name_s asc">>}

The query was run on a 5-node cluster with n_val of 3.

Thanks in advance fo rany pointers!
//Sean.
___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


Re: Solr search performance

2016-09-19 Thread Fred Dushin
All great questions, Sean.

A few things.  First off, for result sets that are that large, you are probably 
going to want to use Solr cursor marks [1], which are supported in the current 
version of Solr we ship.  Riak allows queries using cursor marks through the 
HTTP interface.  At present, it does not support cursors using the protobuf 
API, due to some internal limitations of the server-side protobuf library, but 
we do hope to fix that in the future.

Secondly, we have found sorting with distributed queries to be far more 
performant using Solr 4.10.4.  Currently released versions of Riak use Solr 
4.7, but as you can see on github [2], Solr 4.10.4 support has been merged into 
the develop-2.2 branch, and is in the pipeline for release.  I can't say when 
the next version of Riak is that will ship with this version because of 
indeterminacy around bug triage, but it should not be too long.

I would start to look at using cursor marks and measure their relative 
performance in your scenario.  My guess is that you should see some improvement 
there.

-Fred

[1] https://cwiki.apache.org/confluence/display/solr/Pagination+of+Results
[2] 
https://github.com/basho/yokozuna/commit/f64e19cef107d982082f5b95ed598da96fb419b0


> On Sep 19, 2016, at 4:48 PM, sean mcevoy  wrote:
> 
> Hi All,
> 
> We have an index with ~548,000 entries, ~14,000 of which match one of our 
> queries.
> We read these in a paginated search and the first page (of 100 hits) returns 
> quickly in ~70ms.
> This response time seems to increase exponentially as we walk through the 
> pages:
> the 4th page takes ~200ms,
> the 8th page takes ~1200ms
> the 12th page takes ~2100ms
> the 16th page takes ~6100ms
> the 20th page takes ~24000ms
> 
> And by the time we're searching for the 22nd page it regularly times out at 
> the default 60 seconds.
> 
> I have a good unsderstanding of riak KV internals but absolutely nothing of 
> Lucene which I think is what's most relevant here. If anyone in the know can 
> point me towards any relevant resource or can explain what's happening I'd be 
> much obliged :-)
> As I would also be if anyone with experience of using Riak/Lucene can tell me:
> - Is 500K a crazy number of entries to put into one index?
> - Is 14K a crazy number of entries to expect to be returned?
> - Are there any methods we can use to make the search time more constant 
> across the full search?
> I read one blog post on inlining but it was a bit old & not very obvious how 
> to implement using riakc_pb_socket calls.
> 
> And out of curiosity, do we not traverse the full range of hits for each 
> page? I naively thought that because I'm sorting the returned values we'd 
> have to get them all first and then sort, but the response times suggests 
> otherwise. Does Lucene store the data sorted by each field just in case a 
> query asks for it? Or what other magic is going on?
> 
> 
> For the technical details, we use the "_yz_default" schema and all the fields 
> stored are strings:
> - entry_id_s: unique within the DB, the aim of the query is to gather a list 
> of these
> - type_s: has one of 2 values
> - sub_category_id_s: in the query described above all 14K hits will match on 
> this, in the DB of ~500K entries there are ~43K different values for this 
> field, withe each category typically having 2-6 sub categories
> - category_id_s: not matched in this query, in the DB of ~500K entries there 
> are ~13K different values for this field
> - status_s: has one of 2 values, in the query described baove all hits will 
> have the value "active"
> - user_id_s: unique within the DB but not matched in this query
> - first_name_s: almost unique within the DB, this query will sort by this 
> field
> - last_name_s: almost unique within the DB, this query will sort by this field
> 
> This search query looks like:
> <<"sub_category_id_s:test_1 AND status_s:active AND type_s:sub_category">>
> 
> Our options parameter has the sort directive:
> {sort, <<"first_name_s asc, last_name_s asc">>}
> 
> The query was run on a 5-node cluster with n_val of 3.
> 
> Thanks in advance fo rany pointers!
> //Sean.
> 
> ___
> riak-users mailing list
> riak-users@lists.basho.com
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


Error starting Riak CS + Stanchion on NixOS -- lists:unzip/3 error

2016-09-19 Thread Matthew Daiter
Hey all,
I'm currently having trouble starting Stanchion on NixOS. When attempting
to start Stanchion from root user, the system complains that there's no
function clause matching lists:unzip/3.
There's currently an issue about this opened up on GitHub, available here:
https://github.com/basho/stanchion/issues/111
My current configuration files are here:
https://gist.github.com/mdaiter/6ce6c00077eaef23ba50820ee1b4a2b3
And I've made Nix expressions for Stanchion, Riak-CS and Basho's version of
Erlang (all of which I'd love to push to nixpkgs!) so my build system's
totally reproducible. I'm using Erlang16R2_basho8.
Other users have been having this issue as well. Any help would be greatly
appreciated!
Best,
Matthew Daiter
___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com