Re: High number of Riak buckets

2016-09-30 Thread Vikram Lalit
Hiya Alexander,

Thanks much indeed for the detailed note... very interesting insights...

As you deduced, I actually omitted some pieces from my email for the sake
of simplicity. I'm actually leveraging a transient / stateless chat server
(ejabberd) wherein messages get delivered on live sessions / streams
without the client having to do look-ups. So the storage in Riak is
actually a post-facto delivery / archival rather than prior to the client
receiving them. Hence determining the time key for the look-up isn't going
to be an issue unless I run some analytics where I query all keys (which
would be an issue as I now understand from your comments).

There is of course the question of offline messages whose delivery would
depend on look-ups, but ejabberd there uses the username (the offline
storage is with the secondary index as well on leveldb) and hence the
timestamp not being important. Riak TS sure looks promising there but I'll
check further whether the change would be justified for only offline
messages, or in case other use cases crop up...

Makes sense on the listing all keys in a bucket being expensive though -
let me see how I can model my data for that!!!

Thanks again for your inputs... very informative...

Cheers.
Vikram


On Fri, Sep 30, 2016 at 12:23 PM, Alexander Sicular 
wrote:

> Hi Vikram,
>
> Bucket maximums aside, why are you modeling in this fashion? How will you
> retrieve individual keys if you don't know the time stamp in advance? Do
> you have a lookup somewhere else? Doable as lookup keys or crdts or other
> systems. Are you relying on listing all keys in a bucket? Definitely don't
> do that.
>
> Yes, there is a better way. Use Riak TS. Create a table with a composite
> primary key of topic and time. You can then retrieve by topic equality and
> time range. You can then cache those results in deterministic keys as
> necessary.
>
> If you don't already know, Riak TS is basically (there are some notable
> differences) Riak KV plus the time series data model. Riak TS makes all
> sorts of time series oriented projects easier than modeling them against
> KV. Oh, and you can also leverage KV buckets alongside TS (resource
> limitations not withstanding.)
>
> Would love to hear more,
> Alexander
>
> @siculars
> http://siculars.posthaven.com
>
> Sent from my iRotaryPhone
>
> > On Sep 29, 2016, at 19:42, Vikram Lalit  wrote:
> >
> > Hi - I am creating a messaging platform wherein am modeling each topic
> to serve as a separate bucket. That means there can potentially be millions
> of buckets, with each message from a user becoming a value on a distinct
> timestamp key.
> >
> > My question is there any downside to modeling my data in such a manner?
> Or can folks advise a better way of storing the same in Riak?
> >
> > Secondly, I would like to modify the default bucket properties (n_val) -
> I understand that such 'custom' buckets have a higher performance overhead
> due to the extra load on the gossip protocol. Is there a way the default
> n_val of newly created buckets be changed so that even if I have the above
> said high number of buckets, there is no performance degrade? Believe there
> was such a config allowed in app.config but not sure that file is leveraged
> any more after riak.conf was introduced.
> >
> > Thanks much.
> > ___
> > riak-users mailing list
> > riak-users@lists.basho.com
> > http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>
___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


Re: High number of Riak buckets

2016-09-30 Thread Vikram Lalit
Hi Luke - many thanks... actually I was planning to have different bucket
types have a different n_val. Or I might end up doing so... the thinking
being that I intend to start my production workloads with fewer
replications, but as the system matures / stabilizes (and also increases in
userbase!), I would want to increase n_val.

In my testing that I had done a few weeks ago, each time I tried to
increase the n_val of an existing bucket, I've found conflicting results
(prior question here:
http://lists.basho.com/pipermail/riak-users_lists.basho.com/2016-July/018631.html)
- perhaps due to read-repair taking time - not sure. Understood though from
various Riak papers that decreasing n_val should not be done, but couldn't
conclude yet as to why would increasing be an issue...

So to avoid the scenario, I've been thinking that as the system criticality
increases, I would create a new bucket (with a higher n_val) and then start
pushing newer conversations on to that bucket. Still not sure how this
would behave, but let me test further with bucket types as you suggest...

Do let know please if there's something glaring I'm missing as am trying to
clarify the thought-process to myself as well!!!

Cheers.

On Fri, Sep 30, 2016 at 12:07 PM, Luke Bakken  wrote:

> Hi Vikram,
>
> If all of your buckets use the same bucket type with your custom
> n_val, there won't be a performance issue. Just be sure to set n_val
> on the bucket type, and that all buckets are part of that bucket type.
>
> http://docs.basho.com/riak/kv/2.1.4/developing/usage/bucket-types/
>
> --
> Luke Bakken
> Engineer
> lbak...@basho.com
>
> On Thu, Sep 29, 2016 at 4:42 PM, Vikram Lalit 
> wrote:
> > Hi - I am creating a messaging platform wherein am modeling each topic to
> > serve as a separate bucket. That means there can potentially be millions
> of
> > buckets, with each message from a user becoming a value on a distinct
> > timestamp key.
> >
> > My question is there any downside to modeling my data in such a manner?
> Or
> > can folks advise a better way of storing the same in Riak?
> >
> > Secondly, I would like to modify the default bucket properties (n_val) -
> I
> > understand that such 'custom' buckets have a higher performance overhead
> due
> > to the extra load on the gossip protocol. Is there a way the default
> n_val
> > of newly created buckets be changed so that even if I have the above said
> > high number of buckets, there is no performance degrade? Believe there
> was
> > such a config allowed in app.config but not sure that file is leveraged
> any
> > more after riak.conf was introduced.
> >
> > Thanks much.
>
___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


Re: High number of Riak buckets

2016-09-30 Thread Alexander Sicular
Hi Vikram,

Bucket maximums aside, why are you modeling in this fashion? How will you 
retrieve individual keys if you don't know the time stamp in advance? Do you 
have a lookup somewhere else? Doable as lookup keys or crdts or other systems. 
Are you relying on listing all keys in a bucket? Definitely don't do that.  

Yes, there is a better way. Use Riak TS. Create a table with a composite 
primary key of topic and time. You can then retrieve by topic equality and time 
range. You can then cache those results in deterministic keys as necessary. 

If you don't already know, Riak TS is basically (there are some notable 
differences) Riak KV plus the time series data model. Riak TS makes all sorts 
of time series oriented projects easier than modeling them against KV. Oh, and 
you can also leverage KV buckets alongside TS (resource limitations not 
withstanding.)

Would love to hear more,
Alexander 

@siculars
http://siculars.posthaven.com

Sent from my iRotaryPhone

> On Sep 29, 2016, at 19:42, Vikram Lalit  wrote:
> 
> Hi - I am creating a messaging platform wherein am modeling each topic to 
> serve as a separate bucket. That means there can potentially be millions of 
> buckets, with each message from a user becoming a value on a distinct 
> timestamp key.
> 
> My question is there any downside to modeling my data in such a manner? Or 
> can folks advise a better way of storing the same in Riak?
> 
> Secondly, I would like to modify the default bucket properties (n_val) - I 
> understand that such 'custom' buckets have a higher performance overhead due 
> to the extra load on the gossip protocol. Is there a way the default n_val of 
> newly created buckets be changed so that even if I have the above said high 
> number of buckets, there is no performance degrade? Believe there was such a 
> config allowed in app.config but not sure that file is leveraged any more 
> after riak.conf was introduced.
> 
> Thanks much.
> ___
> riak-users mailing list
> riak-users@lists.basho.com
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


Re: High number of Riak buckets

2016-09-30 Thread Luke Bakken
Hi Vikram,

If all of your buckets use the same bucket type with your custom
n_val, there won't be a performance issue. Just be sure to set n_val
on the bucket type, and that all buckets are part of that bucket type.

http://docs.basho.com/riak/kv/2.1.4/developing/usage/bucket-types/

--
Luke Bakken
Engineer
lbak...@basho.com

On Thu, Sep 29, 2016 at 4:42 PM, Vikram Lalit  wrote:
> Hi - I am creating a messaging platform wherein am modeling each topic to
> serve as a separate bucket. That means there can potentially be millions of
> buckets, with each message from a user becoming a value on a distinct
> timestamp key.
>
> My question is there any downside to modeling my data in such a manner? Or
> can folks advise a better way of storing the same in Riak?
>
> Secondly, I would like to modify the default bucket properties (n_val) - I
> understand that such 'custom' buckets have a higher performance overhead due
> to the extra load on the gossip protocol. Is there a way the default n_val
> of newly created buckets be changed so that even if I have the above said
> high number of buckets, there is no performance degrade? Believe there was
> such a config allowed in app.config but not sure that file is leveraged any
> more after riak.conf was introduced.
>
> Thanks much.

___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


High number of Riak buckets

2016-09-29 Thread Vikram Lalit
Hi - I am creating a messaging platform wherein am modeling each topic to
serve as a separate bucket. That means there can potentially be millions of
buckets, with each message from a user becoming a value on a distinct
timestamp key.

My question is there any downside to modeling my data in such a manner? Or
can folks advise a better way of storing the same in Riak?

Secondly, I would like to modify the default bucket properties (n_val) - I
understand that such 'custom' buckets have a higher performance overhead
due to the extra load on the gossip protocol. Is there a way the default
n_val of newly created buckets be changed so that even if I have the above
said high number of buckets, there is no performance degrade? Believe there
was such a config allowed in app.config but not sure that file is leveraged
any more after riak.conf was introduced.

Thanks much.
___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com