Re: last_write_wins

2014-01-30 Thread Edgar Veiga
Also,

Using last_write_wins = true, do I need to always send the vclock while on
a PUT request? In the official documention it says that riak will look only
at the timestamp of the requests.

Best regards,


On 29 January 2014 10:29, Edgar Veiga edgarmve...@gmail.com wrote:

 Hi Russel,

 No, it doesn't depend. It's always a new value.

 Best regards


 On 29 January 2014 10:10, Russell Brown russell.br...@me.com wrote:


 On 29 Jan 2014, at 09:57, Edgar Veiga edgarmve...@gmail.com wrote:

 tl;dr

 If I guarantee that the same key is only written with a 5 second
 interval, is last_write_wins=true profitable?


 It depends. Does the value you write depend in anyway on the value you
 read, or is it always that you are just getting a totally new value that
 replaces what is in Riak (regardless what is in Riak)?



 On 27 January 2014 23:25, Edgar Veiga edgarmve...@gmail.com wrote:

 Hi there everyone!

 I would like to know, if my current application is a good use case to
 set last_write_wins to true.

 Basically I have a cluster of node.js workers reading and writing to
 riak. Each node.js worker is responsible for a set of keys, so I can
 guarantee some kind of non distributed cache...
 The real deal here is that the writing operation is not run evertime an
 object is changed but each 5 seconds in a batch insertion/update style.
 This brings the guarantee that the same object cannot be write to riak at
 the same time, not event at the same seconds, there's always a 5 second
 window between each insertion/update.

 That said, is it profitable to me if I set last_write_wins to true? I've
 been facing some massive writting delays under high loads and it would be
 nice if I have some kind of way to tune riak.

 Thanks a lot and keep up the good work!


 ___
 riak-users mailing list
 riak-users@lists.basho.com
 http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com




___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


Re: last_write_wins

2014-01-30 Thread Russell Brown

On 30 Jan 2014, at 10:37, Edgar Veiga edgarmve...@gmail.com wrote:

 Also,
 
 Using last_write_wins = true, do I need to always send the vclock while on a 
 PUT request? In the official documention it says that riak will look only at 
 the timestamp of the requests.

Ok, from what you’ve said it sounds like you are always wanting to replace what 
is at a key with the new information you are putting. If that is the case, then 
you have the perfect use case for LWW=true. And indeed, you do not need to pass 
a vclock with your put request. And it sounds like there is no need for you to 
fetch-before-put since that is only to get context /resolve siblings. Curious 
about your use case if you can share more.

Cheers

Russell


 
 Best regards,
 
 
 On 29 January 2014 10:29, Edgar Veiga edgarmve...@gmail.com wrote:
 Hi Russel,
 
 No, it doesn't depend. It's always a new value.
 
 Best regards
 
 
 On 29 January 2014 10:10, Russell Brown russell.br...@me.com wrote:
 
 On 29 Jan 2014, at 09:57, Edgar Veiga edgarmve...@gmail.com wrote:
 
 tl;dr
 
 If I guarantee that the same key is only written with a 5 second interval, 
 is last_write_wins=true profitable?
 
 It depends. Does the value you write depend in anyway on the value you read, 
 or is it always that you are just getting a totally new value that replaces 
 what is in Riak (regardless what is in Riak)?
 
 
 
 On 27 January 2014 23:25, Edgar Veiga edgarmve...@gmail.com wrote:
 Hi there everyone!
 
 I would like to know, if my current application is a good use case to set 
 last_write_wins to true.
 
 Basically I have a cluster of node.js workers reading and writing to riak. 
 Each node.js worker is responsible for a set of keys, so I can guarantee 
 some kind of non distributed cache... 
 The real deal here is that the writing operation is not run evertime an 
 object is changed but each 5 seconds in a batch insertion/update style. 
 This brings the guarantee that the same object cannot be write to riak at 
 the same time, not event at the same seconds, there's always a 5 second 
 window between each insertion/update.
 
 That said, is it profitable to me if I set last_write_wins to true? I've 
 been facing some massive writting delays under high loads and it would be 
 nice if I have some kind of way to tune riak.
 
 Thanks a lot and keep up the good work!
 
 
 ___
 riak-users mailing list
 riak-users@lists.basho.com
 http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
 
 
 

___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


Re: last_write_wins

2014-01-30 Thread Russell Brown

On 30 Jan 2014, at 10:58, Guido Medina guido.med...@temetra.com wrote:

 Hi,
 
 Now I'm curious too, according to 
 http://docs.basho.com/riak/latest/ops/advanced/configs/configuration-files/ 
 the default value for Erlang property last_write_wins is false, now, if 95% 
 of the buckets/keys have no siblings (or conflict resolution), does that mean 
 that for such buckets last_write_wins is set to true, I'm wondering what's 
 the effect (if any) if allow_multi on a bucket is false.
 
 In other words; I could assume that:
 If allow_multi is true then last_write_wins will be ignored 'cause vclock is 
 needed for conflict resolution?
 if allow_multi is false then last_write_wins is true?
They’re independant settings, but allow_mult=true + lww=true makes no sense (in 
reality, in the code, I’m pretty sure the lww=true will be applied.)

allow_mult=false+lww=false means at each vnode there is a read-before-write, 
and casually dominated values are dropped, while siblings values are made, but 
before we write to disk (or return to the user on get) we pick the sibling with 
the highest timestamp. This means that you get _one_ of the causally concurrent 
values, the one with the largest timestamp.

allow_mult=false+lww=false means that at the coordinating vnode we just 
increment whatever vclock the put has (probably none, right?) and write it to 
disk (no read of the local value first) and down stream at the replicas, the 
same thing, just store it. I need to check, but on a get, if there are 
siblings, just pick the highest timestamp.

I really think, for riak, 90% of the time, allow_mult=true is your best choice. 
John Daily did a truly exhaustive set of blog posts on this 
http://basho.com/understanding-riaks-configurable-behaviors-part-1/ I highly 
recommend it. If you data is always overwrite maybe LWW makes sense for you. If 
it is write once, read ever after LWW is perfect.

Cheers

Russell
 Correct me if I'm wrong,
 Again, we have a very similar scenarios, where we create/modify keys and we 
 are certain we have the latest version so for us last_write_wins...
 Regards,
 
 Guido.
 
 On 30/01/14 10:46, Russell Brown wrote:
 
 On 30 Jan 2014, at 10:37, Edgar Veiga edgarmve...@gmail.com wrote:
 
 Also,
 
 Using last_write_wins = true, do I need to always send the vclock while on 
 a PUT request? In the official documention it says that riak will look only 
 at the timestamp of the requests.
 
 Ok, from what you’ve said it sounds like you are always wanting to replace 
 what is at a key with the new information you are putting. If that is the 
 case, then you have the perfect use case for LWW=true. And indeed, you do 
 not need to pass a vclock with your put request. And it sounds like there is 
 no need for you to fetch-before-put since that is only to get context 
 /resolve siblings. Curious about your use case if you can share more.
 
 Cheers
 
 Russell
 
 
 
 Best regards,
 
 
 On 29 January 2014 10:29, Edgar Veiga edgarmve...@gmail.com wrote:
 Hi Russel,
 
 No, it doesn't depend. It's always a new value.
 
 Best regards
 
 
 On 29 January 2014 10:10, Russell Brown russell.br...@me.com wrote:
 
 On 29 Jan 2014, at 09:57, Edgar Veiga edgarmve...@gmail.com wrote:
 
 tl;dr
 
 If I guarantee that the same key is only written with a 5 second interval, 
 is last_write_wins=true profitable?
 
 It depends. Does the value you write depend in anyway on the value you 
 read, or is it always that you are just getting a totally new value that 
 replaces what is in Riak (regardless what is in Riak)?
 
 
 
 On 27 January 2014 23:25, Edgar Veiga edgarmve...@gmail.com wrote:
 Hi there everyone!
 
 I would like to know, if my current application is a good use case to set 
 last_write_wins to true.
 
 Basically I have a cluster of node.js workers reading and writing to riak. 
 Each node.js worker is responsible for a set of keys, so I can guarantee 
 some kind of non distributed cache... 
 The real deal here is that the writing operation is not run evertime an 
 object is changed but each 5 seconds in a batch insertion/update style. 
 This brings the guarantee that the same object cannot be write to riak at 
 the same time, not event at the same seconds, there's always a 5 second 
 window between each insertion/update.
 
 That said, is it profitable to me if I set last_write_wins to true? I've 
 been facing some massive writting delays under high loads and it would be 
 nice if I have some kind of way to tune riak.
 
 Thanks a lot and keep up the good work!
 
 
 ___
 riak-users mailing list
 riak-users@lists.basho.com
 http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
 
 
 
 
 
 
 ___
 riak-users mailing list
 riak-users@lists.basho.com
 http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
 
 ___
 riak-users mailing list
 riak-users@lists.basho.com
 

Re: last_write_wins

2014-01-30 Thread Guido Medina

Hi Russell,

Thanks for your response, I understand most of it, I know LWW=true and 
allow_multi=true won't make any sense, but look at this scenario:


All of our buckets have allow_multi=false except for the one bucket we 
have for CRDT counters, our application requires certain some level of 
consistency so we have full control of our reads/writes using a fine 
grain locking mechanism combined with in-memory cache so in our case the 
LWW=true is what we would want?, now, we haven't touched this parameter 
so it is at its default value.


I'm assuming it will improve performance for our case, but, if we set 
LWW=true, will it affect the bucket(s) with allow_multi=true, is it safe 
to assume that if allow_multi=true LWW will be ignored? We only modify 
bucket properties using Riak Java client 1.4.x atm.


Also, about safety, LWW=true uses timestamp? and LWW=false uses vclock?, 
future of both?, should we leave it untouched? we don't really want to 
use something that could jeopardise our data consistency requirement 
even if it means better performance.


Hopefully I'm enriching the subject and not hijacking it,

Thanks,

Guido.

On 30/01/14 12:49, Russell Brown wrote:


On 30 Jan 2014, at 10:58, Guido Medina guido.med...@temetra.com 
mailto:guido.med...@temetra.com wrote:



Hi,

Now I'm curious too, according to 
http://docs.basho.com/riak/latest/ops/advanced/configs/configuration-files/ 
the default value for Erlang property last_write_wins is false, now, 
if 95% of the buckets/keys have no siblings (or conflict resolution), 
does that mean that for such buckets last_write_wins is set to true, 
I'm wondering what's the effect (if any) if allow_multi on a bucket 
is false.


In other words; I could assume that:

  * If allow_multi is true then last_write_wins will be ignored
'cause vclock is needed for conflict resolution?
  * if allow_multi is false then last_write_wins is true?

They’re independant settings, but allow_mult=true + lww=true makes no 
sense (in reality, in the code, I’m pretty sure the lww=true will be 
applied.)


allow_mult=false+lww=false means at each vnode there is a 
read-before-write, and casually dominated values are dropped, while 
siblings values are made, but before we write to disk (or return to 
the user on get) we pick the sibling with the highest timestamp. This 
means that you get _one_ of the causally concurrent values, the one 
with the largest timestamp.


allow_mult=false+lww=false means that at the coordinating vnode we 
just increment whatever vclock the put has (probably none, right?) and 
write it to disk (no read of the local value first) and down stream at 
the replicas, the same thing, just store it. I need to check, but on a 
get, if there are siblings, just pick the highest timestamp.


I really think, for riak, 90% of the time, allow_mult=true is your 
best choice. John Daily did a truly exhaustive set of blog posts on 
this 
http://basho.com/understanding-riaks-configurable-behaviors-part-1/ I 
highly recommend it. If you data is always overwrite maybe LWW makes 
sense for you. If it is write once, read ever after LWW is perfect.


Cheers

Russell


Correct me if I'm wrong,

Again, we have a very similar scenarios, where we create/modify keys 
and we are certain we have the latest version so for us 
last_write_wins...


Regards,

Guido.

On 30/01/14 10:46, Russell Brown wrote:


On 30 Jan 2014, at 10:37, Edgar Veiga edgarmve...@gmail.com 
mailto:edgarmve...@gmail.com wrote:



Also,

Using last_write_wins = true, do I need to always send the vclock 
while on a PUT request? In the official documention it says that 
riak will look only at the timestamp of the requests.


Ok, from what you’ve said it sounds like you are always wanting to 
replace what is at a key with the new information you are putting. 
If that is the case, then you have the perfect use case for 
LWW=true. And indeed, you do not need to pass a vclock with your put 
request. And it sounds like there is no need for you to 
fetch-before-put since that is only to get context /resolve 
siblings. Curious about your use case if you can share more.


Cheers

Russell




Best regards,


On 29 January 2014 10:29, Edgar Veiga edgarmve...@gmail.com 
mailto:edgarmve...@gmail.com wrote:


Hi Russel,

No, it doesn't depend. It's always a new value.

Best regards


On 29 January 2014 10:10, Russell Brown russell.br...@me.com
mailto:russell.br...@me.com wrote:


On 29 Jan 2014, at 09:57, Edgar Veiga
edgarmve...@gmail.com mailto:edgarmve...@gmail.com wrote:


tl;dr

If I guarantee that the same key is only written with a 5
second interval, is last_write_wins=true profitable?


It depends. Does the value you write depend in anyway on
the value you read, or is it always that you are just
getting a totally new value that replaces what is in Riak
(regardless what is in Riak)?




On 27 January 2014 23:25, 

Re: last_write_wins

2014-01-30 Thread Edgar Veiga
I'll try to explain this the best I can, although it's a simples
architecture I'm not describing it in my native language :)

I have a set of node.js workers (64 for now) that serve as a
cache/middleware layer for a dozen of php applications. Each worker deals
with a set of documents (it's not a distributed cache system). Each worker
updates the documents in memory, and tags them as dirty (just like OS file
cache), and from time to time (for now, it's a 5 seconds window interval),
a persister module will deal with the persistence of those dirty documents
to riak.
If the document isn't in memory, it will be fetched from riak.

If you want document X, you need to ask to the corresponding worker dealing
with it. Two different workers, don't deal with the same document.
That way we can guarantee that there will be no concurrent writes to riak.

Best Regards,




On 30 January 2014 10:46, Russell Brown russell.br...@me.com wrote:


 On 30 Jan 2014, at 10:37, Edgar Veiga edgarmve...@gmail.com wrote:

 Also,

 Using last_write_wins = true, do I need to always send the vclock while on
 a PUT request? In the official documention it says that riak will look only
 at the timestamp of the requests.


 Ok, from what you've said it sounds like you are always wanting to replace
 what is at a key with the new information you are putting. If that is the
 case, then you have the perfect use case for LWW=true. And indeed, you do
 not need to pass a vclock with your put request. And it sounds like there
 is no need for you to fetch-before-put since that is only to get context
 /resolve siblings. Curious about your use case if you can share more.

 Cheers

 Russell



 Best regards,


 On 29 January 2014 10:29, Edgar Veiga edgarmve...@gmail.com wrote:

 Hi Russel,

 No, it doesn't depend. It's always a new value.

 Best regards


 On 29 January 2014 10:10, Russell Brown russell.br...@me.com wrote:


 On 29 Jan 2014, at 09:57, Edgar Veiga edgarmve...@gmail.com wrote:

 tl;dr

 If I guarantee that the same key is only written with a 5 second
 interval, is last_write_wins=true profitable?


 It depends. Does the value you write depend in anyway on the value you
 read, or is it always that you are just getting a totally new value that
 replaces what is in Riak (regardless what is in Riak)?



 On 27 January 2014 23:25, Edgar Veiga edgarmve...@gmail.com wrote:

 Hi there everyone!

 I would like to know, if my current application is a good use case to
 set last_write_wins to true.

 Basically I have a cluster of node.js workers reading and writing to
 riak. Each node.js worker is responsible for a set of keys, so I can
 guarantee some kind of non distributed cache...
 The real deal here is that the writing operation is not run evertime an
 object is changed but each 5 seconds in a batch insertion/update style.
 This brings the guarantee that the same object cannot be write to riak at
 the same time, not event at the same seconds, there's always a 5 second
 window between each insertion/update.

 That said, is it profitable to me if I set last_write_wins to true?
 I've been facing some massive writting delays under high loads and it would
 be nice if I have some kind of way to tune riak.

 Thanks a lot and keep up the good work!


 ___
 riak-users mailing list
 riak-users@lists.basho.com
 http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com






___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


Re: last_write_wins

2014-01-30 Thread Jason Campbell
I'm not sure Riak is the best fit for this.  Riak is great for applications 
where it is the source of data, and has very strong consistency when used in 
this way.  You are using it as a cache, where Riak will be significantly slower 
than other cache solutions.  Especially since you say that each worker will 
have a set of documents it is responsible for.  Something like a local memcache 
or redis would likely suit this use case just as well, but do it much faster 
with less overhead.

Riak will guarantee 3 writes to disk (by default), where something like 
memcache or redis will stay in memory, and if local, won't have network latency 
either.  In the worst case where a node goes offline, the real data can be 
pulled from the backend again, so it isn't a big deal.  It will also simplify 
your application, because node.js can always request from cache and not worry 
about the speed, instead of maintaining it's own cache layer.

I'm as happy as the next person on this list to see Riak being used for all 
sorts of uses, but I believe in the right tool for the right job.  Unless there 
is something I don't understand, Riak is probably the wrong tool.  It will 
work, but there is other software that will work much better.

I hope this helps,
Jason Campbell

- Original Message -
From: Edgar Veiga edgarmve...@gmail.com
To: Russell Brown russell.br...@me.com
Cc: riak-users riak-users@lists.basho.com
Sent: Friday, 31 January, 2014 3:20:42 AM
Subject: Re: last_write_wins



I'll try to explain this the best I can, although it's a simples architecture 
I'm not describing it in my native language :) 


I have a set of node.js workers (64 for now) that serve as a cache/middleware 
layer for a dozen of php applications. Each worker deals with a set of 
documents (it's not a distributed cache system). Each worker updates the 
documents in memory, and tags them as dirty (just like OS file cache), and from 
time to time (for now, it's a 5 seconds window interval), a persister module 
will deal with the persistence of those dirty documents to riak. 
If the document isn't in memory, it will be fetched from riak. 


If you want document X, you need to ask to the corresponding worker dealing 
with it. Two different workers, don't deal with the same document. 
That way we can guarantee that there will be no concurrent writes to riak. 


Best Regards, 







On 30 January 2014 10:46, Russell Brown  russell.br...@me.com  wrote: 







On 30 Jan 2014, at 10:37, Edgar Veiga  edgarmve...@gmail.com  wrote: 



Also, 


Using last_write_wins = true, do I need to always send the vclock while on a 
PUT request? In the official documention it says that riak will look only at 
the timestamp of the requests. 


Ok, from what you’ve said it sounds like you are always wanting to replace what 
is at a key with the new information you are putting. If that is the case, then 
you have the perfect use case for LWW=true. And indeed, you do not need to pass 
a vclock with your put request. And it sounds like there is no need for you to 
fetch-before-put since that is only to get context /resolve siblings. Curious 
about your use case if you can share more. 


Cheers 


Russell 










Best regards, 



On 29 January 2014 10:29, Edgar Veiga  edgarmve...@gmail.com  wrote: 



Hi Russel, 


No, it doesn't depend. It's always a new value. 


Best regards 





On 29 January 2014 10:10, Russell Brown  russell.br...@me.com  wrote: 







On 29 Jan 2014, at 09:57, Edgar Veiga  edgarmve...@gmail.com  wrote: 



tl;dr 


If I guarantee that the same key is only written with a 5 second interval, is 
last_write_wins=true profitable? 

It depends. Does the value you write depend in anyway on the value you read, or 
is it always that you are just getting a totally new value that replaces what 
is in Riak (regardless what is in Riak)? 








On 27 January 2014 23:25, Edgar Veiga  edgarmve...@gmail.com  wrote: 



Hi there everyone! 


I would like to know, if my current application is a good use case to set 
last_write_wins to true. 


Basically I have a cluster of node.js workers reading and writing to riak. Each 
node.js worker is responsible for a set of keys, so I can guarantee some kind 
of non distributed cache... 
The real deal here is that the writing operation is not run evertime an object 
is changed but each 5 seconds in a batch insertion/update style. This brings 
the guarantee that the same object cannot be write to riak at the same time, 
not event at the same seconds, there's always a 5 second window between each 
insertion/update. 


That said, is it profitable to me if I set last_write_wins to true? I've been 
facing some massive writting delays under high loads and it would be nice if I 
have some kind of way to tune riak. 


Thanks a lot and keep up the good work! 


___ 
riak-users mailing list 
riak-users@lists.basho.com 
http://lists.basho.com/mailman/listinfo/riak

Re: last_write_wins

2014-01-30 Thread Eric Redmond
Actually people use Riak as a distributed cache all the time. In fact, many 
customers use it exclusively as a cache system. Not all backends write to disk. 
Riak supports a main memory backend[1], complete with size limits and TTL.

Eric

[1]: http://docs.basho.com/riak/latest/ops/advanced/backends/memory/


On Jan 30, 2014, at 1:48 PM, Jason Campbell xia...@xiaclo.net wrote:

 I'm not sure Riak is the best fit for this.  Riak is great for applications 
 where it is the source of data, and has very strong consistency when used in 
 this way.  You are using it as a cache, where Riak will be significantly 
 slower than other cache solutions.  Especially since you say that each worker 
 will have a set of documents it is responsible for.  Something like a local 
 memcache or redis would likely suit this use case just as well, but do it 
 much faster with less overhead.
 
 Riak will guarantee 3 writes to disk (by default), where something like 
 memcache or redis will stay in memory, and if local, won't have network 
 latency either.  In the worst case where a node goes offline, the real data 
 can be pulled from the backend again, so it isn't a big deal.  It will also 
 simplify your application, because node.js can always request from cache and 
 not worry about the speed, instead of maintaining it's own cache layer.
 
 I'm as happy as the next person on this list to see Riak being used for all 
 sorts of uses, but I believe in the right tool for the right job.  Unless 
 there is something I don't understand, Riak is probably the wrong tool.  It 
 will work, but there is other software that will work much better.
 
 I hope this helps,
 Jason Campbell
 
 - Original Message -
 From: Edgar Veiga edgarmve...@gmail.com
 To: Russell Brown russell.br...@me.com
 Cc: riak-users riak-users@lists.basho.com
 Sent: Friday, 31 January, 2014 3:20:42 AM
 Subject: Re: last_write_wins
 
 
 
 I'll try to explain this the best I can, although it's a simples architecture 
 I'm not describing it in my native language :) 
 
 
 I have a set of node.js workers (64 for now) that serve as a cache/middleware 
 layer for a dozen of php applications. Each worker deals with a set of 
 documents (it's not a distributed cache system). Each worker updates the 
 documents in memory, and tags them as dirty (just like OS file cache), and 
 from time to time (for now, it's a 5 seconds window interval), a persister 
 module will deal with the persistence of those dirty documents to riak. 
 If the document isn't in memory, it will be fetched from riak. 
 
 
 If you want document X, you need to ask to the corresponding worker dealing 
 with it. Two different workers, don't deal with the same document. 
 That way we can guarantee that there will be no concurrent writes to riak. 
 
 
 Best Regards, 
 
 
 
 
 
 
 
 On 30 January 2014 10:46, Russell Brown  russell.br...@me.com  wrote: 
 
 
 
 
 
 
 
 On 30 Jan 2014, at 10:37, Edgar Veiga  edgarmve...@gmail.com  wrote: 
 
 
 
 Also, 
 
 
 Using last_write_wins = true, do I need to always send the vclock while on a 
 PUT request? In the official documention it says that riak will look only at 
 the timestamp of the requests. 
 
 
 Ok, from what you’ve said it sounds like you are always wanting to replace 
 what is at a key with the new information you are putting. If that is the 
 case, then you have the perfect use case for LWW=true. And indeed, you do not 
 need to pass a vclock with your put request. And it sounds like there is no 
 need for you to fetch-before-put since that is only to get context /resolve 
 siblings. Curious about your use case if you can share more. 
 
 
 Cheers 
 
 
 Russell 
 
 
 
 
 
 
 
 
 
 
 Best regards, 
 
 
 
 On 29 January 2014 10:29, Edgar Veiga  edgarmve...@gmail.com  wrote: 
 
 
 
 Hi Russel, 
 
 
 No, it doesn't depend. It's always a new value. 
 
 
 Best regards 
 
 
 
 
 
 On 29 January 2014 10:10, Russell Brown  russell.br...@me.com  wrote: 
 
 
 
 
 
 
 
 On 29 Jan 2014, at 09:57, Edgar Veiga  edgarmve...@gmail.com  wrote: 
 
 
 
 tl;dr 
 
 
 If I guarantee that the same key is only written with a 5 second interval, is 
 last_write_wins=true profitable? 
 
 It depends. Does the value you write depend in anyway on the value you read, 
 or is it always that you are just getting a totally new value that replaces 
 what is in Riak (regardless what is in Riak)? 
 
 
 
 
 
 
 
 
 On 27 January 2014 23:25, Edgar Veiga  edgarmve...@gmail.com  wrote: 
 
 
 
 Hi there everyone! 
 
 
 I would like to know, if my current application is a good use case to set 
 last_write_wins to true. 
 
 
 Basically I have a cluster of node.js workers reading and writing to riak. 
 Each node.js worker is responsible for a set of keys, so I can guarantee some 
 kind of non distributed cache... 
 The real deal here is that the writing operation is not run evertime an 
 object is changed but each 5 seconds in a batch insertion/update style. 
 This brings the guarantee that the same

Re: last_write_wins

2014-01-30 Thread Edgar Veiga
Hi!

I think that you are making some kind of confusion here... I'm not using
riak for cache purposes, thats exactly the opposite! Riak is my end
persistence system, I need to store the documents in a strong, secure,
available and consistent place. That's riak.

It's like I've said before, just make an analogy with the linux file cache
system. Node.js workers simulate that in-memory cache, php applications
write and read from them and when something is dirty, it's persisted to
riak...

Best regards




On 30 January 2014 22:26, Eric Redmond eredm...@basho.com wrote:

 Actually people use Riak as a distributed cache all the time. In fact,
 many customers use it exclusively as a cache system. Not all backends write
 to disk. Riak supports a main memory backend[1], complete with size limits
 and TTL.

 Eric

 [1]: http://docs.basho.com/riak/latest/ops/advanced/backends/memory/


 On Jan 30, 2014, at 1:48 PM, Jason Campbell xia...@xiaclo.net wrote:

 I'm not sure Riak is the best fit for this.  Riak is great for
 applications where it is the source of data, and has very strong
 consistency when used in this way.  You are using it as a cache, where Riak
 will be significantly slower than other cache solutions.  Especially since
 you say that each worker will have a set of documents it is responsible
 for.  Something like a local memcache or redis would likely suit this use
 case just as well, but do it much faster with less overhead.

 Riak will guarantee 3 writes to disk (by default), where something like
 memcache or redis will stay in memory, and if local, won't have network
 latency either.  In the worst case where a node goes offline, the real data
 can be pulled from the backend again, so it isn't a big deal.  It will also
 simplify your application, because node.js can always request from cache
 and not worry about the speed, instead of maintaining it's own cache layer.

 I'm as happy as the next person on this list to see Riak being used for
 all sorts of uses, but I believe in the right tool for the right job.
  Unless there is something I don't understand, Riak is probably the wrong
 tool.  It will work, but there is other software that will work much better.

 I hope this helps,
 Jason Campbell

 - Original Message -
 From: Edgar Veiga edgarmve...@gmail.com
 To: Russell Brown russell.br...@me.com
 Cc: riak-users riak-users@lists.basho.com
 Sent: Friday, 31 January, 2014 3:20:42 AM
 Subject: Re: last_write_wins



 I'll try to explain this the best I can, although it's a simples
 architecture I'm not describing it in my native language :)


 I have a set of node.js workers (64 for now) that serve as a
 cache/middleware layer for a dozen of php applications. Each worker deals
 with a set of documents (it's not a distributed cache system). Each worker
 updates the documents in memory, and tags them as dirty (just like OS file
 cache), and from time to time (for now, it's a 5 seconds window interval),
 a persister module will deal with the persistence of those dirty documents
 to riak.
 If the document isn't in memory, it will be fetched from riak.


 If you want document X, you need to ask to the corresponding worker
 dealing with it. Two different workers, don't deal with the same document.
 That way we can guarantee that there will be no concurrent writes to riak.


 Best Regards,







 On 30 January 2014 10:46, Russell Brown  russell.br...@me.com  wrote:







 On 30 Jan 2014, at 10:37, Edgar Veiga  edgarmve...@gmail.com  wrote:



 Also,


 Using last_write_wins = true, do I need to always send the vclock while on
 a PUT request? In the official documention it says that riak will look only
 at the timestamp of the requests.


 Ok, from what you've said it sounds like you are always wanting to replace
 what is at a key with the new information you are putting. If that is the
 case, then you have the perfect use case for LWW=true. And indeed, you do
 not need to pass a vclock with your put request. And it sounds like there
 is no need for you to fetch-before-put since that is only to get context
 /resolve siblings. Curious about your use case if you can share more.


 Cheers


 Russell










 Best regards,



 On 29 January 2014 10:29, Edgar Veiga  edgarmve...@gmail.com  wrote:



 Hi Russel,


 No, it doesn't depend. It's always a new value.


 Best regards





 On 29 January 2014 10:10, Russell Brown  russell.br...@me.com  wrote:







 On 29 Jan 2014, at 09:57, Edgar Veiga  edgarmve...@gmail.com  wrote:



 tl;dr


 If I guarantee that the same key is only written with a 5 second interval,
 is last_write_wins=true profitable?

 It depends. Does the value you write depend in anyway on the value you
 read, or is it always that you are just getting a totally new value that
 replaces what is in Riak (regardless what is in Riak)?








 On 27 January 2014 23:25, Edgar Veiga  edgarmve...@gmail.com  wrote:



 Hi there everyone!


 I would like to know, if my current application

Re: last_write_wins

2014-01-30 Thread Eric Redmond
For clarity, I was responding to Jason's assertion that Riak shouldn't be used 
as a cache, not to your specific issue, Edgar.

Eric

On Jan 30, 2014, at 2:54 PM, Edgar Veiga edgarmve...@gmail.com wrote:

 Hi!
 
 I think that you are making some kind of confusion here... I'm not using riak 
 for cache purposes, thats exactly the opposite! Riak is my end persistence 
 system, I need to store the documents in a strong, secure, available and 
 consistent place. That's riak.
 
 It's like I've said before, just make an analogy with the linux file cache 
 system. Node.js workers simulate that in-memory cache, php applications write 
 and read from them and when something is dirty, it's persisted to riak...
 
 Best regards
 
 
 
 
 On 30 January 2014 22:26, Eric Redmond eredm...@basho.com wrote:
 Actually people use Riak as a distributed cache all the time. In fact, many 
 customers use it exclusively as a cache system. Not all backends write to 
 disk. Riak supports a main memory backend[1], complete with size limits and 
 TTL.
 
 Eric
 
 [1]: http://docs.basho.com/riak/latest/ops/advanced/backends/memory/
 
 
 On Jan 30, 2014, at 1:48 PM, Jason Campbell xia...@xiaclo.net wrote:
 
 I'm not sure Riak is the best fit for this.  Riak is great for applications 
 where it is the source of data, and has very strong consistency when used in 
 this way.  You are using it as a cache, where Riak will be significantly 
 slower than other cache solutions.  Especially since you say that each 
 worker will have a set of documents it is responsible for.  Something like a 
 local memcache or redis would likely suit this use case just as well, but do 
 it much faster with less overhead.
 
 Riak will guarantee 3 writes to disk (by default), where something like 
 memcache or redis will stay in memory, and if local, won't have network 
 latency either.  In the worst case where a node goes offline, the real data 
 can be pulled from the backend again, so it isn't a big deal.  It will also 
 simplify your application, because node.js can always request from cache and 
 not worry about the speed, instead of maintaining it's own cache layer.
 
 I'm as happy as the next person on this list to see Riak being used for all 
 sorts of uses, but I believe in the right tool for the right job.  Unless 
 there is something I don't understand, Riak is probably the wrong tool.  It 
 will work, but there is other software that will work much better.
 
 I hope this helps,
 Jason Campbell
 
 - Original Message -
 From: Edgar Veiga edgarmve...@gmail.com
 To: Russell Brown russell.br...@me.com
 Cc: riak-users riak-users@lists.basho.com
 Sent: Friday, 31 January, 2014 3:20:42 AM
 Subject: Re: last_write_wins
 
 
 
 I'll try to explain this the best I can, although it's a simples 
 architecture I'm not describing it in my native language :) 
 
 
 I have a set of node.js workers (64 for now) that serve as a 
 cache/middleware layer for a dozen of php applications. Each worker deals 
 with a set of documents (it's not a distributed cache system). Each worker 
 updates the documents in memory, and tags them as dirty (just like OS file 
 cache), and from time to time (for now, it's a 5 seconds window interval), a 
 persister module will deal with the persistence of those dirty documents to 
 riak. 
 If the document isn't in memory, it will be fetched from riak. 
 
 
 If you want document X, you need to ask to the corresponding worker dealing 
 with it. Two different workers, don't deal with the same document. 
 That way we can guarantee that there will be no concurrent writes to riak. 
 
 
 Best Regards, 
 
 
 
 
 
 
 
 On 30 January 2014 10:46, Russell Brown  russell.br...@me.com  wrote: 
 
 
 
 
 
 
 
 On 30 Jan 2014, at 10:37, Edgar Veiga  edgarmve...@gmail.com  wrote: 
 
 
 
 Also, 
 
 
 Using last_write_wins = true, do I need to always send the vclock while on a 
 PUT request? In the official documention it says that riak will look only at 
 the timestamp of the requests. 
 
 
 Ok, from what you’ve said it sounds like you are always wanting to replace 
 what is at a key with the new information you are putting. If that is the 
 case, then you have the perfect use case for LWW=true. And indeed, you do 
 not need to pass a vclock with your put request. And it sounds like there is 
 no need for you to fetch-before-put since that is only to get context 
 /resolve siblings. Curious about your use case if you can share more. 
 
 
 Cheers 
 
 
 Russell 
 
 
 
 
 
 
 
 
 
 
 Best regards, 
 
 
 
 On 29 January 2014 10:29, Edgar Veiga  edgarmve...@gmail.com  wrote: 
 
 
 
 Hi Russel, 
 
 
 No, it doesn't depend. It's always a new value. 
 
 
 Best regards 
 
 
 
 
 
 On 29 January 2014 10:10, Russell Brown  russell.br...@me.com  wrote: 
 
 
 
 
 
 
 
 On 29 Jan 2014, at 09:57, Edgar Veiga  edgarmve...@gmail.com  wrote: 
 
 
 
 tl;dr 
 
 
 If I guarantee that the same key is only written with a 5 second interval, 
 is last_write_wins=true profitable

Re: last_write_wins

2014-01-30 Thread Edgar Veiga
Yes Eric, I understood :)


On 30 January 2014 23:00, Eric Redmond eredm...@basho.com wrote:

 For clarity, I was responding to Jason's assertion that Riak shouldn't be
 used as a cache, not to your specific issue, Edgar.

 Eric

 On Jan 30, 2014, at 2:54 PM, Edgar Veiga edgarmve...@gmail.com wrote:

 Hi!

 I think that you are making some kind of confusion here... I'm not using
 riak for cache purposes, thats exactly the opposite! Riak is my end
 persistence system, I need to store the documents in a strong, secure,
 available and consistent place. That's riak.

 It's like I've said before, just make an analogy with the linux file cache
 system. Node.js workers simulate that in-memory cache, php applications
 write and read from them and when something is dirty, it's persisted to
 riak...

 Best regards




 On 30 January 2014 22:26, Eric Redmond eredm...@basho.com wrote:

 Actually people use Riak as a distributed cache all the time. In fact,
 many customers use it exclusively as a cache system. Not all backends write
 to disk. Riak supports a main memory backend[1], complete with size limits
 and TTL.

 Eric

 [1]: http://docs.basho.com/riak/latest/ops/advanced/backends/memory/


 On Jan 30, 2014, at 1:48 PM, Jason Campbell xia...@xiaclo.net wrote:

 I'm not sure Riak is the best fit for this.  Riak is great for
 applications where it is the source of data, and has very strong
 consistency when used in this way.  You are using it as a cache, where Riak
 will be significantly slower than other cache solutions.  Especially since
 you say that each worker will have a set of documents it is responsible
 for.  Something like a local memcache or redis would likely suit this use
 case just as well, but do it much faster with less overhead.

 Riak will guarantee 3 writes to disk (by default), where something like
 memcache or redis will stay in memory, and if local, won't have network
 latency either.  In the worst case where a node goes offline, the real data
 can be pulled from the backend again, so it isn't a big deal.  It will also
 simplify your application, because node.js can always request from cache
 and not worry about the speed, instead of maintaining it's own cache layer.

 I'm as happy as the next person on this list to see Riak being used for
 all sorts of uses, but I believe in the right tool for the right job.
  Unless there is something I don't understand, Riak is probably the wrong
 tool.  It will work, but there is other software that will work much better.

 I hope this helps,
 Jason Campbell

 - Original Message -
 From: Edgar Veiga edgarmve...@gmail.com
 To: Russell Brown russell.br...@me.com
 Cc: riak-users riak-users@lists.basho.com
 Sent: Friday, 31 January, 2014 3:20:42 AM
 Subject: Re: last_write_wins



 I'll try to explain this the best I can, although it's a simples
 architecture I'm not describing it in my native language :)


 I have a set of node.js workers (64 for now) that serve as a
 cache/middleware layer for a dozen of php applications. Each worker deals
 with a set of documents (it's not a distributed cache system). Each worker
 updates the documents in memory, and tags them as dirty (just like OS file
 cache), and from time to time (for now, it's a 5 seconds window interval),
 a persister module will deal with the persistence of those dirty documents
 to riak.
 If the document isn't in memory, it will be fetched from riak.


 If you want document X, you need to ask to the corresponding worker
 dealing with it. Two different workers, don't deal with the same document.
 That way we can guarantee that there will be no concurrent writes to
 riak.


 Best Regards,







 On 30 January 2014 10:46, Russell Brown  russell.br...@me.com  wrote:







 On 30 Jan 2014, at 10:37, Edgar Veiga  edgarmve...@gmail.com  wrote:



 Also,


 Using last_write_wins = true, do I need to always send the vclock while
 on a PUT request? In the official documention it says that riak will look
 only at the timestamp of the requests.


 Ok, from what you've said it sounds like you are always wanting to
 replace what is at a key with the new information you are putting. If that
 is the case, then you have the perfect use case for LWW=true. And indeed,
 you do not need to pass a vclock with your put request. And it sounds like
 there is no need for you to fetch-before-put since that is only to get
 context /resolve siblings. Curious about your use case if you can share
 more.


 Cheers


 Russell










 Best regards,



 On 29 January 2014 10:29, Edgar Veiga  edgarmve...@gmail.com  wrote:



 Hi Russel,


 No, it doesn't depend. It's always a new value.


 Best regards





 On 29 January 2014 10:10, Russell Brown  russell.br...@me.com  wrote:







 On 29 Jan 2014, at 09:57, Edgar Veiga  edgarmve...@gmail.com  wrote:



 tl;dr


 If I guarantee that the same key is only written with a 5 second
 interval, is last_write_wins=true profitable?

 It depends. Does the value you write depend

Re: last_write_wins

2014-01-30 Thread Edgar Veiga
Here's a (bad) mockup of the solution:

https://cloudup.com/cOMhcPry38U

Hope that this time I've made myself a little more clear :)

Regards


On 30 January 2014 23:04, Edgar Veiga edgarmve...@gmail.com wrote:

 Yes Eric, I understood :)


 On 30 January 2014 23:00, Eric Redmond eredm...@basho.com wrote:

 For clarity, I was responding to Jason's assertion that Riak shouldn't be
 used as a cache, not to your specific issue, Edgar.

 Eric

 On Jan 30, 2014, at 2:54 PM, Edgar Veiga edgarmve...@gmail.com wrote:

 Hi!

 I think that you are making some kind of confusion here... I'm not using
 riak for cache purposes, thats exactly the opposite! Riak is my end
 persistence system, I need to store the documents in a strong, secure,
 available and consistent place. That's riak.

 It's like I've said before, just make an analogy with the linux file
 cache system. Node.js workers simulate that in-memory cache, php
 applications write and read from them and when something is dirty, it's
 persisted to riak...

 Best regards




 On 30 January 2014 22:26, Eric Redmond eredm...@basho.com wrote:

 Actually people use Riak as a distributed cache all the time. In fact,
 many customers use it exclusively as a cache system. Not all backends write
 to disk. Riak supports a main memory backend[1], complete with size limits
 and TTL.

 Eric

 [1]: http://docs.basho.com/riak/latest/ops/advanced/backends/memory/


 On Jan 30, 2014, at 1:48 PM, Jason Campbell xia...@xiaclo.net wrote:

 I'm not sure Riak is the best fit for this.  Riak is great for
 applications where it is the source of data, and has very strong
 consistency when used in this way.  You are using it as a cache, where Riak
 will be significantly slower than other cache solutions.  Especially since
 you say that each worker will have a set of documents it is responsible
 for.  Something like a local memcache or redis would likely suit this use
 case just as well, but do it much faster with less overhead.

 Riak will guarantee 3 writes to disk (by default), where something like
 memcache or redis will stay in memory, and if local, won't have network
 latency either.  In the worst case where a node goes offline, the real data
 can be pulled from the backend again, so it isn't a big deal.  It will also
 simplify your application, because node.js can always request from cache
 and not worry about the speed, instead of maintaining it's own cache layer.

 I'm as happy as the next person on this list to see Riak being used for
 all sorts of uses, but I believe in the right tool for the right job.
  Unless there is something I don't understand, Riak is probably the wrong
 tool.  It will work, but there is other software that will work much better.

 I hope this helps,
 Jason Campbell

 - Original Message -
 From: Edgar Veiga edgarmve...@gmail.com
 To: Russell Brown russell.br...@me.com
 Cc: riak-users riak-users@lists.basho.com
 Sent: Friday, 31 January, 2014 3:20:42 AM
 Subject: Re: last_write_wins



 I'll try to explain this the best I can, although it's a simples
 architecture I'm not describing it in my native language :)


 I have a set of node.js workers (64 for now) that serve as a
 cache/middleware layer for a dozen of php applications. Each worker deals
 with a set of documents (it's not a distributed cache system). Each worker
 updates the documents in memory, and tags them as dirty (just like OS file
 cache), and from time to time (for now, it's a 5 seconds window interval),
 a persister module will deal with the persistence of those dirty documents
 to riak.
 If the document isn't in memory, it will be fetched from riak.


 If you want document X, you need to ask to the corresponding worker
 dealing with it. Two different workers, don't deal with the same document.
 That way we can guarantee that there will be no concurrent writes to
 riak.


 Best Regards,







 On 30 January 2014 10:46, Russell Brown  russell.br...@me.com  wrote:







 On 30 Jan 2014, at 10:37, Edgar Veiga  edgarmve...@gmail.com  wrote:



 Also,


 Using last_write_wins = true, do I need to always send the vclock while
 on a PUT request? In the official documention it says that riak will look
 only at the timestamp of the requests.


 Ok, from what you've said it sounds like you are always wanting to
 replace what is at a key with the new information you are putting. If that
 is the case, then you have the perfect use case for LWW=true. And indeed,
 you do not need to pass a vclock with your put request. And it sounds like
 there is no need for you to fetch-before-put since that is only to get
 context /resolve siblings. Curious about your use case if you can share
 more.


 Cheers


 Russell










 Best regards,



 On 29 January 2014 10:29, Edgar Veiga  edgarmve...@gmail.com  wrote:



 Hi Russel,


 No, it doesn't depend. It's always a new value.


 Best regards





 On 29 January 2014 10:10, Russell Brown  russell.br...@me.com  wrote:







 On 29 Jan 2014, at 09:57

Re: last_write_wins

2014-01-30 Thread Edgar Veiga
No problem Jason, I'm glad you've tried to help :)

I'm using leveldb backend, and the system is running in production for
about 6 months. It's being quiet an interesting experience, but now that
the load is getting bigger and the amount of data in riak too, we need to
start tunning this little things.

Best regards!


On 30 January 2014 23:17, Jason Campbell xia...@xiaclo.net wrote:

 Oh, I completely misunderstood, I'm sorry for that.  I was thinking of
 your application as a typical web application which could regenerate the
 data at any time (making that the authoritative source, not Riak).

 In that case, Riak does sound perfect, but I would definitely not use the
 memory backend if that is the only copy of the data.

 Eric, I'm sorry if I made is sound like Riak is a poor cache in all
 situations, I just didn't think it fit here (although I clearly
 misunderstood).  There is a tradeoff between speed and
 consistency/reliability, and the whole application has to take advantage of
 the extra consistency and reliability for it to make sense.

 Sorry again,
 Jason Campbell

 - Original Message -
 From: Edgar Veiga edgarmve...@gmail.com
 To: Eric Redmond eredm...@basho.com
 Cc: Jason Campbell xia...@xiaclo.net, riak-users 
 riak-users@lists.basho.com, Russell Brown russell.br...@me.com
 Sent: Friday, 31 January, 2014 9:54:33 AM
 Subject: Re: last_write_wins


 Hi!


 I think that you are making some kind of confusion here... I'm not using
 riak for cache purposes, thats exactly the opposite! Riak is my end
 persistence system, I need to store the documents in a strong, secure,
 available and consistent place. That's riak.


 It's like I've said before, just make an analogy with the linux file cache
 system. Node.js workers simulate that in-memory cache, php applications
 write and read from them and when something is dirty, it's persisted to
 riak...


 Best regards







 On 30 January 2014 22:26, Eric Redmond  eredm...@basho.com  wrote:




 Actually people use Riak as a distributed cache all the time. In fact,
 many customers use it exclusively as a cache system. Not all backends write
 to disk. Riak supports a main memory backend[1], complete with size limits
 and TTL.


 Eric


 [1]: http://docs.basho.com/riak/latest/ops/advanced/backends/memory/






 On Jan 30, 2014, at 1:48 PM, Jason Campbell  xia...@xiaclo.net  wrote:


 I'm not sure Riak is the best fit for this. Riak is great for applications
 where it is the source of data, and has very strong consistency when used
 in this way. You are using it as a cache, where Riak will be significantly
 slower than other cache solutions. Especially since you say that each
 worker will have a set of documents it is responsible for. Something like a
 local memcache or redis would likely suit this use case just as well, but
 do it much faster with less overhead.

 Riak will guarantee 3 writes to disk (by default), where something like
 memcache or redis will stay in memory, and if local, won't have network
 latency either. In the worst case where a node goes offline, the real data
 can be pulled from the backend again, so it isn't a big deal. It will also
 simplify your application, because node.js can always request from cache
 and not worry about the speed, instead of maintaining it's own cache layer.

 I'm as happy as the next person on this list to see Riak being used for
 all sorts of uses, but I believe in the right tool for the right job.
 Unless there is something I don't understand, Riak is probably the wrong
 tool. It will work, but there is other software that will work much better.

 I hope this helps,
 Jason Campbell

 - Original Message -
 From: Edgar Veiga  edgarmve...@gmail.com 
 To: Russell Brown  russell.br...@me.com 
 Cc: riak-users  riak-users@lists.basho.com 
 Sent: Friday, 31 January, 2014 3:20:42 AM
 Subject: Re: last_write_wins



 I'll try to explain this the best I can, although it's a simples
 architecture I'm not describing it in my native language :)


 I have a set of node.js workers (64 for now) that serve as a
 cache/middleware layer for a dozen of php applications. Each worker deals
 with a set of documents (it's not a distributed cache system). Each worker
 updates the documents in memory, and tags them as dirty (just like OS file
 cache), and from time to time (for now, it's a 5 seconds window interval),
 a persister module will deal with the persistence of those dirty documents
 to riak.
 If the document isn't in memory, it will be fetched from riak.


 If you want document X, you need to ask to the corresponding worker
 dealing with it. Two different workers, don't deal with the same document.
 That way we can guarantee that there will be no concurrent writes to riak.


 Best Regards,







 On 30 January 2014 10:46, Russell Brown  russell.br...@me.com  wrote:







 On 30 Jan 2014, at 10:37, Edgar Veiga  edgarmve...@gmail.com  wrote:



 Also,


 Using last_write_wins = true, do I need to always send

Re: last_write_wins

2014-01-30 Thread John Daily
Replies inline.

(Thanks to Russell for the link to my blog series, but honestly, as I now 
re-read the section on conflict resolution, I’m unhappy with it. It’s a very 
confusing topic and I regret not doing a better job of clarifying it. This 
answer will undoubtedly also be more confusing than I’d like.)

On Jan 30, 2014, at 9:53 AM, Guido Medina guido.med...@temetra.com wrote:

 All of our buckets have allow_multi=false except for the one bucket we have 
 for CRDT counters, our application requires certain some level of consistency 
 so we have full control of our reads/writes using a fine grain locking 
 mechanism combined with in-memory cache so in our case the LWW=true is what 
 we would want?, now, we haven't touched this parameter so it is at its 
 default value.

It’s a bit confusing to refer to “LWW because the “last write wins” strategy 
is often referred to as LWW, and separately we have the a last_write_wins 
configuration parameter, and they’re not the same thing. I’m going to stick to 
last_write_wins to be explicit when I’m referring to the parameter, and “last 
write wins” when referring to the strategy. (Informally I often refer to LWW as 
the strategy and lww as the parameter, but I’ll spare you that casual pedantry 
here.)

The “last write wins” strategy comes into play whenever allow_mult is set to 
false, regardless of the value of last_write_wins.

Setting last_write_wins=true when allow_mult=false will optimize Bitcask[1] 
“put requests to not bother reading any existing value to compare vector 
clocks, but if servers are offline or there are network partitions during the 
put operation, when read repair or active anti-entropy are invoked later the 
vector clock (including server timestamp) will be used to guess[2] which 
version of the object is the “last.


If you can truly guarantee serialization at the application layer and you can 
guarantee that no two updates to a single value will occur within the 
worst-case clock skew across your cluster, then the “last write wins” strategy 
is reasonable. If you have any doubt about either and data safety is important, 
you really should set allow_mult=true and deal with siblings.

Unfortunately, worst-case clock skew across a cluster can be pretty bad. NTP 
will typically keep it under control, but it’s all too easy for both NTP and 
your monitoring of NTP to be broken.

 
 I'm assuming it will improve performance for our case, but, if we set 
 LWW=true, will it affect the bucket(s) with allow_multi=true, is it safe to 
 assume that if allow_multi=true LWW will be ignored? We only modify bucket 
 properties using Riak Java client 1.4.x atm.

No, it’s definitely not safe. If you set your cluster default to 
last_write_wins=true, you should explicitly set your allow_mult=true buckets to 
last_write_wins=false using the Java client. As Russell indicated, the behavior 
if both allow_mult and last_write_wins are set to true is undefined and not 
guaranteed at all to be what you want, regardless of the current state of the 
code.

 
 Also, about safety, LWW=true uses timestamp? and LWW=false uses vclock?, 
 future of both?, should we leave it untouched? we don't really want to use 
 something that could jeopardise our data consistency requirement even if it 
 means better performance.

Vector clocks (with embedded server timestamps) are used to help Riak decide 
what to do about data inconsistencies regardless of the configuration settings. 
Riak generates (or updates) vector clocks with each put.

-John


[1] Why does last_write_wins=true only really impact Bitcask writes? If the 
backend supports 2i (Memory or LevelDB currently) then we have to read the old 
value from disk to determine whether any indexes need to be updated when that 
value is replaced.

[2] You could use the word “determine” here, but given the inherent 
unreliability of server clocks, it’s just as accurate to say “guess.


___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


Re: last_write_wins

2014-01-29 Thread Edgar Veiga
tl;dr

If I guarantee that the same key is only written with a 5 second interval,
is last_write_wins=true profitable?


On 27 January 2014 23:25, Edgar Veiga edgarmve...@gmail.com wrote:

 Hi there everyone!

 I would like to know, if my current application is a good use case to set
 last_write_wins to true.

 Basically I have a cluster of node.js workers reading and writing to riak.
 Each node.js worker is responsible for a set of keys, so I can guarantee
 some kind of non distributed cache...
 The real deal here is that the writing operation is not run evertime an
 object is changed but each 5 seconds in a batch insertion/update style.
 This brings the guarantee that the same object cannot be write to riak at
 the same time, not event at the same seconds, there's always a 5 second
 window between each insertion/update.

 That said, is it profitable to me if I set last_write_wins to true? I've
 been facing some massive writting delays under high loads and it would be
 nice if I have some kind of way to tune riak.

 Thanks a lot and keep up the good work!


___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


Re: last_write_wins

2014-01-29 Thread Russell Brown

On 29 Jan 2014, at 09:57, Edgar Veiga edgarmve...@gmail.com wrote:

 tl;dr
 
 If I guarantee that the same key is only written with a 5 second interval, is 
 last_write_wins=true profitable?

It depends. Does the value you write depend in anyway on the value you read, or 
is it always that you are just getting a totally new value that replaces what 
is in Riak (regardless what is in Riak)?

 
 
 On 27 January 2014 23:25, Edgar Veiga edgarmve...@gmail.com wrote:
 Hi there everyone!
 
 I would like to know, if my current application is a good use case to set 
 last_write_wins to true.
 
 Basically I have a cluster of node.js workers reading and writing to riak. 
 Each node.js worker is responsible for a set of keys, so I can guarantee some 
 kind of non distributed cache... 
 The real deal here is that the writing operation is not run evertime an 
 object is changed but each 5 seconds in a batch insertion/update style. 
 This brings the guarantee that the same object cannot be write to riak at the 
 same time, not event at the same seconds, there's always a 5 second window 
 between each insertion/update.
 
 That said, is it profitable to me if I set last_write_wins to true? I've been 
 facing some massive writting delays under high loads and it would be nice if 
 I have some kind of way to tune riak.
 
 Thanks a lot and keep up the good work!
 
 
 ___
 riak-users mailing list
 riak-users@lists.basho.com
 http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com


Re: last_write_wins

2014-01-29 Thread Edgar Veiga
Hi Russel,

No, it doesn't depend. It's always a new value.

Best regards


On 29 January 2014 10:10, Russell Brown russell.br...@me.com wrote:


 On 29 Jan 2014, at 09:57, Edgar Veiga edgarmve...@gmail.com wrote:

 tl;dr

 If I guarantee that the same key is only written with a 5 second interval,
 is last_write_wins=true profitable?


 It depends. Does the value you write depend in anyway on the value you
 read, or is it always that you are just getting a totally new value that
 replaces what is in Riak (regardless what is in Riak)?



 On 27 January 2014 23:25, Edgar Veiga edgarmve...@gmail.com wrote:

 Hi there everyone!

 I would like to know, if my current application is a good use case to set
 last_write_wins to true.

 Basically I have a cluster of node.js workers reading and writing to
 riak. Each node.js worker is responsible for a set of keys, so I can
 guarantee some kind of non distributed cache...
 The real deal here is that the writing operation is not run evertime an
 object is changed but each 5 seconds in a batch insertion/update style.
 This brings the guarantee that the same object cannot be write to riak at
 the same time, not event at the same seconds, there's always a 5 second
 window between each insertion/update.

 That said, is it profitable to me if I set last_write_wins to true? I've
 been facing some massive writting delays under high loads and it would be
 nice if I have some kind of way to tune riak.

 Thanks a lot and keep up the good work!


 ___
 riak-users mailing list
 riak-users@lists.basho.com
 http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com



___
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com