It really is not a good idea to use siblings to represent 1-to-many relations. 
That's not what it's intended for, nor what it's optimized for...
Can you tell us exactly why you need Bitcask rather than LevelDB? 2i would 
probably do it.
Otherwise, storing a list of items under each key could be a solution, 
depending of course on the number of items per key. (But do perform conflict 
resolution.)
/Erik



-------- Oprindelig meddelelse --------
Fra: Viable Nisei <vsni...@gmail.com>
Dato:
Til: riak-users@lists.basho.com
Emne: May allow_mult cause DoS?


Hi.

Recently we've described that something is going unexpectedly. We are using 
Riak 1.4.2 with some buckets with allow_mult=true.
We've tried our app under load then found that... concurrently writes into 
bucket with allow_mult turning Riak into irresponsible slowpoke and even crash 
it.

Core i3 with 4GB RAM performs only 20 writes/sec with 5 client threads writing 
20 short strings into 20 keys in bucket with allow_mult=true, search=false. 
With 40 values per 40 keys it performs only 6 writes/sec. 60x60 cause riak 
crash?
Throughput drops drastically. Ok, we've not chaged concurrency factor (5) and 
increased our data set 4x, but why throughput drops?
Ok, we increase our dataset linear, 20 strings * 20 keys, 40 strings*20 keys,  
60 strings*20 keys... Results will be same - exponential throughput drop with 
crash at end.

Cluster of five Amazon EC2 cc2.8xlarge nodes becomes irresponsibly with 
throughput 1-5 writes/sec with only 80-100 values per 1-10 keys.

So, we think it is very strange.

Here you can check our code sample (in java) reproducing this behavior: 
https://bitbucket.org/vsnisei/riak-allow_mult_wtf

So, we have asked Basho about this, but they said that "we think SQLish" and 
asked us for $5k for 2-days consultation to resolve our problem.
So, I've decided to ask here if we are really so stupid and not able to 
understood some simple things or Basho didn't understood us correctly?..

Anyway, looks like that some DoS/DDoS attack approach utilizing this behavior 
may be proposed. We should only know that some service/appliation/website is 
using Riak with allow_mult buckets then provoke concurrent writes into them...

Actually our question to Basho was broader. Our application needs to implement 
1-many bindings. Riak allows the following approaches to simultate such 
bindings, according to documentation:

 1.  Riak search - but we've found that it's VERY slow (20x performance drop 
when search enabled, even for simple objects like {source_id: xxx, target_id: 
yyy}, also we've found that search is not really scalable - adding new nodes 
into cluster not increasing throughput, but even slows cluster down...
 2.  secondary indexes. But, according to docs, they are working only on 
LevelDb, but we need Bitcask
 3.  Link walking. But, according to docs, it's "rest only operation" and in 
java driver it's implemented as a hack
 4.  allow_mult. But we've found that it's just a nightmare. So we told Basho 
about this and given link to our example, but they didn't given us any feedback
 5.  Bucket keys enumeration. But, according to docs, this operation causes 
full keys scan on each node and must not be used in production
 6.  Mapred queries. Ok, we didn't tried them yet, maybe it's silver bullet, 
really. But according to docs (and common sense) mapred causes full-scan (for 
bucket at least. Or for all keys?) and it's operation with unpredictable 
latency.

So, where we are wrong? Is everything ok with behavior I've described? Are we 
misunderstood Riak completely and should pay $5k for some mind-expansion, or 
there is no any hidden mystical knowledge and they will not say us anything 
excepting approaches listed above?

_______________________________________________
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Reply via email to