It really is not a good idea to use siblings to represent 1-to-many relations. That's not what it's intended for, nor what it's optimized for... Can you tell us exactly why you need Bitcask rather than LevelDB? 2i would probably do it. Otherwise, storing a list of items under each key could be a solution, depending of course on the number of items per key. (But do perform conflict resolution.) /Erik
-------- Oprindelig meddelelse -------- Fra: Viable Nisei <vsni...@gmail.com> Dato: Til: riak-users@lists.basho.com Emne: May allow_mult cause DoS? Hi. Recently we've described that something is going unexpectedly. We are using Riak 1.4.2 with some buckets with allow_mult=true. We've tried our app under load then found that... concurrently writes into bucket with allow_mult turning Riak into irresponsible slowpoke and even crash it. Core i3 with 4GB RAM performs only 20 writes/sec with 5 client threads writing 20 short strings into 20 keys in bucket with allow_mult=true, search=false. With 40 values per 40 keys it performs only 6 writes/sec. 60x60 cause riak crash? Throughput drops drastically. Ok, we've not chaged concurrency factor (5) and increased our data set 4x, but why throughput drops? Ok, we increase our dataset linear, 20 strings * 20 keys, 40 strings*20 keys, 60 strings*20 keys... Results will be same - exponential throughput drop with crash at end. Cluster of five Amazon EC2 cc2.8xlarge nodes becomes irresponsibly with throughput 1-5 writes/sec with only 80-100 values per 1-10 keys. So, we think it is very strange. Here you can check our code sample (in java) reproducing this behavior: https://bitbucket.org/vsnisei/riak-allow_mult_wtf So, we have asked Basho about this, but they said that "we think SQLish" and asked us for $5k for 2-days consultation to resolve our problem. So, I've decided to ask here if we are really so stupid and not able to understood some simple things or Basho didn't understood us correctly?.. Anyway, looks like that some DoS/DDoS attack approach utilizing this behavior may be proposed. We should only know that some service/appliation/website is using Riak with allow_mult buckets then provoke concurrent writes into them... Actually our question to Basho was broader. Our application needs to implement 1-many bindings. Riak allows the following approaches to simultate such bindings, according to documentation: 1. Riak search - but we've found that it's VERY slow (20x performance drop when search enabled, even for simple objects like {source_id: xxx, target_id: yyy}, also we've found that search is not really scalable - adding new nodes into cluster not increasing throughput, but even slows cluster down... 2. secondary indexes. But, according to docs, they are working only on LevelDb, but we need Bitcask 3. Link walking. But, according to docs, it's "rest only operation" and in java driver it's implemented as a hack 4. allow_mult. But we've found that it's just a nightmare. So we told Basho about this and given link to our example, but they didn't given us any feedback 5. Bucket keys enumeration. But, according to docs, this operation causes full keys scan on each node and must not be used in production 6. Mapred queries. Ok, we didn't tried them yet, maybe it's silver bullet, really. But according to docs (and common sense) mapred causes full-scan (for bucket at least. Or for all keys?) and it's operation with unpredictable latency. So, where we are wrong? Is everything ok with behavior I've described? Are we misunderstood Riak completely and should pay $5k for some mind-expansion, or there is no any hidden mystical knowledge and they will not say us anything excepting approaches listed above? _______________________________________________ riak-users mailing list riak-users@lists.basho.com http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com