Re: Getting all the Keys

2011-01-22 Thread Jeremiah Peschka
If you ever want to think about putting indexes in Riak, I played a little thought game and wrote it out on my blog: http://facility9.com/2010/12/16/secondary-indexes-how-would-you-do-it Otherwise - reverse indexes/roll you own b-tree. As an aside, thanks for asking your questions, it prompted m

Re: Getting all the Keys

2011-01-22 Thread Thomas Burdick
I mistakenly didn't send a reply to the whole list, but given what everyone is saying I think I "get it" now and the reasoning. Given all of that it seems pretty clear that if I wanted to do what I'm talking about purely in the context of riak using links might work or a bucket containing keys and

Re: Getting all the Keys

2011-01-22 Thread Sean Cribbs
On Jan 22, 2011, at 4:15 PM, Thomas Burdick wrote: > * Why is key listing so slow? It is slow because, even if the keys are in RAM, you have to scan roughly all of the keys in the cluster to get a listing for a single bucket. As a certain person is fond of saying, "full table scan is full tabl

Re: Short read error on .store() with protobufs using Python client

2011-01-22 Thread Bob Feldbauer
Thanks, Nico. I forked riak-python-client and created a pull request for the patch you described (removing the three lines for the length check in recv _pkt). - Bob Feldbauer On 1/22/2011 3:23 AM, Nico Meyer wrote: Hi let me clarify the situation somewhat. The problem is not intermittent in

Re: Getting all the Keys

2011-01-22 Thread Ryan Zezeski
I think it's worth mentioning that riak is based on Amazon Dynamo and if you read the paper you'll see Dynamo's use case is lookup by primary key. -Ryan [Sent from my iPhone] On Jan 22, 2011, at 5:43 PM, Neville Burnell wrote: > >As of Riak 0.14 your m/r can filter on key name. I would highly

Re: Getting all the Keys

2011-01-22 Thread Neville Burnell
>As of Riak 0.14 your m/r can filter on key name. I would highly recommend that your data architecture take this into account by using keys that have meaningful names. >>This will allow you to not scan every key in your cluster. Is this part true? I understood that key filtering just means yo

Re: Getting all the Keys

2011-01-22 Thread Thomas Burdick
No one seems to have really answered either of my questions in any great detail other than "don't do that" or "use redis" which to me just adds another layer of complexity and potential bugginess to my end application or fails to really describe what the problem is. So really my questions can be b

Re: Getting all the Keys

2011-01-22 Thread Les Mikesell
On 1/22/11 1:45 PM, Gary William Flake wrote: Riak is bad at enumerating keys. If the key isn't something that you can use to retrieve the items you want, what's the point of having it? -- Les Mikesell lesmikes...@gmail.com ___ riak-users ma

Re: Getting all the Keys

2011-01-22 Thread Justin Sheehy
On Sat, Jan 22, 2011 at 3:18 PM, Alexander Sicular wrote: > I'll drop a phat tangent and just mention that I watched @rk's talk at Qcon > SF 2010 the other day and am kinda crushing on how they implemented > distributed counters in cassandra (mainlined in 0.7.1 me thinks) which, > imho, is so cho

Re: Getting all the Keys

2011-01-22 Thread Alexander Sicular
I don't think it is a flaw at all. Rather I am of the opinion that riak was never meant to do the things we are all talking about in this thread. When I need to do these things I specifically use redis because, as noted, it has tremendous support for specific data structures. When I need

Re: Getting all the Keys

2011-01-22 Thread Gary William Flake
This is a really big pain point for me as well and -- at the risk of prematurely being overly critical of Riak's overall design -- I think it points to a major flaw of Riak in its current state. Let me explain Riak is bad at enumerating keys. We know that. I am happy to manage a list of keys

Re: Getting all the Keys

2011-01-22 Thread Alexander Staubo
On Sat, Jan 22, 2011 at 19:34, Alexander Staubo wrote: > On Sat, Jan 22, 2011 at 18:23, Thomas Burdick > wrote: >> So really whats the solution to just having a list of like 50k keys that can >> quickly be appended to without taking seconds to then retrieve later on. Or >> is this just not a vali

Re: Getting all the Keys

2011-01-22 Thread Alexander Staubo
On Sat, Jan 22, 2011 at 18:23, Thomas Burdick wrote: > So really whats the solution to just having a list of like 50k keys that can > quickly be appended to without taking seconds to then retrieve later on. Or > is this just not a valid use case for riak at all? That would suck cause > again, I re

Re: Getting all the Keys

2011-01-22 Thread Jeremiah Peschka
If you're looking for a fast, in memory, store that has support for ordered lists you should probably give Redis a look-see. It's an in memory key-value store but it has support for lists as a native data type: http://redis.io/commands#list You could do the same thing in Riak, but you'd be storing

Re: Getting all the Keys

2011-01-22 Thread Thomas Burdick
I guess I'm left even more baffled now, if the keys are all in memory and I only have 1 real node in my cluster, why would it take half a second to obtain all the keys from a completely empty database? If it takes half a second to just list the keys out like that how could a map/reduce ever take le

Re: Getting all the Keys

2011-01-22 Thread Jeremiah Peschka
I was going to respond, but I think Alex answered it well with much more humor than I can muster on a good day. All I can add is: - Make sure you're on Riak 0.14. - Take a look at the filter documentation and see how you can clean up your queries

Re: Memory Requirements for Riak Search index (merge_index)

2011-01-22 Thread Alexander Sicular
Without knowing exactly, I'm gonna go with yes. I happen to be under the impression that the merge_index_backend is a bitcask derivative. But I would love to hear otherwise from someone @basho. -Alexander Sicular @siculars On Jan 22, 2011, at 9:50 AM, Gordon Tillman wrote: > Greetings All, >

Re: Getting all the Keys

2011-01-22 Thread Alexander Sicular
Hi Thomas, This is a topic that has come up many times. Lemme just hit a couple of high notes in no particular order: - If you must do a list keys op on a bucket, you must must must use "?keys=stream". True will block on the coordinating node until all nodes return their keys. Stream will star

Getting all the Keys

2011-01-22 Thread Thomas Burdick
I've been playing around with riak lately as really my first usage of a distributed key/value store. I quite like many of the concepts and possibilities of Riak and what it may deliver, however I'm really stuck on an issue. Doing the equivalent of a select * from sometable in riak is seemingly slo

Memory Requirements for Riak Search index (merge_index)

2011-01-22 Thread Gordon Tillman
Greetings All, It is my understanding the only backend that is compatible with Riak Search indexes is the merge_index_backend. I am wondering if merge_index backend has a similar memory footprint as bitcask; i.e., must the keydir structure for merge_index fit entirely in RAM as is the case wit

Re: Short read error on .store() with protobufs using Python client

2011-01-22 Thread Nico Meyer
Hi, let me clarify the situation somewhat. The problem is not intermittent in the sense that it occurs randomly. It depends mainly on the size of the answer that riak sends. If the answer is bigger than the network MTU the error will occur most of the time. If its smaller it occurs almost never. T