Thanks Alex,
I'm thinking I might just have to start with 16 nodes just to reduce the
number of vnodes i.e. with a partition size of 1024 and then scale up to 64
nodes. Not sure what I'm going to do when I move away from Cloud Servers where
I can use a lot less servers with higher RAM and Fusion IO storage i.e. since
I can't change the partition size.
I guess I'd stick with 512 partition size, move away from Cloud Servers at 32
nodes to dedicated storage which might end up being 8 nodes in the beginning
because of the superior RAM and storage (64 vnodes). So even if I get to 32
nodes on that (> 6TB RAM, > 320TB HDD) I'd still be above 10 vnodes per node.
That seems workable to me.
Is it possible [moving beyond 32 nodes to 64 or higher] to set up another
cluster of servers with a higher partition size, say 1024, move the entire
data to this new set, and then re-provision the old servers to become part of
this new cluster?!
I believe Riak with 2i is a very good fit for what we need to do with the
data. Also, we don't plan to have more than five indexes on an object at any
one time.
Cheers.
-----Original Message-----
From: "Alexander Sicular" <sicul...@gmail.com>
Sent: Thursday, 23 February, 2012 4:25am
To: "Jeremiah Peschka" <jeremiah.pesc...@gmail.com>
Cc: char...@contentomni.com, "riak-users" <riak-users@lists.basho.com>
Subject: Re: Riak for Messaging Project Question
Actuallly, I think my math was faulty, it's actually 800 billion keys, so
that's many more servers than expected.
A few billion? Pssh. You know what's cool? A Trillion.
A few things jump off the page from the comments on this thread:
Static allocation of ring_creation_size. Once you set it you can not change it.
Although I'm sure that is on the drawing board. As Jeremiah pointed out, that
means you will have a ton of vnodes on each physical machine while your cluster
is still small.
Data migration. As you add nodes to the cluster some fraction of the data will
have to move around the ring. This could be a lot of data in your case.
N-val. I'm not entirely sure about this but if you think about how riak does
its map/reduce, search and index querying you have to think about how wide a
net riak casts to get results. Is querying more nodes good or bad? And how good
or bad is it in different cluster sizes. And how does that change with the
number of vnodes on each physical node. I'm not sure but I think it should be
explored.
Frankly, the hardware at play does not really factor into my decision on
whether or not to use riak. I think the only real question is how riak as a
system accommodates the kind of data you want to collect and the ways in which
you want to access it. The physical implementation is simply some scale
function in regards to performance. Ie. more money = more performance.
Would love to hear where you go with this.
Cheers,
Alexander
_______________________________________________
riak-users mailing list
riak-users@lists.basho.com
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com