How to find what node a key is on

2011-03-23 Thread Sameer Farooqui
Does anybody know if it's possible to find out what node a specific key/row
lives on?

We have a 30 node cluster and I'm curious how much faster it'll be to read
data directly from the node that stores the data.

We're using random partitioner, by the way.


*Sameer Farooqui
*Accenture Technology Labs


Re: How to find what node a key is on

2011-03-23 Thread aaron morton
Each row is stored on RF nodes, and your read will be sent to CL number of 
nodes. Messages only take a single hop from the coordinator to each node the 
read is performed on, so the networking overhead varies with the number of 
nodes involved in the request.  There are man factors other than networking 
that influence the speed of a read request. 

There are features available to determine which nodes holds replicas for a 
particular key. AFAIK they are not intended for use by clients. 

Are you currently having problems with read performance ? 

Hope that helps.
Aaron

 
On 24 Mar 2011, at 11:53, Sameer Farooqui wrote:

 Does anybody know if it's possible to find out what node a specific key/row 
 lives on?
 
 We have a 30 node cluster and I'm curious how much faster it'll be to read 
 data directly from the node that stores the data. 
 
 We're using random partitioner, by the way.
 
 
 Sameer Farooqui
 Accenture Technology Labs
 
 



Re: How to find what node a key is on

2011-03-23 Thread Sameer Farooqui
No problems with read performance, just curious about what kind of overhead
was being added b/c we're doing read tests.

If it's easy to figure out where the row is stored, I'd be interested in
trying it. If not, don't worry about it.

- Sameer


On Wed, Mar 23, 2011 at 4:31 PM, aaron morton aa...@thelastpickle.comwrote:

 Each row is stored on RF nodes, and your read will be sent to CL number of
 nodes. Messages only take a single hop from the coordinator to each node the
 read is performed on, so the networking overhead varies with the number of
 nodes involved in the request.  There are man factors other than networking
 that influence the speed of a read request.

 There are features available to determine which nodes holds replicas for a
 particular key. AFAIK they are not intended for use by clients.

 Are you currently having problems with read performance ?

 Hope that helps.
 Aaron


 On 24 Mar 2011, at 11:53, Sameer Farooqui wrote:

 Does anybody know if it's possible to find out what node a specific key/row
 lives on?

 We have a 30 node cluster and I'm curious how much faster it'll be to read
 data directly from the node that stores the data.

 We're using random partitioner, by the way.


 *Sameer Farooqui
 *Accenture Technology Labs





Re: How to find what node a key is on

2011-03-23 Thread Robert Coli
On Wed, Mar 23, 2011 at 4:31 PM, aaron morton aa...@thelastpickle.com wrote:
 There are features available to determine which nodes holds replicas for a
 particular key. AFAIK they are not intended for use by clients.

Specifically :

http://wiki.apache.org/cassandra/JmxInterface#org.apache.cassandra.service.StorageService.Operations.getNaturalEndpoints

Which is I think what Sameer was asking all along.. :)

=Rob


Re: How to find what node a key is on

2011-03-23 Thread Narendra Sharma
The logic to find the node is not complicated. You compute the MD5 hash of
the key. Create sorted list of tokens assigned to the nodes in the ring.
 Find the first token greater than the hash. This is the first node. Next in
the list is the replica, which depends on the RF. Now this is simple because
this assumes SimpleStrategy for replica placement. For other strategies
finding replicas will be more involved.

Cassandra is a distributed databases. Each node is aware of the state of the
cluster and token distribution. Moving the logic into client is possible but
the benefits are way less compared to pain. At the same time doing it for a
large cluster would be more painful.

I would discourage you from going that route.

Thanks,
Naren

On Wed, Mar 23, 2011 at 5:16 PM, Sameer Farooqui cassandral...@gmail.comwrote:

 No problems with read performance, just curious about what kind of overhead
 was being added b/c we're doing read tests.

 If it's easy to figure out where the row is stored, I'd be interested in
 trying it. If not, don't worry about it.

 - Sameer



 On Wed, Mar 23, 2011 at 4:31 PM, aaron morton aa...@thelastpickle.comwrote:

 Each row is stored on RF nodes, and your read will be sent to CL number of
 nodes. Messages only take a single hop from the coordinator to each node the
 read is performed on, so the networking overhead varies with the number of
 nodes involved in the request.  There are man factors other than networking
 that influence the speed of a read request.

 There are features available to determine which nodes holds replicas for a
 particular key. AFAIK they are not intended for use by clients.

 Are you currently having problems with read performance ?

 Hope that helps.
 Aaron


 On 24 Mar 2011, at 11:53, Sameer Farooqui wrote:

 Does anybody know if it's possible to find out what node a specific
 key/row lives on?

 We have a 30 node cluster and I'm curious how much faster it'll be to read
 data directly from the node that stores the data.

 We're using random partitioner, by the way.


 *Sameer Farooqui
 *Accenture Technology Labs