RE: about the data directory

2011-01-14 Thread raoyixuan (Shandy)
Thanks very much

-Original Message-
From: sc...@scode.org [mailto:sc...@scode.org] On Behalf Of Peter Schuller
Sent: Friday, January 14, 2011 4:40 PM
To: user@cassandra.apache.org
Subject: Re: about the data directory

> as a administrator, I want to know why I can read the data from any node, 
> because the data just be kept the replica. Can you tell me? Thanks in advance.

It's part of the point of Cassandra. You talk to the cluster, period.
It's Cassandra's job to keep track of where data lives, and client
applications don't care. This is a fundamental design goal.

-- 
/ Peter Schuller


Re: about the data directory

2011-01-14 Thread Peter Schuller
> as a administrator, I want to know why I can read the data from any node, 
> because the data just be kept the replica. Can you tell me? Thanks in advance.

It's part of the point of Cassandra. You talk to the cluster, period.
It's Cassandra's job to keep track of where data lives, and client
applications don't care. This is a fundamental design goal.

-- 
/ Peter Schuller


RE: about the data directory

2011-01-13 Thread raoyixuan (Shandy)
as a administrator, I want to know why I can read the data from any node, 
because the data just be kept the replica. Can you tell me? Thanks in advance.

-Original Message-
From: Edward Capriolo [mailto:edlinuxg...@gmail.com] 
Sent: Friday, January 14, 2011 9:44 AM
To: user@cassandra.apache.org
Subject: Re: about the data directory

On Thu, Jan 13, 2011 at 7:56 PM, raoyixuan (Shandy)
 wrote:
> I have some confused, why do the users can read the data in all nodes? I mean 
> the data just be kept in the replica, how to achieve it?
>
> -Original Message-
> From: sc...@scode.org [mailto:sc...@scode.org] On Behalf Of Peter Schuller
> Sent: Friday, January 14, 2011 1:19 AM
> To: user@cassandra.apache.org
> Subject: Re: about the data directory
>
>> So you mean just the replica node 's sstable will be changed ,right?
>
> The data will only be written to the nodes that are part of the
> replica set fo the row (with the exception of hinted handoff, but
> that's a different sstable).
>
>> If all the replica node broke down, whether the users can read the data?
>
> If *all* nodes in the replica set for a particular row are down, then
> you won't be able to read that row, no.
>
> --
> / Peter Schuller
>

It does not matter which node you connect to. The node you connect to
determines the hash of the key (or uses the key itself when using
Order Preserving Partitioner) to determine which node or nodes the
data should be on. If the key is on that node it returns it directly
to the client. If the key is not on that node Cassandra fetches it
from another node and then returns that data. The client is unaware
and does not need to be concerned with where the data came from.


Re: about the data directory

2011-01-13 Thread Edward Capriolo
On Thu, Jan 13, 2011 at 7:56 PM, raoyixuan (Shandy)
 wrote:
> I have some confused, why do the users can read the data in all nodes? I mean 
> the data just be kept in the replica, how to achieve it?
>
> -Original Message-
> From: sc...@scode.org [mailto:sc...@scode.org] On Behalf Of Peter Schuller
> Sent: Friday, January 14, 2011 1:19 AM
> To: user@cassandra.apache.org
> Subject: Re: about the data directory
>
>> So you mean just the replica node 's sstable will be changed ,right?
>
> The data will only be written to the nodes that are part of the
> replica set fo the row (with the exception of hinted handoff, but
> that's a different sstable).
>
>> If all the replica node broke down, whether the users can read the data?
>
> If *all* nodes in the replica set for a particular row are down, then
> you won't be able to read that row, no.
>
> --
> / Peter Schuller
>

It does not matter which node you connect to. The node you connect to
determines the hash of the key (or uses the key itself when using
Order Preserving Partitioner) to determine which node or nodes the
data should be on. If the key is on that node it returns it directly
to the client. If the key is not on that node Cassandra fetches it
from another node and then returns that data. The client is unaware
and does not need to be concerned with where the data came from.


RE: about the data directory

2011-01-13 Thread raoyixuan (Shandy)
I have some confused, why do the users can read the data in all nodes? I mean 
the data just be kept in the replica, how to achieve it?

-Original Message-
From: sc...@scode.org [mailto:sc...@scode.org] On Behalf Of Peter Schuller
Sent: Friday, January 14, 2011 1:19 AM
To: user@cassandra.apache.org
Subject: Re: about the data directory

> So you mean just the replica node 's sstable will be changed ,right?

The data will only be written to the nodes that are part of the
replica set fo the row (with the exception of hinted handoff, but
that's a different sstable).

> If all the replica node broke down, whether the users can read the data?

If *all* nodes in the replica set for a particular row are down, then
you won't be able to read that row, no.

-- 
/ Peter Schuller


Re: about the data directory

2011-01-13 Thread Peter Schuller
> So you mean just the replica node 's sstable will be changed ,right?

The data will only be written to the nodes that are part of the
replica set fo the row (with the exception of hinted handoff, but
that's a different sstable).

> If all the replica node broke down, whether the users can read the data?

If *all* nodes in the replica set for a particular row are down, then
you won't be able to read that row, no.

-- 
/ Peter Schuller


RE: about the data directory

2011-01-13 Thread raoyixuan (Shandy)
So you mean just the replica node 's sstable will be changed ,right? 

If all the replica node broke down, whether the users can read the data?

-Original Message-
From: sc...@scode.org [mailto:sc...@scode.org] On Behalf Of Peter Schuller
Sent: Thursday, January 13, 2011 4:32 PM
To: user@cassandra.apache.org
Subject: Re: about the data directory

> I agree with you totally. but I want to know which node is the data kept? I 
> mean which way to know the actual data kept?

If you're just doing testing, you might 'nodetool flush' each host and
then look for the sstable being written. Prior to a flush, it's going
to sit in a memtable in memory (and otherwise only in the commit log),
up until the configurable time period.

For real use-cases, you would normally not care which node has a
particular row, except I suppose for debugging purposes or similar. I
realize I don't know off hand of a simple way, from the perspective of
the command line, to answer that question for a particular key.

-- 
/ Peter Schuller


Re: about the data directory

2011-01-13 Thread Peter Schuller
> I agree with you totally. but I want to know which node is the data kept? I 
> mean which way to know the actual data kept?

If you're just doing testing, you might 'nodetool flush' each host and
then look for the sstable being written. Prior to a flush, it's going
to sit in a memtable in memory (and otherwise only in the commit log),
up until the configurable time period.

For real use-cases, you would normally not care which node has a
particular row, except I suppose for debugging purposes or similar. I
realize I don't know off hand of a simple way, from the perspective of
the command line, to answer that question for a particular key.

-- 
/ Peter Schuller


RE: about the data directory

2011-01-13 Thread raoyixuan (Shandy)
I agree with you totally. but I want to know which node is the data kept? I 
mean which way to know the actual data kept?

-Original Message-
From: sc...@scode.org [mailto:sc...@scode.org] On Behalf Of Peter Schuller
Sent: Thursday, January 13, 2011 4:20 PM
To: user@cassandra.apache.org
Subject: Re: about the data directory

> I have 4 nodes, then I  I create one keyspace (such as FOO) with replica 
> factor =1 and insert an data, why I can see the directory of 
> /var/lib/Cassandra/data/FOO in every nodes? As I know, I just have one replica

The schema (keyspaces and column families) are global across the
cluster (anything else would not make a lot of sense I think). The
replication factor determines the number of replicas of actual data,
based on row key. Given replication factor one, the data should only
show up on one node (assuming  a single row), but all nodes will be
aware of your keyspace/column family.

--
/ Peter Schuller


Re: about the data directory

2011-01-13 Thread Peter Schuller
> I have 4 nodes, then I  I create one keyspace (such as FOO) with replica 
> factor =1 and insert an data, why I can see the directory of 
> /var/lib/Cassandra/data/FOO in every nodes? As I know, I just have one replica

The schema (keyspaces and column families) are global across the
cluster (anything else would not make a lot of sense I think). The
replication factor determines the number of replicas of actual data,
based on row key. Given replication factor one, the data should only
show up on one node (assuming  a single row), but all nodes will be
aware of your keyspace/column family.

--
/ Peter Schuller


RE: about the data directory

2011-01-13 Thread raoyixuan (Shandy)
Not exactly. You mean one data will be put in four nodes which have 25%? If 
does, how about two replica?

From: Viktor Jevdokimov [mailto:viktor.jevdoki...@adform.com]
Sent: Thursday, January 13, 2011 2:59 PM
To: user@cassandra.apache.org
Subject: RE: about the data directory

>I have 4 nodes, then I  I create one keyspace (such as FOO) with replica 
>factor =1 and insert an data,
> why I can see the directory of /var/lib/Cassandra/data/FOO in every nodes? As 
> I know, I just have one replica

So why do you have installed 4 nodes, not 1?

They're for your data to be distributed between 4 nodes with 1 copy on one of 
them. This is like you have 100% of data and each node will have 25% of the 
data (random partitioning).


Viktor.

 Best regards/ Pagarbiai



Viktor Jevdokimov

Senior Developer



Email: viktor.jevdoki...@adform.com

Phone: +370 5 212 3063

Fax: +370 5 261 0453



Konstitucijos pr. 23,

LT-08105 Vilnius,

Lithuania






[Adform news]<http://www.adform.com/>


Disclaimer: The information contained in this message and attachments is 
intended solely for the attention and use of the named addressee and may be 
confidential. If you are not the interested recipient, you are reminded that 
the information remains the property of the sender. You must not use, disclose, 
distribute, copy, print or rely on this e-mail. If you have received this 
message in error, please contact the sender immediately and irrevocably delete 
or destroy this message and any copies.

<>

RE: about the data directory

2011-01-12 Thread Viktor Jevdokimov
>I have 4 nodes, then I  I create one keyspace (such as FOO) with replica 
>factor =1 and insert an data,
> why I can see the directory of /var/lib/Cassandra/data/FOO in every nodes? As 
> I know, I just have one replica

So why do you have installed 4 nodes, not 1?

They're for your data to be distributed between 4 nodes with 1 copy on one of 
them. This is like you have 100% of data and each node will have 25% of the 
data (random partitioning).


Viktor.

 Best regards/ Pagarbiai



Viktor Jevdokimov

Senior Developer



Email: viktor.jevdoki...@adform.com

Phone: +370 5 212 3063

Fax: +370 5 261 0453



Konstitucijos pr. 23,

LT-08105 Vilnius,

Lithuania






[cid:signature-logo6784.png]


Disclaimer: The information contained in this message and attachments is 
intended solely for the attention and use of the named addressee and may be 
confidential. If you are not the interested recipient, you are reminded that 
the information remains the property of the sender. You must not use, disclose, 
distribute, copy, print or rely on this e-mail. If you have received this 
message in error, please contact the sender immediately and irrevocably delete 
or destroy this message and any copies.

<>