[Cassandra Wiki] Update of "StorageConfiguration" by Jo nHermes

Apache Wiki Tue, 24 Aug 2010 14:36:20 -0700

Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Cassandra Wiki" for 
change notification.


The "StorageConfiguration" page has been changed by JonHermes.
http://wiki.apache.org/cassandra/StorageConfiguration?action=diff&rev1=26&rev2=27

--------------------------------------------------

  
  == per-Cluster (Global) Settings ==
  === authenticator ===
+ Allows for pluggable authentication of users, which defines whether it is 
necessary to call the Thrift 'login' method, and which parameters are required 
to login. The default '!AllowAllAuthenticator' does not require users to call 
'login': any user can perform any operation. The other built in option is 
'!SimpleAuthenticator', which requires users and passwords to be defined in 
property files, and for users to call login with a valid combo.
+ 
- org.apache.cassandra.auth.AllowAllAuthenticator
+ Default is: 'org.apache.cassandra.auth.AllowAllAuthenticator', a no-op.
  
  === auto_bootstrap ===
- false
+ Set to 'true' to make new [non-seed] nodes automatically migrate the right 
data to themselves.  (If no InitialToken is specified, they will pick one  such 
that they will get half the range of the most-loaded node.) If a node starts up 
without bootstrapping, it will mark itself bootstrapped so that you can't 
subsequently accidently bootstrap a node with data on it.  (You can reset this 
by wiping your data and commitlog directories.)
+ 
+ Off by default so that new clusters don't bootstrap immediately.  You should 
turn this on when you start adding new nodes to a cluster that already has data 
on it.
  
  === cluster_name ===
- Cluster
+ The name of this cluster.  This is mainly used to prevent machines in one 
logical cluster from joining another.
  
  === commitlog_directory ===
  /var/lib/cassandra/commitlog
@@ -51, +55 @@

  false
  
  === endpoint_snitch ===
- org.apache.cassandra.locator.RackInferringSnitch
+ !EndPointSnitch: Setting this to the class that implements 
{{{IEndPointSnitch}}} which will see if two endpoints are in the same data 
center or on the same rack. Out of the box, Cassandra provides 
{{{org.apache.cassandra.locator.RackInferringSnitch}}}
  
- === hinted_handoff_enabled ===
- true
+ Note: this class will work on hosts' IPs only. There is no configuration 
parameter to tell Cassandra that a node is in rack ''R'' and in datacenter 
''D''. The current rules are based on the two methods:
+ 
+  * isOnSameRack: Look at the IP Address of the two hosts. Compare the 3rd 
octet. If they are the same then the hosts are in the same rack else different 
racks.
+ 
+  * isInSameDataCenter: Look at the IP Address of the two hosts. Compare the 
2nd octet. If they are the same then the hosts are in the same datacenter else 
different datacenter.
  
  === memtable_flush_after_mins ===
  === memtable_operations_in_millions ===
@@ -68, +75 @@

  10000
  
  === seeds ===
- LIST {"127.0.0.1",}
+ Never use a node's own address as a seed if you are bootstrapping it by 
setting autobootstrap to true!
  
  === thrift_framed_transport_size_in_mb ===
  15 by default. Setting this to 0 is how to denote using unframed transport.
@@ -76, +83 @@

  == per-Keyspace Settings ==
  == per-ColumnFamily Settings ==
  == per-Column Settings ==
- == AutoBootstrap ==
- ''[New in 0.5:''
  
- Turn on to make new [non-seed] nodes automatically migrate the right data  to 
themselves.  (If no InitialToken is specified, they will pick one  such that 
they will get half the range of the most-loaded node.) If a node starts up 
without bootstrapping, it will mark itself bootstrapped so that you can't 
subsequently accidently bootstrap a node with data on it.  (You can reset this 
by wiping your data and commitlog directories.)
  
- Off by default so that new clusters and upgraders from 0.4 don't bootstrap 
immediately.  You should turn this on when you start adding new nodes to a 
cluster that already has data on it.  (If you are upgrading from 0.4, start 
your cluster with it off once before changing it to true. Otherwise, no data 
will be lost but you will incur a lot of unnecessary I/O before your cluster 
starts up.)
  
- {{{
-   <AutoBootstrap>false</AutoBootstrap>
- }}}
- '']''
- 
- == Cluster Name ==
- The name of this cluster.  This is mainly used to prevent machines in one 
logical cluster from joining another.
- 
- Example:
- 
- {{{
- <ClusterName>Test Cluster</ClusterName>
- }}}
- == Authenticator ==
- ''[New in 0.6:''
- 
- Allows for pluggable authentication of users, which defines whether it is 
necessary to call the Thrift 'login' method, and which parameters are required 
to login. The default '!AllowAllAuthenticator' does not require users to call 
'login': any user can perform any operation. The other built in option is 
'!SimpleAuthenticator', which requires users and passwords to be defined in 
property files, and for users to call login with a valid combo.
- 
- Example:
- 
- {{{
- <Authenticator>org.apache.cassandra.auth.SimpleAuthenticator</Authenticator>
- }}}
- '']''
  
  == Keyspaces and ColumnFamilies ==
  Keyspaces and {{{ColumnFamilies}}}: A {{{ColumnFamily}}} is the Cassandra 
concept closest to a relational table.  {{{Keyspaces}}} are separate groups of 
{{{ColumnFamilies}}}.  Except in very unusual circumstances you will have one 
Keyspace per application.
@@ -127, +106 @@

  '']''
  
  ''[New in 0.6: !EndPointSnitch, !ReplicaPlacementStrategy and 
!ReplicationFactor became configurable per keyspace.  Prior to that they were 
global settings.]''
- 
- === EndPointSnitch ===
- !EndPointSnitch: Setting this to the class that implements 
{{{IEndPointSnitch}}} which will see if two endpoints are in the same data 
center or on the same rack. Out of the box, Cassandra provides 
{{{org.apache.cassandra.locator.EndPointSnitch}}}
- 
- {{{
- <EndPointSnitch>org.apache.cassandra.locator.EndPointSnitch</EndPointSnitch>
- }}}
- Note: this class will work on hosts' IPs only. There is no configuration 
parameter to tell Cassandra that a node is in rack ''R'' and in datacenter 
''D''. The current rules are based on the two methods: (see 
[[http://svn.apache.org/viewvc/incubator/cassandra/trunk/src/java/org/apache/cassandra/locator/EndPointSnitch.java?view=markup|EndPointSnitch.java]]):
- 
-  * isOnSameRack: Look at the IP Address of the two hosts. Compare the 3rd 
octet. If they are the same then the hosts are in the same rack else different 
racks.
- 
-  * isInSameDataCenter: Look at the IP Address of the two hosts. Compare the 
2nd octet. If they are the same then the hosts are in the same datacenter else 
different datacenter.
  
  === ReplicaPlacementStrategy and ReplicationFactor ===
  Strategy: Setting this to the class that implements 
{{{IReplicaPlacementStrategy}}} will change the way the node picker works. Out 
of the box, Cassandra provides 
{{{org.apache.cassandra.locator.RackUnawareStrategy}}} and 
{{{org.apache.cassandra.locator.RackAwareStrategy}}} (place one replica in a 
different datacenter, and the others on different racks in the same one.)
@@ -205, +172 @@

  
  With {{{OrderPreservingPartitioner}}} the keys themselves are used to place 
on the ring. One of the potential drawback of this approach is that if rows are 
inserted with sequential keys, all the write load will go to the same node.
  
- == Directories ==
- Directories: Specify where Cassandra should store different data on disk.  
Keep the data disks and the {{{CommitLog}}} disks separate for best 
performance. See also [[FAQ#what_kind_of_hardware_should_i_use|what kind of 
hardware should I use?]]
- 
- {{{
- <CommitLogDirectory>/var/lib/cassandra/commitlog</CommitLogDirectory>
- <DataFileDirectories>
-       <DataFileDirectory>/var/lib/cassandra/data</DataFileDirectory>
- </DataFileDirectories>
- }}}
- == Seeds ==
- Addresses of hosts that are deemed contact points. Cassandra nodes use this 
list of hosts to find each other and learn the topology of the ring. You must 
change this if you are running multiple nodes!
- 
- {{{
- <Seeds>
-  <Seed>127.0.0.1</Seed>
- </Seeds>
- }}}
- Never use a node's own address as a seed if you are bootstrapping it by 
setting AutoBootstrap to true.
  
  == Miscellaneous ==
  Time to wait for a reply from other nodes before failing the command

[Cassandra Wiki] Update of "StorageConfiguration" by Jo nHermes

Reply via email to