[Cassandra Wiki] Update of "StorageConfiguration" by Jo nHermes

Apache Wiki Tue, 24 Aug 2010 16:05:10 -0700

Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Cassandra Wiki" for 
change notification.


The "StorageConfiguration" page has been changed by JonHermes.
http://wiki.apache.org/cassandra/StorageConfiguration?action=diff&rev1=31&rev2=32

--------------------------------------------------

   * commitlog_directory and data_file_directories
  /var/lib/cassandra/commitlog
  
-  * concurrent_reads and concurrent_writes
- 8
- 32
+  * concurrent_reads and concurrent_writes, commitlog_sync and 
commitlog_sync_period_in_ms
+ Unlike most systems, in Cassandra writes are faster than reads, so you can 
afford more of those in parallel.  A good rule of thumb is 4 concurrent_reads 
per processor core.  Increase {{{concurrent_writes}}} to the number of clients 
writing at once if you use commitlog_sync.
+ 
+ {{{CommitLogSync}}} may be either "periodic" or "batch."  When in batch mode, 
Cassandra won't ack writes until the commit log has been fsynced to disk.  It 
will wait up to {{{CommitLogSyncBatchWindowInMS}}} milliseconds for other 
writes, before performing the sync.
+ 
+ This is less necessary in Cassandra than in traditional databases since 
replication reduces the odds of losing data from a failure after writing the 
log entry but before it actually reaches the disk. So the other option is 
"timed," where writes may be acked immediately and the {{{CommitLog}}} is 
simply synced every {{{CommitLogSyncPeriodInMS}}} milliseconds.
+ 
+ Interval at which to perform syncs of the {{{CommitLog}}} in periodic mode. 
Usually the default of 1000ms is fine; increase it only if the CommitLog 
PendingTasks backlog in jmx shows that you are frequently scheduling a second 
sync while the first has not yet been processed.
+ 
+ Defaults are: '8' c. reads, '32' c. writes, 'periodic' sync, '10000' ms 
between syncs.
  
   * disk_access_mode
  The options are: 'auto', 'mmap', 'mmap_index_only', and 'standard'.
@@ -110, +117 @@

    * compare_with
  The {{{CompareWith}}} attribute tells Cassandra how to sort the columns for 
slicing operations.  The default is {{{BytesType}}}, which is a straightforward 
lexical comparison of the bytes in each column. Other options are 
{{{AsciiType}}}, {{{UTF8Type}}}, {{{LexicalUUIDType}}}, {{{TimeUUIDType}}}, and 
{{{LongType}}}.  You can also specify the fully-qualified class name to a class 
of your choice extending {{{org.apache.cassandra.db.marshal.AbstractType}}}.
  
-  * {{{SuperColumns}}} have a similar {{{CompareSubcolumnsWith}}} attribute.
+  a. {{{SuperColumns}}} have a similar {{{CompareSubcolumnsWith}}} attribute.
-  * {{{BytesType}}}: Simple sort by byte value.  No validation is performed.
+  a. {{{BytesType}}}: Simple sort by byte value.  No validation is performed.
-  * {{{AsciiType}}}: Like {{{BytesType}}}, but validates that the input can be 
parsed as US-ASCII.
+  a. {{{AsciiType}}}: Like {{{BytesType}}}, but validates that the input can 
be parsed as US-ASCII.
-  * {{{UTF8Type}}}: A string encoded as UTF8
+  a. {{{UTF8Type}}}: A string encoded as UTF8
-  * {{{LongType}}}: A 64bit long
+  a. {{{LongType}}}: A 64bit long
-  * {{{LexicalUUIDType}}}: A 128bit UUID, compared lexically (by byte value)
+  a. {{{LexicalUUIDType}}}: A 128bit UUID, compared lexically (by byte value)
-  * {{{TimeUUIDType}}}: a 128bit version 1 UUID, compared by timestamp
+  a. {{{TimeUUIDType}}}: a 128bit version 1 UUID, compared by timestamp
- 
- (To get the closest approximation to 0.3-style {{{supercolumns}}}, you would 
use {{{CompareWith=UTF8Type CompareSubcolumnsWith=LongType}}}.)
  
    * gc_grace_seconds
  
@@ -199, +204 @@

  }}}
  '']''
  
- Unlike most systems, in Cassandra writes are faster than reads, so you can 
afford more of those in parallel.  A good rule of thumb is 2 concurrent reads 
per processor core.  Increase {{{ConcurrentWrites}}} to the number of clients 
writing at once if you enable {{{CommitLogSync + CommitLogSyncDelay}}}.
- 
- {{{
- <ConcurrentReads>8</ConcurrentReads>
- <ConcurrentWrites>32</ConcurrentWrites>
- }}}
- {{{CommitLogSync}}} may be either "periodic" or "batch."  When in batch mode, 
Cassandra won't ack writes until the commit log has been fsynced to disk.  It 
will wait up to {{{CommitLogSyncBatchWindowInMS}}} milliseconds for other 
writes, before performing the sync.
- 
- This is less necessary in Cassandra than in traditional databases since 
replication reduces the odds of losing data from a failure after writing the 
log entry but before it actually reaches the disk. So the other option is 
"timed," where writes may be acked immediately and the {{{CommitLog}}} is 
simply synced every {{{CommitLogSyncPeriodInMS}}} milliseconds.
- 
- {{{
- <CommitLogSync>periodic</CommitLogSync>
- }}}
- Interval at which to perform syncs of the {{{CommitLog}}} in periodic mode. 
Usually the default of 1000ms is fine; increase it only if the CommitLog 
PendingTasks backlog in jmx shows that you are frequently scheduling a second 
sync while the first has not yet been processed.
- 
- {{{
- <CommitLogSyncPeriodInMS>1000</CommitLogSyncPeriodInMS>
- }}}
- Delay (in milliseconds) during which additional commit log entries may be 
written before fsync in batch mode.  This will increase latency slightly, but 
can vastly improve throughput where there are many writers.  Set to zero to 
disable (each entry will be synced individually).  Reasonable values range from 
a minimal 0.1 to 10 or even more if throughput matters more than latency.
- 
- {{{
- <!-- <CommitLogSyncBatchWindowInMS>1</CommitLogSyncBatchWindowInMS> -->
- }}}
  Time to wait before garbage-collection deletion markers.  Set this to a large 
enough value that you are confident that the deletion marker will be propagated 
to all replicas by the time this many seconds has elapsed, even in the face of 
hardware failures.  The default value is ten days.
  
  {{{

[Cassandra Wiki] Update of "StorageConfiguration" by Jo nHermes

Reply via email to