[Cassandra Wiki] Update of "FAQ" by JonathanEllis

Apache Wiki Wed, 09 Mar 2011 09:36:11 -0800

Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Cassandra Wiki" for 
change notification.


The "FAQ" page has been changed by JonathanEllis.
The comment on this change is: add mmap.
http://wiki.apache.org/cassandra/FAQ?action=diff&rev1=104&rev2=105

--------------------------------------------------

   * [[#bulkloading|How do I bulk load data into Cassandra?]]
   * [[#range_rp|Why aren't range slices/sequential scans giving me the 
expected results?]]
   * [[#unsubscribe|How do I unsubscribe from the email list?]]
-  * [[#cleaning_compacted_tables|I compacted, so why do I still have all my 
SSTables?]]
+  * [[#cleaning_compacted_tables|I compacted, so why did space used not 
decrease?]]
+  * [[#mmap|Why does top report that Cassandra is using a lot more memory than 
the Java heap max?]]
  
  <<Anchor(cant_listen_on_ip_any)>>
  
@@ -369, +370 @@

  See BulkLoading
  
  <<Anchor(range_rp)>>
+ 
  == Why aren't range slices/sequential scans giving me the expected results? ==
- 
  You're probably using the RandomPartitioner.  This is the default because it 
avoids hotspots, but it means your rows are ordered by the md5 of the row key 
rather than lexicographically by the raw key bytes.
  
+ You '''can''' start out with a start key and end key of [empty] and use the 
row count argument instead, if your goal is paging the rows.  To get the next 
page, start from the last key you got in the previous page.
- You '''can''' start out with a start key and end key of [empty] and use the 
row count argument instead, if
- your goal is paging the rows.  To get the next page, start from the last key 
you got in the
- previous page.
  
  You can also use intra-row ordering of column names to get ordered results 
'''within''' a row; with appropriate row 'bucketing,' you often don't need the 
rows themselves to be ordered.
  
@@ -386, +385 @@

  
  <<Anchor(cleaning_compacted_tables)>>
  
- == I compacted, so why do I still have all my SSTables? ==
+ == I compacted, so why did space used not decrease? ==
  SSTables that are obsoleted by a compaction are deleted asynchronously when 
the JVM performs a GC. You can force a GC from jconsole if necessary, but 
Cassandra will force one itself if it detects that it is low on space. A 
compaction marker is also added to obsolete sstables so they can be deleted on 
startup if the server does not perform a GC before being restarted. Read more 
on this subject [[http://wiki.apache.org/cassandra/MemtableSSTable|here]].
  
+ <<Anchor(mmap)>>
+ 
+ == Why does top report that Cassandra is using a lot more memory than the 
Java heap max? ==
+ Cassandra uses mmap to do zero-copy reads. That is, we use the operating 
system's virtual memory system to map the sstable data files into the Cassandra 
process' address space. This will "use" virtual memory; i.e. address space, and 
will be reported by tools like top accordingly, but on 64 bit systems virtual 
address space is effectively unlimited so you should not worry about that.
+ 
+ What matters from the perspective of "memory use" in the sense as it is 
normally meant, is the amount of data allocated on brk() or mmap'd /dev/zero, 
which represent real memory used.  The key issue is that for a mmap'd file, 
there is never a need to retain the data resident in physical memory. Thus, 
whatever you do keep resident in physical memory is essentially just there as a 
cache, in the same way as normal I/O will cause the kernel page cache to retain 
data that you read/write.
+ 
+ The difference between normal I/O and mmap() is that in the mmap() case the 
memory is actually mapped to the process, thus affecting the virtual size as 
reported by top. The main argument for using mmap() instead of standard I/O is 
the fact that reading entails just touching memory - in the case of the memory 
being resident, you just read it - you don't even take a page fault (so no 
overhead in entering the kernel and doing a semi-context switch). This is 
covered in more detail 
[[http://www.varnish-cache.org/trac/wiki/ArchitectNotes|here]].
+

[Cassandra Wiki] Update of "FAQ" by JonathanEllis

Reply via email to