[jira] [Commented] (CASSANDRA-7518) The In-Memory option

2014-07-28 Thread Hanson (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7518?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14077227#comment-14077227
 ] 

Hanson commented on CASSANDRA-7518:
---

I do not have access to DataStax JIRA.

My another post on stackoverflow:
http://stackoverflow.com/questions/24719276/cassandra-in-memory-option

It mentioned that the DataStax White Paper asserts that in upcoming version the 
amount of memory for single node will increase probably via JNA, but no solid 
timeline.


 The In-Memory option
 

 Key: CASSANDRA-7518
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7518
 Project: Cassandra
  Issue Type: New Feature
  Components: Core
Reporter: Hanson
 Fix For: 2.1.0


 There is an In-Memory option introduced in the commercial version of 
 Cassandra by DataStax Enterprise 4.0:
 http://www.datastax.com/documentation/datastax_enterprise/4.0/datastax_enterprise/inMemory.html
 But with 1GB size limited for an in-memory table.
 It would be great if the In-Memory option can be available to the community 
 version of Cassandra, and extend to a large size of in-memory table, such as 
 64GB.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-7549) Heavy Disk Read I/O

2014-07-16 Thread Hanson (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7549?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14063282#comment-14063282
 ] 

Hanson commented on CASSANDRA-7549:
---

When the Query needs to access an SSTable, the OS will cache this SSTable. But 
the Read block size is too small (~40KB observed by “iostat”), so it spends a 
long time to cache it into memory (my machine has plenty of memory). During the 
caching period, the Query is very slow (only hundreds TPS); after the caching 
done, the Query is very fast (thousand TPS).

You mentioned “8Kb read ahead”, is it handled by Cassandra or OS disk driver 
parameter?
I have tried:
cat /sys/block/sdb/queue/read_ahead_kb
128
echo 1024  /sys/block/sdb/queue/read_ahead_kb
But no impact to the Read disk I/O


 Heavy Disk Read I/O
 ---

 Key: CASSANDRA-7549
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7549
 Project: Cassandra
  Issue Type: Improvement
 Environment: Cassandra 2.0.6
Reporter: Hanson

 We observed heavy disk Read I/O, sometimes almost ~100% disk I/O %util. The 
 block size for Disk Read seems too small per “iostat”: 
   - DB Query: ~40KB per read
   - SSTables Compaction : ~120KB per read
 Could it use larger block size for Disk Read? (from Cassandra or OS disk 
 driver tuning)



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Comment Edited] (CASSANDRA-7549) Heavy Disk Read I/O

2014-07-16 Thread Hanson (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7549?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14063282#comment-14063282
 ] 

Hanson edited comment on CASSANDRA-7549 at 7/16/14 8:37 AM:


When the Query needs to access an SSTable, the OS will cache this SSTable. But 
the Read block size is too small (~40KB observed by “iostat”), so it spends a 
long time to cache it into memory (my machine has plenty of memory). During the 
caching period, the Query is very slow (only hundreds TPS); after the caching 
done, the Query is very fast (several thousands TPS).

You mentioned “8Kb read ahead”, is it handled by Cassandra or OS disk driver 
parameter?
I have tried:
cat /sys/block/sdb/queue/read_ahead_kb
128
echo 1024  /sys/block/sdb/queue/read_ahead_kb
But no impact to the Read disk I/O per iostat.



was (Author: he902):
When the Query needs to access an SSTable, the OS will cache this SSTable. But 
the Read block size is too small (~40KB observed by “iostat”), so it spends a 
long time to cache it into memory (my machine has plenty of memory). During the 
caching period, the Query is very slow (only hundreds TPS); after the caching 
done, the Query is very fast (thousand TPS).

You mentioned “8Kb read ahead”, is it handled by Cassandra or OS disk driver 
parameter?
I have tried:
cat /sys/block/sdb/queue/read_ahead_kb
128
echo 1024  /sys/block/sdb/queue/read_ahead_kb
But no impact to the Read disk I/O


 Heavy Disk Read I/O
 ---

 Key: CASSANDRA-7549
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7549
 Project: Cassandra
  Issue Type: Improvement
 Environment: Cassandra 2.0.6
Reporter: Hanson

 We observed heavy disk Read I/O, sometimes almost ~100% disk I/O %util. The 
 block size for Disk Read seems too small per “iostat”: 
   - DB Query: ~40KB per read
   - SSTables Compaction : ~120KB per read
 Could it use larger block size for Disk Read? (from Cassandra or OS disk 
 driver tuning)



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (CASSANDRA-7548) Disk I/O priority control for Compaction and Flusher

2014-07-15 Thread Hanson (JIRA)
Hanson created CASSANDRA-7548:
-

 Summary: Disk I/O priority control for Compaction and Flusher
 Key: CASSANDRA-7548
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7548
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Hanson


Disk I/O priority: Memtables Flusher shall have higher priority than Compaction.
This is to avoid DB Insert/Update hung (spikes) during SSTables Compaction. The 
Compaction shall be able to detect the in-progress of Memtables flushing, and 
slow down itself for disk I/O.




--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (CASSANDRA-7549) Heavy Disk Read I/O

2014-07-15 Thread Hanson (JIRA)
Hanson created CASSANDRA-7549:
-

 Summary: Heavy Disk Read I/O
 Key: CASSANDRA-7549
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7549
 Project: Cassandra
  Issue Type: Improvement
 Environment: Cassandra 2.0.6
Reporter: Hanson


We observed heavy disk Read I/O, sometimes almost ~100% disk I/O %util. The 
block size for Disk Read seems too small per “iostat”: 
- DB Query: ~40KB per read
- SSTables Compaction : ~120KB per read

Could it use larger block size for Disk Read? (from Cassandra or OS disk driver 
tuning)




--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (CASSANDRA-7550) Reduce tombstones

2014-07-15 Thread Hanson (JIRA)
Hanson created CASSANDRA-7550:
-

 Summary: Reduce tombstones
 Key: CASSANDRA-7550
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7550
 Project: Cassandra
  Issue Type: Improvement
Reporter: Hanson


If a record is still in Memtable, the Delete to that record shall be done in 
memory (Make Compaction in memory), instead of placing a tombstone. This way it 
can save disk I/O and space a lot for records with short lifecycle.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-7550) Reduce tombstones

2014-07-15 Thread Hanson (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7550?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14061974#comment-14061974
 ] 

Hanson commented on CASSANDRA-7550:
---

I mean that record is newly inserted and has not been flushed into SSTable yet, 
only existed in memory at the moment to be deleted. So it shall be safety to do 
the Delete in memory without putting a tombstone.
In our use case, the records only stay in DB for hours (random duration) and 
will then be deleted.


 Reduce tombstones
 -

 Key: CASSANDRA-7550
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7550
 Project: Cassandra
  Issue Type: Improvement
Reporter: Hanson

 If a record is still in Memtable, the Delete to that record shall be done in 
 memory (Make Compaction in memory), instead of placing a tombstone. This way 
 it can save disk I/O and space a lot for records with short lifecycle.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-7550) Reduce tombstones

2014-07-15 Thread Hanson (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7550?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14062993#comment-14062993
 ] 

Hanson commented on CASSANDRA-7550:
---

Could be a flag in the Memtable for each record to indicate it is newly 
inserted and has not been flushed into SSTable yet.

 Reduce tombstones
 -

 Key: CASSANDRA-7550
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7550
 Project: Cassandra
  Issue Type: Improvement
Reporter: Hanson

 If a record is still in Memtable, the Delete to that record shall be done in 
 memory (Make Compaction in memory), instead of placing a tombstone. This way 
 it can save disk I/O and space a lot for records with short lifecycle.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-1657) support in-memory column families

2014-07-09 Thread Hanson (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-1657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14055955#comment-14055955
 ] 

Hanson commented on CASSANDRA-1657:
---

 I wonder if AOF from redis fits our needs well. It is durable 
 http://redis.io/topics/persistence
It shall be AOF + RDB from Redis for durability.
Similar to the In-Memory option provided in the commercial version of Cassandra 
from DataStax Enterprise 4.0:
http://www.datastax.com/documentation/datastax_enterprise/4.0/datastax_enterprise/inMemory.html
Which is commitLogs + SSTables as snapshot for durability.
The memTables are still kept in memory even after they are flushed into 
SSTables as snapshot, and old commitLogs can be removed once corresponding 
memTables are flushed into SSTables.
That means the Compaction is done in memory (introduced a new compaction 
strategy “MemoryOnlyStrategy”, instead of doing compaction to SSTables on disk.

TimesTen In-Memory DB also does similar:
todoLogs + Checkpoints for durability.
Where todoLogs equals to Cassandra’s commitLogs, and Checkpoints equals to 
Cassandra’s SSTables as snapshot of memTables.




 support in-memory column families
 -

 Key: CASSANDRA-1657
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1657
 Project: Cassandra
  Issue Type: Improvement
Reporter: Peter Schuller
Assignee: Edward Capriolo
Priority: Minor

 Some workloads are such that you absolutely depend on column families being 
 in-memory for performance, yet you most definitely want all the things that 
 Cassandra offers in terms of replication, consistency, durability etc.
 In order to semi-deterministically ensure acceptable performance for such 
 data, Cassandra could support in-memory column families. Such an in-memory 
 column family would imply that mlock() be used on sstables for this column 
 family. On start-up and on compaction completion, they could be mmap():ed 
 with MAP_POPULATE (Linux specific) or else just mmap():ed + mlock():ed in 
 such a way as to otherwise guarantee it is in-memory (such as userland 
 traversal of the entire file).



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (CASSANDRA-7518) The In-Memory option

2014-07-08 Thread Hanson (JIRA)
Hanson created CASSANDRA-7518:
-

 Summary: The In-Memory option
 Key: CASSANDRA-7518
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7518
 Project: Cassandra
  Issue Type: New Feature
  Components: Core
Reporter: Hanson
 Fix For: 2.1.0


There is an In-Memory option introduced in the commercial version of Cassandra 
by DataStax Enterprise 4.0:
http://www.datastax.com/documentation/datastax_enterprise/4.0/datastax_enterprise/inMemory.html
But with 1GB size limited for an in-memory table.

It would be great if the In-Memory option can be available to the community 
version of Cassandra, and extend to a large size of in-memory table, such as 
64GB.




--
This message was sent by Atlassian JIRA
(v6.2#6252)