CF that is like a non-clustered index, are key lookups that fast?
If you store only the key mappings in a column family, for custom ordering of rows etc. for things like: friends = { user_id : { friendid1, friendid2, } } or topForumPosts = { forum_id1 : { post2343, post32343, post32223, ...} } Now on friends page or on the top_forum_posts page you will get back a list of post_ids, you will then have to perform lookups on the main 'posts' CF to get the actual data. So if a page is displaying 10, 25, or 50 posts you will have 10, 25 or 50 key based lookups for each page view. Is this the suggested way? i.e. a look based on a slice to get a list of post_id's, then a seperate call to actually fetch the data for the given entity. Or is cassandra so fast that 50 key based calls is no reason to worry?
How to get previous / next data?
Hello, We want to use cassandra to store and retrieve time related data. Storing the time-value pairs is easy and works perfectly. The problem arrives at retrieving the data. We do not only want to retrieve data from within a time range, but also be able to get the previous and/or next data sample from a specific point in time. The next in time i can retrieve by giving asking for the range timestamp...maxtime and request for 2 items. Which returns timestamp (if available) and the next timestamp-value. Does anybody know how i can retrieve the previous timestamp? The columns are sorted on the key(timestamp), so the previous request/next should not be a difficult to perform. Any suggestions are welcome! Thanks, Bram van der Waaij
Re: How to get previous / next data?
You want to use 'reversed' in SliceRange (and a start with whatever you want and a count of 2). -- Sylvain On Tue, Jun 15, 2010 at 12:01 PM, Bram van der Waaij bramat...@gmail.com wrote: Hello, We want to use cassandra to store and retrieve time related data. Storing the time-value pairs is easy and works perfectly. The problem arrives at retrieving the data. We do not only want to retrieve data from within a time range, but also be able to get the previous and/or next data sample from a specific point in time. The next in time i can retrieve by giving asking for the range timestamp...maxtime and request for 2 items. Which returns timestamp (if available) and the next timestamp-value. Does anybody know how i can retrieve the previous timestamp? The columns are sorted on the key(timestamp), so the previous request/next should not be a difficult to perform. Any suggestions are welcome! Thanks, Bram van der Waaij
Re: JVM Options for Production
On Mon, 14 Jun 2010 16:01:57 -0700 Anthony Molinaro antho...@alumni.caltech.edu wrote: AM Now I would assume that for 'production' you want to remove AM-ea AM and AM-XX:+HeapDumpOnOutOfMemoryError AM as well as adjust -Xms and Xmx accordingly, but are there any others AM which should be tweaked? Is there actually a recommended production AM set of values or does it very greatly from installation to installation? I brought this up as well here: http://thread.gmane.org/gmane.comp.db.cassandra.user/2083/focus=2093 Ted
Re: CF that is like a non-clustered index, are key lookups that fast?
well it won't be a range, it will be random key lookups. On Tue, Jun 15, 2010 at 8:44 AM, Gary Dusbabek gdusba...@gmail.com wrote: On Tue, Jun 15, 2010 at 04:29, S Ahmed sahmed1...@gmail.com wrote: If you store only the key mappings in a column family, for custom ordering of rows etc. for things like: friends = { user_id : { friendid1, friendid2, } } or topForumPosts = { forum_id1 : { post2343, post32343, post32223, ...} } Now on friends page or on the top_forum_posts page you will get back a list of post_ids, you will then have to perform lookups on the main 'posts' CF to get the actual data. So if a page is displaying 10, 25, or 50 posts you will have 10, 25 or 50 key based lookups for each page view. Is this the suggested way? i.e. a look based on a slice to get a list of post_id's, then a seperate call to actually fetch the data for the given entity. Or is cassandra so fast that 50 key based calls is no reason to worry? You should look at using either multi_get_slice or get_range_slices. You'll save on network trips and the amount of work required of the cluster. Gary.
Re: How to get previous / next data?
Perfect! Thanks :-) 2010/6/15 Sylvain Lebresne sylv...@yakaz.com You want to use 'reversed' in SliceRange (and a start with whatever you want and a count of 2). -- Sylvain On Tue, Jun 15, 2010 at 12:01 PM, Bram van der Waaij bramat...@gmail.com wrote: Hello, We want to use cassandra to store and retrieve time related data. Storing the time-value pairs is easy and works perfectly. The problem arrives at retrieving the data. We do not only want to retrieve data from within a time range, but also be able to get the previous and/or next data sample from a specific point in time. The next in time i can retrieve by giving asking for the range timestamp...maxtime and request for 2 items. Which returns timestamp (if available) and the next timestamp-value. Does anybody know how i can retrieve the previous timestamp? The columns are sorted on the key(timestamp), so the previous request/next should not be a difficult to perform. Any suggestions are welcome! Thanks, Bram van der Waaij
Re: java.lang.OutofMemoryerror: Java heap space
if you are reading 500MB per thrift request from each of 3 threads, then yes, simple arithmetic indicates that 1GB heap is not enough. On Mon, Jun 14, 2010 at 6:13 PM, Caribbean410 caribbean...@gmail.com wrote: Hi, I wrote 200k records to db with each record 5MB. Get this error when I uses 3 threads (each thread tries to read 200k record totally, 100 records a time) to read data from db. The write is OK, the error comes from read. Right now the Xmx of JVM is 1GB. I changed it to 2GB, still not working. If the record size is under 4K, I will not get this error. Any clues to avoid this error? Thx -- Jonathan Ellis Project Chair, Apache Cassandra co-founder of Riptano, the source for professional Cassandra support http://riptano.com
Cassandra timeouts under low load
Hi, I'm running cassandra .6.2 on a dedicated 4 node cluster and I also have a dedicated 4 node hadoop cluster. I'm trying to run a simple map reduce job against a single column family and it only takes 32 map tasks before I get floods of thrift timeouts. That would make sense to me if the cassandra was stressing the hardware or the network, but it's not. Each box has 8 cores/16G ram. During the job CPU averages 150-250% (1/5 utilization on 8 cores), network IO hovers around 15% throughput, iostat 15%. The hadoop machines are taking even less of a beating. The simpler I make the job, the faster it hits cassandra, the faster it throws timeouts vice versa. I'm guessing there's a software/config related bottleneck I'm hitting well before tapping out the hardware. Any idea what that might be? java.lang.RuntimeException: TimedOutException() at org.apache.cassandra.hadoop.ColumnFamilyRecordReader$RowIterator.maybeInit(ColumnFamilyRecordReader.java:174) at org.apache.cassandra.hadoop.ColumnFamilyRecordReader$RowIterator.computeNext(ColumnFamilyRecordReader.java:224) at org.apache.cassandra.hadoop.ColumnFamilyRecordReader$RowIterator.computeNext(ColumnFamilyRecordReader.java:101) at com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:135) at com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:130) at org.apache.cassandra.hadoop.ColumnFamilyRecordReader.nextKeyValue(ColumnFamilyRecordReader.java:95) at org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue(MapTask.java:423) at org.apache.hadoop.mapreduce.MapContext.nextKeyValue(MapContext.java:67) at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:143) at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:583) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:305) at org.apache.hadoop.mapred.Child.main(Child.java:170) Caused by: TimedOutException() at org.apache.cassandra.thrift.Cassandra$get_range_slices_result.read(Cassandra.java:11015) at org.apache.cassandra.thrift.Cassandra$Client.recv_get_range_slices(Cassandra.java:623) at org.apache.cassandra.thrift.Cassandra$Client.get_range_slices(Cassandra.java:597) at org.apache.cassandra.hadoop.ColumnFamilyRecordReader$RowIterator.maybeInit(ColumnFamilyRecordReader.java:151) ... 11 more
RE: java.lang.OutofMemoryerror: Java heap space
Sorry, the record size should be 5KB not 5MB. Coz 4KB is still OK. I will try Benjamin's suggestion. -Original Message- From: Jonathan Ellis [mailto:jbel...@gmail.com] Sent: Tuesday, June 15, 2010 8:09 AM To: user@cassandra.apache.org Subject: Re: java.lang.OutofMemoryerror: Java heap space if you are reading 500MB per thrift request from each of 3 threads, then yes, simple arithmetic indicates that 1GB heap is not enough. On Mon, Jun 14, 2010 at 6:13 PM, Caribbean410 caribbean...@gmail.com wrote: Hi, I wrote 200k records to db with each record 5MB. Get this error when I uses 3 threads (each thread tries to read 200k record totally, 100 records a time) to read data from db. The write is OK, the error comes from read. Right now the Xmx of JVM is 1GB. I changed it to 2GB, still not working. If the record size is under 4K, I will not get this error. Any clues to avoid this error? Thx -- Jonathan Ellis Project Chair, Apache Cassandra co-founder of Riptano, the source for professional Cassandra support http://riptano.com
Re: java.lang.OutofMemoryerror: Java heap space
You should only have to restart once per node to pick up config changes. On Tue, Jun 15, 2010 at 9:41 AM, caribbean410 caribbean...@gmail.com wrote: Today I retry the 2GB heap now it's working. No that out of memory error. Looks like I have to restart Cassandra several times before the new changes take effect. -Original Message- From: Benjamin Black [mailto:b...@b3k.us] Sent: Monday, June 14, 2010 7:46 PM To: user@cassandra.apache.org Subject: Re: java.lang.OutofMemoryerror: Java heap space My guess: you are outrunning your disk I/O. Each of those 5MB rows gets written to the commitlog, and the memtable is flushed when it hits the configured limit, which you've probably left at 128MB. Every 25 rows or so you are getting memtable flushed to disk. Until these things complete, they are in RAM. If this is actually representative of your production use, you need a dedicated commitlog disk, several drives in RAID0 or RAID10 for data, a lot more RAM, and much larger memtable flush size. b On Mon, Jun 14, 2010 at 6:13 PM, Caribbean410 caribbean...@gmail.com wrote: Hi, I wrote 200k records to db with each record 5MB. Get this error when I uses 3 threads (each thread tries to read 200k record totally, 100 records a time) to read data from db. The write is OK, the error comes from read. Right now the Xmx of JVM is 1GB. I changed it to 2GB, still not working. If the record size is under 4K, I will not get this error. Any clues to avoid this error? Thx
Re: Replication Factor and Data Centers
(moving to user@) On Mon, Jun 14, 2010 at 10:43 PM, Masood Mortazavi masoodmortaz...@gmail.com wrote: Is the clearer interpretation of this statement (in conf/datacenters.properties) given anywhere else? # The sum of all the datacenter replication factor values should equal # the replication factor of the keyspace (i.e. sum(dc_rf) = RF) # keyspace\:datacenter=replication factor Keyspace1\:DC1=3 Keyspace1\:DC2=2 Keyspace1\:DC3=1 Does the above example configuration imply that Keyspace1 has a RF of 6, and that of these 3 will go to DC1, 2 to DC2 and 1 to DC3? Yes. What will happen if datacenters.properties and cassandra-rack.properties are simply empty? You have an illegal configuration. https://issues.apache.org/jira/browse/CASSANDRA-1191 is open to have Cassandra raise an error under this condition and others. -- Jonathan Ellis Project Chair, Apache Cassandra co-founder of Riptano, the source for professional Cassandra support http://riptano.com
Re: help for designing a cassandra
http://wiki.apache.org/cassandra/ArticlesAndPresentations might help. On Mon, Jun 14, 2010 at 1:13 PM, Johannes Weissensel whitesensl...@googlemail.com wrote: Hi everyone, i am new to nosql databases and especially column-oriented Databases like cassandra. I am a student on information-systems and i evaluate a fitting no-sql database for a web analytics system. Got the use-case of data like webserver-logfile. in an RDBMS it would be for every hit a row in the database, and than endless grouping and counting on the data for getting the metrics you want. Is there anyone who has experiences with data like that in hypertable, how should i design the database? Also for every hit a single row, or maybe for every session an aggregated version of the data, or for every day and every page a single aggregated version. Maybe some has an idea, how to design the database? Just like an typical not normalized sql database? Hope you have some ideas :) Johannes -- Jonathan Ellis Project Chair, Apache Cassandra co-founder of Riptano, the source for professional Cassandra support http://riptano.com
java.lang.RuntimeException: java.io.IOException: Value too large for defined data type
I am running a 10 node cassandra 0.6.1 cluster with a replication factor of 3. To populate the database to perform my read benchmarking, I have 8 applications using Thrift, each connecting to a different cassandra server and writing 100,000 rows of data (100 KB each row), using a consistencyLevel of ALL. My server nodes are ec2-smalls (1.7GB memory, 100GB disk). With consistency set to ALL, it takes 5-6 minutes for each app to write 10,000 (100 KB) rows. When each of my 8 writing apps reaches about 90,000 rows written, I start seeing write timeouts but my app retries twice and all data appears to get written. It sppears to take about 1hr 45min for all compacting to complete. Coinciding with my write timeouts, all 10 of my cassandra servers are getting the following exception written to system.log: INFO [FLUSH-WRITER-POOL:1] 2010-06-15 13:13:54,411 Memtable.java (line 162) Completed flushing /var/lib/cassandra/data/Keyspace1/Standard1-359-Data.db ERROR [MESSAGE-STREAMING-POOL:1] 2010-06-15 13:13:59,145 DebuggableThreadPoolExecutor.java (line 101) Error in ThreadPoolExecutor java.lang.RuntimeException: java.io.IOException: Value too large for defined data type at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:34) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask (ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:619) Caused by: java.io.IOException: Value too large for defined data type at sun.nio.ch.FileChannelImpl.transferTo0(Native Method) at sun.nio.ch.FileChannelImpl.transferToDirectly(FileChannelImpl.java:415) at sun.nio.ch.FileChannelImpl.transferTo(FileChannelImpl.java:516) at org.apache.cassandra.net.FileStreamTask.stream(FileStreamTask.java:95) at org.apache.cassandra.net.FileStreamTask.runMayThrow(FileStreamTask.java:63) at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:30) ... 3 more ERROR [MESSAGE-STREAMING-POOL:1] 2010-06-15 13:13:59,146 CassandraDaemon.java (line 78) Fatal exception in thread Thread[MESSAGE-STREAMING-POOL:1,5,main] java.lang.RuntimeException: java.io.IOException: Value too large for defined data type at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:34) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask (ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:619) Caused by: java.io.IOException: Value too large for defined data type at sun.nio.ch.FileChannelImpl.transferTo0(Native Method) at sun.nio.ch.FileChannelImpl.transferToDirectly(FileChannelImpl.java:415) at sun.nio.ch.FileChannelImpl.transferTo(FileChannelImpl.java:516) at org.apache.cassandra.net.FileStreamTask.stream(FileStreamTask.java:95) at org.apache.cassandra.net.FileStreamTask.runMayThrow(FileStreamTask.java:63) at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:30) ... 3 more On 8 out of 10 servers, I see this just before the exception: INFO [AE-SERVICE-STAGE:1] 2010-06-15 13:41:36,292 StreamOut.java (line 66) Sending a stream initiate message to /10.210.34.212 ... ERROR [MESSAGE-STREAMING-POOL:1] 2010-06-15 13:43:32,956 DebuggableThreadPoolExecutor.java (line 101) Error in ThreadPoolExecutor On the other 2 servers, I see the AE-SERVICE stream initiate message about 6-9 minutes prior to the exception. Another thing that is odd is that even when the server nodes are quiescent because compacting is complete, I am still seeing cpu usage stay at about 40% . Even after several hours, no reading or writing to the database and all compactions complete, the cpu usage is staying around 40%. Thank you for your help and advice, Julie
Re: java.lang.RuntimeException: java.io.IOException: Value too large for defined data type
You are likely exhausting your heap space (probably still at the very small 1G default?), and maximizing the amount of resource consumption by using CL.ALL. Why are you using ALL? On Tue, Jun 15, 2010 at 11:58 AM, Julie julie.su...@nextcentury.com wrote: I am running a 10 node cassandra 0.6.1 cluster with a replication factor of 3. To populate the database to perform my read benchmarking, I have 8 applications using Thrift, each connecting to a different cassandra server and writing 100,000 rows of data (100 KB each row), using a consistencyLevel of ALL. My server nodes are ec2-smalls (1.7GB memory, 100GB disk). With consistency set to ALL, it takes 5-6 minutes for each app to write 10,000 (100 KB) rows. When each of my 8 writing apps reaches about 90,000 rows written, I start seeing write timeouts but my app retries twice and all data appears to get written. It sppears to take about 1hr 45min for all compacting to complete. Coinciding with my write timeouts, all 10 of my cassandra servers are getting the following exception written to system.log: INFO [FLUSH-WRITER-POOL:1] 2010-06-15 13:13:54,411 Memtable.java (line 162) Completed flushing /var/lib/cassandra/data/Keyspace1/Standard1-359-Data.db ERROR [MESSAGE-STREAMING-POOL:1] 2010-06-15 13:13:59,145 DebuggableThreadPoolExecutor.java (line 101) Error in ThreadPoolExecutor java.lang.RuntimeException: java.io.IOException: Value too large for defined data type at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:34) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask (ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:619) Caused by: java.io.IOException: Value too large for defined data type at sun.nio.ch.FileChannelImpl.transferTo0(Native Method) at sun.nio.ch.FileChannelImpl.transferToDirectly(FileChannelImpl.java:415) at sun.nio.ch.FileChannelImpl.transferTo(FileChannelImpl.java:516) at org.apache.cassandra.net.FileStreamTask.stream(FileStreamTask.java:95) at org.apache.cassandra.net.FileStreamTask.runMayThrow(FileStreamTask.java:63) at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:30) ... 3 more ERROR [MESSAGE-STREAMING-POOL:1] 2010-06-15 13:13:59,146 CassandraDaemon.java (line 78) Fatal exception in thread Thread[MESSAGE-STREAMING-POOL:1,5,main] java.lang.RuntimeException: java.io.IOException: Value too large for defined data type at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:34) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask (ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:619) Caused by: java.io.IOException: Value too large for defined data type at sun.nio.ch.FileChannelImpl.transferTo0(Native Method) at sun.nio.ch.FileChannelImpl.transferToDirectly(FileChannelImpl.java:415) at sun.nio.ch.FileChannelImpl.transferTo(FileChannelImpl.java:516) at org.apache.cassandra.net.FileStreamTask.stream(FileStreamTask.java:95) at org.apache.cassandra.net.FileStreamTask.runMayThrow(FileStreamTask.java:63) at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:30) ... 3 more On 8 out of 10 servers, I see this just before the exception: INFO [AE-SERVICE-STAGE:1] 2010-06-15 13:41:36,292 StreamOut.java (line 66) Sending a stream initiate message to /10.210.34.212 ... ERROR [MESSAGE-STREAMING-POOL:1] 2010-06-15 13:43:32,956 DebuggableThreadPoolExecutor.java (line 101) Error in ThreadPoolExecutor On the other 2 servers, I see the AE-SERVICE stream initiate message about 6-9 minutes prior to the exception. Another thing that is odd is that even when the server nodes are quiescent because compacting is complete, I am still seeing cpu usage stay at about 40% . Even after several hours, no reading or writing to the database and all compactions complete, the cpu usage is staying around 40%. Thank you for your help and advice, Julie
Re: java.lang.RuntimeException: java.io.IOException: Value too large for defined data type
Benjamin Black b at b3k.us writes: You are likely exhausting your heap space (probably still at the very small 1G default?), and maximizing the amount of resource consumption by using CL.ALL. Why are you using ALL? On Tue, Jun 15, 2010 at 11:58 AM, Julie julie.sugar at nextcentury.com wrote: ... Coinciding with my write timeouts, all 10 of my cassandra servers are getting the following exception written to system.log: INFO [FLUSH-WRITER-POOL:1] 2010-06-15 13:13:54,411 Memtable.java (line 162) Completed flushing /var/lib/cassandra/data/Keyspace1/Standard1-359-Data.db ERROR [MESSAGE-STREAMING-POOL:1] 2010-06-15 13:13:59,145 DebuggableThreadPoolExecutor.java (line 101) Error in ThreadPoolExecutor java.lang.RuntimeException: java.io.IOException: Value too large for defined data type at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:34) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask (ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run (ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:619) Caused by: java.io.IOException: Value too large for defined data type at sun.nio.ch.FileChannelImpl.transferTo0(Native Method) at sun.nio.ch.FileChannelImpl.transferToDirectly(FileChannelImpl.java:415) at sun.nio.ch.FileChannelImpl.transferTo(FileChannelImpl.java:516) at org.apache.cassandra.net.FileStreamTask.stream(FileStreamTask.java:95) at org.apache.cassandra.net.FileStreamTask.runMayThrow(FileStreamTask.java:63) at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:30) ... 3 more ... Thanks for your reply. Yes, my heap space is 1G. My vms have only 1.7G of memory so I hesitate to use more. I am using ALL because I was crashing cassandra when I used ZERO (posting from a few days ago) with a heap space error so it was recommended that I use ALL instead. I also tried using ONE but got even more write timeouts so I thought it would be safer to just wait for ALL replications to be written before trying to write more rows. Thank you for your help.
Re: java.lang.RuntimeException: java.io.IOException: Value too large for defined data type
How are you doing your inserts? I draw a clear line between 1) bootstrapping a cluster with data and 2) simulating expected/projected read/write behavior. If you are bootstrapping then I would look into the batch_mutate APIs. They allow you to improve your performance on writes dramatically. If you are read/write testing on a populated cluster, insert and batch_insert (for super columns) are the way to go. As Ben has pointed to me in numerous threads ... think carefully about your replication factor. Do you want the data on all nodes? Or sufficiently replicated so that you can recover? Do you want consistency at the time of write? Or eventually? Cassandra has a bunch of knobs that you can turn ... but that flexibility requires that you think about your expected usage patterns and operational policies. -phil On Jun 15, 2010, at 4:40 PM, Julie wrote: Benjamin Black b at b3k.us writes: You are likely exhausting your heap space (probably still at the very small 1G default?), and maximizing the amount of resource consumption by using CL.ALL. Why are you using ALL? On Tue, Jun 15, 2010 at 11:58 AM, Julie julie.sugar at nextcentury.com wrote: ... Coinciding with my write timeouts, all 10 of my cassandra servers are getting the following exception written to system.log: INFO [FLUSH-WRITER-POOL:1] 2010-06-15 13:13:54,411 Memtable.java (line 162) Completed flushing /var/lib/cassandra/data/Keyspace1/Standard1-359-Data.db ERROR [MESSAGE-STREAMING-POOL:1] 2010-06-15 13:13:59,145 DebuggableThreadPoolExecutor.java (line 101) Error in ThreadPoolExecutor java.lang.RuntimeException: java.io.IOException: Value too large for defined data type at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:34) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask (ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run (ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:619) Caused by: java.io.IOException: Value too large for defined data type at sun.nio.ch.FileChannelImpl.transferTo0(Native Method) at sun.nio.ch.FileChannelImpl.transferToDirectly(FileChannelImpl.java:415) at sun.nio.ch.FileChannelImpl.transferTo(FileChannelImpl.java:516) at org.apache.cassandra.net.FileStreamTask.stream(FileStreamTask.java:95) at org.apache.cassandra.net.FileStreamTask.runMayThrow(FileStreamTask.java:63) at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:30) ... 3 more ... Thanks for your reply. Yes, my heap space is 1G. My vms have only 1.7G of memory so I hesitate to use more. I am using ALL because I was crashing cassandra when I used ZERO (posting from a few days ago) with a heap space error so it was recommended that I use ALL instead. I also tried using ONE but got even more write timeouts so I thought it would be safer to just wait for ALL replications to be written before trying to write more rows. Thank you for your help.
Re: java.lang.RuntimeException: java.io.IOException: Value too large for defined data type
On Tue, Jun 15, 2010 at 1:40 PM, Julie julie.su...@nextcentury.com wrote: Thanks for your reply. Yes, my heap space is 1G. My vms have only 1.7G of memory so I hesitate to use more. Then write slower. There is no free lunch. b
Re: java.lang.RuntimeException: java.io.IOException: Value too large for defined data type
On Tue, Jun 15, 2010 at 1:58 PM, Julie julie.su...@nextcentury.com wrote: Coinciding with my write timeouts, all 10 of my cassandra servers are getting the following exception written to system.log: Value too large for defined data type looks like a bug found in older JREs. Upgrade to u19 or later. Another thing that is odd is that even when the server nodes are quiescent because compacting is complete, I am still seeing cpu usage stay at about 40% . Even after several hours, no reading or writing to the database and all compactions complete, the cpu usage is staying around 40%. Possibly this is Hinted Handoff scanning going on. You can rm data/system/Hint* (while the node is shut down) if you want to take a shot in the dark. Otherwise you'll want to follow http://publib.boulder.ibm.com/infocenter/javasdk/tools/index.jsp?topic=/com.ibm.java.doc.igaa/_1vg0001475cb4a-1190e2e0f74-8000_1007.html to figure out which thread is actually consuming the CPU. -- Jonathan Ellis Project Chair, Apache Cassandra co-founder of Riptano, the source for professional Cassandra support http://riptano.com
stalled streaming
Hello,
Re: java.lang.RuntimeException: java.io.IOException: Value too large for defined data type
Phil Stanhope pstanhope at wimba.com writes: How are you doing your inserts? I draw a clear line between 1) bootstrapping a cluster with data and 2) simulating expected/projected read/write behavior. If you are bootstrapping then I would look into the batch_mutate APIs. They allow you to improve your performance on writes dramatically. If you are read/write testing on a populated cluster, insert and batch_insert (for super columns) are the way to go. As Ben has pointed to me in numerous threads ... think carefully about your replication factor. Do you want the data on all nodes? Or sufficiently replicated so that you can recover? Do you want consistency at the time of write? Or eventually? Cassandra has a bunch of knobs that you can turn ... but that flexibility requires that you think about your expected usage patterns and operational policies. -phil My inserts are being done 100 rows at a time using batch_mutate(). I bring up all 10 nodes in my cassandra cluster at once (no live bootstrapping of nodes). Once they are up, I begin populating the database running 8 write clients (on 8 different VMs), each writing 100 rows at a time. As mentioned earlier, each client writes to a different cassandra server node so no one server node is fielding all the writes simultaneously. I have a replication factor of 3 because I need to be able to survive 2 out of 10 nodes going down at once. I am baffled by all the Value too large exceptions that are occurring on every one of my 10 servers: ERROR [MESSAGE-STREAMING-POOL:1] 2010-06-14 19:30:24,471 DebuggableThreadPoolExecutor.java (line 101) Error in ThreadPoolExecutor java.lang.RuntimeException: java.io.IOException: Value too large for defined data type at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:34) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask (ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run (ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:619) It seems to be happening just after this is logged: INFO [AE-SERVICE-STAGE:1] 2010-06-14 19:28:39,851 StreamOut.java I'm also baffled that after all compactions are done on every one of the 10 servers, about 5 out of 10 servers are still at 40% CPU usage, although they are doing 0 disk IO. I am not running anything else running on these server nodes except for cassandra. The compactions have been done for over an hour. The last write took place 5 hours ago. Thank you for any help, Julie
Re: java.lang.RuntimeException: java.io.IOException: Value too large for defined data type
On Tue, Jun 15, 2010 at 5:15 PM, Julie julie.su...@nextcentury.com wrote: I'm also baffled that after all compactions are done on every one of the 10 servers, about 5 out of 10 servers are still at 40% CPU usage, although they are doing 0 disk IO. I am not running anything else running on these server nodes except for cassandra. The compactions have been done for over an hour. The last write took place 5 hours ago. That actually sounds like https://issues.apache.org/jira/browse/CASSANDRA-1169, which is fixed for 0.6.3 -- Jonathan Ellis Project Chair, Apache Cassandra co-founder of Riptano, the source for professional Cassandra support http://riptano.com
[OT] Real Time Open source solutions for aggregation and stream processing
firstly, my apologies for the off-topic message, but I thought most people on this list would be knowledgeable and interested in this kind of thing. We are looking to find a open source, scalable solution to do RT aggregation and stream processing (similar to what the 'hop' project http://code.google.com/p/hop/ set out to do) for large(ish) click-stream logs. My first thought was something like esper, but in our testing it kind of hits the wall at around 10,000 rules per JVM. I was wondering if any of you guys had some experiences in this area, and what your favorite toolsets are around this. currently we are using cassandra and redis with home grown software to do the aggregation, but I'd love to use a common package if there is one. and again.. apologies for the off-topic message and the x-posting. regards Ian
stalled streaming
hello, I have a 4 node cassandra cluster with 0.6.1 installed. We've been running a mixed read / write workload test how it works in our environment, we run about 4M bath mutations and 40M get_range_slice requests over 6 to 8 hours that load about 10 to 15 GB of data. Yesterday while there was no activity I noticed 2 nodes sitting at 200% CPU on 8 Core machine. Thought nothing of it. Checked again this morning and they are still sitting at that level of activity with no requests going into them. Checking the streams using node tool I see node 3 is streaming to node 0 and 2, and appears to have stalled. The information in the JMX console for streams matches the info below. I cannot see any errors in the logs. This is just a test system, so am happy to bounce the JVM's. Before I do is there anything else I should be looking for to understand why this happened? Also, sorry for the previous empty email. Node 0 Mode: Normal Nothing streaming to /192.168.34.27 Nothing streaming to /192.168.34.28 Nothing streaming to /192.168.34.29 Streaming from: /192.168.34.29 junkbox.mycompany: /local1/junkbox/cassandra/data/junkbox.mycompany/Buckets-tmp-137-Filter.db 0/22765 junkbox.mycompany: /local1/junkbox/cassandra/data/junkbox.mycompany/Buckets-tmp-137-Data.db 0/10750717 junkbox.mycompany: /local1/junkbox/cassandra/data/junkbox.mycompany/Databases-tmp-108-Index.db 0/58 junkbox.mycompany: /local1/junkbox/cassandra/data/junkbox.mycompany/Databases-tmp-108-Filter.db 0/325 junkbox.mycompany: /local1/junkbox/cassandra/data/junkbox.mycompany/Databases-tmp-108-Data.db 0/695 junkbox.mycompany: /local1/junkbox/cassandra/data/junkbox.mycompany/Databases-tmp-119-Index.db 0/58 junkbox.mycompany: /local1/junkbox/cassandra/data/junkbox.mycompany/Databases-tmp-119-Filter.db 0/325 junkbox.mycompany: /local1/junkbox/cassandra/data/junkbox.mycompany/Databases-tmp-119-Data.db 0/695 junkbox.mycompany: /local1/junkbox/cassandra/data/junkbox.mycompany/Buckets-tmp-163-Index.db 0/587164 junkbox.mycompany: /local1/junkbox/cassandra/data/junkbox.mycompany/Buckets-tmp-163-Filter.db 0/22765 junkbox.mycompany: /local1/junkbox/cassandra/data/junkbox.mycompany/Buckets-tmp-163-Data.db 0/5159652 junkbox.mycompany: /local1/junkbox/cassandra/data/junkbox.mycompany/Buckets-tmp-124-Data.db 22765/4966927 junkbox.mycompany: /local1/junkbox/cassandra/data/junkbox.mycompany/Buckets-tmp-137-Index.db 22765/1223053 Node 1 Mode: Normal Nothing streaming to /192.168.34.26 Nothing streaming to /192.168.34.28 Nothing streaming to /192.168.34.29 Not receiving any streams. Node 2 Mode: Normal Nothing streaming to /192.168.34.26 Nothing streaming to /192.168.34.27 Nothing streaming to /192.168.34.29 Streaming from: /192.168.34.29 junkbox.mycompany: /local1/junkbox/cassandra/data/junkbox.mycompany/Buckets-tmp-137-Filter.db 0/22765 junkbox.mycompany: /local1/junkbox/cassandra/data/junkbox.mycompany/Buckets-tmp-137-Data.db 0/2161660 junkbox.mycompany: /local1/junkbox/cassandra/data/junkbox.mycompany/Buckets-tmp-147-Index.db 0/787524 junkbox.mycompany: /local1/junkbox/cassandra/data/junkbox.mycompany/Buckets-tmp-147-Filter.db 0/22765 junkbox.mycompany: /local1/junkbox/cassandra/data/junkbox.mycompany/Buckets-tmp-147-Data.db 0/6917064 junkbox.mycompany: /local1/junkbox/cassandra/data/junkbox.mycompany/Databases-tmp-130-Index.db 0/58 junkbox.mycompany: /local1/junkbox/cassandra/data/junkbox.mycompany/Databases-tmp-130-Filter.db 0/565 junkbox.mycompany: /local1/junkbox/cassandra/data/junkbox.mycompany/Databases-tmp-130-Data.db 0/695 junkbox.mycompany: /local1/junkbox/cassandra/data/junkbox.mycompany/Buckets-tmp-168-Index.db 0/581779 junkbox.mycompany: /local1/junkbox/cassandra/data/junkbox.mycompany/Buckets-tmp-168-Filter.db 0/22765 junkbox.mycompany: /local1/junkbox/cassandra/data/junkbox.mycompany/Buckets-tmp-168-Data.db 0/5111887 junkbox.mycompany: /local1/junkbox/cassandra/data/junkbox.mycompany/Buckets-tmp-125-Data.db 361367/3173057 junkbox.mycompany: /local1/junkbox/cassandra/data/junkbox.mycompany/Buckets-tmp-125-Index.db 695/361367 Node 3 ode: Normal Streaming to: /192.168.34.26 /local1/junkbox/cassandra/data/junkbox.mycompany/stream/Buckets-69-Filter.db 22765/22765 /local1/junkbox/cassandra/data/junkbox.mycompany/stream/Buckets-69-Data.db 0/4966927 /local1/junkbox/cassandra/data/junkbox.mycompany/stream/Databases-42-Index.db 0/58 /local1/junkbox/cassandra/data/junkbox.mycompany/stream/Databases-42-Filter.db 0/325 /local1/junkbox/cassandra/data/junkbox.mycompany/stream/Databases-42-Data.db 0/695 /local1/junkbox/cassandra/data/junkbox.mycompany/stream/Buckets-82-Index.db 0/1223053 /local1/junkbox/cassandra/data/junkbox.mycompany/stream/Buckets-82-Filter.db 0/22765 /local1/junkbox/cassandra/data/junkbox.mycompany/stream/Buckets-82-Data.db 0/10750717 /local1/junkbox/cassandra/data/junkbox.mycompany/stream/Databases-52-Index.db 0/58
Re: stalled streaming
Known bug, fixed in latest 0.6 release. On Tue, Jun 15, 2010 at 3:29 PM, aaron aa...@thelastpickle.com wrote: hello, I have a 4 node cassandra cluster with 0.6.1 installed. We've been running a mixed read / write workload test how it works in our environment, we run about 4M bath mutations and 40M get_range_slice requests over 6 to 8 hours that load about 10 to 15 GB of data. Yesterday while there was no activity I noticed 2 nodes sitting at 200% CPU on 8 Core machine. Thought nothing of it. Checked again this morning and they are still sitting at that level of activity with no requests going into them. Checking the streams using node tool I see node 3 is streaming to node 0 and 2, and appears to have stalled. The information in the JMX console for streams matches the info below. I cannot see any errors in the logs. This is just a test system, so am happy to bounce the JVM's. Before I do is there anything else I should be looking for to understand why this happened? Also, sorry for the previous empty email. Node 0 Mode: Normal Nothing streaming to /192.168.34.27 Nothing streaming to /192.168.34.28 Nothing streaming to /192.168.34.29 Streaming from: /192.168.34.29 junkbox.mycompany: /local1/junkbox/cassandra/data/junkbox.mycompany/Buckets-tmp-137-Filter.db 0/22765 junkbox.mycompany: /local1/junkbox/cassandra/data/junkbox.mycompany/Buckets-tmp-137-Data.db 0/10750717 junkbox.mycompany: /local1/junkbox/cassandra/data/junkbox.mycompany/Databases-tmp-108-Index.db 0/58 junkbox.mycompany: /local1/junkbox/cassandra/data/junkbox.mycompany/Databases-tmp-108-Filter.db 0/325 junkbox.mycompany: /local1/junkbox/cassandra/data/junkbox.mycompany/Databases-tmp-108-Data.db 0/695 junkbox.mycompany: /local1/junkbox/cassandra/data/junkbox.mycompany/Databases-tmp-119-Index.db 0/58 junkbox.mycompany: /local1/junkbox/cassandra/data/junkbox.mycompany/Databases-tmp-119-Filter.db 0/325 junkbox.mycompany: /local1/junkbox/cassandra/data/junkbox.mycompany/Databases-tmp-119-Data.db 0/695 junkbox.mycompany: /local1/junkbox/cassandra/data/junkbox.mycompany/Buckets-tmp-163-Index.db 0/587164 junkbox.mycompany: /local1/junkbox/cassandra/data/junkbox.mycompany/Buckets-tmp-163-Filter.db 0/22765 junkbox.mycompany: /local1/junkbox/cassandra/data/junkbox.mycompany/Buckets-tmp-163-Data.db 0/5159652 junkbox.mycompany: /local1/junkbox/cassandra/data/junkbox.mycompany/Buckets-tmp-124-Data.db 22765/4966927 junkbox.mycompany: /local1/junkbox/cassandra/data/junkbox.mycompany/Buckets-tmp-137-Index.db 22765/1223053 Node 1 Mode: Normal Nothing streaming to /192.168.34.26 Nothing streaming to /192.168.34.28 Nothing streaming to /192.168.34.29 Not receiving any streams. Node 2 Mode: Normal Nothing streaming to /192.168.34.26 Nothing streaming to /192.168.34.27 Nothing streaming to /192.168.34.29 Streaming from: /192.168.34.29 junkbox.mycompany: /local1/junkbox/cassandra/data/junkbox.mycompany/Buckets-tmp-137-Filter.db 0/22765 junkbox.mycompany: /local1/junkbox/cassandra/data/junkbox.mycompany/Buckets-tmp-137-Data.db 0/2161660 junkbox.mycompany: /local1/junkbox/cassandra/data/junkbox.mycompany/Buckets-tmp-147-Index.db 0/787524 junkbox.mycompany: /local1/junkbox/cassandra/data/junkbox.mycompany/Buckets-tmp-147-Filter.db 0/22765 junkbox.mycompany: /local1/junkbox/cassandra/data/junkbox.mycompany/Buckets-tmp-147-Data.db 0/6917064 junkbox.mycompany: /local1/junkbox/cassandra/data/junkbox.mycompany/Databases-tmp-130-Index.db 0/58 junkbox.mycompany: /local1/junkbox/cassandra/data/junkbox.mycompany/Databases-tmp-130-Filter.db 0/565 junkbox.mycompany: /local1/junkbox/cassandra/data/junkbox.mycompany/Databases-tmp-130-Data.db 0/695 junkbox.mycompany: /local1/junkbox/cassandra/data/junkbox.mycompany/Buckets-tmp-168-Index.db 0/581779 junkbox.mycompany: /local1/junkbox/cassandra/data/junkbox.mycompany/Buckets-tmp-168-Filter.db 0/22765 junkbox.mycompany: /local1/junkbox/cassandra/data/junkbox.mycompany/Buckets-tmp-168-Data.db 0/5111887 junkbox.mycompany: /local1/junkbox/cassandra/data/junkbox.mycompany/Buckets-tmp-125-Data.db 361367/3173057 junkbox.mycompany: /local1/junkbox/cassandra/data/junkbox.mycompany/Buckets-tmp-125-Index.db 695/361367 Node 3 ode: Normal Streaming to: /192.168.34.26 /local1/junkbox/cassandra/data/junkbox.mycompany/stream/Buckets-69-Filter.db 22765/22765 /local1/junkbox/cassandra/data/junkbox.mycompany/stream/Buckets-69-Data.db 0/4966927 /local1/junkbox/cassandra/data/junkbox.mycompany/stream/Databases-42-Index.db 0/58 /local1/junkbox/cassandra/data/junkbox.mycompany/stream/Databases-42-Filter.db 0/325 /local1/junkbox/cassandra/data/junkbox.mycompany/stream/Databases-42-Data.db 0/695 /local1/junkbox/cassandra/data/junkbox.mycompany/stream/Buckets-82-Index.db 0/1223053 /local1/junkbox/cassandra/data/junkbox.mycompany/stream/Buckets-82-Filter.db
Re: java.lang.RuntimeException: java.io.IOException: Value too large for defined data type
Benjamin Black b at b3k.us writes: Then write slower. There is no free lunch. b Are you implying that clients need to throttle their collective load on the server to avoid causing the server to fail? That seems undesirable. Is this a side effect of a server bug, or is it part of the intended design? Regards -- Charlie
Re: java.lang.RuntimeException: java.io.IOException: Value too large for defined data type
On Tue, Jun 15, 2010 at 3:55 PM, Charles Butterfield charles.butterfi...@nextcentury.com wrote: Benjamin Black b at b3k.us writes: Then write slower. There is no free lunch. b Are you implying that clients need to throttle their collective load on the server to avoid causing the server to fail? That seems undesirable. Is this a side effect of a server bug, or is it part of the intended design? I am only saying something obvious: if you don't have sufficient resources to handle the demand, you should reduce demand, increase resources, or expect errors. Doing lots of writes without much heap space is such a situation (whether or not it is happening in this instance), but there are many others. This constraint it not specific to Cassandra. Hence, there is no free lunch. b
Re: stalled streaming
Thanks, will move to 0.6.2. Aaron On Tue, 15 Jun 2010 15:55:46 -0700, Benjamin Black b...@b3k.us wrote: Known bug, fixed in latest 0.6 release. On Tue, Jun 15, 2010 at 3:29 PM, aaron aa...@thelastpickle.com wrote: hello, I have a 4 node cassandra cluster with 0.6.1 installed. We've been running a mixed read / write workload test how it works in our environment, we run about 4M bath mutations and 40M get_range_slice requests over 6 to 8 hours that load about 10 to 15 GB of data. Yesterday while there was no activity I noticed 2 nodes sitting at 200% CPU on 8 Core machine. Thought nothing of it. Checked again this morning and they are still sitting at that level of activity with no requests going into them. Checking the streams using node tool I see node 3 is streaming to node 0 and 2, and appears to have stalled. The information in the JMX console for streams matches the info below. I cannot see any errors in the logs. This is just a test system, so am happy to bounce the JVM's. Before I do is there anything else I should be looking for to understand why this happened? Also, sorry for the previous empty email. Node 0 Mode: Normal Nothing streaming to /192.168.34.27 Nothing streaming to /192.168.34.28 Nothing streaming to /192.168.34.29 Streaming from: /192.168.34.29 junkbox.mycompany: /local1/junkbox/cassandra/data/junkbox.mycompany/Buckets-tmp-137-Filter.db 0/22765 junkbox.mycompany: /local1/junkbox/cassandra/data/junkbox.mycompany/Buckets-tmp-137-Data.db 0/10750717 junkbox.mycompany: /local1/junkbox/cassandra/data/junkbox.mycompany/Databases-tmp-108-Index.db 0/58 junkbox.mycompany: /local1/junkbox/cassandra/data/junkbox.mycompany/Databases-tmp-108-Filter.db 0/325 junkbox.mycompany: /local1/junkbox/cassandra/data/junkbox.mycompany/Databases-tmp-108-Data.db 0/695 junkbox.mycompany: /local1/junkbox/cassandra/data/junkbox.mycompany/Databases-tmp-119-Index.db 0/58 junkbox.mycompany: /local1/junkbox/cassandra/data/junkbox.mycompany/Databases-tmp-119-Filter.db 0/325 junkbox.mycompany: /local1/junkbox/cassandra/data/junkbox.mycompany/Databases-tmp-119-Data.db 0/695 junkbox.mycompany: /local1/junkbox/cassandra/data/junkbox.mycompany/Buckets-tmp-163-Index.db 0/587164 junkbox.mycompany: /local1/junkbox/cassandra/data/junkbox.mycompany/Buckets-tmp-163-Filter.db 0/22765 junkbox.mycompany: /local1/junkbox/cassandra/data/junkbox.mycompany/Buckets-tmp-163-Data.db 0/5159652 junkbox.mycompany: /local1/junkbox/cassandra/data/junkbox.mycompany/Buckets-tmp-124-Data.db 22765/4966927 junkbox.mycompany: /local1/junkbox/cassandra/data/junkbox.mycompany/Buckets-tmp-137-Index.db 22765/1223053 Node 1 Mode: Normal Nothing streaming to /192.168.34.26 Nothing streaming to /192.168.34.28 Nothing streaming to /192.168.34.29 Not receiving any streams. Node 2 Mode: Normal Nothing streaming to /192.168.34.26 Nothing streaming to /192.168.34.27 Nothing streaming to /192.168.34.29 Streaming from: /192.168.34.29 junkbox.mycompany: /local1/junkbox/cassandra/data/junkbox.mycompany/Buckets-tmp-137-Filter.db 0/22765 junkbox.mycompany: /local1/junkbox/cassandra/data/junkbox.mycompany/Buckets-tmp-137-Data.db 0/2161660 junkbox.mycompany: /local1/junkbox/cassandra/data/junkbox.mycompany/Buckets-tmp-147-Index.db 0/787524 junkbox.mycompany: /local1/junkbox/cassandra/data/junkbox.mycompany/Buckets-tmp-147-Filter.db 0/22765 junkbox.mycompany: /local1/junkbox/cassandra/data/junkbox.mycompany/Buckets-tmp-147-Data.db 0/6917064 junkbox.mycompany: /local1/junkbox/cassandra/data/junkbox.mycompany/Databases-tmp-130-Index.db 0/58 junkbox.mycompany: /local1/junkbox/cassandra/data/junkbox.mycompany/Databases-tmp-130-Filter.db 0/565 junkbox.mycompany: /local1/junkbox/cassandra/data/junkbox.mycompany/Databases-tmp-130-Data.db 0/695 junkbox.mycompany: /local1/junkbox/cassandra/data/junkbox.mycompany/Buckets-tmp-168-Index.db 0/581779 junkbox.mycompany: /local1/junkbox/cassandra/data/junkbox.mycompany/Buckets-tmp-168-Filter.db 0/22765 junkbox.mycompany: /local1/junkbox/cassandra/data/junkbox.mycompany/Buckets-tmp-168-Data.db 0/5111887 junkbox.mycompany: /local1/junkbox/cassandra/data/junkbox.mycompany/Buckets-tmp-125-Data.db 361367/3173057 junkbox.mycompany: /local1/junkbox/cassandra/data/junkbox.mycompany/Buckets-tmp-125-Index.db 695/361367 Node 3 ode: Normal Streaming to: /192.168.34.26 /local1/junkbox/cassandra/data/junkbox.mycompany/stream/Buckets-69-Filter.db 22765/22765 /local1/junkbox/cassandra/data/junkbox.mycompany/stream/Buckets-69-Data.db 0/4966927 /local1/junkbox/cassandra/data/junkbox.mycompany/stream/Databases-42-Index.db 0/58 /local1/junkbox/cassandra/data/junkbox.mycompany/stream/Databases-42-Filter.db 0/325 /local1/junkbox/cassandra/data/junkbox.mycompany/stream/Databases-42-Data.db 0/695
RE: read operation is slow
Thanks for your updates, good to know that your performance is better now. Actually, if the user asks one record a time, usually it will be done in multi-threading, since most likely the requests coming from different users. If a single users want 200k, and there are no difference to get 1 at a time, or get 100 at a time. Since the result set is exactly same. By the way: the Jassandra you are using is updated for the issue you raised. Now, the result set from select is sorted. Thanks, Regards, dop From: Caribbean410 [mailto:caribbean...@gmail.com] Sent: Tuesday, June 15, 2010 9:16 AM To: user@cassandra.apache.org Subject: Re: read operation is slow Now I read 100 records each time, and the total time to read 200k records (1M each) reduce to 10s. Looks good. But I am still curious how to handle the case that users read one record each time, On Fri, Jun 11, 2010 at 6:05 PM, Dop Sun su...@dopsun.com wrote: And also, you are only select 1 key and 10 columns? criteria.keyList(Lists.newArrayList(userName)).columnRange(nameFirst, nameFirst, 10); Then, if you have 200k keys, you have 200k Thrift calls. If this is the case, you may need to optimize the way you do the query (to combine multiple keys into a single query), and to reduce the number of calls. From: Dop Sun [mailto:su...@dopsun.com] Sent: Saturday, June 12, 2010 8:57 AM To: user@cassandra.apache.org Subject: RE: read operation is slow You mean after you I remove some unnecessary column family and change the size of rowcache and keycache, now the latency changes from 0.25ms to 0.09ms. In essence 0.09ms*200k=18s., it still takes 400 seconds to returning? From: Caribbean410 [mailto:caribbean...@gmail.com] Sent: Saturday, June 12, 2010 8:48 AM To: user@cassandra.apache.org Subject: Re: read operation is slow Hi, do you mean this one should not introduce much extra delay? To read a record, I need select here, not sure where the extra delay comes from. On Fri, Jun 11, 2010 at 5:29 PM, Dop Sun su...@dopsun.com wrote: Jassandra is used here: MapString, ListIColumn map = criteria.select(); The select here basically is a call to Thrift API: get_range_slices From: Caribbean410 [mailto:caribbean...@gmail.com] Sent: Saturday, June 12, 2010 8:00 AM To: user@cassandra.apache.org Subject: Re: read operation is slow I remove some unnecessary column family and change the size of rowcache and keycache, now the latency changes from 0.25ms to 0.09ms. In essence 0.09ms*200k=18s. I don't know why it takes more than 400s total. Here is the client code and cfstats. There are not many operations here, why is the extra time so large? long start = System.currentTimeMillis(); for (int j = 0; j 1; j++) { for (int i = 0; i numOfRecords; i++) { int n = random.nextInt(numOfRecords); ICriteria criteria = cf.createCriteria(); userName = keySet[n]; criteria.keyList(Lists.newArrayList(userName)).columnRange(nameFirst, nameFirst, 10); MapString, ListIColumn map = criteria.select(); ListIColumn list = map.get(userName); // ByteArray bloc = list.get(0).getValue(); // byte[] byteArrayloc = bloc.toByteArray(); // loc = new String(byteArrayloc); // readBytes = readBytes + loc.length(); readBytes = readBytes + blobSize; } } long finish=System.currentTimeMillis(); float totalTime=(finish-start)/1000; Keyspace: Keyspace1 Read Count: 60 Read Latency: 0.090530067 ms. Write Count: 20 Write Latency: 0.01504989 ms. Pending Tasks: 0 Column Family: Standard2 SSTable count: 3 Space used (live): 265990358 Space used (total): 265990358 Memtable Columns Count: 2615 Memtable Data Size: 2667300 Memtable Switch Count: 3 Read Count: 60 Read Latency: 0.091 ms. Write Count: 20 Write Latency: 0.015 ms. Pending Tasks: 0 Key cache capacity: 1000 Key cache size: 187465 Key cache hit rate: 0.0 Row cache capacity: 1000 Row cache size: 189990 Row cache hit rate: 0.68335 Compacted row minimum size: 0 Compacted row maximum size: 0 Compacted row mean size: 0 Keyspace: system Read Count: 1 Read Latency: 10.954 ms. Write Count: 4 Write Latency: 0.28075 ms. Pending Tasks: 0 Column Family: HintsColumnFamily SSTable count: 0 Space used (live): 0 Space used (total): 0 Memtable Columns Count: 0 Memtable Data Size: 0 Memtable Switch
Re: java.lang.RuntimeException: java.io.IOException: Value too large for defined data type
Benjamin Black b at b3k.us writes: I am only saying something obvious: if you don't have sufficient resources to handle the demand, you should reduce demand, increase resources, or expect errors. Doing lots of writes without much heap space is such a situation (whether or not it is happening in this instance), but there are many others. This constraint it not specific to Cassandra. Hence, there is no free lunch. b I guess my point is that I have rarely run across database servers that die from either too many client connections, or too rapid client requests. They generally stop accepting incoming connections when there are too many connection requests, and further they do not queue and acknowledge an unbounded number of client requests on any given connection. In the example at hand, Julie has 8 clients, each of which is in a loop that writes 100 rows at a time (via batch_mutate), waits for successful completion, then writes another bunch of 100, until it completes all of the rows it is supposed to write (typically 100,000). So at any one time, each client should have about 10 MB of request (100 rows x 100 KB/row), times 8 clients, for a max pending request of no more than 80 MB. Further each request is running with a CL=ALL, so in theory, the request should not complete until each row has been handed off to the ultimate destination node, and perhaps written to the commit log (that part is not clear to me). It sounds like something else must be gobbling up either an unbounded amount of heap, or alternatively, a bounded, but large amount of heap. In the former case it is unclear how to make the application robust. In the later, it would be helpful to understand what the heap ussage upper bound is, and what parameters might have a significant effect on that value. To clarify the history here -- initially we were writing with CL=0 and had great performance but ended up killing the server. It was pointed out that we were really asking the server to accept and acknowledge an unbounded number of requests without waiting for any final disposition of the rows. So we had a doh! moment. That is why we went to the other extreme of CL=ALL, to let the server fully dispose of each request before acknowledging it and getting the next. TIA -- Charlie
Re: java.lang.RuntimeException: java.io.IOException: Value too large for defined data type
Actually, you shouldn't expect errors in the general case, unless you are simply trying to use data that can't fit in available heap. There are some practical limitations, as always. If there aren't enough resources on the server side to service the clients, the expectation should be that the servers have a graceful performance degradation, or in the worst case throw an error specific to resource exhaustion or explicit resource throttling. The fact that Cassandra does some background processing complicates this a bit. There are things which can cause errors after the fact, but these are generally considered resource tuning issues and are somewhat clear cut. There are specific changes in the works to bring background load exceptions into view of a client session, where users normally expect them. @see https://issues.apache.org/jira/browse/CASSANDRA-685 But otherwise, users shouldn't be expecting that simply increasing client load can blow up their Cassandra cluster. Any time this happens, it should be considered a bug or a misfeature. Devs please correct me here if I'm wrong. Jonathan On Tue, Jun 15, 2010 at 6:44 PM, Charles Butterfield charles.butterfi...@nextcentury.com wrote: Benjamin Black b at b3k.us writes: I am only saying something obvious: if you don't have sufficient resources to handle the demand, you should reduce demand, increase resources, or expect errors. Doing lots of writes without much heap space is such a situation (whether or not it is happening in this instance), but there are many others. This constraint it not specific to Cassandra. Hence, there is no free lunch. b I guess my point is that I have rarely run across database servers that die from either too many client connections, or too rapid client requests. They generally stop accepting incoming connections when there are too many connection requests, and further they do not queue and acknowledge an unbounded number of client requests on any given connection. In the example at hand, Julie has 8 clients, each of which is in a loop that writes 100 rows at a time (via batch_mutate), waits for successful completion, then writes another bunch of 100, until it completes all of the rows it is supposed to write (typically 100,000). So at any one time, each client should have about 10 MB of request (100 rows x 100 KB/row), times 8 clients, for a max pending request of no more than 80 MB. Further each request is running with a CL=ALL, so in theory, the request should not complete until each row has been handed off to the ultimate destination node, and perhaps written to the commit log (that part is not clear to me). It sounds like something else must be gobbling up either an unbounded amount of heap, or alternatively, a bounded, but large amount of heap. In the former case it is unclear how to make the application robust. In the later, it would be helpful to understand what the heap ussage upper bound is, and what parameters might have a significant effect on that value. To clarify the history here -- initially we were writing with CL=0 and had great performance but ended up killing the server. It was pointed out that we were really asking the server to accept and acknowledge an unbounded number of requests without waiting for any final disposition of the rows. So we had a doh! moment. That is why we went to the other extreme of CL=ALL, to let the server fully dispose of each request before acknowledging it and getting the next. TIA -- Charlie
Some questions about using Cassandra
We are currently looking at a distributed database option and so far Cassandra ticks all the boxes. However, I still have some questions. Is there any need for archiving of Cassandra and what backup options are available? As it is a no-data-loss system I'm guessing archiving is not exactly relevant. Is there any concept of Listeners such that when data is added to Cassandra we can fire off another process to do something with that data? E.g. create a copy in a secondary database for Business Intelligence reports? Send the data to an LDAP server? Anthony Ikeda Java Analyst/Programmer Cardlink Services Limited Level 4, 3 Rider Boulevard Rhodes NSW 2138 Web: www.cardlink.com.au | Tel: + 61 2 9646 9221 | Fax: + 61 2 9646 9283 ** This e-mail message and any attachments are intended only for the use of the addressee(s) named above and may contain information that is privileged and confidential. If you are not the intended recipient, any display, dissemination, distribution, or copying is strictly prohibited. If you believe you have received this e-mail message in error, please immediately notify the sender by replying to this e-mail message or by telephone to (02) 9646 9222. Please delete the email and any attachments and do not retain the email or any attachments in any form. **image001.gif
Re: Some questions about using Cassandra
There is JSON import and export, of you want a form of external backup. No, you can't hook event subscribers into the storage engine. You can modify it to do this, however. It may not be trivial. An easier way to do this would be to have a boundary system (or dedicated thread, for example) consume data in small amounts, using some temporal criterion, with a checkpoint. If the results of consuming the data are idempotent, you don't have to use a checkpoint, necessarily, but some cyclic rework may occur. If your storage layout includes temporal names, it should be straightforward. The details how exactly how would depend on your storage layout, but it is not unusual as far as requirements go. On Tue, Jun 15, 2010 at 7:49 PM, Anthony Ikeda anthony.ik...@cardlink.com.au wrote: We are currently looking at a distributed database option and so far Cassandra ticks all the boxes. However, I still have some questions. Is there any need for archiving of Cassandra and what backup options are available? As it is a no-data-loss system I’m guessing archiving is not exactly relevant. Is there any concept of Listeners such that when data is added to Cassandra we can fire off another process to do something with that data? E.g. create a copy in a secondary database for Business Intelligence reports? Send the data to an LDAP server? Anthony Ikeda Java Analyst/Programmer Cardlink Services Limited Level 4, 3 Rider Boulevard Rhodes NSW 2138 Web: www.cardlink.com.au | Tel: + 61 2 9646 9221 | Fax: + 61 2 9646 9283 [image: logo_cardlink1] ** This e-mail message and any attachments are intended only for the use of the addressee(s) named above and may contain information that is privileged and confidential. If you are not the intended recipient, any display, dissemination, distribution, or copying is strictly prohibited. If you believe you have received this e-mail message in error, please immediately notify the sender by replying to this e-mail message or by telephone to (02) 9646 9222. Please delete the email and any attachments and do not retain the email or any attachments in any form. ** image001.gif
Re: Some questions about using Cassandra
Doh! Replace of with if in the top line. On Tue, Jun 15, 2010 at 7:57 PM, Jonathan Shook jsh...@gmail.com wrote: There is JSON import and export, of you want a form of external backup. No, you can't hook event subscribers into the storage engine. You can modify it to do this, however. It may not be trivial. An easier way to do this would be to have a boundary system (or dedicated thread, for example) consume data in small amounts, using some temporal criterion, with a checkpoint. If the results of consuming the data are idempotent, you don't have to use a checkpoint, necessarily, but some cyclic rework may occur. If your storage layout includes temporal names, it should be straightforward. The details how exactly how would depend on your storage layout, but it is not unusual as far as requirements go. On Tue, Jun 15, 2010 at 7:49 PM, Anthony Ikeda anthony.ik...@cardlink.com.au wrote: We are currently looking at a distributed database option and so far Cassandra ticks all the boxes. However, I still have some questions. Is there any need for archiving of Cassandra and what backup options are available? As it is a no-data-loss system I’m guessing archiving is not exactly relevant. Is there any concept of Listeners such that when data is added to Cassandra we can fire off another process to do something with that data? E.g. create a copy in a secondary database for Business Intelligence reports? Send the data to an LDAP server? Anthony Ikeda Java Analyst/Programmer Cardlink Services Limited Level 4, 3 Rider Boulevard Rhodes NSW 2138 Web: www.cardlink.com.au | Tel: + 61 2 9646 9221 | Fax: + 61 2 9646 9283 [image: logo_cardlink1] ** This e-mail message and any attachments are intended only for the use of the addressee(s) named above and may contain information that is privileged and confidential. If you are not the intended recipient, any display, dissemination, distribution, or copying is strictly prohibited. If you believe you have received this e-mail message in error, please immediately notify the sender by replying to this e-mail message or by telephone to (02) 9646 9222. Please delete the email and any attachments and do not retain the email or any attachments in any form. ** image001.gif
RE: Some questions about using Cassandra
Thanks Jonathan, I was only asking about the event listeners because an alternative we are considering is TIBCO Active Spaces which draws quite a lot of parallels to Cassandra. I guess it would be interesting to find out how other people use Cassandra, i.e., is it your one stop shop for data storage or do you also store to a RDBMs to re-use the data? One factor I need to consider is our Business Intelligence platform that will need to use the data stored for reporting purposes. We are looking at using Cassandra for our real-time layer for Active-Active data centre use and perhaps have Oracle installed alongside for non-real-time use such that data is mediated to the Oracle database for other uses. Anthony From: Jonathan Shook [mailto:jsh...@gmail.com] Sent: Wednesday, 16 June 2010 10:58 AM To: user@cassandra.apache.org Subject: Re: Some questions about using Cassandra Doh! Replace of with if in the top line. On Tue, Jun 15, 2010 at 7:57 PM, Jonathan Shook jsh...@gmail.com wrote: There is JSON import and export, of you want a form of external backup. No, you can't hook event subscribers into the storage engine. You can modify it to do this, however. It may not be trivial. An easier way to do this would be to have a boundary system (or dedicated thread, for example) consume data in small amounts, using some temporal criterion, with a checkpoint. If the results of consuming the data are idempotent, you don't have to use a checkpoint, necessarily, but some cyclic rework may occur. If your storage layout includes temporal names, it should be straightforward. The details how exactly how would depend on your storage layout, but it is not unusual as far as requirements go. On Tue, Jun 15, 2010 at 7:49 PM, Anthony Ikeda anthony.ik...@cardlink.com.au wrote: We are currently looking at a distributed database option and so far Cassandra ticks all the boxes. However, I still have some questions. Is there any need for archiving of Cassandra and what backup options are available? As it is a no-data-loss system I'm guessing archiving is not exactly relevant. Is there any concept of Listeners such that when data is added to Cassandra we can fire off another process to do something with that data? E.g. create a copy in a secondary database for Business Intelligence reports? Send the data to an LDAP server? Anthony Ikeda Java Analyst/Programmer Cardlink Services Limited Level 4, 3 Rider Boulevard Rhodes NSW 2138 Web: www.cardlink.com.au | Tel: + 61 2 9646 9221 | Fax: + 61 2 9646 9283 ** This e-mail message and any attachments are intended only for the use of the addressee(s) named above and may contain information that is privileged and confidential. If you are not the intended recipient, any display, dissemination, distribution, or copying is strictly prohibited. If you believe you have received this e-mail message in error, please immediately notify the sender by replying to this e-mail message or by telephone to (02) 9646 9222. Please delete the email and any attachments and do not retain the email or any attachments in any form. ** _ This e-mail has been scanned for viruses by MCI's Internet Managed Scanning Services - powered by MessageLabs. For further information visit http://www.mci.com ** This e-mail message and any attachments are intended only for the use of the addressee(s) named above and may contain information that is privileged and confidential. If you are not the intended recipient, any display, dissemination, distribution, or copying is strictly prohibited. If you believe you have received this e-mail message in error, please immediately notify the sender by replying to this e-mail message or by telephone to (02) 9646 9222. Please delete the email and any attachments and do not retain the email or any attachments in any form. **image001.gif
Re: stalled streaming
This is not the bug to which I was referring. I don't recall the number, perhaps someone else can assist on that front? I just know I specifically upgraded to 0.6 trunk a bit before 0.6.2 to pick up the fix (and it worked). b On Tue, Jun 15, 2010 at 6:07 PM, Rob Coli rc...@digg.com wrote: On Tue, 15 Jun 2010 15:55:46 -0700, Benjamin Blackb...@b3k.us wrote: Known bug, fixed in latest 0.6 release. On 6/15/10 4:06 PM, aaron wrote: Thanks, will move to 0.6.2. I believe that this thread refers to CASSANDRA-1169, and fix version for that is the (unreleased) cassandra 0.6.3, not (the latest 0.6 release) 0.6.2. https://issues.apache.org/jira/browse/CASSANDRA-1169 =Rob
Re: java.lang.RuntimeException: java.io.IOException: Value too large for defined data type
On Tue, Jun 15, 2010 at 4:44 PM, Charles Butterfield charles.butterfi...@nextcentury.com wrote: I guess my point is that I have rarely run across database servers that die from either too many client connections, or too rapid client requests. They generally stop accepting incoming connections when there are too many connection requests, and further they do not queue and acknowledge an unbounded number of client requests on any given connection. Not what I am suggesting. Instead, I am saying things can be tuned to behave in various ways: you can arbitrarily back up client requests such that they start timing out, you can accept them and run out of memory, you can start swapping and go into a GC death spiral and have nodes drop off the ring, etc. There just isn't a situation in which you can satisfy an arbitrary load in limited time with insufficient resources. The exact same constraints apply to other databases, like MySQL, and the expectation to tune for your needs is the same. b
Re: java.lang.RuntimeException: java.io.IOException: Value too large for defined data type
On Tue, Jun 15, 2010 at 4:44 PM, Charles Butterfield charles.butterfi...@nextcentury.com wrote: To clarify the history here -- initially we were writing with CL=0 and had great performance but ended up killing the server. It was pointed out that we were really asking the server to accept and acknowledge an unbounded number of requests without waiting for any final disposition of the rows. So we had a doh! moment. That is why we went to the other extreme of CL=ALL, to let the server fully dispose of each request before acknowledging it and getting the next. CL.ALL (and CL.QUORUM) go through the strongly consistent write path, while CL.ONE (and CL.ANY) go through the weakly consistent write path. Using CL.ALL you aren't _letting_ the server full dispose of each request, you are _requiring_ it to hold on to resources until _all_ replicas confirm the write. With CL.ONE, the writes are asynchronous, with only a single success required before results are returned to the client, consuming fewer server resources. CL.ONE is what I recommend you use unless you have specific needs to the contrary. Unfortunately, the best documentation for this is the code, though it is mentioned briefly here: http://wiki.apache.org/cassandra/ArchitectureOverview b
Re: java.lang.RuntimeException: java.io.IOException: Value too large for defined data type
On Tue, Jun 15, 2010 at 4:58 PM, Jonathan Shook jsh...@gmail.com wrote: If there aren't enough resources on the server side to service the clients, the expectation should be that the servers have a graceful performance degradation, or in the worst case throw an error specific to resource exhaustion or explicit resource throttling. The fact that Cassandra does some background processing complicates this a bit. This is actually one of the most significant complications: graceful performance degradation can quickly lead to GC pauses long enough for a node to be marked down by the rest of the cluster. You could try to trade a lot of CPU for limited memory by tuning GC parameters to be _really_ aggressive, I suppose, but that doesn't strike me as a great strategy. The nature of distributed systems like this presents challenges simply not present (or even uglier) in centralized databases. I agree completely with the need to have the server engage in self-preserving activity should it start nearing limits, and to signal that change to clients and in logs. Definitely room for improvement, regardless of state of tune. b
Re: Some questions about using Cassandra
On 6/15/10 6:35 PM, Benjamin Black wrote: jmhodges contributed a patch (I remain incompetent at Jira searches) for 'coprocessors' to do what you want. That'd be where I'd start looking. https://issues.apache.org/jira/browse/CASSANDRA-1016 =Rob
RE: Some questions about using Cassandra
Thanks Benjamin. Looking at the 'plugins' now :) -Original Message- From: Benjamin Black [mailto:b...@b3k.us] Sent: Wednesday, 16 June 2010 11:35 AM To: user@cassandra.apache.org Subject: Re: Some questions about using Cassandra On Tue, Jun 15, 2010 at 6:07 PM, Anthony Ikeda anthony.ik...@cardlink.com.au wrote: Thanks Jonathan, I was only asking about the event listeners because an alternative we are considering is TIBCO Active Spaces which draws quite a lot of parallels to Cassandra. Based on painful production experience, I would not consider TIBCO for anything requiring reliable delivery. The failure modes and edge cases are too numerous and unpleasant to enumerate. jmhodges contributed a patch (I remain incompetent at Jira searches) for 'coprocessors' to do what you want. That'd be where I'd start looking. b _ This e-mail has been scanned for viruses by MCI's Internet Managed Scanning Services - powered by MessageLabs. For further information visit http://www.mci.com ** This e-mail message and any attachments are intended only for the use of the addressee(s) named above and may contain information that is privileged and confidential. If you are not the intended recipient, any display, dissemination, distribution, or copying is strictly prohibited. If you believe you have received this e-mail message in error, please immediately notify the sender by replying to this e-mail message or by telephone to (02) 9646 9222. Please delete the email and any attachments and do not retain the email or any attachments in any form. **
Re: stalled streaming
I think the one you're referring to is https://issues.apache.org/jira/browse/CASSANDRA-1076 On Tue, Jun 15, 2010 at 8:16 PM, Benjamin Black b...@b3k.us wrote: This is not the bug to which I was referring. I don't recall the number, perhaps someone else can assist on that front? I just know I specifically upgraded to 0.6 trunk a bit before 0.6.2 to pick up the fix (and it worked). b On Tue, Jun 15, 2010 at 6:07 PM, Rob Coli rc...@digg.com wrote: On Tue, 15 Jun 2010 15:55:46 -0700, Benjamin Blackb...@b3k.us wrote: Known bug, fixed in latest 0.6 release. On 6/15/10 4:06 PM, aaron wrote: Thanks, will move to 0.6.2. I believe that this thread refers to CASSANDRA-1169, and fix version for that is the (unreleased) cassandra 0.6.3, not (the latest 0.6 release) 0.6.2. https://issues.apache.org/jira/browse/CASSANDRA-1169 =Rob -- Jonathan Ellis Project Chair, Apache Cassandra co-founder of Riptano, the source for professional Cassandra support http://riptano.com
Re: stalled streaming
Yes! On Tue, Jun 15, 2010 at 6:44 PM, Jonathan Ellis jbel...@gmail.com wrote: I think the one you're referring to is https://issues.apache.org/jira/browse/CASSANDRA-1076 On Tue, Jun 15, 2010 at 8:16 PM, Benjamin Black b...@b3k.us wrote: This is not the bug to which I was referring. I don't recall the number, perhaps someone else can assist on that front? I just know I specifically upgraded to 0.6 trunk a bit before 0.6.2 to pick up the fix (and it worked). b On Tue, Jun 15, 2010 at 6:07 PM, Rob Coli rc...@digg.com wrote: On Tue, 15 Jun 2010 15:55:46 -0700, Benjamin Blackb...@b3k.us wrote: Known bug, fixed in latest 0.6 release. On 6/15/10 4:06 PM, aaron wrote: Thanks, will move to 0.6.2. I believe that this thread refers to CASSANDRA-1169, and fix version for that is the (unreleased) cassandra 0.6.3, not (the latest 0.6 release) 0.6.2. https://issues.apache.org/jira/browse/CASSANDRA-1169 =Rob -- Jonathan Ellis Project Chair, Apache Cassandra co-founder of Riptano, the source for professional Cassandra support http://riptano.com
Re: JVM Options for Production
The main change you'd commonly make is decreasing the max new gen size on large heaps (say to 2GB) from the default of 1/3 of the heap. IMO keeping heap dump on OOM around is a good idea in production; it doesn't cost much (you're already screwed at the point where it starts writing a dump, so why not) and it can be useful. On Mon, Jun 14, 2010 at 6:01 PM, Anthony Molinaro antho...@alumni.caltech.edu wrote: Hi, I was updating to a newer 0.6.3 and happened to remember that I noticed back in 0.6.2 there's this change in CHANGES.txt * improve default JVM GC options (CASSANDRA-1014) Looking at that ticket, I don't actually see the options listed or a reason for why they changed. Also, I'm not certain which options are now recommended for a production system versus what's in the distribution. The distribution (well svn) for 0.6.x currently has JVM_OPTS= \ -ea \ -Xms256M \ -Xmx1G \ -XX:+UseParNewGC \ -XX:+UseConcMarkSweepGC \ -XX:+CMSParallelRemarkEnabled \ -XX:SurvivorRatio=8 \ -XX:MaxTenuringThreshold=1 \ -XX:+HeapDumpOnOutOfMemoryError \ -Dcom.sun.management.jmxremote.port=8080 \ -Dcom.sun.management.jmxremote.ssl=false \ -Dcom.sun.management.jmxremote.authenticate=false Now I would assume that for 'production' you want to remove -ea and -XX:+HeapDumpOnOutOfMemoryError as well as adjust -Xms and Xmx accordingly, but are there any others which should be tweaked? Is there actually a recommended production set of values or does it very greatly from installation to installation? Thanks, -Anthony -- Anthony Molinaro antho...@alumni.caltech.edu -- Jonathan Ellis Project Chair, Apache Cassandra co-founder of Riptano, the source for professional Cassandra support http://riptano.com
RE: read operation is slow
Thank you for the update. For the select issue, right now we just focus on read and write, later we may test delete operation which need to query all keys. From: Dop Sun [mailto:su...@dopsun.com] ks Sent: Tuesday, June 15, 2010 4:14 PM To: user@cassandra.apache.org Subject: RE: read operation is slow Thanks for your updates, good to know that your performance is better now. Actually, if the user asks one record a time, usually it will be done in multi-threading, since most likely the requests coming from different users. If a single users want 200k, and there are no difference to get 1 at a time, or get 100 at a time. Since the result set is exactly same. By the way: the Jassandra you are using is updated for the issue you raised. Now, the result set from select is sorted. Thanks, Regards, dop From: Caribbean410 [mailto:caribbean...@gmail.com] Sent: Tuesday, June 15, 2010 9:16 AM To: user@cassandra.apache.org Subject: Re: read operation is slow Now I read 100 records each time, and the total time to read 200k records (1M each) reduce to 10s. Looks good. But I am still curious how to handle the case that users read one record each time, On Fri, Jun 11, 2010 at 6:05 PM, Dop Sun su...@dopsun.com wrote: And also, you are only select 1 key and 10 columns? criteria.keyList(Lists.newArrayList(userName)).columnRange(nameFirst, nameFirst, 10); Then, if you have 200k keys, you have 200k Thrift calls. If this is the case, you may need to optimize the way you do the query (to combine multiple keys into a single query), and to reduce the number of calls. From: Dop Sun [mailto:su...@dopsun.com] Sent: Saturday, June 12, 2010 8:57 AM To: user@cassandra.apache.org Subject: RE: read operation is slow You mean after you I remove some unnecessary column family and change the size of rowcache and keycache, now the latency changes from 0.25ms to 0.09ms. In essence 0.09ms*200k=18s., it still takes 400 seconds to returning? From: Caribbean410 [mailto:caribbean...@gmail.com] Sent: Saturday, June 12, 2010 8:48 AM To: user@cassandra.apache.org Subject: Re: read operation is slow Hi, do you mean this one should not introduce much extra delay? To read a record, I need select here, not sure where the extra delay comes from. On Fri, Jun 11, 2010 at 5:29 PM, Dop Sun su...@dopsun.com wrote: Jassandra is used here: MapString, ListIColumn map = criteria.select(); The select here basically is a call to Thrift API: get_range_slices From: Caribbean410 [mailto:caribbean...@gmail.com] Sent: Saturday, June 12, 2010 8:00 AM To: user@cassandra.apache.org Subject: Re: read operation is slow I remove some unnecessary column family and change the size of rowcache and keycache, now the latency changes from 0.25ms to 0.09ms. In essence 0.09ms*200k=18s. I don't know why it takes more than 400s total. Here is the client code and cfstats. There are not many operations here, why is the extra time so large? long start = System.currentTimeMillis(); for (int j = 0; j 1; j++) { for (int i = 0; i numOfRecords; i++) { int n = random.nextInt(numOfRecords); ICriteria criteria = cf.createCriteria(); userName = keySet[n]; criteria.keyList(Lists.newArrayList(userName)).columnRange(nameFirst, nameFirst, 10); MapString, ListIColumn map = criteria.select(); ListIColumn list = map.get(userName); // ByteArray bloc = list.get(0).getValue(); // byte[] byteArrayloc = bloc.toByteArray(); // loc = new String(byteArrayloc); // readBytes = readBytes + loc.length(); readBytes = readBytes + blobSize; } } long finish=System.currentTimeMillis(); float totalTime=(finish-start)/1000; Keyspace: Keyspace1 Read Count: 60 Read Latency: 0.090530067 ms. Write Count: 20 Write Latency: 0.01504989 ms. Pending Tasks: 0 Column Family: Standard2 SSTable count: 3 Space used (live): 265990358 Space used (total): 265990358 Memtable Columns Count: 2615 Memtable Data Size: 2667300 Memtable Switch Count: 3 Read Count: 60 Read Latency: 0.091 ms. Write Count: 20 Write Latency: 0.015 ms. Pending Tasks: 0 Key cache capacity: 1000 Key cache size: 187465 Key cache hit rate: 0.0 Row cache capacity: 1000 Row cache size: 189990 Row cache hit rate: 0.68335 Compacted row minimum size: 0 Compacted row maximum size: 0 Compacted row mean size: 0 Keyspace: system Read Count: 1