CF that is like a non-clustered index, are key lookups that fast?

2010-06-15 Thread S Ahmed
If you store only the key mappings in a column family, for custom ordering
of rows etc. for things like:

friends = {

   user_id : { friendid1, friendid2, }

}

or

topForumPosts = {

forum_id1 : { post2343, post32343, post32223, ...}

}


Now on friends page or on the top_forum_posts page you will get back a list
of post_ids, you will then have to perform lookups on the main 'posts' CF to
get the actual data.  So if a page is displaying 10, 25, or 50 posts you
will have 10, 25 or 50 key based lookups for each page view.

Is this the suggested way?  i.e. a look based on a slice to get a list of
post_id's, then a seperate call to actually fetch the data for the given
entity.

Or is cassandra so fast that 50 key based calls is no reason to worry?


How to get previous / next data?

2010-06-15 Thread Bram van der Waaij
Hello,

We want to use cassandra to store and retrieve time related data. Storing
the time-value pairs is easy and works perfectly. The problem arrives at
retrieving the data. We do not only want to retrieve data from within a time
range, but also be able to get the previous and/or next data sample from a
specific point in time.

The next in time i can retrieve by giving asking for the range
timestamp...maxtime and request for 2 items. Which returns timestamp (if
available) and the next timestamp-value.

Does anybody know how i can retrieve the previous timestamp? The columns are
sorted on the key(timestamp), so the previous request/next should not be a
difficult to perform.

Any suggestions are welcome!

Thanks,

Bram van der Waaij


Re: How to get previous / next data?

2010-06-15 Thread Sylvain Lebresne
You want to use 'reversed' in SliceRange (and a start with whatever
you want and a count of 2).

--
Sylvain

On Tue, Jun 15, 2010 at 12:01 PM, Bram van der Waaij
bramat...@gmail.com wrote:
 Hello,

 We want to use cassandra to store and retrieve time related data. Storing
 the time-value pairs is easy and works perfectly. The problem arrives at
 retrieving the data. We do not only want to retrieve data from within a time
 range, but also be able to get the previous and/or next data sample from a
 specific point in time.

 The next in time i can retrieve by giving asking for the range
 timestamp...maxtime and request for 2 items. Which returns timestamp (if
 available) and the next timestamp-value.

 Does anybody know how i can retrieve the previous timestamp? The columns are
 sorted on the key(timestamp), so the previous request/next should not be a
 difficult to perform.

 Any suggestions are welcome!

 Thanks,

 Bram van der Waaij


Re: JVM Options for Production

2010-06-15 Thread Ted Zlatanov
On Mon, 14 Jun 2010 16:01:57 -0700 Anthony Molinaro 
antho...@alumni.caltech.edu wrote: 

AM Now I would assume that for 'production' you want to remove
AM-ea
AM and
AM-XX:+HeapDumpOnOutOfMemoryError

AM as well as adjust -Xms and Xmx accordingly, but are there any others
AM which should be tweaked?  Is there actually a recommended production
AM set of values or does it very greatly from installation to installation?

I brought this up as well here:

http://thread.gmane.org/gmane.comp.db.cassandra.user/2083/focus=2093

Ted



Re: CF that is like a non-clustered index, are key lookups that fast?

2010-06-15 Thread S Ahmed
well it won't be a range, it will be random key lookups.

On Tue, Jun 15, 2010 at 8:44 AM, Gary Dusbabek gdusba...@gmail.com wrote:

 On Tue, Jun 15, 2010 at 04:29, S Ahmed sahmed1...@gmail.com wrote:
  If you store only the key mappings in a column family, for custom
 ordering
  of rows etc. for things like:
  friends = {
 
 user_id : { friendid1, friendid2, }
  }
  or
  topForumPosts = {
 
  forum_id1 : { post2343, post32343, post32223, ...}
  }
 
  Now on friends page or on the top_forum_posts page you will get back a
 list
  of post_ids, you will then have to perform lookups on the main 'posts' CF
 to
  get the actual data.  So if a page is displaying 10, 25, or 50 posts you
  will have 10, 25 or 50 key based lookups for each page view.
  Is this the suggested way?  i.e. a look based on a slice to get a list of
  post_id's, then a seperate call to actually fetch the data for the given
  entity.
  Or is cassandra so fast that 50 key based calls is no reason to worry?

 You should look at using either multi_get_slice or get_range_slices.
 You'll save on network trips and the amount of work required of the
 cluster.

 Gary.



Re: How to get previous / next data?

2010-06-15 Thread Bram van der Waaij
Perfect! Thanks :-)

2010/6/15 Sylvain Lebresne sylv...@yakaz.com

 You want to use 'reversed' in SliceRange (and a start with whatever
 you want and a count of 2).

 --
 Sylvain

 On Tue, Jun 15, 2010 at 12:01 PM, Bram van der Waaij
 bramat...@gmail.com wrote:
  Hello,
 
  We want to use cassandra to store and retrieve time related data. Storing
  the time-value pairs is easy and works perfectly. The problem arrives at
  retrieving the data. We do not only want to retrieve data from within a
 time
  range, but also be able to get the previous and/or next data sample from
 a
  specific point in time.
 
  The next in time i can retrieve by giving asking for the range
  timestamp...maxtime and request for 2 items. Which returns timestamp (if
  available) and the next timestamp-value.
 
  Does anybody know how i can retrieve the previous timestamp? The columns
 are
  sorted on the key(timestamp), so the previous request/next should not be
 a
  difficult to perform.
 
  Any suggestions are welcome!
 
  Thanks,
 
  Bram van der Waaij



Re: java.lang.OutofMemoryerror: Java heap space

2010-06-15 Thread Jonathan Ellis
if you are reading 500MB per thrift request from each of 3 threads,
then yes, simple arithmetic indicates that 1GB heap is not enough.

On Mon, Jun 14, 2010 at 6:13 PM, Caribbean410 caribbean...@gmail.com wrote:
 Hi,

 I wrote 200k records to db with each record 5MB. Get this error when I uses
 3 threads (each thread tries to read 200k record totally, 100 records a
 time) to read data from db. The write is OK, the error comes from read.
 Right now the Xmx of JVM is 1GB. I changed it to 2GB, still not working. If
 the record size is under 4K, I will not get this error. Any clues to avoid
 this error?

 Thx




-- 
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of Riptano, the source for professional Cassandra support
http://riptano.com


Cassandra timeouts under low load

2010-06-15 Thread Drew Dahlke
Hi, I'm running cassandra .6.2 on a dedicated 4 node cluster and I
also have a dedicated 4 node hadoop cluster. I'm trying to run a
simple map reduce job against a single column family and it only takes
32 map tasks before I get floods of thrift timeouts. That would make
sense to me if the cassandra was stressing the hardware or the
network, but it's not. Each box has 8 cores/16G ram. During the job
CPU averages 150-250% (1/5 utilization on 8 cores), network IO hovers
around 15% throughput, iostat  15%.

The hadoop machines are taking even less of a beating. The simpler I
make the job, the faster it hits cassandra, the faster it throws
timeouts  vice versa. I'm guessing there's a software/config related
bottleneck I'm hitting well before tapping out the hardware. Any idea
what that might be?

java.lang.RuntimeException: TimedOutException()
at 
org.apache.cassandra.hadoop.ColumnFamilyRecordReader$RowIterator.maybeInit(ColumnFamilyRecordReader.java:174)
at 
org.apache.cassandra.hadoop.ColumnFamilyRecordReader$RowIterator.computeNext(ColumnFamilyRecordReader.java:224)
at 
org.apache.cassandra.hadoop.ColumnFamilyRecordReader$RowIterator.computeNext(ColumnFamilyRecordReader.java:101)
at 
com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:135)
at 
com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:130)
at 
org.apache.cassandra.hadoop.ColumnFamilyRecordReader.nextKeyValue(ColumnFamilyRecordReader.java:95)
at 
org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue(MapTask.java:423)
at 
org.apache.hadoop.mapreduce.MapContext.nextKeyValue(MapContext.java:67)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:143)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:583)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:305)
at org.apache.hadoop.mapred.Child.main(Child.java:170)
Caused by: TimedOutException()
at 
org.apache.cassandra.thrift.Cassandra$get_range_slices_result.read(Cassandra.java:11015)
at 
org.apache.cassandra.thrift.Cassandra$Client.recv_get_range_slices(Cassandra.java:623)
at 
org.apache.cassandra.thrift.Cassandra$Client.get_range_slices(Cassandra.java:597)
at 
org.apache.cassandra.hadoop.ColumnFamilyRecordReader$RowIterator.maybeInit(ColumnFamilyRecordReader.java:151)
... 11 more


RE: java.lang.OutofMemoryerror: Java heap space

2010-06-15 Thread caribbean410
Sorry, the record size should be 5KB not 5MB. Coz 4KB is still OK. I will
try Benjamin's suggestion.

-Original Message-
From: Jonathan Ellis [mailto:jbel...@gmail.com] 
Sent: Tuesday, June 15, 2010 8:09 AM
To: user@cassandra.apache.org
Subject: Re: java.lang.OutofMemoryerror: Java heap space

if you are reading 500MB per thrift request from each of 3 threads,
then yes, simple arithmetic indicates that 1GB heap is not enough.

On Mon, Jun 14, 2010 at 6:13 PM, Caribbean410 caribbean...@gmail.com
wrote:
 Hi,

 I wrote 200k records to db with each record 5MB. Get this error when I
uses
 3 threads (each thread tries to read 200k record totally, 100 records a
 time) to read data from db. The write is OK, the error comes from read.
 Right now the Xmx of JVM is 1GB. I changed it to 2GB, still not working.
If
 the record size is under 4K, I will not get this error. Any clues to avoid
 this error?

 Thx




-- 
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of Riptano, the source for professional Cassandra support
http://riptano.com



Re: java.lang.OutofMemoryerror: Java heap space

2010-06-15 Thread Benjamin Black
You should only have to restart once per node to pick up config changes.

On Tue, Jun 15, 2010 at 9:41 AM, caribbean410 caribbean...@gmail.com wrote:
 Today I retry the 2GB heap now it's working. No that out of memory error.
 Looks like I have to restart Cassandra several times before the new changes
 take effect.

 -Original Message-
 From: Benjamin Black [mailto:b...@b3k.us]
 Sent: Monday, June 14, 2010 7:46 PM
 To: user@cassandra.apache.org
 Subject: Re: java.lang.OutofMemoryerror: Java heap space

 My guess: you are outrunning your disk I/O.  Each of those 5MB rows
 gets written to the commitlog, and the memtable is flushed when it
 hits the configured limit, which you've probably left at 128MB.  Every
 25 rows or so you are getting memtable flushed to disk.  Until these
 things complete, they are in RAM.

 If this is actually representative of your production use, you need a
 dedicated commitlog disk, several drives in RAID0 or RAID10 for data,
 a lot more RAM, and much larger memtable flush size.


 b

 On Mon, Jun 14, 2010 at 6:13 PM, Caribbean410 caribbean...@gmail.com
 wrote:
 Hi,

 I wrote 200k records to db with each record 5MB. Get this error when I
 uses
 3 threads (each thread tries to read 200k record totally, 100 records a
 time) to read data from db. The write is OK, the error comes from read.
 Right now the Xmx of JVM is 1GB. I changed it to 2GB, still not working.
 If
 the record size is under 4K, I will not get this error. Any clues to avoid
 this error?

 Thx





Re: Replication Factor and Data Centers

2010-06-15 Thread Jonathan Ellis
(moving to user@)

On Mon, Jun 14, 2010 at 10:43 PM, Masood Mortazavi
masoodmortaz...@gmail.com wrote:
 Is the clearer interpretation of this statement (in
 conf/datacenters.properties) given anywhere else?

 # The sum of all the datacenter replication factor values should equal
 # the replication factor of the keyspace (i.e. sum(dc_rf) = RF)

 # keyspace\:datacenter=replication factor
 Keyspace1\:DC1=3
 Keyspace1\:DC2=2
 Keyspace1\:DC3=1

 Does the above example configuration imply that Keyspace1 has a RF of 6, and
 that of these 3 will go to DC1, 2 to DC2 and 1 to DC3?

Yes.

 What will happen if datacenters.properties and cassandra-rack.properties are
 simply empty?

You have an illegal configuration.
https://issues.apache.org/jira/browse/CASSANDRA-1191 is open to have
Cassandra raise an error under this condition and others.

-- 
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of Riptano, the source for professional Cassandra support
http://riptano.com


Re: help for designing a cassandra

2010-06-15 Thread Jonathan Ellis
http://wiki.apache.org/cassandra/ArticlesAndPresentations might help.

On Mon, Jun 14, 2010 at 1:13 PM, Johannes Weissensel
whitesensl...@googlemail.com wrote:
 Hi everyone,
 i am new to nosql databases and especially column-oriented Databases
 like cassandra.
 I am a student on information-systems and i evaluate a fitting no-sql
 database for a web analytics system. Got the use-case of data like
 webserver-logfile.
 in an RDBMS it would be for every hit a row in the database, and than
 endless grouping and counting on the data for getting the metrics you
 want.
 Is there anyone who has experiences with data like that in hypertable,
 how should i design the database?
 Also for every hit a single row, or maybe for every session an
 aggregated version of the data, or for every day and every page a
 single aggregated version.
 Maybe some has an idea, how to design the database? Just like an
 typical not normalized sql database?
 Hope you have some ideas :)
 Johannes




-- 
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of Riptano, the source for professional Cassandra support
http://riptano.com


java.lang.RuntimeException: java.io.IOException: Value too large for defined data type

2010-06-15 Thread Julie
I am running a 10 node cassandra 0.6.1 cluster with a replication factor of 3. 

To populate the database to perform my read benchmarking, I have 8 applications 
using Thrift, each connecting to a different cassandra server and writing 
100,000 rows of data (100 KB each row), using a consistencyLevel of ALL. My 
server nodes are ec2-smalls (1.7GB memory, 100GB disk).

With consistency set to ALL, it takes 5-6 minutes for each app to write 
10,000 (100 KB) rows.  When each of my 8 writing apps reaches about 90,000 rows 
written, I start seeing write timeouts but my app retries twice and all data 
appears to get written.

It sppears to take about 1hr 45min for all compacting to complete.

Coinciding with my write timeouts, all 10 of my cassandra servers are getting 
the following exception written to system.log:


 INFO [FLUSH-WRITER-POOL:1] 2010-06-15 13:13:54,411 Memtable.java (line 162) 
Completed flushing /var/lib/cassandra/data/Keyspace1/Standard1-359-Data.db
ERROR [MESSAGE-STREAMING-POOL:1] 2010-06-15 13:13:59,145 
DebuggableThreadPoolExecutor.java (line 101) Error in ThreadPoolExecutor
java.lang.RuntimeException: java.io.IOException: Value too large for defined 
data type 
at 
org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:34) 
at 
java.util.concurrent.ThreadPoolExecutor$Worker.runTask
(ThreadPoolExecutor.java:886)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:619)
Caused by: java.io.IOException: Value too large for defined data type
at sun.nio.ch.FileChannelImpl.transferTo0(Native Method)
at 
sun.nio.ch.FileChannelImpl.transferToDirectly(FileChannelImpl.java:415)
at sun.nio.ch.FileChannelImpl.transferTo(FileChannelImpl.java:516)
at 
org.apache.cassandra.net.FileStreamTask.stream(FileStreamTask.java:95)
at 
org.apache.cassandra.net.FileStreamTask.runMayThrow(FileStreamTask.java:63)
at 
org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:30)
... 3 more
ERROR [MESSAGE-STREAMING-POOL:1] 2010-06-15 13:13:59,146 CassandraDaemon.java 
(line 78) Fatal exception in thread Thread[MESSAGE-STREAMING-POOL:1,5,main]
java.lang.RuntimeException: java.io.IOException: Value too large for defined 
data type
at 
org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:34)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.runTask
(ThreadPoolExecutor.java:886)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:619)
Caused by: java.io.IOException: Value too large for defined data type
at sun.nio.ch.FileChannelImpl.transferTo0(Native Method)
at 
sun.nio.ch.FileChannelImpl.transferToDirectly(FileChannelImpl.java:415)
at sun.nio.ch.FileChannelImpl.transferTo(FileChannelImpl.java:516)
at 
org.apache.cassandra.net.FileStreamTask.stream(FileStreamTask.java:95)
at 
org.apache.cassandra.net.FileStreamTask.runMayThrow(FileStreamTask.java:63)
at 
org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:30)
... 3 more

On 8 out of 10 servers, I see this just before the exception:

 INFO [AE-SERVICE-STAGE:1] 2010-06-15 13:41:36,292 StreamOut.java (line 66)
Sending a stream initiate message to /10.210.34.212 ...
ERROR [MESSAGE-STREAMING-POOL:1] 2010-06-15 13:43:32,956
DebuggableThreadPoolExecutor.java (line 101) Error in ThreadPoolExecutor

On the other 2 servers, I see the AE-SERVICE stream initiate message about 6-9
minutes prior to the exception.

Another thing that is odd is that even when the server nodes are quiescent 
because compacting is complete, I am still seeing cpu usage stay at 
about 40% .  Even after several hours, no reading or writing to the database 
and all compactions complete, the cpu usage is staying around 40%.

Thank you for your help and advice,
Julie



Re: java.lang.RuntimeException: java.io.IOException: Value too large for defined data type

2010-06-15 Thread Benjamin Black
You are likely exhausting your heap space (probably still at the very
small 1G default?), and maximizing the amount of resource consumption
by using CL.ALL.  Why are you using ALL?

On Tue, Jun 15, 2010 at 11:58 AM, Julie julie.su...@nextcentury.com wrote:
 I am running a 10 node cassandra 0.6.1 cluster with a replication factor of 3.

 To populate the database to perform my read benchmarking, I have 8 
 applications
 using Thrift, each connecting to a different cassandra server and writing
 100,000 rows of data (100 KB each row), using a consistencyLevel of ALL. My
 server nodes are ec2-smalls (1.7GB memory, 100GB disk).

 With consistency set to ALL, it takes 5-6 minutes for each app to write
 10,000 (100 KB) rows.  When each of my 8 writing apps reaches about 90,000 
 rows
 written, I start seeing write timeouts but my app retries twice and all data
 appears to get written.

 It sppears to take about 1hr 45min for all compacting to complete.

 Coinciding with my write timeouts, all 10 of my cassandra servers are getting
 the following exception written to system.log:


  INFO [FLUSH-WRITER-POOL:1] 2010-06-15 13:13:54,411 Memtable.java (line 162)
 Completed flushing /var/lib/cassandra/data/Keyspace1/Standard1-359-Data.db
 ERROR [MESSAGE-STREAMING-POOL:1] 2010-06-15 13:13:59,145
 DebuggableThreadPoolExecutor.java (line 101) Error in ThreadPoolExecutor
 java.lang.RuntimeException: java.io.IOException: Value too large for defined
 data type
 at
 org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:34)
 at
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask
 (ThreadPoolExecutor.java:886)
        at
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
        at java.lang.Thread.run(Thread.java:619)
 Caused by: java.io.IOException: Value too large for defined data type
        at sun.nio.ch.FileChannelImpl.transferTo0(Native Method)
        at
 sun.nio.ch.FileChannelImpl.transferToDirectly(FileChannelImpl.java:415)
        at sun.nio.ch.FileChannelImpl.transferTo(FileChannelImpl.java:516)
        at
 org.apache.cassandra.net.FileStreamTask.stream(FileStreamTask.java:95)
        at
 org.apache.cassandra.net.FileStreamTask.runMayThrow(FileStreamTask.java:63)
        at
 org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:30)
        ... 3 more
 ERROR [MESSAGE-STREAMING-POOL:1] 2010-06-15 13:13:59,146 CassandraDaemon.java
 (line 78) Fatal exception in thread Thread[MESSAGE-STREAMING-POOL:1,5,main]
 java.lang.RuntimeException: java.io.IOException: Value too large for defined
 data type
        at
 org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:34)
        at
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask
 (ThreadPoolExecutor.java:886)
        at
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
        at java.lang.Thread.run(Thread.java:619)
 Caused by: java.io.IOException: Value too large for defined data type
        at sun.nio.ch.FileChannelImpl.transferTo0(Native Method)
        at
 sun.nio.ch.FileChannelImpl.transferToDirectly(FileChannelImpl.java:415)
        at sun.nio.ch.FileChannelImpl.transferTo(FileChannelImpl.java:516)
        at
 org.apache.cassandra.net.FileStreamTask.stream(FileStreamTask.java:95)
        at
 org.apache.cassandra.net.FileStreamTask.runMayThrow(FileStreamTask.java:63)
        at
 org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:30)
        ... 3 more

 On 8 out of 10 servers, I see this just before the exception:

  INFO [AE-SERVICE-STAGE:1] 2010-06-15 13:41:36,292 StreamOut.java (line 66)
 Sending a stream initiate message to /10.210.34.212 ...
 ERROR [MESSAGE-STREAMING-POOL:1] 2010-06-15 13:43:32,956
 DebuggableThreadPoolExecutor.java (line 101) Error in ThreadPoolExecutor

 On the other 2 servers, I see the AE-SERVICE stream initiate message about 6-9
 minutes prior to the exception.

 Another thing that is odd is that even when the server nodes are quiescent
 because compacting is complete, I am still seeing cpu usage stay at
 about 40% .  Even after several hours, no reading or writing to the database
 and all compactions complete, the cpu usage is staying around 40%.

 Thank you for your help and advice,
 Julie




Re: java.lang.RuntimeException: java.io.IOException: Value too large for defined data type

2010-06-15 Thread Julie
Benjamin Black b at b3k.us writes:

 
 You are likely exhausting your heap space (probably still at the very
 small 1G default?), and maximizing the amount of resource consumption
 by using CL.ALL.  Why are you using ALL?
 
 On Tue, Jun 15, 2010 at 11:58 AM, Julie julie.sugar at nextcentury.com 
wrote:
...
  Coinciding with my write timeouts, all 10 of my cassandra servers are 
getting
  the following exception written to system.log:
 
 
   INFO [FLUSH-WRITER-POOL:1] 2010-06-15 13:13:54,411 Memtable.java (line 162)
  Completed flushing /var/lib/cassandra/data/Keyspace1/Standard1-359-Data.db
  ERROR [MESSAGE-STREAMING-POOL:1] 2010-06-15 13:13:59,145
  DebuggableThreadPoolExecutor.java (line 101) Error in ThreadPoolExecutor
  java.lang.RuntimeException: java.io.IOException: Value too large for defined
  data type
  at
  org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:34)
  at
  java.util.concurrent.ThreadPoolExecutor$Worker.runTask
  (ThreadPoolExecutor.java:886)
         at
  java.util.concurrent.ThreadPoolExecutor$Worker.run
(ThreadPoolExecutor.java:908)
         at java.lang.Thread.run(Thread.java:619)
  Caused by: java.io.IOException: Value too large for defined data type
         at sun.nio.ch.FileChannelImpl.transferTo0(Native Method)
         at
  sun.nio.ch.FileChannelImpl.transferToDirectly(FileChannelImpl.java:415)
         at sun.nio.ch.FileChannelImpl.transferTo(FileChannelImpl.java:516)
         at
  org.apache.cassandra.net.FileStreamTask.stream(FileStreamTask.java:95)
         at
  org.apache.cassandra.net.FileStreamTask.runMayThrow(FileStreamTask.java:63)
         at
  org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:30)
         ... 3 more
...


Thanks for your reply.  Yes, my heap space is 1G.  My vms have only 1.7G of 
memory so I hesitate to use more.  I am using ALL because I was crashing 
cassandra when I used ZERO (posting from a few days ago) with a heap space 
error so it was recommended that I use ALL instead.  I also tried using ONE but 
got even more write timeouts so I thought it would be safer to just wait for 
ALL replications to be written before trying to write more rows. 

Thank you for your help.





Re: java.lang.RuntimeException: java.io.IOException: Value too large for defined data type

2010-06-15 Thread Phil Stanhope
How are you doing your inserts?

I draw a clear line between 1) bootstrapping a cluster with data and 2) 
simulating expected/projected read/write behavior.

If you are bootstrapping then I would look into the batch_mutate APIs. They 
allow you to improve your performance on writes dramatically.

If you are read/write testing on a populated cluster, insert and batch_insert 
(for super columns) are the way to go.

As Ben has pointed to me in numerous threads ... think carefully about your 
replication factor. Do you want the data on all nodes? Or sufficiently 
replicated so that you can recover? Do you want consistency at the time of 
write? Or eventually?

Cassandra has a bunch of knobs that you can turn ... but that flexibility 
requires that you think about your expected usage patterns and operational 
policies.

-phil

On Jun 15, 2010, at 4:40 PM, Julie wrote:

 Benjamin Black b at b3k.us writes:
 
 
 You are likely exhausting your heap space (probably still at the very
 small 1G default?), and maximizing the amount of resource consumption
 by using CL.ALL.  Why are you using ALL?
 
 On Tue, Jun 15, 2010 at 11:58 AM, Julie julie.sugar at nextcentury.com 
 wrote:
 ...
 Coinciding with my write timeouts, all 10 of my cassandra servers are 
 getting
 the following exception written to system.log:
 
 
  INFO [FLUSH-WRITER-POOL:1] 2010-06-15 13:13:54,411 Memtable.java (line 162)
 Completed flushing /var/lib/cassandra/data/Keyspace1/Standard1-359-Data.db
 ERROR [MESSAGE-STREAMING-POOL:1] 2010-06-15 13:13:59,145
 DebuggableThreadPoolExecutor.java (line 101) Error in ThreadPoolExecutor
 java.lang.RuntimeException: java.io.IOException: Value too large for defined
 data type
 at
 org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:34)
 at
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask
 (ThreadPoolExecutor.java:886)
at
 java.util.concurrent.ThreadPoolExecutor$Worker.run
 (ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:619)
 Caused by: java.io.IOException: Value too large for defined data type
at sun.nio.ch.FileChannelImpl.transferTo0(Native Method)
at
 sun.nio.ch.FileChannelImpl.transferToDirectly(FileChannelImpl.java:415)
at sun.nio.ch.FileChannelImpl.transferTo(FileChannelImpl.java:516)
at
 org.apache.cassandra.net.FileStreamTask.stream(FileStreamTask.java:95)
at
 org.apache.cassandra.net.FileStreamTask.runMayThrow(FileStreamTask.java:63)
at
 org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:30)
... 3 more
 ...
 
 
 Thanks for your reply.  Yes, my heap space is 1G.  My vms have only 1.7G of 
 memory so I hesitate to use more.  I am using ALL because I was crashing 
 cassandra when I used ZERO (posting from a few days ago) with a heap space 
 error so it was recommended that I use ALL instead.  I also tried using ONE 
 but 
 got even more write timeouts so I thought it would be safer to just wait for 
 ALL replications to be written before trying to write more rows. 
 
 Thank you for your help.
 
 
 



Re: java.lang.RuntimeException: java.io.IOException: Value too large for defined data type

2010-06-15 Thread Benjamin Black
On Tue, Jun 15, 2010 at 1:40 PM, Julie julie.su...@nextcentury.com wrote:
 Thanks for your reply.  Yes, my heap space is 1G.  My vms have only 1.7G of
 memory so I hesitate to use more.

Then write slower.  There is no free lunch.


b


Re: java.lang.RuntimeException: java.io.IOException: Value too large for defined data type

2010-06-15 Thread Jonathan Ellis
On Tue, Jun 15, 2010 at 1:58 PM, Julie julie.su...@nextcentury.com wrote:
 Coinciding with my write timeouts, all 10 of my cassandra servers are getting
 the following exception written to system.log:

Value too large for defined data type looks like a bug found in
older JREs.  Upgrade to u19 or later.

 Another thing that is odd is that even when the server nodes are quiescent
 because compacting is complete, I am still seeing cpu usage stay at
 about 40% .  Even after several hours, no reading or writing to the database
 and all compactions complete, the cpu usage is staying around 40%.

Possibly this is Hinted Handoff scanning going on.  You can rm
data/system/Hint* (while the node is shut down) if you want to take a
shot in the dark.  Otherwise you'll want to follow
http://publib.boulder.ibm.com/infocenter/javasdk/tools/index.jsp?topic=/com.ibm.java.doc.igaa/_1vg0001475cb4a-1190e2e0f74-8000_1007.html
to figure out which thread is actually consuming the CPU.

-- 
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of Riptano, the source for professional Cassandra support
http://riptano.com


stalled streaming

2010-06-15 Thread aaron
Hello, 



Re: java.lang.RuntimeException: java.io.IOException: Value too large for defined data type

2010-06-15 Thread Julie
Phil Stanhope pstanhope at wimba.com writes:

 
 How are you doing your inserts?
 
 I draw a clear line between 1) bootstrapping a cluster with data and 2)
simulating expected/projected
 read/write behavior.
 
 If you are bootstrapping then I would look into the batch_mutate APIs. They
allow you to improve your
 performance on writes dramatically.
 
 If you are read/write testing on a populated cluster, insert and batch_insert
(for super columns) are the
 way to go.
 
 As Ben has pointed to me in numerous threads ... think carefully about your
replication factor. Do you want
 the data on all nodes? Or sufficiently replicated so that you can recover? Do
you want consistency at the
 time of write? Or eventually?
 
 Cassandra has a bunch of knobs that you can turn ... but that flexibility
requires that you think about your
 expected usage patterns and operational policies.
 
 -phil
 

My inserts are being done 100 rows at a time using batch_mutate().
I bring up all 10 nodes in my cassandra cluster at once (no live bootstrapping 
of nodes).  Once they are up, I begin populating the database running 8 write 
clients (on 8 different VMs), each writing 100 rows at a time.  As mentioned 
earlier, each client writes to a different cassandra server node so no one 
server node is fielding all the writes simultaneously.

I have a replication factor of 3 because I need to be able to survive 2 out of 
10 nodes going down at once.

I am baffled by all the Value too large exceptions that are occurring on 
every one of my 10 servers:
 ERROR [MESSAGE-STREAMING-POOL:1] 2010-06-14 19:30:24,471 
DebuggableThreadPoolExecutor.java (line 101) Error in ThreadPoolExecutor
java.lang.RuntimeException: java.io.IOException: Value too large for defined
data type
at 
org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:34)
at java.util.concurrent.ThreadPoolExecutor$Worker.runTask
(ThreadPoolExecutor.java:886)
at java.util.concurrent.ThreadPoolExecutor$Worker.run
(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:619)

It seems to be happening just after this is logged:
INFO [AE-SERVICE-STAGE:1]  2010-06-14 19:28:39,851 StreamOut.java

I'm also baffled that after all compactions are done on every one of the 10 
servers, about 5 out of 10 servers are still at 40% CPU usage, although they 
are doing 0 disk IO. I am not running anything else running on these server 
nodes except for cassandra.  The compactions have been done for over an hour.
The last write took place 5 hours ago.

Thank you for any help,
Julie






Re: java.lang.RuntimeException: java.io.IOException: Value too large for defined data type

2010-06-15 Thread Jonathan Ellis
On Tue, Jun 15, 2010 at 5:15 PM, Julie julie.su...@nextcentury.com wrote:
 I'm also baffled that after all compactions are done on every one of the 10
 servers, about 5 out of 10 servers are still at 40% CPU usage, although they
 are doing 0 disk IO. I am not running anything else running on these server
 nodes except for cassandra.  The compactions have been done for over an hour.
 The last write took place 5 hours ago.

That actually sounds like
https://issues.apache.org/jira/browse/CASSANDRA-1169, which is fixed
for 0.6.3

-- 
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of Riptano, the source for professional Cassandra support
http://riptano.com


[OT] Real Time Open source solutions for aggregation and stream processing

2010-06-15 Thread Ian Holsman

firstly, my apologies for the off-topic message,
but I thought most people on this list would be knowledgeable and 
interested in this kind of thing.


We are looking to find a open source, scalable solution to do RT 
aggregation and stream processing (similar to what the 'hop' project 
http://code.google.com/p/hop/ set out to do) for large(ish) click-stream 
logs.


My first thought was something like esper, but in our testing it kind of 
hits the wall at around 10,000 rules per JVM.


I was wondering if any of you guys had some experiences in this area, 
and what your favorite toolsets are around this.


currently we are using cassandra and redis with home grown software to 
do the aggregation, but I'd love to use a common package if there is one.


and again.. apologies for the off-topic message and the x-posting.

regards
Ian


stalled streaming

2010-06-15 Thread aaron
hello, 

I have a 4 node cassandra cluster with 0.6.1 installed. We've been running
a mixed read / write workload test how it works in our environment, we run
about 4M bath mutations and 40M get_range_slice requests over 6 to 8 hours
that load about 10 to 15 GB of data. 

Yesterday while there was no activity I noticed 2 nodes sitting at 200%
CPU on 8 Core machine. Thought nothing of it. Checked again this morning
and they are still sitting at that level of activity with no requests going
into them. Checking the streams using node tool I see node 3 is streaming
to node 0 and 2, and appears to have stalled. The information in the JMX
console for streams matches the info below.

I cannot see any errors in the logs. 

This is just a test system, so am happy to bounce the JVM's. Before I do
is there anything else I should be looking for to understand why this
happened?

Also, sorry for the previous empty email. 

Node 0
Mode: Normal
 Nothing streaming to /192.168.34.27
 Nothing streaming to /192.168.34.28
 Nothing streaming to /192.168.34.29
Streaming from: /192.168.34.29
   junkbox.mycompany:
/local1/junkbox/cassandra/data/junkbox.mycompany/Buckets-tmp-137-Filter.db
0/22765
   junkbox.mycompany:
/local1/junkbox/cassandra/data/junkbox.mycompany/Buckets-tmp-137-Data.db
0/10750717
   junkbox.mycompany:
/local1/junkbox/cassandra/data/junkbox.mycompany/Databases-tmp-108-Index.db
0/58
   junkbox.mycompany:
/local1/junkbox/cassandra/data/junkbox.mycompany/Databases-tmp-108-Filter.db
0/325
   junkbox.mycompany:
/local1/junkbox/cassandra/data/junkbox.mycompany/Databases-tmp-108-Data.db
0/695
   junkbox.mycompany:
/local1/junkbox/cassandra/data/junkbox.mycompany/Databases-tmp-119-Index.db
0/58
   junkbox.mycompany:
/local1/junkbox/cassandra/data/junkbox.mycompany/Databases-tmp-119-Filter.db
0/325
   junkbox.mycompany:
/local1/junkbox/cassandra/data/junkbox.mycompany/Databases-tmp-119-Data.db
0/695
   junkbox.mycompany:
/local1/junkbox/cassandra/data/junkbox.mycompany/Buckets-tmp-163-Index.db
0/587164
   junkbox.mycompany:
/local1/junkbox/cassandra/data/junkbox.mycompany/Buckets-tmp-163-Filter.db
0/22765
   junkbox.mycompany:
/local1/junkbox/cassandra/data/junkbox.mycompany/Buckets-tmp-163-Data.db
0/5159652
   junkbox.mycompany:
/local1/junkbox/cassandra/data/junkbox.mycompany/Buckets-tmp-124-Data.db
22765/4966927
   junkbox.mycompany:
/local1/junkbox/cassandra/data/junkbox.mycompany/Buckets-tmp-137-Index.db
22765/1223053

Node 1
Mode: Normal
 Nothing streaming to /192.168.34.26
 Nothing streaming to /192.168.34.28
 Nothing streaming to /192.168.34.29
Not receiving any streams.

Node 2
Mode: Normal
 Nothing streaming to /192.168.34.26
 Nothing streaming to /192.168.34.27
 Nothing streaming to /192.168.34.29
Streaming from: /192.168.34.29
   junkbox.mycompany:
/local1/junkbox/cassandra/data/junkbox.mycompany/Buckets-tmp-137-Filter.db
0/22765
   junkbox.mycompany:
/local1/junkbox/cassandra/data/junkbox.mycompany/Buckets-tmp-137-Data.db
0/2161660
   junkbox.mycompany:
/local1/junkbox/cassandra/data/junkbox.mycompany/Buckets-tmp-147-Index.db
0/787524
   junkbox.mycompany:
/local1/junkbox/cassandra/data/junkbox.mycompany/Buckets-tmp-147-Filter.db
0/22765
   junkbox.mycompany:
/local1/junkbox/cassandra/data/junkbox.mycompany/Buckets-tmp-147-Data.db
0/6917064
   junkbox.mycompany:
/local1/junkbox/cassandra/data/junkbox.mycompany/Databases-tmp-130-Index.db
0/58
   junkbox.mycompany:
/local1/junkbox/cassandra/data/junkbox.mycompany/Databases-tmp-130-Filter.db
0/565
   junkbox.mycompany:
/local1/junkbox/cassandra/data/junkbox.mycompany/Databases-tmp-130-Data.db
0/695
   junkbox.mycompany:
/local1/junkbox/cassandra/data/junkbox.mycompany/Buckets-tmp-168-Index.db
0/581779
   junkbox.mycompany:
/local1/junkbox/cassandra/data/junkbox.mycompany/Buckets-tmp-168-Filter.db
0/22765
   junkbox.mycompany:
/local1/junkbox/cassandra/data/junkbox.mycompany/Buckets-tmp-168-Data.db
0/5111887
   junkbox.mycompany:
/local1/junkbox/cassandra/data/junkbox.mycompany/Buckets-tmp-125-Data.db
361367/3173057
   junkbox.mycompany:
/local1/junkbox/cassandra/data/junkbox.mycompany/Buckets-tmp-125-Index.db
695/361367

Node 3
ode: Normal
Streaming to: /192.168.34.26
  
/local1/junkbox/cassandra/data/junkbox.mycompany/stream/Buckets-69-Filter.db
22765/22765
  
/local1/junkbox/cassandra/data/junkbox.mycompany/stream/Buckets-69-Data.db
0/4966927
  
/local1/junkbox/cassandra/data/junkbox.mycompany/stream/Databases-42-Index.db
0/58
  
/local1/junkbox/cassandra/data/junkbox.mycompany/stream/Databases-42-Filter.db
0/325
  
/local1/junkbox/cassandra/data/junkbox.mycompany/stream/Databases-42-Data.db
0/695
  
/local1/junkbox/cassandra/data/junkbox.mycompany/stream/Buckets-82-Index.db
0/1223053
  
/local1/junkbox/cassandra/data/junkbox.mycompany/stream/Buckets-82-Filter.db
0/22765
  
/local1/junkbox/cassandra/data/junkbox.mycompany/stream/Buckets-82-Data.db
0/10750717
  
/local1/junkbox/cassandra/data/junkbox.mycompany/stream/Databases-52-Index.db
0/58
  

Re: stalled streaming

2010-06-15 Thread Benjamin Black
Known bug, fixed in latest 0.6 release.

On Tue, Jun 15, 2010 at 3:29 PM, aaron aa...@thelastpickle.com wrote:
 hello,

 I have a 4 node cassandra cluster with 0.6.1 installed. We've been running
 a mixed read / write workload test how it works in our environment, we run
 about 4M bath mutations and 40M get_range_slice requests over 6 to 8 hours
 that load about 10 to 15 GB of data.

 Yesterday while there was no activity I noticed 2 nodes sitting at 200%
 CPU on 8 Core machine. Thought nothing of it. Checked again this morning
 and they are still sitting at that level of activity with no requests going
 into them. Checking the streams using node tool I see node 3 is streaming
 to node 0 and 2, and appears to have stalled. The information in the JMX
 console for streams matches the info below.

 I cannot see any errors in the logs.

 This is just a test system, so am happy to bounce the JVM's. Before I do
 is there anything else I should be looking for to understand why this
 happened?

 Also, sorry for the previous empty email.

 Node 0
 Mode: Normal
  Nothing streaming to /192.168.34.27
  Nothing streaming to /192.168.34.28
  Nothing streaming to /192.168.34.29
 Streaming from: /192.168.34.29
   junkbox.mycompany:
 /local1/junkbox/cassandra/data/junkbox.mycompany/Buckets-tmp-137-Filter.db
 0/22765
   junkbox.mycompany:
 /local1/junkbox/cassandra/data/junkbox.mycompany/Buckets-tmp-137-Data.db
 0/10750717
   junkbox.mycompany:
 /local1/junkbox/cassandra/data/junkbox.mycompany/Databases-tmp-108-Index.db
 0/58
   junkbox.mycompany:
 /local1/junkbox/cassandra/data/junkbox.mycompany/Databases-tmp-108-Filter.db
 0/325
   junkbox.mycompany:
 /local1/junkbox/cassandra/data/junkbox.mycompany/Databases-tmp-108-Data.db
 0/695
   junkbox.mycompany:
 /local1/junkbox/cassandra/data/junkbox.mycompany/Databases-tmp-119-Index.db
 0/58
   junkbox.mycompany:
 /local1/junkbox/cassandra/data/junkbox.mycompany/Databases-tmp-119-Filter.db
 0/325
   junkbox.mycompany:
 /local1/junkbox/cassandra/data/junkbox.mycompany/Databases-tmp-119-Data.db
 0/695
   junkbox.mycompany:
 /local1/junkbox/cassandra/data/junkbox.mycompany/Buckets-tmp-163-Index.db
 0/587164
   junkbox.mycompany:
 /local1/junkbox/cassandra/data/junkbox.mycompany/Buckets-tmp-163-Filter.db
 0/22765
   junkbox.mycompany:
 /local1/junkbox/cassandra/data/junkbox.mycompany/Buckets-tmp-163-Data.db
 0/5159652
   junkbox.mycompany:
 /local1/junkbox/cassandra/data/junkbox.mycompany/Buckets-tmp-124-Data.db
 22765/4966927
   junkbox.mycompany:
 /local1/junkbox/cassandra/data/junkbox.mycompany/Buckets-tmp-137-Index.db
 22765/1223053

 Node 1
 Mode: Normal
  Nothing streaming to /192.168.34.26
  Nothing streaming to /192.168.34.28
  Nothing streaming to /192.168.34.29
 Not receiving any streams.

 Node 2
 Mode: Normal
  Nothing streaming to /192.168.34.26
  Nothing streaming to /192.168.34.27
  Nothing streaming to /192.168.34.29
 Streaming from: /192.168.34.29
   junkbox.mycompany:
 /local1/junkbox/cassandra/data/junkbox.mycompany/Buckets-tmp-137-Filter.db
 0/22765
   junkbox.mycompany:
 /local1/junkbox/cassandra/data/junkbox.mycompany/Buckets-tmp-137-Data.db
 0/2161660
   junkbox.mycompany:
 /local1/junkbox/cassandra/data/junkbox.mycompany/Buckets-tmp-147-Index.db
 0/787524
   junkbox.mycompany:
 /local1/junkbox/cassandra/data/junkbox.mycompany/Buckets-tmp-147-Filter.db
 0/22765
   junkbox.mycompany:
 /local1/junkbox/cassandra/data/junkbox.mycompany/Buckets-tmp-147-Data.db
 0/6917064
   junkbox.mycompany:
 /local1/junkbox/cassandra/data/junkbox.mycompany/Databases-tmp-130-Index.db
 0/58
   junkbox.mycompany:
 /local1/junkbox/cassandra/data/junkbox.mycompany/Databases-tmp-130-Filter.db
 0/565
   junkbox.mycompany:
 /local1/junkbox/cassandra/data/junkbox.mycompany/Databases-tmp-130-Data.db
 0/695
   junkbox.mycompany:
 /local1/junkbox/cassandra/data/junkbox.mycompany/Buckets-tmp-168-Index.db
 0/581779
   junkbox.mycompany:
 /local1/junkbox/cassandra/data/junkbox.mycompany/Buckets-tmp-168-Filter.db
 0/22765
   junkbox.mycompany:
 /local1/junkbox/cassandra/data/junkbox.mycompany/Buckets-tmp-168-Data.db
 0/5111887
   junkbox.mycompany:
 /local1/junkbox/cassandra/data/junkbox.mycompany/Buckets-tmp-125-Data.db
 361367/3173057
   junkbox.mycompany:
 /local1/junkbox/cassandra/data/junkbox.mycompany/Buckets-tmp-125-Index.db
 695/361367

 Node 3
 ode: Normal
 Streaming to: /192.168.34.26

 /local1/junkbox/cassandra/data/junkbox.mycompany/stream/Buckets-69-Filter.db
 22765/22765

 /local1/junkbox/cassandra/data/junkbox.mycompany/stream/Buckets-69-Data.db
 0/4966927

 /local1/junkbox/cassandra/data/junkbox.mycompany/stream/Databases-42-Index.db
 0/58

 /local1/junkbox/cassandra/data/junkbox.mycompany/stream/Databases-42-Filter.db
 0/325

 /local1/junkbox/cassandra/data/junkbox.mycompany/stream/Databases-42-Data.db
 0/695

 /local1/junkbox/cassandra/data/junkbox.mycompany/stream/Buckets-82-Index.db
 0/1223053

 /local1/junkbox/cassandra/data/junkbox.mycompany/stream/Buckets-82-Filter.db
 

Re: java.lang.RuntimeException: java.io.IOException: Value too large for defined data type

2010-06-15 Thread Charles Butterfield
Benjamin Black b at b3k.us writes:

 
 Then write slower.  There is no free lunch.
 
 b

Are you implying that clients need to throttle their collective load on the
server to avoid causing the server to fail?  That seems undesirable.  Is this a
side effect of a server bug, or is it part of the intended design?

Regards
-- Charlie






Re: java.lang.RuntimeException: java.io.IOException: Value too large for defined data type

2010-06-15 Thread Benjamin Black
On Tue, Jun 15, 2010 at 3:55 PM, Charles Butterfield
charles.butterfi...@nextcentury.com wrote:
 Benjamin Black b at b3k.us writes:


 Then write slower.  There is no free lunch.

 b

 Are you implying that clients need to throttle their collective load on the
 server to avoid causing the server to fail?  That seems undesirable.  Is this 
 a
 side effect of a server bug, or is it part of the intended design?


I am only saying something obvious: if you don't have sufficient
resources to handle the demand, you should reduce demand, increase
resources, or expect errors.  Doing lots of writes without much heap
space is such a situation (whether or not it is happening in this
instance), but there are many others.  This constraint it not specific
to Cassandra.  Hence, there is no free lunch.


b


Re: stalled streaming

2010-06-15 Thread aaron
Thanks, will move to 0.6.2. 

Aaron

On Tue, 15 Jun 2010 15:55:46 -0700, Benjamin Black b...@b3k.us wrote:
 Known bug, fixed in latest 0.6 release.
 
 On Tue, Jun 15, 2010 at 3:29 PM, aaron aa...@thelastpickle.com wrote:
 hello,

 I have a 4 node cassandra cluster with 0.6.1 installed. We've been
 running
 a mixed read / write workload test how it works in our environment, we
 run
 about 4M bath mutations and 40M get_range_slice requests over 6 to 8
 hours
 that load about 10 to 15 GB of data.

 Yesterday while there was no activity I noticed 2 nodes sitting at 200%
 CPU on 8 Core machine. Thought nothing of it. Checked again this
morning
 and they are still sitting at that level of activity with no requests
 going
 into them. Checking the streams using node tool I see node 3 is
streaming
 to node 0 and 2, and appears to have stalled. The information in the
JMX
 console for streams matches the info below.

 I cannot see any errors in the logs.

 This is just a test system, so am happy to bounce the JVM's. Before I
do
 is there anything else I should be looking for to understand why this
 happened?

 Also, sorry for the previous empty email.

 Node 0
 Mode: Normal
  Nothing streaming to /192.168.34.27
  Nothing streaming to /192.168.34.28
  Nothing streaming to /192.168.34.29
 Streaming from: /192.168.34.29
   junkbox.mycompany:

/local1/junkbox/cassandra/data/junkbox.mycompany/Buckets-tmp-137-Filter.db
 0/22765
   junkbox.mycompany:

/local1/junkbox/cassandra/data/junkbox.mycompany/Buckets-tmp-137-Data.db
 0/10750717
   junkbox.mycompany:

/local1/junkbox/cassandra/data/junkbox.mycompany/Databases-tmp-108-Index.db
 0/58
   junkbox.mycompany:

/local1/junkbox/cassandra/data/junkbox.mycompany/Databases-tmp-108-Filter.db
 0/325
   junkbox.mycompany:

/local1/junkbox/cassandra/data/junkbox.mycompany/Databases-tmp-108-Data.db
 0/695
   junkbox.mycompany:

/local1/junkbox/cassandra/data/junkbox.mycompany/Databases-tmp-119-Index.db
 0/58
   junkbox.mycompany:

/local1/junkbox/cassandra/data/junkbox.mycompany/Databases-tmp-119-Filter.db
 0/325
   junkbox.mycompany:

/local1/junkbox/cassandra/data/junkbox.mycompany/Databases-tmp-119-Data.db
 0/695
   junkbox.mycompany:

/local1/junkbox/cassandra/data/junkbox.mycompany/Buckets-tmp-163-Index.db
 0/587164
   junkbox.mycompany:

/local1/junkbox/cassandra/data/junkbox.mycompany/Buckets-tmp-163-Filter.db
 0/22765
   junkbox.mycompany:

/local1/junkbox/cassandra/data/junkbox.mycompany/Buckets-tmp-163-Data.db
 0/5159652
   junkbox.mycompany:

/local1/junkbox/cassandra/data/junkbox.mycompany/Buckets-tmp-124-Data.db
 22765/4966927
   junkbox.mycompany:

/local1/junkbox/cassandra/data/junkbox.mycompany/Buckets-tmp-137-Index.db
 22765/1223053

 Node 1
 Mode: Normal
  Nothing streaming to /192.168.34.26
  Nothing streaming to /192.168.34.28
  Nothing streaming to /192.168.34.29
 Not receiving any streams.

 Node 2
 Mode: Normal
  Nothing streaming to /192.168.34.26
  Nothing streaming to /192.168.34.27
  Nothing streaming to /192.168.34.29
 Streaming from: /192.168.34.29
   junkbox.mycompany:

/local1/junkbox/cassandra/data/junkbox.mycompany/Buckets-tmp-137-Filter.db
 0/22765
   junkbox.mycompany:

/local1/junkbox/cassandra/data/junkbox.mycompany/Buckets-tmp-137-Data.db
 0/2161660
   junkbox.mycompany:

/local1/junkbox/cassandra/data/junkbox.mycompany/Buckets-tmp-147-Index.db
 0/787524
   junkbox.mycompany:

/local1/junkbox/cassandra/data/junkbox.mycompany/Buckets-tmp-147-Filter.db
 0/22765
   junkbox.mycompany:

/local1/junkbox/cassandra/data/junkbox.mycompany/Buckets-tmp-147-Data.db
 0/6917064
   junkbox.mycompany:

/local1/junkbox/cassandra/data/junkbox.mycompany/Databases-tmp-130-Index.db
 0/58
   junkbox.mycompany:

/local1/junkbox/cassandra/data/junkbox.mycompany/Databases-tmp-130-Filter.db
 0/565
   junkbox.mycompany:

/local1/junkbox/cassandra/data/junkbox.mycompany/Databases-tmp-130-Data.db
 0/695
   junkbox.mycompany:

/local1/junkbox/cassandra/data/junkbox.mycompany/Buckets-tmp-168-Index.db
 0/581779
   junkbox.mycompany:

/local1/junkbox/cassandra/data/junkbox.mycompany/Buckets-tmp-168-Filter.db
 0/22765
   junkbox.mycompany:

/local1/junkbox/cassandra/data/junkbox.mycompany/Buckets-tmp-168-Data.db
 0/5111887
   junkbox.mycompany:

/local1/junkbox/cassandra/data/junkbox.mycompany/Buckets-tmp-125-Data.db
 361367/3173057
   junkbox.mycompany:

/local1/junkbox/cassandra/data/junkbox.mycompany/Buckets-tmp-125-Index.db
 695/361367

 Node 3
 ode: Normal
 Streaming to: /192.168.34.26


/local1/junkbox/cassandra/data/junkbox.mycompany/stream/Buckets-69-Filter.db
 22765/22765


/local1/junkbox/cassandra/data/junkbox.mycompany/stream/Buckets-69-Data.db
 0/4966927


/local1/junkbox/cassandra/data/junkbox.mycompany/stream/Databases-42-Index.db
 0/58


/local1/junkbox/cassandra/data/junkbox.mycompany/stream/Databases-42-Filter.db
 0/325


/local1/junkbox/cassandra/data/junkbox.mycompany/stream/Databases-42-Data.db
 0/695



RE: read operation is slow

2010-06-15 Thread Dop Sun
Thanks for your updates, good to know that your performance is better now.

 

Actually, if the user asks one record a time, usually it will be done in
multi-threading, since most likely the requests coming from different users.

 

If a single users want 200k, and there are no difference to get 1 at a time,
or get 100 at a time.  Since the result set is exactly same.

 

By the way: the Jassandra you are using is updated for the issue you raised.
Now, the result set from select is sorted.

 

Thanks,

Regards,

dop

 

From: Caribbean410 [mailto:caribbean...@gmail.com] 
Sent: Tuesday, June 15, 2010 9:16 AM
To: user@cassandra.apache.org
Subject: Re: read operation is slow

 

Now I read 100 records each time, and the total time to read 200k records
(1M each) reduce to 10s. Looks good. But I am still curious how to handle
the case that users read one record each time,

On Fri, Jun 11, 2010 at 6:05 PM, Dop Sun su...@dopsun.com wrote:

And also, you are only select 1 key and 10 columns?

 

criteria.keyList(Lists.newArrayList(userName)).columnRange(nameFirst,
nameFirst, 10);

 

Then, if you have 200k keys, you have 200k Thrift calls.  If this is the
case, you may need to optimize the way you do the query (to combine multiple
keys into a single query), and to reduce the number of calls.

 

From: Dop Sun [mailto:su...@dopsun.com] 
Sent: Saturday, June 12, 2010 8:57 AM


To: user@cassandra.apache.org

Subject: RE: read operation is slow

 

You mean after you I remove some unnecessary column family and change the
size of rowcache and keycache, now the latency changes from 0.25ms to
0.09ms. In essence 0.09ms*200k=18s., it still takes 400 seconds to
returning?

 

From: Caribbean410 [mailto:caribbean...@gmail.com] 
Sent: Saturday, June 12, 2010 8:48 AM
To: user@cassandra.apache.org
Subject: Re: read operation is slow

 

Hi, do you mean this one should not introduce much extra delay? To read a
record, I need select here, not sure where the extra delay comes from.

On Fri, Jun 11, 2010 at 5:29 PM, Dop Sun su...@dopsun.com wrote:

Jassandra is used here:

 

MapString, ListIColumn map = criteria.select();

 

The select here basically is a call to Thrift API: get_range_slices

 

 

From: Caribbean410 [mailto:caribbean...@gmail.com] 
Sent: Saturday, June 12, 2010 8:00 AM


To: user@cassandra.apache.org
Subject: Re: read operation is slow

 

I remove some unnecessary column family and change the size of rowcache and
keycache, now the latency changes from 0.25ms to 0.09ms. In essence
0.09ms*200k=18s. I don't know why it takes more than 400s total. Here is the
client code and cfstats. There are not many operations here, why is the
extra time so large?



  long start = System.currentTimeMillis();
  for (int j = 0; j  1; j++) {
  for (int i = 0; i  numOfRecords; i++) {
  int n = random.nextInt(numOfRecords);
  ICriteria criteria = cf.createCriteria();
  userName = keySet[n];
 
criteria.keyList(Lists.newArrayList(userName)).columnRange(nameFirst,
nameFirst, 10); 
  MapString, ListIColumn map = criteria.select(); 
  ListIColumn list = map.get(userName); 
//  ByteArray bloc = list.get(0).getValue();
//  byte[] byteArrayloc = bloc.toByteArray();
//  loc = new String(byteArrayloc);

//  readBytes = readBytes + loc.length();
  readBytes = readBytes + blobSize;
  }
  }

long finish=System.currentTimeMillis();

float totalTime=(finish-start)/1000;


Keyspace: Keyspace1
Read Count: 60
Read Latency: 0.090530067 ms.
Write Count: 20
Write Latency: 0.01504989 ms.
Pending Tasks: 0
Column Family: Standard2
SSTable count: 3
Space used (live): 265990358
Space used (total): 265990358
Memtable Columns Count: 2615
Memtable Data Size: 2667300
Memtable Switch Count: 3
Read Count: 60
Read Latency: 0.091 ms.
Write Count: 20
Write Latency: 0.015 ms.
Pending Tasks: 0
Key cache capacity: 1000
Key cache size: 187465
Key cache hit rate: 0.0
Row cache capacity: 1000
Row cache size: 189990
Row cache hit rate: 0.68335
Compacted row minimum size: 0
Compacted row maximum size: 0
Compacted row mean size: 0


Keyspace: system
Read Count: 1
Read Latency: 10.954 ms.
Write Count: 4
Write Latency: 0.28075 ms.
Pending Tasks: 0
Column Family: HintsColumnFamily
SSTable count: 0
Space used (live): 0
Space used (total): 0
Memtable Columns Count: 0
Memtable Data Size: 0
Memtable Switch 

Re: java.lang.RuntimeException: java.io.IOException: Value too large for defined data type

2010-06-15 Thread Charles Butterfield
Benjamin Black b at b3k.us writes:

 
 I am only saying something obvious: if you don't have sufficient
 resources to handle the demand, you should reduce demand, increase
 resources, or expect errors.  Doing lots of writes without much heap
 space is such a situation (whether or not it is happening in this
 instance), but there are many others.  This constraint it not specific
 to Cassandra.  Hence, there is no free lunch.
 
 b

I guess my point is that I have rarely run across database servers that die
from either too many client connections, or too rapid client requests.  They
generally stop accepting incoming connections when there are too many connection
requests, and further they do not queue and acknowledge an unbounded number of
client requests on any given connection.

In the example at hand, Julie has 8 clients, each of which is in a loop that
writes 100 rows at a time (via batch_mutate), waits for successful completion,
then writes another bunch of 100, until it completes all of the rows it is
supposed to write (typically 100,000).  So at any one time, each client should
have about 10 MB of request (100 rows x 100 KB/row), times 8 clients, for a max
pending request of no more than 80 MB.

Further each request is running with a CL=ALL, so in theory, the request should
not complete until each row has been handed off to the ultimate destination
node, and perhaps written to the commit log (that part is not clear to me).

It sounds like something else must be gobbling up either an unbounded amount
of heap, or alternatively, a bounded, but large amount of heap.  In the former
case it is unclear how to make the application robust.  In the later, it would
be helpful to understand what the heap ussage upper bound is, and what
parameters might have a significant effect on that value.

To clarify the history here -- initially we were writing with CL=0 and had
great performance but ended up killing the server.  It was pointed out that
we were really asking the server to accept and acknowledge an unbounded
number of requests without waiting for any final disposition of the rows.
So we had a doh! moment.  That is why we went to the other extreme of
CL=ALL, to let the server fully dispose of each request before acknowledging
it and getting the next.

TIA
-- Charlie






Re: java.lang.RuntimeException: java.io.IOException: Value too large for defined data type

2010-06-15 Thread Jonathan Shook
Actually, you shouldn't expect errors in the general case, unless you
are simply trying to use data that can't fit in available heap. There
are some practical limitations, as always.

If there aren't enough resources on the server side to service the
clients, the expectation should be that the servers have a graceful
performance degradation, or in the worst case throw an error specific
to resource exhaustion or explicit resource throttling. The fact that
Cassandra does some background processing complicates this a bit.
There are things which can cause errors after the fact, but these are
generally considered resource tuning issues and are somewhat clear
cut. There are specific changes in the works to bring background load
exceptions into view of a client session, where users normally expect
them.

@see https://issues.apache.org/jira/browse/CASSANDRA-685

But otherwise, users shouldn't be expecting that simply increasing
client load can blow up their Cassandra cluster. Any time this
happens, it should be considered a bug or a misfeature. Devs please
correct me here if I'm wrong.

Jonathan


On Tue, Jun 15, 2010 at 6:44 PM, Charles Butterfield
charles.butterfi...@nextcentury.com wrote:
 Benjamin Black b at b3k.us writes:


 I am only saying something obvious: if you don't have sufficient
 resources to handle the demand, you should reduce demand, increase
 resources, or expect errors.  Doing lots of writes without much heap
 space is such a situation (whether or not it is happening in this
 instance), but there are many others.  This constraint it not specific
 to Cassandra.  Hence, there is no free lunch.

 b

 I guess my point is that I have rarely run across database servers that die
 from either too many client connections, or too rapid client requests.  They
 generally stop accepting incoming connections when there are too many 
 connection
 requests, and further they do not queue and acknowledge an unbounded number of
 client requests on any given connection.

 In the example at hand, Julie has 8 clients, each of which is in a loop that
 writes 100 rows at a time (via batch_mutate), waits for successful completion,
 then writes another bunch of 100, until it completes all of the rows it is
 supposed to write (typically 100,000).  So at any one time, each client should
 have about 10 MB of request (100 rows x 100 KB/row), times 8 clients, for a 
 max
 pending request of no more than 80 MB.

 Further each request is running with a CL=ALL, so in theory, the request 
 should
 not complete until each row has been handed off to the ultimate destination
 node, and perhaps written to the commit log (that part is not clear to me).

 It sounds like something else must be gobbling up either an unbounded amount
 of heap, or alternatively, a bounded, but large amount of heap.  In the former
 case it is unclear how to make the application robust.  In the later, it would
 be helpful to understand what the heap ussage upper bound is, and what
 parameters might have a significant effect on that value.

 To clarify the history here -- initially we were writing with CL=0 and had
 great performance but ended up killing the server.  It was pointed out that
 we were really asking the server to accept and acknowledge an unbounded
 number of requests without waiting for any final disposition of the rows.
 So we had a doh! moment.  That is why we went to the other extreme of
 CL=ALL, to let the server fully dispose of each request before acknowledging
 it and getting the next.

 TIA
 -- Charlie







Some questions about using Cassandra

2010-06-15 Thread Anthony Ikeda
We are currently looking at a distributed database option and so far
Cassandra ticks all the boxes. However, I still have some questions.

 

Is there any need for archiving of Cassandra and what backup options are
available? As it is a no-data-loss system I'm guessing archiving is not
exactly relevant.

 

Is there any concept of Listeners such that when data is added to
Cassandra we can fire off another process to do something with that
data? E.g. create a copy in a secondary database for Business
Intelligence reports? Send the data to an LDAP server?

 

 

Anthony Ikeda

Java Analyst/Programmer

Cardlink Services Limited

Level 4, 3 Rider Boulevard

Rhodes NSW 2138

 

Web: www.cardlink.com.au | Tel: + 61 2 9646 9221 | Fax: + 61 2 9646 9283

 

 


**
This e-mail message and any attachments are intended only for the use of the 
addressee(s) named above and may contain information that is privileged and 
confidential. If you are not the intended recipient, any display, 
dissemination, distribution, or copying is strictly prohibited.   If you 
believe you have received this e-mail message in error, please immediately 
notify the sender by replying to this e-mail message or by telephone to (02) 
9646 9222. Please delete the email and any attachments and do not retain the 
email or any attachments in any form.
**image001.gif

Re: Some questions about using Cassandra

2010-06-15 Thread Jonathan Shook
There is JSON import and export, of you want a form of external backup.

No, you can't hook event subscribers into the storage engine. You can modify
it to do this, however. It may not be trivial.

An easier way to do this would be to have a boundary system (or dedicated
thread, for example) consume data in small amounts, using some temporal
criterion, with a checkpoint. If the results of consuming the data are
idempotent, you don't have to use a checkpoint, necessarily, but some cyclic
rework may occur.

If your storage layout includes temporal names, it should be
straightforward. The details how exactly how would depend on your storage
layout, but it is not unusual as far as requirements go.


On Tue, Jun 15, 2010 at 7:49 PM, Anthony Ikeda 
anthony.ik...@cardlink.com.au wrote:

  We are currently looking at a distributed database option and so far
 Cassandra ticks all the boxes. However, I still have some questions.



 Is there any need for archiving of Cassandra and what backup options are
 available? As it is a no-data-loss system I’m guessing archiving is not
 exactly relevant.



 Is there any concept of Listeners such that when data is added to Cassandra
 we can fire off another process to do something with that data? E.g. create
 a copy in a secondary database for Business Intelligence reports? Send the
 data to an LDAP server?





 Anthony Ikeda

 Java Analyst/Programmer

 Cardlink Services Limited

 Level 4, 3 Rider Boulevard

 Rhodes NSW 2138



 Web: www.cardlink.com.au | Tel: + 61 2 9646 9221 | Fax: + 61 2 9646 9283

 [image: logo_cardlink1]



 **
 This e-mail message and any attachments are intended only for the use of
 the addressee(s) named above and may contain information that is privileged
 and confidential. If you are not the intended recipient, any display,
 dissemination, distribution, or copying is strictly prohibited. If you
 believe you have received this e-mail message in error, please immediately
 notify the sender by replying to this e-mail message or by telephone to (02)
 9646 9222. Please delete the email and any attachments and do not retain the
 email or any attachments in any form.
 **

image001.gif

Re: Some questions about using Cassandra

2010-06-15 Thread Jonathan Shook
Doh! Replace of with if in the top line.

On Tue, Jun 15, 2010 at 7:57 PM, Jonathan Shook jsh...@gmail.com wrote:

 There is JSON import and export, of you want a form of external backup.

 No, you can't hook event subscribers into the storage engine. You can
 modify it to do this, however. It may not be trivial.

 An easier way to do this would be to have a boundary system (or dedicated
 thread, for example) consume data in small amounts, using some temporal
 criterion, with a checkpoint. If the results of consuming the data are
 idempotent, you don't have to use a checkpoint, necessarily, but some cyclic
 rework may occur.

 If your storage layout includes temporal names, it should be
 straightforward. The details how exactly how would depend on your storage
 layout, but it is not unusual as far as requirements go.



 On Tue, Jun 15, 2010 at 7:49 PM, Anthony Ikeda 
 anthony.ik...@cardlink.com.au wrote:

  We are currently looking at a distributed database option and so far
 Cassandra ticks all the boxes. However, I still have some questions.



 Is there any need for archiving of Cassandra and what backup options are
 available? As it is a no-data-loss system I’m guessing archiving is not
 exactly relevant.



 Is there any concept of Listeners such that when data is added to
 Cassandra we can fire off another process to do something with that data?
 E.g. create a copy in a secondary database for Business Intelligence
 reports? Send the data to an LDAP server?





 Anthony Ikeda

 Java Analyst/Programmer

 Cardlink Services Limited

 Level 4, 3 Rider Boulevard

 Rhodes NSW 2138



 Web: www.cardlink.com.au | Tel: + 61 2 9646 9221 | Fax: + 61 2 9646 9283

 [image: logo_cardlink1]



 **
 This e-mail message and any attachments are intended only for the use of
 the addressee(s) named above and may contain information that is privileged
 and confidential. If you are not the intended recipient, any display,
 dissemination, distribution, or copying is strictly prohibited. If you
 believe you have received this e-mail message in error, please immediately
 notify the sender by replying to this e-mail message or by telephone to (02)
 9646 9222. Please delete the email and any attachments and do not retain the
 email or any attachments in any form.
 **



image001.gif

RE: Some questions about using Cassandra

2010-06-15 Thread Anthony Ikeda
Thanks Jonathan, I was only asking about the event listeners because an
alternative we are considering is TIBCO Active Spaces which draws quite
a lot of parallels to Cassandra.

 

I guess it would be interesting to find out how other people use
Cassandra, i.e., is it your one stop shop for data storage or do you
also store to a RDBMs to re-use the data? One factor I need to consider
is our Business Intelligence platform that will need to use the data
stored for reporting purposes.

 

We are looking at using Cassandra for our real-time layer for
Active-Active data centre use and perhaps have Oracle installed
alongside for non-real-time use such that data is mediated to the Oracle
database for other uses.

 

Anthony

 

 

 

 

From: Jonathan Shook [mailto:jsh...@gmail.com] 
Sent: Wednesday, 16 June 2010 10:58 AM
To: user@cassandra.apache.org
Subject: Re: Some questions about using Cassandra

 

Doh! Replace of with if in the top line.

On Tue, Jun 15, 2010 at 7:57 PM, Jonathan Shook jsh...@gmail.com
wrote:

There is JSON import and export, of you want a form of external backup.

No, you can't hook event subscribers into the storage engine. You can
modify it to do this, however. It may not be trivial.

An easier way to do this would be to have a boundary system (or
dedicated thread, for example) consume data in small amounts, using some
temporal criterion, with a checkpoint. If the results of consuming the
data are idempotent, you don't have to use a checkpoint, necessarily,
but some cyclic rework may occur.

If your storage layout includes temporal names, it should be
straightforward. The details how exactly how would depend on your
storage layout, but it is not unusual as far as requirements go.





On Tue, Jun 15, 2010 at 7:49 PM, Anthony Ikeda 
anthony.ik...@cardlink.com.au wrote:

We are currently looking at a distributed database option and so far
Cassandra ticks all the boxes. However, I still have some questions.

 

Is there any need for archiving of Cassandra and what backup options are
available? As it is a no-data-loss system I'm guessing archiving is not
exactly relevant.

 

Is there any concept of Listeners such that when data is added to
Cassandra we can fire off another process to do something with that
data? E.g. create a copy in a secondary database for Business
Intelligence reports? Send the data to an LDAP server?

 

 

Anthony Ikeda

Java Analyst/Programmer

Cardlink Services Limited

Level 4, 3 Rider Boulevard

Rhodes NSW 2138

 

Web: www.cardlink.com.au | Tel: + 61 2 9646 9221 | Fax: + 61 2 9646 9283

 

 


**
This e-mail message and any attachments are intended only for the use of
the addressee(s) named above and may contain information that is
privileged and confidential. If you are not the intended recipient, any
display, dissemination, distribution, or copying is strictly prohibited.
If you believe you have received this e-mail message in error, please
immediately notify the sender by replying to this e-mail message or by
telephone to (02) 9646 9222. Please delete the email and any attachments
and do not retain the email or any attachments in any form.
**

 



_ 
This e-mail has been scanned for viruses by MCI's Internet Managed 
Scanning Services - powered by MessageLabs. For further information 
visit http://www.mci.com


**
This e-mail message and any attachments are intended only for the use of the 
addressee(s) named above and may contain information that is privileged and 
confidential. If you are not the intended recipient, any display, 
dissemination, distribution, or copying is strictly prohibited.   If you 
believe you have received this e-mail message in error, please immediately 
notify the sender by replying to this e-mail message or by telephone to (02) 
9646 9222. Please delete the email and any attachments and do not retain the 
email or any attachments in any form.
**image001.gif

Re: stalled streaming

2010-06-15 Thread Benjamin Black
This is not the bug to which I was referring.  I don't recall the
number, perhaps someone else can assist on that front?  I just know I
specifically upgraded to 0.6 trunk a bit before 0.6.2 to pick up the
fix (and it worked).


b

On Tue, Jun 15, 2010 at 6:07 PM, Rob Coli rc...@digg.com wrote:

 On Tue, 15 Jun 2010 15:55:46 -0700, Benjamin Blackb...@b3k.us  wrote:
 Known bug, fixed in latest 0.6 release.

 On 6/15/10 4:06 PM, aaron wrote:
 Thanks, will move to 0.6.2.

 I believe that this thread refers to CASSANDRA-1169, and fix version for
 that is the (unreleased) cassandra 0.6.3, not (the latest 0.6 release)
 0.6.2.

 https://issues.apache.org/jira/browse/CASSANDRA-1169

 =Rob



Re: java.lang.RuntimeException: java.io.IOException: Value too large for defined data type

2010-06-15 Thread Benjamin Black
On Tue, Jun 15, 2010 at 4:44 PM, Charles Butterfield
charles.butterfi...@nextcentury.com wrote:

 I guess my point is that I have rarely run across database servers that die
 from either too many client connections, or too rapid client requests.  They
 generally stop accepting incoming connections when there are too many 
 connection
 requests, and further they do not queue and acknowledge an unbounded number of
 client requests on any given connection.


Not what I am suggesting.  Instead, I am saying things can be tuned to
behave in various ways: you can arbitrarily back up client requests
such that they start timing out, you can accept them and run out of
memory, you can start swapping and go into a GC death spiral and have
nodes drop off the ring, etc.  There just isn't a situation in which
you can satisfy an arbitrary load in limited time with insufficient
resources.  The exact same constraints apply to other databases, like
MySQL, and the expectation to tune for your needs is the same.


b


Re: java.lang.RuntimeException: java.io.IOException: Value too large for defined data type

2010-06-15 Thread Benjamin Black
On Tue, Jun 15, 2010 at 4:44 PM, Charles Butterfield
charles.butterfi...@nextcentury.com wrote:
 To clarify the history here -- initially we were writing with CL=0 and had
 great performance but ended up killing the server.  It was pointed out that
 we were really asking the server to accept and acknowledge an unbounded
 number of requests without waiting for any final disposition of the rows.
 So we had a doh! moment.  That is why we went to the other extreme of
 CL=ALL, to let the server fully dispose of each request before acknowledging
 it and getting the next.


CL.ALL (and CL.QUORUM) go through the strongly consistent write path,
while CL.ONE (and CL.ANY) go through the weakly consistent write path.
 Using CL.ALL you aren't _letting_ the server full dispose of each
request, you are _requiring_ it to hold on to resources until _all_
replicas confirm the write.  With CL.ONE, the writes are asynchronous,
with only a single success required before results are returned to the
client, consuming fewer server resources.  CL.ONE is what I recommend
you use unless you have specific needs to the contrary.

Unfortunately, the best documentation for this is the code, though it
is mentioned briefly here:
http://wiki.apache.org/cassandra/ArchitectureOverview


b


Re: java.lang.RuntimeException: java.io.IOException: Value too large for defined data type

2010-06-15 Thread Benjamin Black
On Tue, Jun 15, 2010 at 4:58 PM, Jonathan Shook jsh...@gmail.com wrote:
 If there aren't enough resources on the server side to service the
 clients, the expectation should be that the servers have a graceful
 performance degradation, or in the worst case throw an error specific
 to resource exhaustion or explicit resource throttling. The fact that
 Cassandra does some background processing complicates this a bit.

This is actually one of the most significant complications: graceful
performance degradation can quickly lead to GC pauses long enough for
a node to be marked down by the rest of the cluster.  You could try to
trade a lot of CPU for limited memory by tuning GC parameters to be
_really_ aggressive, I suppose, but that doesn't strike me as a great
strategy.  The nature of distributed systems like this presents
challenges simply not present (or even uglier) in centralized
databases.

I agree completely with the need to have the server engage in
self-preserving activity should it start nearing limits, and to signal
that change to clients and in logs.  Definitely room for improvement,
regardless of state of tune.


b


Re: Some questions about using Cassandra

2010-06-15 Thread Rob Coli

On 6/15/10 6:35 PM, Benjamin Black wrote:

jmhodges contributed a patch (I remain incompetent at Jira searches)
for 'coprocessors' to do what you want.  That'd be where I'd start
looking.


https://issues.apache.org/jira/browse/CASSANDRA-1016

=Rob


RE: Some questions about using Cassandra

2010-06-15 Thread Anthony Ikeda
Thanks Benjamin. Looking at the 'plugins' now :)

-Original Message-
From: Benjamin Black [mailto:b...@b3k.us] 
Sent: Wednesday, 16 June 2010 11:35 AM
To: user@cassandra.apache.org
Subject: Re: Some questions about using Cassandra

On Tue, Jun 15, 2010 at 6:07 PM, Anthony Ikeda
anthony.ik...@cardlink.com.au wrote:

 Thanks Jonathan, I was only asking about the event listeners because
an alternative we are considering is TIBCO Active Spaces which draws
quite a lot of parallels to Cassandra.



Based on painful production experience, I would not consider TIBCO for
anything requiring reliable delivery.  The failure modes and edge
cases are too numerous and unpleasant to enumerate.

jmhodges contributed a patch (I remain incompetent at Jira searches)
for 'coprocessors' to do what you want.  That'd be where I'd start
looking.


b

_ 
This e-mail has been scanned for viruses by MCI's Internet Managed 
Scanning Services - powered by MessageLabs. For further information 
visit http://www.mci.com

**
This e-mail message and any attachments are intended only for the use of the 
addressee(s) named above and may contain information that is privileged and 
confidential. If you are not the intended recipient, any display, 
dissemination, distribution, or copying is strictly prohibited.   If you 
believe you have received this e-mail message in error, please immediately 
notify the sender by replying to this e-mail message or by telephone to (02) 
9646 9222. Please delete the email and any attachments and do not retain the 
email or any attachments in any form.
**


Re: stalled streaming

2010-06-15 Thread Jonathan Ellis
I think the one you're referring to is
https://issues.apache.org/jira/browse/CASSANDRA-1076

On Tue, Jun 15, 2010 at 8:16 PM, Benjamin Black b...@b3k.us wrote:
 This is not the bug to which I was referring.  I don't recall the
 number, perhaps someone else can assist on that front?  I just know I
 specifically upgraded to 0.6 trunk a bit before 0.6.2 to pick up the
 fix (and it worked).


 b

 On Tue, Jun 15, 2010 at 6:07 PM, Rob Coli rc...@digg.com wrote:

 On Tue, 15 Jun 2010 15:55:46 -0700, Benjamin Blackb...@b3k.us  wrote:
 Known bug, fixed in latest 0.6 release.

 On 6/15/10 4:06 PM, aaron wrote:
 Thanks, will move to 0.6.2.

 I believe that this thread refers to CASSANDRA-1169, and fix version for
 that is the (unreleased) cassandra 0.6.3, not (the latest 0.6 release)
 0.6.2.

 https://issues.apache.org/jira/browse/CASSANDRA-1169

 =Rob





-- 
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of Riptano, the source for professional Cassandra support
http://riptano.com


Re: stalled streaming

2010-06-15 Thread Benjamin Black
Yes!

On Tue, Jun 15, 2010 at 6:44 PM, Jonathan Ellis jbel...@gmail.com wrote:
 I think the one you're referring to is
 https://issues.apache.org/jira/browse/CASSANDRA-1076

 On Tue, Jun 15, 2010 at 8:16 PM, Benjamin Black b...@b3k.us wrote:
 This is not the bug to which I was referring.  I don't recall the
 number, perhaps someone else can assist on that front?  I just know I
 specifically upgraded to 0.6 trunk a bit before 0.6.2 to pick up the
 fix (and it worked).


 b

 On Tue, Jun 15, 2010 at 6:07 PM, Rob Coli rc...@digg.com wrote:

 On Tue, 15 Jun 2010 15:55:46 -0700, Benjamin Blackb...@b3k.us  wrote:
 Known bug, fixed in latest 0.6 release.

 On 6/15/10 4:06 PM, aaron wrote:
 Thanks, will move to 0.6.2.

 I believe that this thread refers to CASSANDRA-1169, and fix version for
 that is the (unreleased) cassandra 0.6.3, not (the latest 0.6 release)
 0.6.2.

 https://issues.apache.org/jira/browse/CASSANDRA-1169

 =Rob





 --
 Jonathan Ellis
 Project Chair, Apache Cassandra
 co-founder of Riptano, the source for professional Cassandra support
 http://riptano.com



Re: JVM Options for Production

2010-06-15 Thread Jonathan Ellis
The main change you'd commonly make is decreasing the max new gen size
on large heaps (say to 2GB) from the default of 1/3 of the heap.

IMO keeping heap dump on OOM around is a good idea in production; it
doesn't cost much (you're already screwed at the point where it starts
writing a dump, so why not) and it can be useful.

On Mon, Jun 14, 2010 at 6:01 PM, Anthony Molinaro
antho...@alumni.caltech.edu wrote:
 Hi,

  I was updating to a newer 0.6.3 and happened to remember that I noticed
 back in 0.6.2 there's this change in CHANGES.txt

  * improve default JVM GC options (CASSANDRA-1014)

 Looking at that ticket, I don't actually see the options listed or a
 reason for why they changed.  Also, I'm not certain which options are
 now recommended for a production system versus what's in the distribution.

 The distribution (well svn) for 0.6.x currently has

 JVM_OPTS= \
        -ea \
        -Xms256M \
        -Xmx1G \
        -XX:+UseParNewGC \
        -XX:+UseConcMarkSweepGC \
        -XX:+CMSParallelRemarkEnabled \
        -XX:SurvivorRatio=8 \
        -XX:MaxTenuringThreshold=1 \
        -XX:+HeapDumpOnOutOfMemoryError \
        -Dcom.sun.management.jmxremote.port=8080 \
        -Dcom.sun.management.jmxremote.ssl=false \
        -Dcom.sun.management.jmxremote.authenticate=false

 Now I would assume that for 'production' you want to remove
   -ea
 and
   -XX:+HeapDumpOnOutOfMemoryError

 as well as adjust -Xms and Xmx accordingly, but are there any others
 which should be tweaked?  Is there actually a recommended production
 set of values or does it very greatly from installation to installation?

 Thanks,

 -Anthony

 --
 
 Anthony Molinaro                           antho...@alumni.caltech.edu




-- 
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of Riptano, the source for professional Cassandra support
http://riptano.com


RE: read operation is slow

2010-06-15 Thread caribbean410
Thank you for the update. For the select issue, right now we just focus on
read and write, later we may test delete operation which need to query all
keys.

 

From: Dop Sun [mailto:su...@dopsun.com] ks
Sent: Tuesday, June 15, 2010 4:14 PM
To: user@cassandra.apache.org
Subject: RE: read operation is slow

 

Thanks for your updates, good to know that your performance is better now.

 

Actually, if the user asks one record a time, usually it will be done in
multi-threading, since most likely the requests coming from different users.

 

If a single users want 200k, and there are no difference to get 1 at a time,
or get 100 at a time.  Since the result set is exactly same.

 

By the way: the Jassandra you are using is updated for the issue you raised.
Now, the result set from select is sorted.

 

Thanks,

Regards,

dop

 

From: Caribbean410 [mailto:caribbean...@gmail.com] 
Sent: Tuesday, June 15, 2010 9:16 AM
To: user@cassandra.apache.org
Subject: Re: read operation is slow

 

Now I read 100 records each time, and the total time to read 200k records
(1M each) reduce to 10s. Looks good. But I am still curious how to handle
the case that users read one record each time,

On Fri, Jun 11, 2010 at 6:05 PM, Dop Sun su...@dopsun.com wrote:

And also, you are only select 1 key and 10 columns?

 

criteria.keyList(Lists.newArrayList(userName)).columnRange(nameFirst,
nameFirst, 10);

 

Then, if you have 200k keys, you have 200k Thrift calls.  If this is the
case, you may need to optimize the way you do the query (to combine multiple
keys into a single query), and to reduce the number of calls.

 

From: Dop Sun [mailto:su...@dopsun.com] 
Sent: Saturday, June 12, 2010 8:57 AM


To: user@cassandra.apache.org

Subject: RE: read operation is slow

 

You mean after you I remove some unnecessary column family and change the
size of rowcache and keycache, now the latency changes from 0.25ms to
0.09ms. In essence 0.09ms*200k=18s., it still takes 400 seconds to
returning?

 

From: Caribbean410 [mailto:caribbean...@gmail.com] 
Sent: Saturday, June 12, 2010 8:48 AM
To: user@cassandra.apache.org
Subject: Re: read operation is slow

 

Hi, do you mean this one should not introduce much extra delay? To read a
record, I need select here, not sure where the extra delay comes from.

On Fri, Jun 11, 2010 at 5:29 PM, Dop Sun su...@dopsun.com wrote:

Jassandra is used here:

 

MapString, ListIColumn map = criteria.select();

 

The select here basically is a call to Thrift API: get_range_slices

 

 

From: Caribbean410 [mailto:caribbean...@gmail.com] 
Sent: Saturday, June 12, 2010 8:00 AM


To: user@cassandra.apache.org
Subject: Re: read operation is slow

 

I remove some unnecessary column family and change the size of rowcache and
keycache, now the latency changes from 0.25ms to 0.09ms. In essence
0.09ms*200k=18s. I don't know why it takes more than 400s total. Here is the
client code and cfstats. There are not many operations here, why is the
extra time so large?



  long start = System.currentTimeMillis();
  for (int j = 0; j  1; j++) {
  for (int i = 0; i  numOfRecords; i++) {
  int n = random.nextInt(numOfRecords);
  ICriteria criteria = cf.createCriteria();
  userName = keySet[n];
 
criteria.keyList(Lists.newArrayList(userName)).columnRange(nameFirst,
nameFirst, 10); 
  MapString, ListIColumn map = criteria.select(); 
  ListIColumn list = map.get(userName); 
//  ByteArray bloc = list.get(0).getValue();
//  byte[] byteArrayloc = bloc.toByteArray();
//  loc = new String(byteArrayloc);

//  readBytes = readBytes + loc.length();
  readBytes = readBytes + blobSize;
  }
  }

long finish=System.currentTimeMillis();

float totalTime=(finish-start)/1000;


Keyspace: Keyspace1
Read Count: 60
Read Latency: 0.090530067 ms.
Write Count: 20
Write Latency: 0.01504989 ms.
Pending Tasks: 0
Column Family: Standard2
SSTable count: 3
Space used (live): 265990358
Space used (total): 265990358
Memtable Columns Count: 2615
Memtable Data Size: 2667300
Memtable Switch Count: 3
Read Count: 60
Read Latency: 0.091 ms.
Write Count: 20
Write Latency: 0.015 ms.
Pending Tasks: 0
Key cache capacity: 1000
Key cache size: 187465
Key cache hit rate: 0.0
Row cache capacity: 1000
Row cache size: 189990
Row cache hit rate: 0.68335
Compacted row minimum size: 0
Compacted row maximum size: 0
Compacted row mean size: 0


Keyspace: system
Read Count: 1