Re: CQL and schema-less column family

2011-09-07 Thread Eric Evans
On Tue, Sep 6, 2011 at 12:22 PM, osishkin osishkin osish...@gmail.com wrote:
 Sorry for the newbie question but I failed to find a clear answer.
 Can CQL be used to query a schema-less column family? can they be indexed?
 That is, query for column names that do not necessarily exist in all
 rows, and were not defined in advance when the column family was
 created.

Absolutely, yes.

If you don't create schema for columns, then their type will simply be
the default for that column family.

-- 
Eric Evans
Acunu | http://www.acunu.com | @acunu


Task's map reading more record than CFIF's inputSplitSize

2011-09-07 Thread Mck
Cassandra-0.8.4 w/ ByteOrderedPartitioner

CFIF's inputSplitSize=196608

3 map tasks (from 4013) is still running after read 25 million rows.

Can this be a bug in StorageService.getSplits(..) ?

With this data I've had general headache with using tokens that are
longer than usual (and trying to move nodes around to balance the ring).

 nodetool ring gives
Address Status State   LoadOwnsToken
   
   
Token(bytes[76118303760208547436305468318170713656])
152.90.241.22   Up Normal  270.46 GB   33.33%  
Token(bytes[30303030303031333131313739353337303038d4e7f72db2ed11e09d7c68b59973a5d8])
152.90.241.24   Up Normal  247.89 GB   33.33%  
Token(bytes[303030303030313331323631393735313231381778518cc00711e0acb968b59973a5d8])
152.90.241.23   Up Normal  1.1 TB  33.33%  
Token(bytes[76118303760208547436305468318170713656])


~mck



Secondary index update issue

2011-09-07 Thread Thamizh
Hi All,
I have created KS  CF using cassandra-0.7.8 and inserted some 
rows and column values(around 1000 rows). Later, I wanted to index 2 
column values. So, I issued 'update column family..' command. After, 
when I query based on indexed value it says Row does not found.
After indexing 1. Issued nodetool flush 2.restarted Cassandra once. 
Though it is same. But, I could see some XXX-Index.db file on cassandra 
data directory.
What am I missing?

Here are CF details,

create column family ipinfo with column_type=Standard and 
default_validation_class =   UTF8Type and comparator=UTF8Type and
keys_cached=25000 and rows_cached=5000 and column_metadata=[
{ column_name : country, validation_class : UTF8Type},
{ column_name : ip, validation_class : LongType},
{ column_name : domain, validation_class : UTF8Type },
];

update column family ip with column_type=Standard and 
default_validation_class = UTF8Type and comparator=UTF8Type and
keys_cached=25000 and rows_cached=5000 and column_metadata=[
{column_name : country, validation_class : UTF8Type },
{column_name : domain, validation_class : UTF8Type, index_type: KEYS},
{column_name : ip, validation_class : LongType, index_type: KEYS}
];


Any suggestions would be appreciated

Regards,

  Thamizhannal P

Re: Cassandra 0.8.4 - doesn't support defining keyspaces in cassandra.yaml?

2011-09-07 Thread Jonathan Ellis
No, the load from yaml was only supported for upgrading from 0.6.
You'd need to create the schema programatically instead.

On Wed, Sep 7, 2011 at 12:27 AM, Roshan Dawrani roshandawr...@gmail.com wrote:
 Hi,
 I have just started the process of upgrading Cassandra from 0.7.2 to 0.8.4,
 and I am facing some issues with embedded cassandra that we utilize in our
 application.
 With 0.7.2, we define our keyspace in cassandra.yaml and use Hector to give
 us an embedded cassandra instance loaded with schema from cassandra.yaml. Is
 it not possible to do the same with Cassandra / Hector 0.8.x?
 Can someone throw some light please?
 Thanks.
 --
 Roshan
 Blog: http://roshandawrani.wordpress.com/
 Twitter: @roshandawrani
 Skype: roshandawrani





-- 
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of DataStax, the source for professional Cassandra support
http://www.datastax.com


Re: Secondary index update issue

2011-09-07 Thread Jonathan Ellis
My guess would be you're querying using a different encoding and there
really is no data for your query as given.  Hard to say without more
details.

On Wed, Sep 7, 2011 at 8:13 AM, Thamizh tceg...@yahoo.co.in wrote:

 Hi All,

 I have created KS  CF using cassandra-0.7.8 and inserted some rows and 
 column values(around 1000 rows). Later, I wanted to index 2 column values. 
 So, I issued 'update column family..' command. After, when I query based on 
 indexed value it says Row does not found. After indexing 1. Issued nodetool 
 flush 2.restarted Cassandra once. Though it is same. But, I could see some 
 XXX-Index.db file on cassandra data directory. What am I missing?

 Here are CF details,

 create column family ipinfo with column_type=Standard and
 default_validation_class =   UTF8Type and comparator=UTF8Type and
 keys_cached=25000 and rows_cached=5000 and column_metadata=[
 { column_name : country, validation_class : UTF8Type},
 { column_name : ip, validation_class : LongType},
 { column_name : domain, validation_class : UTF8Type },
 ];

 update column family ip with column_type=Standard and
 default_validation_class = UTF8Type and comparator=UTF8Type and
 keys_cached=25000 and rows_cached=5000 and column_metadata=[
 {column_name : country, validation_class : UTF8Type },
 {column_name : domain, validation_class : UTF8Type, index_type: KEYS},
 {column_name : ip, validation_class : LongType, index_type: KEYS}
 ];

 Any suggestions would be appreciated

 Regards,
 Thamizhannal P


--
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of DataStax, the source for professional Cassandra support
http://www.datastax.com


Re: Cassandra 0.8.4 - doesn't support defining keyspaces in cassandra.yaml?

2011-09-07 Thread Roshan Dawrani
On Wed, Sep 7, 2011 at 7:27 PM, Jonathan Ellis jbel...@gmail.com wrote:

 No, the load from yaml was only supported for upgrading from 0.6.
 You'd need to create the schema programatically instead.


Thanks for confirming. I am now creating my keyspace programmatically, but
running into another small cassandra issue with my embedded server.

In my embedded server setup, I use
cassandra-javautils's CassandraServiceDataCleaner to clean-up the data
directories, which in turn uses DatabaseDescriptor.getAllDataFileLocations()
to get various directories configured in cassandra.yaml.

Now the problem is that DatabaseDescriptor uses CassandraDaemon to do a
check on allowed rpc_server_types, which forces static initializer of its
parent AbstractCassandraDaemon to get executed. AbstractCassandraDaemon's
static initlializer fails if it does not find log4j-server.properties

Our is a Grails application, and log4j configuration is initialized in a
different way, and I do not want to feed embedded server a dummy
log4j.properties  file just to satisfy the chain above. Is there any way I
can avoid it?

-- 
Roshan
Blog: http://roshandawrani.wordpress.com/
Twitter: @roshandawrani http://twitter.com/roshandawrani
Skype: roshandawrani


Any tentative data for 0.8.5 release?

2011-09-07 Thread Roshan Dawrani
Hi,

Quick check: is there a tentative date for release of Cassandra 0.8.5?

Thanks.

-- 
Roshan
Blog: http://roshandawrani.wordpress.com/
Twitter: @roshandawrani http://twitter.com/roshandawrani
Skype: roshandawrani


Re: Any tentative data for 0.8.5 release?

2011-09-07 Thread Jeremy Hanna
The voting started on Monday and is a 72 hour vote.  So if there aren't any 
problems that people find, it should be released sometime Thursday (7 
September).

On Sep 7, 2011, at 10:41 AM, Roshan Dawrani wrote:

 Hi,
 
 Quick check: is there a tentative date for release of Cassandra 0.8.5?
 
 Thanks.
 
 -- 
 Roshan
 Blog: http://roshandawrani.wordpress.com/
 Twitter: @roshandawrani
 Skype: roshandawrani
 



Re: Any tentative data for 0.8.5 release?

2011-09-07 Thread Roshan Dawrani
On Wed, Sep 7, 2011 at 9:15 PM, Jeremy Hanna jeremy.hanna1...@gmail.comwrote:

 The voting started on Monday and is a 72 hour vote.  So if there aren't any
 problems that people find, it should be released sometime Thursday (7
 September).


Great.  Thanks for quick info. Looking forward to it.

-- 
Roshan
Blog: http://roshandawrani.wordpress.com/
Twitter: @roshandawrani http://twitter.com/roshandawrani
Skype: roshandawrani


Re: Task's map reading more record than CFIF's inputSplitSize

2011-09-07 Thread Jonathan Ellis
getSplits looks pretty foolproof to me but I guess we'd need to add
more debug logging to rule out a bug there for sure.

I guess the main alternative would be a bug in the recordreader paging.

On Wed, Sep 7, 2011 at 6:35 AM, Mck m...@apache.org wrote:
 Cassandra-0.8.4 w/ ByteOrderedPartitioner

 CFIF's inputSplitSize=196608

 3 map tasks (from 4013) is still running after read 25 million rows.

 Can this be a bug in StorageService.getSplits(..) ?

 With this data I've had general headache with using tokens that are
 longer than usual (and trying to move nodes around to balance the ring).

  nodetool ring gives
 Address         Status State   Load            Owns    Token
                                                       
 Token(bytes[76118303760208547436305468318170713656])
 152.90.241.22   Up     Normal  270.46 GB       33.33%  
 Token(bytes[30303030303031333131313739353337303038d4e7f72db2ed11e09d7c68b59973a5d8])
 152.90.241.24   Up     Normal  247.89 GB       33.33%  
 Token(bytes[303030303030313331323631393735313231381778518cc00711e0acb968b59973a5d8])
 152.90.241.23   Up     Normal  1.1 TB          33.33%  
 Token(bytes[76118303760208547436305468318170713656])


 ~mck





-- 
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of DataStax, the source for professional Cassandra support
http://www.datastax.com


Re: Calculate number of nodes required based on data

2011-09-07 Thread Adi
On Tue, Sep 6, 2011 at 3:53 PM, Hefeng Yuan hfy...@rhapsody.com wrote:

 Hi,

 Is there any suggested way of calculating number of nodes needed based on
 data?


We currently have 6 nodes (each has 8G memory) with RF5 (because we want to
 be able to survive loss of 2 nodes).
 The flush of memtable happens around every 30 min (while not doing
 compaction), with ~9m serialized bytes.

 The problem is that we see more than 3 nodes doing compaction at the same
 time, which slows down the application.
 (tried to increase/decrease compaction_throughput_mb_per_sec, not helping
 much)

 So I'm thinking probably we should add more nodes, but not sure how many
 more to add.
 Based on the data rate, is there any suggested way of calculating number of
 nodes required?

 Thanks,
 Hefeng



What is the total  amount of data?
What is the total amount in the biggest column family?

There is no hard limit per node. Cassandra gurus like more nodes :-). One
number for 'happy cassandra users'  I have seen mentioned in discussions is
around 250-300 GB per node. But you could store more per node by having
multiple column families each storing around 250-300 GB per column family.
The main problem being repair/compactions and such operations taking longer
and requiring much more spare disk space.

As for slow down in application during compaction I was wondering
what is the CL you are using for read and writes?
Make sure it is not a client issue - Is your client hitting all nodes in
round-robin or some other fashion?

-Adi


Re: Calculate number of nodes required based on data

2011-09-07 Thread Hefeng Yuan
Adi,

The reason we're attempting to add more nodes is trying to solve the 
long/simultaneous compactions, i.e. the performance issue, not the storage 
issue yet.
We have RF 5 and CL QUORUM for read and write, we have currently 6 nodes, and 
when 4 nodes doing compaction at the same period, we're screwed, especially on 
read, since it'll cover one of the compaction node anyways. 
My assumption is that if we add more nodes, each node will have less load, and 
therefore need less compaction, and probably will compact faster, eternally 
avoid 4+ nodes doing compaction simultaneously.

Any suggestion on how to calculate how many more nodes to add? Or, generally 
how to plan for number of nodes required, from a performance perspective?

Thanks,
Hefeng

On Sep 7, 2011, at 9:56 AM, Adi wrote:

 On Tue, Sep 6, 2011 at 3:53 PM, Hefeng Yuan hfy...@rhapsody.com wrote:
 Hi,
 
 Is there any suggested way of calculating number of nodes needed based on 
 data?
  
 We currently have 6 nodes (each has 8G memory) with RF5 (because we want to 
 be able to survive loss of 2 nodes).
 The flush of memtable happens around every 30 min (while not doing 
 compaction), with ~9m serialized bytes.
 
 The problem is that we see more than 3 nodes doing compaction at the same 
 time, which slows down the application.
 (tried to increase/decrease compaction_throughput_mb_per_sec, not helping 
 much)
 
 So I'm thinking probably we should add more nodes, but not sure how many more 
 to add.
 Based on the data rate, is there any suggested way of calculating number of 
 nodes required?
 
 Thanks,
 Hefeng
 
 
 What is the total  amount of data?
 What is the total amount in the biggest column family?
 
 There is no hard limit per node. Cassandra gurus like more nodes :-). One 
 number for 'happy cassandra users'  I have seen mentioned in discussions is 
 around 250-300 GB per node. But you could store more per node by having 
 multiple column families each storing around 250-300 GB per column family. 
 The main problem being repair/compactions and such operations taking longer 
 and requiring much more spare disk space.
 
 As for slow down in application during compaction I was wondering 
 what is the CL you are using for read and writes?
 Make sure it is not a client issue - Is your client hitting all nodes in 
 round-robin or some other fashion?
 
 -Adi



Re: Task's map reading more record than CFIF's inputSplitSize

2011-09-07 Thread Mick Semb Wever

  3 map tasks (from 4013) is still running after read 25 million rows.
  Can this be a bug in StorageService.getSplits(..) ? 

 getSplits looks pretty foolproof to me but I guess we'd need to add
 more debug logging to rule out a bug there for sure.
 
 I guess the main alternative would be a bug in the recordreader paging.

Entered https://issues.apache.org/jira/browse/CASSANDRA-3150

~mck

-- 
“People only see what they're prepared to see.” - Ralph Waldo Emerson 

| http://semb.wever.org | http://sesat.no |
| http://tech.finn.no   | Java XSS Filter |


signature.asc
Description: This is a digitally signed message part


Re: Secondary index update issue

2011-09-07 Thread Thamizh
Hi,

Here is my sample row. I wanted to execute below 2 queries. CF:ip
1. get ip where ip=19268678;
2. get ip where domain='google.com';

Here both ip  domain has secondary indexes.

RowKey: 19268678
= (column=country, value=in, timestamp=1315398995980)
= (column=domain, value=google.com, timestamp=1315398995980)
= (column=ip, value=19268678, timestamp=1315398995980)

What encoding format should I use?. Here I defined ip as LongType  domain as 
utf8.

When I upload data using above mentioned index type on CF definition then, 
above queries are working sucessfully. Why?


Regards,

  Thamizhannal P

--- On Wed, 7/9/11, Jonathan Ellis jbel...@gmail.com wrote:

From: Jonathan Ellis jbel...@gmail.com
Subject: Re: Secondary index update issue
To: user@cassandra.apache.org
Date: Wednesday, 7 September, 2011, 7:29 PM

My guess would be you're querying using a different encoding and there
really is no data for your query as given.  Hard to say without more
details.

On Wed, Sep 7, 2011 at 8:13 AM, Thamizh tceg...@yahoo.co.in wrote:

 Hi All,

 I have created KS  CF using cassandra-0.7.8 and inserted some rows and 
 column values(around 1000 rows). Later, I wanted to index 2 column values. 
 So, I issued 'update column family..' command. After, when I query based on 
 indexed value it says Row does not found. After indexing 1. Issued nodetool 
 flush 2.restarted Cassandra once. Though it is same. But, I could see some 
 XXX-Index.db file on cassandra data directory. What am I missing?

 Here are CF details,

 create column family ipinfo with column_type=Standard and
 default_validation_class =   UTF8Type and comparator=UTF8Type and
 keys_cached=25000 and rows_cached=5000 and column_metadata=[
 { column_name : country, validation_class : UTF8Type},
 { column_name : ip, validation_class : LongType},
 { column_name : domain, validation_class : UTF8Type },
 ];

 update column family ip with column_type=Standard and
 default_validation_class = UTF8Type and comparator=UTF8Type and
 keys_cached=25000 and rows_cached=5000 and column_metadata=[
 {column_name : country, validation_class : UTF8Type },
 {column_name : domain, validation_class : UTF8Type, index_type: KEYS},
 {column_name : ip, validation_class : LongType, index_type: KEYS}
 ];

 Any suggestions would be appreciated

 Regards,
 Thamizhannal P


--
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of DataStax, the source for professional Cassandra support
http://www.datastax.com


Re: Calculate number of nodes required based on data

2011-09-07 Thread Adi
On Wed, Sep 7, 2011 at 1:09 PM, Hefeng Yuan hfy...@rhapsody.com wrote:

 Adi,

 The reason we're attempting to add more nodes is trying to solve the
 long/simultaneous compactions, i.e. the performance issue, not the storage
 issue yet.
 We have RF 5 and CL QUORUM for read and write, we have currently 6 nodes,
 and when 4 nodes doing compaction at the same period, we're screwed,
 especially on read, since it'll cover one of the compaction node anyways.
 My assumption is that if we add more nodes, each node will have less load,
 and therefore need less compaction, and probably will compact faster,
 eternally avoid 4+ nodes doing compaction simultaneously.

 Any suggestion on how to calculate how many more nodes to add? Or,
 generally how to plan for number of nodes required, from a performance
 perspective?

 Thanks,
 Hefeng



Adding nodes to delay and reduce compaction is an interesting performance
use case :-)  I am thinking you can find a smarter/cheaper way to manage
that.
Have you looked at
a) increasing memtable througput
What is the nature of your writes?  Is it mostly inserts or also has lot of
quick updates of recently inserted data. Increasing memtable_throughput can
delay and maybe reduce the compaction cost if you have lots of updates to
same data.You will have to provide for memory if you try this.
When mentioned with ~9m serialized bytes is that the memtable throughput?
That is quite a low threshold which will result in large number of SSTables
needing to be compacted. I think the default is 256 MB and on the lower end
values I have seen are 64 MB or maybe 32 MB.


b) tweaking min_compaction_threshold and max_compaction_threshold
- increasing min_compaction_threshold will delay compactions
- decreasing max_compaction_threshold will reduce number of sstables per
compaction cycle
Are you using the defaults 4-32 or are trying some different values

c) splitting column families
Again splitting column families can also help because compactions occur
serially one CF at a time and that spreads out your compaction cost over
time and column families. It requires change in app logic though.

-Adi


Re: Calculate number of nodes required based on data

2011-09-07 Thread Hefeng Yuan
We didn't change MemtableThroughputInMB/min/maxCompactionThreshold, they're 
499/4/32.
As for why we're flushing at ~9m, I guess it has to do with this: 
http://thelastpickle.com/2011/05/04/How-are-Memtables-measured/
The only parameter I tried to play with is the 
compaction_throughput_mb_per_sec, tried cutting it in half and doubled, seems 
none of them helps avoiding the simultaneous compactions on nodes.

I agree that we don't necessarily need to add node, as long as we have a way to 
avoid simultaneous compaction on 4+ nodes.

Thanks,
Hefeng

On Sep 7, 2011, at 10:51 AM, Adi wrote:

 
 On Wed, Sep 7, 2011 at 1:09 PM, Hefeng Yuan hfy...@rhapsody.com wrote:
 Adi,
 
 The reason we're attempting to add more nodes is trying to solve the 
 long/simultaneous compactions, i.e. the performance issue, not the storage 
 issue yet.
 We have RF 5 and CL QUORUM for read and write, we have currently 6 nodes, and 
 when 4 nodes doing compaction at the same period, we're screwed, especially 
 on read, since it'll cover one of the compaction node anyways. 
 My assumption is that if we add more nodes, each node will have less load, 
 and therefore need less compaction, and probably will compact faster, 
 eternally avoid 4+ nodes doing compaction simultaneously.
 
 Any suggestion on how to calculate how many more nodes to add? Or, generally 
 how to plan for number of nodes required, from a performance perspective?
 
 Thanks,
 Hefeng
 
 
 
 Adding nodes to delay and reduce compaction is an interesting performance use 
 case :-)  I am thinking you can find a smarter/cheaper way to manage that.
 Have you looked at 
 a) increasing memtable througput
 What is the nature of your writes?  Is it mostly inserts or also has lot of 
 quick updates of recently inserted data. Increasing memtable_throughput can 
 delay and maybe reduce the compaction cost if you have lots of updates to 
 same data.You will have to provide for memory if you try this. 
 When mentioned with ~9m serialized bytes is that the memtable throughput? 
 That is quite a low threshold which will result in large number of SSTables 
 needing to be compacted. I think the default is 256 MB and on the lower end 
 values I have seen are 64 MB or maybe 32 MB.
 
 
 b) tweaking min_compaction_threshold and max_compaction_threshold
 - increasing min_compaction_threshold will delay compactions
 - decreasing max_compaction_threshold will reduce number of sstables per 
 compaction cycle
 Are you using the defaults 4-32 or are trying some different values
 
 c) splitting column families
 Again splitting column families can also help because compactions occur 
 serially one CF at a time and that spreads out your compaction cost over time 
 and column families. It requires change in app logic though.
 
 -Adi
 



Re: Calculate number of nodes required based on data

2011-09-07 Thread Adi
On Wed, Sep 7, 2011 at 2:09 PM, Hefeng Yuan hfy...@rhapsody.com wrote:

 We didn't change MemtableThroughputInMB/min/maxCompactionThreshold, they're
 499/4/32.
 As for why we're flushing at ~9m, I guess it has to do with this:
 http://thelastpickle.com/2011/05/04/How-are-Memtables-measured/
 The only parameter I tried to play with is the *
 compaction_throughput_mb_per_sec*, tried cutting it in half and doubled,
 seems none of them helps avoiding the simultaneous compactions on nodes.

 I agree that we don't necessarily need to add node, as long as we have a
 way to avoid simultaneous compaction on 4+ nodes.

 Thanks,
 Hefeng



Can you check in the logs for something like this
.. Memtable.java (line 157) Writing
Memtable-ColumnFamilyName@1151031968(67138588 bytes, 47430 operations)
to see the bytes/operations at which the column family gets flushed. In case
you are hitting the operations threshold you can try increasing that to a
high number. The operations threshold is getting hit at  less than 2% of
size threshold. I would try bumping up the *memtable_operations *substantially.
Default is 1.1624(in millions).  Try 10 or 20 and see if your CF
flushes at higher size. Keep adjusting it until the frequency/size of
flushing becomes satisfactory and hopefully reduces the compaction overhead.

-Adi







 On Sep 7, 2011, at 10:51 AM, Adi wrote:


 On Wed, Sep 7, 2011 at 1:09 PM, Hefeng Yuan hfy...@rhapsody.com wrote:

 Adi,

 The reason we're attempting to add more nodes is trying to solve the
 long/simultaneous compactions, i.e. the performance issue, not the storage
 issue yet.
 We have RF 5 and CL QUORUM for read and write, we have currently 6 nodes,
 and when 4 nodes doing compaction at the same period, we're screwed,
 especially on read, since it'll cover one of the compaction node anyways.
 My assumption is that if we add more nodes, each node will have less load,
 and therefore need less compaction, and probably will compact faster,
 eternally avoid 4+ nodes doing compaction simultaneously.

 Any suggestion on how to calculate how many more nodes to add? Or,
 generally how to plan for number of nodes required, from a performance
 perspective?

 Thanks,
 Hefeng



 Adding nodes to delay and reduce compaction is an interesting performance
 use case :-)  I am thinking you can find a smarter/cheaper way to manage
 that.
 Have you looked at
 a) increasing memtable througput
 What is the nature of your writes?  Is it mostly inserts or also has lot of
 quick updates of recently inserted data. Increasing memtable_throughput can
 delay and maybe reduce the compaction cost if you have lots of updates to
 same data.You will have to provide for memory if you try this.
 When mentioned with ~9m serialized bytes is that the memtable
 throughput? That is quite a low threshold which will result in large number
 of SSTables needing to be compacted. I think the default is 256 MB and on
 the lower end values I have seen are 64 MB or maybe 32 MB.


 b) tweaking min_compaction_threshold and max_compaction_threshold
 - increasing min_compaction_threshold will delay compactions
 - decreasing max_compaction_threshold will reduce number of sstables per
 compaction cycle
 Are you using the defaults 4-32 or are trying some different values

 c) splitting column families
 Again splitting column families can also help because compactions occur
 serially one CF at a time and that spreads out your compaction cost over
 time and column families. It requires change in app logic though.

 -Adi





Re: Calculate number of nodes required based on data

2011-09-07 Thread Hefeng Yuan
Adi, just to make sure my calculation is correct, the configured ops threshold 
is ~2m, we have 6 nodes, does that mean each node's threshold is around 300k? I 
do see the when flushing happens, ops is about 300k, with several 500k. Seems 
like the ops threshold is throttling us.

On Sep 7, 2011, at 11:31 AM, Adi wrote:

 On Wed, Sep 7, 2011 at 2:09 PM, Hefeng Yuan hfy...@rhapsody.com wrote:
 We didn't change MemtableThroughputInMB/min/maxCompactionThreshold, they're 
 499/4/32.
 As for why we're flushing at ~9m, I guess it has to do with this: 
 http://thelastpickle.com/2011/05/04/How-are-Memtables-measured/
 The only parameter I tried to play with is the 
 compaction_throughput_mb_per_sec, tried cutting it in half and doubled, seems 
 none of them helps avoiding the simultaneous compactions on nodes.
 
 I agree that we don't necessarily need to add node, as long as we have a way 
 to avoid simultaneous compaction on 4+ nodes.
 
 Thanks,
 Hefeng
 
 
 
 Can you check in the logs for something like this 
 .. Memtable.java (line 157) Writing 
 Memtable-ColumnFamilyName@1151031968(67138588 bytes, 47430 operations)
 to see the bytes/operations at which the column family gets flushed. In case 
 you are hitting the operations threshold you can try increasing that to a 
 high number. The operations threshold is getting hit at  less than 2% of size 
 threshold. I would try bumping up the memtable_operations substantially. 
 Default is 1.1624(in millions).  Try 10 or 20 and see if your CF 
 flushes at higher size. Keep adjusting it until the frequency/size of 
 flushing becomes satisfactory and hopefully reduces the compaction overhead.
 
 -Adi
 
 
 
 
 
  
 On Sep 7, 2011, at 10:51 AM, Adi wrote:
 
 
 On Wed, Sep 7, 2011 at 1:09 PM, Hefeng Yuan hfy...@rhapsody.com wrote:
 Adi,
 
 The reason we're attempting to add more nodes is trying to solve the 
 long/simultaneous compactions, i.e. the performance issue, not the storage 
 issue yet.
 We have RF 5 and CL QUORUM for read and write, we have currently 6 nodes, 
 and when 4 nodes doing compaction at the same period, we're screwed, 
 especially on read, since it'll cover one of the compaction node anyways. 
 My assumption is that if we add more nodes, each node will have less load, 
 and therefore need less compaction, and probably will compact faster, 
 eternally avoid 4+ nodes doing compaction simultaneously.
 
 Any suggestion on how to calculate how many more nodes to add? Or, generally 
 how to plan for number of nodes required, from a performance perspective?
 
 Thanks,
 Hefeng
 
 
 
 Adding nodes to delay and reduce compaction is an interesting performance 
 use case :-)  I am thinking you can find a smarter/cheaper way to manage 
 that.
 Have you looked at 
 a) increasing memtable througput
 What is the nature of your writes?  Is it mostly inserts or also has lot of 
 quick updates of recently inserted data. Increasing memtable_throughput can 
 delay and maybe reduce the compaction cost if you have lots of updates to 
 same data.You will have to provide for memory if you try this. 
 When mentioned with ~9m serialized bytes is that the memtable throughput? 
 That is quite a low threshold which will result in large number of SSTables 
 needing to be compacted. I think the default is 256 MB and on the lower end 
 values I have seen are 64 MB or maybe 32 MB.
 
 
 b) tweaking min_compaction_threshold and max_compaction_threshold
 - increasing min_compaction_threshold will delay compactions
 - decreasing max_compaction_threshold will reduce number of sstables per 
 compaction cycle
 Are you using the defaults 4-32 or are trying some different values
 
 c) splitting column families
 Again splitting column families can also help because compactions occur 
 serially one CF at a time and that spreads out your compaction cost over 
 time and column families. It requires change in app logic though.
 
 -Adi
 
 
 



SIGSEGV during compaction?

2011-09-07 Thread Yang
I started compaction using nodetool,
then always reproducibly, I get a SEGV in a code that I added to the
Cassandra code, which simply calls get_slice().

have you seen SEGV associated with compaction? anyone could suggest a
route on how to debug this?

I filed a bug on sun website, right now the only possible approach I
can try is to use another JDK


Thanks
Yang


Re: SIGSEGV during compaction?

2011-09-07 Thread Yang
some info in the debug file that JVM exported:

#
# A fatal error has been detected by the Java Runtime Environment:
#
#  SIGSEGV (0xb) at pc=0x2b37cbfa, pid=7236, tid=1179806016
#
# JRE version: 6.0_27-b07
# Java VM: Java HotSpot(TM) 64-Bit Server VM (20.2-b06 mixed mode
linux-amd64 compressed oops)
# Problematic frame:
# J  
com.cgm.whisky.filter.WithinLimitIpFrequencyCap.isValid(Lorg/apache/avro/specific/SpecificRecord;)Lcom/cgm/whisky/EventsFilter$ValidityCode;
#
# If you would like to submit a bug report, please visit:
#   http://java.sun.com/webapps/bugreport/crash.jsp
#

---  T H R E A D  ---

Current thread (0x2aaab80e2800):  JavaThread pool-3-thread-8
[_thread_in_Java, id=7669,
stack(0x46426000,0x46527000)]

siginfo:si_signo=SIGSEGV: si_errno=0, si_code=1 (SEGV_MAPERR),
si_addr=0x2aaabc00

Registers:
RAX=0x0007914355e8, RBX=0x058a,
RCX=0x000791461b38, RDX=0x
RSP=0x465259f0, RBP=0xf222b894,
RSI=0x000791433f20, RDI=0x2b37ca60
R8 =0xd0931f61, R9 =0xf2286ab2,
R10=0x, R11=0x2aaabc00
R12=0x, R13=0x465259f0,
R14=0x0002, R15=0x2aaab80e2800
RIP=0x2b37cbfa, EFLAGS=0x00010202,
CSGSFS=0x0133, ERR=0x0004
  TRAPNO=0x000e

Top of Stack: (sp=0x465259f0)
0x465259f0:   00068a828dc8 000791433f20
0x46525a00:   00079145ee60 0589058a


On Wed, Sep 7, 2011 at 6:21 PM, Yang tedd...@gmail.com wrote:
 I started compaction using nodetool,
 then always reproducibly, I get a SEGV in a code that I added to the
 Cassandra code, which simply calls get_slice().

 have you seen SEGV associated with compaction? anyone could suggest a
 route on how to debug this?

 I filed a bug on sun website, right now the only possible approach I
 can try is to use another JDK


 Thanks
 Yang



Re: SIGSEGV during compaction?

2011-09-07 Thread Yang
unfortunately tried java7, same

On Wed, Sep 7, 2011 at 6:22 PM, Yang tedd...@gmail.com wrote:
 some info in the debug file that JVM exported:

 #
 # A fatal error has been detected by the Java Runtime Environment:
 #
 #  SIGSEGV (0xb) at pc=0x2b37cbfa, pid=7236, tid=1179806016
 #
 # JRE version: 6.0_27-b07
 # Java VM: Java HotSpot(TM) 64-Bit Server VM (20.2-b06 mixed mode
 linux-amd64 compressed oops)
 # Problematic frame:
 # J  
 com.cgm.whisky.filter.WithinLimitIpFrequencyCap.isValid(Lorg/apache/avro/specific/SpecificRecord;)Lcom/cgm/whisky/EventsFilter$ValidityCode;
 #
 # If you would like to submit a bug report, please visit:
 #   http://java.sun.com/webapps/bugreport/crash.jsp
 #

 ---  T H R E A D  ---

 Current thread (0x2aaab80e2800):  JavaThread pool-3-thread-8
 [_thread_in_Java, id=7669,
 stack(0x46426000,0x46527000)]

 siginfo:si_signo=SIGSEGV: si_errno=0, si_code=1 (SEGV_MAPERR),
 si_addr=0x2aaabc00

 Registers:
 RAX=0x0007914355e8, RBX=0x058a,
 RCX=0x000791461b38, RDX=0x
 RSP=0x465259f0, RBP=0xf222b894,
 RSI=0x000791433f20, RDI=0x2b37ca60
 R8 =0xd0931f61, R9 =0xf2286ab2,
 R10=0x, R11=0x2aaabc00
 R12=0x, R13=0x465259f0,
 R14=0x0002, R15=0x2aaab80e2800
 RIP=0x2b37cbfa, EFLAGS=0x00010202,
 CSGSFS=0x0133, ERR=0x0004
  TRAPNO=0x000e

 Top of Stack: (sp=0x465259f0)
 0x465259f0:   00068a828dc8 000791433f20
 0x46525a00:   00079145ee60 0589058a


 On Wed, Sep 7, 2011 at 6:21 PM, Yang tedd...@gmail.com wrote:
 I started compaction using nodetool,
 then always reproducibly, I get a SEGV in a code that I added to the
 Cassandra code, which simply calls get_slice().

 have you seen SEGV associated with compaction? anyone could suggest a
 route on how to debug this?

 I filed a bug on sun website, right now the only possible approach I
 can try is to use another JDK


 Thanks
 Yang




Re: SIGSEGV during compaction?

2011-09-07 Thread Jonathan Ellis
You should report a bug to Oracle.

In the meantime you could try turning off compressed oops -- that's
been a source of a lot of GC bugs in the past.

On Wed, Sep 7, 2011 at 8:22 PM, Yang tedd...@gmail.com wrote:
 some info in the debug file that JVM exported:

 #
 # A fatal error has been detected by the Java Runtime Environment:
 #
 #  SIGSEGV (0xb) at pc=0x2b37cbfa, pid=7236, tid=1179806016
 #
 # JRE version: 6.0_27-b07
 # Java VM: Java HotSpot(TM) 64-Bit Server VM (20.2-b06 mixed mode
 linux-amd64 compressed oops)
 # Problematic frame:
 # J  
 com.cgm.whisky.filter.WithinLimitIpFrequencyCap.isValid(Lorg/apache/avro/specific/SpecificRecord;)Lcom/cgm/whisky/EventsFilter$ValidityCode;
 #
 # If you would like to submit a bug report, please visit:
 #   http://java.sun.com/webapps/bugreport/crash.jsp
 #

 ---  T H R E A D  ---

 Current thread (0x2aaab80e2800):  JavaThread pool-3-thread-8
 [_thread_in_Java, id=7669,
 stack(0x46426000,0x46527000)]

 siginfo:si_signo=SIGSEGV: si_errno=0, si_code=1 (SEGV_MAPERR),
 si_addr=0x2aaabc00

 Registers:
 RAX=0x0007914355e8, RBX=0x058a,
 RCX=0x000791461b38, RDX=0x
 RSP=0x465259f0, RBP=0xf222b894,
 RSI=0x000791433f20, RDI=0x2b37ca60
 R8 =0xd0931f61, R9 =0xf2286ab2,
 R10=0x, R11=0x2aaabc00
 R12=0x, R13=0x465259f0,
 R14=0x0002, R15=0x2aaab80e2800
 RIP=0x2b37cbfa, EFLAGS=0x00010202,
 CSGSFS=0x0133, ERR=0x0004
  TRAPNO=0x000e

 Top of Stack: (sp=0x465259f0)
 0x465259f0:   00068a828dc8 000791433f20
 0x46525a00:   00079145ee60 0589058a


 On Wed, Sep 7, 2011 at 6:21 PM, Yang tedd...@gmail.com wrote:
 I started compaction using nodetool,
 then always reproducibly, I get a SEGV in a code that I added to the
 Cassandra code, which simply calls get_slice().

 have you seen SEGV associated with compaction? anyone could suggest a
 route on how to debug this?

 I filed a bug on sun website, right now the only possible approach I
 can try is to use another JDK


 Thanks
 Yang





-- 
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of DataStax, the source for professional Cassandra support
http://www.datastax.com


Re: SIGSEGV during compaction?

2011-09-07 Thread Yang
thanks Jonathan.

I tried openJdk too, same , filed bug to both Oracle and openJdk


tried -XX:-UseCompressedOops , same SEGV

Oracle bug site asks does it appear with -server and -Xint, I tried
these options, so far no SEGV yet, maybe slower, but haven't measured
exactly



On Wed, Sep 7, 2011 at 8:56 PM, Jonathan Ellis jbel...@gmail.com wrote:
 You should report a bug to Oracle.

 In the meantime you could try turning off compressed oops -- that's
 been a source of a lot of GC bugs in the past.

 On Wed, Sep 7, 2011 at 8:22 PM, Yang tedd...@gmail.com wrote:
 some info in the debug file that JVM exported:

 #
 # A fatal error has been detected by the Java Runtime Environment:
 #
 #  SIGSEGV (0xb) at pc=0x2b37cbfa, pid=7236, tid=1179806016
 #
 # JRE version: 6.0_27-b07
 # Java VM: Java HotSpot(TM) 64-Bit Server VM (20.2-b06 mixed mode
 linux-amd64 compressed oops)
 # Problematic frame:
 # J  
 com.cgm.whisky.filter.WithinLimitIpFrequencyCap.isValid(Lorg/apache/avro/specific/SpecificRecord;)Lcom/cgm/whisky/EventsFilter$ValidityCode;
 #
 # If you would like to submit a bug report, please visit:
 #   http://java.sun.com/webapps/bugreport/crash.jsp
 #

 ---  T H R E A D  ---

 Current thread (0x2aaab80e2800):  JavaThread pool-3-thread-8
 [_thread_in_Java, id=7669,
 stack(0x46426000,0x46527000)]

 siginfo:si_signo=SIGSEGV: si_errno=0, si_code=1 (SEGV_MAPERR),
 si_addr=0x2aaabc00

 Registers:
 RAX=0x0007914355e8, RBX=0x058a,
 RCX=0x000791461b38, RDX=0x
 RSP=0x465259f0, RBP=0xf222b894,
 RSI=0x000791433f20, RDI=0x2b37ca60
 R8 =0xd0931f61, R9 =0xf2286ab2,
 R10=0x, R11=0x2aaabc00
 R12=0x, R13=0x465259f0,
 R14=0x0002, R15=0x2aaab80e2800
 RIP=0x2b37cbfa, EFLAGS=0x00010202,
 CSGSFS=0x0133, ERR=0x0004
  TRAPNO=0x000e

 Top of Stack: (sp=0x465259f0)
 0x465259f0:   00068a828dc8 000791433f20
 0x46525a00:   00079145ee60 0589058a


 On Wed, Sep 7, 2011 at 6:21 PM, Yang tedd...@gmail.com wrote:
 I started compaction using nodetool,
 then always reproducibly, I get a SEGV in a code that I added to the
 Cassandra code, which simply calls get_slice().

 have you seen SEGV associated with compaction? anyone could suggest a
 route on how to debug this?

 I filed a bug on sun website, right now the only possible approach I
 can try is to use another JDK


 Thanks
 Yang





 --
 Jonathan Ellis
 Project Chair, Apache Cassandra
 co-founder of DataStax, the source for professional Cassandra support
 http://www.datastax.com



Re: SIGSEGV during compaction?

2011-09-07 Thread Yang
h, all other things remaining the same, I put jna.jar into classpath,
now it successfully completed a compaction without problems

On Wed, Sep 7, 2011 at 10:06 PM, Yang tedd...@gmail.com wrote:
 thanks Jonathan.

 I tried openJdk too, same , filed bug to both Oracle and openJdk


 tried -XX:-UseCompressedOops , same SEGV

 Oracle bug site asks does it appear with -server and -Xint, I tried
 these options, so far no SEGV yet, maybe slower, but haven't measured
 exactly



 On Wed, Sep 7, 2011 at 8:56 PM, Jonathan Ellis jbel...@gmail.com wrote:
 You should report a bug to Oracle.

 In the meantime you could try turning off compressed oops -- that's
 been a source of a lot of GC bugs in the past.

 On Wed, Sep 7, 2011 at 8:22 PM, Yang tedd...@gmail.com wrote:
 some info in the debug file that JVM exported:

 #
 # A fatal error has been detected by the Java Runtime Environment:
 #
 #  SIGSEGV (0xb) at pc=0x2b37cbfa, pid=7236, tid=1179806016
 #
 # JRE version: 6.0_27-b07
 # Java VM: Java HotSpot(TM) 64-Bit Server VM (20.2-b06 mixed mode
 linux-amd64 compressed oops)
 # Problematic frame:
 # J  
 com.cgm.whisky.filter.WithinLimitIpFrequencyCap.isValid(Lorg/apache/avro/specific/SpecificRecord;)Lcom/cgm/whisky/EventsFilter$ValidityCode;
 #
 # If you would like to submit a bug report, please visit:
 #   http://java.sun.com/webapps/bugreport/crash.jsp
 #

 ---  T H R E A D  ---

 Current thread (0x2aaab80e2800):  JavaThread pool-3-thread-8
 [_thread_in_Java, id=7669,
 stack(0x46426000,0x46527000)]

 siginfo:si_signo=SIGSEGV: si_errno=0, si_code=1 (SEGV_MAPERR),
 si_addr=0x2aaabc00

 Registers:
 RAX=0x0007914355e8, RBX=0x058a,
 RCX=0x000791461b38, RDX=0x
 RSP=0x465259f0, RBP=0xf222b894,
 RSI=0x000791433f20, RDI=0x2b37ca60
 R8 =0xd0931f61, R9 =0xf2286ab2,
 R10=0x, R11=0x2aaabc00
 R12=0x, R13=0x465259f0,
 R14=0x0002, R15=0x2aaab80e2800
 RIP=0x2b37cbfa, EFLAGS=0x00010202,
 CSGSFS=0x0133, ERR=0x0004
  TRAPNO=0x000e

 Top of Stack: (sp=0x465259f0)
 0x465259f0:   00068a828dc8 000791433f20
 0x46525a00:   00079145ee60 0589058a


 On Wed, Sep 7, 2011 at 6:21 PM, Yang tedd...@gmail.com wrote:
 I started compaction using nodetool,
 then always reproducibly, I get a SEGV in a code that I added to the
 Cassandra code, which simply calls get_slice().

 have you seen SEGV associated with compaction? anyone could suggest a
 route on how to debug this?

 I filed a bug on sun website, right now the only possible approach I
 can try is to use another JDK


 Thanks
 Yang





 --
 Jonathan Ellis
 Project Chair, Apache Cassandra
 co-founder of DataStax, the source for professional Cassandra support
 http://www.datastax.com