RE: Compacted_at timestamp

2015-02-08 Thread Andreas Finke
I created a small script recently converting this timestamp into a human 
readable string and sort all entries ascending.

nodetool compactionhistory |awk '{timestamp = strftime("%a %b %e %H:%M:%S %Z 
%Y",$4 / 1000);in_m=$5/1024/1024;out_m=$6/1024/1024; 
printf("%s\t%s\t%s\t%s\t%dM\t%dM\n",$4,timestamp,$2,$3,in_m,out_m)}' |sort -n

Regards
Andi

From: Mark Reddy [mark.l.re...@gmail.com]
Sent: 08 February 2015 21:55
To: user@cassandra.apache.org
Subject: Re: Compacted_at timestamp

Hi Santo,

If you are seeing the compacted_at value as timestamp and want to convert it to 
a human readable date, this is not possible via nodetool. You will always write 
a script to make the compactionhistory call and then convert the output (the 
fourth column - compacted_at) to a readable date.

If you are seeing something other than an expected timestamp value, can you 
post an example of what you are getting?


Regards,
Mark

On 8 February 2015 at 13:20, Havere Shanmukhappa, Santhosh 
mailto:santhosh_havereshanmukha...@intuit.com>>
 wrote:
When I run nodetool compactionhistory command, it displays ‘compacted_at’ 
timestamp in non-readable format. Any way to read that column in readable 
format? I am using c*2.0.11 version.

Thanks,
Santo



Adding more nodes causes performance problem

2015-02-08 Thread C . B .
I have a cluster with 3 nodes, the only keyspace is with replication factor of 
3,
the application read/write UUID-keyed data. I use CQL (casssandra-python),
most writes are done by execute_async, most read are done with consistency
level of ONE, overall performance in this setup is better than I expected.

Then I test 6-nodes cluster and 9-nodes. The performance (both read and
write) was getting worse and worse. Roughly speaking, 6-nodes is about 2~3
times slower than 3-nodes, and 9-nodes is about 5~6 times slower than
3-nodes. All tests were done with same data set, same test program, same
client machines, for multiple times. I'm running Cassandra 2.1.2 with default
configuration.

What I observed, is that with 6-nodes and 9-nodes, the Cassandra servers
were doing OK with IO, but CPU utilization was about 60%~70% higher than
3-nodes.

I'd like to get suggestion how to troubleshoot this, as this is totally against
what I read, that Cassandra is scaled linearly.




Re: High GC activity on node with 4TB on data

2015-02-08 Thread Francois Richard
Hi Jiri,
We do run multiple nodes with 2TB to 4TB of data and we will usually see GC 
pressure when we create a lot of tombstones.
With Cassandra 2.0.x you would be able to see a log with the following pattern:
WARN [ReadStage:7] 2015-02-08 22:55:09,621 SliceQueryFilter.java (line 225) 
Read 939 live and 1017 tombstoned cells in SyncCore.ContactInformation (see 
tombstone_warn_threshold). 1000 columns was requested, slices=[-], 
delInfo={deletedAt=-9223372036854775808, localDeletion=2147483647}
This basically indicates that you add some major deletions for a given row.
Thanks,

FR
 From: Mark Reddy 
 To: user@cassandra.apache.org 
Cc: cassandra-u...@apache.org; FF Systems  
 Sent: Sunday, February 8, 2015 1:32 PM
 Subject: Re: High GC activity on node with 4TB on data
   
Hey Jiri, 
While I don't have any experience running 4TB nodes (yet), I would recommend 
taking a look at a presentation by Arron Morton on large nodes: 
http://planetcassandra.org/blog/cassandra-community-webinar-videoslides-large-nodes-with-cassandra-by-aaron-morton/
 to see if you can glean anything from that.
I would note that at the start of his talk he mentions that in version 1.2 we 
can now talk about nodes around 1 - 3 TB in size, so if you are storing 
anything more than that you are getting into very specialised use cases.
If you could provide us with some more information about your cluster setup 
(No. of CFs, read/write patterns, do you delete / update often, etc.) that may 
help in getting you to a better place.

Regards,
Mark


On 8 February 2015 at 21:10, Kevin Burton  wrote:

Do you have a lot of individual tables?  Or lots of small compactions?
I think the general consensus is that (at least for Cassandra), 8GB heaps are 
ideal.  
If you have lots of small tables it’s a known anti-pattern (I believe) because 
the Cassandra internals could do a better job on handling the in memory 
metadata representation.
I think this has been improved in 2.0 and 2.1 though so the fact that you’re on 
1.2.18 could exasperate the issue.  You might want to consider an upgrade 
(though that has its own issues as well).
On Sun, Feb 8, 2015 at 12:44 PM, Jiri Horky  wrote:

Hi all,

we are seeing quite high GC pressure (in old space by CMS GC Algorithm)
on a node with 4TB of data. It runs C* 1.2.18 with 12G of heap memory
(2G for new space). The node runs fine for couple of days when the GC
activity starts to raise and reaches about 15% of the C* activity which
causes dropped messages and other problems.

Taking a look at heap dump, there is about 8G used by SSTableReader
classes in org.apache.cassandra.io.compress.CompressedRandomAccessReader.

Is this something expected and we have just reached the limit of how
many data a single Cassandra instance can handle or it is possible to
tune it better?

Regards
Jiri Horky




-- 
Founder/CEO Spinn3r.com
Location: San Francisco, CA
blog: http://burtonator.wordpress.com… or check out my Google+ profile



  

Re: High GC activity on node with 4TB on data

2015-02-08 Thread Colin
The most data I put on a node with spinning disk is 1TB.

What are the machine specs? Cpu, memory, etc and what is the read/write 
pattern-heavy ingest rate/heavy read rate and how ling do you keep data in the 
cluster?

--
Colin Clark 
+1 612 859 6129
Skype colin.p.clark

> On Feb 8, 2015, at 2:44 PM, Jiri Horky  wrote:
> 
> Hi all,
> 
> we are seeing quite high GC pressure (in old space by CMS GC Algorithm)
> on a node with 4TB of data. It runs C* 1.2.18 with 12G of heap memory
> (2G for new space). The node runs fine for couple of days when the GC
> activity starts to raise and reaches about 15% of the C* activity which
> causes dropped messages and other problems.
> 
> Taking a look at heap dump, there is about 8G used by SSTableReader
> classes in org.apache.cassandra.io.compress.CompressedRandomAccessReader.
> 
> Is this something expected and we have just reached the limit of how
> many data a single Cassandra instance can handle or it is possible to
> tune it better?
> 
> Regards
> Jiri Horky


Re: High GC activity on node with 4TB on data

2015-02-08 Thread Mark Reddy
Hey Jiri,

While I don't have any experience running 4TB nodes (yet), I would
recommend taking a look at a presentation by Arron Morton on large nodes:
http://planetcassandra.org/blog/cassandra-community-webinar-videoslides-large-nodes-with-cassandra-by-aaron-morton/
to see if you can glean anything from that.

I would note that at the start of his talk he mentions that in version 1.2
we can now talk about nodes around 1 - 3 TB in size, so if you are storing
anything more than that you are getting into very specialised use cases.

If you could provide us with some more information about your cluster setup
(No. of CFs, read/write patterns, do you delete / update often, etc.) that
may help in getting you to a better place.


Regards,
Mark

On 8 February 2015 at 21:10, Kevin Burton  wrote:

> Do you have a lot of individual tables?  Or lots of small compactions?
>
> I think the general consensus is that (at least for Cassandra), 8GB heaps
> are ideal.
>
> If you have lots of small tables it’s a known anti-pattern (I believe)
> because the Cassandra internals could do a better job on handling the in
> memory metadata representation.
>
> I think this has been improved in 2.0 and 2.1 though so the fact that
> you’re on 1.2.18 could exasperate the issue.  You might want to consider an
> upgrade (though that has its own issues as well).
>
> On Sun, Feb 8, 2015 at 12:44 PM, Jiri Horky  wrote:
>
>> Hi all,
>>
>> we are seeing quite high GC pressure (in old space by CMS GC Algorithm)
>> on a node with 4TB of data. It runs C* 1.2.18 with 12G of heap memory
>> (2G for new space). The node runs fine for couple of days when the GC
>> activity starts to raise and reaches about 15% of the C* activity which
>> causes dropped messages and other problems.
>>
>> Taking a look at heap dump, there is about 8G used by SSTableReader
>> classes in org.apache.cassandra.io.compress.CompressedRandomAccessReader.
>>
>> Is this something expected and we have just reached the limit of how
>> many data a single Cassandra instance can handle or it is possible to
>> tune it better?
>>
>> Regards
>> Jiri Horky
>>
>
>
>
> --
>
> Founder/CEO Spinn3r.com
> Location: *San Francisco, CA*
> blog: http://burtonator.wordpress.com
> … or check out my Google+ profile
> 
> 
>
>


Re: High GC activity on node with 4TB on data

2015-02-08 Thread Kevin Burton
Do you have a lot of individual tables?  Or lots of small compactions?

I think the general consensus is that (at least for Cassandra), 8GB heaps
are ideal.

If you have lots of small tables it’s a known anti-pattern (I believe)
because the Cassandra internals could do a better job on handling the in
memory metadata representation.

I think this has been improved in 2.0 and 2.1 though so the fact that
you’re on 1.2.18 could exasperate the issue.  You might want to consider an
upgrade (though that has its own issues as well).

On Sun, Feb 8, 2015 at 12:44 PM, Jiri Horky  wrote:

> Hi all,
>
> we are seeing quite high GC pressure (in old space by CMS GC Algorithm)
> on a node with 4TB of data. It runs C* 1.2.18 with 12G of heap memory
> (2G for new space). The node runs fine for couple of days when the GC
> activity starts to raise and reaches about 15% of the C* activity which
> causes dropped messages and other problems.
>
> Taking a look at heap dump, there is about 8G used by SSTableReader
> classes in org.apache.cassandra.io.compress.CompressedRandomAccessReader.
>
> Is this something expected and we have just reached the limit of how
> many data a single Cassandra instance can handle or it is possible to
> tune it better?
>
> Regards
> Jiri Horky
>



-- 

Founder/CEO Spinn3r.com
Location: *San Francisco, CA*
blog: http://burtonator.wordpress.com
… or check out my Google+ profile




Re: Compacted_at timestamp

2015-02-08 Thread Mark Reddy
Hi Santo,

If you are seeing the compacted_at value as timestamp and want to convert
it to a human readable date, this is not possible via nodetool. You will
always write a script to make the compactionhistory call and then convert
the output (the fourth column - compacted_at) to a readable date.

If you are seeing something other than an expected timestamp value, can you
post an example of what you are getting?


Regards,
Mark

On 8 February 2015 at 13:20, Havere Shanmukhappa, Santhosh <
santhosh_havereshanmukha...@intuit.com> wrote:

>  When I run nodetool compactionhistory command, it displays
> ‘compacted_at’ timestamp in non-readable format. Any way to read that
> column in readable format? I am using c*2.0.11 version.
>
>  Thanks,
> Santo
>


Fastest way to map/parallel read all values in a table?

2015-02-08 Thread Kevin Burton
What’s the fastest way to map/parallel read all values in a table?

Kind of like a mini map only job.

I’m doing this to compute stats across our entire corpus.

What I did to begin with was use token() and then spit it into the number
of splits I needed.

So I just took the total key range space which is -2^63 to 2^63 - 1 and
broke it into N parts.

Then the queries come back as:

select * from mytable where token(primaryKey) >= x and token(primaryKey) < y

>From reading on this list I thought this was the correct way to handle this
problem.

However, I’m seeing horrible performance doing this.  After about 1% it
just flat out locks up.

Could it be that I need to randomize the token order so that it’s not
contiguous?  Maybe it’s all mapping on the first box to begin with.



-- 

Founder/CEO Spinn3r.com
Location: *San Francisco, CA*
blog: http://burtonator.wordpress.com
… or check out my Google+ profile




High GC activity on node with 4TB on data

2015-02-08 Thread Jiri Horky
Hi all,

we are seeing quite high GC pressure (in old space by CMS GC Algorithm)
on a node with 4TB of data. It runs C* 1.2.18 with 12G of heap memory
(2G for new space). The node runs fine for couple of days when the GC
activity starts to raise and reaches about 15% of the C* activity which
causes dropped messages and other problems.

Taking a look at heap dump, there is about 8G used by SSTableReader
classes in org.apache.cassandra.io.compress.CompressedRandomAccessReader.

Is this something expected and we have just reached the limit of how
many data a single Cassandra instance can handle or it is possible to
tune it better?

Regards
Jiri Horky


Re: Mutable primary key in a table

2015-02-08 Thread Colin Clark
No need for CAS in my suggestion - I would try to avoid the use of CAS if at 
all possible.  

It’s better in a distributed environment to reduce dimensionality and isolate 
write/read paths (event sourcing and CQRS patterns).

Also, just in general, changing the primary key on an update is usually 
considered a bad idea and is simply not even permitted by most RDBMS.
—
Colin Clark
co...@clark.ws
+1 320-221-9531
skype colin.p.clark

> On Feb 8, 2015, at 4:16 PM, Eric Stevens  wrote:
> 
> It sounds like changing user names is the kind of thing which doesn't happen 
> often, in which case you probably don't have to worry too much about the 
> additional overhead of using logged batches (not like you're going to be 
> doing hundreds to thousands of these per second).  You probably also want to 
> look into conditional updates (search for Compare And Set - CAS) to help 
> avoid collisions when creating or renaming users.
> 
> Colin's suggestion of using a surrogate key for the primary key on the user 
> table is also a good idea, but you'll still want to use CAS to help maintain 
> the integrity of your data.  Note that CAS has a similar overhead to logged 
> batches in that it also involves a Paxos round.  So keep the number of 
> statements in either CAS or logged batches as minimal as possible.
> 
> On Sun, Feb 8, 2015 at 7:17 AM, Colin  > wrote:
> Another way to do this is to use a time based uuid for the primary key 
> (partition key) and to store the user name with that uuid.
> 
> In addition, you'll need 2 additonal tables, one that is used to get the uuid 
> by user name and another to track user name changes over time which would be 
> organized by uuid, and user name (cluster on the name).
> 
> This pattern is referred to as an inverted index and provides a lot of power 
> and flexibility once mastered.  I use it all the time with cassandra - in 
> fact, to be successful with cassandra, it might actually be a requirement!
> 
> --
> Colin Clark 
> +1 612 859 6129 
> Skype colin.p.clark
> 
> On Feb 8, 2015, at 8:08 AM, Jack Krupansky  > wrote:
> 
>> What is your full primary key? Specifically, what is the partition key, as 
>> opposed to clustering columns?
>> 
>> The point is that the partition key for a row is hashed to determine the 
>> token for the partition, which in turn determines which node of the cluster 
>> owns that partition. Changing the partition key means that potentially the 
>> partition would need to be "moved" to another node, which is clearly not 
>> something that Cassandra would do since the core design of Cassandra is that 
>> all operations should be blazingly fast and to refrain from offering slow 
>> features.
>> 
>> I would recommend that your application:
>> 
>> 1. Read the existing user data
>> 2. Create a new user, using the existing user data.
>> 3. Update the old user row to indicate that it is no longer a valid user. 
>> Actually, you will have to decide on an application policy for old user 
>> names. For example, can they be reused, or are they locked, or... whatever.
>> 
>> 
>> -- Jack Krupansky
>> 
>> On Sun, Feb 8, 2015 at 1:48 AM, Ajaya Agrawal > > wrote:
>> 
>> On Sun, Feb 8, 2015 at 5:03 AM, Eric Stevens > > wrote:
>> I'm struggling to think of a model where it makes sense to update a primary 
>> key as a typical operation.  It suggests, as Adil said, that you may be 
>> reasoning wrong about your data model.  Maybe you can explain your problem 
>> in more detail - what kind of thing has you updating your PK on a regular 
>> basis?
>> 
>> I have a 'user' table which has a column called 'user_name' and other 
>> columns like name, city etc. The application requires that user_name be 
>> unique and user should be searchable by 'user_name'. The only way to do this 
>> in C* would be to make user_name column primary key. Things get trickier 
>> when there is a requirement which says that user_name can be changed by the 
>> users of the application. This a distributed application which mean that it 
>> runs on multiple nodes. If I have to change user_name atomically then either 
>> I need to implement distributed locking or use something C* provides.   
>> 
>> 
> 



smime.p7s
Description: S/MIME cryptographic signature


Re: Mutable primary key in a table

2015-02-08 Thread Eric Stevens
It sounds like changing user names is the kind of thing which doesn't
happen often, in which case you probably don't have to worry too much about
the additional overhead of using logged batches (not like you're going to
be doing hundreds to thousands of these per second).  You probably also
want to look into conditional updates (search for Compare And Set - CAS) to
help avoid collisions when creating or renaming users.

Colin's suggestion of using a surrogate key for the primary key on the user
table is also a good idea, but you'll still want to use CAS to help
maintain the integrity of your data.  Note that CAS has a similar overhead
to logged batches in that it also involves a Paxos round.  So keep the
number of statements in either CAS or logged batches as minimal as possible.

On Sun, Feb 8, 2015 at 7:17 AM, Colin  wrote:

> Another way to do this is to use a time based uuid for the primary key
> (partition key) and to store the user name with that uuid.
>
> In addition, you'll need 2 additonal tables, one that is used to get the
> uuid by user name and another to track user name changes over time which
> would be organized by uuid, and user name (cluster on the name).
>
> This pattern is referred to as an inverted index and provides a lot of
> power and flexibility once mastered.  I use it all the time with cassandra
> - in fact, to be successful with cassandra, it might actually be a
> requirement!
>
> --
> *Colin Clark*
> +1 612 859 6129
> Skype colin.p.clark
>
> On Feb 8, 2015, at 8:08 AM, Jack Krupansky 
> wrote:
>
> What is your full primary key? Specifically, what is the partition key, as
> opposed to clustering columns?
>
> The point is that the partition key for a row is hashed to determine the
> token for the partition, which in turn determines which node of the cluster
> owns that partition. Changing the partition key means that potentially the
> partition would need to be "moved" to another node, which is clearly not
> something that Cassandra would do since the core design of Cassandra is
> that all operations should be blazingly fast and to refrain from offering
> slow features.
>
> I would recommend that your application:
>
> 1. Read the existing user data
> 2. Create a new user, using the existing user data.
> 3. Update the old user row to indicate that it is no longer a valid user.
> Actually, you will have to decide on an application policy for old user
> names. For example, can they be reused, or are they locked, or... whatever.
>
>
> -- Jack Krupansky
>
> On Sun, Feb 8, 2015 at 1:48 AM, Ajaya Agrawal  wrote:
>
>>
>> On Sun, Feb 8, 2015 at 5:03 AM, Eric Stevens  wrote:
>>
>>> I'm struggling to think of a model where it makes sense to update a
>>> primary key as a typical operation.  It suggests, as Adil said, that you
>>> may be reasoning wrong about your data model.  Maybe you can explain your
>>> problem in more detail - what kind of thing has you updating your PK on a
>>> regular basis?
>>>
>>> I have a 'user' table which has a column called 'user_name' and other
>> columns like name, city etc. The application requires that user_name be
>> unique and user should be searchable by 'user_name'. The only way to do
>> this in C* would be to make user_name column primary key. Things get
>> trickier when there is a requirement which says that user_name can be
>> changed by the users of the application. This a distributed application
>> which mean that it runs on multiple nodes. If I have to change user_name
>> atomically then either I need to implement distributed locking or use
>> something C* provides.
>>
>>
>


Re: Mutable primary key in a table

2015-02-08 Thread Colin
Another way to do this is to use a time based uuid for the primary key 
(partition key) and to store the user name with that uuid.

In addition, you'll need 2 additonal tables, one that is used to get the uuid 
by user name and another to track user name changes over time which would be 
organized by uuid, and user name (cluster on the name).

This pattern is referred to as an inverted index and provides a lot of power 
and flexibility once mastered.  I use it all the time with cassandra - in fact, 
to be successful with cassandra, it might actually be a requirement!

--
Colin Clark 
+1 612 859 6129
Skype colin.p.clark

> On Feb 8, 2015, at 8:08 AM, Jack Krupansky  wrote:
> 
> What is your full primary key? Specifically, what is the partition key, as 
> opposed to clustering columns?
> 
> The point is that the partition key for a row is hashed to determine the 
> token for the partition, which in turn determines which node of the cluster 
> owns that partition. Changing the partition key means that potentially the 
> partition would need to be "moved" to another node, which is clearly not 
> something that Cassandra would do since the core design of Cassandra is that 
> all operations should be blazingly fast and to refrain from offering slow 
> features.
> 
> I would recommend that your application:
> 
> 1. Read the existing user data
> 2. Create a new user, using the existing user data.
> 3. Update the old user row to indicate that it is no longer a valid user. 
> Actually, you will have to decide on an application policy for old user 
> names. For example, can they be reused, or are they locked, or... whatever.
> 
> 
> -- Jack Krupansky
> 
>> On Sun, Feb 8, 2015 at 1:48 AM, Ajaya Agrawal  wrote:
>> 
>>> On Sun, Feb 8, 2015 at 5:03 AM, Eric Stevens  wrote:
>>> I'm struggling to think of a model where it makes sense to update a primary 
>>> key as a typical operation.  It suggests, as Adil said, that you may be 
>>> reasoning wrong about your data model.  Maybe you can explain your problem 
>>> in more detail - what kind of thing has you updating your PK on a regular 
>>> basis?
>> 
>> I have a 'user' table which has a column called 'user_name' and other 
>> columns like name, city etc. The application requires that user_name be 
>> unique and user should be searchable by 'user_name'. The only way to do this 
>> in C* would be to make user_name column primary key. Things get trickier 
>> when there is a requirement which says that user_name can be changed by the 
>> users of the application. This a distributed application which mean that it 
>> runs on multiple nodes. If I have to change user_name atomically then either 
>> I need to implement distributed locking or use something C* provides.   
> 


Re: Mutable primary key in a table

2015-02-08 Thread Jack Krupansky
What is your full primary key? Specifically, what is the partition key, as
opposed to clustering columns?

The point is that the partition key for a row is hashed to determine the
token for the partition, which in turn determines which node of the cluster
owns that partition. Changing the partition key means that potentially the
partition would need to be "moved" to another node, which is clearly not
something that Cassandra would do since the core design of Cassandra is
that all operations should be blazingly fast and to refrain from offering
slow features.

I would recommend that your application:

1. Read the existing user data
2. Create a new user, using the existing user data.
3. Update the old user row to indicate that it is no longer a valid user.
Actually, you will have to decide on an application policy for old user
names. For example, can they be reused, or are they locked, or... whatever.


-- Jack Krupansky

On Sun, Feb 8, 2015 at 1:48 AM, Ajaya Agrawal  wrote:

>
> On Sun, Feb 8, 2015 at 5:03 AM, Eric Stevens  wrote:
>
>> I'm struggling to think of a model where it makes sense to update a
>> primary key as a typical operation.  It suggests, as Adil said, that you
>> may be reasoning wrong about your data model.  Maybe you can explain your
>> problem in more detail - what kind of thing has you updating your PK on a
>> regular basis?
>>
>> I have a 'user' table which has a column called 'user_name' and other
> columns like name, city etc. The application requires that user_name be
> unique and user should be searchable by 'user_name'. The only way to do
> this in C* would be to make user_name column primary key. Things get
> trickier when there is a requirement which says that user_name can be
> changed by the users of the application. This a distributed application
> which mean that it runs on multiple nodes. If I have to change user_name
> atomically then either I need to implement distributed locking or use
> something C* provides.
>
>


Compacted_at timestamp

2015-02-08 Thread Havere Shanmukhappa, Santhosh
When I run nodetool compactionhistory command, it displays 'compacted_at' 
timestamp in non-readable format. Any way to read that column in readable 
format? I am using c*2.0.11 version.

Thanks,
Santo