cassandra-stress on 3.0 with column widths benchmark.

2015-09-13 Thread Kevin Burton
I’m trying to benchmark two scenarios…

10 columns with 150 bytes each

vs

150 columns with 10 bytes each.

The total row “size” would be 1500 bytes (ignoring overhead).

Our app uses 150 columns so I’m trying to see if packing it into a JSON
structure using one column would improve performance.

I seem to have confirmed my hypothesis.

I’m running two tests:

./tools/bin/cassandra-stress write -insert -col n=FIXED\(10\)
> size=FIXED\(150\) | tee cassandra-stress-10-150.log
>


> time ./tools/bin/cassandra-stress write -insert -col n=FIXED\(150\)
> size=FIXED\(10\) | tee cassandra-stress-150-10.log


this shows that the "op rate” is much much lower when running with 150
columns:

root@util0063 ~/apache-cassandra-3.0.0-beta2 # grep "op rate"
> cassandra-stress-10-150.log
> op rate   : 7632 [WRITE:7632]
> op rate   : 11851 [WRITE:11851]
> op rate   : 31967 [WRITE:31967]
> op rate   : 41798 [WRITE:41798]
> op rate   : 51251 [WRITE:51251]
> op rate   : 58057 [WRITE:58057]
> op rate   : 62977 [WRITE:62977]
> op rate   : 65398 [WRITE:65398]
> op rate   : 67673 [WRITE:67673]
> op rate   : 69198 [WRITE:69198]
> op rate   : 70402 [WRITE:70402]
> op rate   : 71019 [WRITE:71019]
> op rate   : 71574 [WRITE:71574]
> root@util0063 ~/apache-cassandra-3.0.0-beta2 # grep "op rate"
> cassandra-stress-150-10.log
> op rate   : 2570 [WRITE:2570]
> op rate   : 5144 [WRITE:5144]
> op rate   : 10906 [WRITE:10906]
> op rate   : 11832 [WRITE:11832]
> op rate   : 12471 [WRITE:12471]
> op rate   : 12915 [WRITE:12915]
> op rate   : 13620 [WRITE:13620]
> op rate   : 13456 [WRITE:13456]
> op rate   : 13916 [WRITE:13916]
> op rate   : 14029 [WRITE:14029]
> op rate   : 13915 [WRITE:13915]


… what’s WEIRD here is that

Both tests take about 10 minutes.  Yet it’s saying that the op rate for the
second is slower.  Why would that be? That doesn’t make much sense…

-- 

Founder/CEO Spinn3r.com
Location: *San Francisco, CA*
blog: http://burtonator.wordpress.com
… or check out my Google+ profile



Re: Best strategy for hiring from OSS communities.

2015-09-13 Thread Kevin Burton
I think j...@apache.org is dead…

I saw this:

http://mail-archives.apache.org/mod_mbox/community-dev/201304.mbox/%3CCAKQbXgAgO_3SzLMR0L4p_qkSALQzE=ehpnbmjndccu6dtm-...@mail.gmail.com%3E

And can’t find any documentation on a j...@apache.org

I think it would be valuable to create one.  Maybe I should post to general@
…

On Fri, Sep 11, 2015 at 5:34 PM, Otis Gospodnetić <
otis.gospodne...@gmail.com> wrote:

> Hey Kevin - I think there is j...@apache.org
>
> Otis
> --
> Monitoring * Alerting * Anomaly Detection * Centralized Log Management
> Solr & Elasticsearch Support * http://sematext.com/
>
>
> On Thu, Aug 13, 2015 at 6:02 PM, Kevin Burton  wrote:
>
>> Mildly off topic but we are looking to hire someone with Cassandra
>> experience..
>>
>> I don’t necessarily want to spam the list though.  We’d like someone from
>> the community who contributes to Open Source, etc.
>>
>> Are there forums for Apache / Cassandra, etc for jobs? I couldn’t fine
>> one.
>>
>> --
>>
>> Founder/CEO Spinn3r.com
>> Location: *San Francisco, CA*
>> blog: http://burtonator.wordpress.com
>> … or check out my Google+ profile
>> 
>>
>>
>


-- 

Founder/CEO Spinn3r.com
Location: *San Francisco, CA*
blog: http://burtonator.wordpress.com
… or check out my Google+ profile



Re: Using DTCS, TTL but old SSTables not being removed

2015-09-13 Thread Phil Budne
Jeff Jirsa wrote:
> 2.2.1 has a pretty significant bug in compaction: 
> https://issues.apache.org/jira/browse/CASSANDRA-10270
>
> That prevents it from compacting files after 60 minutes. It may or
> may not be the cause of the problem you're seeing, but it seems like
> it may be possibly related, and you can try the workaround in that
> ticket to see if it helps.

Thanks! So far so good

I've tweaked index_summary_resize_interval_in_minutes
from 60 to -1, and restarted and I'm continuing to see:

. CompactionController.java:153 - Dropping expired SSTable 

debug messages(*) after more than an hour after the restart.

I'll know better as time goes by

(*) I've set the logging level for
org.apache.cassandra.db.compaction.DateTieredCompactionStrategy and
org.apache.cassandra.db.compaction.CompactionController to DEBUG


Re: Using DTCS, TTL but old SSTables not being removed

2015-09-13 Thread Jeff Jirsa
2.2.1 has a pretty significant bug in compaction: 
https://issues.apache.org/jira/browse/CASSANDRA-10270

That prevents it from compacting files after 60 minutes. It may or may not be 
the cause of the problem you’re seeing, but it seems like it may be possibly 
related, and you can try the workaround in that ticket to see if it helps.





On 9/13/15, 10:54 AM, "Phil Budne"  wrote:

>Running Cassandra 2.2.1 on 3 nodes (on EC2, from Datastax AMI, then
>upgraded).  Inserting time-series data; All entries with TTL to expire
>3 hours after the "actual_time" of the observation.  Entries arrive
>with varied delay, and often in duplicate. Data is expiring (no longer
>visible from CQL), but old SSTables are not being removed (except on
>restart).
>
>CREATE KEYSPACE thing
>WITH replication = {'class': 'SimpleStrategy', 'replication_factor': '2'}
>AND durable_writes = true;
>
>CREATE TABLE thing.thing_ia (
>id int,
>actual_time timestamp,
>data text,
>PRIMARY KEY (id, actual_time)
>) WITH CLUSTERING ORDER BY (actual_time ASC)
>AND bloom_filter_fp_chance = 0.01
>AND caching = '{"keys":"ALL", "rows_per_partition":"NONE"}'
>AND comment = ''
>AND compaction = {'tombstone_threshold': '0.1', 
> 'tombstone_compaction_interval': '600', 'class': 
> 'org.apache.cassandra.db.compaction.DateTieredCompactionStrategy'}
>AND compression = {'sstable_compression': 
> 'org.apache.cassandra.io.compress.LZ4Compressor'}
>AND dclocal_read_repair_chance = 0.1
>AND default_time_to_live = 0
>AND gc_grace_seconds = 60
>AND max_index_interval = 2048
>AND memtable_flush_period_in_ms = 0
>AND min_index_interval = 128
>AND read_repair_chance = 0.0
>AND speculative_retry = '99.0PERCENTILE';
>
>All times shown in UTC:
>
>$ python -c 'import time; print int(time.time())'
>1442166347
>
>$ date
>Sun Sep 13 17:46:19 UTC 2015
>
>$ cat ~/mmm.sh
>for x in la-*Data.db; do
>ls -l $x
>~/meta.sh $x >/tmp/mmm/$x
>head < /tmp/mmm/$x
>echo 
>grep Ances /tmp/mmm/$x
>echo ''
>done
>
>$ sh ~/mmm.sh 
>-rw-r--r-- 1 cassandra cassandra 31056032 Sep 12 05:41 la-203-big-Data.db
>SSTable: /raid0/cassandra/data/thing.thing_ia-.../la-203-big
>Partitioner: org.apache.cassandra.dht.Murmur3Partitioner
>Bloom Filter FP chance: 0.01
>Minimum timestamp: 1442025790163000
>Maximum timestamp: 1442034620451000
>SSTable max local deletion time: 1442045239
>Compression ratio: -1.0
>Estimated droppable tombstones: 0.946418951062831
>SSTable Level: 0
>Repaired at: 0
>
>Ancestors: [202]
>
>-rw-r--r-- 1 cassandra cassandra 23647585 Sep 12 06:09 la-204-big-Data.db
>SSTable: /raid0/cassandra/data/thing.thing_ia-.../la-204-big
>Partitioner: org.apache.cassandra.dht.Murmur3Partitioner
>Bloom Filter FP chance: 0.01
>Minimum timestamp: 1442034620472000
>Maximum timestamp: 1442038188419002
>SSTable max local deletion time: 1442073136
>Compression ratio: -1.0
>Estimated droppable tombstones: 0.9163514458998852
>SSTable Level: 0
>Repaired at: 0
>
>Ancestors: []
>
>-rw-r--r-- 1 cassandra cassandra 23456946 Sep 12 07:25 la-205-big-Data.db
>SSTable: /raid0/cassandra/data/thing.thing_ia-.../la-205-big
>Partitioner: org.apache.cassandra.dht.Murmur3Partitioner
>Bloom Filter FP chance: 0.01
>Minimum timestamp: 1442038188472000
>Maximum timestamp: 1442042703834001
>SSTable max local deletion time: 1442053303
>Compression ratio: -1.0
>Estimated droppable tombstones: 0.9442594560554178
>SSTable Level: 0
>Repaired at: 0
>
>Ancestors: []
>
>-rw-r--r-- 1 cassandra cassandra 23331024 Sep 12 08:11 la-206-big-Data.db
>SSTable: /raid0/cassandra/data/thing.thing_ia-.../la-206-big
>Partitioner: org.apache.cassandra.dht.Murmur3Partitioner
>Bloom Filter FP chance: 0.01
>Minimum timestamp: 1442042703845000
>Maximum timestamp: 1442045482391000
>SSTable max local deletion time: 1442056194
>Compression ratio: -1.0
>Estimated droppable tombstones: 0.922422134865437
>SSTable Level: 0
>Repaired at: 0
>
>Ancestors: []
>
>-rw-r--r-- 1 cassandra cassandra 23699494 Sep 12 09:11 la-207-big-Data.db
>SSTable: /raid0/cassandra/data/thing.thing_ia-.../la-207-big
>Partitioner: org.apache.cassandra.dht.Murmur3Partitioner
>Bloom Filter FP chance: 0.01
>Minimum timestamp: 1442045482398001
>Maximum timestamp: 144204909216
>SSTable max local deletion time: 1442059681
>Compression ratio: -1.0
>Estimated droppable tombstones: 0.9327568753815364
>SSTable Level: 0
>Repaired at: 0
>
>Ancestors: []
>
>-rw-r--r-- 1 cassandra cassandra 23900518 Sep 12 10:11 la-208-big-Data.db
>SSTable: /raid0/cassandra/data/thing.thing_ia-.../la-208-big
>Partitioner: org.apache.cassandra.dht.Murmur3Partitioner
>Bloom Filter FP chance: 0.01
>Minimum timestamp: 1442049092164001
>Maximum timestamp: 1442052684468000
>SSTable max local deletion time: 1442063293
>Compression ratio: -1.0
>Estimated droppable tombstones: 0.9249749035769007
>SSTable Level: 0
>Repaired at: 0

Re: Upgrade Limitations Question

2015-09-13 Thread Vasileios Vlachos
Any thoughts anyone?
On 9 Sep 2015 20:09, "Vasileios Vlachos"  wrote:

> Hello All,
>
> I've asked this on the Cassandra IRC channel earlier, but I am asking the
> list as well so that I get feedback from more people.
>
> We have recently upgraded from Cassandra 1.2.19 to 2.0.16 and we are
> currently in the stage where all boxes are running 2.0.16 but nt
> upgradesstables has not yet been performed on all of them. Reading the
> DataStax docs [1] :
>
>- Do not issue these types of queries during a rolling restart: DDL,
>TRUNCATE
>
> In our case the restart bit has already been done. Do you know if it would
> be a bad idea to create a new KS before all nodes have upgraded their
> SSTables? Our concern is the time it takes to go through every single node,
> run the upgradesstables and wait until it's all done. We think creating a
> new KS wouldn't be a problem (someone on the channel said the same thing,
> but recommended that we play safe and wait until it's all done). But if
> anyone has any catastrophic experiences in doing so we would appreciate
> their input.
>
> Many thanks,
> Vasilis
>
> [1]
> http://docs.datastax.com/en/upgrade/doc/upgrade/cassandra/upgradeCassandraDetails.html
>


Using DTCS, TTL but old SSTables not being removed

2015-09-13 Thread Phil Budne
Running Cassandra 2.2.1 on 3 nodes (on EC2, from Datastax AMI, then
upgraded).  Inserting time-series data; All entries with TTL to expire
3 hours after the "actual_time" of the observation.  Entries arrive
with varied delay, and often in duplicate. Data is expiring (no longer
visible from CQL), but old SSTables are not being removed (except on
restart).

CREATE KEYSPACE thing
WITH replication = {'class': 'SimpleStrategy', 'replication_factor': '2'}
AND durable_writes = true;

CREATE TABLE thing.thing_ia (
id int,
actual_time timestamp,
data text,
PRIMARY KEY (id, actual_time)
) WITH CLUSTERING ORDER BY (actual_time ASC)
AND bloom_filter_fp_chance = 0.01
AND caching = '{"keys":"ALL", "rows_per_partition":"NONE"}'
AND comment = ''
AND compaction = {'tombstone_threshold': '0.1', 
'tombstone_compaction_interval': '600', 'class': 
'org.apache.cassandra.db.compaction.DateTieredCompactionStrategy'}
AND compression = {'sstable_compression': 
'org.apache.cassandra.io.compress.LZ4Compressor'}
AND dclocal_read_repair_chance = 0.1
AND default_time_to_live = 0
AND gc_grace_seconds = 60
AND max_index_interval = 2048
AND memtable_flush_period_in_ms = 0
AND min_index_interval = 128
AND read_repair_chance = 0.0
AND speculative_retry = '99.0PERCENTILE';

All times shown in UTC:

$ python -c 'import time; print int(time.time())'
1442166347

$ date
Sun Sep 13 17:46:19 UTC 2015

$ cat ~/mmm.sh
for x in la-*Data.db; do
ls -l $x
~/meta.sh $x >/tmp/mmm/$x
head < /tmp/mmm/$x
echo 
grep Ances /tmp/mmm/$x
echo ''
done

$ sh ~/mmm.sh 
-rw-r--r-- 1 cassandra cassandra 31056032 Sep 12 05:41 la-203-big-Data.db
SSTable: /raid0/cassandra/data/thing.thing_ia-.../la-203-big
Partitioner: org.apache.cassandra.dht.Murmur3Partitioner
Bloom Filter FP chance: 0.01
Minimum timestamp: 1442025790163000
Maximum timestamp: 1442034620451000
SSTable max local deletion time: 1442045239
Compression ratio: -1.0
Estimated droppable tombstones: 0.946418951062831
SSTable Level: 0
Repaired at: 0

Ancestors: [202]

-rw-r--r-- 1 cassandra cassandra 23647585 Sep 12 06:09 la-204-big-Data.db
SSTable: /raid0/cassandra/data/thing.thing_ia-.../la-204-big
Partitioner: org.apache.cassandra.dht.Murmur3Partitioner
Bloom Filter FP chance: 0.01
Minimum timestamp: 1442034620472000
Maximum timestamp: 1442038188419002
SSTable max local deletion time: 1442073136
Compression ratio: -1.0
Estimated droppable tombstones: 0.9163514458998852
SSTable Level: 0
Repaired at: 0

Ancestors: []

-rw-r--r-- 1 cassandra cassandra 23456946 Sep 12 07:25 la-205-big-Data.db
SSTable: /raid0/cassandra/data/thing.thing_ia-.../la-205-big
Partitioner: org.apache.cassandra.dht.Murmur3Partitioner
Bloom Filter FP chance: 0.01
Minimum timestamp: 1442038188472000
Maximum timestamp: 1442042703834001
SSTable max local deletion time: 1442053303
Compression ratio: -1.0
Estimated droppable tombstones: 0.9442594560554178
SSTable Level: 0
Repaired at: 0

Ancestors: []

-rw-r--r-- 1 cassandra cassandra 23331024 Sep 12 08:11 la-206-big-Data.db
SSTable: /raid0/cassandra/data/thing.thing_ia-.../la-206-big
Partitioner: org.apache.cassandra.dht.Murmur3Partitioner
Bloom Filter FP chance: 0.01
Minimum timestamp: 1442042703845000
Maximum timestamp: 1442045482391000
SSTable max local deletion time: 1442056194
Compression ratio: -1.0
Estimated droppable tombstones: 0.922422134865437
SSTable Level: 0
Repaired at: 0

Ancestors: []

-rw-r--r-- 1 cassandra cassandra 23699494 Sep 12 09:11 la-207-big-Data.db
SSTable: /raid0/cassandra/data/thing.thing_ia-.../la-207-big
Partitioner: org.apache.cassandra.dht.Murmur3Partitioner
Bloom Filter FP chance: 0.01
Minimum timestamp: 1442045482398001
Maximum timestamp: 144204909216
SSTable max local deletion time: 1442059681
Compression ratio: -1.0
Estimated droppable tombstones: 0.9327568753815364
SSTable Level: 0
Repaired at: 0

Ancestors: []

-rw-r--r-- 1 cassandra cassandra 23900518 Sep 12 10:11 la-208-big-Data.db
SSTable: /raid0/cassandra/data/thing.thing_ia-.../la-208-big
Partitioner: org.apache.cassandra.dht.Murmur3Partitioner
Bloom Filter FP chance: 0.01
Minimum timestamp: 1442049092164001
Maximum timestamp: 1442052684468000
SSTable max local deletion time: 1442063293
Compression ratio: -1.0
Estimated droppable tombstones: 0.9249749035769007
SSTable Level: 0
Repaired at: 0

Ancestors: []

-rw-r--r-- 1 cassandra cassandra 24471823 Sep 12 11:08 la-209-big-Data.db
SSTable: /raid0/cassandra/data/thing.thing_ia-.../la-209-big
Partitioner: org.apache.cassandra.dht.Murmur3Partitioner
Bloom Filter FP chance: 0.01
Minimum timestamp: 1442052684479001
Maximum timestamp: 144205610535
SSTable max local deletion time: 1442066673
Compression ratio: -1.0
Estimated droppable tombstones: 0.8992881460848035
SSTable Level: 0
Repaired at: 0

Ancestors: []

-rw-r--r-- 1 cassandra cassa