Is reducing the number of vnodes to 64/32 likely to help our situation?
with just 3 nodes per datacenter reduce vnodes to 1.
What options do I have for achieving this in a live cluster?
you need to remove node, move its data to other 2 and add it with
different vnodes count.
workable configuration depends on your requirements. You need to develop
own testing procedure.
How much data will have
whats 95 percentile response time target
size of rows
number of columns per row
data grow rate
data rewrite rate
ttl expiration used
never aim for "minimum". Cassandra has hug
with 2 GB RAM be prepared to expect crashes because it hardly can handle
peaks with increased memory consumption by compaction, validation, etc.
KVM works good only if you are using recent version and virtio drivers
and provider is not overselling memory. At shared hosting you will not
be able
basic osgi integration is easy
you need to get osgi compatible container and hookup it to cassandra
daemon. Its very easy to do - about 5 lines.
osgi container can be accessed from network, you need to deploy your
application into container on each node and start it up. Then use some
RPC mecha
What would be the way to do this with cassandra?
embed app into server, use OSGi.
Dne 25.7.2013 20:03, Andrew Cobley napsal(a):
Any idea on how I can go about pinpointing the problem to raise a JIRA issue ?
http://www.ehow.com/how_8705297_create-java-heap-dump.html
I'm wondering if it's a GC issue ?
yes it is:
1039280992 used; max is 1052770304
most likely memory leak.
cas 2.0b2
https://git-wip-us.apache.org/repos/asf?p=cassandra.git;a=blob_plain;f=CHANGES.txt;hb=refs/tags/2.0.0-beta2-tentative
> and as a small startup time is our most valuable resource…
use technology you are most familiar with.
From my limited experience I think Cassandra is a dangerous choice for
an young limited funding/experience start-up expecting to scale fast.
Its not dangerous, just do not try to be smart and follow what other big
cassandra users like twitter, netflix, facebook, etc are using. If they
are st
Dne 16.7.2013 20:45, Robert Coli napsal(a):
On Fri, Jul 12, 2013 at 2:28 AM, Radim Kolar <mailto:h...@filez.com>> wrote:
with some very little work (less then 10 KB of code) is possible
to have online sstable splitter and exported this functionality
over JMX.
Are you vol
My understanding is that it is not possible to change the number of
tokens after the node has been initialized.
that was my conclusion too. vnodes currently do not brings any
noticeable benefits to outweight trouble. shuffle is very slow in large
cluster. Recovery is faster with vnodes but i h
its possible to change num_tokens on node with data?
i changed it and restarted node but it still has same amount in nodetool
status.
with some very little work (less then 10 KB of code) is possible to have
online sstable splitter and exported this functionality over JMX.
Without manual flush the CPU goes mad after a couple of hours on each
instance.
increase heap size
OpsCenter collects anonymous usage data and reports it back to
DataStax. For example, number of nodes, keyspaces, column families,
etc. Stat reporting isn't required to run OpsCenter however. To turn
this feature off, see the docs here (stat_reporter):
You never informed user that installing y
in case you do not know yet, opscenter is sending certain data about
your cassandra instalation back to datastax.
This fact is not visibly presented to user, its same spyware crap like
EHCache.
Dne 13.6.2013 8:19, Michal Michalski napsal(a):
It could be doable to do something when they get converted to
tombstone, but I don't think it's the use case you're looking for.
actually, this would be good enough for me
reading changelog for eclipse kepler (4.3) and BIRT has support for
creating reports from Cassandra
Could this error message be pointing at a proximate cause?
no
Dne 12.5.2013 2:28, Techy Teck napsal(a):
I am running Cassandra 1.2.2 in production. What kind of problems you
talking about? Might be I get some root cause why I am seeing bad read
performance with Astyanax client in production cluster.
no support for full cassandra 1.2 feature set
no/bad sup
Dne 11.5.2013 21:36, Techy Teck napsal(a):
Does anyone using Astyanax client in production mainly for reading
purpose?
with cassandra 1.1, it has problems with 1.2.
do not use cassandra for implementing queueing system with high
throughput. It does not scale because of tombstone management. Use
hornetQ, its amazingly fast broker but it has quite slow persistence if
you want to create queues significantly larger then your memory and use
selectors for search
if dataset fits into memory and data used in test almost fits into
memory then cassandra is slow compared to other leading nosql databases,
it can go up to 10:1 ratio. Check infinispan benchmarks. Common use
pattern is to use memcached on top of cassandra.
cassandra is good if you have way mor
http://www.slideshare.net/Couchbase/benchmarking-couchbase#btnNext
apply patch + recompile.
Define "max_sstable_size" compaction strategy property on CF you want to split,
then run compaction.
from time to time people ask here for splitting large sstables, here is
patch doing that
https://issues.apache.org/jira/browse/CASSANDRA-4897
I would be careful with the patch that was referred to above, it
hasn't been reviewed, and from a glance it appears that it will cause
an infinite compaction loop if you get more than 4 SSTables at max size.
it will, you need to setup max sstable size correctly.
Dne 8.11.2012 19:12, B. Todd Burruss napsal(a):
my question is would leveled compaction help to get rid of the
tombstoned data faster than size tiered, and therefore reduce the disk
space usage?
leveled compaction will kill your performance. get patch from jira for
maximum sstable size per CF
Dne 29.10.2012 23:24, Stephen Pierce napsal(a):
I'm running 1.1.5; the bug says it's fixed in 1.0.9/1.1.0.
How can I check to see why it keeps running HintedHandoff?
you have tombstone is system.HintsColumnFamily use list command in
cassandra-cli to check
its possible to disable node wide all sstable compaction? I cant find
anything suitable in JMX console.
Dne 18.10.2012 20:06, Bryan Talbot napsal(a):
In a 4 node cluster running Cassandra 1.1.5 with sun jvm 1.6.0_29-b11
(64-bit), the nodes are often getting "stuck" in state where CMS
collections of the old space are constantly running.
you need more java heap memory
what if first node in range is down? then -pr would be ineffective
We have paid tool capable of downgrading cassandra 1.2, 1.1, 1.0, 0.8.
Repair process by itself is going well in a background, but the issue
I'm concerned is a lot of unnecessary compaction tasks
number in compaction tasks counter is over estimated. For example i have
1100 tasks left and if I will stop inserting data, all tasks will finish
within 30 minutes.
I
Are there any tested patches around for fixing this issue in 1.0 branch?
I have to do keyspace wide flush every 30 seconds to survive delete-only
workload. This is very inefficient.
https://issues.apache.org/jira/browse/CASSANDRA-3741
If you have steps to reproduce, post them here
https://issues.apache.org/jira/browse/CASSANDRA-4643
this is first version from 1.1 branch i used in pre-production stress
testing and i got lot of following errors: decorated key -1 != some number
INFO [CompactionExecutor:10] 2012-09-11 02:22:13,586
CompactionController.java (line 172) Compacting large row
system/HintsColumnFamily:67fd0f04ca32
i would migrate to 1.0 because 1.1 is highly unstable.
INFO [AntiEntropySessions:6] 2012-09-02 15:46:23,022
AntiEntropyService.java (line 663) [repair #%s] No neighbors to repair
with on range %s: session completed
you have RF=1, or too many nodes are down.
You looking for the author of Spring Data Cassandra?
https://github.com/boneill42/spring-data-cassandra
If so, I guess that is me. =)
Did you get in touch with spring guys? They have cassandra support on
their spring data todo list. They might have some todo or feature list
they want to impl
is author of Spring - Cassandra here? I am interested in getting this
merged into upstream spring. They have cassandra support on their todo
list.
Dne 25.5.2012 2:41, Edward Capriolo napsal(a):
Also it does not sound like you have run anti entropy repair. You
should do that when upping rf.
i run entropy repairs and it still does not fix counters. I have some
reports from users with same problem but nobody discovered repeatable
scenario.
I was thinking about putting both the commit log and the data
directory on a software raid partition spanning over the two disks.
Would this increase the general read performance? In theory I could
get twice the read performance, but I don't know how the commit log
will influence the read per
are there ubuntu packages?
1) I assume that I have to call the loadNewSSTables() on each node?
this is same as "nodetool refresh?"
is "upgradesstables" required upon "update column family with
compression_options" (or "compaction_strategy") ?
for compaction strategy no, not use about other one.
Dne 19.7.2012 15:07, cbert...@libero.it napsal(a):
Hi all, I have a problem with counters I'd like to solve before going in
production.
I have also similar problem with counters, but i do no think that
something can be done with it. Developers are not interested in
discovering what is wrong and
i do not have experience with other clients, only hector. But timeout
management in hector is really broken. If you expect your nodes to
timeout often (for example, if you are using WAN) better to try
something else first.
Dne 13.6.2012 11:29, Viktor Jevdokimov napsal(a):
I remember that join and decommission didn’t worked since using
streaming. All problems was due to paths differences between Windows
and Linux styles.
what about to use unix style File.separator in streaming protocol to
make it OS-independen
do not delete empty rows. It refreshes tombstone and they will never expire.
Dne 26.3.2012 19:17, aaron morton napsal(a):
Can you describe the situations where counter updates are lost or go
backwards ?
Do you ever get TimedOutExceptions when performing counter updates ?
we got few timeouts per day but not much, less then 10. I do not think
that timeouts will be root c
Dne 19.5.2012 0:09, Gurpreet Singh napsal(a):
Thanks Radim.
Radim, actually 100 reads per second is achievable even with 2 disks.
it will become worse as rows will get fragmented.
But achieving them with a really low avg latency per key is the issue.
I am wondering if anyone has played with in
to get 100 random reads per second on large dataset (100 GB) you need
more disks in raid 0 then 2.
Better is to add more nodes then stick too much disks into node. You
need also adjust io scheduler in OS.
here is part of log. actually record is 419.
ponto:(admin)log/cassandra>grep "to maximum of 64" system.log.1
WARN [MemoryMeter:1] 2012-02-03 00:00:19,444 Memtable.java (line 181)
setting live ratio to maximum of 64 instead of 64.9096047648211
WARN [MemoryMeter:1] 2012-02-08 00:00:17,379 Memtabl
Try reducing memtable_total_space_in_mb config setting. If the problem
is incorrect memory metering that should help.
it does not helps much because difference in correct and cassandra
assumed calculation is way too high. It would require me to shrink
memtables to about 10% of their correct si
Are you experiencing memory pressure you think may be attributed to
memtables not being flushed frequently enough ?
yes
especially delete workload is really good for OOM cassandra for some reason.
liveratio calculation logic also needs to be changed because it is based
on assumption that workloads do not change.
Can you give an example of the sort of workload change you are
thinking of ?
i have 3 workload types running in batch. Delete only workload, insert
only and heavy update (lot of
> There is 2T data on each server. Can someone give me some advice?
do not do it
liveratio calc should do nothing if memtable has 0 columns. I did manual
flush before this.
WARN [MemoryMeter:1] 2012-05-10 13:21:19,430 Memtable.java (line 181)
setting live ratio to maximum of 64 instead of Infinity
INFO [MemoryMeter:1] 2012-05-10 13:21:19,431 Memtable.java (line 186)
CFS(
Is Cassandra a fit for this use-case or should we just stick with the
oldskool MySQL and put things like votes, reviews etc in our C* store?
If all your data fits into one computer and you expect only tens of
millions records in table then go for SQL. It has far more features and
people are co
Dne 8.5.2012 15:25, Sam Tunnicliffe napsal(a):
I couldn't say categorically that this would cause the deleted data to reappear
in read results, but I can see how
it could do.
You think it can remove long standing problems with counters?
Dne 18.4.2012 16:22, Jonathan Ellis napsal(a):
It's not that simple, unless you have an append-only workload.
I have append only workload and probably most ppl using TTL too.
Any compaction pass over A will first convert the TTL data into tombstones.
Then, any subsequent pass that includes A *and all other sstables
containing rows with the same key* will drop the tombstones.
thats why i proposed to attach TTL to entire CF. Tombstones would not be
needed
Dne 4.4.2012 6:52, Igor napsal(a):
Here is small python script I run once per day. You have to adjust
size and/or age limits in the 'if' operator. Also I use mx4j interface
for jmx calls.
forceUserDefinedCompaction would be more usefull if you could do
compaction on 2 tables. If i run it on sin
what is method for undo effect of CASSANDRA-3989 (too many unnecessary
levels)? running major compact or cleanup does nothing.
it would be really helpfull if leveled compaction prints level into syslog.
Demo:
INFO [CompactionExecutor:891] 2012-04-05 22:39:27,043
CompactionTask.java (line 113) Compacting ***LEVEL 1***
[SSTableReader(path='/var/lib/cassandra/data/rapidshare/querycache-hc-19690-Data.db'),
SSTableReader(
Will 1500 bytes row size be large or small for Cassandra from your
understanding?
performance degradation starts at 500MB rows, its very slow if you hit
this limit.
What OS are you using
FreeBSD 8.3 64 bit PRERELEASE
Would you, please share, what filesystem you are using?
zfs 28
Dne 3.4.2012 23:04, i...@4friends.od.ua napsal(a):
if you know for sure that you will free lot of space compacting some
old table, then you can call UserdefinedCompaction for this table(you
can do this from cron). There is also a ticket in jira with discussion
on per-sstable expierd column an
there is problem with size tiered compaction design. It compacts
together tables of similar size.
sometimes it might happen that you will have some sstables sitting on
disk forever (Feb 23) because no other similar sized tables were created
and probably never be. because flushed sstable is abo
I'm also trying to evaluate different strategies for RAID0 as drive
for cassandra data storage. If I need 2T space to keep node tables,
which drive configuration is better: 1T x 2drives or 500G x 4drives?
more drives is always better.
Which stripe size is optimal?
smaller stripe sizes are
Dne 28.3.2012 13:14, Ross Black napsal(a):
Radim,
We are only deleting columns. *Rows are never deleted.*
i suggest to change app to delete rows. try composite keys.
RAID0 would help me use more efficiently the total disk space available at each
node, but tests have shown that under write load it behaves much worse than
using separate data dirs, one per disk.
there are different strategies how RAID0 splits reads, also changing io
scheduler and filesystem
Dne 27.3.2012 11:13, Ross Black napsal(a):
Any pointers on what I should be looking for in our application that
would be stopping the deletion of tombstones?
do not delete already deleted rows. On read cassandra returns deleted
rows as empty in range slices.
How can I fix this?
add more data. 1.5M is not enough to get reliable reports
Dne 26.3.2012 0:36, aaron morton napsal(a):
1. its not possible to run them more often? There should be some
limit - run live/serialized calculation at least once per hour. They
took just few seconds.
The live ratio is updated every time the operation count (since
startup) for the CF doubles.
Dne 26.3.2012 3:39, aaron morton napsal(a):
Can you please reproduce the fault using the --debug cqlsh command
option and bug report here
https://issues.apache.org/jira/browse/CASSANDRA
https://issues.apache.org/jira/browse/CASSANDRA-4083
I was wrong, it fails on first nontombstoned row.
Scenario 4
T1 write column
T2 Flush memtable to S1
T3 del row
T4 flush memtable to S5
T5 tomstone S5 expires
T6 S5 is compacted but not with S1
Result?
I still have wrong results (I simulated an event 5 times and it was
counted 3 times by some counters 4 or 5 times by others.
I have also wrong results with counters in 1.0.8, many times updates to
counter column are just lost and sometimes counters are going backwards
even if our app uses only
Example:
T1 < T2 < T3
at T1 write column
at T2 delete row
at T3 > tombstone expiration do compact ( T1 + T2 ) and drop expired
tombstone
column from T1 will be alive again?
cqlsh> select * from whois.ipbans;
KEY,80.65.56.165
KEY,204.229.100.77
KEY,75.144.148.1
KEY,111.191.88.7
'int' object has no attribute 'replace'
cqlsh>
its counter CF
create column family ipbans
with column_type = 'Standard'
and comparator = 'AsciiType'
and default_validation_class = '
I wonder why are memtable estimations so bad.
1. its not possible to run them more often? There should be some limit -
run live/serialized calculation at least once per hour. They took just
few seconds.
2. Why not use data from FlusherWriter to update estimations? Flusher
knows number of ops a
During compaction of selected sstables Cassandra checks the whole Column
Family for the latest timestamp of the column/row, including other
sstables and memtable.
You are explaining that if i have expired row tombstone and there exists
later timestamp on this row that tombstone is not deleted
Dne 20.3.2012 15:46, Jeremiah Jordan napsal(a):
You need to create the tombstone in case the data was inserted without a
timestamp at some point.
Yes i figured it out too. It would helped if you can assign TTL to whole
sstable. Most common use of TTL is for cache and there is most likely
dedic
Dne 19.3.2012 23:33, ruslan usifov napsal(a):
Do you make major compaction??
no, i do cleanups only. Major compactions kills my node with OOM.
Dne 19.3.2012 21:46, Caleb Rackliffe napsal(a):
I've been wondering about this too, but every column has both a
timestamp /and/ a TTL. Unless the timestamp is not preserved, there
should be no need to adjust the TTL, assuming the expiration time is
determined from these two variables.
timestam
Dne 19.3.2012 20:28, i...@4friends.od.ua napsal(a):
Hello
Datasize should decrease during minor compactions. Check logs for
compactions results.
they do but not as much as i expect. Look at sizes and file dates:
-rw-r--r-- 1 root wheel 5.4G Feb 23 17:03 resultcache-hc-27045-Data.db
-rw
I suspect that running cluster wide repair interferes with TTL based
expiration. I am running repair every 7 days and using TTL expiration
time 7 days too. Data are never deleted.
Stored data in cassandra are always growing (watching them for 3 months)
but they should not. If i run manual cleanu
Dne 2.3.2012 13:24, Watanabe Maki napsal(a):
How about to truncate HintsColumnFamily and then execute nodetool repair as
work around?
i got this exception in CLI. Its weird, all my nodes are UP and no
exception message in server log.
[default@unknown] use system;
Authenticated to keyspace: sy
> If you need to dump a lot of data consider the Hadoop integration.
http://wiki.apache.org/cassandra/HadoopSupport It can run a bit faster
than going through the thrift api.
cassandra hadoop integration reads sstables directly instead of going
via thrift?
Dne 2.3.2012 9:49, Maki Watanabe napsal(a):
Fixed in 1.0?
https://issues.apache.org/jira/browse/CASSANDRA-3176
that patch test if sstable is empty before continuing HH delivery but in
my case table is not empty - it contains one tombstoned row.
Can be something made to remove these empty delivery attempts from log?
Its just tombstoned row.
[default@system] list HintsColumnFamily;
Using default limit of 100
---
RowKey: 00
1 Row Returned.
Elapsed time: 234 msec(s).
INFO [HintedHandoff:1] 2012-03-02 05:44:32,359
Hinte
> if a node goes down, it will take longer for commitlog replay.
commit log replay time is insignificant. most time during node startup
is wasted on index sampling. Index sampling here runs for about 15 minutes.
Is there way in ops center gui to make node with agent to join specified
cluster?
I am thinking about: click on node. select join cluster - type IP
address of existing cluster member and data will be replicated into new
node.
Are there plans to write partitioner based on faster hash alg. instead
of MD5? I did cassandra profiling and lot of time is spent inside MD5
function.
Dne 3.2.2012 17:46, Jonathan Ellis napsal(a):
You should come up with a way to reproduce so we can fix it. :)
it happens after HH delivery when memtable contains lot of deletes
but a ration of< 1 may occur
for column families with a very high update to insert ratio.
better to ask why minimum ratio is 1.0. What harm can be done with using
< 1.0 ratio?
Dne 26.1.2012 2:32, David Carlton napsal(a):
How stable is 1.0 these days?
good. but hector 1.0 is unstable.
Anyway, I can't find any reason to limit minimum value of
phi_convict_threshold to 5. maki
In real world you often want to have 9 because cassandra is too much
sensitive to overloaded LAN and nodes are flipping up/down often and
creating chaos in cluster if you have larger number of nodes (let s
1 - 100 of 208 matches
Mail list logo