Re: Java GC pauses, reality check

2016-11-26 Thread Graham Sanderson
It was removed in the 3.0.x line, but not in the 3.x line (post 9472) as far as I can tell. It looks to be available in 3.11 and in 3.X branches > On Nov 26, 2016, at 1:17 PM, Oleksandr Shulgin > wrote: > > On Nov 26, 2016 20:04, "Graham Sanderson" <mailto:gra..

Re: Java GC pauses, reality check

2016-11-26 Thread Graham Sanderson
ra/browse/CASSANDRA-10969> when we restarted some nodes for other reasons. > On Nov 26, 2016, at 12:07 AM, Oleksandr Shulgin > wrote: > > On Nov 25, 2016 23:47, "Graham Sanderson" <mailto:gra...@vast.com>> wrote: > If you are seeing 25-30 second GC paus

Re: Java GC pauses, reality check

2016-11-25 Thread Graham Sanderson
If you are seeing 25-30 second GC pauses then (unless you are so badly configured) seeing full GC under CMS (though G1 may have similar problems). With CMS eventual fragmentation causing promotion failure is inevitable (unless you cycle your nodes before it happens). Either your heap has way too

Re: Do partition keys create skinny or wide rows?

2016-10-08 Thread Graham Sanderson
ct ... where organization_id = x, to get all employees in a > particular organization? > > And, this will put all those employees in the same node, right? > > On Sun, Oct 9, 2016 at 9:17 AM, Graham Sanderson <mailto:gra...@vast.com>> wrote: > Nomenclature is t

Re: Do partition keys create skinny or wide rows?

2016-10-08 Thread Graham Sanderson
Nomenclature is tricky, but PRIMARY KEY((organization_id, employee_id)) will make organization_id, employee_id the partition key which equates roughly to your latter sentence (I’m not sure about the 4 billion limit - that may be the new actual limit, but probably not a good idea). > On Oct 8, 2

Re: JVM safepoints, mmap, and slow disks

2016-10-08 Thread Graham Sanderson
a lot of cache and TLB misses > with out prefetching though. > > There is a system call to page the memory in which might be better for > larger reads. Still no guarantee things stay cached though. > > Ariel > > > On Sat, Oct 8, 2016, at 08:21 PM, Graham Sanderson wrote: >

Re: JVM safepoints, mmap, and slow disks

2016-10-08 Thread Graham Sanderson
I haven’t studied the read path that carefully, but there might be a spot at the C* level rather than JVM level where you could effectively do a JNI touch of the mmap region you’re going to need next. > On Oct 8, 2016, at 7:17 PM, Graham Sanderson wrote: > > We don’t use Azul’s Zin

Re: JVM safepoints, mmap, and slow disks

2016-10-08 Thread Graham Sanderson
We don’t use Azul’s Zing, but it does have the nice feature that all threads don’t have to reach safepoints at the same time. That said we make heavy use of Cassandra (with off heap memtables - not directly related but allows us a lot more GC headroom) and SOLR where we switched to mmap because

Re: large system hint partition

2016-09-19 Thread Graham Sanderson
The reason for large partitions is that the partition key is just the uuid of the target node More recent (I think 2.2) don't have this problem since they write hints to the file system as per the commit log Sadly the large partitions make things worse when you are hinting hence presumably und

Re: Blog post on Cassandra's inner workings and performance - feedback welcome

2016-07-10 Thread Graham Sanderson
2 )”why more memory makes things worse” - I’d be interested to see you argue that - it really isn’t true with big boxes. (but yes off heap is good) - we run 24 gig JVMs with 8g new gen and never see more than a second or so STW and that is rare (but we do have lot of -XX: options) > On Jul 9, 2

Re: cassandra full gc too often

2015-12-31 Thread Graham Sanderson
If you are lucky that might mask the real issue, but I doubt it… that is an insane number of compaction tasks and indicative of another problem. I would check release notes of 2.0.6+, if I recall that was not a stable version and may have had leaks. Aside from that, just FYI, if you use native_

Re: Cassandra Tuning Issue

2015-12-06 Thread Graham Sanderson
What version of C* are you using; what JVM version - you showed a partial GC config but if that is still CMS (not G1) then you are going to have insane GC pauses... Depending on C* versions are you using on/off heap memtables and what type Those are the sorts of issues related to fat nodes; I'

Re: Behavior difference between 2.0 and 2.1

2015-12-03 Thread Graham Sanderson
You didn’t specify which version of 2.0 you were on. There were a number of inconsistencies with static columns fixed in 2.0.10 for example CASSANDRA-7490, and CASSANDRA-7455, but there were others, and the same bugs may have caused a bunch of other issues. It very much depends exactly how you

Re: why cassanra max is 20000/s on a node ?

2015-11-05 Thread Graham Sanderson
Also it sounds like you are reading the data from a single file - the problem could easily be with your load tool try (as someone suggested) using cassandra stress > On Nov 5, 2015, at 9:06 PM, Graham Sanderson wrote: > > Agreed too. It also matters what you are inserting… i

Re: why cassanra max is 20000/s on a node ?

2015-11-05 Thread Graham Sanderson
Agreed too. It also matters what you are inserting… if you are inserting to the same (or small set of) partition key(s) you will be limited because writes to the same partition key on a single node are atomic and isolated. > On Nov 5, 2015, at 8:49 PM, Venkatesh Arivazhagan > wrote: > > I agr

Re: compression cpu overhead

2015-11-03 Thread Graham Sanderson
On read or write? https://issues.apache.org/jira/browse/CASSANDRA-7039 and friends in 2.2 should make some difference, I didn’t immediately find perf numbers though. > On Nov 3, 2015, at 5:42 PM, Dan Kinder wrote: > > Hey all, > > Just w

Re: Cassandra stalls and dropped messages not due to GC

2015-10-29 Thread Graham Sanderson
, > delivering Apache Cassandra to the world’s most innovative enterprises. > Datastax is built to be agile, always-on, and predictably scalable to any > size. With more than 500 customers in 45 countries, DataStax is the database > technology and transactional backbone of choice fo

Re: Cassandra stalls and dropped messages not due to GC

2015-10-29 Thread Graham Sanderson
you didn’t say what you upgraded from, but if it is 2.0.x, then look at CASSANDRA-9504 If so and you use commitlog_sync: batch Then you probably want to set commitlog_sync_batch_window_in_ms: 1 (or 2) Note I’m only slightly convinced this is the cause because of your READ_REPAIR issues (though i

Re: High cpu usage when the cluster is idle

2015-10-24 Thread Graham Sanderson
I would imagine you are running on fairly slow machines (given the CPU usage), but 2.0.12 and 2.1 use a fairly old version of the yammer/codehale metrics library. It is waking up every 5 seconds, and updating Meters… there are a bunch of these Meters per table (embedded in Timers), so your larg

Re: unusual GC log

2015-10-20 Thread Graham Sanderson
What version of C* are you running? any special settings in cassandra.yaml; are you running with stock GC settings in cassandra-env.sh? what JDK/OS? > On Oct 19, 2015, at 11:40 PM, 曹志富 wrote: > > INFO [Service Thread] 2015-10-20 10:42:47,854 GCInspector.java:252 - ParNew > GC in 476ms. CMS O

Re: BEWARE https://issues.apache.org/jira/browse/CASSANDRA-9504

2015-10-19 Thread Graham Sanderson
issue > On Oct 19, 2015, at 11:37 AM, Graham Sanderson wrote: > > - commitlog_sync_batch_window_in_ms behavior has changed from the > maximum time to wait between fsync to the minimum time. We are > working on making this more user-friendly (see CASSANDRA-9533) but in the >

Re: BEWARE https://issues.apache.org/jira/browse/CASSANDRA-9504

2015-10-19 Thread Graham Sanderson
from starving. The suggested default is now 2ms. was added retroactively to NEWS.txt in 2.1.6 which is why it is not obvious > On Oct 19, 2015, at 11:03 AM, Michael Shuler wrote: > > On 10/19/2015 10:55 AM, Graham Sanderson wrote: >> If you had Cassandra 2.0.x (possibly before)

BEWARE https://issues.apache.org/jira/browse/CASSANDRA-9504

2015-10-19 Thread Graham Sanderson
If you had Cassandra 2.0.x (possibly before) and upgraded to Cassandra 2.1, you may have had commitlog_sync: batch commitlog_sync_batch_window_in_ms: 25 in you cassiandra.yaml It turned out that this was pretty much broken in 2.0 (i.e. fsyncs just happened immediately), but fixed in 2.1, which

Re: Realtime data and (C)AP

2015-10-11 Thread Graham Sanderson
things like integrate Zipkin tracing at a driver level, and add other utility > like token aware batches, and concurrent token aware batch selects. > > On Sat, Oct 10, 2015 at 2:49 PM Graham Sanderson <mailto:gra...@vast.com>> wrote: > Cool - yeah we are still on astyanax

Re: Realtime data and (C)AP

2015-10-10 Thread Graham Sanderson
ed the Java driver's DowngradingConsistencyRetryPolicy for that in > cases where it makes sense. > > Ref: > http://docs.datastax.com/en/drivers/java/2.1/com/datastax/driver/core/policies/DowngradingConsistencyRetryPolicy.html > > Steve > > > >> On Fri, Oct 9, 2015 at 6:06

Re: Realtime data and (C)AP

2015-10-09 Thread Graham Sanderson
from my iPhone > On Oct 9, 2015, at 8:02 PM, Graham Sanderson wrote: > > Most of our writes are not user facing so local_quorum is good... We also > read at local_quorum because we prefer guaranteed consistency... But we very > quickly fall back to local_one in the cases where so

Re: Realtime data and (C)AP

2015-10-09 Thread Graham Sanderson
Most of our writes are not user facing so local_quorum is good... We also read at local_quorum because we prefer guaranteed consistency... But we very quickly fall back to local_one in the cases where some data fast is better than a failure. Currently we do that on a per read basis but we could

Re: addition of nodes with auth enabled on a datacenter causes existing nodes to loose their permissions

2015-10-01 Thread Graham Sanderson
You are seeing https://issues.apache.org/jira/browse/CASSANDRA-9519 > On Oct 1, 2015, at 9:16 PM, K F wrote: > > Hi, > > I have 3 DCs out of which in one of the DC, I added 20 nodes. All of the DCs > had auth enabled, it was functioning

Re: Running Cassandra on Java 8 u60..

2015-09-27 Thread Graham Sanderson
IMHO G1 is still buggy on JDK8 (based solely on being subscribed to the gc-dev mailing list)… I think JDK9 will be the one. > On Sep 25, 2015, at 7:14 PM, Stefano Ortolani wrote: > > I think those were referring to Java7 and G1GC (early versions were buggy). > > Cheers, > Stefano > > > On Fr

Re: To batch or not to batch: A question for fast inserts

2015-09-27 Thread Graham Sanderson
We are about to prototype upgrading our batch inserts, so I’m really glad about this thread… we are able to saturate our dedicated network links from hadoop when inserting via thrift API (Astyanax) - at the time we wrote that code CQL wasn’t there. Reasons to replace our current solution: 1) W

Re: High CPU usage on some of nodes

2015-09-11 Thread Graham Sanderson
gt; > > On Thu, Sep 10, 2015 at 12:00 PM, Graham Sanderson <mailto:gra...@vast.com>> wrote: > Haven’t been following this thread, but we run beefy machines with 8gig new > gen, 12 gig old gen (down from 16g since moving memtables off heap, we can > probably go lower)… &

Re: High CPU usage on some of nodes

2015-09-10 Thread Graham Sanderson
Haven’t been following this thread, but we run beefy machines with 8gig new gen, 12 gig old gen (down from 16g since moving memtables off heap, we can probably go lower)… Apart from making sure you have all the latest -XX: flags from cassandra-env.sh (and MALLOC_ARENA_MAX), I personally would r

Re: Slow performance because of used-up "Waste" in AtomicBTreeColumns

2015-07-23 Thread Graham Sanderson
Multiple writes to a single partition key are guaranteed to be atomic. Therefore there has to be some protection. First rule of thumb, don’t write at insanely high rates to the same partition key concurrently (you can probably avoid this, but hints as currently implemented suffer because the p

Re: Bulk loading performance

2015-07-13 Thread Graham Sanderson
Ironically in my experience the fastest ways to get data into C* are considered “anti-patterns” by most (but I have no problem saturating multiple gigabit network links if I really feel like inserting fast) It’s been a while since I tried some of the newer approaches though (my fast load code i

Re: What are problems with schema disagreement

2015-07-02 Thread graham sanderson
What version of C* are you running? Some versions of 2.0.x might occasionally fail to propagate schema changes in a timely fashion (though they would fix themselves eventually - in the order of a few minutes) > On Jul 2, 2015, at 9:37 PM, John Wong wrote: > > Hi. > > Here is a schema disagree

Re: How to measure disk space used by a keyspace?

2015-07-01 Thread graham sanderson
If you are pushing metric data to graphite, there is org.apache.cassandra.metrics.keyspace..LiveDiskSpaceUsed.value … for each node; Easy enough to graph the sum across machines. Metrics/JMX are tied together in C*, so there is an equivalent value exposed via JMX… I don’t know what it is called

Re: Cassandra 2.2, 3.0, and beyond

2015-06-11 Thread graham sanderson
I think the point is that 2.2 will replace 2.1.x + (i.e. the done/safe bits of 3.0 are included in 2.2).. so 2.2.x and 2.1.x are somewhat synonymous. > On Jun 11, 2015, at 8:14 PM, Mohammed Guller wrote: > > Considering that 2.1.6 was just released and it is the first “stable” release > ready

Re: Question about consistency in cassandra 2.0.9

2015-06-11 Thread graham sanderson
It looks (I’m guessing with entirely not enough info) that you only have two nodes in DC4, and are probably writing at QUORUM reading at LOCAL_ONE. But please specify your configuration > On Jun 11, 2015, at 7:01 PM, K F wrote: > > Hi, > > I am running a cassandra cluster with 4 dcs. Out of 4

Re: Throttle Heavy Read / Write Loads

2015-06-05 Thread Graham Sanderson
Are you doing large batch inserts via thrift - you need to be careful there Sent from my iPhone > On Jun 4, 2015, at 11:37 PM, Anishek Agarwal wrote: > > may be just increase the read and write timeouts at cassandra currently at 5 > sec i think. i think the datastax java client driver provides

Re: 10000+ CF support from Cassandra

2015-06-01 Thread graham sanderson
> > I strongly advise against this approach. > Jon, I think so too. But so you actually foresee any problems with this > approach? > I can think of a few. [I want to evaluate if we can live with this problem] Just to be clear, I’m not saying this is a great approach, I AM saying that it may be be

Re: GC pauses affecting entire cluster.

2015-06-01 Thread graham sanderson
Yes native_objects is the way to go… you can tell if memtables are you problem because you’ll see promotion failures of objects sized 131074 dwords. If your h/w is fast enough make your young gen as big as possible - we can collect 8G in sub second always, and this gives you your best chance of

Re: 10000+ CF support from Cassandra

2015-05-28 Thread Graham Sanderson
Depending on your use case and data types (for example if you can have a minimally Nested Json representation of the objects; Than you could go with a common map representation where keys are top love object fields and values are valid Json literals as strings; eg unquoted primitives, quoted str

Re: 10000+ CF support from Cassandra

2015-05-26 Thread graham sanderson
Are the CFs different, or all the same schema? Are you contractually obligated to actually separate data into separate CFs? It seems like you’d have a lot simpler time if you could use the part of the partition key to separate data. Note also, I don’t know what disks you are using, but disk cach

cassanulldra 2.2

2015-05-11 Thread graham sanderson
I think vast may have changed the release schedule of cassandra. I talk a lot with one of their key developers, and 3.0 was going to drop off heap memtables for several releases due to a rewrite of the storage engine to be more CQL friendly. 2.2 will take all of the improvements in 3.0 but not

DateTieredCompactionStrategy and static columns

2015-04-30 Thread graham sanderson
I have a potential use case I haven’t had a chance to prototype yet, which would normally be a good candidate for DTCS (i.e. data delivered in order and a fixed TTL), however with every write we’d also be updating some static cells (namely a few key/values in a static map CQL column). There coul

Re: Uderstanding Read after update

2015-04-13 Thread Graham Sanderson
Yes it will look in each sstable that according to the bloom filter may have data for that partition key and use time stamps to figure out the latest version (or none in case of newer tombstone) to return for each clustering key Sent from my iPhone > On Apr 12, 2015, at 11:18 PM, Anishek Agarwa

Re: Huge number of sstables after adding server to existing cluster

2015-04-04 Thread graham sanderson
I understand correctly 32 is the max > number for sstables for normally operating cassandra node? > > > Best regards > Mantas > > On Sat, Apr 4, 2015 at 4:47 AM, graham sanderson <mailto:gra...@vast.com>> wrote: > As does 2.1.3 > >> On

Re: Astyanax Thrift Frame Size Hardcoded - Breaks Ring Describe

2015-04-03 Thread graham sanderson
It is very stable for us; we don’t use it in many cases (generally older stuff where it was the best choice), but I think it is a little harsh to write it off > On Apr 3, 2015, at 1:55 PM, Robert Coli wrote: > > On Fri, Apr 3, 2015 at 11:16 AM, Eric Stevens > wrote: >

Re: Huge number of sstables after adding server to existing cluster

2015-04-03 Thread graham sanderson
As does 2.1.3 > On Apr 3, 2015, at 5:36 PM, Robert Coli wrote: > > On Fri, Apr 3, 2015 at 1:04 PM, Thomas Borg Salling > wrote: > I agree with Pranay. I have experienced exactly the same on C* 2.1.2. > > 2.1.2 had a serious bug which resulted in extra files, whic

Re: Disastrous profusion of SSTables

2015-03-26 Thread graham sanderson
you may be seeing https://issues.apache.org/jira/browse/CASSANDRA-8860 https://issues.apache.org/jira/browse/CASSANDRA-8635 related issues (which ends up with excessive numbers of sstab

Re: What are the reasons for holding off on 2.1.x at this point?

2015-03-09 Thread graham sanderson
2.1.3 has a few memory leaks/issues, resource management race conditions. That is horribly vague, however looking at some of the fixes in 2.1.4 I’d be tempted to wait on that. 2.1.3 is fine for testing though. > On Mar 9, 2015, at 6:42 PM, Jacob Rhoden wrote: > > I notice some of the discussi

Re: Upgrade from 2.0.9 to 2.1.3

2015-03-06 Thread graham sanderson
2015, at 3:15 PM, Robert Coli wrote: > > On Fri, Mar 6, 2015 at 6:25 AM, graham sanderson <mailto:gra...@vast.com>> wrote: > I would definitely wait for at least 2.1.4 > > +1 > > https://engineering.eventbrite.com/what-version-of-cassandra-should-i-run/ >

Re: best practices for time-series data with massive amounts of records

2015-03-06 Thread graham sanderson
Note that using static column(s) for the “head” value, and trailing TTLed values behind is something we’re considering. Note this is especially nice if your head state includes say a map which is updated by small deltas (individual keys) We have not yet studied the effect of static columns on s

Re: Upgrade from 2.0.9 to 2.1.3

2015-03-06 Thread graham sanderson
I would definitely wait for at least 2.1.4 > On Mar 6, 2015, at 8:13 AM, Fredrik Larsson Stigbäck > wrote: > > So no upgradeSSTables are required? > /Fredrik > >> 6 mar 2015 kl. 15:11 skrev Carlos Rolo > >: >> >> I would not recommend an upgrade to 2.1.x for now. Do y

Re: OOM and high SSTables count

2015-03-04 Thread graham sanderson
We can confirm a problem on 2.1.3 (sadly our beta sstable state obviously did not match our production ones in some critical way) We have about 20k sstables on each of 6 nodes right now; actually a quick glance shows 15k of those are from OpsCenter, which may have something to do with beta/prod

Re: Fastest way to map/parallel read all values in a table?

2015-02-09 Thread graham sanderson
Depending on whether you have deletes/updates, if this is an ad-hoc thing, you might want to just read the ss tables directly. > On Feb 9, 2015, at 12:56 PM, Kevin Burton wrote: > > I had considered using spark for this but: > > 1. we tried to deploy spark only to find out that it was missing

Re: No schema agreement from live replicas?

2015-02-03 Thread graham sanderson
What version of C* are you using; you could be seeing https://issues.apache.org/jira/browse/CASSANDRA-7734 which I think affects 2.0.7 thru 2.0.10 > On Feb 3, 2015, at 9:47 AM, Clint Kelly wrote: > > FWIW increasing the threshold for with

Re: Versioning in cassandra while indexing ?

2015-01-21 Thread graham sanderson
I believe you can use “USING TIMESTAMP XXX” with your inserts which will set the actual cell write times to the timestamp you provide. Then at least on read you’ll get the “latest” value… you may or may not incur an actual write of the old data to disk, but either way it’ll get cleaned up for yo

Re: Startup failure (Core dump) in Solaris 11 + JDK 1.8.0

2015-01-13 Thread graham sanderson
This might well be https://issues.apache.org/jira/browse/CASSANDRA-8325 try the latest patch for that if you can. > On Jan 13, 2015, at 4:50 AM, Bernardino Mota > wrote: > > Hi, > > Yes, with JDK1.7 it works but only in 32bits mode. It

Re: Error when dropping keyspaces; One row required, 0 found

2014-12-02 Thread graham sanderson
I don’t know what it is but I also saw “empty” keyspaces via CQL while migrating an existing test cluster from 2.0.9 to 2.1.0 (final release bits prior to labelling). Since I was doing this manually (and had cqlsh problems due to python change) I figured it might have been me. My observation w

Re: Nodes get stuck in crazy GC loop after some time, leading to timeouts

2014-11-28 Thread graham sanderson
Nov 28, 2014, at 6:54 PM, graham sanderson wrote: > > Your GC settings would be helpful, though you can see guesstimate by > eyeballing (assuming settings are the same across all 4 images) > > Bursty load can be a big cause of old gen fragmentation (as small working set >

Re: Nodes get stuck in crazy GC loop after some time, leading to timeouts

2014-11-28 Thread graham sanderson
Your GC settings would be helpful, though you can see guesstimate by eyeballing (assuming settings are the same across all 4 images) Bursty load can be a big cause of old gen fragmentation (as small working set objects tends to get spilled (promoted) along with memtable slabs which aren’t flush

Re: Trying to build Cassandra for FreeBSD 10.1

2014-11-17 Thread graham sanderson
Only thing I can see from looking at the exception, is that it looks like - I didn’t disassemble the code from hex - that the “peer” value in the RefCountedMemory object is probably 0 Given that Unsafe.allocateMemory should not return 0 even on allocation failure (which should throw OOM) - thou

Re: What actually causing java.lang.OutOfMemoryError: unable to create new native thread

2014-11-10 Thread graham sanderson
First question are you running 32bit or 64bit… on 32bit you can easily run out of virtual address space for thread stacks. > On Nov 10, 2014, at 8:25 AM, Jason Wee wrote: > > Hello people, below is an extraction from cassandra system log. > > ERROR [Thread-273] 2012-04-10 16:33:18,328 Abstract

Re: Why is one query 10 times slower than the other?

2014-11-05 Thread graham sanderson
In your “lookup_code” example “type” is not a clustercolumn it is the partition key, and hence the first query only hits one partition The second query is a range slice across all possible keys, so the sub-ranges are farmed out to nodes with the data. You are likely at CL_ONE, so it only needs re

Re: Client-side compression, cassandra or both?

2014-11-03 Thread graham sanderson
I wouldn’t do both. Unless a little server CPU or (and you’d have to measure it - I imagine it is probably not significant - as you say C* has more context, and hopefully most things can compress “0, “ repeatedly) disk space are an issue, I wouldn’t bother to compress yourself. Compression acros

Re: Intermittent long application pauses on nodes

2014-10-31 Thread graham sanderson
Dan van Kley <mailto:dvank...@salesforce.com>> wrote: > Excellent, thanks for the tips, Graham. I'll give SafepointTimeout a try and > see if that gives us anything to act on. > > On Fri, Oct 24, 2014 at 3:52 PM, graham sanderson <mailto:gra...@vast.com>&g

Re: Intermittent long application pauses on nodes

2014-10-24 Thread graham sanderson
And -XX:SafepointTimeoutDelay=xxx to set how long before it dumps output (defaults to 1 I believe)… Note it doesn’t actually timeout by default, it just prints the problematic threads after that time and keeps on waiting > On Oct 24, 2014, at 2:44 PM, graham sanderson wrote: > >

Re: Intermittent long application pauses on nodes

2014-10-24 Thread graham sanderson
Actually - there is -XX:+SafepointTimeout which will print out offending threads (assuming you reach a 10 second pause)… That is probably your best bet. > On Oct 24, 2014, at 2:38 PM, graham sanderson wrote: > > This certainly sounds like a JVM bug. > > We are running C*

Re: Intermittent long application pauses on nodes

2014-10-24 Thread graham sanderson
This certainly sounds like a JVM bug. We are running C* 2.0.9 on pretty high end machines with pretty large heaps, and don’t seem to have seen this (note we are on 7u67, so that might be an interesting data point, though since the old thread predated that probably not) 1) From the app/java side

Re: describe tables… and vertical formatting?

2014-10-14 Thread graham sanderson
is that there are multiple entries > per table... > > On Sun, Oct 12, 2014 at 10:39 AM, graham sanderson wrote: > select keyspace_name, columnfamily_name from system.schema_columns; > ? > > On Oct 12, 2014, at 10:29 AM, Kevin Burton wrote: > >> It seems annoyin

Re: LOCAL_* consistency levels

2014-10-14 Thread graham sanderson
There were some versions of C* that didn’t allow you to use LOCAL_* and a single DC NetworkTopologyStrategy, or with SimpleTopologyStrategy. https://issues.apache.org/jira/browse/CASSANDRA-6238 I think You should use a NetworkTopologyStrategy with one DC for now. On Oct 14, 2014, at 7:39 AM, Ro

Re: describe tables… and vertical formatting?

2014-10-12 Thread graham sanderson
select keyspace_name, columnfamily_name from system.schema_columns; ? On Oct 12, 2014, at 10:29 AM, Kevin Burton wrote: > It seems annoying that I can’t get “describe tables” to vertical. > > maybe there’s some option I’m missing? > > Kevin > > -- > > Founder/CEO Spinn3r.com > Location: S

Re: Bitmaps

2014-10-06 Thread graham sanderson
You certainly have plenty of freedom to trade off size vs access granularity using multiple blobs. It really depends on how mutable the data is, how you intend to read it, whether it is highly sparse and or highly dense (in which case you perhaps don’t need to store every bit) etc. On Oct 6, 20

Re: best practice for waiting for schema changes to propagate

2014-09-30 Thread graham sanderson
Also be aware of https://issues.apache.org/jira/browse/CASSANDRA-7734 if you are using C* 2.0.6+ (2.0.6 introduced a change that can sometimes causes initial schema propagation not to happen, introducing potentially long delays until some other code path repairs it later) On Sep 30, 2014, at 1:

Re: Unable to query with token range.. unable to make long from ‘...'

2014-09-28 Thread graham sanderson
igned to your nodes need to be distributed throughout the > > entire possible range of tokens (0 to 2127 -1) > > so it would need to be 2^63 -1 or 2^127-1 > > > > On Sun, Sep 28, 2014 at 1:19 PM, graham sanderson wrote: > It is expecting a 64 bit value … murmer3 partit

Re: Unable to query with token range.. unable to make long from ‘...'

2014-09-28 Thread graham sanderson
It is expecting a 64 bit value … murmer3 partitioner uses 64 bit long tokens… where did you get your 128 bit long from, and what partitioner are you using? On Sep 28, 2014, at 1:39 PM, Kevin Burton wrote: > I’m trying to query an entire table in parallel by splitting it up in token > ranges. >

Re: ava.lang.OutOfMemoryError: unable to create new native thread

2014-09-17 Thread graham sanderson
Are you running on a 32 bit JVM? On Sep 17, 2014, at 9:43 AM, Yatong Zhang wrote: > Hi there, > > I am using leveled compaction strategy and have many sstable files. The error > was during the startup, so any idea about this? > > ERROR [FlushWriter:4] 2014-09-17 22:36:59,383 CassandraDaemon.

Re: Storage: upsert vs. delete + insert

2014-09-10 Thread graham sanderson
ing. > Moreover, it needs one op more to compute resulting row. > cheers, > Olek > > 2014-09-10 22:18 GMT+02:00 graham sanderson : >> delete inserts a tombstone which is likely smaller than the original record >> (though still (currently) has overhead of cost for full key/

Re: Storage: upsert vs. delete + insert

2014-09-10 Thread graham sanderson
delete inserts a tombstone which is likely smaller than the original record (though still (currently) has overhead of cost for full key/column name the data for the insert after a delete would be identical to the data if you just inserted/updated no real benefit I can think of for doing the dele

Re: update static column using partition key

2014-09-07 Thread graham sanderson
Note also (though you are likely not hitting them) there were a bunch of static column related edge cases fixed in 2.0.10 On Sep 7, 2014, at 1:18 PM, graham sanderson wrote: > Presumably you meant unread_ids to be a static column (it isn’t in your table > definition) > > On Sep

Re: update static column using partition key

2014-09-07 Thread graham sanderson
Presumably you meant unread_ids to be a static column (it isn’t in your table definition) On Sep 7, 2014, at 10:14 AM, tommaso barbugli wrote: > Hi, > I am trying to use a couple of static columns; I am using cassandra 2.0.7 and > when I try to set a value using the partition key only, I get a

Re: OOM(Java heap space) on start-up during commit log replaying

2014-08-12 Thread graham sanderson
Agreed need more details; and just start by increasing heap because that may wells solve the problem. I have just observed (which makes sense when you think about it) while testing fix for https://issues.apache.org/jira/browse/CASSANDRA-7546, that if you are replaying a commit log which has a h

Re: Strange slow schema agreement on 2.0.9 ... anyone seen this? - knowsVersion may get stuck as false?

2014-08-10 Thread graham sanderson
version information for node B. On Aug 8, 2014, at 5:06 PM, graham sanderson wrote: > Actually I think it is a different issue (or a freak issue)… the invocation > in InternalResponseStage is part of the “schema pull” mechanism this ticket > relates to, and in my case this is actually

Re: Is per-table memory overhead due to SSTables or tables?

2014-08-08 Thread graham sanderson
google ;-) On Aug 8, 2014, at 7:33 PM, Kevin Burton wrote: > hm.. as a side note, it's amazing how much cassandra information is locked up > in JIRAs… wonder if there's a way to compute automatically the JIRAs with > important information. > > > On Fri, Au

Re: Is per-table memory overhead due to SSTables or tables?

2014-08-08 Thread graham sanderson
See https://issues.apache.org/jira/browse/CASSANDRA-5935 2.1 has a radically different implementation that side steps this (with off heap memtables), but if you really want lots of tables now you can do so as a trade off against GC behavior. The problem is not SSTables per se, but more potentia

Re: Strange slow schema agreement on 2.0.9 ... anyone seen this?

2014-08-08 Thread graham sanderson
happens again, I’ll have some more context to dig deeper, before just getting in and fixing the problem by restarting the nodes which I did today. On Aug 8, 2014, at 4:37 PM, graham sanderson wrote: > Ok thanks - I guess I can at least enable the debug logging added for that > issue to see if

Re: Strange slow schema agreement on 2.0.9 ... anyone seen this?

2014-08-08 Thread graham sanderson
Ok thanks - I guess I can at least enable the debug logging added for that issue to see if it is deliberately choosing not to pull the schema… no repro case, but it may happen again! On Aug 8, 2014, at 4:21 PM, Robert Coli wrote: > On Fri, Aug 8, 2014 at 1:45 PM, graham sanderson wrote: &

Re: Delete By Partition Key Implementation

2014-08-08 Thread graham sanderson
A deletion of an entire row is a single row tombstone, and yes there are range tombstones for marking deletion of a range of columns also On Aug 8, 2014, at 2:17 PM, Kevin Burton wrote: > This is a good question.. I'd love to find out the answer. Seems like a > tombstone with prefixes for the

Strange slow schema agreement on 2.0.9 ... anyone seen this?

2014-08-08 Thread graham sanderson
We recently upgraded C* from 2.0.5 to 2.0.9 We have some data that is partitioned in tables created periodically (once a day). This morning, this automated process timed out because the schema did not reach agreement quickly enough after we created a new empty table. I was able to reproduce thi

Re: Full GC in cassandra

2014-07-28 Thread graham sanderson
1) Is that heap dump after full GC? 2) Are you doing a lot of concurrent writes to the same partition 3) Is the system hinting at the time of the spike 4) when you say “spike” do you mean this is unusually high? On Jul 28, 2014, at 11:07 AM, Keith Wright wrote: > What’s your cfhistograms look li

Re: Does SELECT … IN () use parallel dispatch?

2014-07-25 Thread Graham Sanderson
Of course the driver in question is allowed to be smarter and can do so if use use a ? parameter for a list or even individual elements I'm not sure which if any drivers currently do this but we plan to combine this with token aware routing in our scala driver in the future Sent from my iPhone

Re: All writes fail with ONE consistency level when adding second node to cluster?

2014-07-23 Thread graham sanderson
I was being a little tongue in cheek! On Jul 23, 2014, at 3:20 PM, Jack Krupansky wrote: > Granted, for “normal” apps it is unlikely to be appropriate but... > > From an old post by Jonathan: > --- > Extreme write availability > > For applications that want Cassandra to accept writes even wh

Re: All writes fail with ONE consistency level when adding second node to cluster?

2014-07-23 Thread graham sanderson
Hey now; it is GREAT for a 100% write only use case ;-) On Jul 23, 2014, at 12:15 PM, Robert Coli wrote: > On Tue, Jul 22, 2014 at 7:46 PM, Andrew wrote: > ONE means write to one replica (in addition to the original). If you want to > write to any of them, use ANY. Is that the right understa

Re: All writes fail with ONE consistency level when adding second node to cluster?

2014-07-22 Thread graham sanderson
> so… basically, my entire cluster is offline during this join? > > I assume this is either a bug or some weird state base on growing from 1-2 > nodes? > > frustrating :-( > > > On Tue, Jul 22, 2014 at 8:13 PM, graham sanderson wrote: > Incorrect, ONE does not ref

Re: All writes fail with ONE consistency level when adding second node to cluster?

2014-07-22 Thread graham sanderson
Incorrect, ONE does not refer to the number of “other" nodes, it just refers to the number of nodes. so ONE under normal circumstances would only require one node to acknowledge the write. The confusing error message you are getting is related to https://issues.apache.org/jira/browse/CASSANDRA-

Re: "ghost" table is breaking compactions and won't go away… even during a drop.

2014-07-16 Thread graham sanderson
Known issue deleting and recreating a CF with the same name, fixed in 2.1 (manifests in lots of ways) https://issues.apache.org/jira/browse/CASSANDRA-5202 On Jul 16, 2014, at 8:53 PM, Kevin Burton wrote: > looks like a restart of cassandra and a "nodetool compact" fixed this… > > > On Wed,

Re: Write Inconsistency to update a row

2014-07-03 Thread graham sanderson
What is your keyspace replication_factor? What consistency level are you reading/writing with? Does the data show up eventually? I’m assuming you don’t have any errors (timeouts etc) on the write site On Jul 3, 2014, at 7:55 AM, Sávio S. Teles de Oliveira wrote: > I have two Cassandra 2.0.5

Re: Pattern to store maps of maps...

2014-06-13 Thread graham sanderson
My personal opinion is that unless you are doing map operations on a CQL3 map and will always intend to read the whole thing (you don’t have any choice today), don’t use one at all - use a blob of whatever variety makes sense (e.g. Json, AVRO, Protobuf etc) On Jun 13, 2014, at 7:17 PM, Kevin Bu

Re: Dynamic Columns in Cassandra 2.X

2014-06-13 Thread graham sanderson
he server can generate events to send to the > client, e.g. schema changes - in general, 'triggers' become possible. > > ml > > > On Fri, Jun 13, 2014 at 6:21 PM, graham sanderson wrote: > My 2 cents… > > A motivation for CQL3 AFAIK was to make Cassandra

  1   2   >