Re: Unknown CF / Schema OK

2015-03-22 Thread Tim Olson
I did figure this out:

When adding a columnfamily, the query timed out before all nodes replied,
and I sent the schema out again.  Half the nodes ended up with the CF
having UUID A and half the nodes ended up with the new CF but UUID B.
UnknownColumnFamilyExceptions were thrown until the enqueued data exceeded
memory.  Eventually one half of the nodes crashed, with the other half
having a consistent view of the CF.  At this point I just dropped the
offending CF schema in the active cluster, then the downed nodes could be
re-added successfully.  We lost some data.  :(



On Sun, Mar 22, 2015 at 11:39 AM, Tim Olson  wrote:

> ​After upgrading a schema, I'm getting lots of
> UnknownColumnFamilyException in the logs.  However, all nodes have the
> same schema as reported by nodetool describecluster.   I queried the
> system tables for the given column family UUID, but it doesn't appear in
> any of the schemas on any of the nodes.  I restarted all clients, but that
> didn't help either.
>
> The cluster was running 2.1.2 but I recently upgraded to 2.1.3.
>
> Any ideas?  This is basically making our production cluster highly
> unresponsive.
>
> Tim
>


cassandra triggers

2015-03-22 Thread Rahul Bhardwaj
Hi All,


I want to use triggers in cassandra. Is there any tutorial on creating
triggers in cassandra .
Also I am not good in java.

Pl help !!

Regards:
Rahul Bhardwaj

-- 

Follow IndiaMART.com  for latest updates on this 
and more:  
  Mobile 
Channel: 

 
 


Watch how IndiaMART Maximiser helped Mr. Khanna expand his business. kyunki 
Kaam 
Yahin Banta Hai 
!!!


Really high read latency

2015-03-22 Thread Dave Galbraith
Hi! So I've got a table like this:

CREATE TABLE "default".metrics (row_time int,attrs varchar,offset int,value
double, PRIMARY KEY(row_time, attrs, offset)) WITH COMPACT STORAGE AND
bloom_filter_fp_chance=0.01 AND caching='KEYS_ONLY' AND comment='' AND
dclocal_read_repair_chance=0 AND gc_grace_seconds=864000 AND
index_interval=128 AND read_repair_chance=1 AND replicate_on_write='true'
AND populate_io_cache_on_flush='false' AND default_time_to_live=0 AND
speculative_retry='NONE' AND memtable_flush_period_in_ms=0 AND
compaction={'class':'DateTieredCompactionStrategy','timestamp_resolution':'MILLISECONDS'}
AND compression={'sstable_compression':'LZ4Compressor'};

and I'm running Cassandra on an EC2 m3.2xlarge out in the cloud, with 4 GB
of heap space. So it's timeseries data that I'm doing so I increment
"row_time" each day, "attrs" is additional identifying information about
each series, and "offset" is the number of milliseconds into the day for
each data point. So for the past 5 days, I've been inserting 3k
points/second distributed across 100k distinct "attrs"es. And now when I
try to run queries on this data that look like

"SELECT * FROM "default".metrics WHERE row_time = 5 AND attrs =
'potatoes_and_jam'"

it takes an absurdly long time and sometimes just times out. I did
"nodetool cftsats default" and here's what I get:

Keyspace: default
Read Count: 59
Read Latency: 397.12523728813557 ms.
Write Count: 155128
Write Latency: 0.3675690719921613 ms.
Pending Flushes: 0
Table: metrics
SSTable count: 26
Space used (live): 35146349027
Space used (total): 35146349027
Space used by snapshots (total): 0
SSTable Compression Ratio: 0.10386468749216264
Memtable cell count: 141800
Memtable data size: 31071290
Memtable switch count: 41
Local read count: 59
Local read latency: 397.126 ms
Local write count: 155128
Local write latency: 0.368 ms
Pending flushes: 0
Bloom filter false positives: 0
Bloom filter false ratio: 0.0
Bloom filter space used: 2856
Compacted partition minimum bytes: 104
Compacted partition maximum bytes: 36904729268
Compacted partition mean bytes: 986530969
Average live cells per slice (last five minutes): 501.66101694915255
Maximum live cells per slice (last five minutes): 502.0
Average tombstones per slice (last five minutes): 0.0
Maximum tombstones per slice (last five minutes): 0.0

Ouch! 400ms of read latency, orders of magnitude higher than it has any
right to be. How could this have happened? Is there something fundamentally
broken about my data model? Thanks!


OutOfMemoryError in ReadStage

2015-03-22 Thread Ian Rose
Hi all -

I had a nasty streak of OOMs earlier today (several on one node, and a
single OOM on one other node).  I've downloaded a few of the hprof files
for local analysis.  In each case, there is a single ReadStage thread with
a huge (> 7.5GB) org.apache.cassandra.db.ArrayBackedSortedColumns
instance.  I'm trying to understand exactly what this means.

1) Does a ReadStage thread only process one query at a time?  If so, then a
reasonable conclusion (I think) would be I had a single query that produced
a ton of results.  If not (if ReadStage threads can work on multiple
queries concurrently) then this volume of data might have been produced by
a combination of queries.

2) My driver (gocql) does not appear to enable paging by default.  Am I
correct in assuming that this should "solve the problem" (more precisely:
avoid OOMs due to me fetching a ton of rows, assuming that is the problem
and not that I am fetching a small number of very large rows)?

3) Is there any way for me (either from the system.log or from the hprof
dumps) to tell what query was currently executing when the process OOMed?
If I dig down in the object hierarchy, I see: Thread -> MessageDeliveryTask
-> message -> payload, which has the right ksName and cfName.  But the
"key" property is a byte array - is there an easy way for me to map this
onto my column key (which has multiple CQL columns in a composite key).

4) Alternatively, is it possible for me to see how many rows had been read
for that query so far?  That way I can at least validate that the problem
was "too many rows" and not "rows are too big".

Many thanks!
- Ian


Re: CQL 3.x Update ...USING TIMESTAMP...

2015-03-22 Thread Sachin Nikam
@Eric Stevens
Thanks for representing my position while I came back to this thread.

@Tyler
With your recommendation, won't I end up saving all the version(s) of the
document. In my case the document is pretty huge (~5mb) and each document
has up to 10 versions. And you already highlighted that light weight
transactions are very expensive.

Also as Eric mentions, can you elaborate on what kind of problems could
happen when we try to overwrite or delete data?
Regards
Sachin

On Fri, Mar 13, 2015 at 4:23 AM, Brice Dutheil 
wrote:

> I agree with Tyler, in the normal run of a live application I would not
> recommend the use of the timestamp, and use other ways to *version*
> *inserts*. Otherwise you may fall in the *upsert* pitfalls that Tyler
> mentions.
>
> However I find there’s a legitimate use the USING TIMESTAMP trick, when
> migrating data form another datastore.
>
> The trick is at some point to enable the application to start writing
> cassandra *without* any timestamp setting on the statements. ⇐ for fresh
> data
> Then start a migration batch that will use a write time with an older date
> (i.e. when there’s *no* possible *collision* with other data). ⇐ for
> older data
>
> *This tricks has been used in prod with billions of records.*
> ​
>
> -- Brice
>
> On Thu, Mar 12, 2015 at 10:42 PM, Eric Stevens  wrote:
>
>> Ok, but if you're using a system of time that isn't server clock oriented
>> (Sachin's document revision ID, and my fixed and necessarily consistent
>> base timestamp [B's always know their parent A's exact recorded
>> timestamp]), isn't the principle of using timestamps to force a particular
>> update out of several to win still sound?
>>
>> > as using the clocks is only valid if clocks are perfectly sync'ed,
>> which they are not
>>
>> Clock skew is a problem which doesn't seem to be a factor in either use
>> case given that both have a consistent external source of truth for
>> timestamp.
>>
>> On Thu, Mar 12, 2015 at 12:58 PM, Jonathan Haddad 
>> wrote:
>>
>>> In most datacenters you're going to see significant variance in your
>>> server times.  Likely > 20ms between servers in the same rack.  Even
>>> google, using atomic clocks, has 1-7ms variance.  [1]
>>>
>>> I would +1 Tyler's advice here, as using the clocks is only valid if
>>> clocks are perfectly sync'ed, which they are not, and likely never will be
>>> in our lifetime.
>>>
>>> [1] http://queue.acm.org/detail.cfm?id=2745385
>>>
>>>
>>> On Thu, Mar 12, 2015 at 7:04 AM Eric Stevens  wrote:
>>>
 > It's possible, but you'll end up with problems when attempting to
 overwrite or delete entries

 I'm wondering if you can elucidate on that a little bit, do you just
 mean that it's easy to forget to always set your timestamp correctly, and
 if you goof it up, it makes it difficult to recover from (i.e. you issue a
 delete with system timestamp instead of document version, and that's way
 larger than your document version would ever be, so you can never write
 that document again)?  Or is there some bug in write timestamps that can
 cause the wrong entry to win the write contention?

 We're looking at doing something similar to keep a live max value
 column in a given table, our setup is as follows:

 CREATE TABLE a (
   id ,
   time timestamp,
   max_b_foo int,
   PRIMARY KEY (id)
 );
 CREATE TABLE b (
   b_id ,
   a_id ,
   a_timestamp timestamp,
   foo int,
   PRIMARY KEY (a_id, b_id)
 );

 The idea being that there's a one-to-many relationship between *a* and
 *b*.  We want *a* to know what the maximum value is in *b* for field
 *foo* so we can avoid reading *all* *b* when we want to resolve *a*.
 You can see that we can't just use *b*'s clustering key to resolve
 that with LIMIT 1; also this is for DSE Solr, which wouldn't be able to
 query a by max b.foo anyway.  So when we write to *b*, we also write
 to *a* with something like

 UPDATE a USING TIMESTAMP ${b.a_timestamp.toMicros + b.foo} SET
 max_b_foo = ${b.foo} WHERE id = ${b.a_id}

 Assuming that we don't run afoul of related antipatterns such as
 repeatedly overwriting the same value indefinitely, this strikes me as
 sound if unorthodox practice, as long as conflict resolution in Cassandra
 isn't broken in some subtle way.  We also designed this to be safe from
 getting write timestamps greatly out of sync with clock time so that
 non-timestamped operations (especially delete) if done accidentally will
 still have a reasonable chance of having the expected results.

 So while it may not be the intended use case for write timestamps, and
 there are definitely gotchas if you are not careful or misunderstand the
 consequences, as far as I can see the logic behind it is sound but does
 rely on correct conflict resolution in Cassandra.  I'm curious if I'm
 missing or misunde

Unknown CF / Schema OK

2015-03-22 Thread Tim Olson
​After upgrading a schema, I'm getting lots of UnknownColumnFamilyException in
the logs.  However, all nodes have the same schema as reported by nodetool
describecluster.   I queried the system tables for the given column family
UUID, but it doesn't appear in any of the schemas on any of the nodes.  I
restarted all clients, but that didn't help either.

The cluster was running 2.1.2 but I recently upgraded to 2.1.3.

Any ideas?  This is basically making our production cluster highly
unresponsive.

Tim