Versioning in cassandra while indexing ?

2015-01-20 Thread Pandian R
Hi, I just wanted to know if there is any kind of versioning system in cassandra while indexing new data(like the one we have for ElasticSearch, for example). For example, I have a series of payloads each coming with an id and 'updatedAt' timestamp. I just want to maintain the latest state of any

Re: Dynamic Columns

2015-01-20 Thread Xu Zhongxing
The original dynamic column idea in Google BigTable paper is a mapping of: (row key, raw bytes) -> raw bytes The restriction imposed by CQL is, as far as I understand, you need to have a type for each column. If the value types involved in the schema is limited, e.g. text or int or timesta

Re: Re: Dynamic Columns

2015-01-20 Thread Peter Lin
the thing is, CQL only handles some types of dynamic column use cases. There's plenty of examples on datastax.com that shows how to do CQL style dynamic columns. based on what was described by Chetan, I don't feel CQL3 is a perfect fit for what he wants to do. To use CQL3, he'd have to change his

Re:Re: Dynamic Columns

2015-01-20 Thread Xu Zhongxing
I approximate dynamic columns by data_key and data_value columns. Is there a better way to get dynamic columns in CQL 3? At 2015-01-21 09:41:02, "Peter Lin" wrote: I think that table example misses the point of chetan's functional requirement. he actually needs dynamic columns. On Tue, Jan

Re: Dynamic Columns

2015-01-20 Thread Peter Lin
I think that table example misses the point of chetan's functional requirement. he actually needs dynamic columns. On Tue, Jan 20, 2015 at 8:12 PM, Xu Zhongxing wrote: > Maybe this is the closest thing to "dynamic columns" in CQL 3. > > create table reivew ( > product_id bigint, > create

Re: Dynamic Columns

2015-01-20 Thread Xu Zhongxing
Maybe this is the closest thing to "dynamic columns" in CQL 3. create table reivew ( product_id bigint, created_at timestamp, data_key text, data_tvalue text, data_ivalue int, primary key ((priduct_id, created_at), data_key) ); data_tvalue and data_ivalue is optional.

Re: Should one expect to see hints being stored/delivered occasionally?

2015-01-20 Thread Robert Coli
On Sat, Jan 17, 2015 at 3:32 PM, Vasileios Vlachos < vasileiosvlac...@gmail.com> wrote: > Is there any other occasion that hints are stored and then being sent in a > cluster, other than network or other temporary or permanent failure? Could > it be that the client responsible for establishing a c

Re: Compaction failing to trigger

2015-01-20 Thread Eric Stevens
@Rob - he's probably referring to the thread titled "Reasons for nodes not compacting?" where Tyler speculates that the tables are falling below the cold read threshold for compaction. He speculated it may be a bug. At the same time in a different thread, Roland had a similar problem, and Tyler's

Re: keyspace not exists?

2015-01-20 Thread Robert Coli
On Sun, Jan 18, 2015 at 8:55 PM, Jason Wee wrote: > two nodes running cassandra 2.1.2 and one running cassandra 2.1.1 > For the record, this is an unsupported persistent configuration. You are only supposed to have split minor versions during an upgrade. I have no idea if it is causing the prob

Re: Compaction failing to trigger

2015-01-20 Thread Robert Coli
On Sun, Jan 18, 2015 at 6:06 PM, Flavien Charlon wrote: > It's set on all the tables, as I'm using the default for all the tables. > But for that particular table there are 41 SSTables between 60MB and 85MB, > it should only take 4 for the compaction to kick in. > What version of Cassandra are y

Re: number of replicas per data center?

2015-01-20 Thread Robert Coli
On Sun, Jan 18, 2015 at 8:50 PM, Kevin Burton wrote: > Ah.. six replicas. At least its super inexpensive that way (sarcasm!) > People with larger numbers of data centers do tend to reduce their replication factor per DC. It's all about how much consistency you want to risk, rebuild over the WAN

Re: How do replica become out of sync

2015-01-20 Thread Robert Coli
On Mon, Jan 19, 2015 at 5:44 PM, Flavien Charlon wrote: > Thanks Andi. The reason I was asking is that even though my nodes have > been 100% available and no write has been rejected, when running an > incremental repair, the logs still indicate that some ranges are out of > sync (which then resul

Re: Why Cassandra 2.1.2 couldn't populate row cache in between

2015-01-20 Thread Robert Coli
On Mon, Jan 19, 2015 at 11:57 PM, nitin padalia wrote: > If I've enable row cache for some column family, when I request some > row which is not from the begining of the partition, then cassandra > doesn't populate, row cache. > > Why it is so? For older version I think it was because we're sayin

Re: Dynamic Columns

2015-01-20 Thread chetan verma
Hi, Adding to previous mail. For example: We have a column family named review (with some arbitrary data in map). CREATE TABLE review( product_id bigint, created_at timestamp, data_int map, data_text map, PRIMARY KEY (product_id, created_at) ); Assume that these 2 maps I use to store arbitrary d

Re: Dynamic Columns

2015-01-20 Thread chetan verma
Hi, Most of the time I will be querying on product_id and created_at, but for analytic I need to query almost on all column. Multiple collections ideas is good but the only is cassandra reads a collection entirely, what if I need a slice of it, I mean columns for certain keys which is possible wi

Re: How to know disk utilization by each row on a node

2015-01-20 Thread Jens Rantil
Hi, Datastax comes with sstablekeys that does that. You could also use sstable2json script to find keys. Cheers, Jens On Tue, Jan 20, 2015 at 2:53 PM, Edson Marquezani Filho wrote: > Hello, everybody. > Does anyone know a way to list, for an arbitrary column family, all > the rows owned (incl

Re: Dynamic Columns

2015-01-20 Thread Jonathan Lacefield
Hello, There are probably lots of options to this challenge. The more details around your use case that you can provide, the easier it will be for this group to offer advice. A few follow-up questions: - How will you query this data? - Do your queries require filtering on specific columns ot

Re: Dynamic Columns

2015-01-20 Thread chetan verma
Hi, I am creating a review system. for instance lets assume following are the attibutes of system: Review{ id bigint, product_id bigint, created_at timestamp, summary text, description text, pros set, cons set, feature_rating map etc } I created partition key as product_id (so that all the re

Re: Dynamic Columns

2015-01-20 Thread chetan verma
Could you please explain how we can achieve dynamic column behavior by clustering columns. On Wed, Jan 21, 2015 at 12:10 AM, chetan verma wrote: > Hi, > > I am creating a review system. for instance lets assume following are the > attibutes of system: > > Review{ > id bigint, > product_id bigint

[RELEASE] Apache Cassandra 2.0.12 released

2015-01-20 Thread Jake Luciani
The Cassandra team is pleased to announce the release of Apache Cassandra version 2.0.12. Apache Cassandra is a fully distributed database. It is the right choice when you need scalability and high availability without compromising performance. http://cassandra.apache.org/ Downloads of source a

Re: Dynamic Columns

2015-01-20 Thread Jonathan Lacefield
Hello, Have you looked at solving this challenge with clustering columns? Also, please describe the problem set details for more specific advice from this group. Starting new projects on Thrift isn't the recommended approach. Jonathan [image: datastax_logo.png] Jonathan Lacefield Solutio

Dynamic Columns

2015-01-20 Thread chetan verma
Hi, I am starting a new project with cassandra as database. I have unstructured data so I need dynamic columns, though in CQL3 we can achive this via Collections but there are some downsides to it. 1. Collections are used to store small amount of data. 2. The maximum size of an item in a collectio

How to know disk utilization by each row on a node

2015-01-20 Thread Edson Marquezani Filho
Hello, everybody. Does anyone know a way to list, for an arbitrary column family, all the rows owned (including replicas) by a given node and the data size (real size or disk occupation) of each one of them on that node? I would like to do that because I have data on one of my nodes growing faste

Comparison of multiple ways to query cassandra

2015-01-20 Thread Parth Setya
hi Could someone please shed some light on which is an efficient way to retrieve data from cassandra- Using a Range Slice Query(I'm Using Hector) or filtering using secondary indexes? best Parth

Why Cassandra 2.1.2 couldn't populate row cache in between

2015-01-20 Thread nitin padalia
Hi, If I've enable row cache for some column family, when I request some row which is not from the begining of the partition, then cassandra doesn't populate, row cache. Why it is so? For older version I think it was because we're saying the its caching complete merged partition so, incomplete pa