Re: Tuning cassandra (compactions overall)

2012-05-24 Thread Alain RODRIGUEZ
I already had this kind of trouble while repairing a month ago. I have problems that I am the only one to have. I guess I have something wrong either in the configuration of my nodes or in my data that makes them wrong after a restart/repair. I am planning to try deploying an EC2 cluster with data

Data Versioning Support

2012-05-24 Thread Felipe Schmidt
Doe's Cassandra support data versioning? I'm trying to find it in many places but I'm not quite sure about it. Regards, Felipe Mathias Schmidt (Computer Science UFRGS, RS, Brazil)

Re: how to get list of snapshots

2012-05-24 Thread aaron morton
> 1) ok, I can schedule snapshots using cron (snapshot's name will be ganarated > from current date) > how can I remove snapshots older than a week ? Delete the directory on disk. > 2) ok, I can enable increment backups. How can I remove incremental SSTables > older than 1 week ? > it's more

Re: Data Versioning Support

2012-05-24 Thread R. Verlangen
Hi Felipe, There recently was a thread about ( http://www.mail-archive.com/user@cassandra.apache.org/msg22298.html ). The answer in short: no. However you can build your own data model to support it. Cheers! 2012/5/24 Felipe Schmidt > Doe's Cassandra support data versioning? > > I'm trying to

Re: Confusion regarding the terms "replica" and "replication factor"

2012-05-24 Thread aaron morton
This is partly historical. NTS (as it is now) has not always existed and was not always the default. In days gone by used to be a fella could run a mighty fine key-value store using just a Simple Replication Strategy. A different way to visualise it is a single ring with a Z axis for the DC's.

Re: Retrieving old data version for a given row

2012-05-24 Thread Felipe Schmidt
Ok... it's really strange to me that Cassandra doesn't support data versioning cause all of other key-value databases support it (at least those who I know). I have one remaining question: -in the case that I have more than 1 SSTable in the disk for the same column but with different data versions

Re: Error loading data: Internal error processing get_range_slices / Unavailable Exception

2012-05-24 Thread aaron morton
I'm not sure what those log messages are from. But…. > UnknownException: [host=192.168.2.13(192.168.2.13):9160, latency=11(31), > attempts=1] SchemaDisagreementException() Sounds a bit like. http://wiki.apache.org/cassandra/FAQ#schema_disagreement Cheers - Aaron Morton Freel

Re: Replication factor

2012-05-24 Thread aaron morton
ReadRepair means including all UP replicas in the request, waiting asynchronously after the read has completed, resolving and repairing differences. If you read at QUOURM with RR running, ALL (replace) nodes will perform a read. At any CL > ONE the responses from CL nodes are reconciled and di

Re: Replication factor

2012-05-24 Thread aaron morton
Your experience is when using CL ONE the Dynamic Snitch is moving local reads off to other nodes and this is causing spikes in read latency ? Did you notice what was happening on the node for the DS to think it was so slow ? Was compaction or repair going on ? Have you played with the badnes

Re: unknown exception with hector

2012-05-24 Thread aaron morton
Dropped read messages occur when the node could not process a read task within rpc_timeout. It generally means the cluster has been overwhelmed at some point: too many requests, to much GC, compaction hurting, etc. Check the server side logs for errors but I doubt it is related to the call st

Re: Retrieving old data version for a given row

2012-05-24 Thread aaron morton
> Ok... it's really strange to me that Cassandra doesn't support data > versioning cause all of other key-value databases support it (at least > those who I know). You can design it into your data model if you need it. > I have one remaining question: > -in the case that I have more than 1 SSTabl

RE: Replication factor

2012-05-24 Thread Viktor Jevdokimov
All data is in the page cache. No repairs. Compactions not hitting disk for read. CPU <50%. ParNew GC <100 ms in average. After one compaction completes, new sstable is not in page cache, there may be a disk usage spike before data is cached, so local reads gets slower for a moment, comparing w

what about an "hybrid" partitioner for CF with composite row key ?

2012-05-24 Thread DE VITO Dominique
Hi, We have defined a CF with a composite row key that sounds like (folder id, doc id). For our app, one very common pattern is accessing, through one ui action, some bunch of data with the following row keys: (id, id_1), (id, id_2), (id, id_3)... So, multiple rows are accessed, but all row key

Re: Query on how to count the total number of rowkeys and columns in them

2012-05-24 Thread Віталій Тимчишин
You should read multiple "batches" specifying last key received from previous batch as first key for next one. For large databases I'd recommend you to use statistical approach (if it's feasible). With random parittioner it works well. Don't read the whole db. Knowing whole keyspace you can read pa

Re: inconsistent snapshots

2012-05-24 Thread Vijay
Can you attach the cassandra logs? How was it restored? Priam chunks and compresses before uploading it to S3 and when you restore you have to go thought Priam to restore... Regards, On Thu, May 24, 2012 at 3:58 AM, Jose Flexa wrote: > Hi all, > > Having issues with cassandra 1.0.8 on AWS EC

Migrating from a windows cluster to a linux cluster.

2012-05-24 Thread Henrik Schröder
Hey everyone, We're trying to migrate a cassandra cluster from a bunch of Windows machines to a bunch of (newer and more powerful) Linux machines. Our initial plan was to simply bootstrap the Linux servers into the cluster one by one, and then decommission the old servers one by one. However, whe

Composite keys question

2012-05-24 Thread Roland Mechler
Suppose I have a table in CQL3 with a 2 part composite, and I do a select that specifies just the second part of the key (not the partition key), will this result in a full table scan, or is the second part of the key indexed? Example: cqlsh:"Keyspace1"> CREATE TABLE test_table (part1 text, part2

Re: Migrating from a windows cluster to a linux cluster.

2012-05-24 Thread Henrik Schröder
Hey, we thought a bit about it and came up with another solution: We shut down Cassandra on one of the windows servers, copy over the data directory to one of the Linux servers, delete the LocationInfo files from the system keyspace, and start it up. It should read the saved token from the datafi

Re: Migrating from a windows cluster to a linux cluster.

2012-05-24 Thread Brandon Williams
On Thu, May 24, 2012 at 12:41 PM, Henrik Schröder wrote: > We're running version 1.0.8. Is this fixed in a later release? Will this be > fixed in a later release? No, mixed-OS clusters are unsupported. > Are there any other ways of doing the migration? What happens if we join the > new servers w

Why does a large compaction on one node affect the entire cluster?

2012-05-24 Thread Thomas van Neerijnen
Hi all I am running Cassandra 1.0.10 installed from the apache debs on ubuntu 11.10 on a 7 node cluster. I moved some tokens around my cluster and now have one node compacting a large Leveled compaction column family. It has done about 5k out of 10k outstanding compactions today. The other nodes

Re: Migrating from a windows cluster to a linux cluster.

2012-05-24 Thread Henrik Schröder
On Thu, May 24, 2012 at 8:07 PM, Brandon Williams wrote: > > Are there any other ways of doing the migration? What happens if we join > the > > new servers without bootstrapping and run repair? Are there any other > ugly > > hacks or workaround we can do? We're not looking to run a mixed cluster,

Re: Migrating from a windows cluster to a linux cluster.

2012-05-24 Thread Brandon Williams
On Thu, May 24, 2012 at 1:50 PM, Henrik Schröder wrote: > Ok. It's important for us to not have any downtime, so how about this > solution: > > We startup the Linux cluster independently. > We configure our application to send all Cassandra writes to both clusters, > but only read from the Windows

Re: Migrating from a windows cluster to a linux cluster.

2012-05-24 Thread Steve Neely
It also seems like a dark deployment of your new cluster is a great method for testing the Linux-based systems *before* switching your mision critical traffic over. Monitor them for a while with real traffic and you can have confidence that they'll function correctly when you perform the switchover

Re: Migrating from a windows cluster to a linux cluster.

2012-05-24 Thread Henrik Schröder
On Thu, May 24, 2012 at 9:28 PM, Brandon Williams wrote: > > That sounds fine, with the caveat that you can't run sstableloader > from a machine running Cassandra before 1.1, so copying the sstables > manually (assuming both clusters are the same size and have the same > tokens) might be better.

Re: Migrating from a windows cluster to a linux cluster.

2012-05-24 Thread Brandon Williams
On Thu, May 24, 2012 at 3:36 PM, Henrik Schröder wrote: >> That sounds fine, with the caveat that you can't run sstableloader >> from a machine running Cassandra before 1.1, so copying the sstables >> manually (assuming both clusters are the same size and have the same >> tokens) might be better.

Re: Migrating from a windows cluster to a linux cluster.

2012-05-24 Thread Rob Coli
On Thu, May 24, 2012 at 12:44 PM, Steve Neely wrote: > It also seems like a dark deployment of your new cluster is a great method > for testing the Linux-based systems before switching your mision critical > traffic over. Monitor them for a while with real traffic and you can have > confidence tha

Re: Counters and replication factor

2012-05-24 Thread Edward Capriolo
Also it does not sound like you have run anti entropy repair. You should do that when upping rf. On Monday, May 21, 2012, Radim Kolar wrote: > Dne 26.3.2012 19:17, aaron morton napsal(a): >> >> Can you describe the situations where counter updates are lost or go backwards ? >> >> Do you ever get T

Re: unknown exception with hector

2012-05-24 Thread Deno Vichas
i'm not sure if using framed transport is an option with hector. what should i be in the logs looking for to find the cause of these dropped reads? thanks, On 5/24/2012 3:04 AM, aaron morton wrote: Dropped read messages occur when the node could not process a read task within rpc_timeout. It

about multitenant datamodel

2012-05-24 Thread Toru Inoko
Hi, all. I'm designing data api service(like cassandra.io but not using dedicated server for each user) on cassandra 1.1 on which users can do DML/DDL method like cql. Followings are api which users can use( almost same to cassandra api). - create/read/delete ColumnFamilies/Rows/Columns Now

Help deleting data using cql v3

2012-05-24 Thread Stephen Powis
I have the following schema setup in cassandra 1.1 with cql 3: CREATE TABLE testCol ( my_id varchar, time_id TimeUUIDType, my_value int, PRIMARY KEY (my_id, time_id) ); and the following data already inserted: my_id| time_id | my_valu

Re: Help deleting data using cql v3

2012-05-24 Thread Roland Mechler
This is a known issue, see https://issues.apache.org/jira/browse/CASSANDRA-4193. In the meantime, a workaround is to specify all the column names to be deleted. I.e., delete my_value from testCol where my_id='1_71548' and time_id=2fc39fa0-1dd5-11b2-9b6a-395f35722afe; should work. (I had the s

Re: Help deleting data using cql v3

2012-05-24 Thread Stephen Powis
Thanks for the info Roland! I guess I missed that bug report in my google searching. Stephen On Fri, May 25, 2012 at 12:26 AM, Roland Mechler wrote: > This is a known issue, see > https://issues.apache.org/jira/browse/CASSANDRA-4193. > > In the meantime, a workaround is to specify all the colum

High CPU load on Cassandra Node

2012-05-24 Thread Shubham Srivastava
I have a multiDC ring with 6 nodes in each DC. I have a single node which runs some jobs (including Hadoop Map-Reduce with PIG) every 15minutes. Lately there has been high CPU load and memory issues on this node. What I could see from Ganglia is high CPU load on this server and also number of