RE: nodetool repair -pr enough in this scenario?

2012-06-04 Thread Viktor Jevdokimov
Understand simple mechanics first, decide how to act later. Without -PR there's no difference from which host to run repair, it runs for the whole 100% range, from start to end, the whole cluster, all nodes, at once. With -PR it runs only for a primary range of a node you are running a repair. L

RE: Cassandra Upgrade from 0.8.1

2012-06-04 Thread Harshvardhan Ojha
You can follow these steps for your version also . http://www.datastax.com/docs/1.0/install/upgrading If you will keep the data directory same in Cassandra.yaml, data will be picked in new node. Regards Harsh From: Adeel Akbar [mailto:adeel.ak...@panasiangroup.com] Sent: Tuesday, June 05, 2012

Cassandra Upgrade from 0.8.1

2012-06-04 Thread Adeel Akbar
Dear Guys, Thank you so much for your reply. Currently I have two Cassandra nodes running in ring. I have installed Cassandra on following location; /root/apache-cassandra-0.8.1 Now my questions are; 1. How we upgrade (Step by Step version like 0.8.1 to 0.8.5, then 0.8.5

nodetool repair -pr enough in this scenario?

2012-06-04 Thread David Daeschler
Hello, Currently I have a 4 node cassandra cluster on CentOS64. I have been running nodetool repair (no -pr option) on a weekly schedule like: Host1: Tue, Host2: Wed, Host3: Thu, Host4: Fri In this scenario, if I were to add the -pr option, would this still be sufficient to prevent forgotten del

Re: How to use Hector to retrieve data from Cassandra

2012-06-04 Thread Toru Inoko
Please refer following url. You can find some example of how to use hector https://github.com/zznate/hector-examples/tree/master/src/main/java/com/riptano/cassandra/hector/example Toru On Tue, 05 Jun 2012 13:08:31 +0900, Prakrati Agrawal wrote: Dear all, I am unable to find a good elabora

RE: memory issue on 1.1.0

2012-06-04 Thread Poziombka, Wade L
Thanks for this pointer I will retest with 1.1.1, which seems to be when this is fixed. -Original Message- From: Brandon Williams [mailto:dri...@gmail.com] Sent: Monday, June 04, 2012 11:24 PM To: user@cassandra.apache.org Subject: Re: memory issue on 1.1.0 Perhaps the deletes: https://

Re: memory issue on 1.1.0

2012-06-04 Thread Brandon Williams
Perhaps the deletes: https://issues.apache.org/jira/browse/CASSANDRA-3741 -Brandon On Sun, Jun 3, 2012 at 6:12 PM, Poziombka, Wade L wrote: > Running a very write intensive (new column, delete old column etc.) process > and failing on memory.  Log file attached. > > Curiously when I add new dat

How to use Hector to retrieve data from Cassandra

2012-06-04 Thread Prakrati Agrawal
Dear all, I am unable to find a good elaborate example on how to use Hector to get data stored in Cassandra. Please help me. Thanks and Regards Prakrati Agrawal | Developer - Big Data(I&D)| 9731648376 | www.mu-sigma.com This email message may contain proprieta

RE: memory issue on 1.1.0

2012-06-04 Thread Poziombka, Wade L
I have repeated the test on two quite large machines 12 core, 64 GB as5 boxes and still observed the problem. Interestingly about at the same point. Anything I can monitor... perhaps I'll hook the Yourkit profiler up to it to see if there is some kind of leak? Wade From: Poziombka, Wade L Sen

Re: Mixing Ec2MultiregionSnitch with private network

2012-06-04 Thread Chris Marino
Hi Patrick, I'm not sure if it's doable, but I can tell you for sure that there are lots differences in the way the networks will need to be set up. If you've got to secure client traffic, it's going to get even more complicated with encrypted traffic, etc. We did some performance testing and co

Re: about multitenant datamodel

2012-06-04 Thread Toru Inoko
IMHO a model that allows external users to create CF's is a bad one. why do you think so? I'll let users create ristricted CFs, and limit a number of CFs which users create. is it still a bad one? On Thu, 31 May 2012 06:44:05 +0900, aaron morton wrote: - Do a lot of keyspaces cause some

RE: memory issue on 1.1.0

2012-06-04 Thread Poziombka, Wade L
What JVM settings do you have? -Xms8G -Xmx8G -Xmn800m -XX:+HeapDumpOnOutOfMemoryError -XX:+UseParNewGC -XX:+UseConcMarkSweepGC -XX:+CMSParallelRemarkEnabled -XX:SurvivorRatio=8 -XX:MaxTenuringThreshold=1 -XX:CMSInitiati

Mixing Ec2MultiregionSnitch with private network

2012-06-04 Thread Patrick Lu
Hi All, Does anyone have experience on Cassandra deployment mixing with EC2 and own data center? We plan to use ec2multiregionsnitch to build a Cassandra cluster across EC2 regions, and the same time to have a couple nodes (in the cluster) sitting in our own data center. Any comment whether

RE: 1.1 not removing commit log files?

2012-06-04 Thread Bryce Godfrey
I'll try to get some log files for this with DEBUG enabled. Tough on production though. From: aaron morton [mailto:aa...@thelastpickle.com] Sent: Monday, June 04, 2012 11:15 AM To: user@cassandra.apache.org Subject: Re: 1.1 not removing commit log files? Apply the local hint mutation follows th

Re: Retrieving old data version for a given row

2012-06-04 Thread aaron morton
This is an old issue with sstable2json https://issues.apache.org/jira/browse/CASSANDRA-4054 Internally the tomstone is associated with the o.a.c.db.AbstractColumnContainer see o.a.c.db.RowMutation.delete() to see how a row level delete works. Cheers - Aaron Morton Freelance D

Re: Node join streaming stuck at 100%

2012-06-04 Thread aaron morton
Are their any errors in the logs about failed streaming ? If you are getting time outs 1.0.8 added a streaming socket timeout https://github.com/apache/cassandra/blob/trunk/CHANGES.txt#L323 Cheers - Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On

Re: Cassandra upgrade from 0.8.1 to 1.1.0

2012-06-04 Thread aaron morton
In addition always read the NEWS.txt file in the distribution and glance at the CHANGES.txt file. Cheers - Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On 4/06/2012, at 12:19 PM, Roshan wrote: > Hi > > Hope this will help to you. > > http://www.

Re: memory issue on 1.1.0

2012-06-04 Thread aaron morton
Had a look at the log, this message > INFO [ScheduledTasks:1] 2012-06-03 17:49:01,559 StorageService.java (line > 2772) Unable to reduce heap usage since there are no dirty column families appears correct, it happens after some flush activity and there are not CF's with memtable data. But the h

Re: row_cache_provider = 'SerializingCacheProvider'

2012-06-04 Thread ruslan usifov
I think that SerializingCacheProvider have more JAVA HEAP footprint, then i think 2012/6/4 ruslan usifov : > I have setup 5GB of JavaHeap wit follow tuning: > > MAX_HEAP_SIZE="5G" > HEAP_NEWSIZE="800M" > > JVM_OPTS="$JVM_OPTS -XX:+UseParNewGC" > JVM_OPTS="$JVM_OPTS -XX:+UseConcMarkSweepGC" > JVM_O

RE: nodes moving spontaneously

2012-06-04 Thread Curt Allred
Thanks for the tip. Checked "nodetool ring" on all nodes and they all have a consistent view of the ring. We have had other problems like nodes crashing etc so anything could have happened, but we're sure we didnt issue a "nodetool move" command. From: Tyler Hobbs [mailto:ty...@datastax.com]

[RELEASE] Apache Cassandra 1.1.1 released

2012-06-04 Thread Sylvain Lebresne
The Cassandra team is pleased to announce the release of Apache Cassandra version 1.1.1. Cassandra is a highly scalable second-generation distributed database, bringing together Dynamo's fully distributed design and Bigtable's ColumnFamily-based data model. You can read more here: http://cassand

Re: Errors with Cassandra 1.0.10, 1.1.0, 1.1.1-SNAPSHOT and 1.2.0-SNAPSHOT

2012-06-04 Thread aaron morton
I remember someone have the "file exists" issue a few weeks ago, IIRC it magically went away. Do yo have steps to reproduce this fault ? If you can reproduce it on a release version please create a ticket on https://issues.apache.org/jira/browse/CASSANDRA and update the email thread. Cheers

Re: TimedOutException()

2012-06-04 Thread aaron morton
> Is the node we are connecting to try to proxy requests ? Wouldn't our > configuration ensure all nodes have replicas ? It can still time out even when reading locally. (The thread running the query is waiting on the read thread). Look in the server side logs to see if there are any errors. If

Re: Can't delete from SCF wide row

2012-06-04 Thread aaron morton
Delete is a no look write operation, like normal writes. So it should not be directly causing a lot of memory allocation. It may be causing a lot of compaction activity, which due to the wide row may be throwing up lots of GC. Try the following to get through the deletions: * disable compact

Re: Secondary Indexes, Quorum and Cluster Availability

2012-06-04 Thread aaron morton
IIRC index slices work a little differently with consistency, they need to have CL level nodes available for all token ranges. If you drop it to CL ONE the read is local only for a particular token range. The problem when doing index reads is the nodes that contain the results can no longer be

Re: 1.1 not removing commit log files?

2012-06-04 Thread aaron morton
Apply the local hint mutation follows the same code path and regular mutations. When the commit log is being truncated you should see flush activity, logged from the ColumnFamilyStore with "Enqueuing flush of " messages. If you set DEBUG logging for the org.apache.cassandra.db.ColumnFamilySto

Re: batch isolation

2012-06-04 Thread Todd Burruss
I don't think I'm being clear. I just was wondering if a "row delete" is isolated with all the other inserts or deletes to a specific column family and key in the same batch. On 6/4/12 1:58 AM, "Sylvain Lebresne" wrote: >On Sun, Jun 3, 2012 at 6:05 PM, Todd Burruss wrote: >> I just meant there

Replication factor via hector

2012-06-04 Thread Roshni Rajagopal
Hi , I'm trying to see the effect of different replication factors and consistency levels for a keyspace on a 4 node cassandra cluster. I'm doing this using hector client. I could not find an api to set replication factor for a keyspace though I could find ways to modify consistency level.

Re: repair

2012-06-04 Thread Tamar Fraenkel
Thanks, one more question. On regular basis, should I run repair for the system keyspace? *Tamar Fraenkel * Senior Software Engineer, TOK Media [image: Inline image 1] ta...@tok-media.com Tel: +972 2 6409736 Mob: +972 54 8356490 Fax: +972 2 5612956 On Mon, Jun 4, 2012 at 5:02 PM, Vikto

RE: repair

2012-06-04 Thread Viktor Jevdokimov
Why without -PR when recovering from crash? Repair without -PR runs full repair of the cluster, the node which receives a command is a repair controller, ALL nodes synchronizes replicas at the same time, streaming data between each other. The problems may arise: · When streaming hangs (

Re: Integration Testing for Cassandra

2012-06-04 Thread David McNelis
That article is a good starting point. To make your life a bit easier, consider checking out CassandraUnit that provides facilities to load example data in a variety of ways. https://github.com/jsevellec/cassandra-unit Then you just need to be able to pass in which cassandra instance to connect

Re: repair

2012-06-04 Thread Tamar Fraenkel
Thank you all! *Tamar Fraenkel * Senior Software Engineer, TOK Media [image: Inline image 1] ta...@tok-media.com Tel: +972 2 6409736 Mob: +972 54 8356490 Fax: +972 2 5612956 On Mon, Jun 4, 2012 at 3:16 PM, R. Verlangen wrote: > The "repair -pr" only repairs the nodes primary range: so

Re: repair

2012-06-04 Thread R. Verlangen
The "repair -pr" only repairs the nodes primary range: so is only usefull in day to day use. When you're recovering from a crash use it without -pr. 2012/6/4 Romain HARDOUIN > > Run "repair -pr" in your cron. > > Tamar Fraenkel a écrit sur 04/06/2012 13:44:32 : > > > Thanks. > > > > I actually

Which client to use for Cassandra real time insertion and retrieval

2012-06-04 Thread Samuel CARRIERE
I'm assuming you are looking for a java client. >From my own experience, Hector is a good client, that can be used in real time applications (it supports connexion pooling and automatic retries). But I would suggest to have a look at astyanax from netflix (https://github.com/Netflix/astyanax). I

Re: repair

2012-06-04 Thread Romain HARDOUIN
Run "repair -pr" in your cron. Tamar Fraenkel a écrit sur 04/06/2012 13:44:32 : > Thanks. > > I actually did just that with cron jobs running on different hours. > > I asked the question because I saw that when one of the logs was > running the repair, all nodes logged some repair related en

Re: repair

2012-06-04 Thread Tamar Fraenkel
Thanks. I actually did just that with cron jobs running on different hours. I asked the question because I saw that when one of the logs was running the repair, all nodes logged some repair related entries in /var/log/ cassandra/system.log Thanks again, *Tamar Fraenkel * Senior Software Engineer

RE RPM of Cassandra 1.1.0

2012-06-04 Thread Samuel CARRIERE
Hi, The RPM from datastax : http://rpm.datastax.com/community/noarch/ apache-cassandra11-1.1.0-2.noarch.rpm Regards, Samuel "Adeel Akbar" 04/06/2012 13:20 Veuillez répondre à user@cassandra.apache.org A cc Objet RPM of Cassandra 1.1.0 Hi, I need to install Apache Cassandra 1.1.0

RE repair

2012-06-04 Thread Samuel CARRIERE
Hi, It is not enough to run the repair in one node, except if the node contain all the data (ex : 3 node cluster with RF=3). In the general case, the best is to launch the repair in every node, with the "-rp" option (use -rp to repair only the first range returned by the partitioner) Tamar

RE: repair

2012-06-04 Thread Rishabh Agrawal
Hello, As far as my knowledge goes, it works per node basis. So you have to run on different nodes. I would suggest you to not to execute it simultaneously on all nodes in a production environment. Regards Rishabh Agrawal From: Tamar Fraenkel [mailto:ta...@tok-media.com] Sent: Monday, June 04,

repair

2012-06-04 Thread Tamar Fraenkel
Hi! I apologize if for this naive question. When I run nodetool repair, is it enough to run on one of the nodes, or do I need to run on each one of them? Thanks *Tamar Fraenkel * Senior Software Engineer, TOK Media [image: Inline image 1] ta...@tok-media.com Tel: +972 2 6409736 Mob: +972 54 8

RPM of Cassandra 1.1.0

2012-06-04 Thread Adeel Akbar
Hi, I need to install Apache Cassandra 1.1.0 from RPM. Please provide me link to download rpm for CentOS. Thanks & Regards Adeel Akbar

Which client to use for Cassandra real time insertion and retrieval

2012-06-04 Thread Prakrati Agrawal
Dear all, I am trying to explore Cassandra for real time applications. Can you please suggest me which client is the best to use ? Is the client choice based on the user 's comfort level or on use cases. Thanks and Regards Prakrati Agrawal | Developer - Big Data(I&D)| 9731648376 | www.mu-sigma

Re: row_cache_provider = 'SerializingCacheProvider'

2012-06-04 Thread ruslan usifov
I have setup 5GB of JavaHeap wit follow tuning: MAX_HEAP_SIZE="5G" HEAP_NEWSIZE="800M" JVM_OPTS="$JVM_OPTS -XX:+UseParNewGC" JVM_OPTS="$JVM_OPTS -XX:+UseConcMarkSweepGC" JVM_OPTS="$JVM_OPTS -XX:+CMSParallelRemarkEnabled" JVM_OPTS="$JVM_OPTS -XX:SurvivorRatio=8" JVM_OPTS="$JVM_OPTS -XX:MaxTenuring

Re: Retrieving old data version for a given row

2012-06-04 Thread Felipe Schmidt
*I was taking a look at tombstones stored at SSTable and I noticed that if I perform a key deletion, the tombstone doesn’t have any timestamp, he has this appearance: “key”:[ ] In all the other deletions granularities the tombstone have a timestamp.Without this information seems to be not possible

Re: Adding a new node to Cassandra cluster

2012-06-04 Thread R. Verlangen
Connection pooling involves things like: - (transparent) failover / retry - disposal of connections after X messages - keep track of connections Again: take a look at the hector connection pool. Source: https://github.com/rantav/hector/tree/master/core/src/main/java/me/prettyprint/cassandra/connec

RE: Adding a new node to Cassandra cluster

2012-06-04 Thread Prakrati Agrawal
Ye I know I am trying to reinvent the wheel but I have to. The requirement is such that I have to use Java Thrift API without any client like Hector. Can you please tell me how do I do it. Prakrati Agrawal | Developer - Big Data(I&D)| 9731648376 | www.mu-sigma.com From: samal [mailto:samalgo...

Re: Query

2012-06-04 Thread Franc Carter
On Mon, Jun 4, 2012 at 7:36 PM, MOHD ARSHAD SALEEM < marshadsal...@tataelxsi.co.in> wrote: > Hi all, > > I wanted to know how to read and write data using cassandra API's . is > there any link related to sample program . > I did a Proof of Concept using a python client -PyCassa ( https://github.

Re: Query

2012-06-04 Thread Amresh Singh
Here is a link that will help you out if you use Kundera as high level client for Cassandra: https://github.com/impetus-opensource/Kundera/wiki/Getting-Started-in-5-minutes

Re: Adding a new node to Cassandra cluster

2012-06-04 Thread samal
If you use thrift API, you have to maintain lot of low level code by yourself which is already being polished by HLC hector, pycassa also with HLC your can easily switch between thrift and growing CQL. On Mon, Jun 4, 2012 at 3:00 PM, R. Verlangen wrote: > You might consider using a higher level

RE: Query

2012-06-04 Thread Rishabh Agrawal
If you are using Java try out Kundera or Hector, both are good and have good documentation available. From: MOHD ARSHAD SALEEM [mailto:marshadsal...@tataelxsi.co.in] Sent: Monday, June 04, 2012 2:37 AM To: user@cassandra.apache.org Subject: Query Hi all, I wanted to know how to read and write d

Re: Adding a new node to Cassandra cluster

2012-06-04 Thread Roshni Rajagopal
Prakrati, I believe even though you would specify one node in your code, internally the request would be going to any – perhaps more than 1 node based on your replication factors & consistency level settings. You can try this by connecting to one node and writing to it and then reading the sa

Query

2012-06-04 Thread MOHD ARSHAD SALEEM
Hi all, I wanted to know how to read and write data using cassandra API's . is there any link related to sample program . Regards Arshad

Re: Adding a new node to Cassandra cluster

2012-06-04 Thread R. Verlangen
You might consider using a higher level client (like Hector indeed). If you don't want this you will have to write your own connection pool. For start take a look at Hector. But keep in mind that you might be reinventing the wheel. 2012/6/4 Prakrati Agrawal > Hi, > > ** ** > > I am using Th

RE: Adding a new node to Cassandra cluster

2012-06-04 Thread Prakrati Agrawal
Hi, I am using Thrift API and I am not able to find anything on the internet about how to configure it for multiple nodes. I am not using any proper client like Hector. Prakrati Agrawal | Developer - Big Data(I&D)| 9731648376 | www.mu-sigma.com From: R. Verlangen [mailto:ro...@us2.nl] Sent: Mo

Re: Adding a new node to Cassandra cluster

2012-06-04 Thread R. Verlangen
Hi there, When you speak to one node it will internally redirect the request to the proper node (local / external): but you won't be able to failover on a crash of the localhost. For adding another node to the connection pool you should take a look at the documentation of your java client. Good l

Adding a new node to Cassandra cluster

2012-06-04 Thread Prakrati Agrawal
Dear all I successfully added a new node to my cluster so now it's a 2 node cluster. But how do I mention it in my Java code as when I am retrieving data its retrieving only for one node that I am specifying in the localhost. How do I specify more than one node in the localhost. Please help me

Re: batch isolation

2012-06-04 Thread Sylvain Lebresne
On Sun, Jun 3, 2012 at 6:05 PM, Todd Burruss wrote: > I just meant there is a "row delete" in the same batch as inserts - all to > the same column family and key Then it's the timestamp that will decide what happens. Whatever has a timestamp lower or equal to the tombstone timestamp will be delet

Re: row_cache_provider = 'SerializingCacheProvider'

2012-06-04 Thread aaron morton
Yes SerializingCacheProvider is the off heap caching provider. Can you do some more digging into what is using the heap ? Cheers A - Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On 1/06/2012, at 9:52 PM, ruslan usifov wrote: > Hello > > I begin

Re: Finding whether a new node is successfully added or not

2012-06-04 Thread Pushpalanka Jayawardhana
Hi Prakrati, bin/nodetool -host ring Refer here at Cassandra wiki for more details. A restart is needed as I know as node need to communicate with the seeds and make sense of the cluster it is in. On Mon, Jun 4, 2012 at 1:41 PM, Prakrati Agrawal < p

Re: Finding whether a new node is successfully added or not

2012-06-04 Thread R. Verlangen
Hi there, You can check the ring info with nodetool. Furthermore you can take a look at the streaming statistics: lots of pending indicates a node that is still receiving data from it's seed(s). As far as I'm aware of the seed value will be read upon start: so a restart is required. Good luck. 2

Finding whether a new node is successfully added or not

2012-06-04 Thread Prakrati Agrawal
Dear all, I added a new node to my 1 node Cassandra cluster. Now I want to find out whether it is added successfully or not. Also do I need to restart the already running node after entering the seed value. Please help me. Thanks and Regards Prakrati Agrawal | Developer - Big Data(I&D)| 973164

Re: Getting error on adding seed to add a new node

2012-06-04 Thread Pushpalanka Jayawardhana
Hi, As it is said in cassandra.yaml, you need to define it as # Ex: *"* ,,*"* whole list as one String. On Mon, Jun 4, 2012 at 1:17 PM, Prakrati Agrawal < prakrati.agra...@mu-sigma.com> wrote: > Dear all, > > ** ** > > I am trying to add a new node to my existing one node Cassandra. So I >

Getting error on adding seed to add a new node

2012-06-04 Thread Prakrati Agrawal
Dear all, I am trying to add a new node to my existing one node Cassandra. So I edited the seeds value in the cassandra.yaml and added the ip addresses of both the nodes. But its giving me the following error: ERROR 13:16:48,342 Fatal configuration error error while parsing a block mapping in