Re: Equalizing nodes storage load

2011-07-22 Thread Mina Naguib
Hi Peter That was precisely it. Thank you :) Doing a major compaction on the heaviest node (74.65GB) reduced it to 33.55GB. I'll compact the other 2 nodes as well. I anticipate they will also settle around that size. On 2011-07-22, at 5:00 PM, Peter Tillotson wrote: > I'm not sure if this

Re: CQL COUNT Not Accurate?

2011-07-22 Thread Jonathan Ellis
Yes, this is broken. We'll fix this for https://issues.apache.org/jira/browse/CASSANDRA-2474 On Fri, Jul 22, 2011 at 4:18 PM, Hefeng Yuan wrote: > Hi, > > I just noticed that the count(*) in CQL seems to be having wrong answer, when > I have only one row, the count(*) returns two. > > Below are

"select * from A join B using(common_id) where A.id == a and B.id == b "

2011-07-22 Thread Yang
this is a common pattern used in RDMS, is there some existing idiom to do it in cassandra ? if the size of "select * from A where id == a " is very large, and similarly for B, while the join of A.id == a and B.id==b is small, then doing a get() for both and then merging seems excessively slow.

Re: question on setup for writes into 2 datacenters

2011-07-22 Thread Sameer Farooqui
It sounds like what you're looking for is write consistency of local_quorum: http://www.datastax.com/docs/0.8/consistency/index#write-consistency local_quorum would mean the write has to be successful on a majority of nodes in DC1 (so 2) before it is considered successful. If you use just quorum

Re: CQL COUNT Not Accurate?

2011-07-22 Thread Eric Evans
On Fri, 2011-07-22 at 14:18 -0700, Hefeng Yuan wrote: > Hi, > > I just noticed that the count(*) in CQL seems to be having wrong answer, when > I have only one row, the count(*) returns two. > > Below are the commands I tried: > > cqlsh> SELECT COUNT(*) FROM UserProfile USING CONSISTENCY QUORUM

CQL COUNT Not Accurate?

2011-07-22 Thread Hefeng Yuan
Hi, I just noticed that the count(*) in CQL seems to be having wrong answer, when I have only one row, the count(*) returns two. Below are the commands I tried: cqlsh> SELECT COUNT(*) FROM UserProfile USING CONSISTENCY QUORUM WHERE KEY IN ('00D760DB1730482D81BC6845F875A97D'); (2,) cqlsh> selec

Re: Equalizing nodes storage load

2011-07-22 Thread Peter Tillotson
I'm not sure if this is the answer, but major compaction on each node for each column family. I suspect the data shuffle has left quite a few deleted keys which may get cleaned out on major compaction. As I remember major compaction doesn't automatically in 7.x, I'm not sure if it is triggered by r

question on setup for writes into 2 datacenters

2011-07-22 Thread Dean Hiller
Ideally, we would want to have a replication factor of 4, and a minimum write consistency of 2 (which looking at the default in cassandra.yaml is to memory first with asynch to disk...perfect so far!!!) Now, obviously, I can get the partitioner setup to make sure I get 2 replicas in each data cent

Re: [SPAM] Fwd: Counter consistency - are counters idempotent?

2011-07-22 Thread Aaron Turner
On Fri, Jul 22, 2011 at 9:27 AM, Donal Zang wrote: > On 22/07/2011 18:08, Yang wrote: >> >> btw, this "issue" of  not knowing whether a write is persisted or not >> when client reports error, is not limited to counters,  for regular >> columns, it's the same: if client reports write failure, the v

Re: b-tree

2011-07-22 Thread Peter Tillotson
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 I've not tried this but speculative implementation schema probably something like the following: Super Col family for structure hash(nodeId): { root: { left="nodeId1", right="nodeId2" } nodeId1: { left="nodeId3", right="nodeId4" }

Re: Repair fails with java.io.IOError: java.io.EOFException

2011-07-22 Thread Sameer Farooqui
I don't see a JVM crashlog ( hs_err_pid[pid].log) in ~/brisk/resources/cassandra/bin or /tmp. So maybe JVM didn't crash? We're running a pretty up to date with Sun Java: ubuntu@ip-10-2-x-x:/tmp$ java -version java version "1.6.0_24" Java(TM) SE Runtime Environment (build 1.6.0_24-b07) Java HotSpo

Re: CL=N-1?

2011-07-22 Thread Edward Capriolo
On Fri, Jul 22, 2011 at 3:24 PM, Yang wrote: > is there such an option? > > in some cases I want to distribute some small lookup tables to all the > nodes, so that everyone has a local copy, and loaded in memory. so the > lookup is fast. supposedly I want to write to all N nodes, but that > expos

CL=N-1?

2011-07-22 Thread Yang
is there such an option? in some cases I want to distribute some small lookup tables to all the nodes, so that everyone has a local copy, and loaded in memory. so the lookup is fast. supposedly I want to write to all N nodes, but that exposes me to failure in case of just one node down. so I'd lik

CL=N-1?

2011-07-22 Thread Yang
is there such an option? in some cases I want to distribute some small lookup tables to all the nodes, so that everyone has a local copy, and loaded in memory. so the lookup is fast. supposedly I want to write to all N nodes, but that exposes me to failure in case of just one node down. so I'd lik

Re: how to stop the whole cluster, start the whole cluster like in hadoop/hbase?

2011-07-22 Thread Dean Hiller
Yes, I am wondering more about the yaml file and the settings like the autobootstrap setting and such. I guess I will find out once they enable my amazon service and I can get running with it. NOTE: anyone doing 1.0 or prototype I think constantly uses start/stop whole cluster to upgrade/install

Re: b-tree

2011-07-22 Thread Mike Malone
On Fri, Jul 22, 2011 at 12:05 AM, Eldad Yamin wrote: > In order order to split the nodes. > SimpleGeo have max 1,000 recods (i.e places) on each node in the tree, if > the number is >1,000 they split the node. > In order to avoid that more then 1 process will edit/split the node - > transaction i

[RELEASE] Apache Cassandra 0.7.8 released

2011-07-22 Thread Sylvain Lebresne
The Cassandra team is pleased to announce the release of Apache Cassandra version 0.7.8. This version is a bug fix release[1] and in particular it fixes a regression of Cassandra 0.7.7 that made hinted handoff delivery not being triggered automatically (you could still force delivery through JMX).

Predictable low RW latency, SLABS and STW GC

2011-07-22 Thread Milind Parikh
In order to be predicable @ big data scale, the intensity and periodicity of STW Garbage Collection has to be brought down. Assume that SLABS (Cass 2252) will be available in the main line at some time and assume that this will have the impact that other projects (hbase etc) are reporting. I womder

Re: CompositeType for row Keys

2011-07-22 Thread Patrick Julien
Exactly. In any case, I just answered my own question. If I need range, I can just make another column family where the column name are these keys On Fri, Jul 22, 2011 at 12:37 PM, Nate McCall wrote: >> yes,but why would you use CompositeType if you don't need range query? > > If you were doing

Re: CompositeType for row Keys

2011-07-22 Thread Nate McCall
> yes,but why would you use CompositeType if you don't need range query? If you were doing composite keys anyway (common approach with time series data for example), you would not have to write parsing and concatenation code. Particularly useful if you had mixed types in the key.

Re: [SPAM] Fwd: Counter consistency - are counters idempotent?

2011-07-22 Thread Donal Zang
On 22/07/2011 18:08, Yang wrote: btw, this "issue" of not knowing whether a write is persisted or not when client reports error, is not limited to counters, for regular columns, it's the same: if client reports write failure, the value may well be replicated to all replicas later. this is even

Re: CompositeType for row Keys

2011-07-22 Thread Donal Zang
On 22/07/2011 17:56, Patrick Julien wrote: I can still use it for keys if I don't need ranges then? Because for what we are doing we can always re-assemble keys yes,but why would you use CompositeType if you don't need range query? On Fri, Jul 22, 2011 at 11:38 AM, Donal Zang wrote: If you a

Re: Counter consistency - are counters idempotent?

2011-07-22 Thread Jonathan Ellis
If that's the case, your client is being misleading. Cassandra distinguishes between Unavailable (we knew we couldn't achieve CL before we started, and nothing changed) and TimedOut (didn't get reply in a timely fashion; it may or may not have gone through). TimedOut != Failed. On Fri, Jul 22, 2

Fwd: Counter consistency - are counters idempotent?

2011-07-22 Thread Yang
btw, this "issue" of not knowing whether a write is persisted or not when client reports error, is not limited to counters, for regular columns, it's the same: if client reports write failure, the value may well be replicated to all replicas later. this is even the same with all other systems: Z

Re: CompositeType for row Keys

2011-07-22 Thread Patrick Julien
I can still use it for keys if I don't need ranges then? Because for what we are doing we can always re-assemble keys On Fri, Jul 22, 2011 at 11:38 AM, Donal Zang wrote: > If you are using OPP, then you can use CompositeType on both key and column > name; otherwise(Random Partition), just use it

Re: CompositeType for row Keys

2011-07-22 Thread Donal Zang
If you are using OPP, then you can use CompositeType on both key and column name; otherwise(Random Partition), just use it for columns. On 22/07/2011 17:10, Patrick Julien wrote: With the current implementation of CompositeType in Cassandra 0.8.1, is it recommended practice to try to use a Compo

CompositeType for row Keys

2011-07-22 Thread Patrick Julien
With the current implementation of CompositeType in Cassandra 0.8.1, is it recommended practice to try to use a CompositeType as the key? Or are both, column and key, equally well supported? The documentation on CompositeType is light, well non-existent really, with key_validation_class set to Co

Re: Equalizing nodes storage load

2011-07-22 Thread Mina Naguib
I'm trying to balance Load ( 41.98GB vs 59.4GB vs 74.65GB ) Owns looks ok. They're all 33.33% which is what I want. It was calculated simply by 2^127 / num_nodes. The only reason the first one doesn't start at 0 is that I''ve actually carved the ring planning for 9 machines (2 new data cente

Re: Counter consistency - are counters idempotent?

2011-07-22 Thread Sylvain Lebresne
On Fri, Jul 22, 2011 at 4:52 PM, Kenny Yu wrote: > As of Cassandra 0.8.1, are counter increments and decrements idempotent? If, > for example, a client sends an increment request and the increment occurs, > but the network subsequently fails and reports a failure to the client, will > Cassandra re

Re: Equalizing nodes storage load

2011-07-22 Thread Sasha Dolgy
are you trying to balance "load" or "owns" ? "owns" looks fine ... 33.33% each ... which to me says balanced. how did you calculate your tokens? On Fri, Jul 22, 2011 at 4:37 PM, Mina Naguib wrote: > > Address         Status State   Load            Owns    Token > xx.xx.x.105     Up     Normal

Counter consistency - are counters idempotent?

2011-07-22 Thread Kenny Yu
As of Cassandra 0.8.1, are counter increments and decrements idempotent? If, for example, a client sends an increment request and the increment occurs, but the network subsequently fails and reports a failure to the client, will Cassandra retry the increment (thus leading to an overcount and incons

Equalizing nodes storage load

2011-07-22 Thread Mina Naguib
Hi everyone I've been struggling trying to get the data volume ("load") to equalize across a balanced cluster, and I'm not sure what else I can try. Background: This was originally a 5-node cluster. We re-balanced the 3 faster machines across the ring, and decommissioned the 2 older ones. We

Re: Is it safe to stop a read repair and any suggestion on speeding up repairs

2011-07-22 Thread Adi
> > > Short answer, yes it's safe to kill cassandra during a repair. It's one of > the nice things about never mutating data. > > Longer answer: If nodetool compactionstats says there are no Validation > compactions running (and the compaction queue is empty) and netstats says > there is nothing s

Re: Stress test using Java-based stress utility

2011-07-22 Thread Jonathan Ellis
What does nodetool ring say? On Fri, Jul 22, 2011 at 12:43 AM, Nilabja Banerjee wrote: > Hi All, > > I am following this following link " > http://www.datastax.com/docs/0.7/utilities/stress_java " for a stress test. > I am getting this notification after running this command > > xxx.xxx.xxx.xx= m

Re: Re: eliminate need to repair by using column TTL??

2011-07-22 Thread jonathan . colby
good points Aaron. I realize now how expensive repair on reads are. I'm going to keep doing repairs regularly but still have a max TTL on all columns to make sure we don't have really old data we no longer need getting buried in the cluster. On , aaron morton wrote: Read repair will only r

Re: Stress test using Java-based stress utility

2011-07-22 Thread Nilabja Banerjee
Running only one node. I dnt think it is coming for the replication factor... I will try to sort this out Any other suggestions from your side is always be helpful.. :) Thank you On 22 July 2011 14:36, aaron morton wrote: > UnavailableException is raised server side when there is les

Re: eliminate need to repair by using column TTL??

2011-07-22 Thread aaron morton
Read repair will only repair data that is read on the nodes that are up at that time, and does not guarantee that any changes it detects will be written back to the nodes. The diff mutations are async fire and forget messages which may go missing or be dropped or ignored by the recipient just li

Re: cassandra fatal error when compaction

2011-07-22 Thread lebron james
it happend again i turn off compaction by setting max and min compaction tresholds to zero, and run, 5 threads of inserts, after base reach 27GB size cassandra fall with same error. OS Windows Server 2008 datacenter, JVM have 1.5 GB heap. cassandra version 0.8.1 all parameters in conf file are defa

Re: cassandra fatal error when compaction

2011-07-22 Thread aaron morton
Something has shutdown the mutation stage thread pool. This happens during drain or decommission / move. Restart the service and it should be ok. if it happens again without anyone running something like drain, decommission or move let us know. Cheers - Aaron Morton Freelan

Re: b-tree

2011-07-22 Thread aaron morton
You can use something like Zoo Keeper to coordinate processes doing page splits. Cheers - Aaron Morton Freelance Cassandra Developer @aaronmorton http://www.thelastpickle.com On 22 Jul 2011, at 19:05, Eldad Yamin wrote: > In order order to split the nodes. > SimpleGeo have max 1

Re: Stress test using Java-based stress utility

2011-07-22 Thread aaron morton
UnavailableException is raised server side when there is less than CL nodes UP when the request starts. It seems odd to get it in this case because the default replication factor used by stress test is 1. How many nodes do you have and have you made any changes to the RF ? Also check the serv

eliminate need to repair by using column TTL??

2011-07-22 Thread jonathan . colby
One of the main reasons for regularly running repair is to make sure deletes are propagated in the cluster, ie, data is not resurrected if a node never received the delete call. And repair-on-read takes care of repairing inconsistencies "on-the-fly". So if I were to set a universal TTL on al

Re: cassandra fatal error when compaction

2011-07-22 Thread lebron james
ERROR [pool-2-thread-3] 2011-07-22 10:34:59,102 Cassandra.java (line 3294) Internal error processing insert java.util.concurrent.RejectedExecutionException: ThreadPoolExecutor has shut down at org.apache.cassandra.concurrent.DebuggableThreadPoolExecutor$1.rejectedExecution(DebuggableThreadPoolExecu

Re: b-tree

2011-07-22 Thread Eldad Yamin
In order order to split the nodes. SimpleGeo have max 1,000 recods (i.e places) on each node in the tree, if the number is >1,000 they split the node. In order to avoid that more then 1 process will edit/split the node - transaction is needed. On Jul 22, 2011 1:01 AM, "aaron morton" wrote: >> But