Re: Need help json2sstable

2011-07-20 Thread Nilabja Banerjee
Thank you... but I have already gone through that.. but still not working... I am getting .. "*You must supply exactly one sstable * Can you tell me why I am getting this? On 21 July 2011 02:41, Tyler Hobbs wrote: > The sstable2json/json2sstable format is detailed here: > http://www.datastax.c

reset keys_cached

2011-07-20 Thread 魏金仙
Can any one tell how to reset "keys_cached"? Thanks.

Re: Cassandra training in Bangalore, India

2011-07-20 Thread Sasha Dolgy
I am quite certain if you find enough people and pony up the fees a few people on this list would be willing to make the journey... On Jul 21, 2011 8:02 AM, "samal" wrote: > As per my knowledge, there is not such expert training available in India as > of now. > As Sameer said there is enough onli

Re: Cassandra training in Bangalore, India

2011-07-20 Thread samal
As per my knowledge, there is not such expert training available in India as of now. As Sameer said there is enough online material available from where you can learn.I have been playing with Cassandra since beginning. We can plan for Meetup/learning session near Mumbai/Pune region.

Cassandra Storage Sizing

2011-07-20 Thread Todd Burruss
I put together a blog post on Cassandra Storage Sizing so I don't need to keep figuring it out again and again. Hope everyone finds it useful, and give feedback if you find errors. http://btoddb-cass-storage.blogspot.com/ ... enjoy

Re: with proof Re: cassandra goes infinite loop and data lost.....

2011-07-20 Thread Yan Chunlu
thans for the reply. now the problem is how can I get rid of the ""N of 2147483647 ", it seems never ends, and the node never goes UP last time it happens I run "node cleanup", turns out some data loss(not sure if caused by cleanup). On Thu, Jul 21, 2011 at 11:37 AM, aaron morton wrote: > Pe

Re: with proof Re: cassandra goes infinite loop and data lost.....

2011-07-20 Thread aaron morton
Personally I would do a repair first if you need to do one, just so you are confident everything is where is should be. Then do the move as described in the wiki. Cheers - Aaron Morton Freelance Cassandra Developer @aaronmorton http://www.thelastpickle.com On 21 Jul 2011, at

Re: with proof Re: cassandra goes infinite loop and data lost.....

2011-07-20 Thread Yan Chunlu
sorry for the misunderstanding. I saw many N of 2147483647 which N=0 and thought it was not doing anything. my node was very unbalanced and I was intend to rebalance it by "nodetool move" after a "node repair", does that cause the slices much large? Address Status State Load

Re: with proof Re: cassandra goes infinite loop and data lost.....

2011-07-20 Thread Jonathan Ellis
This is not an infinite loop, you can see the column objects being iterated over are different. Like I said last time, "I do see that it's saying "N of 2147483647" which looks like you're doing slices with a much larger limit than is advisable." On Wed, Jul 20, 2011 at 9:00 PM, Yan Chunlu wrote:

Re: node repair eat up all disk io and slow down entire cluster(3 nodes)

2011-07-20 Thread Yan Chunlu
thank you very much for the help, I will try to adjust minor compaction and also dealing with single CF at a time. On Thu, Jul 21, 2011 at 7:56 AM, Aaron Morton wrote: > If you have never run repair also check the section on repair on this page > http://wiki.apache.org/cassandra/Operations About

with proof Re: cassandra goes infinite loop and data lost.....

2011-07-20 Thread Yan Chunlu
this time it is another node, the node goes down during repair, and come back but never up, I change log level to "DEBUG" and found out it print out the following message infinitely DEBUG [main] 2011-07-20 20:58:16,286 SliceQueryFilter.java (line 123) collecting 0 of 2147483647: 76616c7565:false:6

Re: What is the nodeId for?

2011-07-20 Thread Boris Yen
Not sure if this is the same. I saw exceptions like this: INFO 15:33:49,336 Finished reading /root/commitlog_tmp/CommitLog-1311135088656.log ERROR 15:33:49,336 Exception encountered during startup. java.lang.RuntimeException: java.util.concurrent.ExecutionException: java.lang.AssertionError

Re: PHPCassa get number of rows

2011-07-20 Thread Aaron Morton
Cassandra does not provide a way to count the number of rows, the best you can do is a series of range calls and count them on the client side http://thobbs.github.com/phpcassa/tutorial.html If this is something you need in your app consider creating a custom secondary index to store the row ke

Re: node repair eat up all disk io and slow down entire cluster(3 nodes)

2011-07-20 Thread Aaron Morton
If you have never run repair also check the section on repair on this page http://wiki.apache.org/cassandra/Operations About how frequently it should be run. There is an issue where repair can stream too much data, and this can lead to excessive disk use. My non scientific approach to the neve

Re: Repair taking a long, long time

2011-07-20 Thread aaron morton
The first thing to do is understand what the server is doing. As Edward said, there are two phases to the repair first the differences are calculated and then they are shared between the neighbours. Lets an a third step, once the neighbour gets the data it has to rebuild the indexes and bloom

Re: Data Visualization Best Practices

2011-07-20 Thread aaron morton
This project may provide some inspiration https://github.com/driftx/chiton Cheers - Aaron Morton Freelance Cassandra Developer @aaronmorton http://www.thelastpickle.com On 21 Jul 2011, at 06:36, Selcuk Bozdag wrote: > Hi, > > Cassandra provides a flexible scheme-less data stor

Re: best example of indexing

2011-07-20 Thread Sameer Farooqui
More info: http://www.datastax.com/docs/0.8/data_model/secondary_indexes http://www.datastax.com/docs/0.8/data_model/cfs_as_indexes On Wed, Jul 20, 2011 at 10:49 AM, Konstantin Naryshkin wrote: > In the Cassandra CLI tutorial( > http://wiki.apache.org/cassandra/CassandraCli), there is an exampl

Re: Cassandra training in Bangalore, India

2011-07-20 Thread Sameer Farooqui
Check out this DataStax documentation: http://www.datastax.com/docs/0.8/index DataStax has training in California and the U.S. every once in a while. No training in India as far as I know. O'Reilly Has a book on Cassandra called "Cassandra: The Definitive Guide" and it's kinda outdated, but it's

Re: Cassandra & CLOUD . How its related

2011-07-20 Thread Sameer Farooqui
Are you talking about cloudsandra.com? Check out their website. Cassandra is a database. Cloud is just a fancy term for remote hosting. The two aren't really related. On Wed, Jul 20, 2011 at 3:19 AM, CASSANDRA learner < cassandralear...@gmail.com> wrote: > Hi Guys, > > When we talk about cassand

Re: b-tree

2011-07-20 Thread aaron morton
Just throwing out a (half baked) idea, perhaps the Nested Set Model of trees would work http://en.wikipedia.org/wiki/Nested_set_model * Ever row would represent a set with a left and right encoded into the key * Members are inserted as columns into *every* set / row they are a member. So we are

Re: Need help json2sstable

2011-07-20 Thread Tyler Hobbs
The sstable2json/json2sstable format is detailed here: http://www.datastax.com/docs/0.7/utilities/sstable2json On Wed, Jul 20, 2011 at 4:58 AM, Nilabja Banerjee wrote: > > > > > On 20 July 2011 11:33, Nilabja Banerjee wrote: >> >> Hi All, >> >> Here Is my Json structure. >> >> >> {"Fetch_CC" :{

Re: disable compaction

2011-07-20 Thread Nikolai Kopylov
Thanx a lot Edward, will follow your advice. On Wed, Jul 20, 2011 at 7:28 PM, Edward Capriolo wrote: > > > On Wed, Jul 20, 2011 at 11:13 AM, Nikolai Kopylov wrote: > >> Hi everyone, >> >> finding out recently that cassandra have no upper limit for sstable files >> to grow, I decided to move to de

Re: b-tree

2011-07-20 Thread Jeffrey Kesselman
Im not sure if I have an answer for you, anyway, but I'm curious A b-tree and a binary tree are not the same thing. A binary tree is a basic fundamental data structure, A b-tree is an approach to storing and indexing data on disc for a database. Which do you mean? On Wed, Jul 20, 2011 at 4

b-tree

2011-07-20 Thread Eldad Yamin
Hello, Is there any good way of storing a binary-tree in Cassandra? I wonder if someone already implement something like that and how accomplished that without transaction supports (while the tree keep evolving)? I'm asking that becouse I want to save geospatial-data, and SimpleGeo did it using b-

Re: My "nodetool" in Java

2011-07-20 Thread Jeremy Hanna
If you look at the bin/nodetool file, it's just a shell script to run org.apache.cassandra.tools.NodeCmd. You could probably call that directly from your code. On Jul 20, 2011, at 3:18 PM, cbert...@libero.it wrote: > Hi all, > I'd like to build something like "nodetool" to show the status of t

My "nodetool" in Java

2011-07-20 Thread cbert...@libero.it
Hi all, I'd like to build something like "nodetool" to show the status of the ring (nodes up-down, info on single node) all via JAVA. Do you have any tip for this? (I don't want to run the nodetool through java and capture the output ...). I have really no idea on how to do it ... :-)

Data Visualization Best Practices

2011-07-20 Thread Selcuk Bozdag
Hi, Cassandra provides a flexible scheme-less data storage facility which is a perfect match for one of our projects. However, regarding the requirements it is also necessary to list the CFs in a tabular fashion. I searched on the Internet for some guidelines but could not get a handy practice for

Re: best example of indexing

2011-07-20 Thread Konstantin Naryshkin
In the Cassandra CLI tutorial(http://wiki.apache.org/cassandra/CassandraCli), there is an example of creating a secondary index. Konstantin - Original Message - From: "CASSANDRA learner" To: user@cassandra.apache.org Sent: Wednesday, July 20, 2011 9:47:28 AM Subject: best example of ind

Re: How to keep only exactly column of key

2011-07-20 Thread Sylvain Lebresne
The ticket for pluggable compaction is https://issues.apache.org/jira/browse/CASSANDRA-1610. It's not released yet, so there is not real documentation for this yet. But if you really want to look into it, you can start looking at AbstractCompactionStragegy in trunk. -- Sylvain On Wed, Jul 20, 20

Re: What is the nodeId for?

2011-07-20 Thread Sylvain Lebresne
Possibly, you've hitted this: https://issues.apache.org/jira/browse/CASSANDRA-2824 Should be fixed in next minor release. In the meantime, you "fix" should be alright. -- Sylvain On Wed, Jul 20, 2011 at 3:47 PM, Boris Yen wrote: > Hi Sam, > Thanks for the explanation. > The NodeIds do appear i

Re: disable compaction

2011-07-20 Thread Edward Capriolo
On Wed, Jul 20, 2011 at 11:13 AM, Nikolai Kopylov wrote: > Hi everyone, > > finding out recently that cassandra have no upper limit for sstable files > to grow, I decided to move to deletion of CF with obsolete data. > So that I will not remove columns and there is no need in compaction at > all.

disable compaction

2011-07-20 Thread Nikolai Kopylov
Hi everyone, finding out recently that cassandra have no upper limit for sstable files to grow, I decided to move to deletion of CF with obsolete data. So that I will not remove columns and there is no need in compaction at all. How can I completely disable the compaction process? Thanx for your

Re: Repair taking a long, long time

2011-07-20 Thread David Boxenhorn
As I indicated below (but didn't say specifically) another option is to set read repair chance to 1.0 for all your CFs and loop over all your data, since read triggers a read repair. On Wed, Jul 20, 2011 at 4:58 PM, Maxim Potekhin wrote: > ** > I can re-load all data that I have in the cluster,

Re: Repair taking a long, long time

2011-07-20 Thread Boris Yen
We also got the same problem when using 0.8.0. As far as I know, there are a few issues relative to 'repair' has been marked as resolved at 0.8.1. Hope this could really solve our problem. On Wed, Jul 20, 2011 at 8:47 PM, David Boxenhorn wrote: > I have this problem too, and I don't understand w

Re: Repair taking a long, long time

2011-07-20 Thread Maxim Potekhin
I can re-load all data that I have in the cluster, from a flat-file cache I have on NFS, many times faster than the nodetool repair takes. And that's not even accurate because as other noted nodetool repair eats up disk space for breakfast and takes more than 24hrs on 200GB data load, at which po

Re: What is the nodeId for?

2011-07-20 Thread Boris Yen
Hi Sam, Thanks for the explanation. The NodeIds do appear in the Local row of NodeIdInfo, and after manually deleting two (I got three before I deleted them) of them from CurrentLocal row, the cassandra can be restarted now. I was just thinking what could be the possible cause for this? and wonde

Re: Repair taking a long, long time

2011-07-20 Thread David Boxenhorn
I have this problem too, and I don't understand why. I can repair my nodes very quickly by looping though all my data (when you read your data it does read-repair), but nodetool repair takes forever. I understand that nodetool repair builds merkle trees, etc. etc., so it's a different algorithm, b

Re: Re: 2800 file descriptors?

2011-07-20 Thread Jonathan Ellis
Repair does normally stream lots of small sstables. It's normal to set open fd to unlimited, but a higher limit like 64K would also be reasonable. On Wed, Jul 20, 2011 at 7:02 AM, cbert...@libero.it wrote: >> For the "too many open files" issue, maybe you could try:  ulimit -n 5000 >> && . > > O

Re: network bandwidth question

2011-07-20 Thread Jonathan Ellis
You can assume that's negligible compared to the data traffic. On Wed, Jul 20, 2011 at 7:02 AM, Arijit Mukherjee wrote: > Hi All > > We're trying to set up a Cassandra cluster (initially with 3 nodes). Each > node will generate data @ 32MB per second. What would be the likely network > usage for

R: Re: 2800 file descriptors?

2011-07-20 Thread cbert...@libero.it
> For the "too many open files" issue, maybe you could try: ulimit -n 5000 > && . Ok, thanks for the tip but I get this error running nodetool repair and not during cassandra execution. I however wonder if this is normal or not ... in production do you get similar numbers? Isn't it too much? b

network bandwidth question

2011-07-20 Thread Arijit Mukherjee
Hi All We're trying to set up a Cassandra cluster (initially with 3 nodes). Each node will generate data @ 32MB per second. What would be the likely network usage for this (say with a replication factor of 3)? I mean, if I use simple arithmetic, I can say 32MBps per node, and hence 96MBps in tota

Re: What is the nodeId for?

2011-07-20 Thread Sam Overton
The NodeId is used in counter replication. Counters are stored on each replica as a set of "shards," where each shard corresponds to the local count of one of the replicas for that counter, as identified by the NodeId. A NodeId is generated the first time cassandra starts, and might be renewed dur

Re: 2800 file descriptors?

2011-07-20 Thread Boris Yen
For the "too many open files" issue, maybe you could try: ulimit -n 5000 && . On Wed, Jul 20, 2011 at 6:47 PM, cbert...@libero.it wrote: > Hi all, > I wonder if is normal that Cassandra (5 nodes, 0.75) has more than 2800 fd > open and growing. > I still have the problem that during repair I get

Re: node repair eat up all disk io and slow down entire cluster(3 nodes)

2011-07-20 Thread Yan Chunlu
just found this: https://issues.apache.org/jira/browse/CASSANDRA-2156 but seems only available to 0.8 and people submitted a patch for 0.6, I am using 0.7.4, do I need to dig into the code and make my own patch? does add compaction throttle solve the io problem? thanks! On Wed, Jul 20, 2011 at

What is the nodeId for?

2011-07-20 Thread Boris Yen
Hi, I think we might have screwed our data up. I saw multiple columns inside record: System.NodeIdInfo.CurrentLocal. It makes our cassandra dead forever. I was wondering if anyone could tell me what the NodeId is for? so that I might be able to duplicate this. Thanks in advance Boris

2800 file descriptors?

2011-07-20 Thread cbert...@libero.it
Hi all, I wonder if is normal that Cassandra (5 nodes, 0.75) has more than 2800 fd open and growing. I still have the problem that during repair I get into the "too many open files" Best regards

Cassandra & CLOUD . How its related

2011-07-20 Thread CASSANDRA learner
Hi Guys, When we talk about cassandra, any how we connect it to cloud. I dont understand how it is connected to cloud. Whats this Cassandra Cloud.

Re: best example of indexing

2011-07-20 Thread CASSANDRA learner
where can i get that. Can you please help me out On Wed, Jul 20, 2011 at 3:39 PM, Sasha Dolgy wrote: > Examples exist in the conf directory of the distribution... > On Jul 20, 2011 11:48 AM, "CASSANDRA learner" > wrote: > > Hi Guys, > > > > Can you please give me the best example of creating in

Re: best example of indexing

2011-07-20 Thread Sasha Dolgy
Examples exist in the conf directory of the distribution... On Jul 20, 2011 11:48 AM, "CASSANDRA learner" wrote: > Hi Guys, > > Can you please give me the best example of creating index on a column > family. As I am completely new to this, Can you please give me a simple and > good example.

Re: Need help json2sstable

2011-07-20 Thread Nilabja Banerjee
On 20 July 2011 11:33, Nilabja Banerjee wrote: > Hi All, > > Here Is my Json structure. > > > {"Fetch_CC" :{ > "cc":{ "":"1000", > ":"ICICI", > "":"", > "city":{ >

Re: Need help json2sstable

2011-07-20 Thread Nilabja Banerjee
Yes.Actually, I was just asking you guys to give me one example with one sample of small json structure. Thank you in advance :) On 20 July 2011 11:53, Sasha Dolgy wrote: > You are missing " after > > On Wed, Jul 20, 2011 at 8:03 AM, Nilabja Banerjee > wrote: > > Hi All, > > > >

best example of indexing

2011-07-20 Thread CASSANDRA learner
Hi Guys, Can you please give me the best example of creating index on a column family. As I am completely new to this, Can you please give me a simple and good example.

RE: How to keep only exactly column of key

2011-07-20 Thread Lior Golan
Thanks Sylvain Can you please point us to what interface should be implemented in order to write our own custom compaction. And how is it supposed to be configured? -Original Message- From: Sylvain Lebresne [mailto:sylv...@datastax.com] Sent: Tuesday, July 19, 2011 11:40 AM To: user@cas

Cassandra training in Bangalore, India

2011-07-20 Thread CASSANDRA learner
Is there anywhere the training on cassandra is available in bamgalore, India

node repair eat up all disk io and slow down entire cluster(3 nodes)

2011-07-20 Thread Yan Chunlu
at the beginning of using cassandra, I have no idea that I should run "node repair" frequently, so basically, I have 3 nodes with RF=3 and have not run node repair for months, the data size is 20G. the problem is when I start running node repair now, it eat up all disk io and the server load becam