AVRO client API

2010-06-17 Thread Atul Gosain
Hi Is the client API for cassandra available in AVRO. If so, any links to examples or some documentation ? and If so, any comparison between Thrift and Avro API's to determine the better of them ? Thanks Atul

Re: Chiton not showing any keyspaces

2010-06-17 Thread John Schneider
Jonathan - correct. Brandon clued me in that framed mode was required and it woke right up when I booted in framed mode. Tnx, jos On Thu, Jun 17, 2010 at 7:47 PM, Jonathan Ellis wrote: > probably Cassandra isn't configured to use thrift framed mode, which > Chiton requires. > > On Wed, Jun 16,

Re: Best documentation for Java and Cassandra?

2010-06-17 Thread Benjamin Black
On Thu, Jun 17, 2010 at 9:00 PM, Anthony Ikeda wrote: > I'm certain I didn't mention Lucandra, but yes I get the idea about the > concepts: Lucandra is Lucene on Cassandra. Unclear to me how else Lucene is relevant, but, ok! > * Use indexing tool to reference with queries that returns keys > *

RE: Get all rows back from a ColumnFamily

2010-06-17 Thread Anthony Ikeda
Great thanks! I'm slowly getting the hang of all this now. From: Benjamin Black [mailto:b...@b3k.us] Sent: Friday, 18 June 2010 1:07 PM To: user@cassandra.apache.org Subject: Re: Get all rows back from a ColumnFamily you either need to use OPP and issue a get_range() request with empty

RE: Best documentation for Java and Cassandra?

2010-06-17 Thread Anthony Ikeda
I'm certain I didn't mention Lucandra, but yes I get the idea about the concepts: * Use indexing tool to reference with queries that returns keys * Store and retrieve data from Cassandra by the keys Anthony -Original Message- From: Benjamin Black [mailto:b...@b3k.us] Sent: Friday, 18 Ju

Re: Some questions about using Cassandra

2010-06-17 Thread Benjamin Black
You download the patch and apply it. On Thu, Jun 17, 2010 at 4:10 PM, Anthony Ikeda wrote: > Thanks Sylvia, I would like to actually do that actually. Any idea how I can > get started? > > -Original Message- > From: Sylvain Lebresne [mailto:sylv...@yakaz.com] > Sent: Thursday, 17 June 20

Re: Best documentation for Java and Cassandra?

2010-06-17 Thread Benjamin Black
Lucandra is not a framework, it is Lucene on top of Cassandra. You seem to be falling into the most common trap for relational folks coming to column stores: trying to model your _data_ instead of trying to model your _queries_. You _must_ use the tools as intended to achieve good results. b O

Re: Get all rows back from a ColumnFamily

2010-06-17 Thread Jonathan Ellis
look at get_range_slices in http://wiki.apache.org/cassandra/API On Thu, Jun 17, 2010 at 6:36 PM, Anthony Ikeda < anthony.ik...@cardlink.com.au> wrote: > Is there any way at all (In Java) to get all the data from a > ColumnFamily? > > > > I’ve inserted data into Cassandra and I don’t seem to hav

Re: Get all rows back from a ColumnFamily

2010-06-17 Thread Benjamin Black
you either need to use OPP and issue a get_range() request with empty strings for start and end keys (and you'll generally want to paginate using a count option and saving the last entry in a given page), or you need to index your rows. same as with any other sort of query you might want to perfor

Re: Chiton not showing any keyspaces

2010-06-17 Thread Jonathan Ellis
probably Cassandra isn't configured to use thrift framed mode, which Chiton requires. On Wed, Jun 16, 2010 at 3:29 PM, John Schneider wrote: > Hi, > I've got Chiton comming up on a mac, however when I connect to my Cassandra > instance, it never comes back from "Fetching keyspaces..." > > Log fro

Get all rows back from a ColumnFamily

2010-06-17 Thread Anthony Ikeda
Is there any way at all (In Java) to get all the data from a ColumnFamily? I've inserted data into Cassandra and I don't seem to have a way to browse what's in there. Anthony Ikeda Java Analyst/Programmer Cardlink Services Limited Level 4, 3 Rider Boulevard Rhodes NSW 2138 Web: www.

Re: extending multiget

2010-06-17 Thread Jonathan Ellis
No. At that point you basically have no overhead advantage vs just doing multiple single-row requests. On Thu, Jun 17, 2010 at 2:39 PM, Sonny Heer wrote: > Any plans for this sort of call? > > > Instead of: > >    public Map> multiget_slice(String > keyspace, List keys, ColumnParent column_paren

Load balancing

2010-06-17 Thread Mubarak Seyed
HI All, I have a requirement that we have around 100 application server instances and all needs to write/read data from cassandra's cluster, the write data rate is around 300k records per instance (approximately 30 millions for 100 instances). - How does client (application) connect to cassandr

RE: Best documentation for Java and Cassandra?

2010-06-17 Thread Anthony Ikeda
Okay, so that is where frameworks such as Lucene come into play. Right now we have set up Cassandra and are trying to come up with the best schema, per se, that is going to be the most effective and I think I’ll need to start looking at something like Lucene to ensure the model can be more vi

RE: Some questions about using Cassandra

2010-06-17 Thread Anthony Ikeda
Thanks Sylvia, I would like to actually do that actually. Any idea how I can get started? -Original Message- From: Sylvain Lebresne [mailto:sylv...@yakaz.com] Sent: Thursday, 17 June 2010 5:46 PM To: user@cassandra.apache.org Subject: Re: Some questions about using Cassandra On Thu, Jun

extending multiget

2010-06-17 Thread Sonny Heer
Any plans for this sort of call? Instead of: public Map> multiget_slice(String keyspace, List keys, ColumnParent column_parent, SlicePredicate predicate, ConsistencyLevel consistency_level) throws InvalidRequestException, UnavailableException, TimedOutException, TException; --- public

Re: Occasional 10s Timeouts on Read

2010-06-17 Thread AJ Slater
These are physical machines. storage-conf.xml.fs03 is here: http://pastebin.com/weL41NB1 Diffs from that for the other two storage-confs are inline here: a...@worm:../Z3/cassandra/conf/dev$ diff storage-conf.xml.lpc03 storage-conf.xml.fs01 185c185 > 71603818521973537678586548668074777838 229

Re: Occasional 10s Timeouts on Read

2010-06-17 Thread AJ Slater
The machines in question have 8GB of RAM each and generally never touch swap. I shall try to monitor memory/swap overnight and see if something strange happens. Would swapping really take 10s? AJ On Thu, Jun 17, 2010 at 1:54 PM, Jonathan Ellis wrote: > The explanation that best fits the symptom

Re: Occasional 10s Timeouts on Read

2010-06-17 Thread AJ Slater
The behavior was seen with row caching off. I now have row caching on. key cache hit rate is 0.75-0.8 row cache hit rate is 0 (row cache capacity=1, RowsCached="100%") looks like I should try another format for RowsCached, like "0.8" or "90%" or something. On Thu, Jun 17, 2010 at 1:47 PM, aaron

Re: Occasional 10s Timeouts on Read

2010-06-17 Thread Benjamin Black
Are these physical machines or virtuals? Did you post your cassandra.in.sh and storage-conf.xml someplace? On Thu, Jun 17, 2010 at 10:31 AM, AJ Slater wrote: > Total data size in the entire cluster is about twenty 12k images. With > no other load on the system. I just ask for one column and I ge

Re: munin is great for monitoring Cassandra

2010-06-17 Thread Mike Subelsky
T On Thu, Jun 17, 2010 at 4:53 PM, Peter Schuller wrote: >> drive on an EBS volume).  Does that pattern look bad or is it >> relatively normal? If it's not normal, any advice for what I should >> change to get better performance? > > Assuming you're running with the default cassandra VM options o

Re: munin is great for monitoring Cassandra

2010-06-17 Thread Jason Dixon
On Thu, Jun 17, 2010 at 04:01:41PM -0400, Mike Subelsky wrote: > I just started munin to monitor the performance of my cluster which is > working great. I noticed the attached pattern for JVM heap size - > lots of peaks and valleys, which looks like GC is firing pretty > frequently. I don't have

Re: Occasional 10s Timeouts on Read

2010-06-17 Thread Jonathan Ellis
The explanation that best fits the symptoms you describe is that you are swapping. On Thu, Jun 17, 2010 at 10:12 AM, AJ Slater wrote: > I'm seing 10s timeouts on reads few times a day. Its hard to reproduce > consistently but seems to happen most often after its been a long time > between reads.

Re: munin is great for monitoring Cassandra

2010-06-17 Thread Peter Schuller
> drive on an EBS volume).  Does that pattern look bad or is it > relatively normal? If it's not normal, any advice for what I should > change to get better performance? Assuming you're running with the default cassandra VM options or similar (mainly, the CMS GC), the graph looks fairly normal to

Re: munin is great for monitoring Cassandra

2010-06-17 Thread Jonathan Ellis
That looks pretty normal to me. The JVM doesn't do a full GC until the old gen crosses a (dynamic, defaults to about 70% of heap) threshold. On Thu, Jun 17, 2010 at 1:01 PM, Mike Subelsky wrote: > I just started munin to monitor the performance of my cluster which is > working great.  I noticed

Re: Occasional 10s Timeouts on Read

2010-06-17 Thread aaron morton
Do you have Row Caching enabled ? You can check in the JMX console to see if you're hitting the cache. Try turning on DEBUG level logging and look at the log on a machine you connect to do the read. Aaron On 18 Jun 2010, at 05:31, AJ Slater wrote: > Total data size in the entire cluster i

Re: Replicating nodes through firewall

2010-06-17 Thread Jonathan Ellis
jmx does this fun thing where the node you're connecting to, connects back to jconsole over a random port: http://blogs.sun.com/jmxetc/entry/connecting_through_firewall_using_jmx On Wed, Jun 16, 2010 at 7:34 AM, wrote: > I am trying to set up replication among three cassandra nodes. I believe it

Lucandra issues

2010-06-17 Thread Maxim Kramarenko
Hello! I am trying to rework our current lucene-based application to lucandra. Note the following problem: when I try to use NumericRangeQuery like this one: query.add(NumericRangeQuery.newLongRange("deliveryTimestampMinute", 6, fromDate, toDate, true, true), BooleanClause.Occur.MUST); I got

Re: 3-node balanced system

2010-06-17 Thread Ran Tavory
+ user, - dev (bcc actually) If you use a random partitioner use the following InitialToken for your nodes: $ bc bc 1.06 Copyright 1991-1994, 1997, 1998, 2000 Free Software Foundation, Inc. This is free software with ABSOLUTELY NO WARRANTY. For details type `warranty'. (2^127)/3 *5671372782015641

Re: Best documentation for Java and Cassandra?

2010-06-17 Thread Benjamin Black
Columnar data stores like Cassandra require you to construct indices to answer the queries of interest to you. http://www.slideshare.net/benjaminblack/cassandra-basics-indexing On Thu, Jun 17, 2010 at 12:04 AM, Anthony Ikeda < anthony.ik...@cardlink.com.au> wrote: > I’m wondering if anyone can s

Re: Cassandra questions

2010-06-17 Thread Ran Tavory
On Thu, Jun 17, 2010 at 9:09 PM, F. Hugo Zwaal wrote: > Hi, > > Being fairly new to Cassandra I have a couple of questions: > > 1) Is there a way to remove multiple keys/rows in one operation (batch) or > must keys be removed one by one? > yes, batch_mutate > 2) I see API references to version 0

Cassandra questions

2010-06-17 Thread F. Hugo Zwaal
Hi, Being fairly new to Cassandra I have a couple of questions: 1) Is there a way to remove multiple keys/rows in one operation (batch) or must keys be removed one by one? 2) I see API references to version 0.7, but I couldn't find a alpha or beta anywhere? Does it exist already and if so, wher

Re: Occasional 10s Timeouts on Read

2010-06-17 Thread AJ Slater
Total data size in the entire cluster is about twenty 12k images. With no other load on the system. I just ask for one column and I get these timeouts. Performing multiple gets on the columns leads to multiple timeouts for a period of a few seconds or minutes and then the situation magically resolv

Re: Occasional 10s Timeouts on Read

2010-06-17 Thread AJ Slater
Cassandra 0.6.2 from the apache debian source. Ubunutu Jaunty. Sun Java6 jvm. All nodes in separate racks at 365 main. On Thu, Jun 17, 2010 at 10:12 AM, AJ Slater wrote: > I'm seing 10s timeouts on reads few times a day. Its hard to reproduce > consistently but seems to happen most often after i

Occasional 10s Timeouts on Read

2010-06-17 Thread AJ Slater
I'm seing 10s timeouts on reads few times a day. Its hard to reproduce consistently but seems to happen most often after its been a long time between reads. After presenting itself for a couple minutes the problem then goes away. I've got a three node cluster with replication factor 2, reading at

Re: Best documentation for Java and Cassandra?

2010-06-17 Thread Ezra Epstein
Hi Anthony, As I understand it Cassandra, like most other "no SQL" alternative data stores does not support arbitrary queries. Translation: you can't directly query by field (that is to say, column) value. You are left with doing a "row scan" client side. Hence you need to know what queri

Re: Best documentation for Java and Cassandra?

2010-06-17 Thread Per Olesen
If new to cassandra, how it achieves its goals etc., I would recommend going with the "raw" thrift API to start with, until you get a good understanding of which API calls there are and how they work. I am sure there are nicer APIs, but they also somewhat hide some of the complexity, that I thin

Re: Some questions about using Cassandra

2010-06-17 Thread Sylvain Lebresne
On Thu, Jun 17, 2010 at 1:20 AM, Anthony Ikeda wrote: > Thanks Gary, I’m looking at that plug-in feature at the moment but there > seems to be very little documentation on how to use it. There is no documentation whatsoever. This is just a feature proposition right now (it's not included in Cassa

Re: Best documentation for Java and Cassandra?

2010-06-17 Thread Ran Tavory
I can offer Hector which I've authored and maintain with the help of a few other folks http://wiki.github.com/rantav/hector/ http://github.com/rantav/hector Feel free to post questions to our mailing list http://groups.google.com/group/hector-users On Thu, Jun 17, 2010 at 10:04 AM, Anthony Ikeda

Best documentation for Java and Cassandra?

2010-06-17 Thread Anthony Ikeda
I'm wondering if anyone can suggest the best resources for Java based Cassandra access. I've been able to create a client that can create data and retrieve a row but I'm struggling to understand how to setup queries based on criteria (e.g. return all keys that have a the field "color" equal to "red