Re: ec2 tests

2010-05-28 Thread Mark Greene
If you give us an objective of the test that will help. Trying to get max
write throughput? Read throughput? Weak consistency?

On Thu, May 27, 2010 at 8:48 PM, Chris Dean ctd...@sokitomi.com wrote:

 I'm interested in performing some simple performance tests on EC2.  I
 was thinking of using py_stress and Cassandra deployed on 3 servers with
 one separate machine to run py_stress.

 Are there any particular configuration settings I should use?  I was
 planning on changing the JVM heap size to reflect the Large Instances
 we're using.

 Thanks!

 Cheers,
 Chris Dean



Re: ec2 tests

2010-05-28 Thread Chris Dean
Mark Greene green...@gmail.com writes:
 If you give us an objective of the test that will help. Trying to get max
 write throughput? Read throughput? Weak consistency?

I would like reading to be as fast as I can get.  My real-world problem
is write heavy, but the latency requirements are minimal on that side.
If there are any particular config setting that would help with the slow
ec2 IO that would be great to know.

Cheers,
Chris Dean


Re: Batch_Mutate throws Uncaught exception

2010-05-28 Thread Moses Dinakaran
Hi sorry for the post I was wrong in understanding key and column family

I was in the thinking cache_pages is the column family and Page is the key
but its the other way right

I will update my code and check it again.

Thanks,
Moses

On Thu, May 27, 2010 at 3:46 PM, Mishail mishail.mish...@gmail.com wrote:

 Hi,

 Just to clarify. Are you trying to insert a couple of columns with key
 cache_pages in the ColumnFamily Page?

 Moses Dinakaran wrote:
 i,
 
 
 
  I am trying to use batch_mutate() with PHP Thrift. I was getting the
  following error.
 




remove a row

2010-05-28 Thread huajun qi
Is there anyway to remove a row completely?

I use thrift client's remove method , it only deletes the columns under a
row, but the row with its key is still there.

How can I remove it completely?

-- 
Location:


Re: remove a row

2010-05-28 Thread gabriele renzi
On Fri, May 28, 2010 at 11:05 AM, huajun qi qih...@gmail.com wrote:
 Is there anyway to remove a row completely?
 I use thrift client's remove method , it only deletes the columns under a
 row, but the row with its key is still there.
 How can I remove it completely?


you can't really, with the thrift api,  see
 http://spyced.blogspot.com/2010/02/distributed-deletes-in-cassandra.html


-- 
blog en: http://www.riffraff.info
blog it: http://riffraff.blogsome.com


Re: ec2 tests

2010-05-28 Thread Mark Greene
First thing I would do is stripe your EBS volumes. I've seen blogs that say
this helps and blogs that say it's fairly marginal. (You may want to try
rackspace cloud as they're local storage is much faster.)

Second, I would start out with N=2 and set W=1 and R=1. That will mirror
your data across two of the three nodes and possibly give you stale data on
the reads. If you feel you need stronger durability you increase N and W.

As far as heap memory, don't use 100% of the available physical ram.
Remember, object heap will be smaller than your overall JVM process heap.

That should get you started.


On Fri, May 28, 2010 at 3:10 AM, Chris Dean ctd...@sokitomi.com wrote:

 Mark Greene green...@gmail.com writes:
  If you give us an objective of the test that will help. Trying to get max
  write throughput? Read throughput? Weak consistency?

 I would like reading to be as fast as I can get.  My real-world problem
 is write heavy, but the latency requirements are minimal on that side.
 If there are any particular config setting that would help with the slow
 ec2 IO that would be great to know.

 Cheers,
 Chris Dean



Avro: C# support

2010-05-28 Thread Stephan Pfammatter
Q: Are the plans for Avro to support C#?


Re: Avro: C# support

2010-05-28 Thread Eric Hauser
There is JIRA ticket for .NET support in Avro -
https://issues.apache.org/jira/browse/AVRO-533
https://issues.apache.org/jira/browse/AVRO-533
On Fri, May 28, 2010 at 10:01 AM, Stephan Pfammatter 
stephan.pfammat...@logmein.com wrote:

  Q: Are the plans for Avro to support C#?



how does communication between nodes works?

2010-05-28 Thread Gabriel Sosa
I've been trying to find some deep documentation about the way on how
the nodes communicate between them.

I've been reading
http://wiki.apache.org/cassandra/ArchitectureInternals but I couldn't
find anything there

I'm trying to learn how this works. Is it using thirft or thirft its
only used for clients communication?


thank you

-- 
Gabriel Sosa
Si buscas resultados distintos, no hagas siempre lo mismo. - Einstein


Re: Thoughts on adding complex queries to Cassandra

2010-05-28 Thread Jeremy Davis
I wonder if any of the main project committers would like to weigh in on
what a desired API would look like, or perhaps we should start an
unscheduled Jira ticket?

On Thu, May 27, 2010 at 5:39 PM, Jake Luciani jak...@gmail.com wrote:

 I had this:


 string slice_dice_reduce(1:required listbinary key,
   2:required ColumnParent
 column_parent,
   3:required SlicePredicate predicate,
   4:required ConsistencyLevel
 consistency_level=ONE,
   5:required string dice_js,
   6:required string reduce_js)
 throws (1:InvalidRequestException ire,
 2:UnavailableException ue, 3:TimedOutException te),

 I guess it could use a union of sorts and return either.



 On Thu, May 27, 2010 at 8:36 PM, Jeremy Davis 
 jerdavis.cassan...@gmail.com wrote:


 I agree, I had more than filter results in mind.
 Though I had envisioned the results to continue to use the
 ListColumnOrSuperColumn (and not JSON). You could still create new result
 columns that do not in any way exist in Cassandra, and you could still stuff
 JSON in to any of result columns.

 I had envisioned:
 listColumnOrSuperColumn get_slice(keyspace, key, column_parent, predicate, 
 consistency_level,
 javascript_blob )

 -JD





 On Thu, May 27, 2010 at 5:01 PM, Jake Luciani jak...@gmail.com wrote:

 I've secretly started working on this but nothing to show yet :( I'm
 calling it SliceDiceReduce or SliceReduce.

  The plan is to use the js thrift bindings I've added for 0.3 release of
 thrift (out very soon?)

 This will allow the supplied js to access the results like any other
 thrift client.

 Adding a new verb handler and SEDA stage that will execute on a local
 node and pass this nodes slice data into the supplied js dice function via
 the thrift js bindings.

 The resulting js from each node would then be passed into another
 supplied js reduce function on the starting node.

 The result of this would then return a single JSON or string result.
  The reason I'm keeping the results in json is you can do more than filter.
 You can do things like word count etc.

 Anyway this is little more than an idea now. But if people like this
 approach maybe I'll get motivated!

 Jake





 On May 27, 2010, at 7:36 PM, Steve Lihn stevel...@gmail.com wrote:

 Mongo has it too. It could save a lot of development time if one can
 figure out porting Mongo's query API and stored javascript to Cassandra.
 It would be great if scala's list comprehension can be facilitated to
 write query-like code against Cassandra schema.

 On Thu, May 27, 2010 at 11:05 AM, Vick Khera  vi...@khera.org
 vi...@khera.org wrote:

 On Thu, May 27, 2010 at 9:50 AM, Jonathan Ellis  jbel...@gmail.com
 jbel...@gmail.com wrote:
  There definitely seems to be demand for something like this.  Maybe
 for 0.8?
 

 The Riak data store has something like this: you can submit queries
 (and map reduce jobs) written in javascript that run on the data nodes
 using data local to that node.  It is a very compelling feature.







Help to understand a strange behavior with DCQUORUM

2010-05-28 Thread Patricio Echagüe
Hi all, I need to help to understand how DCQUORUM works.

This is my setup:

- Cluster with 3 Cassandra Nodes
- RF = 3
- ReplicatePlacementStrategy = RackUnawareStrategy

My test:
- I write/read with DCQUORUM

Results:
- While the 3 nodes are UP, all my writes and read succeed. (the nodes are
reached, and the third one -to complete the RF=3- is done my replication,
right?)
- When I killed one node, the test FAILED with UnavailableException
- When I performed the same test but with QUORUM instead of DCQUORUM, It
succeeded.

Could someone explain please why reads and writes with DCQUORUM worked fine
while the 3 nodes were up and running but failed when 1 was down even thouch
I have only one Data Center?

Thanks in advance

-- 
Patricio.-