Re: Why cassandra single node so slow?

2009-11-14 Thread Michael Greene
lazyboy had a thread-based connection pool last time I checked, so he should just have to thread his test loop for it to use multiple connections. MySQL should suffer the same general problem though, although I don't know the specifics of the MySQLdb module. Based on his test, it's accurate to sa

Re: out of memory error on malformed Thrift protocol

2009-11-13 Thread Michael Greene
This is a know issue and is out of Cassandra's specific hands. The Thrift issue is: http://issues.apache.org/jira/browse/THRIFT-601 The temporary workaround is "don't send random data to your Cassandra instance." Michael 2009/11/13 Ted Zlatanov : > The sequence to trigger the bug: > > 1) telnet

Re: java API?

2009-11-13 Thread Michael Greene
It would be helpful to know the specific errors you're experiencing. Michael On Fri, Nov 13, 2009 at 10:51 AM, Mark Vigeant wrote: > I’m completely new to Cassandra and I think the wiki and documentation are > really well done. Now I’m trying to construct an application to upload data > to uploa

Re: [VOTE] Website

2009-11-11 Thread Michael Greene
+1 On Wed, Nov 11, 2009 at 4:12 PM, Brandon Williams wrote: > On Wed, Nov 11, 2009 at 4:10 PM, Eric Evans wrote: >> >> The current website is quite ugly, and I don't know about you, but I'm >> itching to put the new project logo to use, so I'd like to propose >> publishing http://cassandra.deadc

Re: why does remove need a timestamp?

2009-11-08 Thread Michael Greene
The remove call does not need the timestamp of what you inserted, it just needs a timestamp *at least* that large. The current time will work. Cassandra needs this to ensure that you are not removing later-added data. e.g. Client 1 sends node X: Add 'a' at 11:00:00 Client 2 sends node Y: Remove

Re: Incr/Decr Counters in Cassandra

2009-11-04 Thread Michael Greene
This has come up before at http://markmail.org/thread/w3mrh4h64xpf3vuj and http://markmail.org/message/vnmsuddlrhaziq7g I am in favor of adding eventually-consistent atomic operations such as this, but I'm not sure how one would implement it. Some sort of UUID + bloomfilter for the individual atom

Fwd: HBase vs. Cassandra: new article!

2009-10-29 Thread Michael Greene
Forwarding this along: -- Forwarded message -- From: Bradford Stephens Date: Thu, Oct 29, 2009 at 3:48 PM Subject: HBase vs. Cassandra: new article! To: hbase-u...@hadoop.apache.org, core-u...@hadoop.apache.org Hey there, Thought you guys would be interested in a new blog arti

Re: are columns of a supercolumn name sorted?

2009-10-25 Thread Michael Greene
"reversed" is a boolean option on the SliceRange that you pass in the get_slice method via its predicate parameter. Michael On Sun, Oct 25, 2009 at 8:26 PM, kevin wrote: > hi jonathan > thanks for the clarification. > subcolumns is what you want (with the reverse option to slice, to get

Re: RMI occupies port 8080

2009-10-22 Thread Michael Greene
This is passed in on the command line, and can be found in bin/ cassandra.in.sh if using the default scripts. Michael On Oct 22, 2009, at 5:08 AM, Johannes Schaback > wrote: Hi, I am not sure how far this is actually a RMI thing than concerns Cassandra, but Cassandra appears to start RMI on

Re: Starting in bootstrap mode...

2009-10-08 Thread Michael Greene
No, that's "trunk" or what will become 0.5 The 0.4 branch can be found here: http://svn.apache.org/repos/asf/incubator/cassandra/branches/cassandra-0.4/ A 0.4.1 release will be made off of that branch eventually. The 0.4.0 final release in SVN can be found here: http://svn.apache.org/repos/asf/i

Re: [VOTE] Project Logo

2009-10-05 Thread Michael Greene
I recognize my esteemed colleague in the Chicago delegation and echo his truths before putting forth my own ballot: ~~{ Ballot }~~ [ 10 ] 2    http://99designs.com/contests/28940/entries/002 [ 9 ] 30   http://99designs.com/contests/28940/en

Re: Cassandra and Oracle Coherence Comparison

2009-10-05 Thread Michael Greene
Briefly * Coherence is in-memory, Cassandra is persisted * Coherence has a transactional model, Cassandra is eventually consistent * Coherence has specially written adapters for different environments/languages, Cassandra supports most languages through Thrift * They both are distributed repositori

Re: data distribution among DataFileDirectories

2009-09-29 Thread Michael Greene
On Tue, Sep 29, 2009 at 2:46 PM, Jonathan Ellis wrote: > On Tue, Sep 29, 2009 at 2:22 PM, Igor Katkov wrote: >> 2. Is there a way to control key distribution, for the cases when >> hard-drives are of different capacity? > > No.  (That wouldn't be hard to add, but nobody's needed it.) This certai

Re: Cassandra cluster setup [WAS Re: usage]

2009-09-28 Thread Michael Greene
wrote: > At Mon, 28 Sep 2009 17:27:35 -0500, > Michael Greene wrote: >> >> This is a new thread continued from the Facebook-usage thread. >> > > sure > >> Cassandra automatically shards your data based on the Partitioner you >> have setup in storag

Cassandra cluster setup [WAS Re: usage]

2009-09-28 Thread Michael Greene
This is a new thread continued from the Facebook-usage thread. Cassandra automatically shards your data based on the Partitioner you have setup in storage-conf.xml. The copies are controlled by the ReplicationFactor setting in the same configuration file. If all your nodes are in the same data c

Re: usage

2009-09-28 Thread Michael Greene
You must have been talking to the wrong developers. Facebook open-sourced Cassandra in early 2008.  In August 2008 they wrote about this publically, at http://www.new.facebook.com/note.php?note_id=24413138919 where they state: "First deployment of Cassandra system within Facebook was for the Inbox

Re: differences between keyspaces and tables

2009-09-25 Thread Michael Greene
t recommended to use the latest beta > version? > > Joe > > On Fri, Sep 25, 2009 at 1:00 PM, Michael Greene > wrote: > > They were renamed between 0.3 and 0.4. They are the same thing. > > See https://issues.apache.org/jira/browse/CASSANDRA-271 > > >

Re: differences between keyspaces and tables

2009-09-25 Thread Michael Greene
They were renamed between 0.3 and 0.4. They are the same thing.See https://issues.apache.org/jira/browse/CASSANDRA-271 Michael On Fri, Sep 25, 2009 at 2:58 PM, Joe Van Dyk wrote: > Hi, > > In some storage.conf's, I see and in others I see . > > Are they the same thing? > > -- > Joe Van Dyk > h

Re: triggers in cassandra

2009-09-25 Thread Michael Greene
cks for a column, so when there is any other guy > > changing the data I can get notified. That way I don't need to do > continuous > > polling. > > > > Is there any functionality right now that I could use to implement this? > > > > Thanks > > -

Re: perfomance issue

2009-09-25 Thread Michael Greene
Thanks for the results. Perhaps you could shed further light:Is this a single node system? Is the log level changed from DEBUG to INFO? Are the commit log and data directories on the same drive? Are the sets/gets being processed interleaved in parallel, or one then the other? Note that writes are

Re: triggers in cassandra

2009-09-25 Thread Michael Greene
Hector, Can you describe explicitly what you'd want to see from callouts/triggers? One of the reasons I advocated for removal is that no one had a need for it or was working on it in the open source project. What's your scenario? Michael On Fri, Sep 25, 2009 at 9:40 AM, Eric Evans wrote: > On

Re: Got Logo?

2009-09-16 Thread Michael Greene
It has been discussed and there are several proposals on http://issues.apache.org/jira/browse/CASSANDRA-231I personally like a few of them. We're still looking for some direction or perhaps a vote -- chime in if you're interested. Michael On Wed, Sep 16, 2009 at 4:26 PM, Sal Fuentes wrote: > J

Re: how to get the super row slice in the new cassandra

2009-08-22 Thread Michael Greene
Sorry, typo: SlicePredicate On Sat, Aug 22, 2009 at 8:28 AM, Michael Greene wrote: > predicate is a SlicePredice object and consistency_level is a > ConsistencyLevel enum > It is helpful to look back at the cassandra.thrift definition instead of > only consuming the generated c

Re: how to get the super row slice in the new cassandra

2009-08-22 Thread Michael Greene
predicate is a SlicePredice object and consistency_level is a ConsistencyLevel enum It is helpful to look back at the cassandra.thrift definition instead of only consuming the generated code. We are working on adding more documentation to the interface though. Michael On Sat, Aug 22, 2009 at 4:2

Re: quorum read timeout

2009-08-20 Thread Michael Greene
That's not precisely true. See https://issues.apache.org/jira/browse/CASSANDRA-44 for techniques that can be used to modify the current column families. Eventually this will be made more dynamic. Michael On Thu, Aug 20, 2009 at 6:37 PM, Jun Rao wrote: > You should be aware that Cassandra doesn

Re: Cassandra performance

2009-08-19 Thread Michael Greene
re is > only one call per second." > > the internal Cassandra MessagingService uses nonblocking io, but the > Thrift stuff is just your standard thread pool with blocking sockets. > > On Wed, Aug 19, 2009 at 5:32 PM, Huming Wu wrote: > > On Tue, Aug 18, 2009 at 12:02 P

Re: Pls, help with fetching of super-column's value

2009-08-19 Thread Michael Greene
start and finish in SliceRange are non-optional. Try empty strings. 2009/8/19 Teodor Sigaev : > Some more news, I added printing of stack trace to perl's client, and I see > that problem is in getting answer from server, not in sending. It breaks on > reading of exception (TMessageType::EXCEPTION

Re: Cassandra performance

2009-08-18 Thread Michael Greene
According to the HBase guys and confirmed by http://java.sun.com/javase/technologies/hotspot/gc/gc_tuning_6.html#icms, on an 8-core machine you are not going to want to enable -XX:+CMSIncrementalMode -- it is for 1 or 2 core machines that are using the CMS GC. This should affect your latency numbe

Re: Cassandra performance

2009-08-17 Thread Michael Greene
What sort of slicing are you doing? This will impact CPU usage. Michael Huming Wu wrote: > I did some performance test and I am not impressed :). The data set is > 880K unique keys and there are 4 columns with 2 columns being string > and the other 2 are integers (from client side, to the backen

Re: Valid consistency level values on trunk

2009-08-11 Thread Michael Greene
and congratulations to Cassandra's local hero, Sammy Yu, who apparently had the same idea. On Tue, Aug 11, 2009 at 9:50 PM, Jonathan Ellis wrote: > Thanks for looking into it, though. > > On Tue, Aug 11, 2009 at 10:59 AM, Mark McBride wrote: >> My shot at hero status has been thwarted, due to a pr

Re: .NET client example

2009-08-04 Thread Michael Greene
The Java examples are easily portable. For what it's worth, I have been using C# with Cassandra for awhile. I have some wrapper classes and connection management code that I'm still working on getting released, but for testing the code generated by Thrift is largely usable out of the box. Michael

Re: ERROR - Fatal error: Unable to load class org.apache.cassandra.db.marshal.Name for CompareWith attribute

2009-08-02 Thread Michael Greene
That's the error that would be printed if you had CompareWith="Name" set, which is invalid. Valid values for CompareWith are listed in the XML file, and are: AsciiType UTF8Type BytesType UUIDType LongType Michael On Sun, Aug 2, 2009 at 10:20 PM, Tom Melendez wrote: > Is your configuration valid

Re: Thrift interface: struct field names

2009-07-29 Thread Michael Greene
That's substantially true, yes. Any other utilities built on top of the original generated C# code will likely fail, or any introspection utilities expecting those field names, but at the protocol and transport layers (and on the Java server side) you will see no difference. Michael On Wed, Jul

Re: hadoop + cassandra

2009-07-24 Thread Michael Greene
Cassandra does not need Hadoop for functionality and is a "standalone" project. Hadoop is many things. One of those is HDFS, which as you describe is a GFS clone. Hadoop also includes a MapReduce implementation, job tracking, and various other services that a distributed system using it would ne

Re: how to add table in storage-conf.xml

2009-07-23 Thread Michael Greene
columnPath in 0.3 used to be defined as a composite string "columnFamilyName:columnName" or "columnFamilyName:superColumnName:columnName" Now in trunk/0.4 it is defined as a ColumnPath struct/object that can be instanced as Gasol has written below. Michael 2009/7/23 李楠 : > and > > package org.ap

Re: idiomatic way to do zeitgeist kind of counter behavior in cassandra

2009-07-23 Thread Michael Greene
gt; On Wed, Jul 22, 2009 at 9:36 PM, Michael Greene > wrote: >> >> See this previous discussion of a related topic >> http://markmail.org/thread/w3mrh4h64xpf3vuj >> >> Michael >> >> On Wed, Jul 22, 2009 at 11:31 PM, wrote: >> > I m trying to f

Re: idiomatic way to do zeitgeist kind of counter behavior in cassandra

2009-07-22 Thread Michael Greene
See this previous discussion of a related topic http://markmail.org/thread/w3mrh4h64xpf3vuj Michael On Wed, Jul 22, 2009 at 11:31 PM, wrote: > I m trying to figure out how to implement a zeitgeist in cassandra > > How do we implement the counter, do we need to the get the value increment > and i

Re: a talk on building an email app on Cassandra

2009-07-20 Thread Michael Greene
That looks really great, thanks for sharing with us. Can you add that to http://wiki.apache.org/cassandra/ArticlesAndPresentations ? Michael On Mon, Jul 20, 2009 at 12:43 PM, Jun Rao wrote: > Last Friday, I gave an IEEE talk on an email app that we built on top of > Cassandra. Below is the link

Re: how do we updgrade cassandra

2009-07-19 Thread Michael Greene
There's 'snapshot' functionality coming in https://issues.apache.org/jira/browse/CASSANDRA-279 slated for 0.4, but to my knowledge no toolIng in the works for later using that data. It should be as simple as tar'ing and restoring somewhere else after that. Michael On Jul 19, 2009, at 1:25

Re: how do we updgrade cassandra

2009-07-19 Thread Michael Greene
If you run 'ant clean' before running ant or Cassandra, it will ensure that no old artifacts are still around from the code that could be causing problems. To dump the db, just delete the db directories. There should be tooling for this in the future, but that's the most surefire way to know that

Re: Concurrent updates

2009-07-17 Thread Michael Greene
Even if CQL SET allowed for the operation you're describing, it's at odds with the availability and consistency constrains of Cassandra. Another process, somewhere else, could be reading and writing that frequency value at the same time. Reducing the operation to one statement does not make it tra

Re: one server or more servers?

2009-07-15 Thread Michael Greene
02 doesnt trying nodeprobe >> >> On Wed, Jul 15, 2009 at 4:03 PM, Anthony Molinaro >> wrote: >>> >>> Alternatively if you are using the 0.3 release you can point a browser >>> at port 7002 of one of the boxes and should see all the nodes in the >>

Re: one server or more servers?

2009-07-15 Thread Michael Greene
You can use the nodeprobe utility in bin/ to contact each node and make sure they see the same information. Run it with no arguments to see the commands you can pass it. There is also an open issue at https://issues.apache.org/jira/browse/CASSANDRA-252 for making this a little more automatic (ins

Re: problem running cassandra

2009-07-14 Thread Michael Greene
Cassandra uses many ports in successful operation. As Jonathan mentions, the bind to (for JMX) was successful. The other default ports are 7000, 7001, 9160 (used by Cassandra) and 8080 (used by JMX). Based what you pasted, your server failed to bind to the storage port defined in storage-co

Re: Scaling from 1 to x (was: one server or more servers?)

2009-07-14 Thread Michael Greene
Cassandra borrows many concepts from Dynamo, and its paper describes this well: http://s3.amazonaws.com/AllThingsDistributed/sosp/amazon-dynamo-sosp2007.pdf http://wiki.apache.org/cassandra/DataModelAndOperations contains some documentation that references block_for, but this documentation needs

Re: A few issues with the latest code

2009-06-28 Thread Michael Greene
Not sure about the port issue. You should be able to find all the defined ports in conf/storage-conf.xml I filed https://issues.apache.org/jira/browse/CASSANDRA-260 a couple days ago about the Cli/Cql problem. In a recent check-in, the API for reading all columns changed, and the Cli/Cql wasn't