Re: State Of: CQL - driver devs
On Sun, 20 Mar 2011 19:56:39 -0500 Eric Evans wrote: EE> (Hopefully )for the next version, we'll replace Thrift with a dedicated EE> protocol, one that eliminates the Thrift dependency, and more EE> importantly, implements streaming. This should be transparent to EE> applications for the most part though. That would be wonderful. I hope you'll consider HTTP as the transport protocol. But regardless CQL (or whatever it's called in the end) is going to be a great feature for Cassandra. Thank you for working on it. Ted
Re: Reducing confusion around client libraries
On Sun, 12 Dec 2010 01:56:17 +0100 Bjorn Borud wrote: BB> (users ought to be named, because an anonymous "upvote" or "downvote" BB> conveys next to no meaningful information to me) Alternatively the votes could be kept as two separate sets for authenticated vs. anonymous users. Ted
Re: NoSQL, YesCQL?
On Fri, 29 Oct 2010 10:07:43 -0500 (CDT) "Stu Hood" wrote: SH> Most reasonable languages these days have a way to define what looks SH> like a DSL: giving people a text DSL which is subject to injection SH> attacks and can't be type checked without support from a client SH> driver anyway is brain dead. I don't think SQL-like query languages are DSLs in the classic sense. Injection attacks are a red herring: they are a client issue, not a library or a server problem. Type checking is a valid complaint and I think it's balanced out by the flexibility of a text protocol. SH> Regarding performance: assuming optimized RPC libraries (which we do SH> not yet have in Avro, and which Thrift is getting better at), SH> serializing to a string and back will never be as performant as SH> using a pre-parsed representation of the statement on both SH> sides. "Oh but we can add prepared statements!" Poppycock. Consider that Cassandra's statements will never be as complicated as a regular RDBMS, so parsing them efficiently is not so hard. The parameters can be attached to the query, not necessarily inlined. A native JDBC level 4 driver could be a very efficient answer to this problem, too. SH> The stated problem is that backwards compatibility is hard to SH> provide: if that is the core complaint, then changing to a text SH> based serialization format with a sexy name in order to add SH> backwards compatibility is a severe overreaction to the SH> problem. Instead, I would propose evolving the API in a manner that SH> simplifies it. I think it would be great to allow multiple APIs in Cassandra (when I proposed it in the past, it was not allowed, and AFAIK still isn't beyond Avro and Thrift). Then this wouldn't be an yes-or-no choice and the Thrift API would still be available to those who need it. Ted
Re: NoSQL, YesCQL?
On Fri, 29 Oct 2010 09:29:43 -0500 Gary Dusbabek wrote: GD> 2010/10/29 Ted Zlatanov : >> On Thu, 28 Oct 2010 14:46:15 -0700 Chip Salzenberg wrote: >> CS> Short answer: "YES Please, but we will still want a side channel for CS> minimum overhead." >> >> 100% agreed on both counts. But IIRC the fastest side channel is to >> become a Cassandra node. Is that an option? GD> Yes. We call it the fat client. It's a non-storage GD> gossip-participating cassandra node that speaks directly in terms of GD> RowMutations, etc. Sorry, I meant to ask Chip "is that an option for you, as opposed to a general side channel usable from any language?" Ted
Re: NoSQL, YesCQL?
On Thu, 28 Oct 2010 14:46:15 -0700 Chip Salzenberg wrote: CS> Short answer: "YES Please, but we will still want a side channel for CS> minimum overhead." 100% agreed on both counts. But IIRC the fastest side channel is to become a Cassandra node. Is that an option? CS> Long answer: Query languages only work reliably when you have data CS> binding assistance (insert "Bobby Tables" xkcd here). However, they do CS> have the wonderful property of evolving aggressively without requiring CS> upgrades of the driver plumbing. This is, of course, emphatically *not* CS> true of anything like the current Thrift and Avro interfaces. So that's CS> why I say "Yes." On the other hand, a very simple interface for very CS> simple queries has a lot of value, too; see, for example, CS> http://yoshinorimatsunobu.blogspot.com/2010/10/using-mysql-as-nosql-story-for.html CS> So that's why I think we will still want to bypass the full language for CS> minimum latency in some circumstances. I think the sane, reasonable, simple path is to make the query language as similar to SQL as possible (which EricQL seems to aim for). Just making the queries pure text would be terrific, in any case. Then a JDBC driver or a Perl DBD driver (and their parallels in Ruby, Python, etc.) would be so much easier to write and Cassandra clients wouldn't have to be so damn complicated. So I'd rather see specialized tools for minimum latency and overhead, especially for inserts and dumps (like MySQL provides mysqlinsert and mysqldump). Ted
Re: admin web UI
On Fri, 7 May 2010 09:24:40 +0200 gabriele renzi wrote: gr> On Thu, May 6, 2010 at 7:00 PM, Nathan McCall wrote: >> FYI - I asked a similar question in #cassandra-dev yesterday (based on >> this message thread actually) and was directed to this issue: >> https://issues.apache.org/jira/browse/CASSANDRA-754 gr> Interesting, but it seems the objection is more on the geneal idea of gr> "multiple APIs are a pain to mantain" than on the idea of having a gr> simple way for plugging external components with a lifecycle. gr> Maybe there is still space for such a minor patch? gr> (Also, though I understand the reasoning "one API should be enough for gr> everyone and two is too much for us" I don't see why such an option gr> should be ruled out for third parties, but I guess other people have gr> already put more thought than me in this) I'm sure if Cassandra maintainers heard from other people besides myself in favor of multiple APIs they would be more likely to listen. I've argued enough on that ticket. FWIW I would again contribute in that direction if there was a chance of acceptance and I am concerned about the operational risks of the Thrift API and with the state of the Avro API. Ted
Re: loading schema in trunk
On Tue, 13 Apr 2010 10:19:23 -0500 Ted Zlatanov wrote: TZ> I think everyone agrees loadSchemaFromXML can go away after 0.7 but just TZ> to be clear, you don't think Cassandra after 0.7 should come bundled TZ> with a tool that can dump, clear, and restore the schema? It's trivial TZ> to implement some very basic support for that without trying to provide TZ> a full management tool. TZ> I think it would be a big help for new users, troubleshooting (because TZ> you don't depend on tollkit X or language Y to know the true schema from TZ> the server's POV), those who want to share schema definitions and tests TZ> without external dependencies, and sysadmins who don't want to install TZ> another language to do a schema backup. I have had no reply to this so I'll just do it from the Perl side; I opened https://issues.apache.org/jira/browse/CASSANDRA-979 which is necessary to implement this tool from the client side. cassidy.pl, which is bundled with Net::Cassandra::Easy, already supports keyspace and family define/rename/delete operations so if this ticket is done then it (and other Thrift clients) can do schema introspection. At least in cassidy.pl I've implemented these commands: (with active keyspace "system") kdefine testks org.apache.cassandra.locator.RackUnawareStrategy 1 org.apache.cassandra.locator.EndPointSnitch' krename testks testks2 kdelete testks2 (with active keyspace "testks") fdefine testcf Super LongType BytesType comment=statuschanges,row_cache_size=0,key_cache_size=2' frename testcf testcf2 fdelete testcf2 but as I said, it would be nice if there was a neutral format to express this schema. YAML would be best. Ted
Re: need help regarding Cassandra Setup in eclipse
On Fri, 16 Apr 2010 18:17:48 +0500 bilal ahmed wrote: ba> hi ba> i have started playing with Cassandra from couple of days. i downloaded ba> its binary and configured it successfully now i want to contribute in this ba> project but i m unable to configure ba> its source code in eclipse. i followed same steps which were written on ba> this page "http://wiki.apache.org/cassandra/RunningCassandraInEclipse"; but ba> i m facing some issues. ba> issue is... ba> when i check out the code its directory structure looks like this... ba> project-name ba> | ba> src -> java ->org -> apache ->cassandra ba> but when i open any class its package statement looks like this... ba>package org.apache.cassandra.auth; or any other package ba> so here eclipse gives me error "*org.apache.cassandra.auth" does not match ba> the expected package "java.org.apache.cassandra.auth*" i tired a lot but i m ba> unable to resolve it. Your source folder should not be "src" but "src/java" You also need to add the "interface/thrift/gen-java" and "test/unit" source folders (only the former is necessary but the latter is good to have for code searches and to run the tests). Ted
dropped keyspace directories
Should the dropped keyspace directories, being empty, get rmdir()ed? When I run tests against a server the directory gets polluted but it's not really a bug so I wasn't sure if it's worth a ticket. Ted
Re: loading schema in trunk
On Tue, 13 Apr 2010 09:08:44 -0500 Eric Evans wrote: EE> On Tue, 2010-04-13 at 08:44 -0500, Gary Dusbabek wrote: >> 2010/4/13 Ted Zlatanov : >> > Should the functionality be exposed only through JMX, through nodetool, >> > or through cassandra-cli? I'll create the ticket if you like and then I >> > or whoever wants to can work on it. >> I prefer thrift (and nodetool), but I'd like to hear thoughts from the >> community. EE> If we're going to do this, I suggest a separate utility scoped at EE> migrating existing definitions from an 0.6 configuration (and nothing EE> else), and then deprecate it right off the bat (read: for removal in EE> 0.8). EE> The point here is to be as clear as possible that it is transient, and EE> shouldn't be adopted as a management tool. I think everyone agrees loadSchemaFromXML can go away after 0.7 but just to be clear, you don't think Cassandra after 0.7 should come bundled with a tool that can dump, clear, and restore the schema? It's trivial to implement some very basic support for that without trying to provide a full management tool. I think it would be a big help for new users, troubleshooting (because you don't depend on tollkit X or language Y to know the true schema from the server's POV), those who want to share schema definitions and tests without external dependencies, and sysadmins who don't want to install another language to do a schema backup. Ted
Re: loading schema in trunk
On Tue, 13 Apr 2010 07:57:42 -0500 Gary Dusbabek wrote: GD> 2010/4/13 Ted Zlatanov : >> I agree loadSchemaFromXML should go away, although IMHO there should be >> an easy way through something bundled with Cassandra (nodetool or >> cassandra-cli) to dump, wipe, and restore the schema even though the >> general schema support is punted to external tools. Are you against >> providing even that rudimentary support? GD> Not at all. Should the functionality be exposed only through JMX, through nodetool, or through cassandra-cli? I'll create the ticket if you like and then I or whoever wants to can work on it. Ted
Re: loading schema in trunk
On Mon, 12 Apr 2010 20:25:15 -0500 Gary Dusbabek wrote: GD> 2010/4/12 Ted Zlatanov : >> >> OK, I was hoping nodetool would support that operation. I wanted to use >> something on the same machine as the Cassandra instance so I can >> automate a complete install in QA, and jconsole won't work unattended >> AFAIK. I don't know JMX well so I'll look for something suitable; >> recommendations are welcome. GD> This was deliberate. I fully intend to deprecate loadSchemaFromXML in GD> 0.7+1 and remove it completely in 0.7+2. Hopefully by then the tool GD> support (provided by high-level clients) will be such that updating GD> the schema using thrift is a no-brainer. I already started work on it in Net::Cassandra::Easy but needed to keep things going with our current setup and jconsole wasn't working. I agree loadSchemaFromXML should go away, although IMHO there should be an easy way through something bundled with Cassandra (nodetool or cassandra-cli) to dump, wipe, and restore the schema even though the general schema support is punted to external tools. Are you against providing even that rudimentary support? Ted
Re: loading schema in trunk
On Tue, 13 Apr 2010 00:15:32 +0100 Ryan Daum wrote: RD> jmxterm is a nice cli jmx tool Sweet. And it supports tab-completion to boot. Thanks! Ted
Re: loading schema in trunk
On Mon, 12 Apr 2010 17:28:12 -0500 Eric Evans wrote: EE> On Mon, 2010-04-12 at 17:16 -0500, Ted Zlatanov wrote: >> In my checkout, nodetool can't load the schemas as the wiki suggests >> for 0.6 upgrades. Is that coming or planned? Or is the user supposed >> to put together their own JMX invocation manually? EE> I can't see where the wiki suggests that. This is the first I'd heard of EE> using nodetool. Those wese two separate thoughts that got merged, sorry: 1) nodetool can't load the schemas 2) the wiki suggests to load the schemas for 0.6 upgrades EE> Any general purpose JMX client should work; I used jconsole. OK, I was hoping nodetool would support that operation. I wanted to use something on the same machine as the Cassandra instance so I can automate a complete install in QA, and jconsole won't work unattended AFAIK. I don't know JMX well so I'll look for something suitable; recommendations are welcome. Thanks Ted
Re: loading schema in trunk
On Mon, 12 Apr 2010 13:08:29 -0500 Ted Zlatanov wrote: >>> In trunk, the schema is loaded through Thrift. Is there a way to load >>> it from the storage-conf.xml AKA cassandra.xml file without writing >>> custom code? In my checkout, nodetool can't load the schemas as the wiki suggests for 0.6 upgrades. Is that coming or planned? Or is the user supposed to put together their own JMX invocation manually? Thanks Ted
Re: loading schema in trunk
On Mon, 12 Apr 2010 12:39:49 -0500 Eric Evans wrote: EE> On Mon, 2010-04-12 at 11:50 -0500, Ted Zlatanov wrote: >> In trunk, the schema is loaded through Thrift. Is there a way to load >> it from the storage-conf.xml AKA cassandra.xml file without writing >> custom code? EE> http://wiki.apache.org/cassandra/LiveSchemaUpdates Thanks, Eric. Thanks to Gary as well for doing all that work and documenting it. I didn't know about this page, though; I'll subscribe to the wiki's Recent Changes feed. Also, the StorageConfiguration page should probably point to LiveSchemaUpdates as well. Ted
loading schema in trunk
In trunk, the schema is loaded through Thrift. Is there a way to load it from the storage-conf.xml AKA cassandra.xml file without writing custom code? Thanks Ted
Re: Thrift out of memory crashes
On Fri, 26 Mar 2010 09:44:23 -0500 Jonathan Ellis wrote: JE> The workarounds we can apply at the Cassandra level have too high a JE> cost:benefit ratio. The long term fix is to move to Avro. Can you list the workarounds you've considered? Is TBinaryProtocol.setReadLength completely useless? Can we at least do a minimal sanity check of incoming messages? When the benefit is "you won't crash because someone telnetted to the wrong port" I'm willing to pay a pretty a high cost. Ted
Re: Thrift out of memory crashes
On Fri, 26 Mar 2010 07:48:43 -0500 Jonathan Ellis wrote: JE> 2010/3/26 Ted Zlatanov : >> I know this has been discussed in tickets and here previously. I just >> wanted to comment on it because of the upcoming 0.6 release. >> >> In my environment I patch Cassandra to prevent the OOM errors from >> malformed incoming Thrift data, which as everyone knows let anyone crash >> the servers hard with a netcat invocation. For those who don't know the >> story, see https://issues.apache.org/jira/browse/THRIFT-601 >> >> I think the OOM guard should be in the Cassandra releases, at least as >> an option. Just because Thrift doesn't give us airbags doesn't mean we >> don't need brakes. JE> Catching OOME is a bug, not a fix. OOME is the JVM saying "I give up; JE> you're screwed." The JVM isn't stable anymore. I didn't know that, thanks for explaining. I thought the JVM could recover. Can we patch the Thrift-generated Java code, at least, set the read length, or do something else? I hate to give up on this just because Thrift is broken (as we've discussed, there's no viable Thrift replacement yet, and we won't allow users to replace the Thrift API with their own implementation as I proposed with IPluggableAPI). Thanks Ted
Thrift out of memory crashes
I know this has been discussed in tickets and here previously. I just wanted to comment on it because of the upcoming 0.6 release. In my environment I patch Cassandra to prevent the OOM errors from malformed incoming Thrift data, which as everyone knows let anyone crash the servers hard with a netcat invocation. For those who don't know the story, see https://issues.apache.org/jira/browse/THRIFT-601 I think the OOM guard should be in the Cassandra releases, at least as an option. Just because Thrift doesn't give us airbags doesn't mean we don't need brakes. Ted
Re: Standardizing Timestamps Across Clients
On Thu, 18 Mar 2010 13:20:34 -0700 Michael Malone wrote: MM> A standard default would be nice, but while we're making MM> recommendations I'd also suggest that client libs should make this MM> parameter easy to override. Client apps can do lots of interesting MM> things by setting timestamps explicitly. You can get a sort of quasi- MM> transaction by using the same timestamp for a set of operations, for MM> example. That's a good idea. I made the change in Net::Cassandra::Easy 0.05 (you just pass a subroutine reference to the constructor if you don't want the default microseconds). Thanks Ted
Re: Standardizing Timestamps Across Clients
On Thu, 18 Mar 2010 02:36:35 -0500 Jonathan Hseu wrote: JH> Jonathan Ellis suggested that I bring this issue to the dev mailing list: JH> Cassandra should recommended a default timestamp across all clients JH> libraries. ... JH> Here's what different clients are using: JH> 1. Cassandra CLI: Milliseconds since UTC epoch. JH> 2. lazyboy: Seconds since UTC epoch. It used to be seconds since local time JH> epoch. Now it's changing again to microseconds since UTC epoch. JH> 3. driftx's client: Milliseconds since UTC epoch. JH> 4. The example app, Twissandra: Microseconds since UTC epoch. JH> 5. pycassa: Microseconds since UTC epoch. It used to be seconds since local JH> time epoch. JH> 6. The most popular Cassandra Ruby client: Microseconds since UTC epoch. It's good to standardize :) In Perl land, Net::Cassandra::Easy is using seconds but should be using microseconds. I'll change it for 0.4 (the underlying Thrift code will DTRT for the 64-bit encoding using Bit::Vector). Net::Cassandra uses seconds and should also be changed; CC-d to that module's maintainer. Ted
Re: GMane groups updated with new mailing list addresses
On Tue, 16 Mar 2010 14:26:10 -0500 Jonathan Ellis wrote: JE> 2010/3/16 Ted Zlatanov : >> I requested this yesterday and it's done: you can read the mailing lists >> through GMane again, they are changed to the new addresses. The >> addresses are (NNTP protocol) >> >> news.gmane.org:gmane.comp.db.cassandra.devel >> news.gmane.org:gmane.comp.db.cassandra.user >> >> This is very convenient if you don't want to subscribe to the mailing >> lists. JE> Thanks for getting that done! No problem. The dev group is showing cross posts from the ActiveMQ group due to a misconfiguration and I've already told the GMane admins, but Cassandra dev articles are flowing so this is a temporary inconvenience. The user list/group gateway is working great. Ted
GMane groups updated with new mailing list addresses
I requested this yesterday and it's done: you can read the mailing lists through GMane again, they are changed to the new addresses. The addresses are (NNTP protocol) news.gmane.org:gmane.comp.db.cassandra.devel news.gmane.org:gmane.comp.db.cassandra.user This is very convenient if you don't want to subscribe to the mailing lists. HTH Ted
Re: thinking about dropping hinted handoff
On Wed, 10 Mar 2010 15:59:55 -0600 Jonathan Ellis wrote: JE> Read-only for a specific client is completely different from trying to JE> read-only the entire node / cluster. So no, nothing wrong with that. Cool, thanks. See CASSANDRA-900 for my proposal. Ted