On Wed, Aug 22, 2012 at 4:53 PM, Christoph Hack <christ...@tux21b.org>wrote:

> Hi,
>
> I am currently developing a client for Cassandra's new native protocol
> in Go. Everything is working fine so far and I am quite happy with the
> new protocol. Good work! Here a just a couple of questions and notes:
>

That's great. The sooner this new protocol can take off, the better!

1) The native_protocol.spec should probably mention that it is based
> on TCP. I greped the whole document for "UDP" and "TCP" and found
> nothing.
>

Good point. I'll see about getting that into the doc.

2) Streaming
>
> I'm currently allocating and reading one frame after another on the
> client side. The thing that worries me - If I have understood the
> current specification correctly - is that all rows are returned in a
> single frame. If the database is quite large, this frames might not
> fit into the memory.
>

This may not be as bad as it sounds here, since CQL3 doesn't have wide rows
(the underlying storage engine rows may be wide, but those don't matter
here). Normal CQL queries which are expected to be performant (not too many
partition keys touched) should generally fit just fine in memory. But yes,
definitely it will be possible (and even sometimes necessary) to have
resultsets that are too big to handle. I thought about how to handle this
best in the Python driver, but I couldn't come up with a good general
solution; just lots of half-solutions that would only help in certain
cases. When CASSANDRA-4415 lands, it should help mitigate this
significantly.

I was thinking about using buffered I/O instead, but that's probably
> not a good solution either, because then a single thread that iterates
> over the rows slowly will block the whole connection. In my opinion it
> would be a good idea to split the response to a couple of frames, so
> that other threads (with different stream ids) are still able to use
> the connection concurrently and push notifications are still
> delivered. A "last" flag might indicate the end of the response.
>

This is pretty much what #4415 will do.

3) Stability
>
> During the development of the client, I happened to send some invalid
> requests to the server. In particular, I have send a startup message
> with a body length of 0 (and no body afterwards). In those cases,
> Cassandra immediately started to use all of my 8GBs of RAM and spawned
> up to 1000 threads. The log files were full of "out of memory" and
> "couldn't spawn process" messages and it was quite difficult to kill
> Cassandra again.
>

Ouch, that's pretty bad. File a jira ticket for that, for sure. The binary
protocol will still be "beta" for the 1.2 release, and not enabled by
default, so it's not earth-stopping, but I'm sure we want it more stable
than THAT.

4. Prepared Statements
>
> It should be possible to prepare statements that do not take any
> arguments. This simplifies the client development significantly
> (otherwise everybody has to write his own parser to determine the
> number of arguments) and might also speed up common queries.
> The current implementation returns a "java.lang.AssertionError" error
> response in that case.
>

Agreed. I remember hearing about this happening with the thrift interface
too -
http://code.google.com/a/apache-extras.org/p/cassandra-dbapi2/issues/detail?id=21&can=1came
from there- but it doesn't look like anyone bothered to file a jira
ticket. Want to do that one too?

Regards,
> Christoph
>

Best of luck with progress!

p

Reply via email to