Benjamin Black <b <at> b3k.us> writes:

> 
> I am only saying something obvious: if you don't have sufficient
> resources to handle the demand, you should reduce demand, increase
> resources, or expect errors.  Doing lots of writes without much heap
> space is such a situation (whether or not it is happening in this
> instance), but there are many others.  This constraint it not specific
> to Cassandra.  Hence, there is no free lunch.
> 
> b

I guess my point is that I have rarely run across database servers that die
from either too many client connections, or too rapid client requests.  They
generally stop accepting incoming connections when there are too many connection
requests, and further they do not queue and acknowledge an unbounded number of
client requests on any given connection.

In the example at hand, Julie has 8 clients, each of which is in a loop that
writes 100 rows at a time (via batch_mutate), waits for successful completion,
then writes another bunch of 100, until it completes all of the rows it is
supposed to write (typically 100,000).  So at any one time, each client should
have about 10 MB of request (100 rows x 100 KB/row), times 8 clients, for a max
pending request of no more than 80 MB.

Further each request is running with a CL=ALL, so in theory, the request should
not complete until each row has been handed off to the ultimate destination
node, and perhaps written to the commit log (that part is not clear to me).

It sounds like something else must be gobbling up either an unbounded amount
of heap, or alternatively, a bounded, but large amount of heap.  In the former
case it is unclear how to make the application robust.  In the later, it would
be helpful to understand what the heap ussage upper bound is, and what
parameters might have a significant effect on that value.

To clarify the history here -- initially we were writing with CL=0 and had
great performance but ended up killing the server.  It was pointed out that
we were really asking the server to accept and acknowledge an unbounded
number of requests without waiting for any final disposition of the rows.
So we had a "doh!" moment.  That is why we went to the other extreme of
CL=ALL, to let the server fully dispose of each request before acknowledging
it and getting the next.

TIA
-- Charlie




Reply via email to