Actually, you shouldn't expect errors in the general case, unless you
are simply trying to use data that can't fit in available heap. There
are some practical limitations, as always.

If there aren't enough resources on the server side to service the
clients, the expectation should be that the servers have a graceful
performance degradation, or in the worst case throw an error specific
to resource exhaustion or explicit resource throttling. The fact that
Cassandra does some background processing complicates this a bit.
There are things which can cause errors after the fact, but these are
generally considered resource tuning issues and are somewhat clear
cut. There are specific changes in the works to bring background load
exceptions into view of a client session, where users normally expect
them.

@see https://issues.apache.org/jira/browse/CASSANDRA-685

But otherwise, users shouldn't be expecting that simply increasing
client load can blow up their Cassandra cluster. Any time this
happens, it should be considered a bug or a misfeature. Devs please
correct me here if I'm wrong.

Jonathan


On Tue, Jun 15, 2010 at 6:44 PM, Charles Butterfield
<charles.butterfi...@nextcentury.com> wrote:
> Benjamin Black <b <at> b3k.us> writes:
>
>>
>> I am only saying something obvious: if you don't have sufficient
>> resources to handle the demand, you should reduce demand, increase
>> resources, or expect errors.  Doing lots of writes without much heap
>> space is such a situation (whether or not it is happening in this
>> instance), but there are many others.  This constraint it not specific
>> to Cassandra.  Hence, there is no free lunch.
>>
>> b
>
> I guess my point is that I have rarely run across database servers that die
> from either too many client connections, or too rapid client requests.  They
> generally stop accepting incoming connections when there are too many 
> connection
> requests, and further they do not queue and acknowledge an unbounded number of
> client requests on any given connection.
>
> In the example at hand, Julie has 8 clients, each of which is in a loop that
> writes 100 rows at a time (via batch_mutate), waits for successful completion,
> then writes another bunch of 100, until it completes all of the rows it is
> supposed to write (typically 100,000).  So at any one time, each client should
> have about 10 MB of request (100 rows x 100 KB/row), times 8 clients, for a 
> max
> pending request of no more than 80 MB.
>
> Further each request is running with a CL=ALL, so in theory, the request 
> should
> not complete until each row has been handed off to the ultimate destination
> node, and perhaps written to the commit log (that part is not clear to me).
>
> It sounds like something else must be gobbling up either an unbounded amount
> of heap, or alternatively, a bounded, but large amount of heap.  In the former
> case it is unclear how to make the application robust.  In the later, it would
> be helpful to understand what the heap ussage upper bound is, and what
> parameters might have a significant effect on that value.
>
> To clarify the history here -- initially we were writing with CL=0 and had
> great performance but ended up killing the server.  It was pointed out that
> we were really asking the server to accept and acknowledge an unbounded
> number of requests without waiting for any final disposition of the rows.
> So we had a "doh!" moment.  That is why we went to the other extreme of
> CL=ALL, to let the server fully dispose of each request before acknowledging
> it and getting the next.
>
> TIA
> -- Charlie
>
>
>
>
>

Reply via email to