Re: Node goes AWOL briefly; failed replication does not report error to client, though consistency=ALL

2010-12-08 Thread Reverend Chip
On 12/8/2010 7:30 AM, Jonathan Ellis wrote: > On Tue, Dec 7, 2010 at 4:00 PM, Reverend Chip wrote: >> Full DEBUG level logs would be a space problem; I'm loading at least 1T >> per node (after 3x replication), and these events are rare. Can the >> DEBUG logs be limited to the specific modules hel

Re: Node goes AWOL briefly; failed replication does not report error to client, though consistency=ALL

2010-12-08 Thread Jonathan Ellis
On Tue, Dec 7, 2010 at 4:00 PM, Reverend Chip wrote: > On 12/7/2010 1:10 PM, Jonathan Ellis wrote: >> I'm inclined to think there's a bug in your client, then. > > That doesn't pass the smell test.  The very same client has logged > timeout and unavailable exceptions on other occasions, e.g. when

Re: Node goes AWOL briefly; failed replication does not report error to client, though consistency=ALL

2010-12-07 Thread Reverend Chip
On 12/7/2010 1:10 PM, Jonathan Ellis wrote: > I'm inclined to think there's a bug in your client, then. That doesn't pass the smell test. The very same client has logged timeout and unavailable exceptions on other occasions, e.g. when there are too many clients or (in a previous configuration) wh

Re: Node goes AWOL briefly; failed replication does not report error to client, though consistency=ALL

2010-12-07 Thread Jonathan Ellis
I'm inclined to think there's a bug in your client, then. DEBUG-level logs could confirm or refute this by logging for each insert how many replicas are being blocked for, which nodes it got responses from, and whether a TimedOutException from not getting ALL replies was returned to the client. O

Re: Node goes AWOL briefly; failed replication does not report error to client, though consistency=ALL

2010-12-07 Thread Reverend Chip
No, I'm afraid that's not it: replica_placement_strategy: org.apache.cassandra.locator.SimpleStrategy replication_factor: 3 On 12/7/2010 6:37 AM, Jonathan Ellis wrote: > If you are using NetworkTopologyStrategy you are probably hitting > https://issues.apache.org/jira/browse/CASSANDRA-1804 whi

Re: Node goes AWOL briefly; failed replication does not report error to client, though consistency=ALL

2010-12-07 Thread Jonathan Ellis
If you are using NetworkTopologyStrategy you are probably hitting https://issues.apache.org/jira/browse/CASSANDRA-1804 which is fixed in rc2. On Mon, Dec 6, 2010 at 6:58 PM, Reverend Chip wrote: > I'm running a big test -- ten nodes with 3T disk each.  I'm using > 0.7.0rc1.  After some tuning hel

Node goes AWOL briefly; failed replication does not report error to client, though consistency=ALL

2010-12-06 Thread Reverend Chip
I'm running a big test -- ten nodes with 3T disk each. I'm using 0.7.0rc1. After some tuning help (thanks Tyler) lots of this is working as it should. However a serious event occurred as well -- the server froze up -- and though mutations were dropped, no error was reported to the client. Here'