I suspect the issue in my case is saturation, or that the node has gone down. In the case of saturation, sleeping and retrying seems to be the fix. Otherwise setup a new transport/proto/client to a new host in the cluster. Ok, got it. Ian
On Thu, May 13, 2010 at 2:20 PM, Roger Schildmeijer <schildmei...@gmail.com>wrote: > All the Exceptions are documented on the API page ( > http://wiki.apache.org/cassandra/API) on the wiki. > > * UnavailableException -- "Not all the replicas required could be created > and/or read." > * TimedOutException -- "The node responsible for the write or read did not > respond during the rpc interval specified in your configuration (default > 10s). This can happen if the request is too large, the node is oversaturated > with requests, or the node is down but the failure detector has not yet > realized it (usually this takes < 30s)." > > Its hard to give a generic solution proposal. The "proper course of action" > depends on your application domain. > As stated on the wiki the reason for timeout exception could be because of > different reasons. > * "request is too large" -- Proposal: try to narrow your request > > * "node is oversaturated with requests" -- Proposal: using order > preserving partitioner? try random partitioner for better load balancing. > need more nodes in your cluster? > > * "node is down but the failure detector has not yet realized it": > altered the phi constant in o.a.c.gsm.FailureDetector ( > phiConvictThreshold_, > default == 8)? > > // Roger Schildmeijer > > On 13 maj 2010, at 19.53em, Ian Soboroff wrote: > > I searched the Wiki and the mailing list archives a bit but couldn't find > the answer. > > If I catch an exception from a Cassandra.Client method, in my case > batch_mutate, what's the proper course of action? > > Ignoring InvalidRequestException, we have Unavailable, TimedOut, and > generic Thrift exceptions. > > Do I just gin up a new client? Do I need to build the TTransport/Tproto > bits as well? > > Thanks, > Ian > > >