You need to see if the timeout is from the client to the server, or between the 
server nodes. 

If it's server side a TimedOutException will be thrown from thrift. Take a look 
at the nodetool tpstats on the servers, you will probably see lots of "Pending" 
tasks. Basically the cluster is overloaded. Consider:

* check the IO, CPU, GC state on the servers. 
* ensuring the data and requests are evenly spread around the cluster. 
* reducing the number of columns read in a select. 

Cheers

-----------------
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com

On 6/04/2012, at 5:30 AM, Daning Wang wrote:

> Hi all,
> 
> We are using Hector and ofter we see lots of timeout exception in the log, I 
> know that the hector can failover to other node, but I want to reduce the 
> number of timeouts.
> 
> any hector parameter I should change to reduce this error?
> 
> also, on the server side, any kind of tunning need to do for the timeout?
>  
> 
> Thanks in advance.
> 
> 
> 12/04/04 15:13:20 ERROR 
> com.netseer.services.keywordstat.io.KeywordServiceImpl: Timout 10000 ms
> 12/04/04 15:13:25 ERROR 
> me.prettyprint.cassandra.connection.HConnectionManager: MARK HOST AS DOWN 
> TRIGGERED for host 10.28.78.123(10.28.78.123):9160
> 12/04/04 15:13:25 ERROR 
> me.prettyprint.cassandra.connection.HConnectionManager: Pool state on 
> shutdown: 
> <ConcurrentCassandraClientPoolByHost>:{10.28.78.123(10.28.78.123):9160}; 
> IsActive?: true; Active: 1; Blocked: 0; Idle: 5; NumBeforeExhausted: 19
> 12/04/04 15:13:44 ERROR 
> me.prettyprint.cassandra.connection.HConnectionManager: MARK HOST AS DOWN 
> TRIGGERED for host 10.240.113.171(10.240.113.171):9160
> 12/04/04 15:13:44 ERROR 
> me.prettyprint.cassandra.connection.HConnectionManager: Pool state on 
> shutdown: 
> <ConcurrentCassandraClientPoolByHost>:{10.240.113.171(10.240.113.171):9160}; 
> IsActive?: true; Active: 1; Blocked: 0; Idle: 5; NumBeforeExhausted: 19
> 12/04/04 15:13:46 ERROR 
> me.prettyprint.cassandra.connection.HConnectionManager: MARK HOST AS DOWN 
> TRIGGERED for host 10.28.78.123(10.28.78.123):9160
> 12/04/04 15:13:46 ERROR 
> me.prettyprint.cassandra.connection.HConnectionManager: Pool state on 
> shutdown: 
> <ConcurrentCassandraClientPoolByHost>:{10.28.78.123(10.28.78.123):9160}; 
> IsActive?: true; Active: 1; Blocked: 0; Idle: 5; NumBeforeExhausted: 19
> 12/04/04 15:13:46 ERROR 
> me.prettyprint.cassandra.connection.HConnectionManager: MARK HOST AS DOWN 
> TRIGGERED for host 10.123.83.114(10.123.83.114):9160
> 12/04/04 15:13:46 ERROR 
> me.prettyprint.cassandra.connection.HConnectionManager: Pool state on 
> shutdown: 
> <ConcurrentCassandraClientPoolByHost>:{10.123.83.114(10.123.83.114):9160}; 
> IsActive?: true; Active: 1; Blocked: 0; Idle: 5; NumBeforeExhausted: 19
> 12/04/04 15:13:46 ERROR 
> me.prettyprint.cassandra.connection.HConnectionManager: MARK HOST AS DOWN 
> TRIGGERED for host 10.6.115.239(10.6.115.239):9160
> 12/04/04 15:13:46 ERROR 
> me.prettyprint.cassandra.connection.HConnectionManager: Pool state on 
> shutdown: 
> <ConcurrentCassandraClientPoolByHost>:{10.6.115.239(10.6.115.239):9160}; 
> IsActive?: true; Active: 1; Blocked: 0; Idle: 5; NumBeforeExhausted: 19
> 12/04/04 15:13:49 ERROR 
> com.netseer.services.keywordstat.io.KeywordServiceImpl: Timout 10000 ms
> 12/04/04 15:13:49 ERROR 
> me.prettyprint.cassandra.connection.HConnectionManager: MARK HOST AS DOWN 
> TRIGGERED for host 10.120.205.48(10.120.205.48):9160
> 12/04/04 15:13:49 ERROR 
> me.prettyprint.cassandra.connection.HConnectionManager: Pool state on 
> shutdown: 
> <ConcurrentCassandraClientPoolByHost>:{10.120.205.48(10.120.205.48):9160}; 
> IsActive?: true; Active: 3; Blocked: 0; Idle: 3; NumBeforeExhausted: 17
> 12/04/04 15:13:50 ERROR 
> me.prettyprint.cassandra.connection.HConnectionManager: MARK HOST AS DOWN 
> TRIGGERED for host 10.28.20.200(10.28.20.200):9160
> 12/04/04 15:13:50 ERROR 
> me.prettyprint.cassandra.connection.HConnectionManager: Pool state on 
> shutdown: 
> <ConcurrentCassandraClientPoolByHost>:{10.28.20.200(10.28.20.200):9160}; 
> IsActive?: true; Active: 2; Blocked: 0; Idle: 4; NumBeforeExhausted: 18
> 12/04/04 15:13:51 ERROR 
> com.netseer.services.keywordstat.io.KeywordServiceImpl: Timout 10000 ms

Reply via email to