[jira] [Created] (TAJO-613) Hedging against unusually slow TajoWorker

Keuntae Park (JIRA) Wed, 19 Feb 2014 21:27:06 -0800

Keuntae Park created TAJO-613:
---------------------------------

             Summary: Hedging against unusually slow TajoWorker
                 Key: TAJO-613
                 URL: https://issues.apache.org/jira/browse/TAJO-613
             Project: Tajo
          Issue Type: Improvement
            Reporter: Keuntae Park



When one of disks in my Tajo cluster becomes not healthy (that means slow 
response time due to hardware problem), it results in extremely slow query 
processing time.

Following is kernel log of the server that has unhealthy disk:
{noformat}
Feb 18 15:20:12 ceo-tajo03 kernel: sd 0:2:4:0: [sde] Unhandled error code
Feb 18 15:20:12 ceo-tajo03 kernel: sd 0:2:4:0: [sde] Result: hostbyte=DID_ERROR 
driverbyte=DRIVER_OK
Feb 18 15:20:12 ceo-tajo03 kernel: sd 0:2:4:0: [sde] CDB: Read(16): 88 00 00 00 
00 01 57 ec 66 32 00 00 01 00 00 00
...
{noformat}

This problem makes TaskRunner, which normally takes less than 3 seconds for the 
given query,  takes 1700 seconds, and total query execution time also becomes 
1750 seconds, which is normally 70 seconds before.    

I think Tajo needs a mechanism like speculative execution of MapReduce.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Created] (TAJO-613) Hedging against unusually slow TajoWorker

Reply via email to