Bill Farner created AURORA-1090:
-----------------------------------

             Summary: Optimize or remove shard uniqueness check from 
StorageBackfill
                 Key: AURORA-1090
                 URL: https://issues.apache.org/jira/browse/AURORA-1090
             Project: Aurora
          Issue Type: Task
          Components: Scheduler, Technical Debt
            Reporter: Bill Farner
            Priority: Critical


We have noticed that during scheduler startup, the operation, there can be a 
significant amount of time spent between the following log lines:

{noformat}
Performing shard uniqueness sanity check.
storage state machine transition PREPARED -> READY
{noformat}

Looking at what happens in the scheduler between those points, the expensive 
operation seems to be {{guaranteeShardUniqueness}}.

This operation aims to validate the integrity of the storage, but its value is 
dubious.  There are many other things that could be done to validate integrity, 
but they should probably not be done every time the scheduler loads its 
database.

If the operation is kept, it can be dramatically optimized.  It currently 
performs an O(n^2) scan of tasks, and this could trivially be reduced to O(n).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to