[GitHub] [incubator-tvm] comaniac opened a new pull request #6835: [AutoScheduler] Improve tuning with random cost model

GitBox Tue, 03 Nov 2020 11:18:24 -0800


comaniac opened a new pull request #6835:
URL: https://github.com/apache/incubator-tvm/pull/6835



   When tuning a task with all invective operators on GPU, Ansor inlines all 
operators for better performance. However, this makes the expression complex 
and causes high overhead of lowering the TE schedule, which is required for 
cost model feature extraction. Since the tuning space of these tasks is 
relatively small, it is sufficient to use random cost model so that we can 
avoid the lowering overhead.
   
   However, the flow of InitPopulation -> RandomStates only gave me one state. 
After diving into details, I found that all states generated by initial 
populations are the same in terms of state.ToStr(). The reasons I can think of 
are either 2,048 is insufficient to produce two different states, or ToStr() 
does not reflect different states. Here is an example output of initial 
population sampling with de-duplication (throw away the states if it is already 
in the out_states):
   
   ```
   ------------------------------------------------------------
   -------------------------  [ Search ]
   ------------------------------------------------------------
   Generate Sketches               #s: 1
   Sample Iter: 5  #Pop: 1 #Target: 50     fail_ct: 10239  Time elapsed: 46.81
   #Target has been reduced to 25 due to too many failures or duplications
   Sample Iter: 10 #Pop: 1 #Target: 25     fail_ct: 20479  Time elapsed: 92.52
   #Target has been reduced to 12 due to too many failures or duplications
   Sample Iter: 15 #Pop: 1 #Target: 12     fail_ct: 30719  Time elapsed: 138.50
   #Target has been reduced to 6 due to too many failures or duplications
   Sample Iter: 20 #Pop: 1 #Target: 6      fail_ct: 40959  Time elapsed: 184.33
   #Target has been reduced to 3 due to too many failures or duplications
   Sample Iter: 25 #Pop: 1 #Target: 3      fail_ct: 51199  Time elapsed: 229.81
   #Target has been reduced to 1 due to too many failures or duplications
   Sample Initial Population       #s: 1   fail_ct: 53247  Time elapsed: 239.00
   ```
   
   As can be seen, even 50K samples cannot even produce the second state. I 
added a set to check the number of unique states before and after infer bound, 
and here is the results:
   
   ```
   ------------------------------------------------------------
   -------------------------  [ Search ]
   ------------------------------------------------------------
   Generate Sketches               #s: 1
   Unique before InferBound 1
   Unique after InferBound 1
   Unique before InferBound 1
   Unique after InferBound 1
   Unique before InferBound 1
   Unique after InferBound 1
   Unique before InferBound 1
   Unique after InferBound 1
   Unique before InferBound 1
   Unique after InferBound 1
   Sample Iter: 5  #Pop: 1 #Target: 25     fail_ct: 10239  Time elapsed: 95.63
   Unique before InferBound 1
   Unique after InferBound 1
   #Target has been reduced to 12 due to too many failures or duplications
   Unique before InferBound 1
   Unique after InferBound 1
   Unique before InferBound 1
   Unique after InferBound 1
   Unique before InferBound 1
   Unique after InferBound 1
   Unique before InferBound 1
   Unique after InferBound 1
   Sample Iter: 10 #Pop: 1 #Target: 12     fail_ct: 20479  Time elapsed: 190.91
   Unique before InferBound 1
   Unique after InferBound 1
   #Target has been reduced to 6 due to too many failures or duplications
   Unique before InferBound 1
   Unique after InferBound 1
   Unique before InferBound 1
   Unique after InferBound 1
   Unique before InferBound 1
   Unique after InferBound 1
   Unique before InferBound 1
   Unique after InferBound 1
   Sample Iter: 15 #Pop: 1 #Target: 6      fail_ct: 30719  Time elapsed: 285.54
   Unique before InferBound 1
   Unique after InferBound 1
   #Target has been reduced to 3 due to too many failures or duplications
   Unique before InferBound 1
   Unique after InferBound 1
   Unique before InferBound 1
   Unique after InferBound 1
   Unique before InferBound 1
   Unique after InferBound 1
   Unique before InferBound 1
   Unique after InferBound 1
   Sample Iter: 20 #Pop: 1 #Target: 3      fail_ct: 40959  Time elapsed: 380.32
   Unique before InferBound 1
   Unique after InferBound 1
   #Target has been reduced to 1 due to too many failures or duplications
   Sample Initial Population       #s: 1   fail_ct: 43007  Time elapsed: 399.26
   Unique before InferBound 1
   Unique after InferBound 1
   GA Iter: 0      Max score: 0.0121       Min score: 0.0121       #Pop: 1 #M+: 
0  #M-: 0
   Unique before InferBound 1
   Unique after InferBound 7
   GA Iter: 1      Max score: 0.0121       Min score: 0.0121       #Pop: 7 #M+: 
0  #M-: 0
   Unique before InferBound 7
   Unique after InferBound 12
   GA Iter: 2      Max score: 0.0121       Min score: 0.0121       #Pop: 12 
#M+: 0  #M-: 0
   Unique before InferBound 12
   ...
   ```
   
   This log implies two points:
   
   1. We should call ToStr() after infer bound to make sure we can 
differentiate states. As we can see from the log, 2,048 candidates have the 
same ToStr() before infer bound, but we got 7 different ToStr() after infer 
bound.
   
   2. Systematically mutate tile sizes is way more efficient than random 
sampling. As can be seen from the log, we can get 7 different states in only 
2,048 mutated states, but we only get 1 state in 43,007 random states.
   
   Accordingly, this PR runs evolutionary search even we are using the random 
cost model and here is the new log (initial poulation is set to 1 and retry is 
set to 2):
   
   ```
   ------------------------------------------------------------
   -------------------------  [ Search ]
   ------------------------------------------------------------
   Generate Sketches               #s: 1
   Sample Initial Population       #s: 1   fail_ct: 2047   Time elapsed: 19.30
   GA iteration number has been adjusted to 3 due to random cost model
   GA Iter: 0      Max score: 0.3863       Min score: 0.3863       #Pop: 1 #M+: 
0  #M-: 0
   GA Iter: 3      Max score: 0.9023       Min score: 0.0289       #Pop: 12     
   #M+: 1422       #M-: 161
   EvolutionarySearch              #s: 12  Time elapsed: 31.20
   ------------------------------------------------------------
   -------------------------  [ Measure ]
   ------------------------------------------------------------
   Get 12 programs for measure. (This may take a while)
   ............************
   // skip
   ------------------------------------------------------------
   -------------------------  [ Train cost model ]
   ------------------------------------------------------------
   ------------------------------------------------------------
   -------------------------  [ Search ]
   ------------------------------------------------------------
   Sample Initial Population       #s: 1   fail_ct: 2047   Time elapsed: 19.09
   GA iteration number has been adjusted to 3 due to random cost model
   GA Iter: 0      Max score: N/A  Min score: N/A  #Pop: 0 #M+: 0  #M-: 0
   GA Iter: 3      Max score: N/A  Min score: N/A  #Pop: 0 #M+: 1425       #M-: 
155
   EvolutionarySearch              #s: 0   Time elapsed: 32.12
   ------------------------------------------------------------
   -------------------------  [ Search ]
   ------------------------------------------------------------
   Sample Initial Population       #s: 1   fail_ct: 2047   Time elapsed: 19.14
   GA iteration number has been adjusted to 3 due to random cost model
   GA Iter: 0      Max score: N/A  Min score: N/A  #Pop: 0 #M+: 0  #M-: 0
   GA Iter: 3      Max score: N/A  Min score: N/A  #Pop: 0 #M+: 1431       #M-: 
166
   EvolutionarySearch              #s: 0   Time elapsed: 30.53
   ------------------------------------------------------------
   -------------------------  [ Search ]
   ------------------------------------------------------------
   Sample Initial Population       #s: 1   fail_ct: 2047   Time elapsed: 19.34
   GA iteration number has been adjusted to 3 due to random cost model
   GA Iter: 0      Max score: N/A  Min score: N/A  #Pop: 0 #M+: 0  #M-: 0
   GA Iter: 3      Max score: N/A  Min score: N/A  #Pop: 0 #M+: 1427       #M-: 
157
   EvolutionarySearch              #s: 0   Time elapsed: 29.05
   It seems all candidates in the search space have been measured.
   ------------------------------------------------------------
   -------------------------  [ Done ]
   ------------------------------------------------------------
   ```
   
   cc @merrymercy @jcf94 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [incubator-tvm] comaniac opened a new pull request #6835: [AutoScheduler] Improve tuning with random cost model

Reply via email to