comaniac commented on a change in pull request #7197:
URL: https://github.com/apache/tvm/pull/7197#discussion_r551479201



##########
File path: src/auto_scheduler/feature.cc
##########
@@ -1462,12 +1462,18 @@ void GetPerStoreFeaturesFromMeasurePairs(const 
Array<MeasureInput>& inputs,
     if (find_res == task_cache.end()) {
       if (inputs[i]->task->compute_dag.defined()) {  // the measure input is 
complete
         task = inputs[i]->task;
-      } else {  // the measure input is incomplete
-        // rebuild task for incomplete measure pairs read from file
-        Array<te::Tensor> tensors = (*workload_key_to_tensors)(workload_key);
-        task = SearchTask(ComputeDAG(tensors), workload_key, 
inputs[i]->task->target,
-                          inputs[i]->task->target_host, 
inputs[i]->task->hardware_params,
-                          inputs[i]->task->layout_rewrite_option);
+      } else {
+        // The measure input is incomplete, rebuild task for incomplete 
measure pairs read from file
+        try {
+          Array<te::Tensor> tensors = (*workload_key_to_tensors)(workload_key);
+          task = SearchTask(ComputeDAG(tensors), workload_key, 
inputs[i]->task->target,
+                            inputs[i]->task->target_host, 
inputs[i]->task->hardware_params,
+                            inputs[i]->task->layout_rewrite_option);
+        } catch (std::exception& e) {
+          // Cannot build ComputeDAG from workload key, the task may have not 
been registered in
+          // this search round
+          continue;

Review comment:
       Yeah I know. I was thinking a case that you have a `log.json` which has 
no records for the task you are going to tune, so all of them are ignored when 
loading. I was thinking to have a message by the end of feature extraction like 
we previously did ("encountered XXX errors which are safely ignored"), but 
since that message has been removed, maybe we're good for now.

##########
File path: python/tvm/auto_scheduler/cost_model/xgb_model.py
##########
@@ -141,6 +146,12 @@ def update(self, inputs, results):
         self.inputs.extend(inputs)
         self.results.extend(results)
 
+        if len(self.inputs) - self.last_train_length < self.last_train_length 
/ 5:

Review comment:
       Now I got your point, but I'm not sure if reduces training frequency 
could solve the problem in general. At the first glance, we could stop to train 
the cost model if 1) the accuracy of the current one is sufficient, 2) or we 
already have sufficient number of records. Your solution is similar to 2, so 
I'm curious if we already have sufficient number of records, then how much 
accuracy could be further improved if we train the model again next time with 
>20% more data?
   
   On the other hand, I'm wondering if we could leverage the first solution. 
For example, we calculate the test accuracy of the measured records after every 
round. If the accuracy is higher than the threshold, then we skip the training 
in the next round.
   




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Reply via email to