[GitHub] [doris] Kikyou1997 commented on a diff in pull request #12765: [feature-wip](statistics) collect statistics by sql task

GitBox Wed, 21 Sep 2022 00:50:05 -0700


Kikyou1997 commented on code in PR #12765:
URL: https://github.com/apache/doris/pull/12765#discussion_r976151896



##########
fe/fe-core/src/main/java/org/apache/doris/statistics/SQLStatisticsTask.java:
##########
@@ -17,47 +17,119 @@
 
 package org.apache.doris.statistics;
 
-import org.apache.doris.analysis.SelectStmt;
+import org.apache.doris.catalog.Database;
+import org.apache.doris.catalog.Env;
+import org.apache.doris.catalog.Table;
+import org.apache.doris.common.DdlException;
+import org.apache.doris.common.InvalidFormatException;
+import org.apache.doris.statistics.StatisticsTaskResult.TaskResult;
+import org.apache.doris.statistics.StatsGranularity.Granularity;
+import org.apache.doris.statistics.util.InternalQuery;
+import org.apache.doris.statistics.util.InternalQueryResult;
+import org.apache.doris.statistics.util.InternalQueryResult.ResultRow;
+import org.apache.doris.statistics.util.InternalSqlTemplate;
+
+import com.google.common.collect.Lists;
+import com.google.common.collect.Maps;
 
 import java.util.List;
+import java.util.Map;
 
 /**
  * A statistics task that collects statistics by executing query.
  * The results of the query will be returned as @StatisticsTaskResult.
  */
 public class SQLStatisticsTask extends StatisticsTask {
-    private SelectStmt query;
+    private String statement;
 
     public SQLStatisticsTask(long jobId, List<StatisticsDesc> statsDescs) {
         super(jobId, statsDescs);
     }
 
     @Override
     public StatisticsTaskResult call() throws Exception {
-        // TODO
-        // step1: construct query by statsDescList
-        constructQuery();
-        // step2: execute query
-        // the result should be sequence by @statsTypeList
-        List<String> queryResultList = executeQuery(query);
-        // step3: construct StatisticsTaskResult by query result
-        constructTaskResult(queryResultList);
-        return null;
+        checkStatisticsDesc();
+        List<TaskResult> taskResults = Lists.newArrayList();
+
+        for (StatisticsDesc statsDesc : statsDescs) {
+            statement = constructQuery(statsDesc);
+            TaskResult taskResult = executeQuery(statsDesc);
+            taskResults.add(taskResult);
+        }
+
+        return new StatisticsTaskResult(taskResults);
     }
 
-    protected void constructQuery() {
-        // TODO
-        // step1: construct FROM by @granularityDesc
-        // step2: construct SELECT LIST by @statsTypeList
+    protected String constructQuery(StatisticsDesc statsDesc) throws 
DdlException,
+            InvalidFormatException {
+        Map<String, String> params = getQueryParams(statsDesc);
+        List<StatsType> statsTypes = statsDesc.getStatsTypes();
+        StatsType type = statsTypes.get(0);
+
+        StatsGranularity statsGranularity = statsDesc.getStatsGranularity();
+        Granularity granularity = statsGranularity.getGranularity();
+        boolean nonPartitioned = granularity != Granularity.PARTITION;
+
+        switch (type) {

Review Comment:
   First, I don't think it would take much longer exectuion time than separete 
it to many small SQL, besides, combine those SQL into single statement actually 
make the resource control of statistcs more easily, otherwise you should set 
resource limitations for each small statistic SQL and control the total 
resource occupation of these SQL and should make sure there aren't two many 
small SQL of statistics are executing, all about these is unneccessary and make 
resource management and code maintenance much more complicated.
   And, Is really neccessary to enable users setting which concrete metric of 
column is needed by analyze SQL statement? I think this consideration is 
totally redundant.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [doris] Kikyou1997 commented on a diff in pull request #12765: [feature-wip](statistics) collect statistics by sql task

Reply via email to