Guangyuan Feng created KYLIN-5571:
-------------------------------------
Summary: It takes too much time to calculate the data size during
pushing down queries, which will lead to the queries un-stoppable.
Key: KYLIN-5571
URL: https://issues.apache.org/jira/browse/KYLIN-5571
Project: Kylin
Issue Type: Improvement
Components: Query Engine
Affects Versions: 5.0-alpha
Reporter: Guangyuan Feng
Assignee: Guangyuan Feng
Fix For: 5.0-alpha
During pushing down the query, KE will try to calculate the included data size
to set Spark partitions, but if there were too many files on HDFS, it will take
a lot of time to complete.
So in order to improve this situation, the following things will be done:
# Using a limited thread pool to calculate the data size
# Add timeout for the calculation, so as to stop the query as soon as possible
After these changes, we can expected the query complete in a fixed duration.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)