Doris is built on a column oriented format engine, in hight concurrency
serving scenario users always want to get a whole row from system. But column
oriented format will massively amplify random read IO when table is wide.
Doris query engine and plan is too heavy for some simple queries like point
query. We need a short fast path for such queries.FE is an access layer service
for SQL queries and write in java, analyzing and parsing SQLs will lead very
high CPU overhead for hight concurrency queries.
To address these drawbacks, the following optimization methods can be
applied:
1. Row Store Format Optimization: In high concurrency serving scenarios, users
often want to retrieve entire rows. To address the issue of high random read IO
in wide tables, a row store format can be introduced in the system. This format
stores data in a single row, making it easier to retrieve entire rows in a
single read operation, reducing the number of disk accesses required and
improving performance.
2. Short Path Optimization for Point Queries: The heavy query engine and plan
in the system can lead to high overhead for simple point queries. To address
this, a short path optimization can be implemented for point queries, bypassing
the heavy query engine and using a fast and efficient path to directly retrieve
the required data, improving performance.
3. Prepared Statement Optimization: High CPU overhead in high concurrency
queries can be partly attributed to the CPU-intensive process of analyzing and
parsing SQLs in the frontend (FE) layer. To address this, a prepared statement
optimization can be implemented. A prepared statement is a precompiled SQL
statement that can be executed multiple times, reducing the overhead of
analyzing and parsing SQLs and improving performance.
In conclusion, these optimizations can help address the performance issues
faced by Doris in high concurrency scenarios. By providing a row store format,
implementing a short path optimization for point queries, and using prepared
statements, Doris can deliver fast and efficient performance for high
concurrency queries
李航宇
[email protected]
签名由 网易灵犀办公 定制