[
https://issues.apache.org/jira/browse/KYLIN-1048?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14934465#comment-14934465
]
Shaofeng SHI commented on KYLIN-1048:
-------------------------------------
The problem is in Cuboid.translateToValidCuboid(), which is to find an existing
parent cuboid from the given dimension/filter list; The old algorithm is
Breadth-First-Search, algorithm complexity is O(N^2); Rewrite this method to
directly conduct the parent cuboid, algorithm complexity reduced to O(1)ï¼›After
deploy this patch, the CPU/memory is at normal level with the same query, and
the result can be returned very shortly.
> CPU and memory killer in Cuboid.findById()
> ------------------------------------------
>
> Key: KYLIN-1048
> URL: https://issues.apache.org/jira/browse/KYLIN-1048
> Project: Kylin
> Issue Type: Improvement
> Components: Metadata
> Affects Versions: v1.0, v0.7.2, v0.7.1
> Reporter: Shaofeng SHI
> Assignee: Shaofeng SHI
> Fix For: v2.0, v1.1
>
>
> In an cube which has 37 dimensions with a couple of aggregation groups (and
> each group has hierarchy), when a SQL which has columns cross aggregation
> groups, it couldn't be returned in time, the CPU will very high and memory
> usage increases also, finally cause Kylin server crashed with OutOfMemory
> error.
> After doing some analysis and profiling, identified that when the cube is a
> sparse partial cube, the performance in Cuboid.findById() is bad, which will
> take much CPU and memory.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)