[ https://issues.apache.org/jira/browse/KYLIN-1079?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
hongbin ma resolved KYLIN-1079. ------------------------------- Resolution: Fixed > Manager large number of entries in metadata store > ------------------------------------------------- > > Key: KYLIN-1079 > URL: https://issues.apache.org/jira/browse/KYLIN-1079 > Project: Kylin > Issue Type: Improvement > Affects Versions: v2.0, v1.1, v1.0 > Reporter: hongbin ma > Assignee: hongbin ma > Labels: newbie > Fix For: v2.1 > > > Kylin saves cube metadata, table metadata as well as job history/output in a > metadata store. The HBaseMetadataStore is a fault tolerant implementation > which brings no extra dependencies to the system. We use it in real world > deployments. > When cube or hive table is updated, the correspond entries in metadata store > simply updated.(so there's no way to trace history cube definitions, anyway > this is not very expected function).However Job histories and outputs are a > little special, each cubing job's definition and output are saved as new > entries in the metadata store. As more and more jobs accumulate, a lot of job > histories will reside in the metadata store. This might harm frontend > performance when user wants to query job histories. > We should tackle the problem from two perspectives: > 1.Backend tool to delete/archive job history based on given conditions,e.g. > "all jobs that is older than one month and not referenced by any cube > segment(each cube segment keeps track of which job created it)" > 2.Frontend display enforce timestamp filter to retrieve from metadata store > for efficiency. When showing job lists, for example, a "Show last N days" > filter is enforced, where N is configurable by the user. For > HBaseMetadataStore, we saved timestamp for each entry in a separate column, > this is where HBase SingleColumnValueFilter can help. > We can start working this on 2.x-staging branch(as it is the latest dev > branch, and is more friendly to developers), and backport it to 1.x-staging > branch if necessary. -- This message was sent by Atlassian JIRA (v6.3.4#6332)