[
https://issues.apache.org/jira/browse/COMDEV-511?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Maxim Solodovnik updated COMDEV-511:
------------------------------------
Labels: Doris Mentor full-time gsoc2023 (was: ApacheDoris Mentor full-time
gsoc2023)
> [GSoC][Doris]Dictionary Encoding Acceleration
> ---------------------------------------------
>
> Key: COMDEV-511
> URL: https://issues.apache.org/jira/browse/COMDEV-511
> Project: Community Development
> Issue Type: Task
> Components: GSoC/Mentoring ideas
> Reporter: Zhijing Lu
> Priority: Major
> Labels: Doris, Mentor, full-time, gsoc2023
>
> *Apache Doris*
> Apache Doris is a real-time analytical database based on MPP architecture. As
> a unified platform that supports multiple data processing scenarios, it
> ensures high performance for low-latency and high-throughput queries, allows
> for easy federated queries on data lakes, and supports various data ingestion
> methods.
> {*}Page{*}: [https://doris.apache.org|https://doris.apache.org/]
> {*}Github{*}: [https://github.com/apache/doris]
> h3. *Background*
> In Apache Doris, dictionary encoding is performed during data writing and
> compaction. Dictionary encoding will be implemented on string data types by
> default. The dictionary size of a column for one segment is 1M at most. The
> dictionary encoding technology accelerates strings during queries, converting
> them into INT, for example.
>
> h3. *Task*
> * Phase One: Get familiar with the implementation of Apache Doris dictionary
> encoding; learning how Apache Doris dictionary encoding accelerates queries.
> * Phase Two: Evaluate the effectiveness of full dictionary encoding and
> figure out how to optimize memory in such a case.
> h3. *Learning Material*
> {*}Page{*}: [https://doris.apache.org|https://doris.apache.org/]
> {*}Github{*}: [https://github.com/apache/doris]
> h3. Mentor
> * Mentor: Chen Zhang, Apache Doris Committer, [[email protected]
> |mailto:[email protected]]
> * Mentor: Zhijing Lu, Apache Doris Committer,
> [[email protected]|mailto:[email protected]]
> * Mailing List: [email protected]
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]