[ https://issues.apache.org/jira/browse/HIVE-16663?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
BELUGA BEHR updated HIVE-16663: ------------------------------- Description: It is very common that there are many repeated values in the result set of a query, especially when JOINs are present in the query. As it currently stands, beeline does not attempt to cache any of these values and therefore it consumes a lot of memory. Adding a string cache may save a lot of memory. There are organizations that use beeline to perform ETL processing of result sets into CSV. This will better support those organizations. was: It is very common that there are many repeated values in the result set of a query. As it currently stands, beeline does not attempt to cache any of these values and therefore it consumes a lot of memory. Adding a string cache may save a lot of memory. There are organizations that use beeline to perform ETL processing of result sets into CSV. This will better support those organizations. > String Caching For Rows > ----------------------- > > Key: HIVE-16663 > URL: https://issues.apache.org/jira/browse/HIVE-16663 > Project: Hive > Issue Type: Improvement > Components: Beeline > Affects Versions: 2.0.1 > Reporter: BELUGA BEHR > Assignee: BELUGA BEHR > Priority: Minor > Attachments: HIVE-16663.1.patch, HIVE-16663.2.patch, > HIVE-16663.3.patch, HIVE-16663.4.patch, HIVE-16663.5.patch, > HIVE-16663.6.patch, HIVE-16663.7.patch > > > It is very common that there are many repeated values in the result set of a > query, especially when JOINs are present in the query. As it currently > stands, beeline does not attempt to cache any of these values and therefore > it consumes a lot of memory. > Adding a string cache may save a lot of memory. There are organizations that > use beeline to perform ETL processing of result sets into CSV. This will > better support those organizations. -- This message was sent by Atlassian JIRA (v6.4.14#64029)