[ https://issues.apache.org/jira/browse/IMPALA-6028?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
John Russell resolved IMPALA-6028. ---------------------------------- Resolution: Fixed Fix Version/s: Impala 2.10.0 > Document max_row_size for upgrade awareness > ------------------------------------------- > > Key: IMPALA-6028 > URL: https://issues.apache.org/jira/browse/IMPALA-6028 > Project: IMPALA > Issue Type: Bug > Components: Docs > Affects Versions: Impala 2.10.0 > Reporter: Balazs Jeszenszky > Assignee: John Russell > Priority: Blocker > Fix For: Impala 2.10.0 > > > 2.10 introduced the max_row_size query option. This is to replace the > read_size startup option from earlier, both of them control the maximum width > of a row impala can process. The defaults for these are different, while > read_size was 8MB, the new default is 512kb. There is a tradeoff between > being able to read large rows and memory usage. The 512kb is expected to work > well for most use cases, having 8MB would reserve unnecessarily large chunks > of memory in the new buffer pool even if the rows are smaller. > We should advise users that they may have to append query options to their > workflows or change the 512kb default when upgrading to 2.10. > Also, if users set --read_size > 8mb to process larger rows, we should > recommend that they revert --read_size to the default and set the > max_row_size query option to the size of the largest rows they need to > process. This should greatly reduce memory consumption of HDFS scans. The > query option gives them the flexibility of overriding the value per-query, > per-pool or globally. -- This message was sent by Atlassian JIRA (v7.6.3#76005)