[ https://issues.apache.org/jira/browse/IMPALA-2138?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Tim Armstrong updated IMPALA-2138: ---------------------------------- Priority: Major (was: Critical) > Get rid of unused columns by upstream operators at points of materialization > ---------------------------------------------------------------------------- > > Key: IMPALA-2138 > URL: https://issues.apache.org/jira/browse/IMPALA-2138 > Project: IMPALA > Issue Type: Improvement > Components: Frontend > Affects Versions: Impala 1.4, Impala 2.0, Impala 2.2 > Reporter: Ippokratis Pandis > Priority: Major > Labels: performance > Attachments: 0001-Projection-prototype.patch > > > It would be a very good performance improvement if we were able to get rid of > columns as soon as we know that they are not going to be used from any other > operators upstream. The amount of data we are handling will reduce making the > network and I/O (spilling) transfers more efficient. It will also improve > cache performance. > The current row-wise in-memory format does not make it very easy to get rid > of such unused columns. However, there are points of materialization where we > copy-out the tuples and we can actually perform these projections. There are > multiple points of materialization, notably: > * The exchange operator > * The build side of hash join > * The probe side of hash join when we have spilling > * The aggregation > * Sorts and analytic function evaluation > In order to do these projections we need to modify the FE and know at each > operator what's the minimum set of columns that are being referenced by this > operator and all the upstream ones. (That minimum set is very easy to be > calculated during an additional top-down traversal of the plan.) We also need > to modify the BE and make the copy-out operation aware of such projections. > Assigning first to Alex, because of the needed FE changes. Happy to take care > of the needed BE changes. Perhaps we could split this issue into 2 sub-tasks, > the FE and the BE changes. -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org