[ https://issues.apache.org/jira/browse/SPARK-26061?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Apache Spark reassigned SPARK-26061: ------------------------------------ Assignee: Apache Spark > Reduce the number of unused UnsafeRowWriters created in whole-stage codegen > --------------------------------------------------------------------------- > > Key: SPARK-26061 > URL: https://issues.apache.org/jira/browse/SPARK-26061 > Project: Spark > Issue Type: Improvement > Components: SQL > Affects Versions: 2.3.0, 2.3.1, 2.3.2, 2.4.0 > Reporter: Kris Mok > Assignee: Apache Spark > Priority: Trivial > > Reduce the number of unused UnsafeRowWriters created in whole-stage generated > code. > They come from the CodegenSupport.consume() calling prepareRowVar(), which > uses GenerateUnsafeProjection.createCode() and registers an UnsafeRowWriter > mutable state, regardless of whether or not the downstream (parent) operator > will use the rowVar or not. > Even when the downstream doConsume function doesn't use the rowVar (i.e. > doesn't put row.code as a part of this operator's codegen template), the > registered UnsafeRowWriter stays there, which makes the init function of the > generated code a bit bloated. > This ticket doesn't track the root issue, but makes it slightly less painful: > when the doConsume function is split out, the prepareRowVar() function is > called twice, so it's double the pain of unused UnsafeRowWriters. This fix > simply moves the original call to prepareRowVar() down into the doConsume > split/no-split branch so that we're back to just 1x the pain. > To fix the root issue, something that allows the CodegenSupport operators to > indicate whether or not they're going to use the rowVar would be needed. > That's a much more elaborate change so I'd like to just make a minor fix > first. -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org