[ https://issues.apache.org/jira/browse/SPARK-17480?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15478040#comment-15478040 ]
Apache Spark commented on SPARK-17480: -------------------------------------- User 'seyfe' has created a pull request for this issue: https://github.com/apache/spark/pull/15032 > CompressibleColumnBuilder inefficiently call gatherCompressibilityStats > ------------------------------------------------------------------------ > > Key: SPARK-17480 > URL: https://issues.apache.org/jira/browse/SPARK-17480 > Project: Spark > Issue Type: Improvement > Components: SQL > Reporter: Ergin Seyfe > Priority: Minor > > When we profile one of our Spark jobs we saw that: > 6.24% of the CPU is spend on List.length. > Scala List's length method is O(N) => > https://github.com/scala/scala/blob/2.10.x/src/library/scala/collection/LinearSeqOptimized.scala#L36 > Since we loop this method becomes O(N^2) -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org